StrokesPlus.net
Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Options
Go to last post Go to first unread
Rob  
#1 Posted : Wednesday, January 20, 2021 2:10:10 PM(UTC)
Rob

Rank: Administration

Reputation:

Groups: Translators, Members, Administrators
Joined: 1/11/2018(UTC)
Posts: 1,349
United States
Location: Tampa, FL

Thanks: 28 times
Was thanked: 416 time(s) in 354 post(s)
DISCLAIMER:

I am not versed in Tesseract nor OCR beyond understanding what they are in general. I simply pulled together the necessary files and two supporting methods in the SPTesseract class to be able to make this work.
I have only tested this in the most basic way, selecting an area with a light background and dark, legible, text.



Plug-In and Required Files:

https://www.strokesplus.net/files/plugins/SPTesseract_1.1.0.0.zip (177 MB)

IMPORTANT:
- You must have a minimum version of StrokesPlus.net 0.4.1.1
- Extract all files/folders directly into StrokesPlus.net\Plug-Ins folder using 7-Zip or some other method which doesn't block the DLLs
- If you update the current working directory used by S+, this might fail to load the trained data files
- Should look like the image below



IMPORTANT:
You will also need to ensure you have the Microsoft VC Runtime installed: MSVC Runtime



Put this function in Global Actions > Load/Unload > Load (script):
This is a wrapper for extracting text based on a language and rectangle
Code:
function GetTesseractText(lang, rect) {

    //Capture an image from the screen using the Rectangle passed in

    var memoryImage = new drawing.System.Drawing.Bitmap(rect.Width, rect.Height);
    var memoryGraphics = drawing.System.Drawing.Graphics.FromImage(memoryImage);
    memoryGraphics.CopyFromScreen(rect.X, rect.Y, 0, 0, new Size(rect.Width, rect.Height));
    memoryGraphics.Dispose();

    /*
    Get Tesseract Page object 
     
    First param in Bitmap image
      - From code above

    Second param is language
      - See Plug-Ins\TesseractTrainedData folder for other trained data files
      - Appears to be just the first part of the file name; "eng" for English

    Third param is Tesseract.EngineMode enum value: 
      - TesseractOnly = "0"
      - LstmOnly = "1"
      - TesseractAndLstm = "2"
      - Default = "3"

    Last param is Tesseract.PageSegMode
      - OsdOnly = "0"
      - AutoOsd = "1"
      - AutoOnly = "2"
      - Auto = "3"
      - SingleColumn = "4"
      - SingleBlockVertText = "5"
      - SingleBlock = "6"
      - SingleLine = "7"
      - SingleWord = "8"
      - CircleWord = "9"
      - SingleChar = "10"
      - SparseText = "11"
      - SparseTextOsd = "12"
      - RawLine = "13"
      - Count = "14"

    SPTesseract class also has the function below defined:

    List<Rectangle> GetPageSegmentedRegions(Page page, string iteratorLevel)

    Pass Page object from the SPTesseract.GetPage method with one of the PageIteratorLevel values:
    Tesseract.PageIteratorLevel
    
      - Block = "0"
      - Para = "1"
      - TextLine = "2"
      - Word = "3"
      - Symbol = "4"

    All other methods/properties are standard Tesseract 4.1.1 from:
     - https://www.nuget.org/packages/Tesseract/
     - https://github.com/charlesw/tesseract


  */
    var tpage = SPTesseract.GetPage(memoryImage, lang, "3", "3");
    var ocrText = tpage.GetText();
    memoryImage.Dispose();
    return ocrText;
}

Example action script to test recognition and show source rectangle:
Code:
//Use the square gesture, to draw a box around the desired area
//Shows the text recognized by Tesseract (English data file) and the dimensions of the rectangle

sp.MessageBox(`Text: ${GetTesseractText("eng", new Rectangle(action.Bounds.X, action.Bounds.Y, action.Bounds.Width, action.Bounds.Height))}
Rect.X: ${action.Bounds.X}
Rect.Y: ${action.Bounds.Y}
Rect.Width: ${action.Bounds.Width}
Rect.Height: ${action.Bounds.Height}`, 
"Text, Rect");

Edited by user Tuesday, August 3, 2021 2:53:42 PM(UTC)  | Reason: Updated download to version 1.1 - fixed memory leak

thanks 3 users thanked Rob for this useful post.
AppleBag on 4/17/2021(UTC), Matija on 6/24/2021(UTC), bjarkirafn on 7/6/2021(UTC)
Rob  
#2 Posted : Tuesday, August 3, 2021 3:06:49 PM(UTC)
Rob

Rank: Administration

Reputation:

Groups: Translators, Members, Administrators
Joined: 1/11/2018(UTC)
Posts: 1,349
United States
Location: Tampa, FL

Thanks: 28 times
Was thanked: 416 time(s) in 354 post(s)
IMPORTANT

Replace the 1.0 with the new download link (1.1) in the original post. There was a huge memory leak that resulted in each OCR request creating a new OCR engine and page, never disposing of them.

Existing scripts should still be compatible, just that behind the scenes new calls to GetPage will:
  • conditionally reuse the existing engine (if language or engine mode hasn't changed) or release/create a new one
  • release the previous Page and create a new one


Also, there are additional calls available to the plugin:
  • void SetEngine(string lang, string engineMode)
  • void ReleaseEngine()
  • void ReleasePage()


None of these calls are required, but I would recommend calling ReleasePage after you're done extracting text.
If you're not using OCR calls frequently, you can also then call ReleaseEngine - but note that initializing the OCR engine takes some time, so if you release the engine after each call, scripts that use OCR will have a delay as a new engine is instantiated.
Rob  
#3 Posted : Tuesday, August 3, 2021 4:37:33 PM(UTC)
Rob

Rank: Administration

Reputation:

Groups: Translators, Members, Administrators
Joined: 1/11/2018(UTC)
Posts: 1,349
United States
Location: Tampa, FL

Thanks: 28 times
Was thanked: 416 time(s) in 354 post(s)
Updated function, if you want to pass in a scaling multiplier, e.g. 2 to increase the captured image size by 200%
Code:
function GetTesseractText(lang, rect, scale) {

    //Capture an image from the screen using the Rectangle passed in

    var memoryImage = new System.Drawing.Bitmap(rect.Width, rect.Height);
    var memoryGraphics = System.Drawing.Graphics.FromImage(memoryImage);
    memoryGraphics.CopyFromScreen(rect.X, rect.Y, 0, 0, new System.Drawing.Size(rect.Width, rect.Height));
    memoryGraphics.Dispose();

    if(!isNaN(scale)) {
        //Original Image attributes
        var originalWidth = memoryImage.Width;
        var originalHeight = memoryImage.Height;

        // now we can get the new height and width
        var newHeight = parseInt(originalHeight * scale);
        var newWidth = parseInt(originalWidth * scale);

        var scaledImage = new System.Drawing.Bitmap(newWidth, newHeight);
        var scaledGraphics = System.Drawing.Graphics.FromImage(scaledImage);

        scaledGraphics.InterpolationMode = System.Drawing.Drawing2D.InterpolationMode.HighQualityBicubic;
        scaledGraphics.SmoothingMode = System.Drawing.Drawing2D.SmoothingMode.HighQuality;
        scaledGraphics.PixelOffsetMode = System.Drawing.Drawing2D.PixelOffsetMode.HighQuality;
        scaledGraphics.CompositingQuality = System.Drawing.Drawing2D.CompositingQuality.HighQuality;

        scaledGraphics.Clear(Color.Transparent);
        scaledGraphics.DrawImage(memoryImage, 0, 0, newWidth, newHeight);
        scaledGraphics.Dispose();
        memoryImage.Dispose();
        memoryImage = scaledImage;
    }

    /*
    Get Tesseract Page object 
     
    First param in Bitmap image
      - From code above

    Second param is language
      - See Plug-Ins\TesseractTrainedData folder for other trained data files
      - Appears to be just the first part of the file name; "eng" for English

    Third param is Tesseract.EngineMode enum value: 
      - TesseractOnly = "0"
      - LstmOnly = "1"
      - TesseractAndLstm = "2"
      - Default = "3"

    Last param is Tesseract.PageSegMode
      - OsdOnly = "0"
      - AutoOsd = "1"
      - AutoOnly = "2"
      - Auto = "3"
      - SingleColumn = "4"
      - SingleBlockVertText = "5"
      - SingleBlock = "6"
      - SingleLine = "7"
      - SingleWord = "8"
      - CircleWord = "9"
      - SingleChar = "10"
      - SparseText = "11"
      - SparseTextOsd = "12"
      - RawLine = "13"
      - Count = "14"

    SPTesseract class also has the function below defined:

    List<Rectangle> GetPageSegmentedRegions(Page page, string iteratorLevel)

    Pass Page object from the SPTesseract.GetPage method with one of the PageIteratorLevel values:
    Tesseract.PageIteratorLevel
    
      - Block = "0"
      - Para = "1"
      - TextLine = "2"
      - Word = "3"
      - Symbol = "4"

    All other methods/properties are standard Tesseract 4.1.1 from:
     - https://www.nuget.org/packages/Tesseract/
     - https://github.com/charlesw/tesseract


  */
    var tpage = SPTesseract.SPTesseract.GetPage(memoryImage, lang, "3", "3");
    var ocrText = tpage.GetText();
    memoryImage.Dispose();
    SPTesseract.SPTesseract.ReleasePage();
    return ocrText;
}


EDIT:
Added release page call
NOTE - the above script is intended for use in version 0.5.0.0 (beta as of this writing) or higher.
For < 0.5.0.0, just replace SPTesseract.SPTesseract with SPTesseract.

Edited by user Tuesday, August 3, 2021 11:07:11 PM(UTC)  | Reason: Not specified

Users browsing this topic
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.