Tesseract OCR
T-Plan Robot Enterprise v2.2
introduces support of Tesseract
OCR
v2 which is considered to be the most accurate open
source engine for optical character recognition. T-Plan Robot
Enterprise takes advantage of the engine to allow recognition of text
displayed on the connected remote desktop. The engine is exposed in the
scripting language as a standard image comparison method called "tocr" and can
be employed through the standard CompareTo, Screenshot and WaitFor commands.
The engine is not part of the T-Plan Robot Enterprise product and must
be downloaded and installed separately. T-Plan Robot Enterprise then
only needs to know its location which can be configured in the Tesseract OCR panel of the Preferences window. See the method documentation
for installation and configuration instructions.
Be aware that Tesseract's recognition capabilities are limited to most
common fonts and languages and T-Plan doesn't guarantee any accuracy
and/or compatibility with a particular test environment. As the engine
lacks support of layout analysis, it
is accurate just on images with minimum non-textual elements and it is
highly recommended to restrict the recognition to a smaller area where
the text may appear. For example, to get text of a button it is
recommended to find its corners using the image search method and then
limit the OCR to the button rectangle. When the engine is employed to
recognize text on a larger area with multiple GUI controls or even
against the whole desktop, it gets distracted by the GUI and produces
inaccurate results.
The following example shows how to employ the OCR engine to recognize
the date and time on the Windows task bar and take advantage of a
regular expression to test whether the current month is August.

|
|
Tesseract
OCR demo
on Windows 7
|
|