dpScreenOCR User Manual

Version 1.2

1 About

dpScreenOCR is a free and open-source program to recognize text on the screen. Powered by Tesseract, it supports more than 100 languages and can split independent text blocks, such as columns.

2 Installation

2.1 Installing dpScreenOCR

2.1.1 Unix-like systems

The dpScreenOCR website provides several options, including repositories for Debian, Ubuntu, and derivatives. If you don't find a suitable choice for your system, download the source code tarball, unpack it, and follow the instructions in the "doc/building-unix.txt" file.

2.1.2 Windows

The dpScreenOCR website provides an installer and a ZIP archive. The latter doesn't need installation: unpack it anywhere and run dpscreenocr.exe.

2.2 Installing languages

2.2.1 Unix-like systems

Use your package manager to install languages for Tesseract. Package names may vary across systems, but they usually start with "tesseract" and end with a language code or name. For example, the package for German has the following names:

When searching for a language, be aware that some codes are not from ISO 639-3. In particular, "frk" is German Fraktur rather than Frankish. The Tesseract developers are aware of this and will probably fix the code in the future (see issues 68, 49, and 61); meanwhile, if "frk" is described as "Frankish" in your package manager, you can report the problem to the package maintainer.

There are also two special packs that provide extra features rather than languages: "osd" (automatic script and orientation detection) and "equ" (math and equation detection). dpScreenOCR doesn't use them.

2.2.2 Windows

dpScreenOCR for Windows comes with the English language pack. To install other languages, visit the languages page, download ".traineddata" files you want, and place them in the C:\Users\(your name)\AppData\Local\dpscreenocr\tesseract_5_data folder. To quickly navigate to this folder, press Windows + R to open "Run" and paste %LOCALAPPDATA%\dpscreenocr\tesseract_5_data. You can also paste this path to the folder address bar of File Explorer.

If the "tesseract_5_data" folder does not exist in %LOCALAPPDATA%\dpscreenocr when you start dpScreenOCR, the program will copy it from the directory if the EXE file.

2.2.2.1 Migrating from version 1.0

When upgrading from version 1.0, the installer will automatically migrate languages from the old location, which was the "tessdata" folder in the directory of the EXE file. To be more specific, the contents of "tessdata" are copied to the "tesseract_5_data" folder in the directory of the EXE file, and then each user launching dpScreenOCR gets their own copy of "tesseract_5_data" in %LOCALAPPDATA%\dpscreenocr.

For the ZIP version, you need to manually copy languages from "tessdata" to %LOCALAPPDATA%\dpscreenocr\tesseract_5_data.

3 Usage

dpScreenOCR is simple to use:

  1. Choose languages in the Main tab.
  2. Move the mouse pointer near the screen area containing text and press the hotkey shown in the Main tab to start the selection.
  3. Move the mouse so that the selection covers the text and press the hotkey again.

After these steps, dpScreenOCR will recognize the text from the selected area and process it according to the actions from the Actions tab.

3.1 Main tab

3.1.1 Status

The status describes the current state of dpScreenOCR. Green means the program is ready to use, and you can press the Hotkey to start the selection. Yellow shows the progress of recognition. Red warns that the program needs some setup, and you will not be able to start the selection until the problem is fixed.

3.1.2 Character recognition

3.1.2.1 Split text blocks

If this option is enabled, dpScreenOCR will try to detect and split independent text blocks, such as columns. Otherwise, everything is treated as a contiguous block of text. This behavior is best described by the following picture, which shows a two-column text layout (A) recognized with (B) and without (C) the "Split text blocks" option:

3.1.2.2 Languages

This is the list of languages that dpScreenOCR can use to recognize text. You can choose more than one, but be aware that this may slow down recognition and reduce its accuracy.

Read the "Installing languages" section on how to install language packs.

3.1.3 Hotkey

The hotkey starts and ends the on-screen selection. To cancel the selection, press Escape.

The hotkey is global: it works even if the dpScreenOCR's window is minimized. If pressing the hotkey has no effect, it probably means that another program is already using it. In this case, try another key combination.

3.2 Actions tab

The Actions tab lets you choose what to do with the recognized text: copy it to the clipboard, add it to the history (see the History tab), or pass it as an argument to an executable.

3.2.1 Run executable

The "Run executable" action will run an executable with the recognized text as the first argument. The entry expects either an absolute path to the executable, or just its name in case it's located in one of the paths listed in your PATH environment variable.

3.2.1.1 Running scripts on Unix-like systems

Before using your script, make sure it starts with a proper shebang and you have the execute permission (run chmod u+x your_script).

Here is an example Unix shell script that translates the recognized text to your native language using Translate Shell and displays the translation as a desktop notification.

#!/bin/sh

notify-send "Translation" $(trans -b "$1")

3.2.1.2 Running scripts on Windows

3.2.1.2.1 Batch files

dpScreenOCR doesn't run batch files (".bat" or ".cmd") because there's no way to safely pass them arbitrary text. Please use another scripting language instead.

3.2.1.2.2 Creating file associations

Before using a script, make sure that the file association is configured correctly so that you can launch the script just by its file name, without mentioning the interpreter explicitly. The simplest way to test this is to type the name of the script with some arguments in cmd.exe. If the script runs and receives all arguments, you can skip this section.

We will use Python as an example, but for other languages the process is similar. Open cmd.exe as administrator and run asscoc with the extension of the script file as an argument:

> assoc .py

If the script still receives only one argument (path to the script), this means that Windows actually use a different association for the given extension and ignores the one set with assoc/ftype. To fix that, open regedit and make sure the values of the following keys use the correct path to the Python executable and end with %*:

HKEY_CLASSES_ROOT\Applications\python.exe\shell\open\command
HKEY_CLASSES_ROOT\py_auto_file\shell\open\command

A tip for Python users: note that in the examples above the association uses Python Launcher (py.exe) rather than a concrete Python executable (python.exe). This allows using shebang lines to select the Python version on per-script basis. For more information, read Using Python on Windows.

3.2.1.2.3 Hiding console window

Most scripting language interpreters for Windows are shipped with a special version of the executable that doesn't show the console window. For example, it's pyw.exe for Python and wlua.exe for Lua.

A special file association is usually added during installation of the interpreter, so you can hide the console window by simply changing the extension of the script. For example, Python scripts with the ".pyw" extension are associated with pyw.exe instead of py.exe. Other languages can have their own conventions, like ".wlua" for Lua (wlua.exe). If such an association does not exist, create it manually as described in the previous section.

3.3 History tab

The History tab shows the history of the recognized texts. A text is only added here if the corresponding action is enabled in the Actions tab. Each text has a timestamp taken at the moment you finish the selection.

4 Tweaking

This section describes how to change some settings that are not available in the dpScreenOCR's interface.

dpScreenOCR saves settings in settings.cfg. Depending on the platform, you can find it in the following directories:

Each line in settings.cfg contains an option as a key-value pair. A value is a string, which, depending on the option, represents a boolean (true or false), number (like 10 or -5), file path, etc.

The value can contain the following escape sequences:

Any other character preceded by \ is kept as is. To preserve leading spaces, escape the first one with \; to preserve trailing spaces, either escape the last one or put \ after it at the end of the line.

To reset an option to the default value, remove it from settings.cfg; to reset all options, clear or delete the file. Be aware that dpScreenOCR rewrites settings on exit, so make sure you close the program before making changes.

Here is the list of options that can only be changed by editing the settings file:

5 Troubleshooting

This section contains the list of possible issues and their solutions. If the solution doesn't help, or you have an issue that is not listed here, please report the problem on the issue tracker. You can also contact the author by email; the link is at the bottom of the dpScreenOCR website.