FAQs

What is the "OCR - Image Reader" add-on and how can I use it?
This extension provides users with a simple yet powerful in-page OCR extension, eliminating the need for a native one. It is a user-friendly OCR tool, allowing easy text extraction of images or text content that cannot be selected/copied. To start using it, click the toolbar button to activate the selection mode, select the desired area of interest, and release the mouse pointer. The extension then captures the screen area and sends it to the OCR engine within a page's frame element. It displays the progress of the process in a popup window, including fetching the language training database from the server and extracting the text content from the selected area. Both of these processes feature a progress bar. The database access takes some time during the first run but improves for all subsequent calls as it is cached by the browser.
recommended "2FA (Two Factor Authentication)" extension for Chrome, and Edge browsers.
2FA (Two-Factor Authentication) is an Aegis-compatible browser extension for managing TOTP and HOTP codes. It stores your tokens in an encrypted Aegis database file on your device instead of browser storage, making it easy to sync across devices with any file-sharing service. The database uses the same strong encryption as Aegis Authenticator and is fully compatible with the Aegis Android app. You can create, edit, organize, and delete tokens, customize their details and icons, and use keyboard shortcuts for faster access. Read more here.
What's new in this version?
Please check the Logs section.
Does this extension uses an online service for dong the text recognition?
No, the process of extracting text content from an image occurs locally. However, it's worth noting that this extension obtains the training data from a remote server since the database is approximately 30 megabytes and cannot be included in the extension. Apart from fetching the database, this extension does not interact with any remote services.
Is it possible to send a large image to the OCR engine using this extension?
It is possible, in theory, to send large images to the OCR engine. However, it may take a considerable amount of time for the extension to process the image, and it could require excessive CPU resources to extract the content. I recommend using the area selection tool to only select the necessary area instead of sending a large image with significant amounts of space around it.
What is the OCR engine of this extension?
This extension uses the well-known Tesseract.js OCR engine, which features online language training resources to maintain the most up-to-date database.
Tesseract.js is an open-source OCR (optical character recognition) engine based on Tesseract project maintained by Google. Tesseract.js is written purely in JavaScript and runs in a browser or a Node.js environment.
Could you provide instructions for setting up a local testing server to use the OCR extension on local images? When I try to use this extension on these pages, I get the "Cannot access contents of the page" error.
Open the extension manager of your browser and make sure "Allow access to file URLs" (available on Chromium browsers) is enabled for this extension. Now try the OCR on the local tab one more time. You may still encounter a notification stating "Cannot access contents of the page. Extension manifest must request permission to access the respective host." Unfortunately, some browsers do not allow extensions to access the "FILE" scheme. To perform OCR, create a local server and access images using the "http://127.0.0.1/..." address instead. Having a local server to access your pages is similar to accessing web content and will enable the extension to function properly.
To set up a local testing server, you can refer to the documentation on "How do you set up a local testing server". This will guide you through the steps required for creating a local server and accessing your images using the HTTP scheme, enabling you to use your browser extensions on the local resources.
Is it possible for the OCR extension to detect the language of an image and use the appropriate training database? I often work with documents in various languages, and such a feature would be beneficial.
The extension allows language auto-detection starting from version 0.2.3. It performs OCR on the image in three different languages (English, Arabic, and Japanese) when this mode is activated. Then, it utilizes the Compact Language Detector (CLD) algorithm to determine the language of the extracted text. Once it identifies the language successfully, the extension uses the corresponding OCR engine to perform the actual OCR. Keep in mind that the first detection might take some time since the extension has to fetch the training database for several languages.
Is it possible to post the OCR result to a server?
As of version 0.2.4, you can define a custom server to post the result. A use case is to copy the data to a local text file without a manual copy and paste process. To configure the server, use Shift + click on the "Post Result" button. General Format:
```
[GET|POST|PUT]|URL|[POST|PUT body]
```
Post Example:
```
POST|http://127.0.0.1:8080|&content;
POST|http://127.0.0.1:8080|{"body":"&content"};
```
Put Example:
```
PUT|http://127.0.0.1:8080|&content;
```
Get Example:
```
GET|http://127.0.0.1:8080?data=&content;|
```
Open in a Browser Tab Example:
```
OPEN|http://127.0.0.1:8080?data=&content;|
```
The "OPEN" command can be used to for instance search the extracted content in a new browser tab or send the content to a website that needs user interaction. The &content; keyword on the URL part will be replaced with the actual result and it is encoded (encodeURIComponent), but the &content; on the body section is not altered. You can have one instance of the &content; keyword in the URL and one instance in the body part. You can write the server code in any language such as Python, PHP, or JavaScript. Here you can find a sample code written in JavaScipt. This code is meant for NodeJS. Alter the code to fit your needs:
```
const http = require('http');

const server = http.createServer(function(req, res) { // 2 - creating server
  // res.setHeader('Access-Control-Allow-Origin', '*');

  console.log('Request URL: ' + req.url);
  console.log('Request method: ' + req.method);

  if (req.method === 'GET') {
    res.end();
  }
  else {
    req.on('data', chunk => {
      console.log('Chunk:', chunk);
    });
    req.on('end', () => {
      res.end();
    });
  }
});

server.listen(8080);
```
Supported keywords:
- &content; OCR result
- &href; Document URL
[Version 0.2.7] Is it possible to close all result panels at once?
Yes, it is possible to close all result panels together by using the Shift key while pressing the "Close" button. This action closes all open floating windows on the current page, saving you time and effort if you have many panels open.

Can I translate the extracted text from an image using this extension?

You can use the "Post Result" button to send the extracted text to a translator service. For instance, to send the text to Google Translate and translate any language into English use:

OPEN|https://translate.google.com/?sl=auto&tl=en&text=&content;|

To send the text to DeepL and translate in English, use
OPEN|https://www.deepl.com/translator#en/de/&content;|

Is it possible that my text highlighter or text-to-speech extensions have access to the content extracted by this extension?
The interface of each extension is isolated, so it is not possible for other extensions to directly access the extracted content of this extension. However, you can use the "post" command to open the extracted content in a new web page that can be accessed by other extensions.
OPEN|https://webbrowsertools.com/simple-text-editor/?content=&content;|
How do I delete the cached training database used by this extension?
Starting from version 0.3.2, you can delete the cached training database by pressing Ctrl+Click or Command+Click on the "close" button of an OCR result box. This action removes all the cached training storage, and the extension will fetch a fresh copy on the next detection request.
Can I activate the OCR mode using a keyboard shortcut instead of clicking the action button?
Yes, to set up a keyboard shortcut for this extension, follow these steps:
Chromium Browser (Google Chrome):
- Open a new tab in your Chromium browser.
- In the address bar, type "chrome://extensions/shortcuts" and press Enter.
- Locate this extension.
- Assign a keyboard shortcut of your choice to activate the OCR function.
Firefox:
- Open a new tab in your Firefox browser.
- In the address bar, type "about add-ons" and press Enter.
- On the left sidebar, click on "Extensions."
- Find this extension and click on its settings or options.
- Look for the keyboard shortcuts section and assign a preferred shortcut key combination for OCR activation.
I have scans of several documents saved in a folder. Can I use this extension to perform OCR on them?
Starting from version 0.4.1, the extension allows you to drop multiple image files into its interface. It will OCR each image one by one and automatically move on to the next. This enables you to OCR several documents with just a single action. Additionally, you can automate the storage of OCR results. By setting up a local server to handle the results and configuring the extension to send the OCR data to this server, you can easily store the results in a database with a simple drag-and-drop operation.
Does this extension have daily or usage limits?
No. All OCR processing happens entirely in your browser, with no server-side interaction. Because of this, there are no usage limits. The extension does not track your activity and includes no mechanisms to monitor or restrict usage.
I have a document that includes structured data such as tables, mathematical equations, or formatted text. The default engine in this extension doesn't parse it correctly. Is there any fix?
The default engine, Tesseract, only extracts plain text from images and doesn't preserve formatting, layout, or math. However, there's an experimental engine called Granite-Docling, available in the “accuracy” selector.
Granite-Docling is far more advanced—it's a vision-to-sequence transformer that understands both the visual layout and semantic structure of a document. Unlike Tesseract, it can accurately interpret headings, lists, styles (bold, italics), tables, and mathematical equations, and output clean, structured results in formats like Markdown, HTML, or LaTeX.
Note:
- This new engine runs on WebGPU and is powered by a large language model (LLM). It's still experimental and may be unstable or buggy — use with caution.
- If it becomes unresponsive, you can close the OCR iframe to stop the process, or use the “accuracy” selector to switch back to the Tesseract engine.

Permission	Description
storage	to keep the internal preferences
activeTab	to inject area select script into the active page after a user action
notifications	to display possible warnings during the OCR process

Features

FAQs

Advertisements

Preview

Reviews

What's new in this version

Need help?

Keywords

Permissions are explained

Recent Blog Posts