Advertisement |
What is the "OCR - Image Reader" add-on and how can I use it?
This extension provides users with a simple yet powerful in-page OCR extension, eliminating the need for a native one. It is a user-friendly OCR tool, allowing easy text extraction of images or text content that cannot be selected/copied. To start using it, click the toolbar button to activate the selection mode, select the desired area of interest, and release the mouse pointer. The extension then captures the screen area and sends it to the OCR engine within a page's frame element. It displays the progress of the process in a popup window, including fetching the language training database from the server and extracting the text content from the selected area. Both of these processes feature a progress bar. The database access takes some time during the first run but improves for all subsequent calls as it is cached by the browser.
What's new in this version?
Please check the Logs section.
Does this extension uses an online service for dong the text recognition?
No, the process of extracting text content from an image occurs locally. However, it's worth noting that this extension obtains the training data from a remote server since the database is approximately 30 megabytes and cannot be included in the extension. Apart from fetching the database, this extension does not interact with any remote services.
Is it possible to send a large image to the OCR engine using this extension?
It is possible, in theory, to send large images to the OCR engine. However, it may take a considerable amount of time for the extension to process the image, and it could require excessive CPU resources to extract the content. I recommend using the area selection tool to only select the necessary area instead of sending a large image with significant amounts of space around it.
What is the OCR engine of this extension?
This extension uses the well-known Tesseract.js OCR engine, which features online language training resources to maintain the most up-to-date database.
Tesseract.js is an open-source OCR (optical character recognition) engine based on Tesseract project maintained by Google. Tesseract.js is written purely in JavaScript and runs in a browser or a Node.js environment.
Could you provide instructions for setting up a local testing server to use the OCR extension on local images? When I try to use this extension on these pages, I get the "Cannot access contents of the page" error.
Open the extension manager of your browser and make sure "Allow access to file URLs" (available on Chromium browsers) is enabled for this extension. Now try the OCR on the local tab one more time. You may still encounter a notification stating "Cannot access contents of the page. Extension manifest must request permission to access the respective host." Unfortunately, some browsers do not allow extensions to access the "FILE" scheme. To perform OCR, create a local server and access images using the "http://127.0.0.1/..." address instead. Having a local server to access your pages is similar to accessing web content and will enable the extension to function properly.
To set up a local testing server, you can refer to the documentation on "How do you set up a local testing server". This will guide you through the steps required for creating a local server and accessing your images using the HTTP scheme, enabling you to use your browser extensions on the local resources.
Is it possible for the OCR extension to detect the language of an image and use the appropriate training database? I often work with documents in various languages, and such a feature would be beneficial.
The extension allows language auto-detection starting from version 0.2.3. It performs OCR on the image in three different languages (English, Arabic, and Japanese) when this mode is activated. Then, it utilizes the Compact Language Detector (CLD) algorithm to determine the language of the extracted text. Once it identifies the language successfully, the extension uses the corresponding OCR engine to perform the actual OCR. Keep in mind that the first detection might take some time since the extension has to fetch the training database for several languages.
Is it possible to post the OCR result to a server?
As of version 0.2.4, you can define a custom server to post the result. A use case is to copy the data to a local text file without a manual copy and paste process. To configure the server, use Shift + click on the "Post Result" button. General Format:
[GET|POST|PUT]|URL|[POST|PUT body]
Post Example:POST|http://127.0.0.1:8080|&content;
POST|http://127.0.0.1:8080|{"body":"&content"};
Put Example:PUT|http://127.0.0.1:8080|&content;
Get Example:GET|http://127.0.0.1:8080?data=&content;|
Open in a Browser Tab Example:OPEN|http://127.0.0.1:8080?data=&content;|
The "OPEN" command can be used to for instance search the extracted content in a new browser tab or send the content to a website that needs user interaction. The &content;
keyword on the URL part will be replaced with the actual result and it is encoded (encodeURIComponent), but the &content;
on the body section is not altered. You can have one instance of the &content;
keyword in the URL and one instance in the body part. You can write the server code in any language such as Python, PHP, or JavaScript. Here you can find a sample code written in JavaScipt. This code is meant for NodeJS. Alter the code to fit your needs:const http = require('http');
const server = http.createServer(function(req, res) { // 2 - creating server
// res.setHeader('Access-Control-Allow-Origin', '*');
console.log('Request URL: ' + req.url);
console.log('Request method: ' + req.method);
if (req.method === 'GET') {
res.end();
}
else {
req.on('data', chunk => {
console.log('Chunk:', chunk);
});
req.on('end', () => {
res.end();
});
}
});
server.listen(8080);
Supported keywords:&content;
OCR result&href;
Document URL[Version 0.2.7] Is it possible to close all result panels at once?
Yes, it is possible to close all result panels together by using the Shift key while pressing the "Close" button. This action closes all open floating windows on the current page, saving you time and effort if you have many panels open.
Can I translate the extracted text from an image using this extension?
You can use the "Post Result" button to send the extracted text to a translator service. For instance, to send the text to Google Translate and translate any language into English use:
OPEN|https://translate.google.com/?sl=auto&tl=en&text=&content;|
To send the text to DeepL and translate in English, useOPEN|https://www.deepl.com/translator#en/de/&content;|
Is it possible that my text highlighter or text-to-speech extensions have access to the content extracted by this extension?
The interface of each extension is isolated, so it is not possible for other extensions to directly access the extracted content of this extension. However, you can use the "post" command to open the extracted content in a new web page that can be accessed by other extensions.
OPEN|https://webbrowsertools.com/simple-text-editor/?content=&content;|
How do I delete the cached training database used by this extension?
Starting from version 0.3.2, you can delete the cached training database by pressing Ctrl+Click or Command+Click on the "close" button of an OCR result box. This action removes all the cached training storage, and the extension will fetch a fresh copy on the next detection request.
Can I activate the OCR mode using a keyboard shortcut instead of clicking the action button?
Yes, to set up a keyboard shortcut for this extension, follow these steps:
Chromium Browser (Google Chrome):
Firefox:
I have scans of several documents saved in a folder. Can I use this extension to perform OCR on them?
Starting from version 0.4.1, the extension allows you to drop multiple image files into its interface. It will OCR each image one by one and automatically move on to the next. This enables you to OCR several documents with just a single action. Additionally, you can automate the storage of OCR results. By setting up a local server to handle the results and configuring the extension to send the OCR data to this server, you can easily store the results in a database with a simple drag-and-drop operation.
Please keep reviews clean, avoid improper language, and do not post any personal information. Also, please consider sharing your valuable input on the official store.
Permission | Description |
---|---|
storage | to keep the internal preferences |
activeTab | to inject area select script into the active page after a user action |
notifications | to display possible warnings during the OCR process |