Creating an Optical Character Recognition Application (OCR)
  • 08 Feb 2024
  • 3 Minutes to read
  • Dark
    Light

Creating an Optical Character Recognition Application (OCR)

  • Dark
    Light

Article summary

34 STEPS

1. Select Character Recognition from the Application Builder. You may notice many different options, but for now we will select Character Recognition under Vision AI Applications.

Step 1 image

2. Select the Name text field to insert a name of the OCR (Optical Character Recognition) Application.

Step 2 image

3. Type the Name of the OCR application. When naming the application, it is important to name it something meaningful that directly relates to the application purpose. In the current scenario we named it 'automobile-plates'

Step 3 image

4. Click Next blue button, and you will proceed to the Input Subjects page.

Step 4 image

5. Click Next blue button, and you will proceed to the Input Subjects page.

Step 5 image

6.

Step 6 image

7. Type an Input Subjects name. Again please consider a name that will help you differentiate the Input Subject in the future. We have chosen 'automobiles' for the current scenario.

Step 7 image

8. Click Create a new subject in the current scenario, named as 'automobiles' 

Step 8 image

9. Click Next blue button in the end of the page, so you can proceed to the Output Subjects page

Step 9 image

10. The Output Subject of the OCR application is where you will find the labeled and sorted images. Usually the Output Subject gets populated with media (images/ videos) after successful Training of the Application.

Step 10 image

11. Type Output Subjects name. Again please consider the name, as it should be directly related with the Subject Of Interest displayed in the image. We have chosen 'automobiles' for the current scenario.

Step 11 image

12. Click Create a new subject in the current case scenario, named as automobiles

Step 12 image

13. Click Next

Step 13 image

14. Click Character Set. Here you can add more characters which could be detected in the images. You can also add Donated Subjects.

Step 14 image

15. Scroll down and click Complete

Step 15 image

16. Select the Input Subject which is always displayed in the beginning of the Application Pipelines. In the current scenario select automobiles.

Step 16 image

17. Click Upload Media to add the initial Dataset to to the Input Subjects. Once it has been uploaded to the Input Subject, the Media goes for training by the User.

Step 17 image

18. Click select files or select folder to upload the initial Dataset of images/ videos to the Input Subjects of the Application

Step 18 image

19. Click Next

Step 19 image

20. Click Upload

Step 20 image

21. Click Replays

Step 21 image

22. Click automobiles

Step 22 image

23. Click highlight

Step 23 image

24. With your mouse, click and drag highlight, then drop it on highlight

Step 24 image

24b. Drop

Step 24b image

25. Click highlight

Step 25 image

26. With your mouse, click and drag highlight, then drop it on highlight

Step 26 image

26b. Drop

Step 26b image

27. Scroll down and click Force Feedback

Step 27 image

28. Click Replay Subjects

Step 28 image

29. Click No Feedback

Step 29 image

30. Select the text field

Step 30 image

31. Type the presented characters in the image

Step 31 image

32. Select Confirm blue button to Submit the Feedback for the image.

Step 32 image

33. During providing feedback there are also: BackSideline and Skip options.

Step 33 image

34. That's it. You're done.

Step 34 image

Here's an interactive tutorial

** Best experienced in Full Screen (click the icon in the top right corner before you begin) **

https://www.iorad.com/player/2274890/Cogniac---How-to-create-an-OCR-app-and-train-it

I. In which cases we should use / what the app is doing

Optical character recognition (OCR) is a computer vision technique that can extract text from images and videos. The OCR app identifies the individual characters in an image and maps them to their corresponding characters in a known alphabet.

II. Use case / where the app can be used in real life:

  • It can be used for speed cameras to detect car plates and convert visualized characters.

  • Digitizing paper-based documents - The OCR app can convert scanned documents, such as invoices, contracts, and medical records, into editable and searchable digital formats, saving time and improving efficiency for businesses and organizations.

  • Automating document processing - the OCR app can automate document processing, such as extracting data from forms or receipts. It can help reduce errors and improve the accuracy and speed of data entry.

III. What could cause issues while using the app:

  • OCR apps are highly reliant on the quality of the image being processed. If the image is blurry or noisy, the OCR app may struggle to identify the individual characters accurately. This can lead to errors in the output generated by the app.

  • Capturing media from various angles can cause significant issues for the OCR app when recognizing characters accurately. To ensure that the OCR app functions optimally, capturing media from the same angle is essential.

  • OCR struggles with handwritten documents, especially if the writing is unclear.


Was this article helpful?