Introduction

This task aims to develop a universal text detection and recognition algorithm on the CV180X/CV181x processor, which can effectively detect text areas in images and accurately identify text content. This algorithm is designed to be applicable to various application scenarios, including but not limited to document recognition, text extraction from images, automated report generation, etc., and can improve the level of automated processing of text information.

 

  • Acceptance Criteria

Algorithm performance will be evaluated on the evaluation set, focusing mainly on accuracy and computational complexity.

  1. Accuracy: It is required to achieve 95% accuracy on the evaluation set to ensure accurate detection and identification of text areas in different scenarios.
  2. FLOPS requirements: The computational complexity of the algorithm (FLOPS) should be adapted to the processor platform and not exceed 30G to ensure efficient operation on embedded devices.

 

 

  • Evaluation set

The evaluation set will contain multiple scenarios to simulate various situations that may be encountered in actual applications:

  1. Text density variation: Text areas with different densities, including sparse and dense text layouts.
  2. Font and size variations: Images of different fonts and text sizes to test the algorithm’s adaptability to various text styles.
  3. Lighting conditions: Images under uniform lighting and low light conditions.
  4. Sample size: The number of samples in the evaluation set should be greater than 500 to ensure adequate evaluation of algorithm performance.

Through the requirements of these evaluation sets, we expect that the algorithm will show high accuracy and robustness in actual application scenarios and meet the actual needs of general text detection and recognition.

 

Data Collection Process

 

  • Data Collection Plan

The collection plan should be based on the actual application scenarios of the algorithm. First, determine the collection variables, such as: collection scene, quantity, number of people, gender, age, etc. Index and assign an English abbreviation for each variable. For example, if there are three variables: collection scene, distance, and gender, they could be numbered as:

  1. Scene 1: indoor; Scene 2: outdoor
  2. Distance 1: 1m; Distance 2: 3m; Distance 3: 5m
  3. Gender 1: male; Gender 2: female

Then determine the collection process based on the variables, such as whether to collect according to variable 1 or variable 2 first, the actions of the people being collected, and precautions, etc. Organize the above collection plan and variable information into a Word document for saving.

 

  • Prepare Collection Equipment

The collection equipment should be as close as possible to the actual equipment used and ensure that it can be preserved for a long time. Prepare the collection firmware.

 

  • Data Collection and Saving

Carry out data collection and name the files according to the order of the collection variables defined in the "Data Collection Plan", with names like x-x-x-x.xxx. For example, if there are three variables: scene, distance, and gender, the saving format would be: indoor-3m-female-12.xxx, representing the 12th data of "indoor scene, 3m distance, female" collected. After the collection is complete, the collected data and the collection instruction document from the "Data Collection Plan" should be put together for inspection.