Others News
- China develops ethnic language optical character recognition system
Date: 30-Jan-2007 Sources: (Xinhua Online)
Chinese researchers have succeeded in developing ethnic language optical character recognition (OCR) system, which can converts images of texts written in ethnic languages, such as a scanned paper documents, into compute-editable text.
The system is usable for documents written in major Chinese ethnic languages including Mongolian, Tibetan, Uygur, Kazak, Korean and the Kirgiz language, said Ding Xiaoqing, a professor from the Tsinghua University, who headed the team responsible for the project.
It can also be used to recognize and transform materials written in Arabic, Ding added.
Most OCT technologies in China are only applicable for materials written in Chinese and English, and can not be used to process characters in ethnical languages, said Ding.
'The system is designed with capacities to handle multiple ethnic languages and it can recognize up to 96.2 percent of the text content,' said Ding.
The technology passed the appraisal by several academicians from the Chinese Academy of Sciences and Chinese Academy of Engineering on Monday.
It will be used to preserve documents written in ethnic languages and promote the application of information technology in China's ethnic groups, said Ni Guangnan, an academician from the Chinese Academy of Engineering who was among the appraisal team.
Over 40 researchers from Tsinghua University, Inner Mongolia University and northwest China's Xinjiang University have spent eight years in developing and improving the technology.
OCR (Optical Character Recognition) technology involves the translation of optically scanned images of printed or written text characters into character codes than can be manipulated by computers.
Sponsor Results:
