Über Script Identification From Multi-Script Handwritten Documents
In recent years, the escalating use of physical documents has made to progress towards the creation of electronic documents to facilitate easy communication and storage of documents. However, the usage of physical documents is still prevalent in most of the communications. Also, the fact that paper is a very comfortable and secured medium to deal with ensures that the demand for physical documents continues for many more years to come. So, there is a great demand for software, which automatically extracts, analyzes and stores information from physical documents for later retrieval. A multi-lingual document such as railway reservation forms, question papers, language translation books and money-order forms may contain text in more than one script/language. Also, one script could be used to write more than one language. Optical Character Recognition (OCR) module, designed for a specific language, will not work for such multilingual documents. Hence, a pre-OCR script identification system is very essential before running an individual OCR for a specific language. In this book, the problem of script identification is addressed here.
Mehr anzeigen