Document Image Recognition System : Framework and Applications

Iping Supriana Suwardi1, Peb Ruswono Aryan1, Bugi Wibowo1
1Bandung Institute of Technology, Jalan Ganesa 10, Bandung 40132, Indonesia

Abstract—Document image recognition (DIR) is a part of document image understanding (DIU) or intelligent document processing (IDP) system. The objectives of document image recognition is data extraction either textual or graphical data which exist within the document, or structural information such as document layout or document style which will resulting in exact reconstruction of document. Data extracted from this system will be used in further application forming document image understanding system or automatic document processing system. Document image recognition and understanding has been studied over three decades. Many commercial or free software are available, However these software primarily being spesific application. This paper will discuss on the development of a flexible framework of a DIR system which can be applied to many field of document image recognition applications. The paper will also discuss the classification of document complexity that are being used, methods used, and some application prototypes of a DIR system built with this framework.

This paper is published in Proceedings of International Conference on Rural Information and Communication Technology 2007, Bandung, Indonesia.

This research has been funded by Center for ICT research of ITB for competitive research category with funding code RU/PPTIK/004.


  1. Arya, Felix and Iping Supriana. (2007). License Plate Recognition System for Indonesian Vehicle. Proceedings of International Conference on Electrical Engineering and Informatics (ICEEI).2007. Indonesia.
  2. Baird, H.S. (1993). Document Image Defects Model and Their Uses. Proc. IAPR 2nd International Conference on Document
    Analysis and Recognition (ICDAR) 1993. Japan.

  3. Haralick, Robert M. (1994). Document Image Understanding : Geometric And Logical Layout. CVPR94 : Computer Society
    Conference on Computer Vision and Pattern Recognition pp 385-390.
  4. Jung, Min-Chul, et al.(1999). Machine Printed Character Segmentation Method using Side Profiles. Proceedings IEEE
    International Conference on Systems, Man and Cybernetics, Tokyo, Japan. October, 1999.
  5. Strouthopoulos, C. et al. Locating Text In Color Documents. Proc. IEEE International Conference on Image Processing 2001, Greece.
  6. Supriana, Iping. (2007). Algoritma Penghitungan Jumlah Obyek dan Perolehan Deskripsi Batasnya Dari Suatu Dokumen Citra
    Berwarna Dengan Menggunakan Model Susur. Proc. SITIA 2007.
  7. Tang, Yuan Y. Document Analysis and Recognition By Computers. Handbook of Pattern Recognition and Computer
  8. Trier, Olivien Due; Jain, Anil K. (1995). Goal-Directed Evaluation of Binarization Methods. IEEE Transaction on Pattern Analysis and Machine Intelligence Vol. 17 No. 12 pp 1191-1201.
  9. Trier, Olivien Due, et al. Feature Extraction Methods for Character Recognition – A Survey. Pattern Recognition 29, pp. 641-662, 1996.

One comment

  1. khalid · Agustus 7, 2008

    mas bgmn cara baca papernya?saya coba search koq gak ada papernya cuman judulnya saja?terima kasih

Tinggalkan Balasan

Isikan data di bawah atau klik salah satu ikon untuk log in:


You are commenting using your account. Logout / Ubah )

Gambar Twitter

You are commenting using your Twitter account. Logout / Ubah )

Foto Facebook

You are commenting using your Facebook account. Logout / Ubah )

Foto Google+

You are commenting using your Google+ account. Logout / Ubah )

Connecting to %s