Text Line Segmentation of Handwritten Documents using Clustering Method based on Thresholding Approach
Ravi M Kumar, Nayana N Shetty and B P Pragath. Article: Text Line Segmentation of Handwritten Documents using Clustering Method based on Thresholding Approach. IJCA Proceedings on National Conferecne on Advanced Computing and Communications 2012 NCACC(1):9-12, August 2012. Full text available. BibTeX
@article{key:article, author = {M. Ravi Kumar and Nayana N Shetty and B. P. Pragath}, title = {Article: Text Line Segmentation of Handwritten Documents using Clustering Method based on Thresholding Approach}, journal = {IJCA Proceedings on National Conferecne on Advanced Computing and Communications 2012}, year = {2012}, volume = {NCACC}, number = {1}, pages = {9-12}, month = {August}, note = {Full text available} }
Abstract
Segmentation of the text lines in an un-constrained handwritten documents still a challenging task because handwritten text lines are often un-uniformly skewed and curved, and the space between lines is not obvious. In this paper, we propose a text-line segmentation algorithm based on clustering using threshold. The connected components of document image are grouped, from which text-lines are extracted dynamically by coloring all the text-lines.
References
- Downton A. , Leedham C. G. (1990), Preprocessing and presorting of envelope images for automatic sorting using OCR, Pattern Recognition, 23(3-4):347-362.
- Govindaraju V. , R. Srihari, S. Srihari (1994), Handwritten text recognition, Document Analysis Systems DAS
- Seni G. , Cohen E. (1994), External word segmentation of off-line handwritten text line, pattern Recognition, 27, Issue 1, January, pp 41-52
- Srihari S. , Kim G. (1997), Penman: a system for reading unconstrained handwritten page image, SDIUT 97, Symposium on document image understanding technology, pp. 142-153.
- Zhang B. , Srihari S. N. , Huang C. (2004), Word image retrieval using binary features, SPIE Conference on Document Recognition and retrieval XI, San Jose,California, USA, Jan 18-22. 2. Antonacopoulos A. (1994), Flexible Page Segmentation Using the Background, Proc. 12th Int. Conf. on Pattern Recognition (12th ICPR), Jerusalem, Israel, October 9-12, vol. 2, pp. 339-344.
- Marti U. , Bunke H. (1999), A full English sentence database for off-line handwriting recognition, Proc. 5th
- F. Yin, C. L. Liu,(2007), Handwritten text line extraction based on minimal spanning tree clustering, Proc. 5th Int. Conf. on Wavelet Analysis and Pattern Recognition, Vol. 3, pp. 1123-1128.
- F. Chang,C. J. Chen,C. J. Lu,A linear-time component labeling algorithm using contour tracing technique, Computer Vision and Image Understanding, Vol. 93, pp. 206-220, 2004.
- Fei Yin, Cheng-Lin Liu, Handwritten Text Line Segmentation by Clustering with Distance Metric Learning, National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences
- G. Nagy, S. Seth, M. Viswanathan,(1992), A prototype document image analysis system for technical journals, Computer, Vol. 25, pp. 10-22.
- U. Pal, S. Datta,(2003), Segmentation of Bangla unconstrained handwritten text, Proc. 7th Int. Conf. on Document Analysis and Recognition, Vol. 2, pp. 1128- 1132.
- A. Zahour, B. Taconet, P. Mercy,S. Ramdane,(2001),Arabic handwritten text-line extraction, Proc 6th Int. Conf. on Document Analysis and Recognition, pp. 281-285.
- Z. Shi, S. Setlur, V. Govindaraju, (2005),Text extraction from gray scale historical document image using adaptive local connectivity map, Proc. 8th Int. Conf. on Document Analysis and Recognition, Vol. 2, pp. 794-798.
- D. J. Kennard, W. A. Barrett,(2006), Separating lines of text in freeform handwritten historical documents, Proc. 2nd Int. Conf. on Document Image Analysis for Libraries, pp. 12-23.
- Y. Li, Y. Zheng, D. Doermann, S. Jaeger,(2008), Script independent text line segmentation in freestyle handwritten document, IEEE Trans. Pattern Analysis and Machine Intelligence, to appear.
- L O'Gorman, (1993),The document spectrum for page layout analysis, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 15, No. 11, pp. 1162-1173.
- L. Likforman-Sulem,(1994), C. Faure, Extracting lines on handwritten document by perceptual grouping,In: Advances in Handwriting and Drawing: A Multidisciplinary Approach, pp . 21-38.
- I. S. I. Abuhaiba,S. Datta,(1995),M. J. J. Holt, Line extraction and stroke ordering of text pages, Proc. 3rd Int. Conf. on Document Analysis and Recognition, Vol. 1, pp. 390-393.
- A. Simon, J. -C. Pret , A. P. Johnson,(1997), A fast algorithm for bottom-up document layout analysis, IEEE Trans. Pattern Analysis and Machine Intelligence,Vol. 19, No. 3, pp. 273-277.
- Y. Pu,Z. Shi,(1998), A natural learning algorithm based on Hough transform for text lines extraction in handwritten document, Proc. 6th Int. Workshop on Frontiers in Handwriting