10.5120/17792-8585 |
Jan-Hendrik Worch and Bjoern Gottfried. Article: Choosing Shape Features by means of Genetic Algorithms for Gylph-clustering of Historical Documents. International Journal of Computer Applications 102(3):1-6, September 2014. Full text available. BibTeX
@article{key:article, author = {Jan-Hendrik Worch and Bjoern Gottfried}, title = {Article: Choosing Shape Features by means of Genetic Algorithms for Gylph-clustering of Historical Documents}, journal = {International Journal of Computer Applications}, year = {2014}, volume = {102}, number = {3}, pages = {1-6}, month = {September}, note = {Full text available} }
Abstract
The solution for a feature selection problem is presented in the field of document image processing. The choice of shape features for describing glyphs of historical documents is a non-trivial task since the variations of glyphs in different documents is innumerable. Hence, the manual selection of shape features would be a cumbersome task. To select a subset of features from a given set a genetic algorithm is used which optimises the result of a clustering process by x-means. The result of x-means is evaluated by using different quality measures. The optimisation methodology is illustrated within a case study, in which the selection of an appropriate set of features is a crucial part of the system. The intended application supports a user who is transcribing historical documents by showing him similar occurrences of a given glyph.
References
- Die Grenzboten, 28. Jahrgang, 2. Semester 1. Band, 1869. Scan 27 von der Staats- und Universit¨atsbibliothek Bremen.
- SBPK Berlin, Philllipps 1870, fol. 11v, 1870.
- W. Burger and M. J. Burge. Principles of digital image processing: Core algorithms. Springer, London, 2009.
- D. Goldberg and K. Deb. A comparative analysis of selection schemes used in genetic algorithms. In G. Rawlins, editor, Foundations of Genetic Algorithms, pages 69–93. Morgan- Kaufmann, 1991.
- R. C. Gonzalez and R. E. Woods. Digital image processing. Addison-Wesley, Reading, Mass. , [3. ed. ] reprint. with corr. edition, 1992.
- B. Gottfried. Qualitative similarity measures - the case of two-dimensional outlines. Computer Vision and Image Understanding, 110(1):117–133, 2008.
- B. Gottfried. Representing Material Objects by Qualitative Spatial Representations. Universit¨at Bremen, 2008. Unpublished Habilitation.
- B. Gottfried, A. Schuldt, and O. Herzog. Extent, extremum, and curvature: Qualitative numeric features for efficient shape retrieval. In Joachim Hertzberg, Michael Beetz, and Roman Englert, editors, KI 2007: Advances in Artificial Intelligence, volume 4667 of Lecture Notes in Computer Science, pages 308–322. Springer Berlin / Heidelberg, 2007.
- T. K. Ho. Random decision forests. In Proceedings of the second International Conference on Document Analysis and Recognition, pages 278–282, 1995.
- T. K. Ho and H. S. Baird. Perfect metrics. In Proceedings of the second International Conference on Document Analysis and Recognition, pages 593–597, 1993.
- T. K. Ho and H. S. Baird. Large-scale simulation studies in image pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(10):1067–1079, 1997.
- J. Holland. Adaption in Natural and Artificial Systems. University of Michigan Press, 1975.
- M. -K. Hu. Visual pattern recognition by moment invariants. Information Theory, IRE Transactions on, 8(2):179–187, 1962.
- J. MacQueen. Some methods for classification and analysis of multivariate observations. In Proc. 5th Berkeley Symp. , volume 1, pages 281–297, 1967.
- P. Merz. Memetic Algorithms for Combinatorial Optimization Problems. Dissertation, Universit¨at-Gesamthochschule Siegen, 2000.
- S. Mori, C. Y. Suen, and K. Yamamoto. Historical review of ocr research and development. In Proceedings of the IEEE, volume 80, pages 1029–1058, July 1992.
- D. Pelleg and A. Moore. X-means: Extending k-means with efficient estimation of the number of clusters. In Proc. 17th Int. Conf. Machine Learning, pages 727–734, 2000.
- T. H. Reiss. The revised fundamental theorem of moment invariants. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(8):830–834, August 1991.
- A. Schuldt, B. Gottfried, and O. Herzog. Towards the visualisation of shape features the scope histogram. In C. Freksa, M. Kohlhase, and K. Schill, editors, KI 2006: Advances in Artificial Intelligence, volume 4314 of Lecture Notes in Computer Science, pages 289–301. Springer Berlin / Heidelberg, 2007.
- G. Vamvakas, B. Gatos, and S. J. Perantonis. A novel feature extraction and classification methodology for the recognition of historical documents. In 10th International Conference on Document Analysis and Recognition, pages 491–495, 2009.
- J. -H. Worch. VaBene – Validierung eines Benchmarks zur Evaluation von Formmerkmalen f¨ur Glyphen. Diploma thesis, Universit¨at Bremen, September 2011.
- J. -H. Worch, M. Lawo, and B. Gottfried. Glyph spotting for mediaeval handwritings by template matching. In Proceedings of the 12th ACM symposium on Document engineering, DocEng '12, New York, NY, USA, 2012. ACM.
- R. Xu and O. A. Di Guida. Comparison of sizing small particles using different technologies. Powder Technology, 132(2- 3):145 – 153, 2003.