Most Read Research Articles


Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79
Call for Paper - May 2015 Edition
IJCA solicits original research papers for the May 2015 Edition. Last date of manuscript submission is April 20, 2015. Read More

Towards Arabic Spell-Checker Based on N-Grams Scores

Print
PDF
International Journal of Computer Applications
© 2012 by IJCA Journal
Volume 53 - Number 3
Year of Publication: 2012
Authors:
Hasan Muaidi
Rasha Al-tarawneh
10.5120/8400-2168

Hasan Muaidi and Rasha Al-tarawneh. Article: Towards Arabic Spell-Checker Based on N-Grams Scores. International Journal of Computer Applications 53(3):12-16, September 2012. Full text available. BibTeX

@article{key:article,
	author = {Hasan Muaidi and Rasha Al-tarawneh},
	title = {Article: Towards Arabic Spell-Checker Based on N-Grams Scores},
	journal = {International Journal of Computer Applications},
	year = {2012},
	volume = {53},
	number = {3},
	pages = {12-16},
	month = {September},
	note = {Full text available}
}

Abstract

The main purpose of this paper is to develop a simple and flexible spell-checker for Arabic language. The proposed spell-checker is based on N-Grams scores. For this purpose, eleven matrices are built to present the combination between the Arabic letters word. Each matrix concerns in the connection between a 2-grams letters. Each cell in the generarated matrix is assigned an integer value 2, 1 or 0. The cell is assigned the value 2 in the corresponding matrix; if the word is ended by these two letter and assigned 1 if there is a connection and the word is not over yet, and is assigned 0 otherwise. On the other side searching process for any word that is by extracting each pair of letters in the word then it examines the value for each pair when the corresponding value is zero then the spell checker will consider the test word as wrong; otherwise it will check if it is assign with 1 that indicates that there is a connection it will be continue until reach to the value of 2 to determine that the word is correct. The overall accuracy for the proposed spell-checker is reached to 98. 99%.

References

  • Feldman A. Computational linguistics: Models, resources, applications. Computational Linguistics, 32(3):443–444, 2006.
  • Haddad B. and Yaseen M. Detection and correction of nonwords in arabic: A hybrid approach. International Journal of Computer Processing of Oriental Languages, 30, 2007.
  • P. Brown, P. deSouza, R. Mercer, V. Pietra, and J. Lai. Class-based n-gram models of natural language. Computational Linguistics, 18:467–479, 1992.
  • Muaidi H. Extraction Of Arabic Word Roots: An Approach Based on Computational Model and Multi- Backpropagation Neural Networks. PhD thesis, De Montfort University - UK, 2008.
  • Satori H. , Harti M. , and Chenfour N. Arabic speech recognition system using cmu-sphinx4. CoRR 0704. 2201, 2007.
  • Shaalan K. , Allam A. , and Gomah A. Towards automatic spell checking for arabic. In Language Engineering, 2003.
  • Kukich Karen. Technique for automatically correcting words in text. ACM Computing Surveys (CSUR), 24:377– 439, 1992.
  • Karttunen. Applications of finite-state transducers in natural language processing. In CIAA: International Conference on Implementation and Application of Automata, LNCS, 2000.
  • Kabbani M. The arabic spell-checker dictionary from ayaspell project. Technical report, Prix special des troisiemes rencontres africaines du Logiciel Libre, 2008.
  • Suleiman H. Mustafa and Qasem A. Al-Radaideh. Using n-grams for arabic text searching. JASIST, 55(11):1002– 1007, 2004.
  • Alqrainy S. , Ayesh A. , and Muaidi H. Automated tagging system and tagset design for arabic text. International Journal of Computational Linguistics Research, 1:55–62, 2010.
  • Zerrouki T. and Balla A. Implementation of infixes and circumfixes in the spellcheckers. In Proceedings of the Second International Conference on Arabic Language Resources and Tools, 2009.