Most Read Research Articles


Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79
Call for Paper - May 2015 Edition
IJCA solicits original research papers for the May 2015 Edition. Last date of manuscript submission is April 20, 2015. Read More

Feature Subset Selection Algorithm for High-Dimensional Data by using FAST Clustering Approach

Print
PDF
IJCA Proceedings on International Conference on Knowledge Collaboration in Engineering
© 2014 by IJCA Journal
ICKCE
Year of Publication: 2014
Authors:
Kumaravel. V
Raja. K

Kumaravel. V and Raja. K. Article: Feature Subset Selection Algorithm for High-Dimensional Data by using FAST Clustering Approach. IJCA Proceedings on International Conference on Knowledge Collaboration in Engineering ICKCE:21-25, April 2014. Full text available. BibTeX

@article{key:article,
	author = {Kumaravel. V and Raja. K},
	title = {Article: Feature Subset Selection Algorithm for High-Dimensional Data by using FAST Clustering Approach},
	journal = {IJCA Proceedings on International Conference on Knowledge Collaboration in Engineering},
	year = {2014},
	volume = {ICKCE},
	pages = {21-25},
	month = {April},
	note = {Full text available}
}

Abstract

Feature selection involves the process of identifying the most useful feature's subset which produces compatible results similar to original set of feature. Efficiency and effectiveness are the two measures to evaluate feature selection algorithm. The time to find the cluster concerns to efficiency, while effectiveness is concerned to quality of subset feature. With these criteria, fast clustering algorithm was proposed and experimented in two steps. Features are divided into cluster in first step and followed by selection representative feature related to the target class from each cluster. Fast algorithm has the probability of producing a useful and independent feature subset. Performance of this algorithm is evaluated against several selection algorithms (FCBF, Relief, and CFs) and it outperforms the other algorithm. The result analyzed from 35 real world dataset (image, microarray, text data) proves not only that FAST produces smaller subset but also improves the performance.

References

  • Liu, H. , Motoda, H. and Yu, L. 2004 "Selective Sampling Approach to Active Feature Selection," Artificial Intelligence, vol. 159, nos. 1/2, pp. 49-74.
  • Guyon, I. and Elisseeff, A. 2003 "An Introduction to Variable and Feature Selection," J. Machine Learning Research, vol 3, pp. 1157- 1182, 2003.
  • Mitchell, T. M. 1982 "Generalization as Search," Artificial Intelligence, vol. 18, no. 2, pp. 203-226, 1982.
  • Dash, M. and Liu, H. 1997 "Feature Selection for Classification," Intelligent Data Analysis, vol. 1, no. 3, pp. 131-156.
  • Souza, J. 2004 "Feature Selection with a General Hybrid Algorithm, "PhD dissertation, Univ. of Ottawa.
  • Langley, P. 1994 "Selection of Relevant Features in Machine Learning," Proc. AAAI Fall Symp. Relevance, pp. 1-5.
  • Ng, A. Y. 1998 "On Feature Selection: Learning with Exponentially Many Irrelevant Features as Training Examples," Proc. 15th Int'l Conf. Machine Learning, pp. 404-412.
  • Das, S. 2001 "Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection," Proc. 18th Int'l Conf. Machine Learning, pp. 74-81.
  • Xing, E. , Jordan, M. , and Karp, R. 2001 "Feature Selection for High- Dimensional Genomic Microarray Data," Proc. 18th Int'l Conf. Machin e Learning, pp. 601-608.
  • Pereira, F. , Tishby, N. , and Lee, L. , 1993 "Distributional Clustering of English Words," Proc. 31st Ann. Meeting on Assoc. for Computational Linguistics, pp. 183-190.
  • Baker, L. D. and McCallum, A. K. 1998 "Distributional Clustering of Words for Text Classification," Proc. 21st Ann. Int'l ACM SIGIR Conf. Research and Development in information Retrieval, pp. 96-103,.
  • I. S. Dhillon, and R. Kumar. , "A Divisive Information Theoretic Feature Clustering Algorithm for Text Classification," J. Machine Learning Research, vol. 3, pp. 1265-1287,2003.
  • Jaromczyk, J. W. . and Toussaint, G. T. , "Relative Neighborhood Graphs and Their Relatives," Proc. IEEE, vol. 80, no. 9, (Sept. 1992), pp. 1502- 1517.
  • John, G. H. . , Kohavi, R. , and Pfleger, K. , 1994. "Irrelevant Features and the Subset Selection Problem," Proc. 11th Int'l Conf. Machine Learning, pp. 121-129.
  • Forman, G. , 2003. "An Extensive Empirical Study of Feature Selection Metrics for Text Classification," J. Machine Learning Research, vol. 3, pp. 1289-1305.
  • Hall, M. A. , 2000. "Correlation-Based Feature Selection for Discrete and Numeric Class Machine Learning," Proc. 17th Int'l Conf, Machine Learning pp. 359-366.
  • Kononenko, I. , 1994. "Estimating Attributes: Analysis and Extensions of RELIEF," Proc. European Conf. Machine Learning, pp. 171-182.
  • Battiti, R. , 1994. "Using Mutual Information for Selecting Features in Supervised Neural Net Learning," IEEE Trans. Neural Networks, vol. 5, no. 4 , (July 1994), pp. 537-550.
  • Hall, M. A. 1999. "Correlation-Based Feature Subset Selection for Machine Learning," PhD dissertation, Univ. of Waikato.
  • Yu, L. and Liu, H. 2003. "Feature Selection for High- Dimensional Data: A Fast Correlation-Based Filter Solution," Proc. 20th Int'l Conf. Machine Leaning, vol. 20, no. 2, pp. 856-863.
  • Yu, L. and Liu, H. 2004. "Efficient Feature Selection via Analysis of Relevance and Redundancy," J. Machine Learning Research, vol. 10, no. 5, pp. 1205-1224.
  • Fleuret, F. , 2004. "Fast Binary Feature Selection with Conditional Mutual Information," J. Machine Learning Research, vol. 5, pp. 1531-1555, 2004.
  • Kohav, R. and John, G. H. 1997. , "Wrappers for Feature Subset Selection," Artificial Intelligence, vol. 97, nos. 1/2, pp. 273-324.
  • Press, W. H. ,Flannery, B. P. , Teukolsky, S. A. , and Vetterling, W. T. , 1988. Numerical Recipes in C. Cambridge Univ. Press.
  • Almuallim, H. and Dietterich, T. G. , 1994. "Learning Boolean Concepts in the Presence of Many Irrelevant Features," Artificial Intelligence, vol. 69, nos. 1/2, pp. 279-305.
  • Robnik-Sikonja, M. and Kononenko, I. , 2003. "Theoretical and Empirical Analysis of Relief and ReliefF," Machine Learning, vol. 53, pp. 23- 69.
  • Dash, M. Liu. , H. and Motoda, H. , 2000. "Consistency Based Feature Selection," Proc. Fourth Pacific Asia Conf. Knowledge Discovery and Data Mining, pp. 98-109.
  • Cohen, W. , 1995. "Fast Effective Rule Induction," Proc. 12th Int'l Conf. Machine Learning (ICML '95), pp. 115- 123,1995.