Call for Paper - May 2015 Edition
IJCA solicits original research papers for the May 2015 Edition. Last date of manuscript submission is April 20, 2015. Read More
International Journal of Computer Applications
© 2014 by IJCA Journal
Volume 107 - Number 5
Year of Publication: 2014
10.5120/18747-0000 |
Bhavin M Jasani and C K Kumbharana. Article: Analyzing Different Web Crawling Methods. International Journal of Computer Applications 107(5):23-26, December 2014. Full text available. BibTeX
@article{key:article, author = {Bhavin M. Jasani and C. K. Kumbharana}, title = {Article: Analyzing Different Web Crawling Methods}, journal = {International Journal of Computer Applications}, year = {2014}, volume = {107}, number = {5}, pages = {23-26}, month = {December}, note = {Full text available} }
Abstract
As we know that the no of internet users are increasing day by day at a enormous rate. To maintain the resource discovery of World Wide Web (WWW) is a crucial task in today's scenario. There are many algorithms and architectures have been introduced to make effective WWW resource discovery.
References
- C. C. Aggarwal, F. Al-Garawi, and P. S. Yu. Intelligent crawling on the World Wide Web with arbitrary predicates. In WWW10, Hong Kong, May 2001.
- B. Amento, L. Terveen, and W. Hill. Does "authority" mean quality? Predicting expert quality ratings of web documents. In Proc. 23rd Annual Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2000.
- A. Arasu, J. Cho, H. Garcia-Molina, A. Paepcke, and S. Raghavan. Searching the Web. ACM Transactions on Internet Technology, 1(1), 2001.
- K. Bharat and M. R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1998.
- Sergey Brin and Lawrence Page. The anatomy of a large-scale hyper textual Web search engine. Computer Networks and ISDN Systems, 1998.
- S. Chakrabarti. Mining the Web. Morgan Kaufmann, 2003.
- S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan, and S. Rajagopalan. Automatic resource list compilation by analyzing hyperlink structure and associated text. In Proceedings of the 7th International World Wide Web Conference, 1998.
- S. Chakrabarti, K. Punera, and M. Subramanyam. Accelerated focused crawling through online relevance feedback. In WWW2002, Hawaii, May 2002.
- S. Chakrabarti, M. van den Berg, and B. Dom. Focused crawling: A new approach to topic-specific Web resource discovery. Computer Networks, 1999.
- J. Cho, H. Garcia-Molina, and L. Page. Efficient crawling through URL ordering. Computer Networks, 1998.
- B. D. Davison. Topical locality in the web. In Proc. 23rd Annual Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2000.
- P. M. E. De Bra and R. D. J. Post. Information retrieval in the World Wide Web: Making client-based searching feasible. In Proc. 1st International World Wide Web Conference, 1994.
- M. Diligenti, F. Coetzee, S. Lawrence, C. L. Giles, and M. Gori. Focused crawling using context graphs. In Proc. 26th International Conference on Very Large Databases (VLDB 2000).
- D. Eichmann. Ethical Web agents. In Second International World-Wide Web Conference, 1994.
- M. Hersovici, M. Jacovi, Y. S. Maarek, D. Pelleg, M. Shtalhaim, and S. Ur. The shark-search algorithm | an application: Tailored Web site mapping. In WWW7, 1998.
- J. Johnson, T. Tsioutsiouliklis, and C. L. Giles. Evolving strategies for focused web crawling. In Proc. 12th Intl. Conf. on Machine Learning (ICML-2003),Washington DC, 2003.
- J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 1999.
- V. Kumar, A. Grama, A. Gupta, and G. Karypis. Introduction to Parallel Computing: Design and Analysis of Algorithms. Benjamin/Cummings, 1994.
- H. Lieberman, F. Christopher, and L. Weitzman. Exploring the Web with Reconnaissance Agents. Communications of the ACM, August 2001.
- A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. Information Retrieval,2000.
- F. Menczer and R. K. Belew. Adaptive retrieval agents: Internalizing local context and scaling up to the Web. Machine Learning, 2000.
- F. Menczer, G. Pant, M. Ruiz, and P. Srinivasan. Evaluating topic-driven Web crawlers. In Proc. 24th Annual Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2001.
- F. Menczer, G. Pant, and P. Srinivasan. Topical web crawlers: Evaluating adaptive algorithms. To appear in ACM Trans. on Internet Technologies, 2003.
- http://dollar. biz. uiowa. edu/~fil/Papers/TOIT. pdf.
- G. Pant. Deriving Link-context from HTML Tag Tree. In 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2003.
- rajagopalan. Automatic resource list compilation by analyzing hyperlink structure.
- M. Porter. An algorithm for suffix stripping. Program, 1980.
- G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983.
- Steven S. Skiena, The Algorithm design Manual.
- Ben Coppin, Artificial Intelligence Illuminated.
- [Berners-Lee 1992]: Berners-Lee, T. , Cailliau, R. , Groff, J. F. and Pollermann, B. World-Wide Web: the information universe. Electronic Networking: Research, Applications and Policy.
- [Bush 1945]: Bush, V. As We May Think. Atlantic Monthly, 1945.
- [Coombs 1990]: Coombs, J. H. , Hypertext, Full Text, and Automatic Linking. In SIGIR, (Brussels, 1990).
- [ DeRose 1999]: DeRose, S. J. and van Dam, A. Document structure and markup in the FRESS hypertext system. Markup Languages: Theory & Practice.
- [Frisse 1988]: Frisse, M. E. searching for information in a hypertext medical handbook. Communications of the ACM.
- [Nelson 1981]: Nelson, T. Literary Machines. Mindful Press, Sausalito, 1981.
- [Nelson 1988]: Nelson, T. H. Unifying tomorrow's hypermedia. In Online Information. 12th International Online Information Meeting Learned Info, Oxford, UK, 1988.
- [van Dam 1969] van Dam, A. , Carmody, S. , Gross, T. , Nelson, T. , and Rice, D. , A Hypertext Editing System for the 360. In Conference in Computer Graphics, (1969), University of Illinois.
- [Van Dam 1988] van Dam, A. Hypertext '87 Keynote Address. Communications of the ACM.
- Crawling the Web: Gautam Pant, Padmini Srinivasan, and Filippo Menczer, Department of Management Sciences, School of Library and Information Science
- The University of Iowa, Iowa City IA 52242, USA.
- WebCrawler: Finding What People Want: Brian Pinkerton
- Effective Web Crawling: Carlos Castillo
- www. wikipedia. com