Most Read Research Articles


Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79

Warning: Creating default object from empty value in /var/www/html/sandbox.ijcaonline.org/public_html/modules/mod_mostread/helper.php on line 79
Call for Paper - May 2015 Edition
IJCA solicits original research papers for the May 2015 Edition. Last date of manuscript submission is April 20, 2015. Read More

An Open Source ETL Tool - Medium and Small Scale Enterprise ETL(MaSSEETL)

Print
PDF
International Journal of Computer Applications
© 2014 by IJCA Journal
Volume 108 - Number 4
Year of Publication: 2014
Authors:
Rupali Gill
Jaiteg Singh
10.5120/18899-0190

Rupali Gill and Jaiteg Singh. Article: An Open Source ETL Tool - Medium and Small Scale Enterprise ETL(MaSSEETL). International Journal of Computer Applications 108(4):15-22, December 2014. Full text available. BibTeX

@article{key:article,
	author = {Rupali Gill and Jaiteg Singh},
	title = {Article: An Open Source ETL Tool - Medium and Small Scale Enterprise ETL(MaSSEETL)},
	journal = {International Journal of Computer Applications},
	year = {2014},
	volume = {108},
	number = {4},
	pages = {15-22},
	month = {December},
	note = {Full text available}
}

Abstract

In Data Warehouse (DW) environment, Extraction-Transformation-Loading (ETL) processes consumes up to 70% of resources. Data quality tools aim at detecting and correcting data problems that affect the accuracy and efficiency of data analysis applications. Source data imported into the data warehouse often has different quality, format, coding etc. In order to bring all the data together in a standard, homogeneous environment, Extraction–transformation–loading (ETL) tools are used. ETL solutions provided so far are either proprietary and have limited functionality. Small and Medium Scale Enterprises(SME) and Small Scale Enterprises (SSE) cannot afford the licensing cost of these paid tools. The developed tool is capable of providing an integrated and open source data quality solution - MaSSEETL is to deal with naming conflicts, structural conflicts, date conversions, missing values and changing dimensions. MaSSEETL solves the appropriate errors with appropriate level of warning. In this paper, we are presenting the working of MaSSEETL. The tool provides an pragmatic evidence of strategic intensification of quality data in the academic and business enterprises.

References

  • Pandey K. Rahul (2014). Data Quality in Data warehouse: problems and solution. IOSR-Journal of Computer Engineering Volume 16 Issue 1 pp. 18-24.
  • Saravanan P. (2014) "An Iterative Estimator for Predicting the Heterogeneous Data Sets", Weekly Science Research Journal ISSN: 2321-7871 Volume- 1 Issue -27 pp-1-15'
  • Choudhary N. (2014) "A Study over Problems and Approaches of Data Cleansing/Cleaning", International Journal of Advanced Research in Computer Science and Software Engineering ISSN: 2277 128X Volume 4 Issue 2 pp- 774-779
  • Srikanth K. ; Murthy N. V. E. S; Anitha J. (2013) " Data Waehousing Concept Using ETL Process For SCD Type-3" International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) ISSN: 2276-6856 Vol. 2, Issue 5 pp-142-145
  • Sujatha. R (2013) "Enhancing Iterative Non-Parametric Algorithm for Calculating Missing Values of Heterogeneous Datasets by Clustering" , International Journal of Scientific and Research Publication ISSN: 2250-3153 Volume 3 Issue 3 pp-1-4'
  • Kabiri A. ; Chiadmi D. (2013) "Survey on ETL Processes", Journal of Theoretical and Applied Information Technology. Vol. 54 No. 2
  • Srikanth K. ; Murthy N. V. E. S. ; Anitha J. (2013) "Data Warehousing Concept Using ETL Process for SCD Type-2", American Journal of Engineering Research (AJER) e-ISSN: 2320-0847 p-ISSN: 2320-0936 Volume-2, Issue-4, pp-86-91' 2013
  • Rao S. Chinta; Rajanikanth J. ; Chandra Sekhar V. ; MSVS Bhadri R. (2012) "Data Cleaning Framework for Robust Data Quality in Enterprise Data Warehouse" , IJCST e- ISSN : 0976-8491 p- ISSN : 2229-4333 Vol. 3, Issue 3, pp 36-41
  • Singh R. ; Singh K. (2009). "A Descriptive Classification of Causes of Data Quality Problems in Data Warehousing", International Journal of Computer and Electrical Engineering, Vol. 1, No. 4
  • Vassiliadis P. ; Simitsis A. ; Baikousi E. (2009) "A Taxonomy of ETL Activities" DOLAP '09 Proceedings of the ACM twelfth international workshop on Data warehousing and OLAP, pp 25-32
  • Singh J. ; Singh K. (2009) "Statistically Analyzing the Impact of Automated ETL Testing on the Data Quality of a Data Warehouse", International Journal of Computer and Electrical Engineering, Vol. 1, No. 4
  • Rodi´c J. ; Baranovi´c M. (2009) "Generating Data Quality Rules and Integration into ETL Process", DOLAP'09 ACM
  • Muller H. ; Freytag J. (2003). "Problems, Methods, and Challenges in Comprehensive Data Cleansing", pp. 21.
  • Rahm, E. ; Do; H. H. (2000). "Data Cleaning: Problems and Current Approaches" IEEE Data Engineering Bull. Vol 23 No. 4, pp. 3-13