Topic > Database Systems: Evolution and Efficiency of Big Data

Big Data: An Continuing EvolutionBig Data today continues to evolve and appears to be in the early stages of evolution. It will continue to grow and will need constant research initiatives to keep pace. This paper will review the definition of Big Data and how it is used, why the current DBMS cannot handle Big Data efficiently, what hardware and software solutions are being tested, and what challenges researchers are facing. Big Data is a term used today to talk about the vastly growing amounts of data (mostly unstructured, but can also include structured and semi-structured data), available to be mined [1]. Data mining attempts to derive meaningful information from data. As the amount of data in different varieties continues to increase, it becomes more difficult to process useful information with acceptable turnaround times. Current software and hardware tools cannot keep up with the demands of Big Data. Big Data requires the ability to process complex computer data at the Petabyte or Exabyte level. [2]. Big Data is developing from many sources, and as storage capacity has doubled every 14 months for the past 30 years, storing data has become increasingly cheaper [3]. Some of the data sources are social media on the Internet, mobile sensors, astronomy, transaction logs and many others [4]. Today, companies want to collect large amounts of data that may not be useful today, but may be useful later. The popular social media site, Facebook, collects over 500 Terabytes per day [4]. The term Big Data is not only defined by its volume, its ability to retrieve knowledge in a reasonable period of time. For example, Netflix, a video streaming service, uses a machine learning technique… middle of the paper… the future,” ACM SIGKDD Explorations Newslette, vol. 14, no. 2, pp. 1-5, 2012 .[6] C. Ordonez, "Can We Analyze Big Data inside a DBMS?", in DOLAP '13 Proceedings of the sixteenth international workshop on Data warehousing and OLAP, San Francisco, 2013.[7] Bear, A. Lamb and N .Tran, “The Vertica Database: SQL RDBMS For Managing Big Data,” in MBDS '12 Proceedings of the 2012 Workshop on Managing Big Data Systems, San Jose, 2012.[8] Tran, S. Bodagala, and J. Dave , “Designing query optimizers for big data problems of the future,” Proceedings of the VLDB Endowment, vol 6, no. 1168-1169, 2013.[9] IEEE Internet Computing, vol. 16, no. 4-6, May 2012. [10] Big Data and Technology Readiness Levels, IEEE, vol.42, no. pages. 8-9, 2014.