Netscientium

Big Data Basics

Big Data Basics

Definition :  As per wikipedia,  “Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, and information privacy. The term often refers simply to the use of predictive analytics or other certain advanced methods to extract value from data, and seldom to a particular size of data set. Accuracy in big data may lead to more confident decision making. And better decisions can mean greater operational efficiency, cost reduction and reduced risk.”

3 main characteristics of big data are

Volume of data
which extends from terabyte to zeta bytes
Variety
growing from structured data to structured and unstructured data
Velocity
Batch data to streaming data

Growth of big data is enhanced by increase in storage capacity , increased processing power and data availability. Data in all sectors have increased tremendously. Data from social media is apart from this and all together this increases the data available for processing and to take a data driven decision.

Typical tools used for big data

NoSQL  : MongoDB couchDB, Cassandra,BigTable, Hbase, Zookeeper etc

MapReduce  : Hadoop, HIVE, PIG, Cascading, Cascalog, Kafka, Oozie etc

Processing : R, Yahoo Pipes, Mechanical Turk, Elastic search, datameer, bigsheets, tinkerpop etc

gains from big data per sector

September 3, 2015

0 Responses on Big Data Basics"

2014 © Netscitus Corporation. All Rights Reserved.