Big Data tutorials, technologies, questions and answers

In this section we have organized Big Data tutorials, articles, technologies, questions and answers.

Big Data tutorials, technologies, questions and answers

Big Data technology tutorials, questions and answers

Big Data technologies are here to help companies to analyze huge set of data generated through various sources. It provides technologies for handling huge quantity of structured and unstructured data. Big Data technologies provides software system for analyzing data in real-time or through scheduled job.

Famous companies like Face book, Twitter, Google are using Big Data technologies for handling such a huge data set for their users. Due to innovation in Big Data technologies these companies are able to handle so much of data and the access of these data is also very fast.

In this section we are providing you many articles, tutorials and questions and answers on various Big Data technologies.

Big Data Tutorials

Following articles are good for learning Big Data technologies:

Big Data Technologies

Let's discuss the technologies used in Big Data environment.

  • Hadoop - Apache Hadoop is software system for storing and processing of big data sets, many technologies are used on the top of Hadoop to achieve Big Data analytics.
     
  • Hadoop HDFS - Hadoop HDFS (Hadoop Distributed File System) is framework for storing files (by splitting and other means) on to distributed servers in fault-tolerant way. This enables to store huge data sets in Big Data Environment. Many tools and software on the top of Hadoop HDFS is used for storing and analyzing the data.
     
  • HBase - HBase is NoSQL database on the top of Hadoop HDFS, which provides random and fast access to the data.
     
  • Hive - Apache Hive provides SQL query interface for searching the data stored in HDFS.
     
  • Pig - Apache Pig system is high level abstraction for creating and running MapReduce job on the HDFS.
     
  • Mahout - Apache Mahout is machine learning framework which works with Hadoop ecosystem.
     
  • Oozie - Apache Oozie is workflow scheduling system which works on the Hadoop ecosystem. It is used for scheduling the jobs in Big Data environment.
     
  • Zookeeper - Apache Zookeeper is software system which is part of Hadoop ecosystem and it is used for centralized management of configuration information, name providing, distributed synchronization and providing group services. 

Machine Learning

Here are tutorials of Machine learning:

Big Data tutorials on devmanuals.com

Following are best topics from http://www.devmanuals.com/.