Menu Close

Big Data Fundamentals

This is a beginners’ course in the field of Big Data. In this training, you will get a clear understanding of what Big Data is, what its composition is, how Hadoop is the best tool to work on Big Data, and various components of the Hadoop ecosystem like MapReduce, HDFS, HBase, Hive, Pig, Sqoop, Spark, Storm, Scala and other tools and technologies.

 

Outline:

Introduction to Big Data

  • Big Data and its importance and what is a distributed system Examples of Big Data
  • Characteristics of Big Data, introduction to Hadoop and its works and examples of Big Data Hadoop Uses
  • What is Hadoop and how does Hadoop work Installation Hadoop
  • Installation of single-node Apache Hadoop OUTLINE 3 | P a g e Hadoop Architecture
  • What is HDFS  Planning Hadoop cluster and hardware considerations
  • Hadoop cluster maintenance MapReduce
  • What is MapReduce, an example program: word count, Map phase, shuffle and sort, Reduce phase and name node Hive
  • Hive introduction, what is Hive, origin of Hive, Hadoop-based system and where we do not use it Mahout
  • Intro to Mahout and fundamentals of Machine Learning
  • What is Classified, Classification, Clustering, Recommendation and Pattern Mining Cassandra
  • What is Cassandra and who develops it, non-relational and what is eventually consistent
  • Introduction to NoSQL and comparison with RDBMS and comparison of RDMS and problems of RDBMS SAP HANA
  • OLTP, OLAP, SAP BW, SAP BW with BWA, SAP in memory strategy and SAP HANA comparison with BWA Spark
  • What is Spark and the architecture of Spark, use of Spark, component of Spark and its comparison with Hadoop Scala
  • What is Scala and uses of Scala
  • Advantage of Scala, how Scala solves problems of Java, C, etc., the architecture of Scala and Spark coding using Scala Splunk
  • What is Splunk, Splunk latest version, OS supported, what does Splunk do and what does Splunk provide