Implement Hadoop Big Data

Tujuan Program

Fundamentals of Hadoop and YARN and write applications using them HDFS, MapReduce, Hive, Pig, Sqoop, Flume, and ZooKeeper

Spark, Spark SQL, Streaming, Data Frame, RDD, GraphX and MLlib writing Spark applications

Working with Avro data formats

Practicing real-life projects using Hadoop and Apache Spark

Be equipped to clear Big Data Hadoop Certification

Who Needs This?

Programming Developers

System Administrators

Project Manager

Analytics Engineer

Fresh Graduater

What You Learn?

◾ The architecture of Hadoop cluster
◾ What is High Availability and Federation?
◾ How to setup a production cluster?
◾ Various shell commands in Hadoop
◾ Understanding configuration files in Hadoop
◾ Installing a single node cluster with Cloudera Manager
◾ Understanding Spark, Scala, Sqoop, Pig, and Flume

◾ Introducing Big Data and Hadoop
◾ What is Big Data and where does Hadoop fit in?
◾ Two important Hadoop ecosystem components, namely, MapReduce and HDFS
◾ In-depth Hadoop Distributed File System – Replications, Block Size, Secondary Name node, High Availability and in-depth YARN – resource manager and node manager

◾ Why use Apache Spark?
◾ Functional Programming Basics
◾ Parallel Programming using Resilient Distributed Datasets
◾ Scale out / Data Parallelism in Apache Spark
◾ Dataframes and SparkSQL

◾ Indexing in Hive
◾ The ap Side Join in Hive
◾ Working with complex data types
◾ The Hive user-defined functions
◾ Introduction to Impala
◾ Comparing Hive with Impala
◾ The detailed architecture of Impala

Berapa Nilai Investasi
Yang Diperlukan?

IDR 10.000.000/pax