Location
Online
Dates
Can be taken anytime
Course Type
Professional Training Course
Accreditation
Yes (Details)
Language
English
Price
$10

Course Overview

The world of Hadoop and Big Data have hundreds of different technologies with cryptic names form the Hadoop ecosystem. With this course, you'll not only understand what those systems are and how they fit together but, you'll go hands-on and learn how to use them to solve real business problems. This course is comprehensive, covering over 25 different technologies in over 14 hours of video lectures. It's filled with hands-on activities and exercises, so you get some real experience in using Hadoop - it's not just theory. At the end of this course, you may expect to learn the course with a real, deep understanding of Hadoop and its associated distributed systems and you can apply Hadoop to real-world problems.

Coupon code - WIISEGT

Who should take this course

It is available for all the learners.

Accreditation

WIISE

Course content

The outline of this course is mentioned below:

  • Introduction - Introduction. Hadoop Overview. Overview of the Hadoop Ecosystem. Tips and Tricks.

  • Using Hadoop's Core: HDFS and MapReduce - HDFS Overview0. Install the MovieLens dataset into HDFS using the Ambari UI. Install the MovieLens dataset into HDFS using the command line. MapReduce Overview. MapReduce distributes processing. MapReduce example: Break down movie ratings by rating score. Installing Python, MRJob, and nano. Coding up the ratings histogram MapReduce job. Rank movies by their popularity. Check your results against mine.

  • Programming Hadoop with Pig - Introducing Ambari. Introducing Pig. Find the oldest movie with a 5-star rating using Pig. Find old 5-star movies with Pig. More Pig Latin. Find the most-rated one-star movie. Pig Challenge: Compare Your Results to Mine.

  • Programming Hadoop with Spark - Spark Overview. The Resilient Distributed Dataset (RDD). Find the movie with the lowest average rating - with RDD's. Datasets and Spark 2.0. Finding movie with the lowest average rating. Movie recommendations with MLLib. Filter the lowest-rated movies by number of ratings. Check your results against mine.

  • Usage of relational data stores with Hadoop - What is Hive? Hive to find the most popular movie. How Hive works. Hive to find the movie with the highest average rating. Compare Solutions. Integrating MySQL with Hadoop. Install MySQL and import our movie data. Sqoop to import data from MySQL to HFDS/Hive. Use Sqoop to export data from Hadoop to MySQL.

  • Usage of non-relational data stores with Hadoop. Why No SQL? What is HBase. Import movie ratings into HBase. Use HBase with Pig to import data at scale. Cassandra overview. Installing Cassandra. Write Spark output into Cassandra. MongoDB overview. Install MongoDB, and integrate Spark with MongoDB. Using the MongoDB shell. Choosing a database technology. Choose a database for a given problem.

  • Querying your Data Interactively - Overview of Drill. Setting up Drill. Querying across multiple databases with Drill. Overview of Phoenix. Install Phoenix and query HBase with it. Integrate Phoenix with Pig. Overview of Presto. Install Presto and query Hive with it. Query both Cassandra and Hive using Presto.

  • Managing your Cluster - YARN explained. Tez explained. Hive on Tez and measure the performance benefit. Mesos explained. ZooKeeper explained. Simulating a failing master with ZooKeeper. Oozie explained. Set up a simple Oozie workflow. Zeppelin overview. Zeppelin to analyze movie ratings: Part 1. Zeppelin to analyze movie ratings: Part 2. Hue overview. Other technologies worth mentioning.

  • Feeding Data to your Cluster - Kafka explained. Setting up Kafka and publishing some data. Publishing web logs with Kafka. Flume explained. Set up Flume and publish logs with it. Set up Flume to monitor a directory and store its data in HDFS.

  • Analyzing Streams of Data - Spark Streaming: Introduction. Analyze web logs published with Flume using Spark Streaming. Monitor Flume-published logs for errors in real time. Exercise solution: Aggregating HTTP access codes with Spark Streaming. Apache Storm: Introduction. Count words with Storm. Flink: An Overview. Counting words with Flink.

  • Designing Real-World Systems - The Best of the Rest. Review: How the pieces fit together. Understanding your requirements. Sample application: consume webserver logs and keep track of top-sellers. Sample application: serving movie recommendations to a website. Design a system to report web sessions per day. Exercise solution: Design a system to count daily sessions.

  • BONUS - Books and online resources. Bonus lecture: Discounts on my other big data / data science courses!

About Course Provider

WIISE is a 'Professional Learning Network'​ with a global outreach that helps anyone to learn anything to achieve personal and professional goals.

We bring top-rated interactive learning courses & certifications from across the world through respected Global Academic Institutes and Industry experts to our learners.

WIISE for Teams is a Smart training solution suitable for growing businesses (SMB’s) - deliver online cost-effective, on-demand training, staff engagement & Upskilling to their employees and customers. WIISE incorporates the latest micro-learning & social-learning techniques that provides fast and engaging training at a fraction of cost of traditional training methods.

WIISE is brought by respectable Learning services & Skill development company - PositiveShift Group - Silicon Valley CA USA, India (www.positiveshift.in). The company has been awarded unique Innovation partnership with National Skill Development Corporation (NSDC) and Ministry of Skill Development and Entrepreneurship, Govt of India.

Frequently asked questions

{{ item.question }}