Hadoop, Mapreduce for Big Data problems - Part 2

Location
Online
Dates
Can be taken anytime
Course Type
Professional Training Course
Accreditation
Yes (Details)
Language
English
Price
$10

Course Overview

This course helps you learn in-depth details of Hadoop and MapReduce along with a hands on experience of the same. You'll learn how to set up your own cluster using both VMs and the Cloud. All the major features of MapReduce are covered - including advanced topics like Total Sort and Secondary Sort. You will also learn how to Build your Hadoop Cluster and Customize your own MapReduce Jobs. This course is a continuation of the Part -1 of Hadoop Course.

Coupon code - WIISEGT

Who should take this course

It is available for all the learners.

Accreditation

WIISE

Course content

The outline of this course is mentioned below:

The Inverted Index, Custom Data Types for Keys, Bigram Counts and Unit Tests:

  • The Heart of Search Engines - The Inverted Index
  • Generating the Inverted Index using MapReduce
  • Custom Data Types for Keys - The Writable Interface
  • Represent a Bigram using a Writable Comparable
  • MapReduce to Count the Bigrams in Input Text
  • Test your MapReduce job using MRUnit

Input and Output Formats and Customized Partitioning:

  • Introducing the File Input Format
  • Text And Sequence File Formats
  • Data Partitioning using a Custom Partitioner
  • Make the Custom Partitioner Real in Code
  • Total Order Partitioning
  • Input Sampling, Distribution, Partitioning
  • Secondary Sort

Collaborative Filtering:

  • Introduction to Collaborative Filtering
  • Friend Recommendations using Chained MR Jobs
  • The First MapReduce
  • The Second MapReduce

Hadoop as a Database:

  • Structured Data in Hadoop
  • Running an SQL Select with MapReduce
  • Running an SQL Group By with MapReduce
  • The Map Side
  • The Reduce Side
  • Sorting and Partitioning
  • Putting it all together

K-Means Clustering:

  • Overview
  • A MapReduce job for K-Means Clustering
  • The Distance Between Points
  • Custom Writables for Input / Output
  • Configuring the Job
  • The Mapper and Reducer
  • The Iterative MapReduce Job

Setting up a Hadoop Cluster:

  • Manually configuring a Hadoop cluster (Linux VMs)
  • Getting started with Amazon Web Servicies
  • Start a Hadoop Cluster with Cloudera Manager on AWS

Appendix:

  • Setup a Virtual Linux Instance (For Windows users)
  • Path and other Environment Variables

About Course Provider

WIISE is a 'Professional Learning Network'​ with a global outreach that helps anyone to learn anything to achieve personal and professional goals.

We bring top-rated interactive learning courses & certifications from across the world through respected Global Academic Institutes and Industry experts to our learners.

WIISE for Teams is a Smart training solution suitable for growing businesses (SMB’s) - deliver online cost-effective, on-demand training, staff engagement & Upskilling to their employees and customers. WIISE incorporates the latest micro-learning & social-learning techniques that provides fast and engaging training at a fraction of cost of traditional training methods.

WIISE is brought by respectable Learning services & Skill development company - PositiveShift Group - Silicon Valley CA USA, India (www.positiveshift.in). The company has been awarded unique Innovation partnership with National Skill Development Corporation (NSDC) and Ministry of Skill Development and Entrepreneurship, Govt of India.

Frequently asked questions

{{ item.question }}