Learning Path - Data Science With Apache Spark 2

Location
Online
Dates
Can be taken anytime
Course Type
Professional Training Course
Accreditation
Yes (Details)
Language
English
Price
$10

Course Overview

Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists.

This Learning Path begins with an introduction to Apache Spark. We first cover the basics of Spark, introduce SparkR, then look at the charting and plotting features of Python in conjunction with Spark data processing, and finally Spark's data processing libraries. We then develop a real-world Spark application. Next, we enable you to become comfortable and confident working with Spark for data science by exploring Spark's data science libraries on a dataset of tweets.

Begin your journey into fast, large-scale, and distributed data processing using Spark with this Learning Path.

Basic knowledge:

  • Requires basic knowledge of either Python or R

Who should take this course

Application developers, data scientists, or big data architects interested in combining the data processing power of Apache Spark will find this course to be very useful. As implementations of Apache Spark will be shown with Scala and Python, some programming knowledge on these languages will be needed. This course is for anyone who wants to work with Spark on large and complex datasets. A basic knowledge about statistics and computational mathematics is expected.

With the help of real-world use cases on the main features of Spark, this course offers an easy introduction to the framework. This practical hands-on course covers the fundamentals of Spark needed to get to grips with data science through a single dataset. It expands on the next learning curve for those comfortable with Spark programming who are looking to apply Spark in the field of data science.

Accreditation

Course Completion Certificate

Course content

What will you learn:

  • Get to know the fundamentals of Spark 2.0 and the Spark programming model using Scala and Python
  • Know how to use Spark SQL and DataFrames using Scala and Python
  • Get an introduction to Spark programming using R
  • Perform Spark data processing, charting, and plotting using Python
  • Get acquainted with Spark stream processing using Scala and Python
  • Be introduced to machine learning with Spark using Scala and Python
  • Get started with graph processing with Spark using Scala
  • Develop a complete Spark application
  • Understand the Spark programming language and its ecosystem of packages in Data Science
  • Obtain and clean data before processing it
  • Understand the Spark machine learning algorithm to build a simple pipeline
  • Work with interactive visualization packages in Spark
  • Apply data mining techniques on the available data sets
  • Build a recommendation engine

About Course Provider

Simpliv LLC, a platform for learning and teaching online courses. We basically focus on online learning which helps to learn business concepts, software technology to develop personal and professional goals through video library by recognized industry experts or trainers.

Why Simpliv

With the ever-evolving industry trends, there is a constant need of the professionally designed learning solutions that deliver key innovations on time and on a budget to achieve long-term success.

Simpliv understands the changing needs and allows the global learners to evaluate their technical abilities by aligning the learnings to key business objectives in order to fill the skills gaps that exist in the various business areas including IT, Marketing, Business Development, and much more.

Frequently asked questions

{{ item.question }}