Hadoop Ecosystem Essentials
Develop essential data analyst skills as you delve into the Hadoop ecosystem and learn how to handle large amounts of data with this online course from Packt.
Duration
4 weeks
Weekly study
2 hours
100% online
How it works
Unlimited subscription
Learn more
For data analysts, Hadoop is an extremely powerful tool to help process large amounts of data and is used by successful companies such as Google and Spotify.
On this four-week course, you’ll learn how to use Hadoop to its full potential to make it easier for you to store, analyse, and scale big data.
Through step-by-step guides and exercises, you’ll gain the knowledge and practical skills to take into your role in data analytics.
You’ll understand how to manage clusters with Yet Another Resource Negotiator (YARN), Mesos, Zookeeper, Oozie, Zeppelin, and Hue.
With this knowledge, you’ll be able to ensure high performance, workload management, security, and more.
Next, you’ll uncover the techniques to handle and stream data in real-time using Kafka, Flume, Spark Streaming, Flink, and Storm.
This understanding will help you to react and respond quickly to any issues that may arise.
Finally, you’ll learn how to design real-world systems using the Hadoop ecosystem to ensure you can use your skills in practice.
By the end of the course, you’ll have the knowledge to handle large amounts of data using Hadoop.
Welcome to Hadoop Ecosystem Essentials and the start of your learning journey, brought to you by Packt.
In this activity, we will discuss how to install and use Apache Drill to query across multiple databases.
In this activity, we will describe Apache Phoenix, how to install the SQL driver, and how to integrate Phoenix with Pig.
In this activity, we will discuss Presto, a query engine developed by Facebook.
You have reached the end of Week 1. In this activity, you'll reflect on what you have learned.
Welcome to Week 2. In this activity we'll highlight the main topics that will be covered this week.
In this activity, we will discuss different technologies for managing resources in your cluster.
In this activity, we will describe technologies for managing clusters and tasks.
In this activity, we will discuss a system that competes with Hortonworks, called Hue, and older systems in Hadoop.
You have reached the end of Week 2. In this activity, you'll reflect on what you have learned.
Welcome to Week 3. In this activity we'll highlight the main topics that will be covered this week.
In this activity, we will describe how Kafka provides a scalable and reliable means of collecting data across a cluster of computers and broadcasting it for further processing.
In this activity, we will discuss another way to stream data using Apache Flume.
In this activity, we will discuss using Spark Streaming for processing continuous streams of data in real-time.
In this activity, we will describe streaming with Apache Storm, another tool for real-time data processing.
In this activity, we will explore the Flink stream processing engine.
You have reached the end of Week 3. In this activity, you'll reflect on what you have learned.
Welcome to Week 4. In this activity we'll highlight the main topics that will be covered this week.
In this activity, we will discuss how to fit various systems together to design an architecture that solves real-world business problems.
You have reached the end of Week 4. In this activity, you'll reflect on what you have learned.
More courses you might like
Learners who joined this course have also enjoyed these courses.
©2025 onlincourse.com. All rights reserved