BIG Data| Hadoop

Hadoop is software framework that is used for storing data and running applications on clusters of commodity hardware. In this course, students learn about big data characteristics, need of framework such as Hadoop and its ecosystem.

Course Content

  • Big Data & Hadoop Introduction

  • HDFS - Hadoop Distributed File System & Hadoop’s Distributions

  • Hadoop Cluster Setup & Working with Hadoop Cluster

  • Hadoop Configurations & Daemon Logs

  • Hadoop Cluster Maintenance & Administration

  • Hadoop Computational Frameworks

  • Scheduling—managing resources via Schedulers

  • Hadoop Cluster Planning

  • Hadoop Clients & HUE interface

  • Data Ingestion in Hadoop Cluster

  • Hadoop Ecosystem components/services

  • Hadoop Security—Securing Hadoop Cluster

  • Cluster Monitoring—Monitoring Hadoop Cluster

Course Duration

3 Months


Knowledge of Linux

Basic programming principles of Java