Hadoop Training

Hadoop Training Online Training

Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes. Its distributed file system facilitates rapid data transfer rates among nodes and allows the system to continue operating uninterrupted in case of a node failure. This approach lowers the risk of catastrophic system failure, even if a significant number of nodes become inoperative. PsnTrainings Provides Best Hadoop Online Training By Real-time Experts

Introduction

  • Introduction to Hadoop
  • History of Hadoop
  • Building Blocks - Hadoop Eco-System
  • Who is Behind Hadoop
  • What Hadoop is good for and what it is not

HDFS

  • Configuring HDFS
  • Interacting With HDFS
  • HDFS Permissions and Security
  • Additional HDFS Tasks
  • HDFS Overview and Architecture
  • HDFS Installation
  • Hadoop File System Shell
  • File System Java API

MAPREDUCE

  • Map/Reduce Overview and Architecture
  • Installation
  • Developing Map/Red Jobs
  • Input and Output Formats
  • Job Configuration
  • Job Submission
  • Practicing Map Reduce Programs (atleast 10 Map Reduce Algorithms)

Getting Started With Eclipse IDE

  • Map/Reduce Overview and Architecture
  • Configuring Hadoop API on Eclipse IDE
  • Connecting Eclipse IDE to HDFS

Hadoop Streaming

AdvancedMapReduce Features

  • Custom Data Types
  • Input Formats
  • Output Formats
  • Partitioning Data
  • Reporting Custom Metrics
  • Distributing Auxiliary Job Data

Distributing Debug Scripts

Using Yahoo Web Services

Pig

  • Pig Overview
  • Installation
  • Pig Latin
  • Pig with HDFS

Hive

  • Hive Overview
  • Installation
  • Hive QL
  • Hive Unstructured Data Analyzation
  • Hive Semistructured Data Analyzation

HBase

  • HBase Overview and Architecture
  • HBase Installation
  • HBase Shell
  • CRUD operations
  • Scanning and Batching
  • filters
  • HBase Key Design

ZooKeeper

  • Zoo Keeper Overview
  • Installation
  • Server Mantainace

Sqoop

  • Sqoop Overview
  • Installation
  • Imports and Exports

CONFIGURATION

  • Basic Setup
  • Important Directories
  • Selecting Machines
  • Cluster Configurations
  • Small Clusters: 2-10 Nodes
  • Medium Clusters: 10-40 Nodes
  • Large Clusters: Multiple Racks

Integrations

Putting it all together

  • Distributed installations
  • Best Practices