BIGDATA HADOOP ADMIN SYLLABUS
1.Understanding Big Data and Hadoop
Introduction to big data, limitations of existing solutions
Hadoop architecture, Hadoop components and ecosystem
Data loading & reading from HDFS
Replication rules, rack awareness theory
Hadoop cluster administrator
Roles and responsibilities
2. Hadoop Architecture and Cluster setup
Hadoop server roles and their usage
Hadoop installation and initial configuration
Deploying Hadoop in a pseudo-distributed mode
Deploying a multi-node Hadoop cluster
Installing Hadoop Clients
Understanding working of HDFS and resolving simulated problems.
3. Hadoop cluster Administration & Understanding MapReduce
Understanding secondary name node
Working with Hadoop distributed cluster
Decommissioning or commissioning of nodes
Understanding schedulers and enabling them.
4. Backup, Recovery and Maintenance
Common admin commands like Balancer
Trash, Import Check Point
Distcp, data backup and recovery
Enabling trash, namespace count quota or space quota, manual failover or metadata recovery.
5. Hadoop Cluster: Planning and Management
Planning the Hadoop cluster
Cluster sizing, hardware
Network and software considerations
Popular Hadoop distributions, workload and usage patterns.
6. Hadoop 2.0 and it’s features
Limitations of Hadoop 1.x
Features of Hadoop 2.0
YARN framework, MRv2
Hadoop high availability and federation
Yarn ecosystem and Hadoop 2.0 Cluster setup.
7. Setting up Hadoop 2.X with High Availability and upgrading Hadoop
Configuring Hadoop 2 with high availability
Upgrading to Hadoop 2
Working with Sqoop
Working with Hive
Working with Hbase.
8. Understanding Cloudera manager and cluster setup, Overview on Kerberos
Hive administration, HBase architecture
HBase setup, Hadoop/Hive/Hbase performance optimization
Cloudera manager and cluster setup
Pig setup and working with grunt
Why Kerberos and how it helps