Hadoop The Definitive Guide Reading Notes

Hadoop: The Definitive Guide, Fourth Edition: http://shop.oreilly.com/product/0636920033448.do
Code and Data: http://hadoopbook.com/code.html
Download ncdc weather dataset: https://gist.github.com/rehevkor5/2e407950ca687b36fc54
Building and Running:

1
2
3
4
5
6
7
8
9
$ git clone https://github.com/tomwhite/hadoop-book.git
$ cd hadoop-book
# do a full build and create example JAR files in the top-level directory
$ mvn package -DskipTests
# run an example
$ export HADOOP_CLASSPATH=hadoop-examples.jar
$ hadoop MaxTemperature /hadoop-book/ncdc/ output1

Reading Notes


Skipped sections

  • Chapter 2 MapReduce
  • Chapter 3 HDFS
    • HDFS Federation
    • HDFS High Availability
    • The Java Interface
    • Data Flow, Coherency Model
Contents
  1. 1. Reading Notes
    1. 1.0.1. Skipped sections
|