יום שלישי, 11 ביוני 2013

Executing the hadoop samples

I keep on follow the Single node setup document.
I call to copy from the local system to the HDFS th config folder
bin/hadoop fs -put conf input
The result can be found using the
NameNode page - http://localhost:50070/ 
The files are list in: /user/zvika/input directory
Note that each file Block Size is 64MB the user is zvika and the Group is supergroup

I execute the samples using :
 zvika@ubuntu:~/myStaff/Hadoop/hadoop-1.1.2$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'

The result is a very long list :

13/06/11 21:51:08 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/06/11 21:51:08 WARN snappy.LoadSnappy: Snappy native library not loaded
13/06/11 21:51:08 INFO mapred.FileInputFormat: Total input paths to process : 20
13/06/11 21:51:09 INFO mapred.JobClient: Running job: job_201306112139_0001
13/06/11 21:51:10 INFO mapred.JobClient:  map 0% reduce 0%
13/06/11 21:51:14 INFO mapred.JobClient:  map 10% reduce 0%
13/06/11 21:51:16 INFO mapred.JobClient:  map 15% reduce 0%
13/06/11 21:51:17 INFO mapred.JobClient:  map 20% reduce 0%
13/06/11 21:51:18 INFO mapred.JobClient:  map 25% reduce 0%
13/06/11 21:51:19 INFO mapred.JobClient:  map 30% reduce 0%
13/06/11 21:51:20 INFO mapred.JobClient:  map 40% reduce 0%
13/06/11 21:51:21 INFO mapred.JobClient:  map 45% reduce 0%
13/06/11 21:51:22 INFO mapred.JobClient:  map 50% reduce 0%
13/06/11 21:51:23 INFO mapred.JobClient:  map 55% reduce 13%
13/06/11 21:51:24 INFO mapred.JobClient:  map 60% reduce 13%
13/06/11 21:51:25 INFO mapred.JobClient:  map 65% reduce 13%
13/06/11 21:51:26 INFO mapred.JobClient:  map 70% reduce 13%
13/06/11 21:51:27 INFO mapred.JobClient:  map 80% reduce 13%
13/06/11 21:51:28 INFO mapred.JobClient:  map 85% reduce 13%
13/06/11 21:51:29 INFO mapred.JobClient:  map 90% reduce 13%
13/06/11 21:51:30 INFO mapred.JobClient:  map 100% reduce 13%
13/06/11 21:51:32 INFO mapred.JobClient:  map 100% reduce 23%
13/06/11 21:51:34 INFO mapred.JobClient:  map 100% reduce 100%
13/06/11 21:51:34 INFO mapred.JobClient: Job complete: job_201306112139_0001
13/06/11 21:51:34 INFO mapred.JobClient: Counters: 30
13/06/11 21:51:34 INFO mapred.JobClient:   Job Counters
13/06/11 21:51:34 INFO mapred.JobClient:     Launched reduce tasks=1
13/06/11 21:51:34 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=33979
13/06/11 21:51:34 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/06/11 21:51:34 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/06/11 21:51:34 INFO mapred.JobClient:     Launched map tasks=20
13/06/11 21:51:34 INFO mapred.JobClient:     Data-local map tasks=20
13/06/11 21:51:34 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=19571
13/06/11 21:51:34 INFO mapred.JobClient:   File Input Format Counters
13/06/11 21:51:34 INFO mapred.JobClient:     Bytes Read=29676
13/06/11 21:51:34 INFO mapred.JobClient:   File Output Format Counters
13/06/11 21:51:34 INFO mapred.JobClient:     Bytes Written=180
13/06/11 21:51:34 INFO mapred.JobClient:   FileSystemCounters
13/06/11 21:51:34 INFO mapred.JobClient:     FILE_BYTES_READ=82
13/06/11 21:51:34 INFO mapred.JobClient:     HDFS_BYTES_READ=31840
13/06/11 21:51:34 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1081971
13/06/11 21:51:34 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=180
13/06/11 21:51:34 INFO mapred.JobClient:   Map-Reduce Framework
13/06/11 21:51:34 INFO mapred.JobClient:     Map output materialized bytes=196
13/06/11 21:51:34 INFO mapred.JobClient:     Map input records=843
13/06/11 21:51:34 INFO mapred.JobClient:     Reduce shuffle bytes=196
13/06/11 21:51:34 INFO mapred.JobClient:     Spilled Records=6
13/06/11 21:51:34 INFO mapred.JobClient:     Map output bytes=70
13/06/11 21:51:34 INFO mapred.JobClient:     Total committed heap usage (bytes)=3346661376
13/06/11 21:51:34 INFO mapred.JobClient:     CPU time spent (ms)=5060
13/06/11 21:51:34 INFO mapred.JobClient:     Map input bytes=29676
13/06/11 21:51:34 INFO mapred.JobClient:     SPLIT_RAW_BYTES=2164
13/06/11 21:51:34 INFO mapred.JobClient:     Combine input records=3
13/06/11 21:51:34 INFO mapred.JobClient:     Reduce input records=3
13/06/11 21:51:34 INFO mapred.JobClient:     Reduce input groups=3
13/06/11 21:51:34 INFO mapred.JobClient:     Combine output records=3
13/06/11 21:51:34 INFO mapred.JobClient:     Physical memory (bytes) snapshot=4006256640
13/06/11 21:51:34 INFO mapred.JobClient:     Reduce output records=3
13/06/11 21:51:34 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=22421676032
13/06/11 21:51:34 INFO mapred.JobClient:     Map output records=3
13/06/11 21:51:34 INFO mapred.FileInputFormat: Total input paths to process : 1
13/06/11 21:51:34 INFO mapred.JobClient: Running job: job_201306112139_0002
13/06/11 21:51:35 INFO mapred.JobClient:  map 0% reduce 0%
13/06/11 21:51:38 INFO mapred.JobClient:  map 100% reduce 0%
13/06/11 21:51:45 INFO mapred.JobClient:  map 100% reduce 33%
13/06/11 21:51:47 INFO mapred.JobClient:  map 100% reduce 100%
13/06/11 21:51:47 INFO mapred.JobClient: Job complete: job_201306112139_0002
13/06/11 21:51:47 INFO mapred.JobClient: Counters: 30
13/06/11 21:51:47 INFO mapred.JobClient:   Job Counters
13/06/11 21:51:47 INFO mapred.JobClient:     Launched reduce tasks=1
13/06/11 21:51:47 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=3134
13/06/11 21:51:47 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/06/11 21:51:47 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/06/11 21:51:47 INFO mapred.JobClient:     Launched map tasks=1
13/06/11 21:51:47 INFO mapred.JobClient:     Data-local map tasks=1
13/06/11 21:51:47 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=8441
13/06/11 21:51:47 INFO mapred.JobClient:   File Input Format Counters
13/06/11 21:51:47 INFO mapred.JobClient:     Bytes Read=180
13/06/11 21:51:47 INFO mapred.JobClient:   File Output Format Counters
13/06/11 21:51:47 INFO mapred.JobClient:     Bytes Written=52
13/06/11 21:51:47 INFO mapred.JobClient:   FileSystemCounters
13/06/11 21:51:47 INFO mapred.JobClient:     FILE_BYTES_READ=82
13/06/11 21:51:47 INFO mapred.JobClient:     HDFS_BYTES_READ=297
13/06/11 21:51:47 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=101471
13/06/11 21:51:47 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=52
13/06/11 21:51:47 INFO mapred.JobClient:   Map-Reduce Framework
13/06/11 21:51:47 INFO mapred.JobClient:     Map output materialized bytes=82
13/06/11 21:51:47 INFO mapred.JobClient:     Map input records=3
13/06/11 21:51:47 INFO mapred.JobClient:     Reduce shuffle bytes=82
13/06/11 21:51:47 INFO mapred.JobClient:     Spilled Records=6
13/06/11 21:51:47 INFO mapred.JobClient:     Map output bytes=70
13/06/11 21:51:47 INFO mapred.JobClient:     Total committed heap usage (bytes)=220528640
13/06/11 21:51:47 INFO mapred.JobClient:     CPU time spent (ms)=790
13/06/11 21:51:47 INFO mapred.JobClient:     Map input bytes=94
13/06/11 21:51:47 INFO mapred.JobClient:     SPLIT_RAW_BYTES=117
13/06/11 21:51:47 INFO mapred.JobClient:     Combine input records=0
13/06/11 21:51:47 INFO mapred.JobClient:     Reduce input records=3
13/06/11 21:51:47 INFO mapred.JobClient:     Reduce input groups=1
13/06/11 21:51:47 INFO mapred.JobClient:     Combine output records=0
13/06/11 21:51:47 INFO mapred.JobClient:     Physical memory (bytes) snapshot=296771584
13/06/11 21:51:47 INFO mapred.JobClient:     Reduce output records=3
13/06/11 21:51:47 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=2142064640
13/06/11 21:51:47 INFO mapred.JobClient:     Map output records=3


To examine the hadoop job processing go to http://ubuntu:50060/tasktracker.jsp and refresh while executing  the JOB

אין תגובות:

הוסף רשומת תגובה