此页面上的内容需要较新版本的 Adobe Flash Player。

获取 Adobe Flash Player

您现在的位置: 智可网 - 新技术 - Hadoop - 正文
Linux下执行Hadoop WordCount.jar
教程录入:李隆权    责任编辑:quan 作者:佚名 文章来源:linuxidc

Linux执行 Hadoop WordCount

Ubuntu 终端进入快捷键 :ctrl + Alt +t

hadoop启动命令:start-all.sh

 

正常执行效果如下:

hadoop@HADOOP:~$ start-all.sh

Warning: $HADOOP_HOME is deprecated.

 

starting namenode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-namenode-HADOOP.MAIN.out

HADOOP.MAIN: starting datanode, logging to/home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-datanode-HADOOP.MAIN.out

HADOOP.MAIN: starting secondarynamenode,logging to/home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-secondarynamenode-HADOOP.MAIN.out

starting jobtracker, logging to/home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-jobtracker-HADOOP.MAIN.out

HADOOP.MAIN: starting tasktracker, loggingto /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-tasktracker-HADOOP.MAIN.out

 

jps命令查看启动的hadoop服务

hadoop@HADOOP:~$ jps

3615 Jps

2699 NameNode

3461 TaskTracker

2922 DataNode

3137 SecondaryNameNode

3231 JobTracker

 

本地创建一个文件夹

hadoop@HADOOP:~$ mkdir ~/file

 

在file文件创建两个txt文件

hadoop@HADOOP:~$ cd file

hadoop@HADOOP:~/file$ echo "Hello World" > file1.txt

hadoop@HADOOP:~/file$ echo "Hello Hadoop" > file2.txt

hadoop@HADOOP:~/file$ ls

file1.txt file2.txt

hadoop@HADOOP:~/file$

 

在HDFS上创建一个输入文件夹

hadoop@HADOOP:~/file$ hadoop fs -mkdir input

查看创建的input文件夹路径

hadoop@HADOOP:~$ hadoop fs -ls

Warning: $HADOOP_HOME is deprecated.

 

Found 5 items

-rw-r--r--  3 Administrator supergroup   6296230 2014-09-03 10:38 /user/hadoop/cloud.txt

drwxr-xr-x  - hadoop        supergroup          0 2014-09-02 16:31/user/hadoop/hadi_curbm

drwxr-xr-x  - hadoop        supergroup          0 2014-09-04 09:59 /user/hadoop/input

drwxr-xr-x  - hadoop        supergroup          0 2014-09-02 16:31/user/hadoop/pegasus

hadoop@HADOOP:~$

 

可以看到目录被创建到 /user/hadoop/input 目录

 

上传本地file文件到input目录

hadoop@HADOOP:~$hadoop fs put ~/file/*.txt  /user/hadoop/input

 

找到hadoop目录下的examples.jar 程序包

hadoop@HADOOP:~$ cd hadoop-1.1.2

hadoop@HADOOP:~/hadoop-1.1.2$ ls

bin         docs                         hadoop-test-1.1.2.jar  LICENSE.txt src

build.XML   hadoop-ant-1.1.2.jar         hadoop-tools-1.1.2.jar  logs        webaPPS

c++         hadoop-clIEnt-1.1.2.jar      ivy                    NOTICE.txt  Wordcount.jar

CHANGES.txt hadoop-core-1.1.2.jar        ivy.XML                README.txt

conf        hadoop-examples-1.1.2.jar    lib                    sbin

contrib     hadoop-minicluster-1.1.2.jar libexec                share

hadoop@HADOOP:~/hadoop-1.1.2$

 

执行jar程序代码 统计input目录下文件的Wordcount

hadoop@HADOOP:~$ hadoop jar /home/hadoop/hadoop-1.1.2/hadoop-examples-1.1.2.jar Wordcount  /user/hadoop/input output

Warning: $HADOOP_HOME is deprecated.

 

14/09/04 10:10:44 INFOinput.FileInputFormat: Total input paths to process : 0

14/09/04 10:10:45 INFO mapred.JobClIEnt:Running job: job_201409040943_0001

14/09/04 10:10:46 INFOmapred.JobClIEnt:  map 0% reduce 0%

14/09/04 10:10:54 INFOmapred.JobClIEnt:  map 0% reduce 100%

14/09/04 10:10:55 INFO mapred.JobClIEnt:Job complete: job_201409040943_0001

14/09/04 10:10:55 INFO mapred.JobClIEnt:Counters: 18

14/09/04 10:10:55 INFOmapred.JobClIEnt:  Job Counters

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Launched reducetasks=1

14/09/04 10:10:55 INFOmapred.JobClIEnt:    SLOTS_MILLIS_MAPS=4087

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Total time spent byall reduces waiting after reserving slots (ms)=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Total time spent byall maps waiting after reserving slots (ms)=0

14/09/04 10:10:55 INFO mapred.JobClIEnt:    SLOTS_MILLIS_REDUCES=4068

14/09/04 10:10:55 INFOmapred.JobClIEnt:  File Output FormatCounters

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Bytes Written=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:  FileSystemCounters

14/09/04 10:10:55 INFO mapred.JobClIEnt:    FILE_BYTES_WRITTEN=55309

14/09/04 10:10:55 INFOmapred.JobClIEnt:  Map-Reduce Framework

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Reduce inputgroups=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Combine outputrecords=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Reduce shufflebytes=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Physical memory(bytes) snapshot=35037184

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Reduce outputrecords=0

14/09/04 10:10:55 INFO mapred.JobClIEnt:     Spilled Records=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    CPU time spent(ms)=120

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Total committedheap usage (bytes)=15925248

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Virtual memory(bytes) snapshot=377499648

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Combine inputrecords=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Reduce inputrecords=0

hadoop@HADOOP:~$

 

 

显示结果

hadoop@HADOOP:~$ hadoop fs -ls output

Warning: $HADOOP_HOME is deprecated.

 

Found 3 items

-rw-r--r--  1 hadoop supergroup          02014-09-04 10:10 /user/hadoop/output/_SUCCESS

drwxr-xr-x  - hadoop supergroup          02014-09-04 10:10 /user/hadoop/output/_logs

-rw-r--r--  1 hadoop supergroup          02014-09-04 10:10 /user/hadoop/output/part-r-00000

hadoop@HADOOP:~$

 

查看执行结果

hadoop@HADOOP:~$ hadoop fs -cat output/part-r-00000

Hadoop 1

Hello   2

World  1

分享
打赏我
打开支付宝"扫一扫" 打开微信"扫一扫"
客户端
"扫一扫"下载智可网App
意见反馈
Linux下执行Hadoop WordCount.jar
作者:佚名 来源:linuxidc

Linux执行 Hadoop WordCount

Ubuntu 终端进入快捷键 :ctrl + Alt +t

hadoop启动命令:start-all.sh

 

正常执行效果如下:

hadoop@HADOOP:~$ start-all.sh

Warning: $HADOOP_HOME is deprecated.

 

starting namenode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-namenode-HADOOP.MAIN.out

HADOOP.MAIN: starting datanode, logging to/home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-datanode-HADOOP.MAIN.out

HADOOP.MAIN: starting secondarynamenode,logging to/home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-secondarynamenode-HADOOP.MAIN.out

starting jobtracker, logging to/home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-jobtracker-HADOOP.MAIN.out

HADOOP.MAIN: starting tasktracker, loggingto /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-tasktracker-HADOOP.MAIN.out

 

jps命令查看启动的hadoop服务

hadoop@HADOOP:~$ jps

3615 Jps

2699 NameNode

3461 TaskTracker

2922 DataNode

3137 SecondaryNameNode

3231 JobTracker

 

本地创建一个文件夹

hadoop@HADOOP:~$ mkdir ~/file

 

在file文件创建两个txt文件

hadoop@HADOOP:~$ cd file

hadoop@HADOOP:~/file$ echo "Hello World" > file1.txt

hadoop@HADOOP:~/file$ echo "Hello Hadoop" > file2.txt

hadoop@HADOOP:~/file$ ls

file1.txt file2.txt

hadoop@HADOOP:~/file$

 

在HDFS上创建一个输入文件夹

hadoop@HADOOP:~/file$ hadoop fs -mkdir input

查看创建的input文件夹路径

hadoop@HADOOP:~$ hadoop fs -ls

Warning: $HADOOP_HOME is deprecated.

 

Found 5 items

-rw-r--r--  3 Administrator supergroup   6296230 2014-09-03 10:38 /user/hadoop/cloud.txt

drwxr-xr-x  - hadoop        supergroup          0 2014-09-02 16:31/user/hadoop/hadi_curbm

drwxr-xr-x  - hadoop        supergroup          0 2014-09-04 09:59 /user/hadoop/input

drwxr-xr-x  - hadoop        supergroup          0 2014-09-02 16:31/user/hadoop/pegasus

hadoop@HADOOP:~$

 

可以看到目录被创建到 /user/hadoop/input 目录

 

上传本地file文件到input目录

hadoop@HADOOP:~$hadoop fs put ~/file/*.txt  /user/hadoop/input

 

找到hadoop目录下的examples.jar 程序包

hadoop@HADOOP:~$ cd hadoop-1.1.2

hadoop@HADOOP:~/hadoop-1.1.2$ ls

bin         docs                         hadoop-test-1.1.2.jar  LICENSE.txt src

build.XML   hadoop-ant-1.1.2.jar         hadoop-tools-1.1.2.jar  logs        webaPPS

c++         hadoop-clIEnt-1.1.2.jar      ivy                    NOTICE.txt  Wordcount.jar

CHANGES.txt hadoop-core-1.1.2.jar        ivy.XML                README.txt

conf        hadoop-examples-1.1.2.jar    lib                    sbin

contrib     hadoop-minicluster-1.1.2.jar libexec                share

hadoop@HADOOP:~/hadoop-1.1.2$

 

执行jar程序代码 统计input目录下文件的Wordcount

hadoop@HADOOP:~$ hadoop jar /home/hadoop/hadoop-1.1.2/hadoop-examples-1.1.2.jar Wordcount  /user/hadoop/input output

Warning: $HADOOP_HOME is deprecated.

 

14/09/04 10:10:44 INFOinput.FileInputFormat: Total input paths to process : 0

14/09/04 10:10:45 INFO mapred.JobClIEnt:Running job: job_201409040943_0001

14/09/04 10:10:46 INFOmapred.JobClIEnt:  map 0% reduce 0%

14/09/04 10:10:54 INFOmapred.JobClIEnt:  map 0% reduce 100%

14/09/04 10:10:55 INFO mapred.JobClIEnt:Job complete: job_201409040943_0001

14/09/04 10:10:55 INFO mapred.JobClIEnt:Counters: 18

14/09/04 10:10:55 INFOmapred.JobClIEnt:  Job Counters

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Launched reducetasks=1

14/09/04 10:10:55 INFOmapred.JobClIEnt:    SLOTS_MILLIS_MAPS=4087

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Total time spent byall reduces waiting after reserving slots (ms)=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Total time spent byall maps waiting after reserving slots (ms)=0

14/09/04 10:10:55 INFO mapred.JobClIEnt:    SLOTS_MILLIS_REDUCES=4068

14/09/04 10:10:55 INFOmapred.JobClIEnt:  File Output FormatCounters

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Bytes Written=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:  FileSystemCounters

14/09/04 10:10:55 INFO mapred.JobClIEnt:    FILE_BYTES_WRITTEN=55309

14/09/04 10:10:55 INFOmapred.JobClIEnt:  Map-Reduce Framework

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Reduce inputgroups=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Combine outputrecords=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Reduce shufflebytes=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Physical memory(bytes) snapshot=35037184

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Reduce outputrecords=0

14/09/04 10:10:55 INFO mapred.JobClIEnt:     Spilled Records=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    CPU time spent(ms)=120

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Total committedheap usage (bytes)=15925248

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Virtual memory(bytes) snapshot=377499648

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Combine inputrecords=0

14/09/04 10:10:55 INFOmapred.JobClIEnt:    Reduce inputrecords=0

hadoop@HADOOP:~$

 

 

显示结果

hadoop@HADOOP:~$ hadoop fs -ls output

Warning: $HADOOP_HOME is deprecated.

 

Found 3 items

-rw-r--r--  1 hadoop supergroup          02014-09-04 10:10 /user/hadoop/output/_SUCCESS

drwxr-xr-x  - hadoop supergroup          02014-09-04 10:10 /user/hadoop/output/_logs

-rw-r--r--  1 hadoop supergroup          02014-09-04 10:10 /user/hadoop/output/part-r-00000

hadoop@HADOOP:~$

 

查看执行结果

hadoop@HADOOP:~$ hadoop fs -cat output/part-r-00000

Hadoop 1

Hello   2

World  1