hadoop的"mapred.ReduceTask: java.net.ConnectExceptio

WBOY
Lepaskan: 2016-06-07 15:22:05
asal
1102 orang telah melayarinya

集群某节点91有故障发生,出现 [plain] 2013-11-08 08:32:13,908 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201311061017_18902_r_000000_0 copy failed: attempt_201311061017_18902_m_000003_0 from node-192 2013-11-08 08:32:13,921 WARN org.a

 集群某节点91有故障发生,出现

 

[plain]  

2013-11-08 08:32:13,908 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201311061017_18902_r_000000_0 copy failed: attempt_201311061017_18902_m_000003_0 from node-192  

2013-11-08 08:32:13,921 WARN org.apache.hadoop.mapred.ReduceTask: java.net.ConnectException: Connection timed out  

    at java.net.PlainSocketImpl.socketConnect(Native Method)  

    at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)  

    at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)  

    at java.net.AbstractPlainSocketImpl.connect(Unknown Source)  

    at java.net.SocksSocketImpl.connect(Unknown Source)  

    at java.net.Socket.connect(Unknown Source)  

    at sun.net.NetworkClient.doConnect(Unknown Source)  

    at sun.net.www.http.HttpClient.openServer(Unknown Source)  

    at sun.net.www.http.HttpClient.openServer(Unknown Source)  

    at sun.net.www.http.HttpClient.(Unknown Source)  

    at sun.net.www.http.HttpClient.New(Unknown Source)  

    at sun.net.www.http.HttpClient.New(Unknown Source)  

    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)  

    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)  

    at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)  

    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1631)  

    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.setupSecureConnection(ReduceTask.java:1588)  

    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1488)  

    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1399)  

    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1331)  

分析hadoop代码:

 

[java]  

localFs = FileSystem.getLocal(fConf);  

    if (fConf.get("slave.host.name") != null) {  

      this.localHostname = fConf.get("slave.host.name");  

    }  

    if (localHostname == null) {  

      this.localHostname =  

      DNS.getDefaultHost  

      (fConf.get("mapred.tasktracker.dns.interface","default"),  

       fConf.get("mapred.tasktracker.dns.nameserver","default"));  

    }  

 

在该节点ping 下这个hostname:

 

[plain]  

ping node-191  

PING node-128-191.localhost (220.250.64.228) 56(84) bytes of data.  

64 bytes from 220.250.64.228: icmp_seq=1 ttl=247 time=14.8 ms  

64 bytes from 220.250.64.228: icmp_seq=2 ttl=247 time=14.3 ms  

64 bytes from 220.250.64.228: icmp_seq=3 ttl=247 time=14.4 ms  

发现压根不是191的ip。

 

到该节点的hosts里查看,也没有配置191的hostname。

 

问题得解。

 

将191的hostname添加到集群所有节点的hosts上。重启tasktracker搞定。

Label berkaitan:
sumber:php.cn
Kenyataan Laman Web ini
Kandungan artikel ini disumbangkan secara sukarela oleh netizen, dan hak cipta adalah milik pengarang asal. Laman web ini tidak memikul tanggungjawab undang-undang yang sepadan. Jika anda menemui sebarang kandungan yang disyaki plagiarisme atau pelanggaran, sila hubungi admin@php.cn
Tutorial Popular
Lagi>
Muat turun terkini
Lagi>
kesan web
Kod sumber laman web
Bahan laman web
Templat hujung hadapan
Tentang kita Penafian Sitemap
Laman web PHP Cina:Latihan PHP dalam talian kebajikan awam,Bantu pelajar PHP berkembang dengan cepat!