IT_AI_DeepLearning/Hadoop Eco

HDFS - [오류] TaskTracker, DataNode 구동시 문제, 해결 방법

JJun ™ 2013. 7. 13. 06:36

 


 출처: http://develop.sunshiny.co.kr/885

          (Hadoop 관련 설치 내용도 상세히 기록한 블로그!)


 


 

# TaskTracker 구동되지 않을때


 

증상 (1) hostname: 이름 혹은 서비스를 알 수 없습니다

 

2013-04-10 10:07:31,681 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:   host = java.net.UnknownHostException: hostname: hostname: 이름 혹은 서비스를 알 수 없습니다
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.1.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonfo
' on Thu Jan 31 02:03:24 UTC 2013

************************************************************/
2013-04-10 10:07:32,868 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properti
es
2013-04-10 10:07:33,198 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats
 registered.
2013-04-10 10:07:33,212 ERROR org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Error getting localhost name. Using 'local
host'...
java.net.UnknownHostException: hostname: hostname: 이름 혹은 서비스를 알 수 없습니다
        at java.net.InetAddress.getLocalHost(InetAddress.java:1438)
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.getHostname(MetricsSystemImpl.java:463)
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configureSystem(MetricsSystemImpl.java:394)
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configure(MetricsSystemImpl.java:390)
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:152)
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:133)
        at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:40)
        at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:50)
        at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3905)
Caused by: java.net.UnknownHostException: hostname: 이름 혹은 서비스를 알 수 없습니다
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:866)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1258)
        at java.net.InetAddress.getLocalHost(InetAddress.java:1434)
        ... 8 more
2013-04-10 10:07:33,223 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-04-10 10:07:33,223 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
2013-04-10 10:07:34,553 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2013-04-10 10:07:34,814 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-04-10 10:07:36,192 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay
.log.Slf4jLog
2013-04-10 10:07:37,348 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.Htt
pServer$QuotingInputFilter)
2013-04-10 10:07:37,601 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1
 and reduceRetainSize=-1
2013-04-10 10:07:37,685 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.UnknownHost
Exception: hostname: hostname: 이름 혹은 서비스를 알 수 없습니다
        at java.net.InetAddress.getLocalHost(InetAddress.java:1438)
        at org.apache.hadoop.security.SecurityUtil.getLocalHostName(SecurityUtil.java:271)
        at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:289)
        at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1563)
        at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3906)
Caused by: java.net.UnknownHostException: hostname: 이름 혹은 서비스를 알 수 없습니다
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:866)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1258)
        at java.net.InetAddress.getLocalHost(InetAddress.java:1434)
        ... 4 more

2013-04-10 10:07:37,722 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down TaskTracker at java.net.UnknownHostException: hostname: hostname

: 이름 혹은 서비스를 알 수 없습니다
************************************************************/

 

 

 

# 설정
/etc/hosts 파일에 해당 hostname에 해당하는 IP 설정

 

[hadoop@hostname bin]$ cat /etc/hosts
192.168.1.110       localhost



 

 

# DataNode 구동되지 않을때

 

증상 (1) Invalid directory in dfs.data.dir: Incorrect permission

 

2013-04-10 10:20:57,201 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = hostname/192.168.1.45
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.1.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonfo' on Thu Jan 31 02:03:24 UTC 2013
************************************************************/
2013-04-10 10:20:58,220 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-04-10 10:20:58,289 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-04-10 10:20:58,301 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-04-10 10:20:58,305 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2013-04-10 10:20:58,933 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-04-10 10:20:59,366 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /data/node01, expected: rwxr-xr-x, while actual: rwxrwxr-x
2013-04-10 10:20:59,445 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /data/node02, expected: rwxr-xr-x, while actual: rwxrwxr-x
2013-04-10 10:20:59,445 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid.
2013-04-10 10:20:59,445 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2013-04-10 10:20:59,472 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at hostname/192.168.1.45
************************************************************/

 

 

 

# 확인
conf/hdfs-site.xml 파일의 dfs.data.dir 에 해당하는 값의 권한을 755로 변경해줌

[hadoop@hostname conf]$ cat hdfs-site.xml
<property>
 <name>dfs.data.dir</name>
 <value>/data/node01,/data/node02</value>
</property>


#
권한 설정

 

[hadoop@hostname conf]$ chmod -R 755 /data/node*



 

 

 

증상(2) Datanode denied communication with namenode: datanode02:50010

 

 

2013-05-12 16:11:13,708 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = zookeeper1/192.168.1.45
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.1.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonfo' on Thu Jan 31 02:03:24 UTC 2013
************************************************************/
2013-05-12 16:11:14,556 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-05-12 16:11:14,619 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-05-12 16:11:14,625 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-05-12 16:11:14,628 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2013-05-12 16:11:15,090 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-05-12 16:11:17,182 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
2013-05-12 16:11:17,231 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened data transfer server at 50010
2013-05-12 16:11:17,250 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s
2013-05-12 16:11:17,575 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2013-05-12 16:11:18,072 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2013-05-12 16:11:18,123 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = true
2013-05-12 16:11:18,129 INFO org.apache.hadoop.http.HttpServer: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.datanode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2013-05-12 16:11:18,142 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50075
2013-05-12 16:11:18,144 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50075 webServer.getConnectors()[0].getLocalPort() returned 50075
2013-05-12 16:11:18,148 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075
2013-05-12 16:11:18,150 INFO org.mortbay.log: jetty-6.1.26
2013-05-12 16:11:20,086 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50075
2013-05-12 16:11:20,138 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2013-05-12 16:11:20,146 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source DataNode registered.
2013-05-12 16:11:21,643 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
2013-05-12 16:11:21,669 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort50020 registered.
2013-05-12 16:11:21,677 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort50020 registered.
2013-05-12 16:11:21,695 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(zookeeper1:50010, storageID=DS-537727669-192.168.1.45-50010-1368149158612, infoPort=50075, ipcPort=50020)
2013-05-12 16:11:21,714 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode: zookeeper1:50010
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:2539)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.register(NameNode.java:1013)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at com.sun.proxy.$Proxy5.register(Unknown Source)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.register(DataNode.java:740)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.runDatanodeDaemon(DataNode.java:1549)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1609)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1734)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1751)

2013-05-12 16:11:21,732 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at zookeeper1/192.168.1.45
************************************************************/

 


 

 

# 확인
conf/hdfs-site.xml 파일에dfs.hosts 속성이 지정이 되면, 속성의 include_server에 설정이 되어 있는 노드만 인식을 하게됨.


# 설정

1) NamdNode 서버 /etc/hosts 파일에, 해당 DataNode의 호스트 명과 아이피를 등록해줌.
2) conf/include_server 파일에 해당 호스트명을 넣어줌.