hadoop with problematic startup

With my one-node cluster Hadoop configuration, Hadoop (0.20.2) doesn’t start without problem. Here is the status:

# netstat -an | grep LISTEN | grep tcp
tcp        0      0 0.0.0.0:46631           0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:40072           0.0.0.0:*               LISTEN     
tcp        0      0 127.0.0.1:9001          0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:50060           0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:50030           0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:42004           0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN     
tcp6       0      0 :::22                   :::*                    LISTEN

There is nothing listening to port 9000 TCP. From the logs I see, tasktracker takes off without problem. But all the others are having problem.

Namenode couldn’t have started:

ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-localhost/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:290)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
INFO org.apache.hadoop.ipc.Server: Stopping server on 9000

Datanode says, it cannot conntect:

INFO org.apache.hadoop.ipc.RPC: Server at localhost/127.0.0.1:9000 not available yet, Zzzzz...
INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 tim
e(s).
INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 tim
e(s).
INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 2 tim
e(s).
INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 3 tim
e(s).

Of course secondary name node is off, too:

INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s).

I have looked up to Micheal Roll’s post about installing Hadoop on Ubuntu and tried once more with omitting IPv6 suppport. But no… Thereafter, have just looked up Hadoop Cluster Configuration guide. It points out options such as dfs.name.dir and dfs.data.dir. I think there is a problem with data files of Hadoop which are deleted since they are located under /tmp (locally) in default.

I have edit the conf/hdfs-site.xml to add these lines:

    <property>
         <name>dfs.name.dir</name>
         <value>/hometohadoop/hadoop-0.20.2/logs/transLogs</value>
    </property>
    <property>
         <name>dfs.data.dir</name>
         <value>/hometohadoop/hadoop-0.20.2/dataDir</value>
    </property>

Now I say Hadoop to store the relevant files under these directories instead of some unsteady directory under /tmp (local). Ensuring these folders do exist:

# mkdir -p /hometohadoop/hadoop-0.20.2/dataDir
# mkdir -p /hometohadoop/hadoop-0.20.2/logs/transLogs

I have restarted. It says now,

ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
 java.io.IOException: NameNode is not formatted.

Which makes me format (delete all data, of course) in Hadoop so that I will eventually have a stable data store anymore:

# bin/hadoop namenode -format

Now restarting Hadoop again… And yet:

INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000: starting

***

Good…

***