What if you do not set the property 'hadoop.tmp.dir'?

In  Hadoop installation directory under the core-site.xml file in the conf directory there used to be a property 'hadoop.tmp.dir'. This property is used to define the directory at the local filesystem which will store the data for the namenode and the data node.
There're three HDFS properties which contain hadoop.tmp.dir in their values-

  • dfs.name.dir: directory where namenode stores its metadata, with default value ${hadoop.tmp.dir}/dfs/name.
  • dfs.data.dir: directory where HDFS data blocks are stored, with default value ${hadoop.tmp.dir}/dfs/data.
  • fs.checkpoint.dir: directory where secondary namenode store its checkpoints, default value is ${hadoop.tmp.dir}/dfs/namesecondary.
Usually the default value for "hadoop.tmp.dir" is '/tmp/hadoop-{username}'. With many of the OS restarts, the 'tmp' directory contents are deleted and hence  you may loose the data directory for the namenode as well as th datanode content. And then if you stop all the services and start them again then you will face the error saying 'the namenode directory does not exists'. 
If you want to avoid this situation, then set the value for 'hadoop.tmp.dir' to directory location which do not gets deleted on the OS start-up.
For e.g. : 
'property-name:hadoop.tmp.dir
property-value: /home/hadoop/hdfs-tmp

Comments