How to clear OS cache

Modern computers are equipped with lots of memory, and operating systems utilizes the free memory space to cache things for faster access, such as inodes and files on disk.  This is great for day to day uses because the caching make things faster, unless you are an “experimental computer scientist” who often carries out serious performance tests and the OS cache would just get in the way messing up your timing information, and many of you bite the bullet by rebooting the machine each time, that is, if you have the privilege to do so, and your performance test would take much longer to finish.

Under Linux, you don’t need to reboot the machine.  You can use the following command chain to clear the OS cache (but you still need sudo access to the following command chain):

> sudo su
> sync; echo 3 > /proc/sys/vm/drop_caches

sync is to make sure all dirty buffers are flushed. writing 3 to /proc/sys/vm/drop_caches is to clear everything: pagecaches, directory entries (or dentries), and inodes. You may also choose to clear only pagecaches using “echo 1”, or clear dentries and inodes using “echo 2”.

hbase not-so-quick start

I wanted to play with Apache HBase so I downloaded v0.94.2 to a ubuntu VirtualBox and followed the quick start.  But it didn’t start at all.  The log file had exceptions similar to the following:

2012-11-15 09:37:28,728 INFO org.apache.hadoop.ipc.HBaseRPC: Server at localhost/127.0.0.1:40408 could not be reached after 1 tries, giving up.
 2012-11-15 09:37:28,732 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of -ROOT-,,0.70236052 to localhost,40408,1352990244709, trying to assign elsewhere instead; retry=0
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to localhost/127.0.0.1:40408 after attempts=1
 at org.apache.hadoop.hbase.ipc.HBaseRPC.handleConnectionException(HBaseRPC.java:291)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:259)
 at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1313)
 at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1269)
 at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1256)
 at org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:550)
 at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:483)
 at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1640)
 at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1363)
 at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1338)
 at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1333)
 at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:2212)
 at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:632)
 at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:529)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:344)
 at org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.run(HMasterCommandLine.java:220)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
 at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:416)
 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:462)
 at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1150)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1000)
 at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
 at $Proxy12.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:335)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:312)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:364)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
 ... 15 more
 2012-11-15 09:37:28,732 WARN org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable location to assign region -ROOT-,,0.70236052

After trying a few different things, I found this blog post.  The post was using Cloudera releases, but the symptoms were very similar to what I experienced.  It presented two approaches to solving the issue — either commenting out the “127.0.0.1 COMPNAME” line in /etc/hosts, or adding a property “-Djava.net.preferIPv4Stack=true” to HADOOP_OPTS in hadoop-env.sh.  I tried the first approach by commenting out all the 127.0.0.1 COMPNAME lines in /etc/hosts.  In my case, there were 2 such lines, one resolved to localhost and the other resolved to evabuntu (the name I gave to my virtual machine).

When I tried to start HBase again, I got a different exception:


2012-11-15 10:06:22,062 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMasterevabuntu
at org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:134)
at org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:197)
at org.apache.hadoop.hbase.LocalHBaseCluster.<init>(LocalHBaseCluster.java:147)
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:140)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1806)
Caused by: java.net.UnknownHostException: evabuntu: evabuntu
at java.net.InetAddress.getLocalHost(InetAddress.java:1438)
at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:185)
at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:241)
at org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.<init>(HMasterCommandLine.java:215)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:131)
... 7 more
Caused by: java.net.UnknownHostException: evabuntu
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:866)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1258)
at java.net.InetAddress.getLocalHost(InetAddress.java:1434)
... 15 more

This time, the exception was much easier to understand.  HBase picked up the name “evabuntu” from /etc/hostname but wasn’t able to resolve the IPv6 because there was no entry for it in /etc/hosts.  So all I had to do was add an IPv6 entry for it in /etc/hosts.

First use ifconfig to find the IPv6


> /sbin/ifconfig

eth0      Link encap:Ethernet  HWaddr 08:00:27:f4:8f:83
inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fef4:8f83/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:66784 errors:0 dropped:0 overruns:0 frame:0
TX packets:38494 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:86804874 (86.8 MB)  TX bytes:2845363 (2.8 MB)

eth1      Link encap:Ethernet  HWaddr 08:00:27:df:4d:e0
inet addr:192.168.56.101  Bcast:192.168.56.255  Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fedf:4de0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:93 errors:0 dropped:0 overruns:0 frame:0
TX packets:92 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:11166 (11.1 KB)  TX bytes:12582 (12.5 KB)

lo        Link encap:Local Loopback
inet addr:127.0.0.1  Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING  MTU:16436  Metric:1
RX packets:9652 errors:0 dropped:0 overruns:0 frame:0
TX packets:9652 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:971333 (971.3 KB)  TX bytes:971333 (971.3 KB)

in my case, I wanted to use eth1, but you may pick the IPv6 from any interface.  So I copied and pasted the inet6 address fe80::a00:27ff:fedf:4de0 to /etc/hosts


> cat /etc/hosts

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
fe80::a00:27ff:fedf:4de0 evabuntu

and HBase started.

Then I tried the second approach outlined in that blog post, which was to force Java to use IPv4.  The example in the post added the property to HADOOP_OPTS in conf/hadoop-env.sh, but in the case of Apache HBase, it needed to be added to HBASE_OPTS in conf/hbase-env.sh.  But this alone wasn’t enough because as I mentioned earlier, the /etc/hostname says evabuntu, so I also needed to add an entry for evabuntu in /etc/hosts.  This would be the same drill as above, except this time, you pick out the IPv4 assigned to each network interface instead of IPv6.