Sunday, January 4, 2015

Follow up on HDFS Client Bandwidth Utilisation

In a previous post, I explained the source of unexpected bandwidth consumption in the HDFS client.  This is a follow up post on HDFS client bandwidth utilization.  Sadly, at this point, I do not have new solutions to keep the bandwidth utilization low for random "small" reads with the HDFS client but I have new insight on how the HDFS client protocol works.

Thursday, January 1, 2015

Unexpected High Bandwidth Consumption in HDFS (Hadoop 2.0.0 and Cloudera 4.x)

I am trying to use HDFS as a backend for data storage using the Java API.  The details about this data storage will be published in a future post (and is irrelevant to this discussion).  During my experimentation, I met unexpected bandwidth consumption on the client/reading nodes.  My findings are shared below.

Sunday, August 17, 2014

The command "tee -" Considered Harmful

I spent much time debugging a corrupted buffer problem and thought the learning would be useful to a larger audience.

Note: I was debugging someone else script ! :-)

tl:dr: if you are using "tee -", you are better to know what you are doing.  If you have doubts, read on.