In a previous post, I explained the source of unexpected bandwidth consumption in the HDFS client. This is a follow up post on HDFS client bandwidth utilization. Sadly, at this point, I do not have new solutions to keep the bandwidth utilization low for random "small" reads with the HDFS client but I have new insight on how the HDFS client protocol works.
Sunday, January 4, 2015
Thursday, January 1, 2015
I am trying to use HDFS as a backend for data storage using the Java API. The details about this data storage will be published in a future post (and is irrelevant to this discussion). During my experimentation, I met unexpected bandwidth consumption on the client/reading nodes. My findings are shared below.