HDFS' Location Awareness


According to several documentation 1, 2, 3 HDFS' Location Awareness is about knowing the physical location of nodes and replicating data on different racks to reduce the impact of rack issues due to, e.g. power supply and/or switch issues.


How does HDFS know the physical location of nodes and racks and subsequently decide to replicate data to nodes located on other racks?


Rack-awareness is configured when the cluster is set up. This can be done either manually for each node or through a script.

Each DataNode is given a network location which is simple a string, much like a file system path.



The NameNode then builds a network topology (basically a tree structure) using the network locations of each DataNode. This topology is then used to determine block replica placement.

Need Your Help

How to efficiently perform load and bitwise operation using GPGPU?

c cuda bit-manipulation shared-memory gpgpu

I need to load an array of 128 to 256 bytes length to GPU shared memory.

“Attempt to index local…” Why am I getting this error?

lua corona

I'm new to Lua and trying to get things sorted in my head. I tried this code:

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.