Saturday, April 23, 2011

Fun With Hadoop

Since I finally wrapped up the mail server, I've moved on to my next final project, our Wifi Heatmap. Part of the server side component of this application is a compilation step that clusters samples of network strengths into estimates of access point location, and then tiles those access points for efficient display.

To make this a little exciting, I decided to use the Apache Hadoop framework to perform this computation in 2 Map-Reduce programs. The first map pass is trivial; it outputs the sample points keyed by the BSSID of the access point. The first Reduce pass is interesting, it is here that we cluster each set of samples with the same BSSID. In the second Map pass, we accomplish the bulk of the tiling by hashing each access points to its tile ID. In the second Reduce pass, we concatenate all access points with the same ID into a downloadable tile that we can then serve to our map viewer on demand.

No comments:

Post a Comment