Hadoop is an important batch data processing framework in use by companies of all sizes. It has a very approachable architecture and can be applied to a large group of modern computing problems. In addition, the framework supports an implementation of mapreduce which allows users to run jobs on any size cluster to fit their data size. Come learn about the architecture of this framework, management of the cluster, and how to develop mapreduce jobs.
About the speaker
Dan Colish is a Core Data Engineer at Urban Airship. He is also a maintainer and active open source developer for Xapian and other smaller projects. He resides in Portland with his family and enjoys snowshoeing and hiking around Mt. Hood.