Denver JUG Hadoop and Encryption Presentations

15 Jan 2010

Denver JUG January Meeting

I had the pleasure of hanging out with about 60 of my local friends at the Denver Java Users Group (DJUG to the locals) on Wednesday night and talking about Encryption on the JVM as well as Hadoop. I had the good fortune of having Andy Sautins of Returnpath.net, who's an active user of Hadoop, field a few of the questions. I really appreciate the time a few of the folks spent giving me feedback on Speakerrate.com. For your future reference, below are the slides and sample source. Feedback and suggestions are always welcome at matthewm@ambientideas.com

Encryption Bootcamp on the JVM

Abstract

Does your application transmit customer information? Are there fields of sensitive customer data stored in your DB? Can your application be used on insecure networks? If so, you need a working knowledge of encryption and how to leverage Open Source APIs and libraries to make securing your data as easy as possible. Encryption is quickly becoming a developer’s new frontier of responsibility in many data-centric applications.

In today’s data-sensitive and news-sensationalizing world, don’t become the next headline by an inadvertent release of private customer or company data. Secure your persisted, transmitted and in-memory data and learn the terminology you’ll need to navigate the ecosystem of symmetric and public/private key encryption.

Intro to Hadoop

Abstract

Moore’s law has finally hit the wall and CPU speeds have actually decreased in the last few years. The industry is reacting with hardware with an ever-growing number of cores and software that can leverage “grids” of distributed, often commodity, computing resources. But how is a traditional Java developer supposed to easily take advantage of this revolution? The answer is the Apache Hadoop family of projects. Hadoop is a suite of Open Source APIs at the forefront of this grid computing revolution and is considered the absolute gold standard for the divide-and-conquer model of distributed problem crunching. The well-travelled Apache Hadoop framework is currently being leveraged in production by prominent names such as Yahoo, IBM, Amazon, Adobe, AOL, Facebook and Hulu just to name a few.

In this session, you’ll start by learning the vocabulary unique to the distributed computing space. Next, we’ll discover how to shape a problem and processing to fit the Hadoop MapReduce framework. We’ll then examine the incredible auto-replicating, redundant and self-healing HDFS filesystem. Finally, we’ll fire up several Hadoop nodes and watch our calculation process get devoured live by our Hadoop grid. At this talk’s conclusion, you’ll feel equipped to take on any massive data set and processing your employer can throw at you with absolute ease.