Apache Storm is a distributed realtime computation system which in layman terms means that you can process large amount of incoming data in realtime to make significant decisions in a distributed manner. This was developed by Backtype(later acquired by Twitter) and now is under the Apache Foundation. Storm makes it really simple to do all the realtime processing over a large incoming stream of data. Storm is known to be really fast as a benchmark clocked it at over a million messages processed per second per node. It is highly scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.
It really amazing fun to use it for some very interesting problems.
During this session we will walk through
1. What it means to have a distributed realtime computation system.
2. How Apache Storm is designed.
3. Understand the various terms used in Apache storm
4. Write some code to do some basic realtime computation on incoming data.
5. If time permits, we will discuss how we have change our algorithms to suit realtime computation.
Slides from the talk :
Session difficulty level: Intro/101
Share this session:
Guys, Twitter has made a FREE online course on this topic along with Udacity: https://www.udacity.com/course/real-time-analytics-with-apache-storm–ud381