RSS

Category Archives: Zookeeper

Storm

By looking around for a suitable system to do realtime analytics for the project group I have come to Storm.

https://github.com/nathanmarz/storm

Storm is a system to distribute realtime analytics to a cluster of server. For this the storm system uses Zookeeper to coordinate master and worker nodes and use a message queues to build create datastream to connect the workernodes. Such a connection of several workernode is called a topology. A topology could by deployed local on one server or could be submitted to the cluster.

I have read some stuff about the system and it would be easy to run this on our cluster and connect this to e.g. hbase. All we need ist to set up the masterprocess and the worker nodes, zookeeper ist already running for hbase.

Here some further information:

A presentation of storm:
http://www.infoq.com/presentations/Storm

The stormwiki:
https://github.com/nathanmarz/storm/wiki

 

 
Leave a comment

Posted by on 26.03.2012 in Hadoop, Zookeeper

 

Hadoop and Pig at Twitter

Related to the previous post on how Hadoop is used at Facebook, there is an interesting slide deck on how Twitter is using Hadoop and Pig. Take a look.

The video of the original talk is available at Yahoo!. Very impressive to see the much less code you have to write in Pig when compared to the native Java MapReduce jobs (around 20:00).

 
Leave a comment

Posted by on 31.10.2011 in Hadoop, Pig, Zookeeper

 

Tags: , , , , , , , , , ,