Twitter to open source Storm analytics platform

By on
Twitter to open source Storm analytics platform

Just like Hadoop but for infinite processes.

Twitter has revealed plans to open source Storm, the Hadoop-like analytics platform it gained through its acquisition of BackType last month. 

Hadoop is a type of distributed file system. Backtype has previously called Storm the “Hadoop of real-time processing”.

While Hadoop runs finite “MapReduce jobs” with queues and workers, Storm's “topology” processes messages forever or until it is actively switched off, BackType’s Nathan Marz wrote on Twitter’s Engineering blog

The most notable users of Hadoop clusters were Yahoo and Facebook, however in recent months hardware vendors such as Dell and EMC have delivered pre-configured Hadoop stacks for enterprise customers. 

Twitter also used a distribution of Hadoop built by Dell’s partner for its recently announced offering, Clouderaaccording to Read Write Web

Storm’s advantages over Hadoop, according to Marz, were that it avoided queues in the process to update a database and was fault tolerant; it supported continuous computation that was suitable for streaming Twitter’s trending topics in a browser; and that it could run “intense” queries in parallel.

"It abstracts the message passing away, automatically parallelizes the stream computation on a cluster of machines, and lets you focus on your realtime processing logic," Marz explained in an earlier blog post for BackType

Another standout feature was “Storm's awesome automated deploy” that allowed a user create a Storm cluster on Amazon’s EC2 cloud “with just the click of a button”, he said. 

Prior to Twitter's acquisition, BackType had released a successful product, BackTweets, which offered companies a sales lead-generation system to track who messages were reaching. 

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © . All rights reserved.

Most Read Articles

Log In

  |  Forgot your password?