Facebook's data query engine now open source

By
Follow google news

Presto pokes petabytes.

Social network Facebook has released its structured query language (SQL) engine as open source, following a strategy set out a year ago.

Facebook's data query engine now open source

Called Presto, the query engine is able to use a variety of data bases, from multiple Hadoop Distributed File System (HDFS) sources to proprietary data stores. The purpose of Presto is to provide fast interactive analytics with the speed of commercial data warehouses, and was written from the ground up by Facebook.

It scales to very large sized data sources, and Facebook itself uses Presto for its 300 petabyte data warehouse, with over a thousand employees running 30,000 queries daily, scanning over a petabyte of information each time.

A large subset of standard ANSI SQL is supported, but as of yet, cannot write output data back into tables with queries being streamed to clients instead.

Presto architecture; source: Facebook
Presto architecture; source: Facebook

Facebook engineer Martin Traverso said in a blog post that Presto came about after the social network's data warehouse grew to petabyte scale, and it wanted an interactive system optimised for low latency for queries against its HDFS-based clusters.

As a result, Presto does not use MapReduce and does all processing in memory which in turn is pipelined through the network between stages to avoid unnecessary I/O which would cause greater latency for queries.

According to Traverso, Presto is ten times better than Hive and MapReduce for processor efficiency and query latency in most cases. Traverso says Presto was written in Java for speed of development and "has a great ecosystem" while being easy to integrate with other components Facebook has built in the same language.

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © iTnews.com.au . All rights reserved.
Tags:

Most Read Articles

National photo licence recognition system set to go live in 2025

National photo licence recognition system set to go live in 2025

ANZ CEO backs Plus tech stack, but changes "inefficient" delivery

ANZ CEO backs Plus tech stack, but changes "inefficient" delivery

Qld lifts 12-year ban on IBM after $1.25bn payroll failure

Qld lifts 12-year ban on IBM after $1.25bn payroll failure

Google says Australian law on age verification 'extremely difficult' to enforce

Google says Australian law on age verification 'extremely difficult' to enforce

Log In

  |  Forgot your password?