LNG operator ConocoPhillips has turned to Hadoop to put off upgrading more expensive parts of its data infrastructure.
The company, which has two existing liquefied natural gas (LNG) operations in northern Australia and a joint venture interest in the Australia Pacific LNG project, is also using public cloud as a sandbox to experiment with emerging data use cases.
Speaking at the Hadoop Summit 2016 in San Jose last week, ConocoPhillips’ director of analytic platforms Kelly Cook provided a rare insight into how under-pressure resources companies are spinning out what’s left of their IT budgets.
“We’re in an industry that’s in a terrible financial position right now because of the collapse in oil prices and the current economic situation, and so any time we can save or defer capital [expenditure], we’re doing something good for the business,” Cook said.
“What we’re seeing is Hadoop is a platform that offers us a way to defer capital expense for the larger traditional data warehouse platforms [we run].
“The primary use cases that we’ve looked at so far are things like ELT and ETL offload - basically using Hadoop as a less expensive alternative to the many jobs that are running in our data warehouse environments throughout the organisation, and a way to defer upgrades or expansion of those other platforms.”
Switching the company’s data infrastructure allegiance to Hadoop had already resulted in “tangible savings”, according to Cook.
It cost Cook just US$6000 (A$8039) to set up the Hadoop cluster.
“The way that we were able to get an initial investment in Hadoop was that we went down to the loading dock and we took a pallet of technical workstations that were going to be recycled,” he said in a separate presentation.
“We bought US$6000 worth of drives from CDW, found an end of a building that wasn’t being used and we built a Hadoop cluster in the cubes [office cubicles] of that building out of these technical workstations.
“We had the luxury of finding pretty sweet workstations that had 64GB of RAM and Xeon chips, and our initial 15-node cluster was able to process some internal data sets in minutes that people hadn’t been able to see before.
“Demonstrating those kinds of successes at that financial level is the kind of attention you’re looking for.”
Expanding use cases
Cook indicated there was much more that ConocoPhillips could do with the platform.
“We’re [Hadoop] rookies,” he said. “We’re fairly new at this. The technology is pregnant with opportunity for us.”
He planned to turn Hadoop onto a range of “analytics, machine learning and … typical efficiency-type analysis” that has become a hallmark of the optimisation drives of many resources and industrial firms in recent years.
“With cost being the theme on everyone’s mind, we’re looking at literally anything and everything across the organisation that we can optimise, from supply chain assets to maintenance schedules to the operation of [oil] wells,” Cook said.
“Literally anything you can optimise is a candidate.”
Not all potential use cases for Hadoop are internally-driven: an increasing number are coming from engagements with technology vendors, Cook said.
“We’re starting to see a number of applications draw us to Hadoop,” he said. “In other words, vendors bring in applications where Hadoop is an underlying platform or an underlying platform requirement.”
Cook said ConocoPhillips was testing some of those potential use cases out in a cloud-based sandbox.
“You don’t want to test them out in your production [Hadoop] cluster or in your data lake,” he said.
“The cloud provides a perfect opportunity to run those trials, tests and proof-of-concept opportunities.”
Cook said the cloud provided the analytics team somewhere to try out new ideas and establish business cases before asking the business for money to fund them.
“When you come up with a new use case, and the business isn’t entirely sure how that’s going to impact your on-premises infrastructure and isn’t really ready to invest in expanding the infrastructure to support that use case, the cloud provides us with an opportunity to expand out that use case, develop it a little bit, see if there’s business value there, and then determine how to operationalise it at a later point in time,” he said.
Understanding open source
While ConocoPhillips introduced Hadoop to defer spend on traditional technologies, it faced some challenges from a user base that had become accustomed to the way the traditional technology vendors – and their solutions – worked.
“I think if I had it to do over again I would go back and spend some time helping people understand that we weren’t simply adopting yet another thing that holds data,” Cook said.
“Going back a couple of years [when we presented] Hadoop to the company, I and my team understood very well that we were becoming participants and partakers in an open source community.
“But when I presented the platform I think there were a lot of assumptions on the part of the people I presented the platform to. They just assumed it was like everything else they consumed from the typical big tech vendors - that it had a certain cycle of maturity, a certain level of interface and a certain slow progress. [But] it’s a very different paradigm.”
Cook said it had taken a “six month-ish” internal drive to win business support to invest in giving back to the Hadoop and open source communities.
“[In adopting Hadoop] we were becoming part of an open source community that supports an ecosystem of stuff that does an incredible amount of things for just a ridiculous value, and the more we invest, the more we get out,” Cook said.
“I think people have latched on now and appreciate the fact that being part of the open source community – even contributing back to it – is incredibly valuable for the enterprise.”