Sailing into the big data clouds

By on
Sailing into the big data clouds

Cloud means lower cost, higher competitiveness for big data.

Food, sports and gambling aren’t the first things coming to mind when the twin concepts of big data and the cloud are being thrown around.

But companies working in these three industries – and many more besides – are finding  competitive advantage and cost savings by using the cloud as location for their big data storage and analytics

Take sailing. The America’s Cup is one of the most competitive, challenging yacht races on earth. Jon Bilger’s been on the podium twice, once in 2003 and again in 2007. Both times he was team weather manager for Alighini, and both times the team came out on top.

After the second victory, Bilger started thinking about how the weather prediction techniques he used during the races could be commercialised and offered to yachtsmen across the globe.

“We licensed a weather modelling tool from the CSIRO, and then installed 25 servers in a US data centre,” said Bilger, from his home in New Zealand. “That was how we founded PredictWind.”

Because of the limited compute power, PredictWind had to divide the globe into four regions, and the resolution of those regions was only 100 square kilometres. This was adequate to provide a good-enough service, but Predict Wind wanted greater accuracy, and better resolution.

To do that, the company needed more computing power. It also needed to ditch the servers in the US, which were suffering from reliability problems because they were running full-tilt, 24 hours a day.

The answer PredictWind came up with was to move its weather data and computing to the cloud. At the beginning of this year, Predict Wind’s big data needs were handed over to Amazon Web Services, and the company hasn’t looked back.

"There are around 1.1 terabytes of netCDF files produced every 24 hours," noted Bilger. Those files, however, are not stored permanently. "The working dataset is around 376 gigabytes, and there are 4.5GB of of compressed GRIB files archived daily. The site is alse dealing with around three requests per second," he added. 

Big data – the unfair advantage

Greyhound racing is a long way from sailing, and unlike the sport of CEOs, it has an air of old Australia about it. Aging gents walking their lithe hounds early in the morning, cigarette butts and betting stubs littering the grandstands at the tracks, and dogs, crazy fast dogs tearing after an artificial rabbit while punters tear their hair, hoping for a win.

It also turns out greyhound racing is the perfect platform for statistical analysis in the cloud, said Jon Rout, co-founder of Tipster.

“We are using quant models, which are used by investment banks,” said Rout. “Tipster allows you to create a quant model of a greyhound race and do an analysis of which dog will likely win.”

Call it an unfair advantage. Rout has historical data for greyhound racing going back years. There’s also the fact a greyhound race has a limited number of variables, those being the performance of each dog. It’s the perfect venue for throwing heavy computational analysis at. Tipster, instead of buying its own servers and computational iron, outsources it all to the cloud.

“The calculations are done in memory via Amazon Elastic Cache,” said Rout. “It takes about three seconds to calculate the odds. If we were pulling the stats in and out of a database, it could take thirty minutes.”

Tipster captures an average of 100 races per day with up to 10 dogs per race. Add to that the 50 or so metrics per dog and Tipster is around 50,000 new records per day. 

"There are currently just under three million total stored statistics used to calculate quant factors and hence the scores for each dog when combined with the weightings that a user inputs into their custom model," Rout said. 

Tipster, and PredictWind are part of a trend, where small and medium sized companies are using the cloud to perform calculations and analysis which would have been the sole province of super computers only a few years back, said Denise Montgomery, research director, financial services, Ovum Asia Pacific.

“When we’re looking at big data in the cloud, it seems the greatest Montgomery. “When it comes to the cloud, there is a natural fit, partly for the smaller players who don’t have the resources that big players have.”

Rout said Tipster is looking at moving into other sports, but settled on greyhounds because they were a relatively simple proposition data-wise. “Greyhounds are relatively simple because there’s a single competitor,” he said.  “With horses, there’s the the jockey and the trainer, as well as the horse itself to consider. In terms of computing complexity, it’s easier to work with greyhounds. With soccer, there are 11 player’s form you need to take into account.”

You can lead a horse to water…

One of the singularly surprising things heavy mobile data users discover is how poor many restaurant and food websites are. “We estimate 95 percent of restaurants are ignoring this mobile traffic,” said James Eling, founder of Marketing4Restaurants.

For Eling using big data in the cloud to help restaurants take advantage of mobile came from personal experience. “I was trying to find a restaurant which would deliver to my home,” he said “I tried typing into Google where I lived and restaurant delivery, and it could not tell me what I was after.”

Marketing4Restaurants takes mobile search data, as well as real time Facebook and Twitter data relating to restaurants, and pushes it to the cloud. “We aggregate all this data into a single useable stream, and then feed it back to the restaurants and they can use it to streamline their marketing,” he said.

Eling decided to go with MongoDB running on Microsoft Azure. “We chose those systems because they were easy to use, and we essentially had it up and running in no time at all,” he said. At present, the data stored in the cloud isn't huge, added Eling, at around 35 gigabytes. "However we're also dealing with around 45,000 queries per day, and our traffic is growing at 25 percent per month."

Restaurants pay a flat membership fee, and then ongoing monthly fees to take advantage of Marketing4Restaurant’s market intelligence. At present, Eling said the business has around 100 customers in Australia and New Zealand, and they’re looking to expand into the US during 2013.

What big data means for smaller and medium sized organisations is they’re able to compete at the big end of town, at a cost effective price, added Eling.

Cloud means lower costs

PredictWind fits the bill for a small, data-rich company wanting to take advantage of big data in the cloud. According to founder Bilger, the shift to Amazon Web Services has allowed it to increase the resolution of its predictions from 100 kilometers down to one kilometer.

Each data point has also increased from a single wind vector to 3600 individual vectors.

“When you go to a one kilometer resolution, you need a lot of computing power,” Bilger said. “Jobs which used to take 11 hours on our old infrastructure are now down to two one-hour jobs per day.”

Because PredictWind pays Amazon for compute power on an hourly basis, the cost of creating predictions has also dropped significantly, while accuracy has increased by orders of magnitude.

“We’re now able to offer sailors around the world extremely sophisticated tools,” Bilger said. “We can plan routes which take advantage of the best-prevailing wind, and which takes a sailor around the worst weather for a more comfortable journey.”

“We wouldn’t have been able to do this without the cloud,” he added. 

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © . All rights reserved.

Most Read Articles

Log In

  |  Forgot your password?