Computing for the Large Hadron Collider

By on
Computing for the Large Hadron Collider
David Foster, deputy head of IT at CERN.
Page 2 of 2  |  Single page

Did CERN develop its own software for operating the global compute grid?

For the last ten to 12 years, there has been a continuous stream of EU investment in middleware production.

Very early on there were EU-funded projects to help develop the grid middleware. It started with the European Data Grid at around 2000. After that there were three more EU projects to develop middleware under a project titled Enabling Grids for E-Science in Europe (EGEE).

Following that there was a separate middleware project that continues today – the European Middleware Initiative – which manages software development and collaboration around the middleware, and the EGI or European Grid Initiative, which handles the operational aspects of the grid such as account management and monitoring.

What this consistent funding has produced is a robust and organic system. At any time there may be sites not available, sites down for maintenance, network problems or new sites joining – all of this means the grid has to be very dynamic. It has to accommodate a resource pool that changes almost in real time. If a job fails, it can be rescheduled at a different place.

Were commercial packages available back when this was developed?

We don't tend use commercial complete solutions. Wherever a commercial solution emerges that is viable for us to use, we won't develop our own. But grid computing technology was and is still very new in commercial terms. So it has required some of our own investment.

Who owns the middleware?

Most of the grid middleware developed by EMI and used by LHC is released under the Apache Software Licence v 2.0 (ASLv2). The ASLv2 licence allows redistribution of the software as part of commercial products. Although the original redistributed code must retain the ASLv2 licence, any work derived from it can be licenced under different, possibly commercial, software licences. The choice of adopting ASLv2 was explicitly made to allow commercial companies to include the grid middleware in their products.

Currently no fees are paid for using the software and no patents have been filed to my knowledge.

Has this software been exploited for use by cloud computing providers?

It’s important first to note the distinction between cloud and grid computing.

Grid computing is the connection of resources owned by the community. In grid computing, we both have computers and share those two resources to optimize use of our capital expenditure. We both have a big CapEx invested, and we both want to optimise use of that CapEx.

In cloud computing, computing resources are bought and sold over the network. A third party has spent CapEx on infrastructure and customers buy those resources on an OpEx basis.

The distinction is predominantly one of a business model. From a technical point of view, there is rather less of a difference. The underlying technology isn’t so different. Cloud computing is another interface to get a raw virtual machine or application. Its an interface to get work done, where you can re-purpose processers to do different jobs.

You could say that the defining difference from a technical perspective is also that cloud computing can be rather more interactive, whereas grid computing is entirely batch in its current form.

The paradigm shift for both was advances in networking. We saw a potential for a cloud business model very early on. Once there was a big market of people with access to high performance networks, service providers could make computing available at massive economies of scale through huge data centres. That reduces OpEx to a minimum for the customer. Everyone recognizes this as a big way forward. You can do so much more when you treat computing as an Operational Expense.

Does CERN make use of the cloud, even with your large CapEx investment?

So far we are experimenting with cloud computing. There are some practical issues associated with cloud computing. One is that it is not in the interests of cloud computing providers to provide a homogeneous interface to cloud. They see a competitive advantage through the specialized interfaces they provide. Switching from one cloud provider to another is problematic.

Further, to move the amount of data we produce in and out of a cloud is prohibitively expensive. The cloud models currently are well weighted towards the cost of moving I/O.

And finally, we have a very large capital expenditure. We have all the skills and the equipment, we do things ourselves, so we are probably not going to pay a premium to do the same thing with a cloud provider. It always must be more expensive at a certain scale with a cloud provider.

All that being said, the fact cloud is an OpEx play is an interesting model for new, smaller scale science experiments. Instead of hiring sys admins, buying hardware, paying for a hosting environment and a network expert, perhaps you just buy the computing you need – its more convenient. Why would you invest your capital in your own infrastructure?

Whether it works for  is a question of application, scale and risk. It depends on scale and complexity of the data and processing you need to do. You might be a four-person team, but you might also be doing an experiment generating petabytes of data. So yes, you might be a small outfit, but it’s probably cheaper to hire two more people and build your own infrastructure than buy from a cloud.

There will be more use cases as cloud producers push down the prices. We’re clearly in a premium now. 

Previous Page 1 2 Single page
Got a news tip for our journalists? Share it with us anonymously here.
Copyright © . All rights reserved.

Most Read Articles

Log In

  |  Forgot your password?