Five researchers from Alcatel Lucent's Bell Labs and Stony Brook University have created a cloud storage platform they hope will provide a reference design for a future flexible, customisable framework.
The researchers last week published their paper on their SEARS - Space Efficient And Reliable Storage - system.
They said cloud storage customers were clear on what they expected of today's cloud services but the reality was these expectations - reliability, fast response, good usability - were not easy to achieve in unison.
Users were therefore being forced to choose between file retrieval times and storage efficiency, the researchers said.
"Any file system must offer reliable storage whether through file duplication that requires more space but less computation complexity such as GFS or through erasure coding that requires less space but more computation complexity such as RAID systems," they wrote [pdf].
As such, the researchers have come up with SEARS, which uses deduplication and erasure coding techniques to allows systems administrators to configure the system based on their needs for either high storage efficiency or fast file retrieval.
"Given a file, there are different ways to associate data chunks with available storage servers and retrieve data. Archive-based backup systems mainly care about storage efficiency and reliability," they wrote.
"However, interactive cloud storage systems also care about file retrieval speed.
"With proper association of data to storage server clusters, SEARS provides flexible mixing of different configurations, suitable for real-time and archival applications."
To meet different needs, the system offers two schemes for systems administrators to adapt to performance and efficiency desires.
Chunk-level binding is for archival storage running in the background. It offers system-wide data deduplication (ideal for large media content repositories where users share the same or similar content).
User-level binding deals with applications focused on performance and puts fast file retrieval ahead of system-wide deduplication.
The researchers tested the system on ten machines in the US and claimed it outperformed Amazon EC2 with a retrieval time of 2.5 seconds for 3MB files compared to Amazon's 7 seconds.