When Australian swimmers touch down in Japan for the 2021 Tokyo Olympics in under 70 days’ time, they’ll have a single source of truth for racing data at their disposal.
Swimming Australia, the country’s peak body for swimming, has stood up a data lake on Amazon Web Services to give coaches a better picture of how athletes are performing.
The data lake, which has been developed as part of an 18-month engagement with AWS, pulls together data, metrics and statistics that were previously siloed across the organisation.
“In high performance sport, we’re measuring a lot of stuff,” Swimming Australia performance solutions manager Jess Corones said ahead of the AWS Summit on Wednesday.
“We’ve got a lot of different tools, some that are specifically built primarily for swimming, but our data [sources weren't] talking to each other.
“[Data] was all over the place, and we had to find a way to get more power out of that data, so we wanted to democratise it and put it… in the hands of the people that matter most.”
Corones said that spreadsheets had long been the main method for tracking athlete development, meaning the kinds of reports or data produced usually depended on “who a coach would ask”.
“Every sport scientist or every university was really just developing their own dashboards or Excel reports or graphs, so even for a coach… it was very difficult to read,” she said.
“It was a really disorganised domain for [coaches], and they would take from it what they could, but it took a lot of translation from sport scientists as to how to interpret that data.
“We had the data there, but we were really immature in the way we handled the data and surfaced the data.”
The data lake, called Atlantis, is now the “mothership” for evidence-based decision making, containing both competition and training data – Swimming Australia's two biggest data sources.
“It’s the 'city' that sits under the sea of all our data sources, and by putting [all the sources] in one place, we can now compare,” Corones said.
“We wanted the data [sources] to be able to talk to each other, so we could go, 'OK, how does what they do in training relate to what we’re actually seeing… at the competition?'”
One of the initial use cases for the data lake is a ‘relay model’ that allows coaches to pick and choose swimmers based on their performance and to see how rivals are performing at race meets, a previously manual task.
“Normally, it's days and weeks and hours of scraping the internet looking for international results to try and understand and see where their form is at,” she said.
Working with AWS, Swimming Australia found a “way to get those data sources quickly”, giving coaches a better understanding of the form of other nations' “at a touch of a button”.
“So when we have coaches say, ‘Great Britain just had their trials, where are [the results] at’, instead of having to go away and type it all out… we can give them the answers straightaway,” she said.
The data lake can also be used at the junior swimming level to understand retention rates, after an algorithm developed by the University of Sydney was embedded within the data lake.
The algorithm was specifically developed to understand the reasons why some kids drop out, particularly in age brackets where there can be big difference in maturity between athletes.
“One of the projects we did was linking this algorithm that the University of Sydney had developed," Corones said.
"It’s an age correction model that looks at the athlete’s performance and their date of birth.
"It works a little bit like a golf handicap."
A third proof-of-concept centres on performance benchmarking, which involves “breaking down a race into starts, turns [and] finishes, and seeing where [swimmers] would rank with those metrics”.
“Being able to put that in the coaches’ hands on the pool deck is a huge, huge benefit for them because that’s where we get most of the questions,” Corones said.
“Not having to go back to the office and mine… that information, and to be able to give them that feedback then and there, is really, really helpful for them and gives them an [immediate] target.”
Corones said that while the data lake is still in its “infancy”, having only been handed over from AWS in January, its potential is “immense”.
“We’re only just starting to tap the surface of what we can do with it and gradually exposing the coaches to how we can use the data lake and what they can get from it as well,” she said.
“We’re using the data from it for the first time at the Tokyo Olympics, which is really exciting.”