The University of Queensland is switching on learning analytics to help lecturers and tutors identify students at risk of failing subjects, and to measure class attendance.
UQ is also providing researchers and students access to data about campus usage, which is being analysed to improve space utilisation and - in the future - energy management.
These are all use cases for data being drawn together by a new data lake at the university.
The lake is powered primarily by an AWS stack, and contains both raw data streamed in from a variety of source systems, along with a growing number of “curated” datasets for specific purposes.
Deputy director of IT services David Stockdale told the recent AWS Public Sector Summit in Canberra that eight use cases had so far been identified for the lake.
These were clustered around two broader visions for the lake.
“The important bit is to support university decision-making and business performance analysis,” Stockdale said.
“In our hearts, what was actually really important was to make UQ a data-centric organisation.
“That might seem an obvious thing to do, but I can guarantee that virtually no organisation is truly data-centric. It might have a good business intelligence unit - we do at the university - but we're not fully data-centric and really data is the core to what we we make our decisions on.
“The other vision that we had was actually to make the University of Queensland into one big living laboratory.
“We've got three main campuses and 53,500 students. We're a small town. We can actually do a lot of research on what we're actually doing ourselves of how our campus operates.”
A pilot use case for data-centric decision-making at the university is what Stockdale calls “learning analytics” - or identifying “students at risk” of not passing subjects.
Stockdale said that universities generally were “very aware” that most of their income was derived from course fees, for which students expected a return.
“The most important thing is making sure the outcome is good for the student,” he said.
“They've actually invested in your institution, and they want an outcome.
“Now, I know there's all the affiliated things - there's all the fun part of going to university - but at the end of the day, they've probably come to you to get a degree to advance their careers, and so the more that we can do in terms of looking after them, ensuring that if they are starting to fail that we can get in there quickly and turn that around, is very important.”
In the pilot use case, which is now being put into production, lecturers and tutors can access a “student dashboard” that provides an “overview of all of the students and how they're performing.”
“This is just a quick snapshot, [but it’s] one of the applications we've built on top of our data lake,” he said.
“This is live now, but not on all courses yet.”
Other pilot use cases for data in the lake involve opening access to that data to researchers and students.
“One of the examples where we're actually doing that is with fourth year undergraduates,” Stockdale said.
“We're giving them anonymised data. They're actually looking at how people interact with the campus itself and then that's feeding back into professional services to inform some of the decisions about it.
“But it's enabling those students to do something that's real - using real data in this real environment.”
Stockdale said space utilisation was a problem for universities generally - and UQ was no exception.
“Universities are particularly bad at utilisation of space,” he said.
“The University of Queensland works on a two semester scheme, about half the year we don't do anything in those spaces. Let's be honest, most of the space is actually not used.
“Even in a research-intensive university, how can we actually get the maximum efficiency out of [our spaces]?”
Stockdale said a tool had been built atop the data lake to better understand “floor utilisation” - the split of users in a space, whether staff or students, and trends around how utilisation is changing.
Another tool has been built to specifically analyse teaching spaces, which draws on enrolment, timetable, room and wi-fi data.
This is being used to draw inferences about attendance in specific classes.
Stockdale noted this was not 100 percent accurate - because “not everybody's connected to your wi-fi” - but he said accuracy was as high as 80 percent.
“We were working from a baseline of zero percent accuracy, so if I can get us to 70 or 80 percent that's a massive improvement,” he said.
In the immediate future, the university plans to build a tool to provide “energy intelligence” about its campuses.
“The University of Queensland is quite unique because it's going to be the first university in the world to generate more energy than it uses [because] we're building such a big solar farm,” Stockdale said.
“That's a massive accomplishment, but that does not mean we should be throwing away the energy or not practicing good energy management, so we're building some good tooling around how we use our energy.”
UQ piloted its data lake - and analytics - using a full AWS stack.
He indicated that there was scope to substitute individual cloud components, where necessary - and the university has already done so. But he said that 90 percent of the data infrastructure remained AWS services.
These services include Kinesis to stream data, Glue to catalogue it, and S3 to store it.
The university is using AWS services not just to ingest and store data, but to run analytics and visualisations on top.
He said this related to a recent principle to “take the algorithm to the data, and not the data to the algorithm.”
“We're building a data lake, we're going to have lots of data in there, we're going to have raw data, and we're going to have curated datasets - why then extract all the data to do the analysis somewhere else?” he said.
Stockdale said UQ had used AWS professional services as it stood up the AWS infrastructure, and consultancy Servian “to deliver sprints to build some of that tooling on top of” the lake.
He said the project so far had come in “well under budget”.
“We delivered the first part of that data lake for 25 percent of what we'd budgeted for, and that meant that we had 75 percent [of the budget] left that we could start to build business capability on top of the university's foundational data lake,” he said.