The University of NSW’s two-tiered data lake architecture has helped the university transition to ‘hybrid’ online and in-person teaching while limiting inaccurate or “rogue” or inaccurate data use.
UNSW had completed its first data lake in Microsoft Azure by the middle of last year, and has since gone on to create a second data lake in Azure with Databricks to serve as a curated collection of data that can easily be interpreted for reporting and strategy development.
Separating data into raw and curated collections has helped streamline the way staff use data and reduced ambiguity, UNSW’s chief data and insights officer Kate Carruthers said.
In an interview with Microsoft, Carruthers said that strategy was complemented by embedding the university’s small data team into different parts of the business, such as human resources, as data engineers and domain specialists.
“We've been working with our colleagues in HR - so when we had to suddenly all go off campus our colleagues in HR developed a number of Power BI dashboards that they made available very quickly.”
Data projects must also follow strict governance rules, with departments signing a data sharing agreement, naming a data owner and getting approval from Carruthers’ team.
“We're starting to stop all of that rogue data use,” she said.
“You used to have to logon to a system and download a CSV and nobody knew about it. And then they present their reports, and you'd often get people who'd done different reports in the same meeting and having fights about which data was correct.
“Now when they're talking about X, it's absolutely X and there's no debate about it.”
That level of data insight is proving useful as the university and broader higher education sector grapple with simultaneous challenges to the way it operates on a day-to-day basis and struggles with the loss of income from international students, with UNSW alone set to lose almost 500 full-time jobs.
Offering courses online to help address the crisis comes with its own challenges too, Carruthers said.
“Now that every course that we offer is available online, the opportunities for contract cheating (where someone hires a third party to complete an assignment for them) have just grown exponentially.
“We are now doing a machine learning proof of concept with support from Insight and Microsoft to start to identify that.”
Student outcomes are also impacted by teachers’ ability to ‘lean into’ digital media, with those who can better adapt in-person materials for online learning seeing better results than teachers who are still doing the same thing they did in the classroom.
“If you stood in front of the PowerPoint and talked about the PowerPoint in the classroom and you're trying to just do that online, it's not engaging.”
To try and fix that, the data team is establishing a machine learning DevOps project to give staff the information the information they need to do their jobs in the new learning environment without having to go searching for that data.
“We're onboarding the Moodle data for this proof of concept in the data lake. That will then enable us to then use ML techniques to then do some learning analytics,” Carruthers said.
“We're trying to work out how we can engage students and I think it's changing the way that we teach.
“We will be teaching in hybrid mode for the foreseeable future so even when the domestic students come back on campus, the internationals probably won't be here. That's a big shift for us.
“We want to identify what are the real issues that are impacting on student experience and student performance, and not mere correlates.”