Warmest 100 is back with new bag of tricks

 

The tech used to predict the Triple J Hottest 100 ahead of the countdown.

The team from the Warmest 100 are back with what they’re calling “the web’s most accurate prediction of Triple J’s Hottest 100”, with a swag of new tricks up their sleeve.

Last year, a group of data analysts successfully predicted 92 of the 100 songs in the world's largest music vote (though mostly not in order) by mining data from posts on social networks auto-generated by the broadcaster to further expand the reach of the massive voting exercise.

So accurate were their predictions that this year, Triple J disabled the social sharing function that allowed for the data to be scraped.

Yesterday, The Vine reported that the team, who initially weren’t planning a repeat attempt, had a change of heart on Sunday after encouragement from Chicago economist David Quach, and have compiled a new list for this year.

Today iTnews went behind the scenes to find out how they did it.

On Sunday morning, Australia time, Quach contacted Nick Drewe from last year’s Warmest 100 team to say that he’d collected around 400 votes from a search of Twitter, and asked Drewe if he was sure he didn’t want to run the Warmest 100 again.

People had been posting images showing their votes, he noted, and Quach had manually read them and tallied them up. Drewe had a change of heart after repeating Quach’s method.

Instagram “turned out to be a goldmine”, and Drewe re-used code from an Instagram search tool he’d written to search for images tagged with “hottest100”. His code used the Instagram API to find the images, and simply downloaded them.

The team then used a free trial of Optical Character Recognition (OCR) software called Maestro to process the images and extract the votes. The votes were tallied in a simple spreadsheet.

Independently, Mark Pazolli, an engineer and mathematician from Western Australia, developed his own, similar method to that of the Warmest 100 group. He decided to try after hearing that the Warmest 100 wouldn’t run again.

“When I heard the guys weren’t doing the Warmest 100 again, I thought, ‘why not?’” he said.

Pazolli’s more sophisticated approach allowed him to complete his own list ahead of the Warmest 100 team, as they publicly acknowledged via Twitter.

Pazolli also used the Instagram API to find the source images containing people’s votes, using a program he wrote in Python. He then used wget to download the images, and the open-source OCR program Tesseract to process the images.

Some more Python code cleaned up the resulting text file, which was cross-matched with a list of artists and song titles provided in a pdf by TripleJ to all Hottest 100 voters, again using Python.

Pazolli tried various matching methods, eventually using a locally-sensitive hash called a Nilsimsa hash augmented with some hinting to offset the method’s relative slowness. His approach netted him a total 14,000 votes, just shy of the 17,800 votes collected by the Warmest 100 team.

The Warmest 100 website was spun up quickly as a relatively simple update from last year. The Hydra.js modular architecture library provides the basic structure, and Javascript drives the parallax scrolling effect.

It makes liberal use of cloud services, embedding players from SoundCloud and YouTube to play songs from the page itself, and traffic is measured using XiTi.com and Google Analytics. This year the team is using the CloudFlare Content Delivery Network (CDN) as a front-end to help manage load to the site.

We now have to wait for the countdown on 26 January to see how accurate this year’s predictions turn out to be.

Copyright © iTnews.com.au . All rights reserved.


Warmest 100 is back with new bag of tricks
 
 
 
Top Stories
Frugality as a service: the Amazon story
Behind the scenes, Amazon Web Services is one lean machine.
 
Negotiating with the cloud email megavendors
[Blog post] Lessons from Woolworths’ mammoth migration.
 
Qld govt to move up to 149k staff onto Office 365
Australia's largest deployment, outside of the universities.
 
 
Sign up to receive iTnews email bulletins
   FOLLOW US...

Latest VideosSee all videos »

The great data centre opportunity on Australia's doorstep
The great data centre opportunity on Australia's doorstep
Scott Noteboom, CEO of LitBit speaking at The Australian Data Centre Strategy Summit 2014 in the Gold Coast, Queensland, Australia. http://bit.ly/1qpxVfV Scott Noteboom is a data centre engineer who led builds for Apple and Yahoo in the earliest days of the cloud, and who now eyes Asia as the next big opportunity. Read more: http://www.itnews.com.au/News/372482,how-do-we-serve-three-billion-new-internet-users.aspx#ixzz2yNLmMG5C
Interview: Karl Maftoum, CIO, ACMA
Interview: Karl Maftoum, CIO, ACMA
To COTS or not to COTS? iTnews asks Karl Maftoum, CIO of the ACMA, at the CIO Strategy Summit.
Susan Sly: What is the Role of the CIO?
Susan Sly: What is the Role of the CIO?
AEMO chief information officer Susan Sly calls for more collaboration among Australia's technology leaders at the CIO Strategy Summit.
Meet the 2014 Finance CIO of the Year
Meet the 2014 Finance CIO of the Year
Credit Union Australia's David Gee awarded Finance CIO of the Year at the iTnews Benchmark Awards.
Meet the 2014 Retail CIO of the Year
Meet the 2014 Retail CIO of the Year
Damon Rees named Retail CIO of the Year at the iTnews Benchmark Awards for his work at Woolworths.
Robyn Elliott named the 2014 Utilities CIO of the Year
Robyn Elliott named the 2014 Utilities CIO of the Year
Acting Foxtel CIO David Marks accepts an iTnews Benchmark Award on behalf of Robyn Elliott.
Meet the 2014 Industrial CIO of the Year
Meet the 2014 Industrial CIO of the Year
Sanjay Mehta named Industrial CIO of the Year at the iTnews Benchmark Awards for his work at ConocoPhillips.
Meet the 2014 Healthcare CIO of the Year
Meet the 2014 Healthcare CIO of the Year
Greg Wells named Healthcare CIO of the Year at the iTnews Benchmark Awards for his work at NSW Health.
Meet the 2014 Education CIO of the Year
Meet the 2014 Education CIO of the Year
William Confalonieri named Healthcare CIO of the Year at the iTnews Benchmark Awards for his work at Deakin University.
Meet the 2014 Government CIO of the Year
Meet the 2014 Government CIO of the Year
David Johnson named Government CIO of the Year at the iTnews Benchmark Awards for his work at the Queensland Police Service.
Q and A: Coalition Broadband Policy
Q and A: Coalition Broadband Policy
Malcolm Turnbull and Tony Abbott discuss the Coalition's broadband policy with the press.
AFP scalps hacker 'leader' inside Australia's IT ranks.
AFP scalps hacker 'leader' inside Australia's IT ranks.
The Australian Federal Police have arrested a Sydney-based IT security professional for hacking a government website.
NBN Petition Delivered To Turnbull's Office
NBN Petition Delivered To Turnbull's Office
UTS CIO: IT teams of the future
UTS CIO: IT teams of the future
UTS CIO Chrissy Burns talks data.
New UTS Building: the IT within
New UTS Building: the IT within
The IT behind tomorrow's universities.
iTnews' NBN Panel
iTnews' NBN Panel
Is your enterprise NBN-ready?
Introducing iTnews Labs
Introducing iTnews Labs
See a timelapse of the iTnews labs being unboxed, set up and switched on! iTnews will produce independent testing of the latest enterprise software to hit the market after installing a purpose-built test lab in Sydney. Watch the installation of two DL380p servers, two HP StoreVirtual 4330 storage arrays and two HP ProCurve 2920 switches.
The True Cost of BYOD
The True Cost of BYOD
iTnews' Brett Winterford gives attendees of the first 'Touch Tomorrow' event in Brisbane a brief look at his research into enterprise mobility. What are the use cases and how can they be quantified? What price should you expect to pay for securing mobile access to corporate applications? What's coming around the corner?
Ghost clouds
Ghost clouds
ACMA chair Chris Chapman says there is uncertainty over whether certain classes of cloud service providers are caught by regulations.
Was the Snowden leak inevitable?
Was the Snowden leak inevitable?
Privacy experts David Vaile (UNSW Cyberspace Law and Policy Centre) and Craig Scroggie (CEO, NextDC) claim they were not surprised by the Snowden leaks about the NSA's PRISM program.
Latest Comments
Polls
Which bank is most likely to suffer an RBS-style meltdown?





   |   View results
ANZ
  20%
 
Bankwest
  9%
 
CommBank
  11%
 
National Australia Bank
  17%
 
Suncorp
  24%
 
Westpac
  19%
TOTAL VOTES: 1498

Vote