One of the first questions we get from companies who want to sponsor machine learning challenges is about the expected quality of the end result. In our experience, there are two essential drivers of innovation: competition and collaboration. Hackathons and competitions encourage both.
Not surprisingly, several groundbreaking results of machine learning and artificial intelligence can be originated from competitions. In this article, we highlight the most notable cases where open challenges inspired significant advances.
The ImageNet competition
For many years, the ImageNet competition was one of the main driving forces behind the innovation in computer vision. Remember when convolutional networks first exploded in popularity? It was because of the annual ImageNet competition.
In short, the goal of the annual ImageNet Large Scale Visual Recognition Challenge between 2010 and 2017 was to build an accurate classifier for a massive dataset of images containing more than 1.4 million images, categorized into 1000 classes. Before the challenge series’ launch, achieving even an acceptable performance on such a complex task was a formidable feat.
However, 2012 was a turning point. In their landmark paper titled ImageNet Classification with Deep Convolutional Neural Networks, authors Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton introduced a novel architecture (later named as AlexNet) that represented a hyperspace jump in terms of performance. Its top-5 error was as low as 15.3%, a whopping 10.8% improvement on the runner-up. Below, you can see the results visualized between 2011 and 2016.
AlexNet was the first convolutional network that won the competition, and the first one to shed light on their capabilities. After this landmark submission, all subsequent winners utilized the power of convolutional networks.
The competition served as a benchmark during its years, with many famous entries such as VGG, GoogLeNet, and ResNet.
By 2017, the ImageNet challenge was considered as solved. Most entries in the last competition had reached 95% accuracy, a threshold that was considered an extremely difficult feat before. After its great successes, the challenge was discontinued in this form. However, the organizers announced for the challenge to return in a renewed form, focusing on 3D vision in the future.
For a long time, the protein folding problem was the Holy Grail of bioinformatics. Predicting the three-dimensional protein structure from the sequence of its amino acids is an extremely complex task, involving the deep understanding of thermodynamics and the interactions between molecules.
AlphaFold by Google DeepMind solved this 50 years old problem. After decades of painfully slow progress, the performance improvement compared to the previous state-of-the-art was unprecedented. As said by John Moult, as quoted in the AlphaFold announcement post,
“We have been stuck on this one problem – how do proteins fold up – for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment.”John Moult, Co-founder and Chair of CASP, University of Maryland
The method debuted in the 13th Critical Assessment of Techniques for Protein Structure Prediction (CASP) competition, which was a driving force of development in the field.
Like AlexNet, AlphaFold ended up revolutionizing a field. Without CASP, this would probably have happened much later.
The COCO challenge
The four most common tasks in computer vision are, in increasing difficulty,
- semantic segmentation,
- instance segmentation.
Instance segmentation, which requires identifying precisely the pixels that belong to the object and which category the object belongs to, was an insurmountable challenge for a long time. COCO, short for Common Objects in Context, aimed to tackle that. Since 2015, the publication of the dataset and the competition’s launch, the average precision nearly doubled. Year after year, top-tier groups from places like Facebook, Alibaba, or Microsoft compete with each other to push state of the art even further.
One notable submission for this challenge was the Mask R-CNN, published by Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick from Facebook AI (FAIR). For a while, it represented the pinnacle of region-based convolutional networks, with hundreds of applications throughout the spectrum. For example, part of our team used Mask R-CNN to develop a powerful method for cell nuclei segmentation with microscopy.
Competitive programming has been a significant part of computer science for a long time. However, with the rise of machine learning, they gained a surge in popularity. In essence, teamwork and competition can drive brilliant minds to develop ingenious solutions to various challenges. A fixed-term competition is a distillation of how research and development work on a larger timescale.
During the past decade, open machine learning competitions have been a significant driving force of development. To see this, it is enough to take a look at the ImageNet challenge, where AlexNet was premiered in 2012, single-handledly popularizing convolutional neural networks in computer vision.
Besides pushing state of the art, competitions can be used for other purposes with great success. Companies often organize one for supercharging the development time, but it can lead to great results in classroom settings as well.
For us, the lesson is clear: if you want to solve interesting and hard problems, go participate in a competition. On the other hand, if you want to accelerate progress in a field, organize an open challenge.