Post by Greg Szopiński, Liudmyla Kyrashchuk, and Rachel Melby.
Kaggle Days began in May of 2018 in Warsaw. The first edition had nearly 100 Kagglers from all over the world who came to meet and learn from Kaggle Masters and Grandmasters. There was also a first-of-its-kind Kaggle offline competition.
For our second edition of Kaggle Days held in January 2019, we doubled down — gathering nearly 200 Kagglers to in Paris to meet, learn and code with Kaggle Grandmasters, and of course to compete in our traditional offline competition. We are so pleased with how the event went, and now that we’ve had a chance to catch our breath we want to offer you a recap of the experience!
The Night Before
Before the intensive two days of learning and coding, we gathered in the center of Paris to admire Notre Dame, Hôtel de Ville, Louvre and many more beautiful places of French capital city. As is common this time of year, the air was brisk and after the sightseeing, everyone wanted to head indoors to a local French bar to relax and continue the merriment. There, Kagglers met with Anthony Goldbloom (CEO of Kaggle), Mikel Bober-Irizar, Darius Barusauskas and spent a nice evening networking.
Presentations and workshops were clear and pragmatic. LogicAI did a nice job setting expectations with speakers. Speakers genuinely had deep knowledgeAnthony Goldbloom
onthe topics they spoke about.
Before the start of presentations on the first day, Anthony Goldbloom and Anna Montoya gave a welcoming keynote speech and answered audience questions about Kaggle. This was followed by short speeches by our LVMH and Dior.
After that, the day kicked off with two presentations, both of which focused on machine learning interpretability. Alberto Danese, a senior data scientist at Cerved Group and Kaggle Grandmaster talked about how it can enable widespread adoption of ML technologies in the enterprise. Konstantin Lopukhin spoke on specific cases of neural networks, which he noted are often considered black-box models.
Attendees were given the option to choose to attend a workshop over the sponsor presentations. Darius Barusauskas’ (current number 9 on Kaggle leaderboard) led a workshop focused on the machine learning techniques which allowed him to win first prize in the NCAA competition.
Overall, the day was organized in this double-track way; if you were not interested in talks, you could instead attend a workshop or brainstorming session and vice versa.
After a social lunch break, Jean-Francois Puget (IBM, 33 on Kaggle) shared his tricks for feature engineering a hyperparameter optimization. This included general tips and an overview of more competition-specific ideas from Kaggle. The main objective was to feature engineering as possible (even if you fail at times), including using deep learning to generate new features and to avoid ensembling too early in the competition.
Paul Deveau presented one of the few non-competition oriented presentations that day. He explained how machine learning can help in solving problems in cellular biology and in understanding human behavior. Stanislav Semenov formerly held Kaggle’s number one ranking, shared some of his tricks for competitions, ranging from a simple LASSO model for feature selection to more *arcane* techniques, such as isotonic regression.
Because success stories are so inspiring, we were delighted that Darius Barusauskas shared his with us: how he became Grandmaster and turned his success in Kaggle competitions into a startup company. During his talk, Darius shared his personal inspirations and some lessons learned from competing at Kaggle. (e.g. to be careful with tree-based models and noisy features).
During this session, there were also two workshops to choose from: Luca Massaron with Pietro Marinelli gave insight into using gradient-boosted trees and hyperparameter optimization using Skopt library. Later, Kaggle’s current number five, Pavel Pleskov gave some insights into image classification.
Participants not interested in neither workshops nor presentations were able to take part in one of the three brainstorming sessions, on behalf of Dior Couture, Louis Vuitton, and Sephora. “I think people were intrigued by LVMH’s interest in data science,” said Anthony Goldbloom. Teams with the best ideas were awarded some fancy gifts from the respective company.
I think the networking went better than any event I’ve ever attended. Part of the reason is attendees have something they’re passionate about in common. Where else would you see a deep learning professor and a high school student sharing approaches to optimizing the learning rate?Anthony Goldbloom
After a brief afternoon break, Paweł and Wojtek from LogicAI with Carey Chou from LVMH shared some details about cooperation between the companies on the style-transfer project. The main idea was to translate a person’s taste or mood to a style of a fashion product (e.g. a bag or a dress) to make it more personal and unique. The final presentation of day one was Gabor Fodor’s talk about different qualities of Kaggle and how it is more than a competition platform. In short — Kaggle kernels allow you to achieve quite impressive results even if you don’t have much computing power. In the meantime, in the workshop area, Pavel Ostyakov presented how to win Kaggle competition spending a minimum amount of time (main tip: stay away from jupyter notebooks).
The first day ended with a huge announcement from the organizers: the main topic of the live competition to take place the following day. It was quite the teaser and attendees couldn’t wait to receive further details, but they had to wait until the morning!
The second day was vastly different from the first one. This time there were no presentation nor workshops. The competition was the featured event.
The mission of the competition was to predict sales of given LVMH products after three consecutive time periods (one, two and three months). The prizes made the game worth playing. First prize was Louis Vuitton smartwatches for each member of the winning team — and they would be hard-won as the data set was quite complex! It contained features ranging from typical sales data to vectorized product images.
Over the next 11 hours, 76 teams tried to find the winning solution, while the speakers from the previous day took the role of mentors and provided the participants with some clues.
It was a great model: everyone enjoyed being mentors and those of us who competed benefited from the mentoringAnthony Goldbloom
Apart from sharing their wisdom, speakers also tried to crack the competition and set a very high level of rivalry. The final shake-up switched first and second team (on the private leaderboard), although the final scores were pretty close to each other. Initially, the sponsors planned to award 3-top teams but finally gave the award to the 4th team also. Kudos to them!
After this really intense 11-hour machine learning marathon there was some much-needed downtime. Mentors, participants, sponsors, and staff partied in one of Paris clubs, located on the rooftop of the fashion museum near the bank of the Seine.
Overall, our second installment of Kaggle Days was a great success. Attendees, sponsors, and organizers all agreed to do it again! We hope you’ll join us at the next Kaggle Days event!
Stay tuned for future events at https://kaggledays.com/.