When we have won the 2019 RecSys competition we built a huge system for extracting features. As it often is with the competitions they don't always produce immediate practical results.
When I say "practical" the system we constructed was quite complicated - we had hundreds of features and the models were slow. It is not the first time such a thing happened. During the Netflix Challenge where they awarded 1 million dollars to the winning solution - the solution wasn't used because it was too complex (https://www.wired.com/2012/04/netflix-prize-costs/). But the ideas survived and the competition itself sparked research interest in this domain.
Going back to the RecSys - after the competition finished we wanted to implement our ideas in a live system. When we talked with potential clients we lacked some arguments that we can deal with designing and implementing a working realtime solution.
Competitions vs real world scenarios
People often say that the competitions don't always give you the experience needed to implement something in the production system. I tend to have a different opinion on this. Sure, the solutions that win are often too complex to be considered in a production setting. However if a person or a team is capable of creating a complex and good solution to a problem then the same person knows how to compromise and achieve slightly worse results with simpler methods. What I mean is that the competitions give a spectrum of possibilities. When you squeezed the last drop of water from a rock than it will be much easier to squeeze it from a lemon.
Why real-time recommendations are important?
As it turns out real-time recommendations matter. Compared to nightly recommendations Pixie (a system designed at Pinterest) improved user engagement by up to 50%. Read an article about it here.
Back to the drawing board - Architecture
We wanted the system to be fast and responsive. The technology stack we went for:
Although Python rarely is the first choice for performance-critical systems it turned out to be quite good for prototyping. Quickly testing new ideas is more important for us at this point than having 2-3 times more throughput. As we are quite good at Scala - we can rewrite it any time.
First, we implemented the web service in Flask but soon realized that it is not that fast. To make it fast you must rely on gunicorn to spawn many processes. Sanic at its core is asynchronous and running it using many workers is as easy as adding one parameter workers.
app.run(host="0.0.0.0", port=8001, workers=8)
Switching from Flask to Sanic was painless since both frameworks have almost the same API.
As it stands redis is the most popular key-value store in the rankings (https://db-engines.com/en/ranking).
In our solution, we rely on "accumulators" which track the statistics about the users, items, and interactions between them. Each accumulator has a state that must be updated many times per second. Redis seemed like an ideal choice because each accumulator has its distinct information schema.
There is another thing about redis that makes it even more interesting - it can store many data types that are very useful for recommendations: sets, bloom filters, maps. It would be possible to build a working recommendation system using only the commands this database engine provides - which is quite amazing. Here the author goes through all the scenarios how redis can be used in this context - http://nosqlgeek.blogspot.com/2019/02/building-recommendation-engine-with.html.
Some of the ideas from the blog:
- Content-Based Filtering (using Sets)
- Collaborative Filtering (using Sets)
- Ratings based Collaborative Filtering (using Sorted Sets)
- Social Collaborative Filtering (using RedisGraph)
When testing the system on a local machine we achieve a throughput of about 2.5k events per second - which translates to about 30k writes to redis database. Some of the events like transactions spawn a separate process which saves the state of the user for later use. Based on those saved states we are able to build a scoring model.
Real-time item rescoring
When designing our system we wanted it to have the same quality as our solution in the RecSys competition. Collaborative filtering is not enough when it comes to personalization based on the activity on the website. Users do many things that are not considered very helpful to traditional recommendation systems.
Some examples of activities that can be important:
- Filtering the product results
- Sorting the product results
- Interacting with images, ratings, comments
- Does the user interact with many products or fewer specific ones?
Our system can extract features from all those activities and build a model that can analyze a set of items.
Having a real-time system is useful when users have never been seen on your website. As it was for Trivago most of the users visit the website 1-2 times a year and many of them are not even logged in.
The system like ours can be used:
- Recommend products based on the last interactions with other products
- Boost search results to present better items to the user
It turned out that designing a realtime system for handling session context was not that hard for us. Batch predictions are of course different than streaming predictions but the same rules apply.
We spent many months on the RecSys competition and eventually we used this experience to build a fully functional system. Schedule a meeting with us to talk about how you can benefit from real-time recommendations.
Author other articles
3 languages for Machine Learning
How to build event-based models
Churn modeling for business part II case study
LogicAI wins 1st place in ACM RecSys 2019 Trivago competition