Rust for real-time application – no dependencies allowed

A tale of migrating from Python to Rust to implement a real-time recommendation engine.
Blog
Recommendation systems

Table of Contents

It has been almost 3 months since we started rewriting RecoAI (our recommendation engine) from Python to Rust. To call it a rewrite is an understatement because you cannot just rewrite things from one language to another without some changes in the architecture or structure of the project. The main difference is that in our previous version the events that were comming to the system were dynamic without any underlying schema. It was quite good to write a prototype not so for writing a stable system. Memory safety and minimal generalization give us a comfort that could not be achieved with Python.

Unorthodox architecture

It is a third version of the application and each time we create a new version we try to improve upon what we think was missing. In short we made some interesting choices that could turn someone head (for good or bad).

No dependencies on external services (yes even a database)

It means that we don’t rely on database to keep the state of the application. All the state is handled by in-memory data structures: HashMap, Vector, HashSet etc. You might think that this is a bad idea but if the system is real-time and must respond in milliseconds to provide accurate recommendations it is price that is worth paying.

The application can provide things like /metrics endpoint for Prometheus but it is a one way dependency and we want to keep it this way.

No REST API

The only way to communicate with the application is through predefined events. Because of this we can recreate the state of the application at any point which is important because it is a Machine Learning application and the ability to go back in time is very important for tasks like this.

The only downside of this approach is that we need to have a way to serialize the state or otherwise we have to replay all the events from history. We might go back to REST API for things like user management but it would be done probably by another backend.

No cloud (maybe apart from BigQuery)

The application is quite memory heavy – we keep the deployments on dedicated servers (also costs…). The application is just a single binary copied to the server so it is not such a big deal. First of all we aim for simplicity and looking and cloud offering sometimes I see this more as a complication than necessity. The only cloud service we rely on is BigQuery and Datastudio for reporting.

Why we switched to Rust?

Python is an excellent language for prototyping ideas but it has three major flaws:

1. Lack of static types – Python is a dynamic language which makes it quite hard to manage a complex system like ours. There are more and more ideas how to make it easier in this aspect like type annotation and checkers, but at the end of day we prefer to rely on a compiled language.

2. Performance – Recommendations often rely on very tight loops and iteration over large data structures. Implementing some algorithms in Python turned out to be too slow for production use.

3. Lack of good multithreaded model – We wanted to keep our data structures in memory. Shared access to such data in Python is impossible. You can use a in-memory database like redis but it has other problems like serialization/deserialization and lack of flexibility to define more complex data structures.

Type-safe events

Our system is fully based on events. Events are no longer dynamic like it was in Python. The architecture of the system slightly resembles Event Sourcing (https://microservices.io/patterns/data/event-sourcing.html). It means that you can only change state of the system using predefined events. Also it has benefits for system like ours – it means that we can replay the events and replicate the state of the system at any moment.

At first we wanted to move part of our ideas from Python to Rust. We had a lot of code that was declarative and evaluated in a lazy way. Moving this design exactly to Rust was harder than expected. This came from a fact that in Rust the compiler must know the types of variables at all times – there is no place for dynamic events with unknown attributes. The fact that the compiler does not allow unknown types means that a lot of things we planned had to be rethought. Rust compiler resisted to too clever code. It was possible to find a few workarounds here and there, but it was evident that Rust does not like very abstract code. What we ended up was a simpler design and once we started to write it in line with Rust, the compiler stopped complaining and the design turned out to be much simpler.

In our system, there are about 30 types of events which you can communicate to the system and every event type has a predefined schema. Generating JSON schema from Rust structs is very straightforward which made it possible to create SDK to our system without almost any additional work.

Because all the events have static schemas it was very easy to generate JSON schemas from them and then create SDKs in popular languages for free.

Everything is a library, RecoAI is a library

RecoAI has no dependencies on external services which results in a deployment that uses only one binary file copied to the server. You don’t even need Docker to deploy it. It has every dependency included:

  • server
  • search engine
  • database (or lack thereof)

We use Rust libraries and RecoAI is a Rust library itself. If you want to integrate a recommendation engine into your Rust application you can do so easily. This is a very Rust-like thing to do. The Rust community loves libraries and its zero-dependency deployment.

Performance

RecoAI can process up to 100.000 events per second which is 2 orders of magnitude faster than Python. There are 2 reasons behind the performance improvement:

  • Rust is much faster than Python (100-200x times in some cases)
  • Every data structure is stored in memory in pure Rust data structures – we don’t communicate with any database to keep the state of the application. Even if you use a database like redis which stores the data in memory too the cost of serialization and deserialization is quite high.

Libraries

It came to us as a surprise that Rust has already a choice of very good quality libraries. Some of those libraries are quite niche topics and it is a nice thing that a language that is not so mainstream has such a choice. Some of noteworthy libraries we use in the project (apart from the obvious ones):

  • Tantivy – https://github.com/tantivy-search/tantivy – This is a building block of the search engine modeled after Lucene.
  • FST – https://github.com/BurntSushi/fst – Library to perform fuzzy search using Levenshtein distance as a parameter for searching.
  • SchemaRS – https://github.com/GREsau/schemars – This library is a god send for us. It can generate JSON schema from a defined struct. Generating this schema is an important step to generate SDKs in the language we support.
  • Tracing and its ecosystem – https://github.com/tokio-rs/tracing – This library helps us to track all endpoint requests in nice JSON format, capture their responses, and additionally put different information into different files with simplicity.

Types and schemas

I mentioned that all events that come to the system have a predefined schema. Let’s say that we have an event of someone putting some item into the basket.

struct AddToCart {
   cart_id: Option<str>,
   item: ItemDetails
};</str>

This is automatically converted into a JSON schema:

{
   "$schema": "http://json-schema.org/draft-07/schema#",
   "title": "AddToCart",
   "type": "object",
   "required": [
       "item",
   ],
   "properties": {
       "cart_id": {
           "type": [
               "string",
               "null"
           ]
       },
       "item": {
           "$ref": "common.json#/definitions/ItemDetails"
       }
   }
}

And this JSON in turn is converted into models in different languages.

@dataclass
class AddToCart:
   item: ItemDetails
   cart_id: Optional[str] = None

   @staticmethod
   def from_dict(obj: Any) -> 'AddToCart':
       ...

   def to_dict(self) -> dict:
       ...

JSON schemas are used not only to create SDK but also to create forms in the user interface.

Productivity

Overall we have 4 people working on the Rust version right now. Most of them did not have any experience beforehand. While I think we are still Rust noobs and many places require fixing I still believe that the learning curve in Rust is not bad.

Disadvantages

Rust is not perfect. Some places in our code seem very verbose, especially pattern matching dispatchers. Rust offers some solutions to solve the redundancies but the solutions often rely on macros which are not so easy to learn. Macros in Rust are different than macros in other languages mostly because of how powerful they are. Also, coming from Python where macros do not exist it is just something that you must wrap your head around.

Compilation times are also pretty bad. It takes around 1 minute to compile our project for testing – not including downloading dependencies.

Summary

We are very happy with Rust. The language itself has some disadvantages but overall it is a pleasure to work with. Looking at what is happening with the Rust community and investments done by the biggest companies, I’m sure that we made a good choice switching to Rust.

Topics:

CTO / Data Scientist / Kaggle Grandmaster

CTO / Data Scientist / Kaggle Grandmaster

Other stories in category

BlogKaggle Days
4 – Nature never goes out of style!

4 – Nature never goes out of style!

4 – Nature never goes out of ...

Five continents, twelve events, one grand finale, and a community of more than 10 million - that's Kaggle Days, a nonprofit event for data science enthusiasts and Kagglers. Beginning in November 2021, hundreds of participants attending each meetup face a daunting task to be on the podium and win one of three invitations to the finals in Barcelona and prizes from Kaggle Days and Z by HPZ by HP.

Paras Varshney

16 Aug 2022

BlogKaggle Days
3 – Now you are playing with power

3 – Now you are playing with power

3 – Now you are playing with ...

"It was amazing," commented attendees of the third Kaggle Days X Z by HP World Championship meetup, and we fully agree. The Moscow event brought together as many as 280 data science enthusiasts in one place to take on the challenge and compete for three spots in the grand finale of Kaggle Days in Barcelona. Of course, we already know the winning teams that best handled the contest task. In addition to the excitement of the competition, in Moscow were also inspiring lectures, speeches, and fascinating presentations of modern equipment. As always, at Kaggle Days, a lot was going on.

Paras Varshney

16 Aug 2022

BlogKaggle Days
2 – Water Water everywhere, not a drop to drink

2 – Water Water everywhere, not a drop to drink

2 – Water Water everywhere, n...

"Happy to be part of shaping the future." "It's the Way of The Future." That is how the participants summed up another meetup organized as part of Kaggle Days, a non-profit event for data science enthusiasts who want to grow and compete for prizes under the watchful eye of top Kaggle mentors and grandmasters. The second meetup in New Delhi is behind us. Three hundred participants, more than one hundred teams, and only three invitations to the finals in Barcelona mean that the excitement could not be lacking.

Paras Varshney

16 Aug 2022