Federated learning in recommendations

Liudmyla Kyrashchuk Data Scientist
1 year, 7 months ago · 3 min read Machine Learning Rust Recommendation system

Federated learning (FL) was introduced by Google to the machine learning community for the first time in 2016. From that time, slow but steady, the other companies are catching up to this technology and are starting to offer their solutions based on it as well.

What is FL?

source: Federated Learning: Collaborative Machine Learning without Centralized Training Data

Federated learning is a fairly new approach in machine learning based on the idea of data decentralization. The company stores the global model on their server, sends model weights and training program to the users' devices to train on their data locally. Once trained, the update for weights is sent back to the server, where it is averaged with the other users' updates. Company then updates the global model on the server with that averaged update. Once done, the cycle repeats. From the client's side the model locally doesn't do any predictions; it is just trained on the device. The device is communicating with the server only while idle and connected to power. Additionally, the company doesn't need to store those data in their databases, so it's a win-win situation.

Federated learning can be:

Google in their article form 2016 proposed Federated Averaging as an optimization algorithm, that has the following steps:

  1. Sample a random set of clients
  2. Compute updates for each client in parallel locally and send them to the server
  3. Average the clients' updates
  4. Update the global model with the averaged update

While this approach brings better data protection and less clients data into the company databases, there are still some challenges:

Security. Those include attacks from the client-side to confuse the global model either by complicating the converges or leading the model to train towards the adversarial targets; GAN-based attacks from the model or client-side. Another question to consider is the privacy of the updates both as data itself (differential privacy) and as a computation problem (using the secure multi-party computation (MPC), homomorphic encryption (HE), and trusted execution environments (TEEs)).

Machine learning. Due to the fact that the device must be on idle and connected to power while sending updates to the server, updates are often expected to be unbalanced, non-IID (independent and identically distributed), temporarily unavailable and geographically biased. One of the key challenges is the optimization on those data with the regard to parallelization of the process for each client. Among others are hyperparameter tuning, architecture selection, using in the unsupervised settings. 

Bias and fairness. In the settings where the data is decentralized the fairness and bias measurements can be a challenge, as the company doesn't get the necessary information to check the algorithm.

Deployment. Maintaining the process is another challenge. The upload process is slower than download, so the new ways of updating the models are in development. You can read about two of them here.

Federated learning in RecoAI

While Google recently announced that they are rolling out the trial for Federated learning of Cohorts (FloC) for their advertising, we firmly believe that this technology can be used in recommendations. 

There are two possible aspects where and how it can be used:

Why does it matter? At RecoAI we believe that we can create a better product by offering privacy to the user. We should be able to provide basic components of comfort, like deleting the event from the recommendations, opting-out from sending data, or getting recommendations, making us "forget" about some viewed or bought products. The core of our RecoAI is a Rust language, Rust does compensate for the slowness of the Python and heavy memory usage of Java, but specifically in this context, its ability to naturally compile to WebAssembly is a key. Being able to prepare recommendations in the browser for the specific user can bring security and privacy to another level.

Liudmyla Kyrashchuk Data Scientist
1 year, 7 months ago · 3 min read Machine Learning Rust Recommendation system

Author other articles

Related articles