In this tutorial we will learn how to extract image embeddings from a Deep Learning model mobilenet v2 using ONNX model definition.

Extracting deep learning image embeddings in Rust

Paweł Jankiewicz CTO / Data Scientist Author LinkedIn Profile
1 year, 6 months ago · 3 min read Machine Learning Rust ONNX tutorial
Table of contents

State of Rust Machine Learning

Rust may not be the first choice to develop a deep learning model. Right now the site that tracks its progress in the domain is about right that it is not yet ready for this task -

While the language is more than capable of handling this task, the ecosystem is not ready yet for general machine learning development. The problem is that in my opinion there are no obvious ways how to deal with dataframes just to name the first thing. I'm pretty sure that in 2-3 years Rust will be a language that can serve for Machine Learning experimentation and development of models but right now its main usage is in the model deployments.

Vision models deployment using Rust

However Rust can offer some value in Machine Learning deployments. In building RecoAI (our fast, accurate and fair recommendation engine) we had the idea to generate image embeddings using deep learning models. 

It turns out that probably the best way to do it currently is not to use a generic deep learning framework like but instead rely on an intermediate ONNX format developed by Microsoft.

Let's create a command line app in Rust that can inspect the ONNX model (print out the layer names) and extract embeddings for a selected image.

use tract_onnx::prelude::*;
use clap::{AppSettings, Clap};
use std::error::Error;

#[derive(Clap, Debug)]
struct Image {
    #[clap(long, default_value = "cat.jpeg")]
    image_path: String,
    #[clap(long, default_value = "Reshape_103")]
    layer_name: String,
    normalize: bool,
    #[clap(long, default_value = "224")]
    image_size: usize,

#[derive(Clap, Debug)]
enum SubCommand {

#[derive(Clap, Debug)]
#[clap(version = "0.1", author = "Paweł Jankiewicz")]
#[clap(setting = AppSettings::ColoredHelp)]
struct Opts {
    #[clap(long, default_value = "mobilenetv2-7.onnx")]
    model_path: String,
    subcmd: SubCommand,

We are using an amazing clap crate to create a command-line interface. Subcommand struct creates 2 tasks: inspect, embed.

fn inspect_model(opts: Opts) -> Result<(), Box<dyn error="">>  {
    let model = tract_onnx::onnx()

    for name in model.node_names() {
        println!("{:?}", name);


inspect_model function takes the options as a parameter and loads the ONNX model using the provided path. We enumerate the node names. After running

cargo run -- --model-path mobilenetv2-7.onnx inspect

It should print out:

(...redacted for brevity)

In this case we are interested in Reshape_103 node which has a vector 1280 floats representing last but one layer before classification. The code to extract this embedding is a little bit more involved mostly because we need not provide the input image as a vector of floats, and even worse we need to normalize the input according to the mean and the standard deviation used when the model was trained originally.

fn embed(opts: &Opts, image_opt: &Image) -> Result<(), Box<dyn error="">> {
    let image_size = image_opt.image_size;
    let model = tract_onnx::onnx()
        .with_input_fact(0, InferenceFact::dt_shape(f32::datum_type(), tvec!(1, 3, image_size, image_size)))?

    let image = image::open(&image_opt.image_path).unwrap().to_rgb8();
    let resized =
        image::imageops::resize(&image, image_size as u32, image_size as u32, ::image::imageops::FilterType::Triangle);

    let image: Tensor = if image_opt.normalize {
        tract_ndarray::Array4::from_shape_fn((1, 3, image_size, image_size), |(_, c, y, x)| {
            let mean = [0.485, 0.456, 0.406][c];
            let std = [0.229, 0.224, 0.225][c];
            (resized[(x as _, y as _)][c] as f32 / 255.0 - mean) / std})
    } else {
        tract_ndarray::Array4::from_shape_fn((1, 3, image_size, image_size), |(_, c, y, x)| {
            resized[(x as _, y as _)][c] as f32

    // run the model on the input
    let result =!(image))?;
    let best: Vec<_> = result[0]

    println!("{:?}", best);

To run image embedding on a cat image, run

cargo run -- --model-path mobilenetv2-7.onnx embed --normalize --image-size 224 --image-path cat.jpeg

This command should print out a list of 1280 floats. You can see the code for the whole project in the repository.


There are other considerations than just writing Rust for fun. Rust and ONNX is meant to increase the performance on Deep Learning model deployments. Let's compile our small project and see how fast it loads the model and extracts the embeddings

cargo build --release
time target/release/image-embedding-rust --model-path mobilenetv2-7.onnx embed --normalize --image-size 224 --image-path cat.jpeg

On my old Thinkpad P52 with i7 processor it takes 0.2 seconds to load the model and perform feature extraction.

Even as a cold cached model it is not bad at all. This result could be improved by loading the model to memory.

What's next

In the next episode we will wrap this code and create a web service that can index the images and return most similar image if queried. Stay tuned.

If you are interested in problems like this and using Rust in production development of the fastest real-time recommendation engine check out RecoAI website or drop us an e-mail at

Paweł Jankiewicz CTO / Data Scientist Author LinkedIn Profile
1 year, 6 months ago · 3 min read Machine Learning Rust ONNX tutorial

Author other articles