Session-based recommendation system

Recommending the most relevant products in real time is the Holy Grail of data science in the e-commerce industry. Together with Sephora, we have met the challenge to improve online recommendations and increase the click-through rate (CTR). Although offline recommendation approaches had already been used, they did not incorporate the latest user behavior. Our task was to implement a system that would generate accurate, real-time recommendations in response to users’ most recent activities in the app. 

We approached the problem by creating a  framework based on the accumulators - objects that transform raw events into meaningful data that can later be used by machine learning models. We proved the power of this method by winning the 2019 RecSys challenge. Having data prepared, we gathered existing offline recommendations with newly generated online ones, giving us a  large set of potentially suitable products. The final step was to rank those products from the most to the least relevant and present the user only a couple of the best ones. 

Ministry of Health

Fraud detection for Polish Ministry of Health

GovTech Polska is using the competition formula to involve tech startups in solving state-scale technological challenges through Artificial Intelligence and Data Science. The central entity is the public sector, which reports challenges and looks for modern ways to solve them but the indirect beneficiaries are of course citizens. 

The system is based on Machine Learning and Data Science algorithms to detect anomalies in hospital records that have financial impact on contract with the Ministry of Health. As a result,  the Ministry can avoid contract abuse and save taxpayers millions of EUR, starting from 2020.


Price optimization

Price is one of the most vulnerable parts of any online venture. Every store is trying hard to find that sweet spot where the price is both profitable and emphasizes the value of the product. Instead of going the hard way, Frisco, the Polish market e-grocery giant, decided to approach this in a smart way by leveraging the power of machine learning.

Our goal was to increase sales without margin decrease for some categories and to increase margin without sales decrease of specified products in other categories, according to Frisco's strategy. We met this goal by adapting two concepts: price elasticity and market basket penetration. To investigate the relationship between the price change and market basket penetration for each product, we used a probabilistic approach. It allowed us to use different levels of price sensitivity, introduce expert knowledge into the equation, and gave the distribution of likely values. During the whole process, we gain a deeper understanding of the e-grocery industry, which we can transfer to other companies.

Alior Bank

Predicting defaults for small and medium businesses

Predicting defaults and risk of small and medium businesses is a challenge that every financial institution has. Alior Bank wanted to enrich its modeling infrastructure by using transactions of SMB Clients. The bank didn't have the tools to analyze sequential information and diverse data types that are present in the transactions. 

Together with Alior Bank we built a prototype of a framework called EventAI which simplifies model declaration. The framework let the analysts create sophisticated metrics describing the customers and use them easily. The tool we created is a solution that is available to all financial institutions in need of such mechanism.


Digitalization program

Implement digitalization program, to all offices and factories and 70 000 engineers. Build internal AI communities and make ‘AI as common as excel’.

We interviewed Data Science teams in 3 locations in the world and gathered insights about their cooperation with local and international teams. Over 70 easy-to-implement initiatives, majority of them involve no-cost and can be implemented in less than a month.

Wall Street English

Predicting student progress, optimizing classes schedules

Predicting any kind of human behaviour is a difficult task, predicting student progress in online/offline courses is no exception. Our client, an international language school, wanted to improve scheduling of their offline classes by predicting students’ progress in the online part of the course.

Exploiting the power of EventAI in transforming event-based data into a meaningful machine-learning algorithm input, we were able to predict finishing times of course’s online chunks for a given student. EventAI, with the ease of declaring features and aggregation functions, made the feature engineering process simpler, less prone to errors and easy to control. Having the predictions, we were able to optimise offline classes schedule, improve classes occupancies and reduce students waiting time.


Matching product to category tree automation

For a large e-commerce business, it's crucial to match its products with the merchants' offers as fast as possible. With the COVID-19 and rapid increase in online shopping, it came to a boiling point. Our client, an online platform of diverse goods, wanted to automate the onboarding of new merchants.

We created a web-based application powered by a machine learning model to match the merchant's product to the relevant category tree. We praise the FastAPI for its simple documentation, built-in perks, and development speed. It allowed us to come up with a simple application design in a short amount of time. Our deep learning model was able to predict the whole category tree to its leaves based on the title and product description only. This process can be transferred to any e-commerce platform with both deep and wide category trees to achieve the speed boost in terms of onboarding new resellers.

Oil pump

Fiber optic sensing data analysis application

Distributed fiber optic sensing, in particular Distributed Temperature Sensing (DTS) and Distributed Acoustic Sensing (DAS), is a technology that allows in-well monitoring e.g. for leakage detection. In order to make sense to a reservoir engineer, measurements need to be processed and visualized.

We used Python and standard data processing libraries - Numpy, Pandas, Dask - especially the last one proved to be useful when running CPU-heavy domain-specific algorithms. We also used HoloViz Panel library to make the app, and HoloViews/Bokeh for visualization. This allowed us to easily make fully interactive plots with zooming, hovers, and inter-plot interactions.

Investment and savings optimization for Limitless

Fintech products are inseparable from the AI industry.  AI makes fintech safe, accurate, fair and effective. It is crucial for financial apps to have solid data management and algorithms. Limitless is a financial wellbeing app targeted at Millennials and licensed to corporations that reached out to us to develop how to predict the size of monthly savings, customers are willing to put aside. The app helps corporate employees build their financial safety net and boost their overall satisfaction. It enables users to invest money automatically, with a solution adapted to their daily lives.The Open banking mobile app uses algorithms to analyse transactional data and determine the maximum amount that can be safely moved to saving and investing accounts. 

To make Limitless app safe and accurate, we worked closely with the team to be sure our predictions are precise. We used ML & DL techniques into a recommendation model to suggest how to save to achieve the dream goal, as well as optimization to find the best personalized investment strategy within the user’s given timeline.


Kaggle Days Paris Competition

For our second and more chic event, Kaggle Days in Paris, we selected the competition to be about forecasting the sales of the next three months based on the first seven days of sales after launch. That way we helped a luxury goods retailer, LVMH, to plan their sales strategies with advanced machine learning techniques.

During 11 hours of competition 76 teams tried their best, using all tricks and techniques, to find themselves in the lowest error range. Data consisted of product descriptions, sales, social media, website navigation, and image data.The error dropped down to the unimaginable level of 0.6 from the baseline of 0.8 and everybody happily went to celebrate 2 intensive days loaded with knowledge, tips and mistakes learnt the hard way from the data science community.

Inline Inspection

Inline Inspection involves evaluating pipelines by using intelligent devices to search for internal and external damage. Such inspections are very costly and are carried out every couple of years. Next pipeline engineers manually match anomalies from certain years to analyze the growth of the damage. The goal is to predict changes which may violate the integrity of the pipeline. The whole process is very laborious and tedious.


Using standard data science libraries in Python we were able to fully automate this task with unsupervised learning and predictive modeling. This will save hundreds of hours of work and enable pipeline engineers to spend more time on less monotonous projects.


Digitalization introduction - Training and mentorship for Sanofi

During the workshop, participants were presented with basics of AI - the proper definition, use of AI in everyday’s tools, different types of artificial intelligence and common misconceptions about it. Then we talked about Big data, AL, ML and deep learning. What are the differences and links between them and how those systems are connected and dependent on each other.

Having that basic yet crucial knowledge in mind, we could move forward to learning how to analyze data potential and come up with AI ideas. We asked participants to come up with and share realistic challenges they face at their company and figured out AI based solutions which could be implemented in real life and automate business processes.

After the brainstorming session, participants presented their ideas and got some feedback from mentors. 

The workshop was followed by 1 on 1 mentoring sessions where participants could work on their ideas further and make them ready to implement at the company. 

Building Data Science Community for UAE

Better World Hackathon was the first step towards building a strong AI community in UAE and also a part of International Exhibition for National Security and Resilience in Abu Dhabi, UAE.

Better World Hackathon's aim was to bring talents and ideas to the country, to give opportunities for data science experts from all over the world to build innovative AI products. It was a global challenge for AI and Data Science teams to compete for prizes worth 310 K USD. The Hackathon was divided into 3 areas: security, civil defence and services. 

Besides organisation, logistics and marketing expertise, we worked with several ministry departments to analyze their data and prepare them for digitalization. It has led to becoming an official supplier in Data Science, for the Ministry of Interior for digitalization.

Scraping tool

Today's retailers are rightly convinced that data is a key strategic asset. The organizations that can use the data most effectively will be best equipped to succeed. To take this opportunity, however, organizations cannot rely on older web scraping tools and tactics. AI-based solutions enable retailers to identify, extract, prepare, integrate and use internet data. Since the web is the largest database and the information it contains is valuable, Redbubble, an Australian e-commerce platform giving artists a new way to sell their products asked us to perform a web scraping.

The innovative solution we've created for them includes scraping product data from over 20 e-commerce stores. Scraped data is processed and used for product comparison. The job isn't easy because the products on the web are constantly changing. Thanks to these capabilities, our solution provides high-quality, up-to-date, and comprehensive information that they want to stay ahead of their competitors and stay at the top of dynamic markets.

Nymcard Fintech Segmentation and recommendation engines

NymCard Fintech Segmentation and recommendation engines
The way we use our credit and debit cards tells about us more than any other activity offline. Based on this data the Bank can create for us an unique banking experience, suggesting us the perfect savings product, attractive credit terms or event an tailor-made investment program for our kids. All this thanks to data multisegmentation and strong recommendation engines.
We have designed a tailor-made system for fintech app, that used banking data to predict customer behaviour and suggest best offer, tailored to customer needs. Using multisegmentation engine we were able to automatically prepare personalized financial strategies, that not only helped better serve the customer in the bank branch, but most of all prepare unique banking product offers, answering a true customer need and providing great customer experience.

Predicting hearing devices best settings

Demant, an international leader in hearing aid kits, had already very good physical models for calibrating devices’ settings based on audiograms. These settings are coded into devices by trained audiologists during the first patient visit. The question emerged - is it possible to predict settings using audiograms and machine learning models? And if so, can predicted settings be even better than those proposed by audiologists?
Thanks to machine learning we found how audiograms and settings are connected. Our models prove to be successful in predicting correct device settings without any audiologist work required. Such a solution is faster and less prone to error and being reliable is very important where it comes to any medical device settings. This shows how machine learning can help in automation of hearing aid kits calibration.

Identifying and addressing one-time customers

One time customers, or one-timers are customers that use a service or buy a product from a specific provider once and never do it again. It is important to keep customers engaged and return to previously chosen businesses.
Together with LVMH and Dior we worked on a solution to identify one-timers and created a more systematic approach to turning them into returning customers.

Our approach was based on an idea that customers can be viewed as a collection of events, such as creating an account, buying a product or a service, but also receiving a phone call or an email. This innovative way of thinking allowed us to identify one-timers and create a systematic procedure of engaging and, as a result, turning them into returning customers. As a result, our algorithms helped to achieve a significant lift, based on active customers engagement strategies.


Clothes segmentation and style transfer

The problem is to reflect the person mood in the clothes and apply for example music that the person listens to to the style of the clothes. The challenging problem was to find the correspondence between the music and visual style. 

During the project we performed clothes segmentation and style transfer to match the style with the mood of the music. The keynote was presented during Kaggle Days in Paris in 2019.

Dubai Police

Predicting number of crimes

Organizing Kaggle Days in Dubai, we were lucky to cooperate with Dubai Police. The goal was to reduce patrols' waiting time. We divided it into two subtasks: predicting the number of crimes within each hour of the day in smaller areas (hexagons) and clustering those hexagons into manageable groups.

The competition was about the first subtask. Participants used historical data of incidents, acquired during the last four years, in 319 areas of Dubai. Additionally, we provided them with weather, temperature, and public holidays datasets. Twenty-six teams competed for 9.5 hours. To exceed expectations, LogicAI prepared an algorithm to cluster regions in Dubai to reduce the response time of the police forces.

With the help of seasoned Data Scientist we have also brainstormed to solve two serious problems that all societies struggle with nowadays. We have decided to focus on Reducing fatalities on roads, and to design AI projects that would help tracking child abuse online and as an outcome - minimize this online crime.

Automatic scheduling and workload prediction

All companies face HR challenges when it comes to people management on a mass scale. One of the major struggles nowadays is optimization of HR costs, where it is possible, without reducing staff nor salary cuts.
It is important to plan employees work, monitor their performance but also take care of their happiness, so they are more loyal to our company.
Grafik Optymalny is a Polish revolutionary startup that has built an application for scheduling employees shifts. They have asked us to build a couple of AI based modules, among which two were most outstanding:
1. Automatic scheduling of employee shifts
- we have created an advanced algorithm allowing companies to schedule shifts of hundreds of employees in many locations, optimizing it in an automatic way. The algorithm not only did optimize costs and management overload, but it also focused on keeping employee happy, to reduce unnecessary churn.
2. Predicting employees churn based on their job schedules
- we have built a powerful tool for churn prediction, that allowed employees predict potential staff shortages.