Porsche Holding Data Science

project image

Summary

Construction of data warehouse and data-driven analysis / models

Keywords

AWS, MS Azure, ETL Pipelines, Tensorflow, Elasticsearch, SQL, NoSQL, Kafka, Python

Description

The goal was to combine various NoSQL and SQL databases in a central data warehouse and data lake. Data had to be collected via batch jobs in a nightly process as well as “live” data via streaming using Kafka from different end devices.

A special requirement of the project was that large amounts of data had to be managed (20+ million web events / month, 2 TB/month), as well as the security standards had to be met by Volkswagen AG. Along with this, I was also massively involved in the implementation of the DSGVO standards, as well as with the group-wide “Team Cloud” from Volkswagen AG in the expansion of the cloud infrastructure with AWS and MS Azure.

Various recommender systems were developed for use in apps and on websites, with the aim of displaying the perfect vehicle for a user. These were implemented in a simple way using graphs in Elasticsearch, and in more complex scenarios using precomputed clusters or machine learning models which were played out using OpenFAAS and Kubernetes.

Furthermore, machine learning models were developed for the prediction of unit numbers, inventories and (vehicle) registration numbers.

Are you working on a similar project? Are you interested in something similar? contact us now for a free 15-minute consultation.

DataFortress.cloud is your partner for high quality, state-of-the-art solutions.