Nico Jahn | Senior IT Consultant / Software Engineer

About Me

My name is Nico Jahn, and I am a computer scientist from Germany. At the moment, I am situated in Munich. I have been living in Berlin for 7 continuous years to pursue my Bachelor’s and Master’s degrees in computer science. During my studies, I focused on automation, scalability, and machine learning; however, my expertise has grown in many areas in recent years. Over the past years, I have developed a massive passion for large-scale computation and infrastructure management. So I spend most of my time searching for, browsing, and learning documentation. I had the opportunity to write a paper including the algorithm from my bachelor thesis with my supervisor Prof. Dr. Obermayer, my advisor Thomas Goerttler and 2 more scientists from TU Berlin. Another one, extending the previous work, is published in a different journal and summarizes some of the work done at my former employer. I am thankful for having my Master’s thesis written at the same department where I wrote my Bachelor’s thesis 3 years prior. I am always looking forward to a vacation with a few good friends, which I regularly visit, even though they live all across Germany.

Projects

Open Neural-APC

nicojahn/open-neural-apc

A top-notch automatic passenger counting algorithm for public transportation

This project was my bachelor thesis. I have gotten around 2500 video sequences of low-resolution 3D footage (LiDAR, or more precisely: ToF) and some counts of passengers for each sequence. Those sequences have been manually labeled (those labels denote all the boarding and alighting passengers during a door opening phase (one video)). There was even the opportunity to publish a paper based on the algorithm and some information about the data set.

EPFD pipeline

nicojahn/EPFD-pipeline

A pipeline wrapper for the EPFD ("Ensemble pruning based on objection maximization with a general distributed framework") implementation

I wrote the wrapper and combined it with the Weights&Biases¹ client to run several experiments. EPFD was introduced by Bian et al. in 2019 and is an ensemble pruning technique based on iterative optimization. It is excellent if there is not a huge population to draw from; otherwise, the approximation algorithm takes forever to compute. It was part of the university class ‘Hot Topics in Computer Vision - Seminar,’ where we could conduct experiments on a new task with, e.g., pre-implemented algorithms.

Weights&Biases website: https://wandb.ai/ . ↩

Genre classification with DNN

nicojahn/DL4AED

Deep Learning for Audio Event Detection was a univeristy class

In this class, we were a group of 3 (with my fellow students Joel and Wassim), were wanted to classify genres in a semi-supervised fashion. We took data from the GTZAN¹ and FMA² dataset (30 sec. audio music clips) and split those into 3-second chunks. We then used logarithmic Mel spectrograms(the frequencies and intensities of the chunks) and tested a ResNet autoencoder (ResNet + a basic decoder).

GTZAN dataset: http://marsyas.info/downloads/datasets.html . ↩
FMA dataset: https://github.com/mdeff/fma/ . ↩

Skills and Expertise

Deep Learning Frameworks: TensorFlow 1&2, Keras, Pytorch
Soft skills: Leadership, Mentorship
Technologies: Virtualization (ESXi, OpenStack), Containerization & Orchestration (Docker, Kubernetes), Network stacks, Cloud Environments (AWS, Azure), Kafka
Programming & Scripting Languages: Python, C/C++, Java, Shell
Languages: German (native language), English C1 (business fluent)

Certifications

AWS Certified Cloud Practitioner (CLF-C02)
AWS Certified AI Practitioner (AIF-C01)
AWS Certified Machine Learning - Specialty (MLS-C01)
AWS Certified Solutions Architect - Associate (SAA-C03)
Certified Kubernetes Administrator (CKA)
Certified Kubernetes Application Developer (CKAD)
Certified Kubernetes Security Specialist (CKS)
HashiCorp Certified: Terraform Associate (HCTAO-003)
Professional Scrum Master™ I (PSM I)

Experience

Data Reply GmbH

Senior IT Consultant

March 2024 - present

https://www.reply.com/data-reply/en

Helping to improve the machine learning infrastructure and develop solutions for our clients.

Data Reply GmbH

IT Consultant

April 2022 - February 2024

https://www.reply.com/data-reply/en

Helping to improve the machine learning infrastructure and develop solutions for our clients.

Interautomation Deutschland GmbH

Working student

August 2019 - March 2022

https://www.interautomation.de/en

A software engineer who further developed and optimized Neural APC. Creating training and inference workflow for the algorithm with special needs.

TU Berlin

Working student

May 2018 - July 2019

https://www.its.tu-berlin.de

My area of responsibility was the server infrastructure. This included updating and managing the Windows domain controller of the faculty, managing the software systems (distribution of packets), and keeping the Windows clients from crashing due to Windows updates. A good mixture of ubuntu and Windows servers, as well as macOS and Windows clients.

Education

TU Berlin

MA Computer Science

2018 - 2022

I graduated with very good grades (in the German grading system, it is a 1.3). I specialized in machine learning, with a broader overview of various application areas. I took my time to spend more on classes and projects, so I rarely felt that more time would have resulted in better grades. I was improving (two-tone) Mooney images in my master thesis.

TU Berlin

BA Computer Science

2015 - 2019

I graduated with good grades (in the German grading system, it is a 2.8). I wrote my Bachelor’s thesis about automatic passenger counting (yes, the algorithm listed above). I already took most of the basic machine learning classes at the university. I had a gorgeous ungraded project with a group of 9 fellow students. As a group leader, I was responsible for a part of a smart-home system (creating a UDP and TCP SSL client/server architecture in Java).

Publications

NAPC: A Neural Algorithm for Automated Passenger Counting in Public Transport on a Privacy-Friendly Dataset

IEEE Open Journal of Intelligent Transportation Systems

2021

10.1109/OJITS.2021.3139393

Real-time load information in public transport is of high importance for both passengers and service providers. Neural algorithms have shown a high performance on various object counting tasks and play a continually growing methodological role in developing automated passenger counting systems. However, the publication of public-space video footage is often contradicted by legal and ethical considerations to protect the passengers’ privacy. This work proposes an end-to-end Long Short-Term Memory network with a problem-adapted cost function that learned to count boarding and alighting passengers on a publicly available, comprehensive dataset of approx.13,000 manually annotated low-resolution 3D LiDAR video recordings (depth information only) from the doorways of a regional train. These depth recordings do not allow the identification of single individuals. For each door opening phase, the trained models predict the correct passenger count (ranging from 0 to 67) in approx.96% of boarding and alighting, respectively. Repeated training with different training and validation sets confirms the independence of this result from a specific test set.

Engineering the Neural Automatic Passenger Counter

Engineering Applications of Artificial Intelligence

2022

10.1016/j.engappai.2022.105148

Automatic passenger counting (APC) in public transportation has been approached with various machine learning and artificial intelligence methods since its introduction in the 1970s. It is mainly used for revenue sharing, which (in Germany alone) is in the billions annually and supply planning, which is essential for services of general interest. While equivalence testing is becoming more popular than difference detection (Student’s t-test), the former is much more difficult to pass to ensure low user risk. On the other hand, recent developments in artificial intelligence have led to algorithms that promise much higher counting quality (lower bias). However, gradient-based methods (including Deep Learning) typically run into local optima. In this work, we explore and exploit various aspects of machine learning to increase the reliability, performance, and counting quality of the Neural APC 3D depth video-based LSTM neural network. We perform a grid search with several fundamental parameters: the selection and size of the training set, which is similar to cross-validation, and the initial network weights and randomness during the training process. Using this experiment, we show how aggregation techniques such as ensemble quantiles can reduce bias, and we give an idea of the overall spread of the results. We utilize the test success chance, a simulative metric based on the empirical distribution. We also employ a post-training Monte Carlo quantization approach and introduce cumulative summation to turn counting into a stationary method and allow unbounded counts. All in all, our numerous additions provide a major quality increase to the NAPC.