Ding-Ze Hu

Software Engineer with a focus on Python, data engineering, and full-stack development — orienting toward AI / ML-adjacent roles. Based in Hamburg with three years of professional experience at DESY, an MSc in Computer Science from Göttingen, and working proficiency across Mandarin (native), English (C1), and German (B1).

Professional experience

Hands-on ownership of data-centric systems, research tooling, and full-stack delivery in a large scientific organisation.

Software Engineer (Full Stack)

Deutsches Elektronen-Synchrotron (DESY) — Hamburg · Apr 2023 – present

  • Architected and optimized scalable metadata schemas and Python-based automation engines, reducing manual data collection overhead.
  • Led the containerization of complex research environments using Docker, ensuring 100% reproducibility and streamlining deployment across distributed systems.
  • Designed and maintained end-to-end data pipelines. Implemented automated data cleaning and transformation workflows, ensuring 100% reproducibility for production-ready research.
  • Delivered production-ready full-stack features (Django, React) used by 10+ research teams, improving discoverability and accessibility of scientific data.
  • Took ownership of end-to-end features—from data ingestion and backend services to frontend delivery.
  • Acted as a technical bridge between research users and engineering solutions, translating vague scientific requirements into robust, deployable software systems.

About

I build software that helps researchers find, trust, and use their data — mostly in Python, with solid full-stack and DevOps habits.

Portrait of Ding-Ze Hu

At DESY in Hamburg, I build the Python services and data pipelines that 10+ research teams depend on daily — from metadata ingestion through to the Django/React interfaces they actually use. My MSc at Göttingen focused on database performance for real-time analysis (thesis at GWDG), which still shapes how I think about data-intensive systems.

Technical skills

Programming

Python (Pandas, NumPy, Scikit-learn) SQL (PostgreSQL, MySQL) Typescript C++

Frameworks

Django React Next.js

Databases

Supabase PostgreSQL MongoDB MySQL Redis InfluxDB

Data & ML

Anomaly Detection Time Series Forecasting Graph Embeddings Geospatial Visualization Leaflet ECharts

DevOps & tools

Docker Git GitHub CI/CD

Projects

Thesis benchmark chart: full retrieval comparison figure

In-Memory Database Solution for near Real-Time Data Analysis

Oct 2021 – Jun 2022 · Master’s thesis at GWDG

  • Benchmarked Redis and MongoDB as in-memory caching layers for IoT time-series data stored in InfluxDB, comparing fetch/write performance and memory consumption across custom data structures.
  • Designed three Redis schema variants (sorted sets with pickled data, sorted sets with hashes, multi-set composite) and optimized MongoDB queries using pagination and custom _id indexing — achieving 10x retrieval speedup over legacy InfluxDB queries.
  • Technologies: Python, Pandas, NumPy, Redis, MongoDB, InfluxDB, Jupyter Notebook
Screenshot of the data visualization and network project

Data Visualization & Graph Engineering

Apr 2021 – Jul 2022

  • Developed a high-performance scraping engine to extract publication datasets from Semantic Scholar; processed and cleaned complex relational data.
  • Implemented graph embedding techniques to transform publication metadata into 2D coordinates for interactive network visualizations.
  • Technologies used: Python, Pandas, NumPy, HTML, Graph embeddings, Node.js
Visualization from the sensor data forecasting project

Analysis and Prediction of Sensor Data to build a Forecast Model Using AI Technology

Nov 2019 – Apr 2020

  • High-Scale Time-Series Forecasting & ML Pipeline (drawn from a 20B+ record dataset). Architected a processing pipeline for 20B+ records, implementing advanced outlier detection, feature extraction, and ML models for environmental trend forecasting.
  • Developed dynamic heat maps to visualize temporal temperature variations.
  • Built and deployed predictive models for environmental trend forecasting based on large-scale citizen-science datasets.

Qualification

Master of Science in Computer Science

Georg-August-Universität Göttingen

Oct 2018 – Jun 2022

  • Focus: Machine Learning, Databases, Big Data, Software Development
  • Thesis in GWDG: In-Memory Database Solution for near Real-Time data analysis.

Bachelor of Science in Information and Telecommunications Engineering

Ming Chuan University, Taipei, Taiwan

Sep 2014 – Jun 2018

  • Bachelor thesis: Design and Implementation of Network Performance Testing Website
  • Designed and developed a web application to measure real-time network Quality of Service (QoS) parameters.
  • Implemented functionality for users to run tests and analyze performance results in real time.

Certifications

  • CCNA (Cisco Certified Network Associate) – Routing and Switching
  • Oracle Certified Professional, Java SE 6 Programmer
  • Certified Android Mobile App Developer, Ministry of Economic Affairs, R.O.C.