Skip to content
View Jessy-Ding's full-sized avatar
  • Northeastern University
  • Oakland

Highlights

  • Pro

Block or report Jessy-Ding

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Jessy-Ding/README.md

Hi there 👋, I am Mengyuan (Jessy) Ding.

AI Systems · Research Engineering · Machine Learning

MSCS @ Northeastern University
Former Researcher @ Brigham and Women's Hospital / Harvard Medical School

I build machine learning systems, data infrastructure, and research pipelines for complex real-world problems.

My current interests focus on:

  • AI systems and evaluation
  • machine learning robustness under domain shift
  • multimodal and imperfect-data learning
  • scalable backend and data infrastructure
  • human-in-the-loop AI systems
  • reproducible ML workflows

Biomedical and neuroimaging applications are currently my primary problem domains, and my long-term focus is AI/ML systems and research engineering.


What I'm Working On

Cross-domain medical image segmentation

Current interests include:

  • domain shift
  • imperfect and heterogeneous data
  • annotation-efficient learning
  • self-supervised representation learning
  • robustness across scanners/sites/distributions

Particularly interested in:

  • representation learning
  • sample selection
  • generalization under distribution shift
  • scalable biomedical ML pipelines

AI systems and infrastructure

Working on:

  • multi-agent AI systems
  • evaluation pipelines
  • semantic normalization workflows
  • backend services and access-control systems
  • scalable research tooling

Interested in how AI systems behave in noisy, real-world environments rather than benchmark-only settings.


Technical Interests

Machine Learning

  • representation learning
  • multimodal ML
  • self-supervised learning
  • medical imaging
  • segmentation
  • model evaluation
  • robustness and generalization

Systems & Infrastructure

  • backend APIs
  • authentication systems
  • RBAC
  • databases
  • reproducible pipelines
  • data versioning
  • scalable tooling

Data & Scientific Computing

  • neuroimaging workflows
  • MRI pipelines
  • BIDS infrastructure
  • large-scale biomedical datasets
  • scientific Python ecosystem

Selected Technical Areas

Python
PyTorch
Flask
REST APIs
MongoDB
JWT / RBAC
Docker
Git/GitHub
Machine Learning
Deep Learning
Medical Imaging
Data Pipelines
Research Infrastructure

Current Learning Goals

  • algorithms and data structures
  • ML systems engineering
  • robust machine learning
  • scalable backend systems
  • software engineering practices
  • deep learning for real-world deployment

Looking to Collaborate On

  • ML systems
  • robust AI pipelines
  • multimodal learning
  • representation learning
  • data-centric AI
  • medical imaging AI
  • AI infrastructure
  • backend engineering for ML systems

Research Direction

I am particularly interested in research problems involving:

  • distribution shift
  • limited-label learning
  • imperfect real-world datasets
  • trustworthy AI systems
  • evaluation and benchmarking
  • scalable research tooling
  • human-centered AI systems

Long-term goal: build AI systems that remain reliable under noisy, heterogeneous, and real-world conditions.


Philosophy

I prefer building systems that are:

  • reproducible
  • scalable
  • inspectable
  • modular
  • deployable
  • robust to imperfect data

I care less about leaderboard optimization and more about whether systems continue to work under realistic constraints.


Ask Me About

  • machine learning under domain shift
  • medical imaging pipelines
  • AI evaluation systems
  • backend engineering for ML workflows
  • transitioning from neuroscience/medicine into AI systems engineering
  • reproducible research infrastructure

GitHub Roadmap

This GitHub is gradually being organized around:

  1. ML systems
  2. research engineering
  3. data infrastructure
  4. robust machine learning
  5. biomedical AI pipelines
  6. reproducible scientific workflows

Public repositories will mainly focus on:

  • tooling
  • pipelines
  • infrastructure
  • methods
  • reproducible engineering workflows

Contact

Pinned Loading

  1. SPARC-FAIR-Codeathon/Transcriptomic_oSPARC SPARC-FAIR-Codeathon/Transcriptomic_oSPARC Public

    SPARC Portal transcriptomic data visualization in o²S²PARC

    Jupyter Notebook 3 3

  2. BIDS-Lite-Organizer BIDS-Lite-Organizer Public

    Lightweight desktop app to automatically organize brain imaging data into BIDS format (Idea proposed by Dr. Michael Fox, MD, PhD, Brigham and Women's Hospital).

    Python 1