Heungsub Lee
- Contact
- [email protected]
- Web Sites
- subl.ee, github.com/sublee, linkedin.com/in/sublee
Interest
AI services and platforms, real-time communication in distributed systems, cost and performance optimization, and developer experience.
Skills
- Programming Languages
- Go, Python, TypeScript
- Service Development
- Linux, AWS, gRPC, React, Pulumi, Kubernetes, Redis, ZeroMQ, Concurrent programming, Testing
- AI Engineering
- MCP, PyTorch, NCCL, NVIDIA Nsight Systems
Work Experience
- Lead Software Engineer
- Global AI Platform, Sep 2023 – Present
-
Leading the development of Aster, a personal AI agent service that applies planning to solve user problems by integrating data and functionalities from multiple MCP servers.
Directed LangDiff, an open-source library that bridges structured LLM outputs with progressive UI rendering.
- Software Engineering Manager
- NAVER, Aug 2020 – Jul 2023
-
Supervised 25 engineers on MLOps platforms to optimize inference performance and productivity for HyperCLOVA, a Korean-focused LLM.
Developed NSMLv2, a large-scale ML research platform at CLOVA. Designed a multi-tenant architecture with an economics-driven approach, enabling diverse organizations to share GPU clusters for HPC while maximizing resource efficiency.
- Software Engineer
- Kakao Brain, Dec 2018 – Aug 2020
-
Developed torchgpipe, an open-source pipeline parallelism library for PyTorch.
Developed a serverless training framework and distributed hyperparameter search pipelines for an AutoML service.
- Game Server Engineer
- NEXON, Mar 2011 – Dec 2018
-
Developed cloud-native distributed MMORPG servers for Durango using pub/sub over a spatial grid system, supporting up to 70k concurrent users per game world.
Developed online racing game servers and matchmaking for KartRider Dash and KartRider Coin Rush.
- Back-end Web Developer
- nPine, Dec 2008 – Feb 2011
- Developed e-commerce web services for stock image platforms.
Open Source Experience
- torchgpipe, Feb 2019 – Apr 2020
- Implemented GPipe, a multi-GPU pipeline parallelism technique for large-scale model training, as a PyTorch library with CUDA, autograd, and long skip connection optimizations; later upstreamed into PyTorch as the official Pipe APIs.
- Hangulize, Oct 2010 – Present
- Designed a Hangul transcription algorithm and released it as a free web tool widely used by professional Korean translators.
- TrueSkill, Jan 2012 – Dec 2015
- Implemented TrueSkill™, the rating algorithm behind Xbox Live, as a Python library; presented at PyData Berlin 2019.
- Contributions
- Contributed upstream patches improving GPU safety (#27371) and API consistency (#21006, #25985) in PyTorch. Fixed subdomain URL bug (#108) in Flask.
Publications
- H. Park et al., “HPCClusterScape: Increasing Transparency and Efficiency of Shared High-Performance Computing Clusters for Large-scale AI Models,” arXiv:2310.02120, Oct 2024.
- B. Kim et al., “What changes can large-scale language models bring? Intensive study on HyperCLOVA: Billions-scale Korean generative pretrained Transformers,” arXiv:2109.04650, Sep 2021.
- C. Kim*, Heungsub Lee* et al., “torchgpipe: On-the-fly pipeline parallelism for training giant models,” arXiv:2004.09910, Apr 2020.
*Contributed equally
Public Speeches
- “NSML, the hyper-scale ML training platform,” KRnet, Jun 2022.
- “Remake of Hangulize,” Golang Korea Meetup, Aug 2018.
- “Profiling,” PyCon Korea, Aug 2015.
- “The server architecture of Durango,” NDC, 2014, 2016, and 2018.
Languages
- Korean — Native
- English — Conversant in reading and writing
Education
Computer Software, Kwangwoon University, 2008, Completed the first year only.