Zhan Zhuang

Ph.D. Candidate in Computer Science specializing in AI/ML
Hong Kong, HK.

About

Zhan Zhuang is a Ph.D. Candidate in Computer Science at CityUHK, specializing in post-training and transfer learning for foundation models, with an expected graduation in June 2026. He boasts an extensive publication record in top-tier AI/ML venues including ICLR, ICML, NeurIPS, and ICCV, complemented by industry experience from ByteDance. His expertise in parameter-efficient updates, automated self-improvement, and long-context memory mechanisms positions him to develop robust, generalizable, and self-improving AI systems.

Work

ByteDance Seed Research
|

Research Intern

Shenzhen, Guangdong, China

Summary

Conducted hands-on industry research at ByteDance Seed Research, contributing to advanced projects in foundation models and self-improving learning systems.

Highlights

Developed and implemented parameter-efficient update strategies for foundation models, enabling frequent, low-cost model refreshes while preserving prior capabilities.

Contributed to the design and evaluation of automated self-improvement mechanisms, utilizing curriculum/data synthesis and self-evaluation for robust model performance.

Explored and integrated long-context and memory architectures, significantly improving task completion and generalization for complex AI systems.

Applied hands-on industry experience to optimize large-scale AI systems, ensuring deployment feedback loops for continuous improvement and robust generalization.

Education

City University of Hong Kong (CityUHK)
Hong Kong, Hong Kong SAR, Hong Kong

Ph.D.

Computer Science

Southern University of Science and Technology (SUSTech)
Shenzhen, Guangdong, China

B.Eng.

Computer Science and Technology

Awards

Outstanding Academic Performance Award

Awarded By

CityUHK

Recognized for exceptional academic achievement and contributions during Ph.D. studies.

Outstanding Graduate Award

Awarded By

SUSTech

Awarded for distinguished academic performance and overall contributions as a graduating student.

Top 10 Graduate Award at Shude College

Awarded By

SUSTech

Acknowledged as one of the top 10 graduates from Shude College, demonstrating academic excellence.

Excellent Student Cadre

Awarded By

SUSTech

Recognized for outstanding leadership and service in student organizations.

Excellent League Member

Awarded By

SUSTech

Awarded for exemplary commitment and active participation within the student league.

Publications

Method and device for acquiring information of close contact person, server and storage medium

Published by

Granted Patent (CN112866991B)

Summary

Invented and patented a method and device for acquiring close contact information, enhancing public health initiatives.

Data processing method and device based on block chain, equipment and storage medium

Published by

Granted Patent (CN112635061B)

Summary

Developed and patented a novel data processing method and device leveraging blockchain technology for secure and efficient information management.

HAD: Heterogeneity-Aware Distillation for Lifelong Heterogeneous Learning

Published by

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Summary

Authored research on heterogeneity-aware distillation, advancing lifelong heterogeneous learning in computer vision.

MoSA: Mosaic Shared Adaptation of Large Language Models

Published by

Thirteenth International Conference on Learning Representations (ICLR)

Summary

Co-authored work on mosaic shared adaptation, improving parameter-efficient fine-tuning for large language models.

PLAN: Proactive Low-Rank Allocation for Continual Learning

Published by

International Conference on Computer Vision (ICCV)

Summary

Introduced PLAN, a proactive low-rank allocation strategy to optimize continual learning performance.

HiRA: Parameter-Efficient Hadamard High-Rank Adaptation for Large Language Models

Published by

Thirteenth International Conference on Learning Representations (ICLR)

Summary

Presented HiRA, an oral presentation (top 1.8%) on parameter-efficient Hadamard high-rank adaptation for LLMs, enhancing model efficiency.

Moonshine: Distilling Game Content Generators into Steerable Generative Models

Published by

Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI)

Summary

Delivered an oral presentation on Moonshine, a method for distilling game content generators into steerable generative models.

Come Together, But Not Right Now: A Progressive Strategy to Boost Low-Rank Adaptation

Published by

Forty-Second International Conference on Machine Learning (ICML)

Summary

Proposed a progressive strategy to significantly boost low-rank adaptation in machine learning models.

MoPFormer: Motion-Primitive Transformer for Wearable-Sensor Activity Recognition

Published by

Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS)

Summary

Developed MoPFormer, a novel transformer architecture for enhancing wearable-sensor activity recognition.

Gradual Domain Adaptation via Gradient Flow

Published by

Twelfth International Conference on Learning Representations (ICLR)

Summary

Published research on gradual domain adaptation using gradient flow, recognized as a Spotlight paper (top 5%) for its innovative approach.

Extend Wave Function Collapse Algorithm to Large-Scale Content Generation

Published by

IEEE Conference on Games

Summary

Awarded 'Best Paper' (top 5%) for extending the Wave Function Collapse algorithm to enable large-scale content generation.

Skills

Programming Languages

Python, Java, C, C#, JavaScript, SQL, Scala, Verilog, HTML5.

Development Tools & Platforms

Linux, Android, Unity, Arduino, Web Development, STM32, Jetson Nano, Hololens.

Machine Learning & AI

Foundation Models, Transfer Learning, Continual Learning, Parameter-Efficient Fine-Tuning (PEFT), Large Language Models (LLM), Computer Vision (CV), Neural Information Processing Systems, Deep Learning, Generative Models, Reinforcement Learning.

Research Methodologies

Algorithmic Screening, Data Synthesis, Self-Evaluation, High-Performance Computing (HPC), Model Optimization, Deployment Feedback Loops, Strategic Prioritization.