Experiences
Research
LLM as NER Data Generator
Sep. 2023 - Mar. 2024
Prof. Chao Zhang's group @ GaTech
- Innovated 1st structured named-entity recognition (NER) training dataset generation with LLMs by attributed prompting for diversity
- Developed & Optimized multi-stage generation pipeline including parallel API calls, data filtering & cleaning, pretty logging & summary stats
- Manually Inspected generated samples; Case-studied & Analyzed multiple paradigms for LLM annotation feedback and self-correction
- Increased DeBEETa NER F1 score by >5% on average using $<1 API cost & <10 labeled samples, matching ChatGPT-3.5 teacher performance while 20X faster; Written 40-page paper with failure analysis
Parameter-Efficient Personalization
May. 2023 - Dec. 2023
CLARITY lab @ UMich
- Collaborated with Christopher Clarke
- Explored storage-efficient methods for personalization focusing on subjective text classification tasks
- Surveyed literature to select subjective datasets (e.g., irony)
- Designed & Executed PEFT, Adapter and Personalized Head training & evaluation user-wise pipelines for Flan-T5 generative text classification
- Benchmarked 7 prompting and PEFT methods across 11 subjective tasks, each with up to 5K users and 120K total samples
Symbolic Music Generation
Oct. 2021 - Feb. 2023
LIT @ UMich
- Mentored by Artem Abzaliev
- Designed & Implemented a compact music token representation for long song sequences that first integrated music theory annotations
- Coded & Optimized tokenization pipeline to process 10K+ raw MIDI files including batching and concurrency optimization, channel reduction and efficient edge-case (>50) handling
- Tailored Transformer-XL & Reformer architectures for long music sequence training; Designed music-specific evaluation metrics; Inspected >100 generated music pieces
Personalized Text Classification Dataset
Jul. 2022 - Oct. 2022
CLARITY lab @ UMich
- Collaborated with Yiping Kang & Ashish Mahendra
- Designed a tree-structured text classification dataset schema for nested and temporally-changing label sets
- Processed 15K production user data from Myca productivity tool spanning 2 years
Zero-Shot Text Classification
Feb. 2022 - Jun. 2022
CLARITY lab @ UMich
- Collaborated with Christopher Clarke
- Benchmarked 3 zero-shot classification paradigms across 18 datasets
- Re-Implemented a closed-sourced, prior GPT-2-based 0-shot approach
- Developed & Optimized training & eval pipelines, reducing GPT-2 inference time by 2X; Launched experiments
- Improved classifier accuracies by 1% on average with simple domainconditioned training; Designed illustrations & wrote paper sections
Industry
Front-end Software Engineer Intern
May. 2021 - Jul. 2021
Seller Experience team @ eBay (remote)
- Mentored by Wei Don and Srini
- Successfully launched a new video feature in item listing tool at Seller Experience team, impacting more than 10% of eBay sellers
- Presented architecture, implementation, upstream service challenges & next steps and Live-demoed to internal team of 40+
- Got an return offer!
Others
Heterogeneous Bi-Encoder
Jul. 2022
CLARITY lab @ UMich
- Collaborated with Ashish Mahendra
- Investigated independent context and candidate encoders for intent classification
- Provided feedback on architecture implementation and experiments; Wrote sections of and edited paper submission
Multi-Robot Collaboration Researcher
Sep. 2021 - Dec. 2021
Barton Research Group @ UMich
- Research question: “How to get the relative pose between two robots directly (as opposed to global positioning), exploiting capacity of both robots?”
- Case-studied relative pose estimation between two robots to reduce noise resulting from global-positioning-based localization
- Devised algorithmic formalization for static robot collaboration, as a point-matching problem given laser scans
ECG Signal Processing Researcher
Sep. 2020 - Apr. 2022
Michigan Medicine @ UMich
- Mentored by Dr. Mohammed Saeed
- Developed Dash-based ECG signal web app for with features including thumbnail, channel toggle, box measurement & annotation
- Designed UI wireframes tailored for physicians’ retrospective study and annotation needs; Gathered feedbacks from cardiologists
- Algorithmically Optimized rendering efficiency of GBs of signal records
- Devised self-supervised pretraining objectives for 12-channel ECG timeseries based on symbolic and real-valued representations, inspired by NLP and vision pre-training
- Reviewed literature to Compile a dataset collection of 50K+ 12-lead ECG records from 8 datasets
- Pioneered applying the Vision Transformer architecture for ECG disease classification with heart-signal data augmentations; Visualized attention layers for explainability
Predictive Maintenance Portability Researcher
Jan. 2021 - Apr. 2021
Barton Research Group @ UMich
- Research question: “Does the prior failure prediction approach generalize well to similar bearing systems?”
- Applied prior bearing failure prediction method and analyzed generalization to a new dataset
- Re-organized prior codebase; Formalized prior method into components with applying criteria
UX/UI Designer & Developer Intern
Mar. 2020 - May 2020
OpptIn (Startup) @ Scranton, PA (remote)
- Brainstormed UI framework with functional widgets to augment location-specific real-life spaces
- Iterated company logo designs 3 times with ~30 illustrations; Prototyped >20 UI layouts for space discovery and management in a team of 6; Implemented layouts in Android
- Prototyped UI animations and dynamic style changes for pre- and post-joining spaces
- Served as the key communicator on UX concepts and engineering constraints among design and development teams
Bioengineering Imaging Research Assistant
May. 2019 - Jul. 2019
Prof. Yu Chen @ UMD
- Manually-refined kidney imaging segmentation model annotation for transplant success prediction among a group of 6 annotators
- Surveyed & analyzed image texture filters for kidney imaging noise segmentation
Art Museum Experience Researcher & Mentor
Apr. 2019 - May 2020
Prof. Kyungjin Yoo @ UMD
- Developed an ARCore-based Android app that renders art museum paintings in 6 base colors to educate color theory & raise engagement
- Compared prior approaches for theme-colored visualization; Implemented primary & secondary color extraction & color map file parsing
- Developed & Tested painting segmentation heuristics based on color & semantics
- Mentored 30 students on VR, AR, location intelligence training & art museum experience
- Reviewed students’ Unity project submissions
- Wrote Android Studio & AR development tutorial based on Google SDK
- Led lab discussions & provided feedbacks on research proposals 3 times weekly
Soldier Intelligence Trainer
Nov. 2018 - Feb. 2019
UMD
- Participated in Prof. Fawzi Emad’s course challenge about AI soldiers combating in a 2D grid battlefield
- Read an AI and Games textbook; Implemented genetic search algorithms with custom-designed reward heuristics
- Tracked soldiers’ in-game behavior and computed statistics to monitor performance
- Automated game restart to increase self-play training trials