Overview
Led dataset integration and preprocessing for a parameter-efficient fine-tuning pipeline adapting Mistral-7B-Instruct into a course-specific AI tutor for UF's CIS6930 Large Language Models course. Used LoRA/QLoRA with a blended instruction corpus of UltraChat 200k, Infinity-Instruct, and Symbolic IT, trained on UF's HiPerGator cluster. The tuned model achieved a ~10% relative improvement in token-level F1 over the base model and demonstrated low perplexity on course slide reconstruction.
Highlights
- Led dataset integration combining UltraChat 200k, Infinity-Instruct, and Symbolic IT into a unified JSONL corpus with format standardization and deduplication
- Implemented core training pipelines using Hugging Face Transformers, PEFT, and TRL with LoRA rank r=64 across all attention and MLP projection layers
- Achieved ~10% relative improvement in mean token-level F1 (0.31 → 0.34) on a 400-example held-out instruction set
- Performed two-stage fine-tuning: general instruction tuning followed by CIS6930 lecture slide specialization, achieving perplexity of 4–8 on course material
- Trained on a single NVIDIA L4 GPU via UF's HiPerGator cluster using 4-bit NF4 quantization to fit within 24 GB VRAM
Tech Stack
PythonPyTorchHugging Face TransformersPEFTLoRA/QLoRATRLBitsAndBytesMistral-7BHiPerGator