Simulating Edge Device Inference Mar. 2025 — May 2025
Course project for CS598: Systems for GenAI. Collaborated with four other graduate students to develop a containerized LLM model sharding system that simulates inference across virtual edge devices, enabling practical latency and efficiency tests on a single device with configurable constraints.
[Paper] [Code]