Publication
ISSCC 2024
Conference paper

A software-assisted peak current regulation scheme to improve power-limited inference performance in a 5nm AI SoC

Abstract

Compute utilization and the average power of an AI model inference can vary significantly across model class, number of tokens, and batch size. Power consumption of discrete AI accelerator cards must also strictly adhere to system specifications for peak current draw. Exploiting the predictive nature of the AI workloads, a novel software-assisted feed-forward current-limiting scheme is proposed in conjunction with optimized closed-loop control to maximize current-limited performance.