Publication
IEEE JSSC
Paper

SMIV: A 16-nm 25-mm SoC for IoT With Arm Cortex-A53, eFPGA, and Coherent Accelerators

View publication

Abstract

Emerging Internet of Things (IoT) devices necessitate system-on-chips (SoCs) that can scale from ultralow power always-on (AON) operation, all the way up to less frequent high-performance tasks at high energy efficiency. Specialized accelerators are essential to help meet these needs at both ends of the scale, but maintaining workload flexibility remains an important goal. This article presents a 25-mm2 SoC in 16-nm FinFET technology which demonstrates targeted, flexible acceleration of key compute-intensive kernels spanning machine learning (ML), DSP, and cryptography. The <italic>SMIV</italic> SoC includes a dedicated AON sub-system, a dual-core Arm Cortex-A53 CPU cluster, an SoC-attached embedded field-programmable gate array (eFPGA) array, and a quad-core cache-coherent accelerator (CCA) cluster. Measurement results demonstrate: 1) 1236<inline-formula> <tex-math notation="LaTeX">times </tex-math></inline-formula> power envelope, from 1.1 mW (only AON cluster), up to 1.36 W (whole SoC at maximum throughput); 2) 5.528.9<inline-formula> <tex-math notation="LaTeX">times </tex-math></inline-formula> energy efficiency gain from offloading compute kernels from A53 to eFPGA; 3) 2.94<inline-formula> <tex-math notation="LaTeX">times </tex-math></inline-formula> latency improvement using coherent memory access (CCA cluster); and 4) 55<inline-formula> <tex-math notation="LaTeX">times </tex-math></inline-formula> MobileNetV1 energy per inference improvement on CCA compared to the CPU baseline. The overall flexibility-efficiency range on SMIV spans measured energy efficiencies of 1<inline-formula> <tex-math notation="LaTeX">times </tex-math></inline-formula> (dual-core A53), 3.1<inline-formula> <tex-math notation="LaTeX">times </tex-math></inline-formula> (A53 with SIMD), 16.5<inline-formula> <tex-math notation="LaTeX">times </tex-math></inline-formula> (eFPGA), 54.9<inline-formula> <tex-math notation="LaTeX">times </tex-math></inline-formula> (CCA), and 256<inline-formula> <tex-math notation="LaTeX">times </tex-math></inline-formula> (AON) at a peak efficiency of 4.8 TOPS/W.

Date

Publication

IEEE JSSC