Die Ergebnisse können Gästen nur in Auswahl angezeigt werden. Bitte loggen Sie sich für Vollzugriff ein: Login

Treffer: Edge AI Inference Optimization: Quantization and Pruning on Resource-Constrained Platforms.

Title:

Edge AI Inference Optimization: Quantization and Pruning on Resource-Constrained Platforms.

Authors:

Pardesi, Ishan¹

Source:

Journal of Computational Analysis & Applications. 2025, Vol. 34 Issue 12, p385-393. 9p.

Subject Terms:

*EMBEDDED computer systems, *EDGE computing, *MATHEMATICAL optimization, *PROGRAM transformation, *COMPUTING platforms, *ARTIFICIAL neural networks, *RESOURCE allocation, *SIGNAL quantization

Database:

Academic Search Index

Weitere Informationen

The deployment of sophisticated artificial intelligence models on resource-constrained embedded systems presents fundamental challenges in balancing computational efficiency with accuracy preservation. Contemporary edge devices, including ARM Cortex-A processors and automotive electronic control units, operate under severe limitations of memory capacity, computational throughput, and power budgets that preclude direct deployment of standard floating-point neural networks. Quantization techniques systematically reduce numerical precision from 32-bit floating-point to 8-bit or binary integer representations, achieving compression ratios exceeding 50× while maintaining accuracy within acceptable degradation thresholds. Post-training quantization enables direct conversion of pre-trained models without retraining requirements, while quantization-aware training adapts network representations to accommodate extreme precision reduction. Pruning methodologies exploit overparameterization in neural architectures through selective parameter elimination, with magnitude-based approaches achieving sparsity levels of 80-90% and structured pruning variants enabling hardware acceleration on conventional processors. Hardware-aware optimization strategies align sparsity patterns with SIMD execution units and memory access characteristics, maximizing inference throughput on embedded platforms. Empirical validation in ARM Cortex-A processors and Raspberry Pi systems demonstrates practical deployment of vision and language models within tight resource envelopes, achieving previously unattainable real-time inference performance on cost-effective embedded hardware. The convergence of efficient neural architecture, aggressive model compression, and platform-specific optimization enables the democratization of artificial intelligence capabilities in value-sensitive applications in the automotive, industrial, and consumer domains. [ABSTRACT FROM AUTHOR]

Treffer: Edge AI Inference Optimization: Quantization and Pruning on Resource-Constrained Platforms.

Weitere Informationen

Links

Zusatz-Funktionen