Neural Network Optimization for Resource-Constrained IoT Devices

Authors

  • Er. Aman Shrivastav ABESIT Engineering College , Ghaziabad Author

DOI:

https://doi.org/10.63345/cfbff019

Keywords:

Neural Networks, IoT Devices, Optimization, Model Compression, Quantization, Pruning, Knowledge Distillation

Abstract

The rapid expansion of the Internet of Things (IoT) has created an urgent need for neural networks that deliver reliable intelligence under stringent constraints of memory, compute, and energy. This paper presents a unified, deployment-oriented framework for neural network optimization on resource-constrained IoT devices, integrating structured pruning, post-training and quantization-aware quantization, knowledge distillation, and lightweight architectural redesign. We formalize a multi-objective cost function that balances accuracy, latency, model size, and energy per inference, and we operationalize it via a staged pipeline: (i) sparsity-inducing pruning with topology preservation for microcontroller kernels, (ii) mixed-precision quantization to 8- and 4-bit pathways with calibration on device-representative data, (iii) teacher-student distillation with temperature-scaled soft targets to recover accuracy, and (iv) hardware–software co-tuning for common IoT platforms (Raspberry Pi, ESP32, and Cortex-M microcontrollers). 

Downloads

Download data is not yet available.

References

https://www.researchgate.net/publication/322518645/figure/fig2/AS:583290961829889@1516078810133/Flowchart-for-Artificial-Neural-Network-ANN.png

https://www.ipoint-systems.com/fileadmin/_processed_/3/f/csm_energy-flow-diagram-process_445978d728.png

• Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. Advances in Neural Information Processing Systems, 28, 1135–1143.

• Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., ... & Adam, H. (2018). Quantization and training of neural networks for efficient integer-arithmetic-only inference. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2704–2713.

• Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.

• Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., ... & Adam, H. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.

• Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2016). SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. arXiv preprint arXiv:1602.07360.

• Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). ShuffleNet: An extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6848–6856.

• Lane, N. D., Bhattacharya, S., Georgiev, P., Forlivesi, C., Jiao, L., Qendro, L., & Kawsar, F. (2017). DeepX: A software accelerator for low-power deep learning inference on mobile devices. Proceedings of the 15th International Conference on Information Processing in Sensor Networks (IPSN), 1–12.

• Zhang, T., Ye, S., Zhang, K., Tang, J., & Pan, P. (2018). A systematic DNN weight pruning framework using alternating direction method of multipliers. European Conference on Computer Vision (ECCV), 184–199.

• Wu, J., Leng, C., Wang, Y., Hu, Q., & Cheng, J. (2016). Quantized convolutional neural networks for mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4820–4828.

• Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2017). A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282.

• Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2017). Pruning filters for efficient convnets. International Conference on Learning Representations (ICLR).

• Tan, M., & Le, Q. V. (2019). EfficientNet: Rethinking model scaling for convolutional neural networks. Proceedings of the International Conference on Machine Learning (ICML), 6105–6114.

• Reddi, V. J., Cheng, C., Karsai, G., Krishnan, S., Li, H., Lin, H., ... & Venkataramani, S. (2020). MLPerf Tiny benchmark. arXiv preprint arXiv:2010.07502.

• David, R., Duke, J., Jain, A., Reddi, V. J., Jeffries, N., Li, J., ... & Warden, P. (2021). TensorFlow Lite Micro: Embedded machine learning for tinyML systems. Proceedings of Machine Learning and Systems (MLSys), 3, 800–811.

• Xu, Z., & Xu, W. (2020). Knowledge distillation for deep neural networks: A survey. International Journal of Automation and Computing, 17(2), 151–167.

• Alsubaei, F., Abuhussein, A., & Shiva, S. (2019). Security and privacy in the Internet of Medical Things: Taxonomy and risk assessment. Future Generation Computer Systems, 97, 509–520.

• Wang, H., Zhang, Z., Xu, S., & Chen, Y. (2019). Lightweight convolutional neural networks for mobile devices. IEEE Access, 7, 106974–106983.

• Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., & Zhang, C. (2017). Learning efficient convolutional networks through network slimming. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2736–2744.

• Chen, T., Goodfellow, I., & Shlens, J. (2016). Net2Net: Accelerating learning via knowledge transfer. International Conference on Learning Representations (ICLR).

• Xu, R., Chen, Y., Lin, H., & Wang, F. (2021). Edge intelligence: Architectures, challenges, and applications. Journal of Systems Architecture, 117, 102110.

Downloads

Published

02-07-2025

Issue

Section

Review Article

How to Cite

Neural Network Optimization for Resource-Constrained IoT Devices. (2025). Scientific Journal of Artificial Intelligence and Blockchain Technologies, 2(3), July(37-45). https://doi.org/10.63345/cfbff019

Similar Articles

1-10 of 22

You may also start an advanced similarity search for this article.