OCAP: On-device Class-Aware Pruning for personalized edge DNN models

Ma, Ye-Da; Zhao, Zhi-Chao; Liu, Di; He, Zhenli; Zhou, Wei

dc.contributor.author	Ma, Ye-Da
dc.contributor.author	Zhao, Zhi-Chao
dc.contributor.author	Liu, Di
dc.contributor.author	He, Zhenli
dc.contributor.author	Zhou, Wei
dc.date.accessioned	2024-06-25T10:52:36Z
dc.date.available	2024-06-25T10:52:36Z
dc.date.created	2023-08-22T13:49:18Z
dc.date.issued	2023
dc.identifier.citation	Journal of systems architecture. 2023, 142 .	en_US
dc.identifier.issn	1383-7621
dc.identifier.uri	https://hdl.handle.net/11250/3135695
dc.description.abstract	In this paper, we propose a new on-device class-aware pruning method for edge systems, namely OCAP. The motivation behind is that Deep Neural Network (DNN) models are usually trained with a large dataset so that they can learn more diverse features and be generalized to accurately predict numerous classes. Some works reveal that some features (channels) are only related to some classes. And edge systems are usually implemented in a specific environment, where classes the system detects are limited. As a result, implementing a general-trained model for a specific edge environment leads to unnecessary redundancy. Meanwhile, transferring some data and models to the cloud for personalization will cause privacy issues. Thus, we may have an on-device class-aware pruning method to remove the channels which are irrelevant for the classes the edge system observes mostly, thereby reducing the model’s Floating Point Operations (FLOPs), memory footprint, latency, improving energy efficiency and keeping a relatively high accuracy for the observed classes while protecting the in-situ data privacy. OCAP proposes a novel class-aware pruning method based on the intermediate activation of input images to identify the class-irrelevant channels. Moreover, we propose a method based on KL-divergence to select diverse and representative data for effectively fine-tuning the pruned model. The experimental results show the effectiveness and efficiency of OCAP. In comparison with state-of-the-art class-aware pruning methods, OCAP has better accuracy and higher compression ratio. Additionally, we evaluate OCAP on Nvidia Jetson Nano, Nvidia Jetson TX2 and Nvidia Jetson AGX Xavier in terms of efficiency, where the experimental results demonstrate the applicability of OCAP on edge systems. The code is available at https://github.com/mzd2222/OCAP.	en_US
dc.language.iso	eng	en_US
dc.publisher	Elsevier	en_US
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.title	OCAP: On-device Class-Aware Pruning for personalized edge DNN models	en_US
dc.title.alternative	OCAP: On-device Class-Aware Pruning for personalized edge DNN models	en_US
dc.type	Journal article	en_US
dc.type	Peer reviewed	en_US
dc.description.version	acceptedVersion	en_US
dc.source.pagenumber	15	en_US
dc.source.volume	142	en_US
dc.source.journal	Journal of systems architecture	en_US
dc.identifier.doi	10.1016/j.sysarc.2023.102956
dc.identifier.cristin	2168771
cristin.ispublished	true
cristin.fulltext	postprint
cristin.qualitycode	1

Tilhørende fil(er)

Filnavn:: myd1.pdf
Størrelse:: 3.285Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6828]
Publikasjoner fra CRIStin - NTNU [38672]

Vis enkel innførsel

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal