Vis enkel innførsel

dc.contributor.authorMa, Ye-Da
dc.contributor.authorZhao, Zhi-Chao
dc.contributor.authorLiu, Di
dc.contributor.authorHe, Zhenli
dc.contributor.authorZhou, Wei
dc.date.accessioned2024-06-25T10:52:36Z
dc.date.available2024-06-25T10:52:36Z
dc.date.created2023-08-22T13:49:18Z
dc.date.issued2023
dc.identifier.citationJournal of systems architecture. 2023, 142 .en_US
dc.identifier.issn1383-7621
dc.identifier.urihttps://hdl.handle.net/11250/3135695
dc.description.abstractIn this paper, we propose a new on-device class-aware pruning method for edge systems, namely OCAP. The motivation behind is that Deep Neural Network (DNN) models are usually trained with a large dataset so that they can learn more diverse features and be generalized to accurately predict numerous classes. Some works reveal that some features (channels) are only related to some classes. And edge systems are usually implemented in a specific environment, where classes the system detects are limited. As a result, implementing a general-trained model for a specific edge environment leads to unnecessary redundancy. Meanwhile, transferring some data and models to the cloud for personalization will cause privacy issues. Thus, we may have an on-device class-aware pruning method to remove the channels which are irrelevant for the classes the edge system observes mostly, thereby reducing the model’s Floating Point Operations (FLOPs), memory footprint, latency, improving energy efficiency and keeping a relatively high accuracy for the observed classes while protecting the in-situ data privacy. OCAP proposes a novel class-aware pruning method based on the intermediate activation of input images to identify the class-irrelevant channels. Moreover, we propose a method based on KL-divergence to select diverse and representative data for effectively fine-tuning the pruned model. The experimental results show the effectiveness and efficiency of OCAP. In comparison with state-of-the-art class-aware pruning methods, OCAP has better accuracy and higher compression ratio. Additionally, we evaluate OCAP on Nvidia Jetson Nano, Nvidia Jetson TX2 and Nvidia Jetson AGX Xavier in terms of efficiency, where the experimental results demonstrate the applicability of OCAP on edge systems. The code is available at https://github.com/mzd2222/OCAP.en_US
dc.language.isoengen_US
dc.publisherElsevieren_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titleOCAP: On-device Class-Aware Pruning for personalized edge DNN modelsen_US
dc.title.alternativeOCAP: On-device Class-Aware Pruning for personalized edge DNN modelsen_US
dc.typeJournal articleen_US
dc.typePeer revieweden_US
dc.description.versionacceptedVersionen_US
dc.source.pagenumber15en_US
dc.source.volume142en_US
dc.source.journalJournal of systems architectureen_US
dc.identifier.doi10.1016/j.sysarc.2023.102956
dc.identifier.cristin2168771
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal