Vis enkel innførsel

dc.contributor.advisorKulahci, Murat
dc.contributor.advisorTyssedal, John Sølve
dc.contributor.authorCacciarelli, Davide
dc.date.accessioned2024-04-26T05:44:23Z
dc.date.available2024-04-26T05:44:23Z
dc.date.issued2024
dc.identifier.isbn978-82-326-7963-8
dc.identifier.issn2703-8084
dc.identifier.urihttps://hdl.handle.net/11250/3128141
dc.descriptionDouble Ph.D. Degree program between Technical University of Denmark (DTU) Department of Applied Mathematics and Computer Science and Norwegian University of Science and Technology (NTNU) Department of Mathematical Sciences
dc.description.abstractAs businesses increasingly rely on machine learning models to make informed decisions, the ability to develop accurate and reliable models is critical. However, in many industrial contexts, data annotation represents a major bottleneck to the training and deployment of predictive models. This thesis focuses on data-efficient strategies for developing machine learning models in label-scarce settings. The increasing availability of unlabeled data in various applications has led to the need for efficient methods that minimize the cost associated with collecting labeled observations. Traditional active learning approaches, such as pool-based methods, have been extensively studied, but the emergence of data streams has necessitated the development of stream-based active learning strategies able to select the most informative observations from data streams in real time. The thesis begins with a survey of active learning, providing an overview of recently proposed approaches for selecting informative observations from data streams. It presents the strengths and limitations of the state of the art and discusses the challenges and opportunities that arise in this area of research. Next, the thesis presents a novel stream-based active learning strategy for linear models inspired by the optimal experimental design theory. By setting a threshold on the informativeness of unlabeled data points, the proposed strategy enables the learner to decide in real time whether to label an instance or discard it. Then, the thesis investigates the robustness of online active learning in the presence of outliers and irrelevant features. The thesis also provides initial results related to an adaptive sampling scheme for drifting regression data streams. Finally, the thesis presents a stream-based active distillation framework for developing lightweight yet powerful object detection models. This approach combines active learning and knowledge distillation, allowing a compact student model to be finetuned using pseudo-labels generated by a large pre-trained teacher model. Overall, this thesis contributes to the field of stream-based active learning by providing insights into various techniques and addressing concerns related to robustness and scalability. The findings expand the potential applications of active learning in real-time data streams and pave the way for more efficient and effective model development.en_US
dc.language.isoengen_US
dc.publisherNTNUen_US
dc.relation.ispartofseriesDoctoral theses at NTNU;2024:186
dc.relation.haspartPaper 1: Cacciarelli, Davide; Kulahci,Murat. Sampling strategies for industrial applications through active learningen_US
dc.relation.haspartPaper 2: Cacciarelli, Davide; Kulahci, Murat. Active learning for data streams: a survey. Machine Learning 2023 ;Volum 113. s. 185-239 https://doi.org/10.1007/s10994-023-06454-2 This article is licensed under a Creative Commons Attribution 4.0 International License CC BYen_US
dc.relation.haspartPaper 3: Cacciarelli, Davide; Kulahci, Murat; Tyssedal, John Sølve. Stream-based active learning with linear models. Knowledge-Based Systems 2022 ;Volum 254. https://doi.org/10.1016/j.knosys.2022.109664 This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).en_US
dc.relation.haspartPaper 4: Cacciarelli, Davide; Kulahci, Murat; Tyssedal, John Sølve. Robust online active learning. Quality and Reliability Engineering International 2023 ;Volum 40.(1) s. 277-296 https://doi.org/10.1002/qre.3392 This is an open access article under the terms of the Creative Commons Attribution License CC BYen_US
dc.relation.haspartPaper 5: Stream-Based Active Learning for Regression with Dynamic Feature Selection https://doi.org/10.1109/TransAI60598.2023.00030 © 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_US
dc.relation.haspartPaper 6: Dani Manjah, Davide Cacciarelli, Baptiste Standaert, Mohamed Benkedadra, Gauthier Rotsart de Hertaing, Benoît Macq, Stéphane Galland, Christophe De Vleeschouwer. Stream-Based Active Distillation for Scalable Model Deployment Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023, pp. 4999-5007en_US
dc.titleActive Learning for Data Streamsen_US
dc.typeDoctoral thesisen_US
dc.subject.nsiVDP::Mathematics and natural science: 400::Mathematics: 410en_US


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel