Localizing and identifying key features of a standardized electronic component datasheet utilizing object detection and natural language processing

Rønning, Gaute Wierød

dc.contributor.advisor	Sharma, Kshitij
dc.contributor.author	Rønning, Gaute Wierød
dc.date.accessioned	2023-10-20T17:20:31Z
dc.date.available	2023-10-20T17:20:31Z
dc.date.issued	2023
dc.identifier	no.ntnu:inspera:142737689:37361552
dc.identifier.uri	https://hdl.handle.net/11250/3097877
dc.description	Full text not available
dc.description.abstract	Denne studien presenterer en programvareløsning som bruker og eksperimenterer med toppmoderne objektdeteksjons modeller og naturlig språkbehandling for å lokalisere og trekke ut interessepunkter i elektroniske komponent datablader. Det overordnede målet med oppgaven er å effektivisere bruken av datablader i produksjonsprosessen og forbedre informasjonsflyt i produksjon for sammensetning av kretskort, elektroniske enheter og andre komponenter. I tillegg vil programvareløsningen gi en enkel løsning for å lagre og formatere data i et datablad i eksterne systemer, for eksempel et produktlivssyklusstyringssystem. Det meste av behandling gjøres manuelt i dag, med betydelige variasjoner i formatering og informasjon, noe som egner seg for et klassifiseringsproblem. Den resulterende løsningen konstruert i løpet av denne master oppgaven kan behandle ett PDF dokument og finne interessepunkter og kritisk språklig informasjon i disse regionene, ved å benytte YOLOv8 og BERT modeller kombinert med optisk tegngjenkjenning.
dc.description.abstract	This study presents a pipeline utilizing and experimenting with state-of-the-art object detection models and natural language processing to find and extract points of interest in electronic components datasheets. The overarching goal of the thesis is to streamline the production process of datasheets and improve the information flow in production for printed circuit boards, electronic assemblies, and other components easier. In addition, the pipeline will provide an easy solution to save and format data contained within datasheets in external systems, for instance, a product lifecycle management system. Most processing is done manually today, with significant variances in formatting and information, making it suitable for a classification problem. The resulting pipeline constructed during this thesis can process an entire pdf document and find its point of interest and vital linguistic information within those regions, utilizing the YOLOv8 and BERT models combined with optical character recognition.
dc.language	eng
dc.publisher	NTNU
dc.title	Localizing and identifying key features of a standardized electronic component datasheet utilizing object detection and natural language processing
dc.type	Master thesis

Files in this item

Files	Size	Format	View

This item appears in the following Collection(s)

Institutt for datateknologi og informatikk [6808]

Show simple item record