Classification of Maintenance Reports - Statistical NLP meets the Oil & Gas Industry

dc.contributor.advisor	Gulla, Jon Atle
dc.contributor.author	Selvig, Ole Christer Andre Asikainen
dc.date.accessioned	2019-09-11T10:55:40Z
dc.date.created	2018-07-09
dc.date.issued	2018
dc.identifier	ntnudaim:17793
dc.identifier.uri	http://hdl.handle.net/11250/2615793
dc.description.abstract	Several problematic data characteristics were revealed, such as multilingual reports, and significant class imbalances. While no consistent scheme for conduct-ing data preparation was found, several techniques were frequently reiterated in the most promising experiments. For the three classifiers tested (Naive Bayes, Support Vector Machines, and Random Forest), Support Vector Machines was the overall best choice, being the only classifier to generalize well beyond observed data. The various re-sampling techniques decreased the overall performance, which seems to indicate that more noise was generated instead.	en
dc.language	eng
dc.publisher	NTNU
dc.subject	Informatikk, Kunstig intelligens	en
dc.title	Classification of Maintenance Reports - Statistical NLP meets the Oil & Gas Industry	en
dc.type	Master thesis	en
dc.source.pagenumber	107
dc.contributor.department	Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi og elektroteknikk,Institutt for datateknologi og informatikk	nb_NO
dc.date.embargoenddate	10000-01-01