Vis enkel innførsel

dc.contributor.authorHashmi, Ehtesham
dc.contributor.authorYildirim-Yayilgan, Sule
dc.contributor.authorShaikh, Sarang
dc.date.accessioned2024-04-23T07:46:12Z
dc.date.available2024-04-23T07:46:12Z
dc.date.created2024-04-19T19:49:44Z
dc.date.issued2024
dc.identifier.issn1869-5450
dc.identifier.urihttps://hdl.handle.net/11250/3127710
dc.description.abstractPeople in the modern digital era are increasingly embracing social media platforms to express their concerns and emotions in the form of reviews or comments. While positive interactions within diverse communities can considerably enhance confidence, it is critical to recognize that negative comments can hurt people’s reputations and well-being. Currently, individuals tend to express their thoughts in their native languages on these platforms, which is quite challenging due to potential syntactic ambiguity in these languages. Most of the research has been conducted for resource-aware languages like English. However, low-resource languages such as Urdu, Arabic, and Hindi present challenges due to limited linguistic resources, making information extraction labor-intensive. This study concentrates on code-mixed languages, including three types of text: English, Roman Urdu, and their combination. This study introduces robust transformer-based algorithms to enhance sentiment prediction in code-mixed text, which is a combination of Roman Urdu and English in the same context. Unlike conventional deep learning-based models, transformers are adept at handling syntactic ambiguity, facilitating the interpretation of semantics across various languages. We used state-of-the-art transformer-based models like Electra, code-mixed BERT (cm-BERT), and Multilingual Bidirectional and Auto-Regressive Transformers (mBART) to address sentiment prediction challenges in code-mixed tweets. Furthermore, results reveal that mBART outperformed the Electra and cm-BERT models for sentiment prediction in code-mixed text with an overall F1-score of 0.73. In addition to this, we also perform topic modeling to uncover shared characteristics within the corpus and reveal patterns and commonalities across different classes.en_US
dc.language.isoengen_US
dc.publisherSpringeren_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titleAugmenting sentiment prediction capabilities for code-mixed tweets with multilingual transformersen_US
dc.title.alternativeAugmenting sentiment prediction capabilities for code-mixed tweets with multilingual transformersen_US
dc.typeJournal articleen_US
dc.typePeer revieweden_US
dc.description.versionpublishedVersionen_US
dc.source.journalSocial Network Analysis and Miningen_US
dc.identifier.doi10.1007/s13278-024-01245-6
dc.identifier.cristin2263139
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal