Show simple item record

dc.contributor.authorPatel, Vrushang
dc.date.accessioned2021-08-23T17:24:31Z
dc.date.available2021-08-23T17:24:31Z
dc.date.issued2021-08-13
dc.identifier.citationPatel, Vrushang. Short Text Classification with Tolerance Near Sets; A Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in the Department of Applied Computer Science. Winnipeg: University of Winnipeg, 2021. DOI: 10.36939/ir.202108231232.en_US
dc.identifier.urihttps://hdl.handle.net/10680/1962
dc.description.abstractText classification is a classical machine learning application in Natural Language Processing, which aims to assign labels to textual units such as documents, sentences, paragraphs, and queries. Applications of text classification include sentiment classification and news categorization. Sentiment classification identifies the polarity of text such as positive, negative or neutral based on textual features. In this thesis, we implemented a modified form of a tolerance-based algorithm (TSC) to classify sentiment polarities of tweets as well as news categories from text. The TSC algorithm is a supervised algorithm that was designed to perform short text classification with tolerance near sets (TNS). The proposed TSC algorithm uses pre-trained SBERT algorithm vectors for creating tolerance classes. The effectiveness of the TSC algorithm has been demonstrated by testing it on ten well-researched data sets. One of the datasets (Covid-Sentiment) was hand-crafted with tweets from Twitter of opinions related to COVID. Experiments demonstrate that TSC outperforms five classical ML algorithms with one dataset, and is comparable with all other datasets using a weighted F1-score measure.en_US
dc.language.isoenen_US
dc.publisherUniversity of Winnipeg Libraryen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectSentiment Classificationen_US
dc.subjectMachine Learningen_US
dc.subjectTolerance Near Setsen_US
dc.subjectTransformeren_US
dc.subjectNatural Language Processingen_US
dc.titleShort Text Classification with Tolerance Near Setsen_US
dc.typeThesisen_US
dc.description.degreeMaster of Science in Applied Computer Scienceen_US
dc.publisher.grantorUniversity of Winnipegen_US
dc.identifier.doihttps://doi.org/10.36939/ir.202108231232


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record