Page 30 - 2023
P. 30
Ph.D.
(Engineering & Technology)
EFFICIENT MULTI-DOMAIN ADAPTATION IN SENTIMENT
ANALYSIS USING MACHINE LEARNING AND CROSS DOMAIN
SEMANTIC LIBRARY
Ph.D. Scholar : Patel Dipakkumar Chinubhai
Research Supervisor : Dr. Kiran R. Amin
Regi. No.: 17276341004
Abstract :
Nowadays, the rapid growth of the internet has led to the way for most effortless data
generation. These data can be in the form of web pages, blogs, emails, posts on various
social networks, or anything that is uploaded to the internet. There must be a technique
to retrieve valuable information from this vast data storage. Classification is machine
learning techniques for automatic categorization of the data into specified categories.
Sentiment Analysis (SA) is a classification problem that is necessary to scrutinize the
user-generated data into any of the two classes (negative or positive). Sentiment
Analysis is implemented by machine learning techniques and lexicon-oriented
techniques. Due to accuracy, simplicity, and adaptability, machine-learning approaches
have lured the researchers. Traditional sentiment analysis techniques are trained on one
topic (also called the domain) and tested on the same topic.
The domain on which the machine is trained is called the source domain, and the testing
domain is called the target domain. Sometimes labelled data are not available in target
domains. The traditional SA models could not deal with these missing labelled data, and
the accuracy of traditional machine learning models degrades largely if they are trained
on one domain (called source domain) and classify the data of different domain (called
target domain which is different from the source domain and labels are not available).
This situation is considered as a domain adaptation. To improve the classification
accuracy, the machine needs to be trained on corresponding target domain data, but to
label each new domain is a difficult and time-consuming task. In semi-supervised
technique, training and target data are normally from same domains. Here, we have
different source and target domains. Hence, the domain adaptation technique is needed
to solve the problem of data labelling and make the machine general enough to classify
the data of the domain on which it is not trained. The similarity measure plays a vital role
in domain adaptation for selecting important pivot (common) features from the target
domain that matches source domains. This similarity matching of target domain features
to source domain features helps to assign class labels to some of the features of the
target domain. Our research work proposes an enhanced cross entropy measure to
match the distribution of source domains data with the target domain and find new
target domain-specific features based on some common (pivot) features.
05