Page 30 - 2023
P. 30

Ph.D.
                                                                               (Engineering & Technology)
        EFFICIENT MULTI-DOMAIN ADAPTATION IN SENTIMENT
        ANALYSIS USING MACHINE LEARNING AND CROSS DOMAIN
        SEMANTIC LIBRARY
        Ph.D. Scholar : Patel Dipakkumar Chinubhai
        Research Supervisor : Dr. Kiran R. Amin



                                                                              Regi. No.: 17276341004
        Abstract :
        Nowadays, the rapid growth of the internet has led to the way for most effortless data
        generation. These data can be in the form of web pages, blogs, emails, posts on various
        social networks, or anything that is uploaded to the internet. There must be a technique
        to  retrieve  valuable  information  from  this  vast  data  storage.  Classification  is  machine
        learning  techniques  for  automatic  categorization  of  the  data  into  specified  categories.
        Sentiment  Analysis  (SA)  is  a  classification  problem  that  is  necessary  to  scrutinize  the
        user-generated  data  into  any  of  the  two  classes  (negative  or  positive).    Sentiment
        Analysis  is  implemented  by  machine  learning  techniques  and  lexicon-oriented
        techniques.  Due  to  accuracy,  simplicity,  and  adaptability,  machine-learning  approaches
        have lured the researchers. Traditional sentiment analysis techniques are trained on one
        topic (also called the domain) and tested on the same topic.
        The domain on which the machine is trained is called the source domain, and the testing
        domain is called the target domain. Sometimes labelled data are not available in target
        domains. The traditional SA models could not deal with these missing labelled data, and
        the accuracy of traditional machine learning models degrades largely if they are trained
        on one domain (called source domain) and classify the data of different domain (called
        target domain which is different from the source domain and labels are not available).
        This  situation  is  considered  as  a  domain  adaptation.  To  improve  the  classification
        accuracy, the machine needs to be trained on corresponding target domain data, but to
        label  each  new  domain  is  a  difficult  and  time-consuming  task.  In  semi-supervised
        technique,  training  and  target  data  are  normally  from  same  domains.  Here,  we  have
        different source and target domains. Hence, the domain adaptation technique is needed
        to solve the problem of data labelling and make the machine general enough to classify
        the data of the domain on which it is not trained. The similarity measure plays a vital role
        in  domain  adaptation  for  selecting  important  pivot  (common)  features  from  the  target
        domain that matches source domains. This similarity matching of target domain features
        to source domain features helps to assign class labels to some of the features of the
        target  domain.  Our  research  work  proposes  an  enhanced  cross  entropy  measure  to
        match  the  distribution  of  source  domains  data  with  the  target  domain  and  find  new
        target domain-specific features based on some common (pivot) features.
                                                                                           05
   25   26   27   28   29   30   31   32   33   34   35