Page 53 - 2023
P. 53

companies based on the market capitalization from various authorized online sources.
        After  gathering  news  feeds  and  Twitter  tweets,  the  polarity  and  subjectivity  score  are
        calculated using various methods like CountVectorizer, TF-IDF, and Word2Vec in which
        TF-IDF  gives  the  best  result.  Spark  NLP  based  pipeline  was  implemented  and  three
        machine learning algorithms Naïve Bayes Classifier, Multi-Class Logistic Regression and
        Random  Forest  Classifier  were  applied  to  compare  the  performance.  The  vedar  and
        textblob are two libraries used for data labeling and sentiment analysis.

        After completing this task, the analytical study is carried out which is known as technical
        analysis. In this phase, all the numerical historical data for BSE top 100 stock companies
        based on the market capitalization are gathered from well reputed official site BSE portal.
        After  gathering  input  data,  the  data  pre-processing  is  applied  to  prove  the  data
        statistically and also calculate the various technical indicators for the input parameters.

        Finally, a predictive model is implemented. The proposed model performs the prediction
        of stock price movement for intraday on the basis of historical stock data, sentiments
        score or impact of textual information on stock and various technical indicators. So the
        model will predict the class of stock price movement as Up, Down or Neutral. Also the
        model, will auto suggest what will be the price movement of specific stock for intraday as
        upward, downward, or no effect as per current and past scenarios. The big data analytics
        and machine learning combined approach are applied to optimize the performance of the
        model.  After  this,  the  neural  network  techniques,  Multi-layer  Perceptron  classifier  is
        identified as optimized performance based technique and also the value of the parameter
        based on parameter tuning comparison for model implementation is identified.

        Key words: Stock Market, Big Data, SparkNLP, MLlib, ANN



























                                                                                           28
   48   49   50   51   52   53   54   55   56   57   58