Page 53 - 2023
P. 53
companies based on the market capitalization from various authorized online sources.
After gathering news feeds and Twitter tweets, the polarity and subjectivity score are
calculated using various methods like CountVectorizer, TF-IDF, and Word2Vec in which
TF-IDF gives the best result. Spark NLP based pipeline was implemented and three
machine learning algorithms Naïve Bayes Classifier, Multi-Class Logistic Regression and
Random Forest Classifier were applied to compare the performance. The vedar and
textblob are two libraries used for data labeling and sentiment analysis.
After completing this task, the analytical study is carried out which is known as technical
analysis. In this phase, all the numerical historical data for BSE top 100 stock companies
based on the market capitalization are gathered from well reputed official site BSE portal.
After gathering input data, the data pre-processing is applied to prove the data
statistically and also calculate the various technical indicators for the input parameters.
Finally, a predictive model is implemented. The proposed model performs the prediction
of stock price movement for intraday on the basis of historical stock data, sentiments
score or impact of textual information on stock and various technical indicators. So the
model will predict the class of stock price movement as Up, Down or Neutral. Also the
model, will auto suggest what will be the price movement of specific stock for intraday as
upward, downward, or no effect as per current and past scenarios. The big data analytics
and machine learning combined approach are applied to optimize the performance of the
model. After this, the neural network techniques, Multi-layer Perceptron classifier is
identified as optimized performance based technique and also the value of the parameter
based on parameter tuning comparison for model implementation is identified.
Key words: Stock Market, Big Data, SparkNLP, MLlib, ANN
28