Twitter Sentiment Analysis of COVID-19 Vaccination Integrating SenticNet-7 and SentiWordNet-Adjusted VADER Models
Abstract
Social media platforms in the modern era are enormous informational databases that continually generate massive volumes of data that provide deep insights into human thought, behaviour, and trends. Nonetheless, it is now crucial and difficult to extract relevant data from this vast and unstructured social media data pool. The objective of this study is to improve the accuracy of sentiment analysis by combining various techniques designed for social media sentiment analysis, with a focus on Twitter data. People communicate and share ideas and opinions in a completely new way thanks to social media platforms like Twitter, which have also produced an abundance of data that is ready for analysis. But there are many obstacles to overcome when trying to manually extract pertinent information from this massive amount of unstructured data. This problem is addressed by data mining methodologies, which involve using a variety of statistical methods and algorithms to extract patterns, connections, and insights from huge databases. Text mining is a subfield of data mining that focuses on retrieving knowledge and information from unstructured textual data, especially content that users have created on social networking sites, such as posts, comments, reviews, and tweets. Text mining techniques allow sentiment analysis, topic extraction, and other important information to be extracted from social media text by utilising machine learning, linguistic analysis, and natural language processing. The goal of this project is to improve the accuracy of sentiment classification in social media mining, with a concentration on Twitter data. This is accomplished by combining a number of methods, such as SenticNet-7, a sentiment dictionary with a medical focus, and SentiWordNet-Adjusted VADER Sentiment Analysis (SAVSA-SN7). SentiWordNet is used by SAVSA-SN7 to provide sentiment ratings to individual words in tweets. Then, the Valence Aware Dictionary and sEntiment Reasoner (VADER) sentiment analyzer are used to refine the sentiment scores. SenticNet-7, which is customised for the medical sector, is also included to take into consideration sentiment peculiarities unique to this industry. The outcomes of the experiments show how effective this combination method is, especially when dealing with the brief text data that is common on Twitter, where sentiment can vary greatly depending on the context. The proposed methodology’s evaluation highlights its accuracy and performance in capturing sentiment and generating insightful recommendations for decision-making processes. Through the integration of data mining, text mining, and social media mining techniques, this research contributes to advancing sentiment analysis, particularly in the context of Twitter data.