Ph.D Thesis description
Social networks offer users the ability to publicly post messages that can yield interactions as comments. These comments express the users opinions about a product or a particular subject. In this thesis, we investigate how users opinions can affect mobile applications ranking. Our first goal is data collection from social networks. Then we will apply deep-learning techniques to classify users sentiments. This classification represents the core of mobile application ranking prediction.
This thesis aims to analyze the opinions of users of mobile applications in social networks such as Facebook. This analysis will determine whether a review is subjective (it discloses an opinion), then determine its polarity (positive or negative).
Traditional methods are based on machine learning combined with either bag-of-words representations or human annotations. In such methods only a single word (unigram) is considered at a time ignoring words order (I. Ounis and I. Soboroff, 2008). Thus, the compositions of words that reverse polarity will not be detected (example: avoid war, limited freedom). To solve the issue, it is possible to consider pairs of words (bigrams). The advantage is the ability of capturing simple negations. This solution implies a very large features vector.
One of the key advantage of deep neural networks, as compared to traditional machine learning techniques, is their capacity to understand sentence structure and semantics Ronan Collobert et al., 2011). In the Natural Language Processing (NLP) domain, hierarchical abstract features may be computed from online social media interactions in order to determine the sentiment expressed in a specific message or post. By using deep learning algorithms can better understand sentiment than traditional methods by analyzing sentences rather than isolated words (Socher Richard, 2013).
This thesis aims on applying deep learning models for sentiment analysis in social media such as Facebook. The first task consists on building large datasets of messages and posts collected from social media. The feature engineering and data labeling tasks will be automatically performed by deep learning algorithms instead of human intervention. In second, deep-learning neural networks without human intervention (Pengfei Liu et al. 2015, Soujanya Poriaa et al. 2016) will automatically process the selection of attributes and annotation classes. One issue of training deep networks on large data is that it frequently turns the vanishing problem (Bengio et al., 1994). The second task then, is to find a deep network model suitable to the addressed problem. This model will be validated by the evaluation of supervised classifier performance, trained on data set generated in this step. As a possible improvement in prediction performance, we propose a third task. It aims to characterize an opinion by the sentiment intensity. This can be considered a semantic composition problem (Socher Richard et al. 2013). This model also will be validated by a supervised classifier. Thereafter the two models will be compared and analyzed in order to justify the interest of modeling the sentiments intensity.