Open Access

Topic Sentiment Analysis in Online Learning Community from College Students

 and    | May 20, 2020

Cite

Introduction

Due to the rapid development of Web 2.0 and social media, the Online Learning Community (OLC) is increasingly being utilized by academic institutions to create a more convenient learning environment (Fariza, 2019; Wu, Hsieh, & Yang, 2017). OLC establishes a virtual social form through teaching, research and other activities, with interactive learning, collaborative learning and independent learning. In summary, OLC consists of three basic elements: technology, teaching, and academic sentiment interaction. The purpose of improving academic sentiment interaction is to cultivate learners’ sense of belonging to the community, so that learners are willing to stay in the community for a long time and maintain learning motivation at a high level (Cho, Kim, & Choi, 2017). Academic sentiments are generally hidden in the text records of learning community activities, such as documents, statements, and sentences. Through the techniques of sentiment analysis, weight calculation, and semantic understanding, the sentiment experience related to learning processes can be observed. In this regard, academic sentiment mining (AEM) can gain insights from the comments in OLC to analyze the factors affecting learning outcomes, which is of great significance for the improvement of teaching theories and methods (Kohoulat et al., 2017).

In the realization of AEM, much attention has been paid towards discoveries of the expressed sentiments in the applications of academic recommender system (Kaklauskas, Zavadskas, & Seniut, 2013), opinion leader identification (Li, Ma, & Zhang, 2013), forum topic mining (Cheng et al., 2015; Colace, Santo, & Luca Greco, 2014), and community quality analysis (Ghiasifard et al., 2015), etc. However, all the available methods mentioned above are unable to highlight the identification and visualization of topic sentiment based on learning topic mining and sentiment clustering at various granularity-levels. In the light of recent studies, current paper aims at constructing topic sentiment analysis in OLC. Our ultimate goal is to obtain a list of terms relevant to some learning topic and to visualize the association relationships based on the sentiment classification in an interactive way.

To access a certain plaint set without checking all the document information, current paper aims at highlighting the identification and visualization of topic sentiment based on learning topic mining and sentiment clustering at various granularity-levels. To enact this need, we first proposed a topic analytics method with actual data sets from the website (www.icourses.cn), to construct the documents-topics probability matrix and the topics-terms probability matrix. Secondly, the topic sentiment distances are measured on the basis of topic-clustered concept lattice. Then, the sentiment polarity can be calculated with the help of the sentiment dictionary and domain context. Finally, a set of formal concepts containing topic-terms as well as a set of association rules are generated and visualized.

The following proposals are given in the paper.

A topic analytics method is proposed to analyze and extract potential topics in OLC.

On the basis of college students’ feedback, a novel approach is developed to identify the topic sentiment by measuring the sentiment distance.

In addition, the hierarchical and associated relationships as well as the granularity of sentiment information are obtained.

The rest of the paper is constituted as follows: Section 2 introduces a related survey of methodologies in the area of topic detection, sentiment analysis, and sentiment concept clustering. Section 3 gives introductions about Latent Dirichlet Allocation (LDA) and Formal Concept Analysis (FCA). The topic sentiment analysis method is proposed in Section 4. Section 5 contains the illustration and implementation of the proposed method followed by the results and discussions given in Section 6. Section 7 presents the conclusions and future work followed by the acknowledgment and references.

Related work

The OLC contains a large amount of information, which can be divided into two parts: learning resources and student review information (Shea, Li, & Pickett, 2006). How to accurately extract the topic of students’ attention and make corresponding optimization according to the sentiment of students’ comments has become a key link to improve the quality of community service and enhance the learning effect of students. To achieve this goal, many scholars have carried out extensive and in-depth research, which can be summarized into three main steps: topic detection (Lu et al., 2013), sentiment analysis (Nan & Wu, 2010), and sentiment concept clustering (Pappas et al., 2017). In this regard, various investigations and studies have been put forward to optimize the quality of data mining in OLC. In order to explain the proposed method in an understandable way, this section focuses on the detection methods of community topics and analyzes different sentiment distribution models according to the corresponding topics.

Topic detection for online community

The topic detection model is a kind of probability generation model for text content by simulating the human mind process to find the best topic set and its vocabulary. The existing topic models mainly include: latent semantic analysis (LSA) (Martínez, 2015), probabilistic semantic based indexing model (PLSI) (Parvathy, 2016) and Latent dirichlet allocation (LDA) (Yue, Barnes, & Jia, 2017).

The LSA method implements the representation of documents on low-dimensional implicit semantic spaces by introducing semantic dimensions. Although the model is capable of constructing text representation without dictionary spaces, the basis of the LSA methodology is still derived from linear algebra, generating huge amount of negative numbers in various dimensions. To deal with this problem, Hofmann proposed a probabilistic semantic based indexing model (PLSI) (Hofmann, 1999). The aim of PLSI is to emphasize the semantic interpretability of topic text based on implicit semantic indexing, which is unable to deal with the over-fitting problems caused by massive text. In order to improve the parameters from PLSI, which cannot be linearly changed as the document set grows, Blie proposed the LDA (Latent Dirichlet Allocation) model to retrieve potential topics, representing high-dimensional word space with low-dimensional topic space (Blei, Ng, & Jordan, 2003). The LDA model is a multi-layer unsupervised Bayesian network that has been widely used to mine document subject knowledge. The LDA-based approach for online community can be summarized into two categories. The first aspect is to identify similar topics under different time segments and analyze the evolution trends. Chu and Li (2010) proposed a method to realize the evolution of the topics. They utilized the original corpus for topic classification. Nagori developed a content-based recommended system to personalize the e-learning systems (Nagori & Aghila, 2012). They exploited the topic model by introducing the similarity metrics. Yang (Yang, Zhang, & Shi, 2014) adjusted the priori parameters of the model to find changeable topics in the text. Ge extracted the hidden micro blog topics to emerge topics that need to be expressed in the community (Ge, Chen, & Du, 2013). The second aspect is the combination with other models to enhance deep semantic relationships of the topic. Santosh, Vardhan, and Ramesh (2016) focused on the analysis of the feature attributes of online product reviews. They proposed the LDA model to obtain the feature keywords of the product and combined the feature ontology tree (FOT) to improve the accuracy of subject detection. Cerulo and Distante (2013) obtained a topic-terms matrix by developing a topic recognition model, which was utilized to form a formal context to constructing a theme concept lattice for topic-driven navigation. Zhong et al. (2018) designed an evaluation framework for the quality of student comments in online communities. They considered the dimensional characteristics of online commentary data quality, and constructed a set of topic features. To sum up, the current methods focus on the evolution analysis of community topic mining, semantic relationship enhancement, and probabilistic topic modeling, ignoring the hierarchical relationship between topics and sentimental analysis of students’ feedback.

Sentiment analysis for online students

In the process of topic detection, adding sentiment analysis can identify the sentiment changes from the online students implied in the topic. Therefore, it is necessary to identify sentiment distributions according to the corresponding topic. Sentiment analysis, also known as opinion mining, is the process of analyzing, processing and classifying subjective texts with sentiment techniques. At present, the mainstream sentiment analysis methods can be divided into three categories. The first aspect is to analyze the text by constructing a sentiment dictionary, which mainly relies on the qualities of sentiment lexicons with specific semantic rules. Pointwise Mutual Information (PMI) and Latent Dirichlet Allocation (LDA) are often used in constructing sentiment lexicons, among which PMI can be used to judge the sentiment tendency of words, while LDA is utilized to extract sentiment words from corpus (Li, Ba, & Huang, 2015). Turney and Littman (2003) developed the PMI algorithm to extend the sentiment dictionary, and then the semantic polarity algorithm is proposed to analyze the sentiment tendency of the text, which improves the accuracy of text data classification. Yang, Peng, and Chen (2014) proposed a LDA-based method to constructing a specific domain sentiment dictionary on the basis of the existing public sentiment dictionary, where the extracted topic words are viewed as a priori knowledge from the corpus. The second aspect is focused on mining the sentiment features of the text based on Machine Learning (ML), such as Support Vector Machine (SVM) (Liu, Bi, & Fan, 2017), Naive Bayes (NB) (Shirakawa et al., 2017), Maximum Entropy (ME) (Ficamos, Yan, & Chen, 2013). Vinodhini (2014) designed a hybrid formwork of SVM and principal component analysis (PCA) to improve the sentiment classification accuracy by reducing the complexity of the sentiment mining model. Mertiya and Singh (2016) proposed an unsupervised polarity selection method to determine the polarity of tweets via merging NB and adjective analysis theory. Xie et al. (2017) extracted the seed sentiment words from Wikipedia by using probabilistic latent semantic analysis, which are used as the input matrix of the ME model. Meanwhile, to classify sentiment, they used entropy classification theory to select sentiment features. In addition, the last aspect is the deep-learning based approach by converting word embedding into a text vector to extract deep sentiment features, which mainly includes Convolutional Neural Networks (CNN) and Recurrent Neural Network (RNN). Shin, Lee, and Choi (2017) integrated lexicon embeddings, attention mechanisms into CNN to analyze sentiment features with less noisy words. Ethemet, Aysu and Fazli (2018) proposed a cross-language emotion analysis model, which can realize sentiment analysis based on CNN under the condition of small corpus. Although many researchers have put a lot of efforts into improving the sentiment classification of online communities for practical work, there is still a lack of evaluation in the sentiment unit combination, especially when it comes to the OLC. Since the sentiment analysis of students’ learning is closely related to the context where the topic is located, it is necessary to establish a set of association rules with contextual awareness. To enact this need, we introduce formal concept analysis theory into online sentiment analysis by exploring the sentiment association rules between students.

To sum up, current paper tries to make improvements in two ways. Firstly, the granularity analysis of learning topic for visualizing the hierarchical relationships is considered. Afterwards, we stay focus on finding the negative sentiment form students’ comments on the basis of sentiment scoring calcualtion to form the basic association rule sets.

Introduction to LDA and FCA
Latent Dirichlet Allocation

Latent Dirichlet Allocation (LDA) is a three-layer Bayesian probability network, which assumes that documents in the corpus select a topic based on a certain probability, and each topic also selects a term based on a certain probability. Therefore, a document is a mixture of multiple topics, and a topic is also a mixture of multiple terms. Suppose the topic distribution vector in the document is θ, the topic term distribution is φ, z is the topic, and w is the term, where θ and z are the implicit variables, w is the explicit variable. The topic recognition process of the LDA model is shown in Figure 1 (Blei D. M., Ng A. Y., & Jordan M. I., 2003). Topics-Terms Matrix and Topics-Documents Matrix can be mined in the end.

Figure 1

Schematic diagram of topic generation based on LDA model.

Formal concept analysis

Formal concept analysis (FCA) is a hierarchical concept construction theory based on Galois connection, which is utilized to describe the domain knowledge in depth on the basis of the mapping relationships between objects and attributes. The FCA theory consists of four basic notions of formal context, formal concept, partial ordering, and concept lattice. To further analyze the collections of documents in OLC by referring to Ren’s paper (Ren, Ling, & Yao, 2018), four definitions are given separately.

Definition 1

(Formal context) (Wei et al., 2019) A formal context is represented as a triple K=(S, T, I), where I is a binary relation represented as SIT between a set of objects S={s1,…, si, …, sN} (a collection of documents) and a set of attributes T={t1,…, tj, …, tK} (a collection of topics).

Definition 2

(Formal concept) (Wei et al., 2019) Let us suppose K=(S, T, I) be a formal context, XS, YT, and (X, Y) be a formal concept if it satisfies the following conditions: xX,yY(x,y)I\forall x \in X,\forall y \in Y \Rightarrow (x,y) \in IxXyY,(x,y)I\forall x \notin X \Rightarrow \exists y \in Y,(x,y) \notin IyYxX,(x,y)I\forall y \notin Y \Rightarrow \exists x \in X,(x,y) \notin I

The inheritance relationship between different formal concepts can be utilized to construct a complete concept lattice through partial order relations, which is defined as follows.

Definition 3

(Partial ordering) (Zhang, Wei, & Qi, 2005) Let us suppose (X1, Y1) and (X2, Y2) be two formal concepts. The partial ordering relation between the two formal concepts is valid on condition that X1X2 or Y1Y2. In that case, (X1, Y1) is called the subconcept of (X2, Y2).

Definition 4

(Concept lattice) (Zhang, Wei, & Qi, 2005) Let us suppose ≺ be the set of partial orderings among the whole formal concepts. H(S, T, ≺) is defined as a concept lattice based on the formal context K=(S, T, I).

Current paper focuses on the following steps to demonstrate the advantages of utilizing LDA for opinion mining: Firstly, when traditional machine learning methods are applied to sentiment classification, the classification effect is unstable, and most of them are supervised methods, which require a certain number of labeled training samples. The manual labeling process is relatively time-consuming and labor-intensive with poor field portability. Therefore, unsupervised learning algorithms have become an important research direction for sentiment analysis of online reviews. However, although the existing thematic sentiment hybrid model can extract both the subject and sentiment information of the document at the same time, the effect of the model’s sentiment classification and the stability are not ideal due to the local negation and the number of subjects in the subjective document. In fact, text sentiment classification is still essentially a text classification problem.

The innovations of this paper can be summarized as the following two points: 1) Modeling the specific domain knowledge of college students’ online learning communities, as well as proposing a framework that supports small-scale knowledge acquisition and modeling, and further refines the granularity of subject knowledge and sentiment. 2) On the basis of LDA theory, a concept hierarchy analysis method is introduced to design and implement a topic-clustered concept lattice generation algorithm for review documents, which is helpful for mining sentiment of sparse short text data.

The aim of the approach is not to demonstrate the advantages of utilizing LDA for opinion mining, but to construct a set of feature categories of students’ comment for the online learning community. The innovative point proposed is to use the relationship between topic features and review text classification to build the formal context of formal concept analysis, thereby constructing a hierarchical topic concept lattice to reveal the implied sentiment.

For the sentiment classification of texts in online learning communities, review topics of students often have characteristics such as limitedness, which can lead to calculation errors in the sentiment similarity of reviews. In order to reduce the interference of topic content on sentiment classification, this article first mines the implicit topics of online reviews based on the LDA topic model, and combines the sentiment dictionary to calculate the sentiment polarity of the topics to obtain the sentiment tendency of the comments. To enact these goals, two approaches are proposed: on the one hand, the topics are detected via the LDA probability topic model; on the other hand, the sentiment scores matrix based on FCA is obtained by calculating sentiment similarities.

Topic sentiment analysis method
Design of the framework

The visualization for topic sentiment analysis in online learning community (TSAOLC) consists of four modules: data preprocessing, topic detection, sentiment analysis, and visualization. The architectural overview of TSAOLC is shown in Figure 2. The process of analyzing topic sentiment depends on a sequence of each step, which is depicted as follows.

Figure 2

The overall structure of the proposed methodology.

Data preprocessing. Most of the text in OLC is unstructured or semi-structured text. Therefore, before extracting the text topic, the original text needs to be structured and represented in three steps: First, we use the web crawler tool to download related webpages and data sets; Then, stop word removal and Chinese word segmentation are applied to process the original corpora, which are used to edit the dictionary to get the document-word segmentation matrix; Finally, based on the total frequency of occurrences, the top n terms for topic detection are selected to simplify relevant sets of candidate terms.

Topic detection. Firstly, the LDA probability topic model is used to obtain the subject candidate set with probability dependence in the document, during which the documents-topics probability matrix and the topics-terms probability matrix are obtained. The documents-topics probability matrix can be utilized to assign the membership values of topics to the correspondent relevant documents. Meanwhile, the top terms in the topics-terms probability matrix are likely to semantically related, which can be clustered to form a certain meaningful concept. Therefore, the generated documents-topics probability matrix is regarded as the input for the formal context to construct the topic-clustered concept lattice.

Sentiment analysis. In the process of sentiment analysis, the common process is to extract the sentiment terms from the domain context and calculate the sentiment polarity with the help of the sentiment dictionary. In general, the main steps can be divided into two stages: 1) First, it is necessary to establish a sentiment vocabulary related to OLC; 2) Secondly, the sentiment scores are calculated to determine the semantic distance between the sentiment words.

Visualization. The process of visualizing topic sentiment can be divided into three stages. First, based on the FCA theory, a set of formal concepts containing topic-terms as well as a set of association rules are generated. Then, the sentiment scores matrix is mapped to the set of association rules to obtain a set of topic sentiment that satisfy different confidence levels. Finally, the negative sentiment is screened out as specific constraints to generate a sentiment rule set that reflects the negative tendency of the students in the text.

Data preprocessing

First, we set the website list of OLC to the seed URL, and use the web crawler software to download the web pages and data sets; then, after downloading the web pages to the local disk, the source codes are analyzed to extract the useful information containing the web page titles, which are saved in the database. Meanwhile, in order to filter irrelevant webpages, the unnecessary symbols are removed on the basis of the common stop words list belonging to a predefined domain vocabulary, which consists of Baidu stopwords vocabulary and machine learning stopwords list of Sichuan University. Afterwards, we utilize the Chinese word segmentation system named NLPIR-ICTCLAS to process the text corpus, which can automatically discover new terms and adaptively test the linguistic probability distribution from longer text content. After exacting the candidate terms, TF-IDF (Term Frequency-Inverse Document Frequency) is introduced to assess the importance degree of a term to a document (Wu et al., 2008).

Topic detection

Topic-modeling. The documents-terms matrix in Section 4.2 is used as the training initial document for the LDA model. The model can be divided into three steps: 1) Model initialization. For each document, a topic is extracted from the topic distribution. Meanwhile, a term is extracted from the term distribution corresponding to the extracted topic. The above process is repeated until each term in the document is traversed. 2) Model training. The training parameter set of the LDA model includes (α, β, k, n, i), wherein the meaning of each parameter is explained as follows: α represents a symmetric Dirichlet parameter, which refers to the smoothness degree of the generated topic words; β represents an asymmetric Dirichlet parameter, which refers to the smoothness degree of topic feature to generate feature terms; k indicates the number of topics clustering; n indicates the number of feature terms under each topic; i indicates the number of iterations of the Gibbs sampling algorithm. In this paper, the values of the parameter sets (α, β, k, n, i) are set to (1, 0.01, 50, 500, 5000). 3) Identity array definition. The Topics-Terms matrix and the documents-topics matrix are generated after the edge distribution of these variables is obtained. We select the terms of top N according to the probability value ranking under each topic, and give an array of identifiers under different topics by combining the domain knowledge of OLC.

Formal concept analysis. In our case, we employ the FCA theory, as presented in section 3.2, to construct a hierarchial topic-clustered concept lattice, where the topics-terms matrix is fed as an input to be converted into a topic formal context represented as K=(Documents, Topics, I). In the classical theory of FCA, the formal concept lattice can only be constructed through binary relations. However, the association matrix of the previous step usually contains multi-valued attributes that cannot be used to construct a topic clustering concept lattice directly. Therefore, we first use the conceptual scaling technique to structure and single-value the multi-valued attributes in the correlation matrix, and then convert the obtained multi-valued formal context into a formal context, which is named association matrix binarization. After that, the generated documents-topics matrix is assigned to the formal context to estimate the probalilities of each topic. Based on the analysis above, a proposed method algorithm for constructing the hierarchial topic-clustered concept lattice is summarized as shown in Table 1.

A proposed method algorithm for topic-clustered concept lattice generation.

Input:A set of topic and comment documentation D, where | D |=n, the number of potential topics m.
Output:A topic-clustered concept lattice CL, a topics-terms probability matrix P and a documents-topics probability matrix R.
1.for each diD.
2.diCWSi. // Convert the document into a word segment.
3.for each cws in CWSi.
4.W = W ∪ {cws}. // Obtain a collection of phrases that contains topic attribute.
5.end for.
6.end for.
7.for each cws in CWSi.
8.CWSitfidfi. // Calculate the term frequency of attributes.
9.D=[D:tfidfi]\mathop D\limits^{'} = [D\,\,:\,{tfidf}_i] . // Obtain term frequency vector.
10.end for.
11.(D,W)LDA(D,P,I)(\mathop D\limits^{'} ,W)\buildrel {LDA} \over \longrightarrow (D,P,I) . // Perform topic detection.
12.(D,P)R(\mathop D\limits^{'} ,\,P) \to R . // Classify topic association matrix.
13.Find the subset of topic attributes represented as tj.
14.for j=1 to 2m.
15.Compute the set of objects by applying the Glois connection.
16.RI′. // Convert topic association matrix to multi-valued formal context.
17.I′ → I. // Convert multi-valued formal context to binary single-valued formal context.
18.(D,R,I')FCACL(D,R,I')(\mathop D\limits^{'} ,R,I')\buildrel {FCA} \over \longrightarrow CL(D,R,I') . // Construct a hierarchical topic concept lattice.
19.end for.
20.Return {CL( D\mathop D\limits^{'} , R, I′), P, R}.
21.Derive the topic-clustered sets.
Sentiment analysis

Sentiment identification. In order to compute sentiment scores, a sentiment scores matrix (SSM) is created for storing the distance of semantic similarity among the opinion terms. We first extract the topic formal concept from the concept lattice in Section 4.3 to build a set of term datasets. In each dataset, documents and topics are viewed as rows and columns of the matrix, respectively. Then in order to obtain the sentiment polarity of the term, it is necessary to calculate the semantic similarity between the new term and the positive or negative seed terms on the basis of the sentiment dictionary, the results of which can be used as the basis for sentiment classification of the term. The construction of the sentiment dictionary consists of two steps, namely first assigning an initialized value to the seed term, and then assigning sentiment weights to the modifiers of the sentiment terms (such as degree adverbs, negative words, etc.). The collection of seed terms in the sentiment dictionary is generally divided into positive sentiment, negative sentiment and neutral sentiment. It is the basis for constructing a collection of sentiment words, which is mainly derived from the value of domain experts. After obtaining the seed terminology, we obtain the classification threshold of the sentiment polarity by calculating its mutual information. Then, by calculating the mutual information between the newly added term ti and the known sentiment term tj, the sentiment distance between them is obtained as shown in Equation 4. Modifiers such as turning conjunctions, negative words, and degree adverbs in the sentiment dictionary often have volatility effects on the strength of sentiment words. Existing methods cannot effectively solve the problem of sentiment ablation when multiple fusions are performed. To enact this need, this article improves the above problems in two ways. On the one hand, before calculating the weight of the newly added term, we utilize the conditional clause after the transitional conjunction in the sentence to replace the whole sentence. At the same time, we calculate the sentiment comprehensive value based on the multi-feature linear fusion method to avoid the sentiment discrimination effectively. On the other hand, in order to quantify the influence of degree adverbs on sentiment intensity, we give the weights of common degree adverbs by referring to the literature (Gao, Luo, & Wang, 2017), namely “adv1 (1.5), adv2 (1.3), adv3 (1.1), and adv4 (1). The synonyms of degree adverbs are also considered to have equal sentiment intensity, which are listed in Table 2. The Equation for calculating the sentiment comprehensive value is shown in Equation 5 on the basis of the reference (Wang, Pan, & Yang, 2019). SimMI(ti,tj)=ti{0,1}tj{0.1}logp(ti,tj)p(ti)p(tj)×p(ti,tj){{Sim}_{MI}}({t_i},{t_j}) = \sum\limits_{{t_i} \in \{ 0,1\} } {\sum\limits_{{t_j} \in \{ 0.1\} } {\log {{p({t_i},{t_j})} \over {p({t_i})p({t_j})}}} } \times p({t_i},{t_j}) where p(ti, tj) represents the probability that the terms ti and tj appear in the same document; p(ti) and p(tj) represent the probabilities that the document contains ti and tj respectively. The probabilities above-mentioned can be improved based on the sentiment dictionary proposed by the reference (Zhao et al., 2016) in the initial corpus using the method of maximum likelihood estimation. SD(inew,iseed)=(advi)*(neg)*{iseed{0,m}iseedSposSimMI(inew,iseed)|Spos|+iseed{0,n}iseedSnegSimMI(inew,iseed)Sneg}\eqalign{ & SD({i_{new}},{i_{seed}}) \cr & = ({adv}_i)*(neg)*\left\{ {{{\sum\limits_{{i_{seed}} \in \{ 0,m\} } {\sum\limits_{{i_{seed}} \in {S_{pos}}} {{Sim}_{MI}({i_{new}},{i_{seed}})} } } \over {|{S_{pos}}|}} + {{\sum\limits_{{i_{seed}} \in \{ 0,n\} } {\sum\limits_{{i_{seed}} \in {S_{neg}}} {{Sim}_{MI}({i_{new}},{i_{seed}})} } } \over {{S_{neg}}}}} \right\}} where inew represents the new term. iseed represents the positive or negative seed terms. (advi) represents the weights for adverb of degree and its value range is 1, 1.1, 1.3, 1.5. (neg) is a negative variable in the sentence where inew is located, and its value range is 1 or −1. When (neg)=1, it means that there is no negative term on the right side of inew, and the sentiment tendency has not changed. When (neg)= −1, it indicates that there is a negative term on the right side of inew, and the sentiment tendency are opposite to the existing sentiment words in this sentence. Spos and Sneg represent the set of positive sentiment terms and negative sentiment terms, respectively, m and n represent the number of terms contained in the set of the positive sentiment terms and the negative sentiment terms, respectively. If SD(inew, iseed) is greater than 0, it means that inew has a positive tendency and should be added into the set of the positive sentiment terms and vice versa.

Classification weights for adverb of degree.

Level(weights)Included adverbs
adv1(1.5)excessively, completely, extensively, dreadfully, entirely, absulutely
adv2(1.3)fairly, pretty, rather, quite, very, much, greatly, by far, hightly, deeply
adv3(1.1)really, almost, nearly, bven, just, still
adv4(1)slightly, a little, a bit, trifle, somewhat

Sentiment score calculation. In order to obtain a student’s sentiment tendency towards a topic, the SSM needs to be initialized. The main process can be described as follows. Firstly, the student’s comment topics are extracted from the topic formal context to establish the text-topic matrix. Secondly, the similarity distance between different topics is calculated to obtain the mapping relationship between students and topics. Afterwards, the maximum sentiment value of the sentiment terms under different topics is calculated to get the sentiment similarity distance that converges under a certain topic.

Topic similarity computing. KL distance is utilized to calculate the similarity distance under different topics—terms probability distribution (Zheng et al., 2013). However, the KL distance is asymmetrical, which is unable to compute the symmetric topic distribution functions. Therefore, enlightened by the literature (Wang, Zuo, & Tao, 2015), we introduce the relative entropy into the topic similarity calculation to iterate over all the terms, which is computed using Equation 6. simKL(ti,uj)=12(ti(0,1),uj0,1)p(ti)logtiuj+ti(0,1),uj0,1)q(ui)logujti){sim}_{KL}({t_i},{u_j}) = - {1 \over 2}\left( {\sum\limits_{{t_i} \in (0,1),{u_j}0,1)} {p({t_i})\log {{{t_i}} \over {{u_j}}} + } \sum\limits_{{t_i} \in (0,1),{u_j}0,1)} {q({u_i})\log {{{u_j}} \over {{t_i}}}} } \right) where (ti, uj) represents two topics; p(ti) and q(uj) represent the distribution function of the topics under their respective conditional probabilities.

Sentiment similarity computing. After obtaining the mapping relationships between the student and the topic, it is possible to calculate the sentiment value (as shown in Equation 7) of the student and obtain the sentiment tendency under a certain topic. Sentimentscore(ind,t)=j=1dk=1tSD(inew,iseed){Sentiment}_{score}\left( {i_n^{d,t}} \right) = \sum\limits_{j = 1}^d {\sum\limits_{k = 1}^t {SD\left( {{i_{new,}}{i_{seed}}} \right)} }

Here, d and t are the sizes of documents and topics, respectively; SD represents the sentiment comprehensive value.

On the basis of the equations above-mentioned, the proposed algorithm in Table 3 for calculating SSM is listed as follows: Step 1 to Step 3 initialize the sentiment score and the probability distribution p(ti, tj) as 0. Step 4 derives the positive and negative seed terms to construct a collection of sentiment words. Step 5 computes the sentiment comprehensive value between each topic and seed terms. Step 6 computes the topic similarity between each topic of the students and each topic in the topic formal context. Step 7 to Step 13 iterates over all terms contained in the topic formal context. Step 8 initializes the sentiment comprehensive value. Afterwards, Steps 9 to 11 computes the sum of the membership values. Step 12 computes the whole sentiment scores of each topic of the students.

A proposed method algorithm for calculating sentiment scores matrix.

Input:A topic formal context K=(U, T, I), where U={u1,u2,…,un} represents a set of topics belonging to a group of students, T={t1,t2,…,tm}, n is the size of student set, m is the size of topics.
Output:A sentiment scores matrix Sentimentscore(ti), where irepresents sentiment score of each topic.
1.for each topic ti in T.
2.Sentimentscore(ti)=0.
3.P(ti, ui)=0.
4.  Derive the positive and negative seed terms on the basis of domain experts.
5.  Compute simKL(ti, uj) // Compute the mutual information.
6.  Compute SD(ti, tseed) // Compute the sentiment comprehensive value.
7.for each topic of student uj in the topic formal context K.
8.SD(ti, uj)=0.
9.    for each topic of ti in the topic formal context K.
10.SD(ti, uj)= SD(ti, tseed)+ SD(uj, tseed).
11.end for.
12.Sentimentscore(ti)= Sentimentscore(ti)+ SD(ti, uj).
13.end for.
14.end for.
15.Return Sentimentscore(ti).

Note: For the selection of positive and negative seed terms, domain expert refers to ten participants, including authors, who use Borda counts to vote on different seed terms. Specifically, for any term to be classified, it is called three alternative sentiment datasets (positive sentiment, negative sentiment, and neutral sentiment), sorted by score, and finally classified as the highest according to the majority voting principle.

Sentiment adverbs in Table 2 is to play the role of an adverb modifying the whole sentence, which can be a more accurate understanding of the performance of those comments, thoughts and experiences in the comment text. Besides, the stronger the sentiment polarity expressed are, the more in line with students demand for learning effects to understand and analyze. Therefore, it is very necessary to quantify the influence of degree adverbs on sentiment intensity. We classify sentiment adverbs into 4 levels to obtain common degree adverbs on the basis of the literature (Zhang et al., 2017).

Visualization based on FCA

After obtaining SSM, the generated topic set of formal concepts in the previous section is regarded as an input to ConExp 1.3, which can output two association rules under different confidence values: weak association rules and strong association rules. The weak association rule set is also named Luxenburger set of approximate rules, while strong association rules are called Duquenne-guigues set of implication rules where the degree of support values and confidence values are both greater than the minimum support and the minimum confidence threshold (Qodmanan, Nasiri M., & Minaei, 2011). In order to analyze the sentiment state of online learning in a more in-depth way, we utilize the both association rules to map each topic to its relevant sentiment. To enact this need, the weak association rules are generated by using “Calculate Duquenne-guigues set of implications” module to create implication rules. Meanwhile, the strong association rules are also generated by using “Calculate Association Rule”module. The generated association rule expression is of the form “Number of objects<Number of objects satisfying the preconditions>Precondition = [Confidence] =><Number of objects satisfying the preconditions andconclusions> Conclusion”. The most effective association rules are obtained by adjusting the number of association rules along with the minimum confidence and minimum support.

The same student has different information needs in different situations, so the relative identities and needs of students in OLC are dynamic. Therefore, the current method focuses on mining association rules of learning identities to identify the transformational rules of students’ relative roles in different situations, which helps to realize the precise service of the community. In addition, mining the association rules for student behaviors can help to establish the sentiment evolution path of students on specific topics, which can improve the basis for the transformation of roles from different learning groups. Therefore, we also discuss the behavioral association rules. A detailed explanation of the two association rules will be explained in the Section 6.

Implementation

In this part, we develop the model of topic sentiment analysis in OLC on several modules to express and monitor opinions. The first module for data preprocessing collects 171,430 comments by crawling the text corpora from http://www.icourses.cn/home/ on four categories namely Computer science, Economic management, Medicine & Health, and Agriculture and forestry. The second module is used for detecting discussion topics to construct hierarchical topic-clustered concept lattice. The topic terms and their corresponding probability distributions are shown in Table 4, where each topic is represented by the top 5 terms with the highest probability. The third module is designed for computing SSM in order to obtain the sentiment information hidden in the topics. The topic formal context is on the basis of the multi-valued formal context with the sentiment scores of each topic as shown in Table 5. Afterwards, we convert the formal context in Table 5 into a binary single-valued formal context, via the association matrix binarization based on positive, objective, and negative relationships. However, since information in Table 5 does not relate to the sentiment of the students during the learning process, it is relatively difficult to analyze the association rules between the student’s identity and the learning behavior. To enact this need, we further mine the terms from the relevant documents containing student identity information and learning behaviors. Specifically, the student identity attributes mainly contain “learner”, “administrator”, “freshman”, “junior”, “postgraduate”, “psychological stress”. The learning behavior attributes mainly contain “cooperation”, “interaction”, “information provider>AVG”, “information provider<AVG”, “information searcher>AVG”, “information searcher <AVG”, “information sharer>AVG” and “information sharer<AVG”. Therefore, we add the attributes above-mentioned (columns of Table 6) to get the sentiment information of the students during the learning process, and the binary sentiment of the single-valued formal context is shown in Table 6. The fourth module shows a flat structure view of the associated topics-sentiment by mapping topics to their association rules.

Recognition results of topic terms.

TopicTerm and its probability
T1Course selection/0.023, Learning objectives/0.021, Difficulty of knowledge/0.018, Teaching methods/0.017, Guidance methods/0.013
T2Credits/0.025, Content organization/0.023, Teaching methods/0.021, Learning support/0.021, Homework and assessment methods/0.020
T3Case presentation/0.032, Procedural evaluation/0.031, Knowledge expansion/0.029, Analysis of difficult points/0.027, Group discussion/0.027
T4Communication and feedback/0.033, Resource sharing/0.033, Information update/0.032, Response time/0.031, Information acceptance/0.030

T1: Instructional design; T2: Course content; T3: Teaching effect; T4: Teaching interaction.

Multi-valued sentiment formal context based on topic association matrix.

T1T2T3T4
D1−3.4272.8744.315−1.306
D22.641−0.597−2.1052.635
D34.7152.1321.6240
D42.3340−1.7484.316
D5−3.619−1.8573.624−0.391
D6−2.1072.1672.4192.361
D70−0.524−0.2672.638
D82.3691.6292.3640
D91.024−0.1213.4782.964
D102.3611.493−0.328−1.267

The binary sentiment of the single-valued formal context.

T1T2T3T4T5T6T7T8T9T10T11T12T13T14T15T16T17T18T19T20
D1********
D2*********
D3********
D4********
D5*********
D6********
D7********
D8*********
D9********
D10**********

Note: *represents criterion satisfied, T1 represents freshman; T2 represents junior; T3 represents postgraduate; T4 represents administrator; T5 represents learner; T6 represents information provider<AVG; T7 represents information provider>AVG; T8 represents information sharer<AVG; T9 represents information sharer>AVG; T10 represents information searcher<AVG; T11 represents information searcher<AVG; T12 represents psychological stress; T13 represents cooperation,; T14 represents interaction; T15 represents PT1; T16 represents OT1; T17 represents NT1; T18 represents PT2; T19 represents OT2; T20 represents NT2; T21 represents PT3; T22 represents OT3; T23 represents NT3; T24 represents PT4; T25 represents OT4; T26 represents NT4;

represents criterion satisfied. As the length limits, the topics of T21 to T26 is not shown in this table.

Figure 3 represents a screenshot of the proposed method, which enables teachers and supervisors to view each specific topic by adjusting the controls of the browser. Supervisors can select each topic (top-right section) based on LDA with its most relevant topics-terms matrix (middle-right section) and documents-topics matrix (middle-left section). Besides, five documents are randomly selected to be assigned to the topic whose TF-IDF values are greater than the pre-setting threshold (bottom-right section). Afterwards, the hierarchical topic concept lattice is constructed based on FCA (central part). Finally, two sets of the implication rules and association rules are listed on the basis of the sentiment scores matrix (bottom-left part).

Figure 3

A screenshot of the tool of documents-topics for sentiment mining.

The platform represented by Figure 3 is constructed based on two open source tools, namely the interactive visualization library pyLDAvis and the open-source tool Colibri / ML. On the one hand, for the analysis and discovery of topics, the topic model interactive visualization library pyLDAvis is introduced to semi-automatic mining of potential comment topics from unstructured text resources. The above process mainly includes preprocessing the data, generating a document word frequency matrix, generating an LDA model, and mining association rules. First, in the data preprocessing phase, for a large number of html tags, non-Chinese characters are eliminated. Secondly, in the frequency matrix generation stage, Chinese text analysis software is used to analyze the text data to meet the basic requirements of machine learning. At the same time, words that are meaningless are excluded.

Finally, in the LDA topic generation stage, the topic number K needs to be set in advance, and then the LDA model is trained by adjusting model parameters to generate a model_twords.dat file containing (topicID, word, probability). After that, different topics and related words can be obtained to realize the interactive display. In addition, pyLDAvis can fine-tune lambda parameters to achieve dynamic adjustment of word weights. For the second part, FCA can build a concept lattice based on the “document-topic-probability” matrix output by LDA. An open-source tool Colibri / ML is introduced, which is developed to implement a novel geometric representation of programming structural patterns and violations to figure out patterns. Specifically, first, the probabilistic relationship between the subject concepts is mapped to the upper and lower semantic relationship between the object and the attribute respectively, and Colibri-java is used to perform formal context reduction, purification and other editing. Secondly, based on the topic formal context, the concept hierarchy analysis is performed on the “document-topic” matrix to generate the hierarchical topic concept lattice, and the upper-lower relationship between topics is obtained. The result is displayed in the document tree view in the form of a Hasse diagram. Finally, we use the plugin “Calculate Association Rules” of ConEXP 1.3 to realize the mining of association rules implicit in the topical context.

Results and discussions

In order to verify the effectiveness and accuracy of the proposed model in mining learning behaviors and the hidden sentiment under different learning conditions, we first analyze the implication rules and association rules generated in the third module in Section 5. Afterwards, mean absolute error, precision, recall and F values are used to measure the overall performance of the proposed method, compared with other state-of-art models.

Analysis of the implication rules and association rules

The model generates a total of 164 association rules with a confidence level greater than 50%, and 97 implication rules. As the negative sentiment can express a stronger need of information on certain topics compared to the positive sentiment. Thus, a list of total 48 association rules and 47 implication rules are obtained. In addition, by adjusting the minimum confidence and minimum support, the most efficient of the rules above-mentioned can be highlighted under a smaller number of rule conditions. Let us assume that the support degree is three, and the confidence is greater than 50%, the remaining ten association rules are selected when the preconditions in the rule sets contain “learner” and the conclusions involve student behaviors. Similarly, when the preconditions in rule sets relate to learning behavior and the conclusion contains student identity information or student behavior, three implication rules are obtained. The selected rules are listed in Table 7. In the sentiment mining of association rules, three valuable basic rules can be summed up. Firstly, students who are learning online have a great possibility to meet the dynamic needs of learning by adjusting learning behaviors. Rule 1 indicates that students who do not like to provide information will obtain information through information tracking when they are dissatisfied with the content of the course. Secondly, when students with relatively higher academic levels express negative sentiments, they often improve their learning effects through communication with professors, academic authorities, etc. Rule 6 shows that when learners are psychologically stressed and dissatisfied with the learning effect, there is a 67% chance that the learner is a graduate student and they will be willing to communicate their information needs by interacting with others. Finally, when students are dissatisfied with a topic, they usually adjust to themselves through some inherent learning behavior habits, which reflect the autonomy of students in OLC. Rule 4 shows that when learners are dissatisfied with the learning content, there is a 75% chance of dealing with related issues through information tracking and active sharing. Besides, the basic rules hidden in the implication rule sets can be summarized into two points. For the one hand, the attitude of the students to the teaching effect will greatly affect their learning status, and the learning behavior with stress will change accordingly. Rule 2 indicates that when students who tend to interact and are willing to share their personal attitudes show anxiety about the teaching effect, they are often reluctant to alleviate their psychological stress through information search. For another hand, there is a relatively strong correlation between different learning behaviors. Both Rules 1 and 3 indicate that students, who are likely to share information (retrieve information) and are willing to cooperate with each other, often track relevant information (share information) to conduct more in-depth learning and show more positive learning sentiment. This conclusion indicates that it is more important to cultivate students’ good learning behavior habits compared with the course content, which is of great guiding significance for improving students’ online learning efficiency.

The implication rules and association rules.

Association rules1<3>Learner Information provider<AVG NT2=[100%]=><3>Information searcher>AVG;
2<4>Learner Psychological stress PT1=[75%]=><3>Information provider<AVG NT3;
3<4>Learner NT2 =[75%]=><3>Interaction;
4<4>Learner NT2 =[75%]=><3>Information sharer>AVG Information searcher>AVG;
5<3>Learner Information sharer<AVG Psychological stress Cooperation PT1=[67%]=><2> Information provider<AVG NT3;
6<3>Learner Information searcher>AVG Psychological stress PT1 NT3 =[67%]=><2> Postgraduate Information searcher<AVG Interaction;
7<3>Learner Information provider<AVG PT1 PT4=[67%]=><2>Information searcher<AVG NT2;
8<3> Learner Information provider<AVG Information searcher<AVG NT2=[67%]=><2> Information sharer>AVG Psychological stress Interaction;
9<3>Learner NT2 PT4 =[67%]=><2>Postgraduate Interaction;
10<3>Learner NT2 PT4 =[67%]=><2>Information sharer<AVG;
Implication rules1<2>Learner Information sharer>AVGInteraction cooperation ==> Information searcher>AVG Psychological stress PT2;
2<2>Learner Interaction sharer>AVG NT3==> Information searcher<AVG Psychological stress;
3<2>Learner Information searcher>AVG Interaction cooperation ==> Information sharer>AVG Psychological stress PT4;

Note: The pre-setting condition for the association rule is (Preconditions contain learners= [>50%] => Conclusions related to student behavior); The pre-setting condition for the implication rule is (Preconditions related to student behaviors => Conclusions related to student identities or student behaviors). When the frequency of the user behavior in Table 7 is greater than the mean value, it can be considered that under the constraint of the precondition, the student has a relatively high probability to adopt such behavior.

Experimental verification and evaluation criteria

The experimental data of this paper is selected from the topics of “Computer Science”, “Information Science”, “Network Engineering” and “Software Engineering” in http://www.icourses.cn/home/. The time starts from January 7, 2019 to January 13, 2019, which is the last week before the final test in many schools. The number of active users online was 40,758, and the number of valid comments was 102,846. The text corpus is divided into 7 consecutive time segments, which are named St1–St7 in turn. Then, we input the data set into the following models to calculate topic sentiment values, namely the method proposed (TSAOLC), the association rule algorithm (RA) (Zhi, 2002), the clustering algorithm (CG) (Meng, Shen, & Chen, 2013), TextBlob (https://github.com/sloria/TextBlob), and the co-training algorithm (CoT) (Hady & Schwenker, 2008). The results of each method are ranked according to sentiment values in their descending order and compared to the results verified by human annotator (the ranking bias is set to ±6) (Chen, 2018).

Besides, to evaluate the advantages and effectiveness of the proposed method, we further performed a comparative analysis with the semi-supervised method (Co-Training algorithm) based on SVM Classifier on high-dimensional datasets, compared with Naïve Bayes, multilayer perceptron and random forest, to select a suitable classifier to build a predictive model for the quality of topic sentiment analysis.

The calculation results of relevant evaluation indicators are shown in Tables 810. The results show that TSAOLC exhibits high classification performance on all datasets, which validates the effectiveness and stability of the proposed method.

Precision contrast between different methods based on SVM.

St1St2St3St4St5St6St7
RA49.3237.5140.6742.5243.7741.2645.33
CG52.3334.9638.7941.6840.1737.7442.59
CoT57.7346.2848.8544.8451.3947.7748.25
TextBlob58.8645.1646.0742.3352.7845.5652.63
TSAOLC61.3450.2354.9549.8353.9562.9854.36

Recall contrast between different methods based on SVM.

St1St2St3St4St5St6St7
RA44.4542.0647.6444.3745.9841.6348.21
CG42.6840.9748.8642.0743.6342.8847.71
CoT49.9947.3852.8455.3652.0949.2353.84
TextBlob54.1845.8451.6758.0762.2953.4660.06
TSAOLC56.4958.0362.2759.9665.5958.7662.34

F-measure contrast between different methods based on SVM.

St1St2St3St4St5St6St7
RA46.6739.6543.8843.4344.8541.4446.73
CG47.0137.7343.2541.8741.8340.1545.00
CoT53.5846.8250.7749.5551.7448.4950.89
TextBlob56.4245.5048.7148.9757.1449.1956.10
TSAOLC58.8253.8559.3854.4359.2060.8058.08

MAE contrast between different methods based on SVM.

St1St2St3St4St5St6St7
RA98.4292.4690.8788.3889.0791.4595.63
CG82.0385.5687.6989.0692.6194.9786.36
CoT78.8476.3472.1968.7875.4376.3578.62
TextBlob72.9367.4569.3764.9270.1468.6262.15
TSAOLC58.9954.5657.3255.2557.2059.1553.13

In order to select a suitable classifier to establish a quality prediction model, we further performed a comparative analysis with Naïve Bayes, multilayer perceptron and random forest. The maximum number of decision trees in a random forest is 100. The hidden layer of the multilayer perceptron is 3, and the learning rate is 0.2.

The evaluation performance of partial data in different classifiers is shown in Figures 45. From the experimental results in Fig 4, it can be known that the precision, recall and F-measure of the topic sentiment prediction model of all the data can be around 0.5. Meanwhile, the value of MAE stays below 0.7. Based on the TSAOLC model defined in this paper, the comprehensive performance using the random forest method is optimal, which means that the weighted average of various indicators on the three sub-data sets is the best (Average MAE is 0.4758).

Figure 4

Precision (left) and recall (right) comparison based on various classifiers.

Figure 5

F-measure (left) and MAE (right) comparison based on various classifiers.

Illustrative example

To better illustrate how the proposed method can help teachers or supervisors implement teaching process management, the construction for topic-clustered concept lattice can be divided into description layer, topic feature layer, learning sentiment analysis layer and visualization layer. The description layer uses data preprocessing technology to mine document-text matrix from student text, which mainly includes text content such as community postings, classroom discussions and students’ online course selection. The classroom topic feature layer and learning sentiment analysis layer provide statistical information on learning topic content and subject-level clustering, so that teachers and managers can view each specific topic by adjusting the controls of the browser. Specifically, teachers and supervisors can select specific topic words to obtain a vector of topic variables and visualize hierarchical dependencies between topics. The visualization layer mainly includes the dynamic display of hierarchical topic-clustered concept lattice and the visualization of association rules. In the constructed concept lattice, super-concepts have more extensions than sub-concepts, and sub-concepts have richer connotations than super-concepts. Among them, a white semicircle node indicates that the concept has an attribute, and a black semicircle node indicates that the concept has an object. As the level increases, the attributes of the layer concept gradually increase, the number of objects with these concepts gradually decreases, and finally a specific object is located. Teachers can obtain all formal concepts that contain the topic word by selecting appropriate confidence thresholds. At the same time, they can use the attributes as prerequisites in association rule sets to obtain association rules and replication rules, so as to identify the student group’s sentimental tendency on specific topics. In addition, teachers or teaching managers can dynamically display topic concepts with the same clustering characteristics by clicking on different concept nodes of the concept lattice. And if the topic concept set above-mentioned is used as a prerequisite of association rules, and the number of association rules is adjusted by adjusting the minimum confidence and the minimum support degree, a more concentrated negative sentiment evaluation in a certain type of topics can be obtained, thereby providing a reasonable basis for curriculum reform.

Conclusions and perspectives

This paper designs a model for online sentiment analysis of various topics in OLC. The model obtains the topic-terminology hybrid matrix and the document-topic hybrid matrix by selecting the real user’s comment information on the basis of LDA topic detection approach. Afterwards, a topic clustering concept lattice based on FCA model is constructed, where the topic sentiment can be identified by measuring their sentiment scores. In addition, the topic sentiment can be visualized based on the implication and association rules to refine the granularity of sentimented knowledge. Finally, from the results of the experiment, the following conclusions can be obtained:

The proposed model can effectively perceive students’ sentiment tendencies on different topics, which provides powerful practical reference for improving the quality of information services in teaching practice.

The topic-sentiment visualization framework can clarify the hierarchical dependencies between different topics, which lay the foundation for improving the accuracy of teaching content recommendation and optimizing the knowledge coherence of related courses.

In order to improve the accuracy of the topic-sentiment analysis model, the follow-up research will focus on optimization of semantic constraint capabilities between different topics. In addition, exploring the intensity of students’ sentiments and their evolutionary trends will also be an interesting content, which will improve the adaptive ability of opinion mining.

eISSN:
2543-683X
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining