Qiaozhi Wang, Hao Xue, Fengjun Li, Dongwon Lee and Bo Luo
. Liu and E. Terzi. A framework for computing the privacy scores of users in online social networks. ACM Transactions on Knowledge Discovery from Data , 5(1), 2010.
 W. Liu and D. Ruths. What’s in a name? using first names as features for gender inference in twitter. In AAAI spring symposium: Analyzing microtext , volume 13, page 01, 2013.
 B. Luo and D. Lee. On protecting private information in social networks: a proposal. In IEEE ICME Workshop of M3SN . IEEE, 2009.
 A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam
The paper deals with evaluation and ranking of students taking into account two main criteria of the learning – theoretical knowledge and practical skills. These criteria are divided into several sub-criteria to reflect different aspects of the learning outcomes. To make such complex evaluation the proper utility function based on simple multi-attribute rating technique is proposed. This new utility function includes not only the evaluation score and weighted coefficients for criteria importance, but considers also additional coefficients that indicate how theoretical knowledge and practical skills will take part in the aggregated final assessment. The formulated model is applied for the assessing of the students on web programming. The students are ranked under three different cases where the theoretical knowledge and practical skills take different part in the aggregated assessment. The obtained results demonstrate the applicability of the described approach by providing different ranking depending on the importance of the theoretical and practical aspects.
A model for prediction of the outcome indicators of e-Learning, based on Balanced ScoreCard (BSC) by Neural Networks (NN) is proposed. In the development of NN models the problem of a small sample size of the data arises. In order to reduce the number of variables and increase the examples of the training sample, preprocessing of the data with the help of the methods Interpolation and Principal Component Analysis (PCA) is performed. A method for optimizing the structure of the neural network is applied over linear and nonlinear neural network architectures. The highest accuracy of prognosis is obtained applying the method of Optimal Brain Damage (OBD) over the nonlinear neural network. The efficiency and applicability of the method suggested is proved by numerical experiments on the basis of real data.
Hierarchical Classification Approach to Automated Essay Scoring. – Assessing Writing, Vol. 23 , 2015, No 1, pp. 35-39.
4. Karnalim, O. A Low-Level Structure-Based Approach for Detecting Source Code Plagiarism. – IAENG International Journal of Computer Science, Vol. 44 , 2017, No 4, pp. 501-522.
5. Wang, D., Y. Liang, D. Xu, X. Feng, R. Guan. A Content-Based Recommender System for Computer Science Publications. – Knowledge-Based Systems, Vol. 157 , 2018, No 1, pp. 1-9.
6. Beel, J., B. Gipp, S. Langer, C. Breitinger. Research-Paper Recommender Systems: A
Jagadish S. Kallimani, K. G. Srinivasa and B. Eswara Reddy
The method for filtering information from large volumes of text is called Information Extraction. It is a limited task than understanding the full text. In full text understanding, we express in an explicit fashion about all the information in a given text. But, in Information Extraction, we delimit in advance, as part of the specification of the task and the semantic range of the result. Only extractive summarization method is considered and developed for the study. In this article a model for summarization from large documents using a novel approach has been proposed by considering one of the South Indian regional languages (Kannada). It deals with a single document summarization based on statistical approach. The purpose of summary of an article is to facilitate the quick and accurate identification of the topic of the published document. The objective is to save prospective readers’ time and effort in finding the useful information in a given huge article. Various analyses of results were also discussed by comparing it with the English language.
, Man and Cybernetics, Vol. 40 , 2010, No 6, pp. 601-618.
22. Sen, B., E. Ucar, D. Delen. Predicting and Analyzing Secondary Education Placement Test Scores: A Data Mining Approach. – Expert Systems with Applications, Vol. 39 , 2012, No 10, pp. 9468-9476.
23. Shahiri, A. M., W. Husain, N. A. Rashid. A Review on Predicting Students’ Performance Using Data Mining Techniques. – Procedia Computer Science, Vol. 72 , 2015, pp. 414-422.
24. Strecht, P., J. Mendes-Moreira, C. Soares. Merging Decision Trees: A Case Study in Predicting Student Performance
Gunjan Ansari, Tanvir Ahmad and Mohammad Najmud Doja
In our work, we propose an ensemble of local and global filter-based feature selection method to reduce the high dimensionality of feature space and increase accuracy of spam review classification. These selected features are then used for training various classifiers for spam detection. Experimental results with four classifiers on two available datasets of hotel reviews show that the proposed feature selector improves the performance of spam classification in terms of well-known performance metrics such as AUC score.
Daniela Borissova, Ivan Mustakerov and Dilian Korsemov
In the paper a business intelligence tool based on group decision making is proposed. The group decision making uses a combinatorial optimization modeling technique. It takes into account weighted coefficients for evaluation criteria assigned by decision makers together with their scores for the alternatives in respect of these criteria. The proposed optimization model for group decision making considers also the knowledge level of the group members involved as decision makers. This optimization model is implemented in three-layer architecture of Web application for business intelligence by group decision making. Developed Web application is numerically tested for a representative problem for software choice considering six decision makers, three alternatives and 19 evaluation criteria. The obtained results show the practical applicability and effectiveness of the proposed approach.
Linked data has been widely recognized as an important paradigm for representing data and one of the most important aspects of supporting its use is discovery of links between datasets. For many datasets, there is a significant amount of textual information in the form of labels, descriptions and documentation about the elements of the dataset and the fundament of a precise linking is in the application of semantic textual similarity to link these datasets. However, most linking tools so far rely on only simple string similarity metrics such as Jaccard scores. We present an evaluation of some metrics that have performed well in recent semantic textual similarity evaluations and apply these to linking existing datasets.
El Moatez Billah Nagoudi, Ahmed Khorsi, Hadda Cherroun and Didier Schwab
Measuring the amount of shared information between two documents is a key to address a number of Natural Language Processing (NLP) challenges such as Information Retrieval (IR), Semantic Textual Similarity (STS), Sentiment Analysis (SA) and Plagiarism Detection (PD). In this paper, we report a plagiarism detection system based on two layers of assessment: 1) Fingerprinting which simply compares the documents fingerprints to detect the verbatim reproduction; 2) Word embedding which uses the semantic and syntactic properties of words to detect much more complicated reproductions. Moreover, Word Alignment (WA), Inverse Document Frequency (IDF) and Part-of-Speech (POS) weighting are applied on the examined documents to support the identification of words that are most descriptive in each textual unit. In the present work, we focused on Arabic documents and we evaluated the performance of the system on a data-set of holding three types of plagiarism: 1) Simple reproduction (copy and paste); 2) Word and phrase shuffling; 3) Intelligent plagiarism including synonym substitution, diacritics insertion and paraphrasing. The results show a recall of 88% and a precision of 86%. Compared to the results obtained by the systems participating in the Arabic Plagiarism Detection Shared Task 2015, our system outperforms all of them with a plagiarism detection score (Plagdet) of 83%.