Science

Permanent URI for this communityhttp://repository.kln.ac.lk/handle/123456789/1

Browse

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Item
    Plagiarism detection educational tool: A student’s assessments similarity checker
    (Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Jayakody, J.R.K.C.
    Plagiarism is very common among students in higher education institutes due to many reasons such as lack of knowledge about the subject, poor academic writing skills or difficulty in meeting a given deadline. The most popular method of plagiarism is to use the online web pages or e-books as it is an easy effort to get the contents from internet, change it and to submit as an original work. Hence, there are bunch of online software tools as well as offline tools exists to detect the plagiarism. However, there are less software tools to identify the copied works among students. Therefore, in this research I developed a plagiarism detection tool to identify the plagiarized assignments or tutorial submitted. Individual assignments and tutorials which had been given to software engineering courses of the Department of Computing and Information System of Wayamba University were used as the dataset. Natural language processing algorithms were developed to derive the statistical features from the assignments such as bag of words, most frequent words, number of words, name entities and paragraphs etc. Moreover, Term Frequency and Inverted Document Frequency (TF-IDF) module was developed to generate a similarity index value among assignments. In addition, Latent semantic analysis module was developed with the word dictionary and vector corpus. Features that were generated and extracted from every module were used to identify the clusters of similar assignments. K-mean clustering algorithms in rapid minor were used to identify the clusters. Most of the submitted assignments were identified with number of clusters. Once the clustering results were verified with the students, it was evident that fairly good results were the given by the automatic cluster classification.
  • Thumbnail Image
    Item
    Question paper analysis with Natural Language Processing
    (Department of Zoology and Environmental Management, University of Kelaniya, Kelaniya, Sri Lanka., 2016) Jayakody, J.R.K.C.; Perera, P.L.M.
    “Art of Paper Setting” is very popular terminology when it is come to education examination process. As it is an “Art”, teachers should passionate enough to prepare a better question paper which will reflect the educational objectives. There are few steps involved in the process of paper setting and analysis of the paper is the most important element among those steps as it is only indicator of the alignment of questions with intended objectives. When it comes to the analysis process, human intelligence can analyze questions more easily. But implementing similar intelligent systems with computer intelligence is a real challenge. Therefore the purpose of this research is to build a computer intelligent system which can analyze and classify questions. When it is come to classification standards, Bloom’s Taxonomy is a world recognized cognitive skills classification standard. Therefore this standard was used as the guide for the questions categorization of question papers. In the analysis phase, natural language processing techniques were used to analyze the raw text. With these techniques, first the row texts were processed and then the meaningful features of the questions such as verb similarity stem pattern similarity and stem meaning similarity were extracted. Next with machine learning techniques, a model (the brain of the system) was trained by feeding extracted question features. For the model training, several classification algorithms such as Multinomial Naive Bayes Classifier, Bernoulli Naive Bayes Classifier, Logistic Regression Classifier, Stochastic Gradient Descent Classifier, C-Support Vector Classifier and Linear Support Vector Classifier were used. Accuracy levels of each and every classification algorithms were measured with changing the size of the training data set and the optimum algorithm was selected for model training. Finally the model was trained with the optimum algorithm and that model was used to classify the unseen questions. The ultimate model was fine tuned to gain 80% classification accuracy.