Science

Permanent URI for this communityhttp://repository.kln.ac.lk/handle/123456789/1

Browse

Search Results

Now showing 1 - 4 of 4
  • Thumbnail Image
    Item
    A data mining approach for the analysis of undergraduate examination question papers
    (International Research Conference on Smart Computing and Systems Engineering - SCSE 2018, 2018) Brahmana, A.; Kumara, B.T.G.S.; Liyanage, A.L.C.J.
    Examinations play a major role in the teaching, learning and assessment process. Questions are used to obtain information and assess knowledge and competence of students. Academics who are involved in teaching process in higher education mostly use final examination papers to assess the retention capability and application skills of students. Questions that used to evaluate different cognitive levels of students may be categorized as higher order questions, intermediate order questions and lower order questions. This research work tries to derive a suitable methodology to categorize final examination question papers based on Bloom’s Taxonomy. The analysis was performed on computer science related end semester examination papers in the Department of computing and information systems of Sabaragamuwa University of Sri Lanka. Bloom’s Taxonomy identifies six levels in the cognitive domain. The study was conducted to check whether examination questions comply with the requirements of Bloom’s Taxonomy at various cognitive levels. According to the study the appropriate category of the questions in each examination, the paper was determined. Over 900 questions which obtained from 30 question papers are allocated for the analysis. Natural language processing techniques were used to identify the significant keywords and verbs which are useful in the determination of the suitable cognitive level. A rule based approach was used to determine the level of the question paper in the light of Bloom’s Taxonomy. An effective model which enables to determine the level of examination paper can derive as the final outcome.
  • Thumbnail Image
    Item
    Incident crowdsourcing, tracking and ranking application for environmental problems and issues in Sri Lanka using natural language processing
    (Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Lakshika, M.V.P.T.; Senanayake, S.H.D.
    Sri Lanka is having numerous critical environmental problems and issues such as deforestation, pollution of water bodies, natural disasters and many other urban problems. Many of the communities who suffer from such environmental problems are not attaining solutions for their vital problems due to lack of awareness, inefficiency and carelessness of amenable parties such as environmental related authorities and ministries in Sri Lanka. Within past few years, virtual communities in Sri Lanka used social media for emphasizing numerous forms of social problems. The intention of this research is to make the awareness of virtual community as a compulsion towards the responsible parties in Sri Lanka which can work as a driving force for stimulating reasonable solutions towards environmental problems. The web based application discussed in this research has been designed to obtain content of such environmental problems by soliciting contributions from crowdsourcing. Online community can report environmental problems by using text and images. Users of the application can vote and comment on the problems and issues posted in the application. Each problem will receive points based on up or down votes and comments they have received and then the application ranks genuine high quality environmental problems while allocating points for each user in the system. Text categorization is a subtask of information retrieval which used in this application is very effective for filtering of environmental related information before posting to the system. Further, this application is using the semantic information or the polarity of user comments as positive, negative or neutral which are not used yet for the most important natural language applications. This research discussed about a study of the interaction between Natural Language Processing and text categorization. Based on users negative or positive comments and up or down votes, the application calculates the points for each post according to a predefined criterion and highlights the genuine high quality environmental problems and issues in Sri Lanka.
  • Thumbnail Image
    Item
    An algorithm for plagiarism detection in Sinhala language
    (Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Basnayake, S.F.; Wijekoon, H.; Wijayasiriwardhane, T.K.
    According to the Merriam-Webster dictionary, the simple definition of the verb plagiarize is, “to use the words or ideas of another person as if they were your own words or ideas”. Many software tools to aid in detecting plagiarism is available for English language, but equivalent tools are not yet available specifically for Sinhala language. Though language independent tools that work on many languages are available, they generally give poor results as they do not consider language specific features. There are some detection methods proposed for Asian languages like Hindi, Malayalam, Arabic and Persian which have some close relationship and similar properties of Sinhala language. All of those methods use language specific rules and they even outperform the commercially available tools. These findings are evidence that the language specific plagiarism detection is more effective than the language independent plagiarism detection as some paraphrasing techniques can be used to mislead the language independent systems.Sinhala language is constitutionally recognized as the official language of Sri Lanka, along with Tamil. Due to the complexity of the language structure and rules of grammar, the language independent tools seem to provide poor results when used for plagiarism detection in Sinhala documents. In this research, we propose a novel plagiarism detection algorithm built around content based methods specific to Sinhala language. The methodology of this study follows both experimental and build approaches. The proposed plagiarism detection system has two modules namely, text pre-processing module and the similarity detection module. The text pre-processing module pre-process the text files to standardize the text sources using techniques such as stop word removal, number replacement, lemmatization, synonym recognition and creating n-grams. Then the similarity detection module analyses the pre-processed text using Jaccard coefficient and cosine similarity coefficient to measure the similarity between two documents. A prototype of Sinhala language plagiarism detection system will be implemented using the proposed method and several combinations of the above techniques will be used to discover the best combination. Testing and statistical performance evaluation will be carried out using a sample of source text files and plagiarized text files in Sinhala language by taking expert judgements also into the consideration. The final outcome of this research study is to develop an effective software application for plagiarism detection in Sinhala language documents.
  • Thumbnail Image
    Item
    Natural language processing framework: WordNet based sentimental analyzer
    (Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Jayakody, J.R.K.C.
    Sentimental analysis is a technique which is used to classify different types of documents as positive, negative or neutral. Hand written form, mails, telephone surveys or online feedback forms are used to get customer feedbacks about products and services. In fact, sentimental analysis is the technique which is used to mine online and offline customer feedback data to gain insight of product and services automatically. Since business types are different, it is quite challenging to develop a generic sentimental analyzer. Therefore, this ongoing research focused on developing a generic framework that can be extended further in future to develop the best generic sentiment analyzer. Several online customer feedback forms were used as the dataset. Webpage scraping module was developed to extract the reviews from web pages and chunk and chink rules were developed to extract the comparative and superlative adverbs to build the knowledge base. The web site (Thesaurus.com) was used to build the test data with synonyms of good, bad and neutrals. Next WordNet database was used with different semantic similarity algorithms such as path similarity, Leacock-Chodorow-similarity, Wu- Palmer-Similarity and Jiang-conrath similarity to test the sentiments. Accuracy of this framework was improved further with the vector model built with natural language processing techniques. Label dataset of amazon product reviews provided by University of Pennsylvania were used to test the accuracy. Framework was developed to change the multiplied value based on the domain. The accuracy of the final sentiment value was given as a percentage of the positive or negative type. This framework gave fairly accurate results which are useful to generate good insights with user reviews.