Symposia & Conferences
Permanent URI for this communityhttp://repository.kln.ac.lk/handle/123456789/10213
Browse
9 results
Search Results
Item Data Mining Approach for Identifying Suitable Sport for Beginners(IEEE International Research Conference on Smart computing & Systems Engineering (SCSE) 2019, Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka, 2019) Amarasena, P.T.; Kumara, B. T. G. S.; Jointion, S.Anthropometric measurements are generally used to determine and predict achievement in different sports. An athlete’s anthropometric and physical characteristics may perform important preconditions for successful participation in any given sport. Further, anthropometric profiles indicate whether the player would be suitable for the competition at the highest level in a specific sport. Recently, more researches have been carried out on Sport Data mining. In this study, we propose an approach to identify the most suitable sport for beginners using data mining and anthropometric profiles. We propose clustering base approach. We apply a spatial clustering technique called the Spherical Associated Keyword Space which is projected clustering result from a three-dimensional sphere to a two dimensional (2D) spherical surface for 2D visualization. Empirical study of our approach has proved the effectiveness of clustering resultsItem A data mining approach for the analysis of undergraduate examination question papers(International Research Conference on Smart Computing and Systems Engineering - SCSE 2018, 2018) Brahmana, A.; Kumara, B.T.G.S.; Liyanage, A.L.C.J.Examinations play a major role in the teaching, learning and assessment process. Questions are used to obtain information and assess knowledge and competence of students. Academics who are involved in teaching process in higher education mostly use final examination papers to assess the retention capability and application skills of students. Questions that used to evaluate different cognitive levels of students may be categorized as higher order questions, intermediate order questions and lower order questions. This research work tries to derive a suitable methodology to categorize final examination question papers based on Bloom’s Taxonomy. The analysis was performed on computer science related end semester examination papers in the Department of computing and information systems of Sabaragamuwa University of Sri Lanka. Bloom’s Taxonomy identifies six levels in the cognitive domain. The study was conducted to check whether examination questions comply with the requirements of Bloom’s Taxonomy at various cognitive levels. According to the study the appropriate category of the questions in each examination, the paper was determined. Over 900 questions which obtained from 30 question papers are allocated for the analysis. Natural language processing techniques were used to identify the significant keywords and verbs which are useful in the determination of the suitable cognitive level. A rule based approach was used to determine the level of the question paper in the light of Bloom’s Taxonomy. An effective model which enables to determine the level of examination paper can derive as the final outcome.Item Analysis of historical accident data to determine accident prone locations and cause of accidents(International Research Conference on Smart Computing and Systems Engineering - SCSE 2018, 2018) Ifthikar, A.; Hettiarachchi, S.Road traffic accidents causes great distress and destroy the lives of many individuals. Inspite of different attempts to solve this problem, it still resides as a major cause of death. This paper proposes a system to analyse historical accident data and subsequently identify accident-prone areas and their relevant causes via clustering accident location coordinates. This system, once developed, can be used to warn drivers and also to aid fully autonomous automobiles to take precautions at accident-prone areas.Item Applicability of crowdsourcing for traffic-less travelling in Sri Lankan context(International Research Conference on Smart Computing and Systems Engineering - SCSE 2018, 2018) Senanayake, J.M.D.; Wijayanayake, J.Traffic is one of the most significant problem in Sri Lanka. Valuable time can be saved if there is a proper way to predict the traffic and recommend the best route considering the time factor and the people’s satisfaction on various transportation methods. Therefore, in this research using crowdsourcing together with data mining techniques, data related to user mobility were collected and studied and based on the observations, an algorithm has been developed to overcome the problem. By using developed techniques, the best transportation method can be predicted. Therefore, people can choose what will be the best time slots & transportation methods when planning journeys. The algorithm correctly predict the best traffic-less traveling method for the studied area of each given day & the given time. Throughout this research it has been proven that to determine the best transportation method in Sri Lankan context, data mining concepts together with crowdsourcing can be applied. Based on a thorough analysis by extending the data set of the collection stage, it was shown that this research can be extended to predict the best transportation method with consideration of existing traffic in all the areas.Item Social media mining for post-disaster management - A case study on Twitter and news(International Research Conference on Smart Computing and Systems Engineering - SCSE 2018, 2018) Banujan, K.; Kumara, B.T.G.S.; Incheon PaikA natural disaster is a natural event which can cause damage to both lives and properties. Social media are capable of sharing information on a real-time basis. Post disaster management can be improved to a great extent if we mine the social media properly. After identifying the need and the possibility of solving that through social media, we chose Twitter to mine and News for validating the Twitter Posts. As a first stage, we fetch the Twitter posts and news posts from Twitter API and News API respectively, using predefined keywords relating to the disaster. Those posts were cleaned and the noise was reduced at the second stage. Then in the third stage, we get the disaster type and geolocation of the posts by using Named Entity Recognizer library API. As a final stage, we compared the Twitter datum with news datum to give the rating for the trueness of each Twitter post. Final integrated results show that the 80% of the Twitter posts obtained the rating of “3” and 15% obtained the rating of “2”. We believe that by using our model we can alert the organizations to do their disaster management activities. Our future development consists mainly of two folds. Firstly, we are planning to integrate the other social media to fetch the data, i.e. Instagram, YouTube, etc. Secondly, we are planning to integrate the weather data into the system in order to improve the precision and accuracy for finding the trueness of the disaster and location.Item Data mining model for identifying high-quality journals(International Research Conference on Smart Computing and Systems Engineering - SCSE 2018, 2018) Jayaneththi, J.K.D.B.G.; Kumara, B.T.G.S.The focus in local universities over the last decade, have shifted from teaching at undergraduate and postgraduate levels to conducting research and publishing in reputed local and international journals. Such publications will enhance the reputation on the individual and the university. The last two decades has seen a rapid rise in open access journals. This has led to quality issues and hence chossing journals for publication has become an issue. Most of these journals focus on the monetary aspect and will publish articles that previously may not have been accepted. Some of the issues include design of the study, methodology and the rigor of the analysis. This has great consequences as some of these papers are cited and used as a basis for further studies. Another cause for concern is that, the honest researchers are sometimes duped, into believing that journals are legitimate and may end up by publishing good material in them. In addition, at present, it is very difficult to identify the fake journals from the legitimate ones. Therefore, the objective of the research was to introduce a data mining model which helps the publishers to identify the highest quality and most suitable journals to publish their research findings. The study focused on the journals in the field of Computer Science. Journal Impact Factor, H-index, Scientific Journal Rankings, Eigen factor Score, Article Influence Score and Source Normalized Impact per Paper journal metrics were used for building this data mining model. Journals were clustered into five clusters using K-Means clustering algorithm and the clusters were interpreted as excellent, good, fair, poor and very poor based on the results.Item Predicting landslides in hill country of Sri Lanka using data mining techniques(Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Karunanayake, K.B.A.A.M.; Wijayanayake, W.M.J.I.A landslide is the movement of rock, debris or earth down a slope. They result from the failure of the materials which make up the hill slope and are driven by the force of gravity. When it refers to Sri Lankan context landslide is the major natural disaster in hill country of Sri Lanka, creating economical and ecological damage while endangering human lives. Therefore, the fast detection plays an important role in avoiding or minimizing the hazards. Currently in Sri Lanka National Building Research Organization (NBRO) under the Ministry of Disaster Management in Sri Lanka issue landslide early warning messages based on Landslide Hazard Zonation Map and readings of auto meter rain gauging. However, a map is only covering a specific point in time, and do not take current weather and geographical conditions into account. Though they collect current rainfall using auto meter rain gauging this facility is not established in everywhere. As the hill country is a rapidly developing area some causative factors can be changed time to time due to human intervention or natural incidents. Therefore, it is understood that there has a problem in predicting landslide depending on current situation. On the other hand, to deal with the current approach there must have an expert. The main objective of this study is to develop a model which can be embedded to develop an user friendly and efficient computer program which is usable by any ordinary person who is living in a landslide prone area to determine “am I safe in the current place with regards to current geological and weather condition?” by dealing with data of current situation rather than living blindly until NBRO issue warnings. Most of the time landslides often occur at specific location under certain topographic and geologic conditions within the country and it is important to utilize existing data to predict landsides. Data mining techniques can be used to develop prediction models using existing data. Plan-Do-Check-Act data mining methodology has been selected for this study. Initially, study is limited to homogeneous areas of Badulla and Nuwara-Eliya districts which are already identified as landslide prone areas. Based on the homogeneity of these areas models will be developed by incorporating only three causative factors, slope, surface overburden, land use which are varying due to human intervention and natural incidents and triggering factor, rain fall. The historical data are collected using the contours, map of land use, map of overburden and map of landslides. The decision tree algorithm and the neural network technique will be used to develop prediction models out of predictive analysis data mining techniques. The cross validation evaluation technique will be used to test the models and ultimately select the best model out of decision tree algorithm model and neural network model.Item Analysing mobility patterns of people to determine the best transportation method(Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Senanayake, J.M.D.; Wijayanayake, W.M.J.I.With the technological enhancements related to Internet, Wireless Communication, Big Data Analytics, Sensor-based Data, and Machine Learning; new paradigms are enabled for processing large amount of data which are collected from various sources. In the past decades, both coarse and fine-grained sensor data had been used to perform location-driven activity inference. In recent years, GPS phone and GPS enabled PDA become prevalent in people’s daily lives. With such devices people become more capable than ever of tracing their outdoor mobility and using locationbased applications. Based on the collected data from these GPS enabled devices with the help of IoT related to user mobility lots of research areas are opened. In this research the data related to user locations when users do any outdoor movements is collected using the mobile devices that are connected to the Internet and is mined using data mining techniques and come up with an algorithm to model & analyse those big data to identify mobility pattern, traffic prediction, transportation method satisfaction etc. The data for this research will be collected using a mobile application which has to be installed in smart devices like smart phones, tablet PCs etc. In this application the user has to enter the activity that he or she currently doing and the method of transportation & the users' opinion on the transportation method if he is doing some sort of travelling. The GPS coordinates (longitude & latitude) as GPS trajectories along with the time stamp and the date will be automatically acquired from the users' IoT device. A cloud based storage will be used to store collected data. Since the dataset is going to be a huge one, there can be data which contains outlier values due to the uncertainty of the mobile network coverage and the GPS coverage of the devices. Therefore, these data should be properly cleaned when doing data mining activities otherwise these data will lead to incorrect results such as wrong traffic prediction in certain places if several users are stuck in the same GPS coordinates for a while. Not only that but also when it comes to the user satisfaction, it might lead to generate incorrect outcome if the users in the sample will not enter their satisfaction accurately. This can be avoided by comparing cluster wise users with the consideration of the location and the transportation method. We can get the average opinion of the users and take it as the satisfaction of the transportation method in that cluster. Using the final results of this research the government can also be benefited if we selected the sample users well with mixing all the types of people and by providing necessary information for planning smart cities.Item Natural language processing framework: WordNet based sentimental analyzer(Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Jayakody, J.R.K.C.Sentimental analysis is a technique which is used to classify different types of documents as positive, negative or neutral. Hand written form, mails, telephone surveys or online feedback forms are used to get customer feedbacks about products and services. In fact, sentimental analysis is the technique which is used to mine online and offline customer feedback data to gain insight of product and services automatically. Since business types are different, it is quite challenging to develop a generic sentimental analyzer. Therefore, this ongoing research focused on developing a generic framework that can be extended further in future to develop the best generic sentiment analyzer. Several online customer feedback forms were used as the dataset. Webpage scraping module was developed to extract the reviews from web pages and chunk and chink rules were developed to extract the comparative and superlative adverbs to build the knowledge base. The web site (Thesaurus.com) was used to build the test data with synonyms of good, bad and neutrals. Next WordNet database was used with different semantic similarity algorithms such as path similarity, Leacock-Chodorow-similarity, Wu- Palmer-Similarity and Jiang-conrath similarity to test the sentiments. Accuracy of this framework was improved further with the vector model built with natural language processing techniques. Label dataset of amazon product reviews provided by University of Pennsylvania were used to test the accuracy. Framework was developed to change the multiplied value based on the domain. The accuracy of the final sentiment value was given as a percentage of the positive or negative type. This framework gave fairly accurate results which are useful to generate good insights with user reviews.