Bar chart plagiarism detection

Plagiarism can be considered one of the electronic crimes and intellectual thefts, which has become one of educational challenges of research institutions. One form to represent quantitative information is charts such as line and bar chart, which can formulate the information in info-graphic form. T...

पूर्ण विवरण

ग्रंथसूची विवरण
मुख्य लेखक: Mohammed Salih, Mohammed Mumtaz
स्वरूप: थीसिस
भाषा:अंग्रेज़ी
प्रकाशित: 2013
विषय:
ऑनलाइन पहुंच:http://eprints.utm.my/39164/1/MohammedMumtazMohammedSalihMFSKSM2013.pdf
_version_ 1846216507908947968
author Mohammed Salih, Mohammed Mumtaz
author_facet Mohammed Salih, Mohammed Mumtaz
author_sort Mohammed Salih, Mohammed Mumtaz
description Plagiarism can be considered one of the electronic crimes and intellectual thefts, which has become one of educational challenges of research institutions. One form to represent quantitative information is charts such as line and bar chart, which can formulate the information in info-graphic form. The extraction of features of bar chart is an essential process to get the data from images. Some techniques presented by researchers focused on the graphical part rather than text itself, such as Hough Transform and Learning Based method. In this study, ten features of bar chart images are utilized to detect and find the proportion of similarity between the charts. Some of these features can be directly extracted by OCR, while others demand finding the relationship between the text part and the graphic part to extract the data such as the real values for each bar in images. The new technique which introduced in this research can extract three values of each bar namely Start, End and Exact values depending on horizontal and vertical lines of the bar chart image. In addition, the Word 2-gram and Euclidean distance methods are used to detect and find the plagiarism. Experimental results show the ability of the system to detect plagiarism for ten possible patterns of bar chart plagiarisms. The performance of the system is evaluated depending on overlapping features and precision and recall. The experimental results show the ability of the system to detect not only copy and paste data of bars, but also restructuring and summarization of captions of image as well as modifications to data of bar chart images, such as swapping among bars, changing colors and changing scales of bar chart images.
format Thesis
id uthm-39164
institution Universiti Teknologi Malaysia
language English
publishDate 2013
record_format eprints
spelling uthm-391642017-09-13T06:44:46Z http://eprints.utm.my/39164/ Bar chart plagiarism detection Mohammed Salih, Mohammed Mumtaz TK Electrical engineering. Electronics Nuclear engineering Plagiarism can be considered one of the electronic crimes and intellectual thefts, which has become one of educational challenges of research institutions. One form to represent quantitative information is charts such as line and bar chart, which can formulate the information in info-graphic form. The extraction of features of bar chart is an essential process to get the data from images. Some techniques presented by researchers focused on the graphical part rather than text itself, such as Hough Transform and Learning Based method. In this study, ten features of bar chart images are utilized to detect and find the proportion of similarity between the charts. Some of these features can be directly extracted by OCR, while others demand finding the relationship between the text part and the graphic part to extract the data such as the real values for each bar in images. The new technique which introduced in this research can extract three values of each bar namely Start, End and Exact values depending on horizontal and vertical lines of the bar chart image. In addition, the Word 2-gram and Euclidean distance methods are used to detect and find the plagiarism. Experimental results show the ability of the system to detect plagiarism for ten possible patterns of bar chart plagiarisms. The performance of the system is evaluated depending on overlapping features and precision and recall. The experimental results show the ability of the system to detect not only copy and paste data of bars, but also restructuring and summarization of captions of image as well as modifications to data of bar chart images, such as swapping among bars, changing colors and changing scales of bar chart images. 2013-01 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/39164/1/MohammedMumtazMohammedSalihMFSKSM2013.pdf Mohammed Salih, Mohammed Mumtaz (2013) Bar chart plagiarism detection. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computer Science and Information Systems.
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Mohammed Salih, Mohammed Mumtaz
Bar chart plagiarism detection
title Bar chart plagiarism detection
title_full Bar chart plagiarism detection
title_fullStr Bar chart plagiarism detection
title_full_unstemmed Bar chart plagiarism detection
title_short Bar chart plagiarism detection
title_sort bar chart plagiarism detection
topic TK Electrical engineering. Electronics Nuclear engineering
url http://eprints.utm.my/39164/1/MohammedMumtazMohammedSalihMFSKSM2013.pdf
url-record http://eprints.utm.my/39164/
work_keys_str_mv AT mohammedsalihmohammedmumtaz barchartplagiarismdetection