Log Mining Using Generalized Association Rules

Explosive growth in size and usage of the World Wide Web has made it necessary for Web site administrators to track and analyze the navigation patterns of Web site visitors. To achieve this goal, the use of web mining tool is necessary. Web mining can be defined as the use of data mining technique...

Full description

Bibliographic Details
Main Author: Mohd. Helmy, Abd. Wahab
Format: Thesis
Language:English
English
Published: 2004
Subjects:
Online Access:https://etd.uum.edu.my/1324/1/MOHD._HELMY_B._ABD._WAHAB.pdf
https://etd.uum.edu.my/1324/2/1.MOHD._HELMY_B._ABD._WAHAB.pdf
https://etd.uum.edu.my/1324/
Abstract Abstract here
_version_ 1855352919365255168
author Mohd. Helmy, Abd. Wahab
author_facet Mohd. Helmy, Abd. Wahab
author_sort Mohd. Helmy, Abd. Wahab
description Explosive growth in size and usage of the World Wide Web has made it necessary for Web site administrators to track and analyze the navigation patterns of Web site visitors. To achieve this goal, the use of web mining tool is necessary. Web mining can be defined as the use of data mining techniques to automatically discover and extract information from web documents. Since Data Mining is primarily concerned with the discovery of knowledge and aims to provide answers to questions that people do not know how to ask, it is not an automatic process. Rather one has to exhaustively explores very large volumes of data to determine otherwise hidden relationships. The process extracts high quality information that can be used to draw conclusions based on relationships or patterns within the data. However, data mining technique are not easily applicable to Web data due to problems both related with the technology underlying the Web and the lack of standards in the design and implementation of Web pages. Information collected by the Web servers are kept in the server log is the main source of data for analyzing user navigation patterns. Once logs have been pre-processed and sessions have been obtained, there are several kinds of access pattern mining that can be performed depending on the needs of the analyst. Since the method use in this study relied on relatively simple techniques therefore the information gathered is adequate for real user profile data due to the noise in the data has to be first tackled. In this study, Data Mining techniques known as generalized association rules was used in order to get some insights into website usage pattern. For the purpose of this study, server logs from tutor.com portal were retrieved, pre-processed and analyzed. An important finding from this study is that Mathematics subject generally popular from UPSR, PMR and UPSR levels. On the contrary, arts subjects are not popular to Tutor.com users. The system administrator may consider evaluating the content and the link for such subjects, so that the real problem can be identified.
format Thesis
id oai:etd.uum.edu.my:1324
institution Universiti Utara Malaysia
language English
English
publishDate 2004
record_format EPrints
record_pdf Abstract
spelling oai:etd.uum.edu.my:13242013-07-24T12:11:27Z https://etd.uum.edu.my/1324/ Log Mining Using Generalized Association Rules Mohd. Helmy, Abd. Wahab QA76 Computer software Explosive growth in size and usage of the World Wide Web has made it necessary for Web site administrators to track and analyze the navigation patterns of Web site visitors. To achieve this goal, the use of web mining tool is necessary. Web mining can be defined as the use of data mining techniques to automatically discover and extract information from web documents. Since Data Mining is primarily concerned with the discovery of knowledge and aims to provide answers to questions that people do not know how to ask, it is not an automatic process. Rather one has to exhaustively explores very large volumes of data to determine otherwise hidden relationships. The process extracts high quality information that can be used to draw conclusions based on relationships or patterns within the data. However, data mining technique are not easily applicable to Web data due to problems both related with the technology underlying the Web and the lack of standards in the design and implementation of Web pages. Information collected by the Web servers are kept in the server log is the main source of data for analyzing user navigation patterns. Once logs have been pre-processed and sessions have been obtained, there are several kinds of access pattern mining that can be performed depending on the needs of the analyst. Since the method use in this study relied on relatively simple techniques therefore the information gathered is adequate for real user profile data due to the noise in the data has to be first tackled. In this study, Data Mining techniques known as generalized association rules was used in order to get some insights into website usage pattern. For the purpose of this study, server logs from tutor.com portal were retrieved, pre-processed and analyzed. An important finding from this study is that Mathematics subject generally popular from UPSR, PMR and UPSR levels. On the contrary, arts subjects are not popular to Tutor.com users. The system administrator may consider evaluating the content and the link for such subjects, so that the real problem can be identified. 2004 Thesis NonPeerReviewed application/pdf en https://etd.uum.edu.my/1324/1/MOHD._HELMY_B._ABD._WAHAB.pdf application/pdf en https://etd.uum.edu.my/1324/2/1.MOHD._HELMY_B._ABD._WAHAB.pdf Mohd. Helmy, Abd. Wahab (2004) Log Mining Using Generalized Association Rules. Masters thesis, Universiti Utara Malaysia.
spellingShingle QA76 Computer software
Mohd. Helmy, Abd. Wahab
Log Mining Using Generalized Association Rules
thesis_level Master
title Log Mining Using Generalized Association Rules
title_full Log Mining Using Generalized Association Rules
title_fullStr Log Mining Using Generalized Association Rules
title_full_unstemmed Log Mining Using Generalized Association Rules
title_short Log Mining Using Generalized Association Rules
title_sort log mining using generalized association rules
topic QA76 Computer software
url https://etd.uum.edu.my/1324/1/MOHD._HELMY_B._ABD._WAHAB.pdf
https://etd.uum.edu.my/1324/2/1.MOHD._HELMY_B._ABD._WAHAB.pdf
https://etd.uum.edu.my/1324/
work_keys_str_mv AT mohdhelmyabdwahab logminingusinggeneralizedassociationrules