THESIS
2003
xi, 91 leaves : ill. ; 30 cm
Abstract
As Web sites continue to grow in size and complexity, the results of Web usage mining have become critical for a number of applications such as site design, site personalization, business and marketing decision support, usability studies, and network traffic analysis. A major challenge is to construct statistical models of user behavior using Web logs that enable the model to be understandable to users and to make accurate predictions. One important data source for Web usage mining is the log data consisting of traces of the user's browsing behavior....[
Read more ]
As Web sites continue to grow in size and complexity, the results of Web usage mining have become critical for a number of applications such as site design, site personalization, business and marketing decision support, usability studies, and network traffic analysis. A major challenge is to construct statistical models of user behavior using Web logs that enable the model to be understandable to users and to make accurate predictions. One important data source for Web usage mining is the log data consisting of traces of the user's browsing behavior.
In this work, our objective is to discover user access patterns from large quantities of Web logs. These patterns can be used to enhance understanding of the user's browsing behavior or to make accurate predictions. We make two linked contributions in this work. The first is to discover user behavior models that can be summarized in the form of several state machines via the k-mixture Markov models and an EM algorithm, where each model describes a transitional behavior. These models allow us to cluster user traces into groups with different browsing patterns even if with new pages, allow us to identify the user's browsing class, and allow us to predict the next access. Our second contribution is to predict the time interval in which a future access will occur. We extend the traditional association rules by including the temporal information explicitly in each rule, and reason about the confidence of each prediction in terms of its temporal region. Our tests show that such models can make accurate predictions on when and what pages will be visited in the future.
Post a Comment