评测Web使用分析中会话识别的准确度

Measuing Accuracy of Sessionizers for Web Usage Analysis

  • 摘要: 目前用于用户会话识别的方法主要有两类:基于时限的会话识别和与拓扑结构(超链接)结合的会话识别,这两类方法都是在用户识别的基础上对用户活动作出猜测而得到的。该文提出了一套用于对这些启发式方法所获得的数据的准确程度进行量化的评测系统,不同的估测方法反映不同的数据挖掘应用的需要。最后通过一个实际站点的数据说明了评测系统的识别结果是准确的。

     

    Abstract: This paper describes timeout-based sessionizing mechanisms and topology-aware heuristics which now used to identify user sessions. The Sessionizing tools are based on heuristic rules and on assumptions about the site's usage,and therefore prone to error.The paper proposes a formal framework composed of a set of measures for the evaluation the accuracy of sessionizing tools.The different measures reflect the requitements of different web usage analysis applications.Experiment using the log data of a real web site shows the use of the measures.

     

/

返回文章
返回