Method for mining representative sequence pattern from Web click stream data

A technology of click stream data and sequence patterns, applied in data mining, network data retrieval, electronic digital data processing, etc., can solve the problem of difficult to handle sequence data with exponential growth, users cannot understand, use and make decisions, and calculation efficiency is not good Advanced questions, to achieve the effect of reducing the number of results, avoiding being difficult to be directly understood, and reducing the number of results

Active Publication Date: 2021-05-07
NORTHEASTERN UNIV LIAONING
View PDF9 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the existing sequential pattern mining methods have the following problems: On the one hand, most of the existing sequential pattern mining methods are completed based on two steps of "mining" and "selection", and there may be multiple scans during mining and selection. In the case of databases, it is difficult to deal with exponentially growing sequence data, which makes the calculation efficiency low; on the other hand, the existing sequence pattern mining methods produce a large number of mining results, and there are many redundant sequences and containment relationships sequence mode, which makes it impossible for users to understand, use and make decisions in a short time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for mining representative sequence pattern from Web click stream data
  • Method for mining representative sequence pattern from Web click stream data
  • Method for mining representative sequence pattern from Web click stream data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0063] In this embodiment, taking the Web click stream data of multiple users within a period of time as an example, the method for mining representative sequence patterns from the Web click stream data of the present invention is used to mine the site access habits of these users.

[0064] In this embodiment, a method of mining representative sequence patterns from Web clickstream data, such as figure 1 shown, including the following steps:

[0065] Step 1. Determine the seed sequence: input the Web click stream data as the sequence data set, and input the user-defined minimum support and maximum coverage parameters at the same time; traverse the sequence data set once, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for mining a representative sequence pattern from Web click stream data, and relates to the technical field of sequence pattern mining. The method comprises the following steps: firstly, inputting a Web click stream sequence data set, a minimum support degree and a maximum coverage degree, traversing the data set once, and reserving all frequent sites which are not less than the minimum support degree as sequence generation seeds; adopting a gap expansion enumeration tree for each sequence generation seed, and combining a gap scanning pruning strategy and closure check to obtain all frequently closed super sequences of the seed; further selecting all representative sequences of the seed by adopting a local representative sequence screening technology; traversing all the sequences to generate seeds, and outputting a representative sequence of each seed to obtain all representative sequence modes of the Web click stream data. The method has the advantages that the representative sequence pattern can be used to effectively overcome the contradiction that the frequent sequence pattern is large in number and low in availability, and the availability of the result can be enhanced; and reference is provided for applications of online user behavior analysis, information recommendation, engine optimization and the like of the Web click stream.

Description

technical field [0001] The invention relates to the technical field of sequence mining, in particular to a method for mining representative sequence patterns from Web click stream data. Background technique [0002] Frequent sequences in sequence data are widely used in commercial fields such as Web click stream data analysis, customer shopping habit analysis, log data analysis, etc. Many users often buy "Coke" after buying "Potato Chips". Therefore, "Coke" can be recommended to customers who have purchased "Potato Chips" to increase the sales of the product. Sequential pattern mining can not only be applied to the commercial field, but has been widely used in many other fields, such as traffic travel pattern analysis, scientific experiment process analysis, natural disaster prediction analysis, disease drug diagnosis analysis, bioinformatics data analysis, etc. Among them, Web clickstream data analysis plays an important role in the fields of online user behavior analysis...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/9535G06F16/957G06F16/901
CPCG06F16/9535G06F16/957G06F16/9027G06F2216/03
Inventor 赵宇海汪嗣尧王若飞马生俊印莹
Owner NORTHEASTERN UNIV LIAONING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products