The invention discloses a website classification catalogue optimization
analysis method based on
log mining. According to the method, website
log data is firstly preprocessed, wherein the
log data refers to a series of webpage access data sets recorded on a
server; through preprocessing, a catalogue path through which a user obtains information via a specific website is extracted from the
log data; then, a method (VOB) based on the browsing path sequence is used for calculating the similarity between any two catalogue paths until a catalogue path
similarity matrix is constructed; then, a divisive hierachical clustering (NHC)
algorithm based on matrix transformation is used for performing clustering on the catalogue path
similarity matrix, so that users corresponding to the catalogue paths are clustered into different categories; and finally, expected website classification catalogue systems of each category of users are mined out, and are subjected to comparison analysis on the original classification catalogue
system. Through the steps, the website classification catalogue systems conforming to the expectation of the users can be mined out, and the quantitative decision support is provided for the website optimization.