Self-organized mapping network based document clustering method

A technology of self-organizing mapping and document clustering, applied in neural learning methods, biological neural network models, special data processing applications, etc., can solve problems such as edge effects, difficulty in adapting input document data, underutilization of neurons, etc. , to achieve the effect of improving the clustering F value

Inactive Publication Date: 2006-07-26
HARBIN INST OF TECH
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention provides a document clustering method based on a self-organizing map network to overcome the difficulty in adapting the input d

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-organized mapping network based document clustering method
  • Self-organized mapping network based document clustering method
  • Self-organized mapping network based document clustering method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] Combine below Figure 1 to Figure 5 This embodiment will be specifically described. The method of the present invention is realized through the following steps: 1. Use the search words to find all selected documents within the scope specified by the searcher; 2. Initialize the output layer of the self-organizing map network as a ring structure, and divide the ring structure at least equally Two halves, each sector of which is used as a neuron; 3. Input the selected document, conduct self-organizing map network training, and calculate the R of the current output layer 2 Clustering criterion coefficient; 4. Judgment R 2 Whether the clustering criterion coefficient is greater than the threshold μ; 5. If the result of step 4 is yes, then the training of the self-organizing map network is terminated, and the selected documents are classified according to the output layer neuron composition of the current self-organizing map network; 6. End; 7. If the result of step 4 is no...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The file clustering method based on self-organization mapping network comprises: finding out all the selected files; initiating the output layer of the network as ring structure to divide it equally at least into two parts with every sector as a neuron; calculating R2 clustering criterion factor of current output layer; deciding whether the said factor more than the threshold mu, if yes, stopping the training and classifying the file according to current self-organization mapping network; or else, finding out the neuron with maximal RSS in class to interpolate a new neuron, then training all neurons of current output layer. This invention can self-adapt the input file data and fixed structure and other incidental problems.

Description

technical field [0001] The invention relates to a document clustering method. Background technique [0002] As an unsupervised machine learning method, clustering has high automatic processing ability, and has become an important means for effectively organizing, summarizing and navigating text information. The purpose of document clustering is to dig out the structural information in the document collection by automatically organizing the document collection, so as to facilitate users to browse and improve the efficiency of information access. Its main applications include digital library services, automatic sorting of results returned by search engines, and mining of user interests. Among many document clustering methods, Self-Organizing Maps (SOM for short) proposed by T. Kohonen has attracted more attention of researchers. Document clustering has high-dimensional and semantic-related characteristics, and SOM can better realize the order-preserving mapping from high-dim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/08G06F15/18G06F17/30
CPCG06K9/6251G06F18/2137
Inventor 刘远超关毅徐志明刘秉权林磊
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products