System and method for clustering documents

A document clustering and document technology, which is applied in the fields of instruments, calculations, electrical digital data processing, etc., can solve the problems of reduced accuracy of retrieved information and difficulty in providing information in real time

Inactive Publication Date: 2007-10-17
LG ELECTRONICS INC
View PDF0 Cites 34 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, the accuracy of retrieved information is reduced, and it is difficult to provide information in real time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for clustering documents
  • System and method for clustering documents
  • System and method for clustering documents

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] Reference will now be made in detail to the preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings.

[0021] A document clustering system and method according to an embodiment of the present invention will now be described in detail with reference to the accompanying drawings.

[0022] FIG. 1 is a block diagram for describing a document clustering system according to an embodiment of the present invention.

[0023] Referring to FIG. 1 , a document clustering system according to an embodiment of the present invention includes: a client 200, to which a user inputs a query for document retrieval or on which a document retrieval result regarding the input query is displayed; and a clustering system 100, It is connected to the client 200 through the network 210 to perform document retrieval according to the query and cluster the retrieved documents.

[0024] The client 200 includes an input unit by which a user transmits a pre...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Provided are a system and method of clustering documents. The system includes a document DB, a document feature writing unit storing documents, a document retrieving unit, a clustering unit, and a cluster DB. The document DB stores documents. The document feature writing unit extracts attribute information of documents stored in the document database, and writes indexes with respect to the respective documents on the basis of the attribute information. The document retrieving unit retrieves documents including a query input by a user, using the indexes. The clustering unit includes a representative vector calculator calculating feature vectors and a representative vector of the retrieved documents, and a similarity calculator calculating similarities between the documents using the feature vectors and the representative vector. The cluster database stores documents clustered by the clustering unit.

Description

technical field [0001] The present invention relates to a document clustering system and method, which can determine the similarity between documents and cluster similar documents based on the determined similarity. Background technique [0002] In recent years, document retrieval systems have been widely used, which can process a large amount of document information, extract information corresponding to a user's needs, and provide the extracted information to the user. [0003] That is, document retrieval or information retrieval refers to searching for a document or information desired by a user from a large number of documents and information. To retrieve documents or information, keyword processing is performed on natural language text, weights are assigned to each keyword, and then retrieval and ranking are performed. [0004] A document retrieval system in the prior art receives a query from a user, and outputs common results extracted by a common system to the user. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F17/40
Inventor 车完奎金晶中安汉峻
Owner LG ELECTRONICS INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products