Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Online clustering visualization method of text

A text and clustering technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as limited data capacity and inability to support online processing of text data streams

Inactive Publication Date: 2013-02-13
中国人民解放军总参谋部第五十七研究所
View PDF3 Cites 29 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The t-SNE algorithm proposed by L and G et al. can only process batch data, and the algorithm has limited data capacity, and cannot support online processing of text data streams.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Online clustering visualization method of text
  • Online clustering visualization method of text
  • Online clustering visualization method of text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to make the objects and advantages of the present invention clearer, the present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0052] Such as figure 1 As shown, it is a schematic diagram of a system applying the present invention to perform online clustering and visualization of texts. The system first collects a certain amount of historical data as the initial text data, which does not need to label the text category; the system clusters the initial data to obtain the text category distribution vector parameter and vocabulary category distribution frequency parameter of the initial text data, the former is high Dimensional data dimensionality reduction layout method The initial layout provides data sources, and calculates the layout model parameters, which are used as model data as constraint parameters for online text clustering; when processing online text data, the system conducts Online...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an online clustering visualization method of a text and belongs to the field of intelligent information processing of computer science. The method aims to introduce type characteristic word marking information to a user to realize the restriction and the optimization on a clustering process and improve the definition and the intelligibility of a text clustering structure; and an online clustering technology of the test is designed to realize increment clustering on a text data flow, keep the stability of the whole body of the clustering structure and update a model in a self-adaptive manner. The invention designs an online type high-dimensional data dimension-reducing and arrangement method to be suitable for large-scale data or a data flow environment; and the dimension reduction and the arrangement are carried out on a clustered text type distribution vector, so as to realize the increment visualization of text data and realize the visualized display of the text data and the type structure in a two-dimensional or three-dimensional Euclidean space.

Description

technical field [0001] The invention belongs to the text intelligent information processing technology under the computer science, and in particular relates to an online text clustering visualization method. Background technique [0002] Text data is one of the most important information carriers, and it is a common working scenario when browsing and processing text information. With the surge of information, users are in urgent need of a new computer technology that can automatically classify and manage the incoming data, so as to facilitate users to browse and query according to categories. If the amount of data increases further, the traditional text queue is no longer fully capable of displaying text information. At this time, it is necessary to visually display the clustering results in a two-dimensional or three-dimensional visual form, so that users can more conveniently Understand the information distribution situation and realize accurate acquisition of information...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 金烨徐诗恒
Owner 中国人民解放军总参谋部第五十七研究所
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products