Unlock instant, AI-driven research and patent intelligence for your innovation.

Automatic mining method and system for new topics

An automatic mining and topic technology, applied in the field of text processing, can solve problems such as time-consuming and cost

Active Publication Date: 2020-06-26
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
View PDF10 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Manual discovery of new topics in text mining is time-consuming and costly due to the relatively large amount of text data and less manpower

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic mining method and system for new topics
  • Automatic mining method and system for new topics
  • Automatic mining method and system for new topics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] In order to more clearly illustrate the technical solutions of the embodiments of the present specification, the following briefly introduces the drawings that need to be used in the description of the embodiments. Apparently, the accompanying drawings in the following description are only some examples or embodiments of this specification, and those skilled in the art can also apply this specification to other similar scenarios. Unless otherwise apparent from context or otherwise indicated, like reference numerals in the figures represent like structures or operations.

[0020] It should be understood that "system", "device", "unit" and / or "module" as used herein is a method for distinguishing different components, elements, components, parts or assemblies of different levels. However, the words may be replaced by other expressions if other words can achieve the same purpose.

[0021] As indicated in the specification and claims, the terms "a", "an", "an" and / or "the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

One aspect of the invention provides an automatic mining method and system for new topics. The method comprises the steps of obtaining historical text data, and determining a first semantic distance between historical texts in the historical text data; determining at least one first cluster based on the first semantic distance; determining a first topic of each cluster in the at least one first cluster, wherein the first topic reflects the central content of the first cluster; obtaining text collection data containing historical texts and newly added texts, and determining a second semantic distance between texts in the text collection data; determining at least one second cluster based on the second semantic distance; determining a second topic of each cluster in the at least one second cluster, wherein the second topic reflects the central content of the second cluster; and when the third semantic distance between the second topic and any one of the first topics is greater than a preset distance threshold, determining that the second topic is a new topic.

Description

technical field [0001] This specification relates to the field of text processing, in particular to an automatic mining method and system for new topics. Background technique [0002] With the rapid expansion of Internet information, the amount of information is increasing exponentially, especially text data. It is very important to mine the value of text data. New topic mining can guide users to improve their products and discover the latest hot spots. It is time-sensitive and valuable, and is an important part of text mining. Manual discovery of new topics in text mining is time-consuming and costly due to the relatively large amount of text data and less manpower. Therefore, it is desirable to provide an automatic mining method for new topics. Contents of the invention [0003] One aspect of this specification provides a method for automatic mining of new topics, the method comprising: acquiring historical text data, determining a first semantic distance between histo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/30G06F16/35
CPCG06F16/355
Inventor 谢杨易
Owner ALIPAY (HANGZHOU) INFORMATION TECH CO LTD