Hot word processing method and apparatus

A processing method and word segmentation processing technology, applied in the computer field, can solve problems such as low efficiency and low feature level

Inactive Publication Date: 2017-07-04
BEIJING GRIDSUM TECH CO LTD
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The embodiment of the present application provides a hot word processing method and device to at least solve the technical problem of low efficiency in the prior art due to manual deletion of hot words with low characteristics

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hot word processing method and apparatus
  • Hot word processing method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.

[0022] It should be noted that the terms "first" and "second" in the description and claims of the present application and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application discloses a hot word processing method and apparatus. The method comprises: carrying out word segmentation on the target corpus obtained from a text of target tags so as to obtain a plurality of word segmentation; clustering the plurality of word segmentation to obtain a plurality of clustering result sets; calculating an entropy value of each of the plurality of word segmentation in the clustering sets, wherein the entropy value is used to characterize the feature level of the word segmentation, and the clustering set is a set of a plurality of clustering result sets; selecting word segmentation whose entropy value is larger than the preset threshold in the plurality of word segmentation, so as to obtain target word segmentation; and deleting the target word segmentation from the statistical hot words associated with the target tags. According to the method and apparatus disclosed by the present application, the technical problem of low efficiency in the prior art due to that the hot words with low feature level are deleted by using an manual manner is solved.

Description

technical field [0001] The present application relates to the field of computers, in particular, to a method and device for processing hot words. Background technique [0002] When analyzing some topics, it is usually necessary to count the hot words in the topic, and the hot words refer to N words that appear in a certain percentage of texts related to the topic. In the process of counting hot words, it is often encountered that hot words in different topics in the same field have a great similarity, and some hot words that are common in this field will appear in almost all topics in this field Happening. For example, when analyzing the topic of a legal case, no matter the topic is a divorce case, a traffic accident case or a civil dispute case, words such as "plaintiff" and "defendant" will appear in the topics of all legal cases, so the above Hot words have a low degree of feature and are non-feature words. Therefore, it is not helpful to express the characteristics of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/355
Inventor 李新国
Owner BEIJING GRIDSUM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products