Method and device for processing a set of related words

A processing method and word technology, applied in the Internet field, can solve the problem of small vocabulary and achieve the effect of improving the collection of related words

Active Publication Date: 2020-09-15
BEIJING GRIDSUM TECH CO LTD
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The embodiment of the present application provides a processing method and device for a collection of related words, so as to at least solve the technical problem that the vocabulary of the existing method of word bag accumulation is too small

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for processing a set of related words
  • Method and device for processing a set of related words
  • Method and device for processing a set of related words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] According to an embodiment of the present application, an embodiment of a method for processing a set of related words is provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, Also, although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0033] figure 1 is a flow chart of a processing method for a set of associated words according to an embodiment of the present application, such asfigure 1 As shown, the processing method includes the following steps:

[0034] Step S102, crawling web texts from the target data source based on the associated words in the associated words set of the object to be analyzed.

[0035] Step S104, performing word segmentation on the network text to obtain a plurality of text vocabulary, and obtainin...

Embodiment 2

[0096] According to the embodiment of the present application, an embodiment of a processing device for associating word sets is also provided, such as image 3 As shown, the processing device includes: a crawling unit 10 , a processing unit 30 , a screening unit 50 and an updating unit 70 .

[0097] Wherein, the crawling unit 10 is configured to crawl web texts from the target data source based on related words in the set of related words of the object to be analyzed.

[0098] The processing unit 30 is configured to perform word segmentation on the network text to obtain a plurality of text vocabulary, and obtain vocabulary information of each text vocabulary, wherein the vocabulary information includes the association index data of each text vocabulary and / or the part-of-speech information of each text vocabulary, and the association index The data is used to indicate the degree of association of each text term with the related word.

[0099] The filtering unit 50 is config...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a conjunction word set processing method and device, wherein the processing method comprises the steps of crawling a web text from a target data source on the basis of conjunction words in a conjunction word set of an object to be analyzed; performing word segmentation on the web text to obtain a plurality of text vocabularies, and obtaining the vocabulary information of each text vocabulary, wherein the vocabulary information includes conjunction index data of each text vocabulary and / or information of part of speech of each text vocabulary, and the conjunction index data is used for indicating the conjunction degree of each text vocabulary and the conjunction words; screening the conjunction index data of a plurality of text vocabularies and / or information of part of speech of a plurality of text vocabularies, and obtaining the screened conjunction vocabularies; and updating the conjunction word set by using the screened conjunction vocabularies. The method and the device provided by the invention solve the technical problem of small vocabulary quantity of the existing word bag accumulating method.

Description

technical field [0001] The present application relates to the Internet field, and in particular, relates to a processing method and device for a set of associated words. Background technique [0002] When an enterprise releases a product or service, or a government department promulgates a certain policy, or an instant event that attracts social attention occurs, there will inevitably be some relevant news reported by online media on the Internet. These online news will be Arouse the attention and discussion of netizens. In the process of collecting Internet public opinion content (i.e., web texts related to the object) for an analysis object (such as current events, products, characters, policies, etc.), if a web crawler is used to crawl the web texts related to the analysis object To collect information, since crawling does not distinguish whether the content is related to the object of analysis, after crawling the web text, it needs to be filtered to filter out the conte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/335G06F16/35
CPCG06F16/335G06F16/355
Inventor 梁梦溪何鑫
Owner BEIJING GRIDSUM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products