Information processing method and related equipment

A technology of preset thresholds and keyword groups, applied in the field of information processing, can solve problems such as limited meaning

Pending Publication Date: 2020-05-08
BEIJING GRIDSUM TECH CO LTD
View PDF9 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The method based on text keyword extraction can only extract relatively short words (generally 2 characters), and the method based on topic clustering can only display the topic connotation in the form of relatively short words (generally 2 characters), expressing more limited meaning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information processing method and related equipment
  • Information processing method and related equipment
  • Information processing method and related equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] The embodiment of the present invention provides an information processing method and related equipment, which can not only determine the subject of the text, but also obtain relatively long keywords and phrases corresponding to the text, with richer meaning and high readability, which is helpful for data analysis. bigger.

[0065] The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and not necessarily Used to describe a specific sequence or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, pro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides an information processing method and related equipment, which not only can determine the theme of a text, but also can obtain relatively long keywords and phrases corresponding to the text, and are richer in meaning, high in readability and more helpful to data analysis. The method comprises the steps of obtaining a target text; preprocessing the target text to obtain a target corpus set; inputting the target corpus set into a preset topic model to determine a topic corresponding to each word in the target corpus set; determining the theme of which theword frequency is greater than a second preset threshold value in the target corpus set as the theme of the target text; determining a target sub-tree according to a short statement method tree corresponding to the target text; combining nouns in the first sub-tree to obtain a keyword group corresponding to the target text; and determining the keyword group of which the word frequency is greater than a third preset threshold value in the keyword groups corresponding to the target text as the keyword group of the target text.

Description

technical field [0001] The invention relates to the field of information processing, in particular to an information processing method and related equipment. Background technique [0002] The method of text keyword extraction based on Textrank: firstly, the words in the text are used as vertices, and the adjacency relationship between words is used as edges to form a graph; The frequency of co-occurrence is used to calculate the weight transfer frequency; then the random walk algorithm is used to iteratively calculate the scores of each node in the graph until convergence. Finally, the words are sorted according to the node scores, and the TopN with the highest score is selected as the keyword. Methods based on topic clustering: Topic models establish the corresponding frequencies between articles, topics, and words. For a text, the topic model can give the topic category of each word it contains. Words are divided into topic categories, and the higher the weight, the grea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F16/332
Inventor 陈万礼
Owner BEIJING GRIDSUM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products