Unlock instant, AI-driven research and patent intelligence for your innovation.

Efficient and accurate noisy word processing method for text based on knowledge graph

A technology of knowledge graph and processing method, which is applied in the field of efficient and accurate noise word processing based on knowledge graph, which can solve problems such as lack of ability to filter noise words from spoken language to text, inappropriate deletion or missing deletion of homonyms and synonyms, and affecting sentence meaning, etc. Achieve the effects of efficient and accurate filtering of noise words, overcoming interference, flexible expansion and real-time modification

Pending Publication Date: 2021-03-12
上海适享文化传播有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the current efficient and accurate noise word processing method filters out sensitive words through a complete sensitive word lexicon and matches text words. If a sensitive word is found in the text, the word will be filtered out from the text and only focus on sensitive words. In the spoken environment, many Speech auxiliary words are noise words, and there is no noise word filtering ability for spoken language to text. Homophone synonyms are easy to be improperly handled and accidentally deleted or omitted during filtering, which will affect the meaning of the sentence.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Efficient and accurate noisy word processing method for text based on knowledge graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] Such as figure 1 As shown, the present invention provides a technical solution, an efficient and accurate noise word processing method for text based on a knowledge map, including the following steps:

[0027] S1. Build a word lexicon that needs to be filtered, build a business knowledge map and add various homophones of business-related words;

[0028] S2, adding a weight to each of the above words;

[0029] S3. Segment the text through a word segmentation tool;

[0030] S4. First correct the text homophones into business words through the business knowledge map and record all business words that appear in the text;

[0031] S5. The corrected text matches the filter words, but the recorded business words are not affected by the filter;

[0032] S6. Outputting the filtered text.

[0033] According to the above technical solution, the filter lexicon in S1 is connected to the network data, and the lexicon is classified, including political power, pornography, violence...

Embodiment 2

[0042] The invention provides a technical solution, an efficient and accurate noise word processing method for text based on knowledge graph:

[0043] S1. Build a thesaurus that needs to be filtered, build a business knowledge graph and add various homophones of business-related words. The best classification of the filtered thesaurus can improve reusability;

[0044] S2, adding a weight to each of the above words;

[0045] S3. Use the word segmentation tool to segment the text and adjust the word segmentation effect:

[0046] Word weight: ["collect" 200, "concentrate" 100]

[0047] Text: "Like collecting Chinese independent innovation products"

[0048] Segmentation results: "Like-collection-China-independent-innovation-of-product";

[0049] S4. First correct the text homophones to business words through the business knowledge map and record all business words that appear in the text:

[0050] Text: "Yeah, I like the warmth of freshman autumn in Harbin"

[0051] Result: ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an efficient and accurate noisy word processing method for a text based on a knowledge graph, and the method comprises the following steps: S1, building a to-be-filtered word library, building a business knowledge graph, and adding various homophones of business related words; S2, adding a weight to each word; S3, performing word segmentation on the text through a word segmentation tool; S4, correcting the text homophones into service words through a service knowledge graph, and recording all the service words appearing in the text; S5, matching the corrected text witha filtering word, wherein the recorded service word is not influenced by filtering; S6, outputting the filtered text. According to the method, text data is filtered, the method is accurate and stable,flexible expansion and real-time modification can be achieved, interference of text polyphonic synonyms can be overcome, sentence meanings are well reserved while noisy words are filtered, and efficient and accurate noisy word filtering processing capacity is provided for a spoken language-to-text scene.

Description

technical field [0001] The invention relates to the technical field of knowledge graphs, in particular to an efficient and accurate noise word processing method based on knowledge graphs. Background technique [0002] Knowledge graph (Knowledge Graph), known as knowledge domain visualization or knowledge domain mapping map in the library and information industry, is a series of different graphics showing the knowledge development process and structural relationship, using visualization technology to describe knowledge resources and their carriers, mining , analysis, construction, drawing and display of knowledge and their interrelationships, knowledge map, is through the application of mathematics, graphics, information visualization technology, information science and other disciplines of theories and methods and metrology citation analysis, co-occurrence analysis and other methods, and use the visualized map to visually display the core structure, development history, fron...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/284G06F16/335G06F16/338G06F16/36G06F16/33
CPCG06F40/284G06F16/335G06F16/367G06F16/338G06F16/3334
Inventor 李抒雁沙涛
Owner 上海适享文化传播有限公司