Risk phrase recognition method and device, electronic equipment and storage medium
A recognition method and recognition algorithm technology, which is applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve the problem of poor recognition effect of risk phrases, and achieve comprehensive, rapid and accurate recognition of recognized phrases, and a large amount of information. Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0069] The embodiment of this application provides a risk phrase identification method, see figure 1 As shown, the method includes:
[0070] S100. Perform phrase recognition on the risk description text by using a predetermined phrase recognition algorithm to obtain a first risk phrase list.
[0071] S200. Process the risk description text by using a predetermined word segmentation tool to obtain a second risk phrase list.
[0072] S300. Combine the first risk phrase list and the second risk phrase list to determine a risk phrase list including multiple risk phrases. Specifically, when merging, duplicate risk phrases are removed.
[0073] Based on the above embodiments, the number of risk phrases obtained by using the predetermined phrase recognition algorithm is small, which is not enough to characterize the information to be expressed by the original risk text. Combining the word segmentation tool to process the risk description text and expand the risk phrase, the accura...
Embodiment 2
[0075] see figure 2 As shown, the embodiment of the present invention provides a possible implementation manner. On the basis of Embodiment 1, step S100 includes the following steps:
[0076] S101. Filter the risk description text based on predetermined filtering rules. Wherein, the filter rule is: filter stop words according to a predetermined stop word list, and retain punctuation marks, nouns, verbs, adjectives and degree adverbs.
[0077] S102. Perform part-of-speech tagging on the filtered risk description text, and filter words with a predetermined part-of-speech to form a text to be recognized.
[0078] Further, the screening of words with predetermined parts of speech includes: screening nouns, verbs, adjectives and degree adverbs from the filtered risk description text. Unfiltering punctuation marks, nouns, verbs, adjectives and adverbs of degree can prevent the content that was previously separated by symbols from being combined due to removal of punctuation marks...
Embodiment 3
[0132] see image 3 As shown, the embodiment of the present invention provides a possible implementation manner. On the basis of Embodiment 1, step S200 includes the following steps:
[0133] S201. Combine each word in the risk description text to form a phrase to be matched.
[0134] S202. Search and match the phrase to be matched in the predetermined vocabulary of the word segmentation tool, and determine the phrase that matches the vocabulary in the predetermined vocabulary.
[0135] S203. Filter the matching phrases based on a predetermined filtering rule, and use the filtered phrases as risk phrases in the second risk phrase list. Wherein, the predetermined filtering rules include at least one of the following: filtering single words; filtering numbers; filtering phrases whose number of words is less than a predetermined number. Furthermore, the phrases whose constituent words are less than a predetermined number may be common words with a length less than 3, which furt...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com