Check patentability & draft patents in minutes with Patsnap Eureka AI!

New word discovery method and system

A new word discovery and entry technology, which is applied in the field of new word discovery methods and systems, can solve the problems that words cannot be found, the frequency of occurrence is low, and they are not identified as new words.

Active Publication Date: 2019-11-08
THE FOURTH PARADIGM BEIJING TECH CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Exemplary embodiments of the present invention are to provide a new word discovery method and system to solve the problem that some words cannot be found, the results obtained based on existing methods are difficult to target specific fields, and some words with low frequency of occurrence are not found. At least one of the questions identified as a new word

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • New word discovery method and system
  • New word discovery method and system
  • New word discovery method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like numerals refer to like parts throughout. The embodiments are described below in order to explain the present invention by referring to the figures.

[0036] figure 1 A flowchart showing a new word discovery method according to an exemplary embodiment of the present invention. As an example, the method for discovering new words may be implemented by a computer program, or may be performed by a special hardware device or a collection of hardware and software resources for new word discovery, big data calculation, artificial intelligence platform or data analysis, such as , the method for discovering new words may be executed by a natural language processing platform for realizing services related to finding new words.

[0037] refer to figure 1 , in step S10, a first set of candidate words including a plurality of ca...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A new word discovery method and system are provided. The new word discovery method comprises the following steps: obtaining a first candidate word set by segmenting each sentence in the text; screening the first candidate word set to obtain a second candidate word set by the part-of-speech rule of the first dictionary; obtaining word segmentation from each sentence by a second dictionary, and determining a candidate word coefficient according to a relationship between a boundary word of a candidate word in a second candidate word set and a boundary word of a word segmentation; adjusting the candidate word coefficients according to the internal cohesion degree and the boundary freedom degree of the candidate words in the second candidate word set, and selecting new words from the second candidate word set according to the adjusted candidate word coefficients, wherein the first dictionary is the same as or different from the second dictionary. According to the new word discovery method and system, the influence of the specific part of speech and the dictionary on the new word discovery result can be reduced, and the candidate word coefficient can be obtained through the boundary relation, the internal cohesion degree and the boundary freedom degree, and the new word discovery result is more accurate.

Description

technical field [0001] The present invention generally relates to natural language processing, and more specifically relates to a new word discovery method and system. Background technique [0002] In the existing new word discovery method, since the results of word segmentation are used to combine candidate words and screen out new words from candidate words, but word segmentation itself relies on the establishment of a dictionary. For texts in new fields, when new words are not known, In this case, the word segmentation may be wrong, causing the new word to never be recognized. [0003] On the other hand, the results obtained by the threshold filtering method based on the characteristics of candidate words are difficult to target specific fields. For example, candidate words are usually sorted by frequency of occurrence, and most of them are common words in general fields, such as "the first time", "every time", etc. Years", etc., while special words such as "insured pers...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27
CPCG06F40/289
Inventor 赵汉光王珵戴文渊
Owner THE FOURTH PARADIGM BEIJING TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More