Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Word forming determination model generation method, and new word discovery method and device

A new word discovery and judgment model technology, applied in the field of computer networks, can solve problems affecting the accuracy of new word recognition and achieve the effect of improving accuracy

Inactive Publication Date: 2017-12-26
CAINIAO SMART LOGISTICS HLDG LTD
View PDF1 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This seems to fall into a strange circle: the accuracy of word segmentation itself depends on the integrity of the existing thesaurus. If the word is not included in the thesaurus, how can we trust the result of word segmentation? In this case, according to the existing new word discovery method, with the influx of a large number of new words, it will seriously affect the accuracy of recognition of new words

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word forming determination model generation method, and new word discovery method and device
  • Word forming determination model generation method, and new word discovery method and device
  • Word forming determination model generation method, and new word discovery method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] In order to make the purpose, technical solution and advantages of the application clearer, the embodiments of the application will be described in detail below in conjunction with the accompanying drawings. It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined arbitrarily with each other.

[0065] In a typical configuration of the present application, a computing device includes one or more processors (CPUs), input / output interfaces, network interfaces, and memory.

[0066] Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and / or nonvolatile memory such as read only memory (ROM) or flash RAM. Memory is an example of computer readable media.

[0067] Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of inf...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a word forming determination model generation method, and a new word discovery method and device. The new word discovery method includes: performing pre-processing on a text so as to extract a plurality of text blocks; acquiring the word frequency, the cohesion degree and the coupling degree of each text block as word forming feature information of each text block; using a pre-generated word forming determination model and the word forming feature information to classify each text block so as to recognize new words. Automatic discovery of the new words can be achieved, the word forming feature information of each block includes the word frequency, the cohesion degree, and the coupling degree; and the accuracy of new word recognition can be improved.

Description

technical field [0001] The present application relates to computer network technology, in particular to a method for generating a word formation judgment model, a method and a device for discovering new words. Background technique [0002] When dealing with Chinese text, you will encounter difficulties that are not common in other languages, such as Chinese word segmentation. The Chinese text is a character sequence composed of some Chinese characters put together. There is no obvious boundary between Chinese words and words. By adding word boundary marks in the display, the formed word strings completely reflect the original meaning of the sentence. This is The work done by word segmentation. So, how does the computer know whether the word segmentation result of "combined into molecules" is "combined / synthesized / molecule", or "combined / formed / molecule", or "combined / component / sub"? This is the problem of ambiguity in Chinese word segmentation, and many word segmentation m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27
CPCG06F40/284
Inventor 王国印郑恒
Owner CAINIAO SMART LOGISTICS HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products