New word recognition method, device, computer device and storage medium

A new word recognition, new word technology, applied in computing, special data processing applications, instruments, etc., can solve the problem of low new word effect

Active Publication Date: 2019-03-01
PING AN TECH (SHENZHEN) CO LTD
View PDF4 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The embodiment of the present application provides a new word recognition method, device, computer equipment and co

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • New word recognition method, device, computer device and storage medium
  • New word recognition method, device, computer device and storage medium
  • New word recognition method, device, computer device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0017] It should be understood that when used in this specification and the appended claims, the terms "comprising" and "comprises" indicate the presence of described features, integers, steps, operations, elements and / or components, but do not exclude one or Presence or addition of multiple other features, integers, steps, operations, elements, components and / or collections thereof.

[0018] It should also be understood that the terminology used in the specificati...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Embodiments of the present application provide a new word recognition method, apparatus, computer device, and computer-readable storage medium. The method comprises the following steps of: acquiring atext corpus; according to a preset sentence endpoint, dividing the text corpus into pieces with a length of 2-N is a natural number, N >= 2; judging whether a candidate word satisfies a preset condition; determining a candidate word as a candidate new word if the candidate word satisfies a preset condition; judging whether the candidate new words are included in the preset vocabulary; and determining the candidate new word as a new word if the candidate new word is not included in the preset vocabulary. Embodiments of the present application are based on natural language processing, and textcorpus is accurately segmented through preset sentence endpoints to obtain candidate words, thereby improving segmentation accuracy, identifying new words through identifying candidate words and candidate new words, and effectively improving accuracy and efficiency of new word discovery.

Description

technical field [0001] The present application relates to the technical field of natural language processing, and in particular to a new word recognition method, device, computer equipment and computer-readable storage medium. Background technique [0002] Chinese word segmentation is the basic technology of the current NLP (NLP, English is Natural Language Processing, Natural Language Processing) project, and its accuracy is directly related to the final performance of the NLP project. The discovery of new words has a direct impact on the accuracy of the word segmentation system. In the traditional new word discovery technology, the text is usually segmented first, and then the remaining fragments that fail to match are guessed to be new words, but the accuracy of word segmentation depends on the completeness of the lexicon, so the effect of new word discovery is poor . Contents of the invention [0003] The embodiment of the present application provides a new word reco...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/289G06F40/30
Inventor 马骏王少军
Owner PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products