Natural language syntactic analysis method

A syntactic analysis and natural language technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems that affect the accuracy of syntactic analysis results, insufficient countermeasures of PCFG method, inappropriate processing, etc.

Active Publication Date: 2019-07-16
BEIJING YU ZI CHENG SCI & TECH CO LTD
View PDF7 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

From this, a conflict arises between the randomness of the corpus and some fundamental syntactic functions and definitions inherent in natural language itself
[0122] [2] For some important structural features in natural language, the PCFG method (including the lexicalized PCFG method) has insufficient countermeasures and cannot deal with it properly
However, such an arrangement is likely to seriously affect the accuracy of the syntactic analysis results, that is, if the computer makes a misjudgment in the lexical analysis link, then this misjudgment cannot be corrected at all in other links of the syntactic analysis that will be performed next and constraints, which have a negative impact on the accuracy of the syntactic analysis results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Natural language syntactic analysis method
  • Natural language syntactic analysis method
  • Natural language syntactic analysis method

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0272] Example 1: That men who were appointed didn't bother the liberals wasn't marked upon by the press.

[0273] This example sentence is preprocessed to generate a word list (i-a) and a word list (i-b). Since the word that in the example sentence has structural ambiguity (structural ambiguity), that may be a subordinate associated word unit or a finite word unit, so two word lists (i) are generated, and the two word lists (i) be identified differently.

[0274] When there is structural ambiguity in a sentence, it is necessary to make multiple word lists (i) for the sentence; the number of word lists (i) can be obtained by using the multiplication principle in combinatorics according to the number of structural ambiguities. This example sentence also contains a structural ambiguity: upon may be both a particle and a preposition, but due to space limitations, it will not be specifically analyzed.

[0275] Word list (i-a):

[0276]

[0277] Word list (i-b):

[0278] ...

example 2

[0444] Example 2: That something you learned is wrong is known to the public.

[0445] That in this example sentence produces structural ambiguity. However, due to limited space, only the word list (ii) that preprocesses That as a subordinate associative word unit is given, as shown below.

[0446] The adjective wrong in this example sentence acts as the predicative of the clause and is the main component of the sentence. However, in order to facilitate computer processing, according to the operation of the application plan, the adjective wrong is temporarily removed in the preprocessing link. The predicative wrong of the clause can be repaired in the subsequent syntactic structure repair link.

[0447]

[0448] According to the scheme of the present application, for example sentence 2, an A-B-C joint system as follows can be generated:

[0449]

[0450] B 1 ={to+

[0451] through the above A 1 -B 1 -C 1 Combined system, the basic framework of the syntactic struct...

example 3

[0455] Example 3: That men were appointed didn't bother the liberals wasn't marked upon by the press.

[0456] The two thats in this example sentence both produce structural ambiguity. However, due to limited space, only the word list (ii) that preprocesses 2 that as subordinate associative word units is given, as follows:

[0457]

[0458] According to the scheme of the present application, for example sentence 3, an A-B-C joint system as follows can be generated:

[0459]

[0460] B 1 ={g[PREP](u)=by+

[0461] through the above A 1 -B 1 -C 1 Combined system, the basic framework of the syntactic structure of example sentence 3 is obtained, such as Figure 33 shown.

[0462] The complete syntactic analysis result of example sentence 3 is expressed as follows in the form of character string: [see Figure 8 ]

[0463] (ROOT(S(SBAR(IN That)(S(SBAR(IN that)(S(NP(NNS men))(VP(VBD were)(VBNappointed))))(VP(VBD did)(RB n't )(VP(VB bother)(NP(DT the)(NNS liberals)))))(...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a natural language syntactic analysis method. According to the invention, two international leading natural language syntactic analysis devices, namely a Berkley Parser and a Stanford Parser, which are generally recognized by the computer science at present are pointed out, and some serious technical vulnerabilities exist in the Berkley Parser and the Stanford Parser; and for the vulnerabilities, a technical scheme for solving the problem is provided. A brand-new mathematical model for describing statements is established, and a set of computer syntactic analysis methodis provided on the basis of the mathematical model. According to the technical means, three aspects of lexical analysis, syntactic analysis and semantic analysis in natural language processing of thecomputer are organically unified, mutual constraint among the three aspects is enhanced, and therefore the effect of eliminating structural ambiguity of the computer is improved. The natural languagesyntactic analysis method is high in technical difficulty, high in comprehensiveness, wide in application range and very large in calculation amount, conforms to the natural laws of mathematics and computer science, and is beneficial to improving the accuracy of computer syntactic analysis.

Description

[0001] This application claims the priority of the Chinese patent application filed on March 22, 2019, with application number 201910224013.X, and the title of the invention is "A Method for Natural Language Syntax Analysis", the entire content of which is incorporated in this application by reference middle. technical field [0002] The invention relates to the field of computer data processing, in particular to a method for syntactic analysis of natural language. Background technique [0003] Natural language processing (NLP) is a very important direction in the field of computer science and artificial intelligence. It studies various theories and methods that can realize effective communication between humans and computers using natural language. [0004] Syntactic parsing is one of the key tasks in natural language processing (NLP). The basic task of syntactic analysis is to determine the syntactic structure of a sentence or the interdependence of words within a sentenc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/216G06F40/211G06F40/289G06F40/30
Inventor 秦一男朱江
Owner BEIJING YU ZI CHENG SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products