A Method of Natural Language Syntax Analysis

A syntactic analysis and natural language technology, applied in natural language data processing, semantic analysis, instruments, etc., can solve problems that cannot be corrected and constrained, processing is not in place, corpus randomness function and definition conflicts, etc.

Active Publication Date: 2021-02-12
BEIJING YU ZI CHENG SCI & TECH CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

From this, a conflict arises between the randomness of the corpus and some fundamental syntactic functions and definitions inherent in natural language itself
[0122] [2] For some important structural features in natural language, the PCFG method (including the lexicalized PCFG method) has insufficient countermeasures and cannot deal with it properly
However, such an arrangement is likely to seriously affect the accuracy of the syntactic analysis results, that is, if the computer makes a misjudgment in the lexical analysis link, then this misjudgment cannot be corrected at all in other links of the syntactic analysis that will be performed next and constraints, which have a negative impact on the accuracy of the syntactic analysis results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Method of Natural Language Syntax Analysis
  • A Method of Natural Language Syntax Analysis
  • A Method of Natural Language Syntax Analysis

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0272] Example 1: That men who were appointed didn't bother the liberals wasn't marked upon by the press.

[0273] This example sentence is preprocessed to generate a word list (i-a) and a word list (i-b). Since the word that in the example sentence has structural ambiguity (structural ambiguity), that may be a subordinate associated word unit or a finite word unit, so two word lists (i) are generated, and the two word lists (i) be identified differently.

[0274] When there is structural ambiguity in a sentence, it is necessary to make multiple word lists (i) for the sentence; the number of word lists (i) can be obtained by using the multiplication principle in combinatorics according to the number of structural ambiguities. This example sentence also contains a structural ambiguity: upon may be both a particle and a preposition, but due to space limitations, it will not be specifically analyzed.

[0275] Word list (i-a):

[0276]

[0277] Word list (i-b):

[0278] ...

example 2

[0446] Example 2: That something you learned is wrong is known to the public.

[0447] That in this example sentence produces structural ambiguity. However, due to limited space, only the word list (ii) that preprocesses That as a subordinate associative word unit is given, as shown below.

[0448] The adjective wrong in this example sentence serves as the predicative of the subordinate clause and is the main component of the sentence. However, in order to facilitate computer processing, according to the operation of the application plan, the adjective wrong is temporarily removed in the preprocessing link. The predicative wrong of the clause can be repaired in the subsequent syntactic structure repair link.

[0449]

[0450]

[0451] According to the scheme of the present application, for example sentence 2, an A-B-C joint system as follows can be generated:

[0452]

[0453] B 1 ={to+

[0454] through the above A 1 -B 1 -C 1 Combined system, the basic framewo...

example 3

[0458] Example 3: That men were appointed didn't bother the liberals wasn't marked upon by the press.

[0459] The two thats in this example sentence both produce structural ambiguity. However, due to limited space, only the word list (ii) that preprocesses 2 that as subordinate associative word units is given, as follows:

[0460]

[0461] According to the scheme of the present application, for example sentence 3, an A-B-C joint system as follows can be generated:

[0462]

[0463] B 1 ={g[PREP](u)=by+

[0464] through the above A 1 -B 1 -C 1 Combined system, the basic framework of the syntactic structure of example sentence 3 is obtained, such as Figure 33 shown.

[0465] The complete syntactic analysis result of example sentence 3 is expressed as follows in the form of character string: [see Figure 8 ]

[0466] (ROOT(S(SBAR(IN That)(S(SBAR(IN that)(S(NP(NNS men))(VP(VBD were)(VBNappointed))))(VP(VBD did)(RB n't )(VP(VB bother)(NP(DT the)(NNS liberals)))))(...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method for syntactic analysis of natural language is disclosed. The present invention points out some serious technical loopholes in the two internationally leading natural language syntax analysis devices recognized by the computer science community today - Berkeley Parser (Berkeley Parser) and Stanford Parser (Stanford Parser), and aims at these loopholes, A technical solution to the problem is given. The invention establishes a set of brand-new mathematical models for characterizing sentences, and on the basis of this, proposes a set of computer syntax analysis methods. The present invention organically unifies the three aspects of lexical analysis, syntactic analysis and semantic analysis in computer natural language processing through technical means, and strengthens the mutual constraints among these three aspects, thereby improving the effect of computer dissolving structural ambiguity . The invention has high technical difficulty, strong comprehensiveness, wide application range and very large calculation amount, conforms to the natural law of mathematics and computer science, and helps to improve the accuracy of computer syntax analysis.

Description

[0001] This application claims the priority of the Chinese patent application filed on March 22, 2019, with application number 201910224013.X, and the title of the invention is "A Method for Natural Language Syntax Analysis", the entire content of which is incorporated in this application by reference middle. technical field [0002] The invention relates to the field of computer data processing, in particular to a method for syntactic analysis of natural language. Background technique [0003] Natural language processing (NLP) is a very important direction in the field of computer science and artificial intelligence. It studies various theories and methods that can realize effective communication between humans and computers using natural language. [0004] Syntactic parsing is one of the key tasks in natural language processing (NLP). The basic task of syntactic analysis is to determine the syntactic structure of a sentence or the interdependence of words within a sentenc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/211G06F40/216G06F40/289G06F40/30
CPCG06F40/216G06F40/211G06F40/289G06F40/30
Inventor 秦一男朱江
Owner BEIJING YU ZI CHENG SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products