Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Chinese automatic syntactic analyzer based on sentence pattern structure

A syntactic analysis and sentence pattern technology, applied in unstructured text data retrieval, instrumentation, natural language data processing, etc., to achieve the effect of improving construction efficiency

Active Publication Date: 2021-06-11
北京汉雅天诚教育科技有限公司
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the field of natural language processing still lacks a Chinese automatic syntax analyzer based on sentence structure

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese automatic syntactic analyzer based on sentence pattern structure
  • Chinese automatic syntactic analyzer based on sentence pattern structure
  • Chinese automatic syntactic analyzer based on sentence pattern structure

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment 1

[0313] Take the sentence "At this time, the hardworking workers are beside the road and have prepared the materials for building this railway." For example, the results of punctuation segmentation and maximum positive matching word segmentation are as follows:

[0314] Punctuation 1: At this time,

[0315] Punctuation 2: The hard worker is by the road,

[0316] Punctuation 3: The materials for the construction of this railway have been prepared.

[0317] Taking punctuation sentence 3 as an example, the directed acyclic graph DAG after S403 dynamic word recognition is as follows: figure 2 As shown, the dynamic words involved and their patterns are as follows:

[0318] 1) Ready: v:v←a

[0319] 2) OK: v:a-u

[0320] 3) Ready: v:v←a-u

[0321] 4) This line: m:m-q

[0322] The word segmentation path and its weight output by S405 are as follows:

[0323] already ready to build this Railroad material. # Lexical weight: 5.89

[0324] already pr...

specific Embodiment 2

[0337] Take the sentence "Like Jia's Grand View Garden, you can live in aunt Lin Daiyu, aunt Xue Baochai, and later more, such as Baoqin, Xiuyun, and anyone who can attract relatives can be accommodated." For example, it is omitted here The internal analysis process of punctuation sentences only shows the implementation process of S407, as follows:

[0338]

[0339]

[0340] Step 1) The result of merging the compound structure of np type punctuation sentences is as follows:

[0341]

[0342] Step 2) the punctuation sentence merges into the result of xj as follows:

[0343]

[0344]

[0345] So far, there is no np type punctuation sentence in the sequence, so the final result is 3 clauses, and the sentence structure expression is as follows:

[0346] ﹝Like ∧﹙Jia Jia In the ﹹGrand View Garden□, ﹞×║ can be: live│﹙Auntie﹙Lin Daiyu, ...﹙Auntie﹙Xue Baochai,

[0347] ×║﹝later﹞﹝more﹞more▽,

[0348] ﹙What?Baoqin,...Xiuyun,:==:﹛﹛﹝everyth...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an automatic Chinese syntactic analyzer based on a sentence pattern structure, which comprises the following steps: S1, expanding a grammar mode of a regular expression to realize an expanded regular expression grammar based on a multivariate word feature sequence; S2, constructing a syntactic rule library by using the extended regular expression grammar obtained in the step S1; S3, constructing a vocabulary knowledge base and a lexical knowledge base matched with the syntactic rule base constructed in the S2; and S4, based on the vocabulary knowledge base and the lexical knowledge base constructed in the S3, performing Chinese automatic syntactic analysis of a sentence pattern structure by adopting a lexical and syntactic integrated analysis algorithm. The method has the advantages that the Chinese automatic syntactic analysis function based on a sentence pattern structure system is achieved, the construction efficiency of a large-scale sentence standard syntax tree bank is improved, and a way is laid for connection of formalized graph analysis sentences and Chinese information processing downstream applications.

Description

technical field [0001] The invention belongs to the technical field of natural language processing, in particular to a Chinese automatic syntax analyzer based on sentence structure. Background technique [0002] Chinese automatic syntax analysis refers to: according to the given grammatical system, automatically deduce the grammatical structure of the sentence, and analyze the grammatical units contained in the sentence and the relationship between these grammatical units. Syntactic analysis is one of the key technologies in the field of natural language processing. On the one hand, it can provide technical support for subsequent semantic analysis, and on the other hand, it can provide help for many upper-level applications such as machine translation, information extraction, question answering systems, and automatic corpus processing. [0003] The accuracy rate of the mainstream automatic syntax analysis algorithm based on the phrase structure grammar system and the depende...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/211G06F40/216G06F40/289G06F40/253G06F16/33
CPCG06F40/211G06F40/216G06F40/289G06F40/253G06F16/33
Inventor 赵敏彭炜明宋继华王宁陈晨管世昱
Owner 北京汉雅天诚教育科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products