Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

39 results about "Treebank" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

In linguistics, a treebank is a parsed text corpus that annotates syntactic or semantic sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which benefitted from large-scale empirical data. The exploitation of treebank data has been important ever since the first large-scale treebank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of treebanks is becoming more widely appreciated in linguistics research as a whole. For example, annotated treebank data has been crucial in syntactic research to test linguistic theories of sentence structure against large quantities of naturally occurring examples.

Module for creating a language neutral syntax representation using a language particular syntax tree

InactiveUS20060004563A1Semantic analysisSpecial data processing applicationsDependency structureApplication software

A method or module for creating an Language Neutral Syntax (LNS) representation of a sentence from a language particular syntax representation such as found in the Penn Treebank for use by different applications. The method or module includes a node generator configured to create hierarchical and dependent nodes using phrasal and constituent nodes of the language particular syntax. A node dependency generator is configured to create an unordered hierarchical dependency structure for the hierarchical and dependent nodes using a semantic relation between the hierarchical and dependent nodes derived from the language particular syntax.

Module for creating a language neutral syntax representation using a language particular syntax tree

Module for creating a language neutral syntax representation using a language particular syntax tree

Module for creating a language neutral syntax representation using a language particular syntax tree

Owner:MICROSOFT TECH LICENSING LLC

Pointer sentinel mixture architecture

ActiveUS20180082171A1Natural language data processingProbabilistic networksData setAlgorithm

The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.

Pointer sentinel mixture architecture

Pointer sentinel mixture architecture

Pointer sentinel mixture architecture

Owner:SALESFORCE COM INC

Chinese implicit discourse relation identification method

ActiveCN105955956AProof of validityJustifyNatural language translationSemantic analysisDiscourse relationNetwork model

The invention discloses a Chinese implicit discourse relation identification method. The method comprises the following steps of step 1 carrying out automatic word segmentation on a Chinese implicit discourse relation theory element pair to obtain an automatic word segmentation result; step 2 learning feature expression of Chinese implicit discourse relation theory elements based on the obtained automatic word segmentation result of the Chinese implicit discourse relation theory elements; step 3 carrying out modelling on the Chinese implicit discourse relation between the theory elements through a maximum-margin-based neural network model based on the obtained feature expression; and step 4 utilizing the obtained neural network model to identify the Chinese implicit discourse relation. According to the method, the Chinese implicit discourse relation can be more accurately identified. Through experimental verification on a Chinese discourse tree bank, in comparison with the existing English implicit discourse relation identification method, the method obtains an identification result with the higher accuracy rate in the Chinese implicit discourse relation identification.

Chinese implicit discourse relation identification method

Chinese implicit discourse relation identification method

Chinese implicit discourse relation identification method

Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Pointer sentinel mixture architecture

ActiveUS20200065651A1Natural language data processingProbabilistic networksData setAlgorithm

The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.

Pointer sentinel mixture architecture

Pointer sentinel mixture architecture

Pointer sentinel mixture architecture

Owner:SALESFORCE COM INC

Module for creating a language neutral syntax representation using a language particular syntax tree

InactiveUS7596485B2Semantic analysisSpecial data processing applicationsDependency structureApplication software

A method or module for creating an Language Neutral Syntax (LNS) representation of a sentence from a language particular syntax representation such as found in the Penn Treebank for use by different applications. The method or module includes a node generator configured to create hierarchical and dependent nodes using phrasal and constituent nodes of the language particular syntax. A node dependency generator is configured to create an unordered hierarchical dependency structure for the hierarchical and dependent nodes using a semantic relation between the hierarchical and dependent nodes derived from the language particular syntax.

Module for creating a language neutral syntax representation using a language particular syntax tree

Module for creating a language neutral syntax representation using a language particular syntax tree

Module for creating a language neutral syntax representation using a language particular syntax tree

Owner:MICROSOFT TECH LICENSING LLC

Automatic analysis method Chinese syntax based on corpus and tree type structural pattern match

InactiveCN101329666ALarge granularityImprove efficiencySpecial data processing applicationsCollocationPattern matching

The invention discloses an automatic analysis method of Chinese syntax based on corpora and pattern matching of tree structure. Based on the deep analysis and complete segmentation of Chinese mark corpus and according to syntactic patterns extracted from corpus and corresponding relationship of semantic collocation, the method carries out the pattern matching and switching processes of the sentences to be processed, and obtains an optimal syntax analysis result through the process of semantic disambiguation. The syntax automatic analysis system of the invention comprises an extracting, storing and calling module of syntactic pattern in syntax treebank, a sentence pattern statistics module, a syntactic pattern matching module, a local conversion module of approximate patterns and a semantic disambiguation module. Experiments prove that compared with the traditional syntax analysis, the Chinese syntax analysis method of the invention pays more attention to the combination of overall matching and local switching of the syntactic patterns, has large processing granularity and high efficiency, and increases average accuracy and recalling rate by about 10 percent.

Automatic analysis method Chinese syntax based on corpus and tree type structural pattern match

Automatic analysis method Chinese syntax based on corpus and tree type structural pattern match

Automatic analysis method Chinese syntax based on corpus and tree type structural pattern match

Owner:NANJING UNIV

Method for processing unknown words in Chinese-language dependency tree banks

InactiveCN103678272ASpecial data processing applicationsChinese charactersGranularity

The invention belongs to the field of processing for natural languages of computational linguistics, and discloses a method for processing unknown words in Chinese-language dependency tree banks. The method includes steps of A, searching all synonyms of the unknown words by the aid of synonym forests; B, computing character pattern similarity degrees among the unknown words and all the synonyms of the unknown words according to character pattern features of Chinese characters; C, extracting mapped words and information quantities of word classes of the mapped words when the character pattern similarity degrees among the unknown words and the multiple synonyms are high, and improving character pattern similarity degree computation models; D, extracting the words with the maximum character pattern similarity degrees as the optimal mapped words of the unknown words and using the extracted words as explanation for the unknown words in the tree banks. The method has the advantages that unit pairs (word classes, word classes) in dependency syntactic analysis can be recovered to unit pairs (word classes, words) or unit pairs (words, word classes) on the premise that the scales of the tree banks are no longer expanded, accordingly, the information granularity can be refined, the problem of data sparseness can be solved, and the dependency syntactic analysis performance can be improved.

Method for processing unknown words in Chinese-language dependency tree banks

Method for processing unknown words in Chinese-language dependency tree banks

Method for processing unknown words in Chinese-language dependency tree banks

Owner:BEIJING INFORMATION SCI & TECH UNIV

Formalized scheme for constructing Chinese tree bank based on sentence-based grammar

InactiveCN106708800AImprove accuracyImprove efficiencySemantic analysisSpecial data processing applicationsInformation processingChinese traditional

The invention discloses a formalized scheme for constructing a Chinese tree bank based on sentence-based grammar and relates to the field of corpus linguistics and natural language processing. According to the formalized scheme, research results on ''dynamic words'' in the linguistic circle are introduced in the design process with the sentence-based grammar in Chinese traditional teaching grammar being a prototype. By the adoption of the formalized scheme for constructing the Chinese tree bank, the accuracy and efficiency of tree bank construction can be improved beneficially, and meanwhile communication and fusion of the three fields of information processing, grammar study and teaching practice are promoted.

Formalized scheme for constructing Chinese tree bank based on sentence-based grammar

Formalized scheme for constructing Chinese tree bank based on sentence-based grammar

Formalized scheme for constructing Chinese tree bank based on sentence-based grammar

Owner:彭炜明 +4

Method for searching and matching articles by way of tree graph

ActiveCN104598647AAchieve precise positioningImprove retrieval efficiencySpecial data processing applicationsProduct stewardshipAmbiguity

The invention discloses a method for constructing, searching, matching and representing articles, commodities and service information by way of a tree graph and aims to improve the efficiency of precisely retrieving and managing required articles, commodities and services in the field of electronic commerce, Internet shopping, product management and the like. A client, a background system for information interaction with the client and a method for constructing, searching, matching and representing an article set tree graph between the client and the background system are included. The background system comprises a tree graph matching and searching engine, an article type attribute tree library and an article library. Compared with the prior art, the method disclosed by the invention has the advantages that precise positioning of required articles is realized by the user, ambiguity and nondeterminacy for determining searching conditions by character recognition are avoided, the quantity of irrelevant articles represented to the user is reduced, and meanwhile, the article retrieval efficiency is improved.

Method for searching and matching articles by way of tree graph

Method for searching and matching articles by way of tree graph

Method for searching and matching articles by way of tree graph

Owner:李剑

Event relationship graph generation method and apparatus

ActiveCN106095748ASave time readingNatural language data processingSpecial data processing applicationsGraph generationData mining

The invention provides an event relationship graph generation method and apparatus. The method comprises the steps of splitting a manuscript into statements according to preset punctuation marks; extracting characters in the split statements and screening out the statements containing the characters as standby statements, wherein the characters include at least one of a personal name, a role and a personal pronoun; performing syntactic analysis on the standby statements by utilizing a pre-obtained syntactic analysis tree bank to generate an associative relationship among the characters; and generating an event relationship graph by utilizing the characters and the associative relationship. By utilizing the method, events occurring among the characters can be visually viewed, so that a user can understand substances of the events for a relatively short time and the reading time is shortened.

Event relationship graph generation method and apparatus

Event relationship graph generation method and apparatus

Event relationship graph generation method and apparatus

Owner:NEUSOFT CORP

Syntax analysis method and device for layering Chinese long sentences based on punctuation treatment

InactiveCN1928854AImprove accuracyImprove recallSpecial data processing applicationsAmbiguityRecall rate

Unlike to traditional method, the new hierarchy syntactic analysis method faced Chinese long sentence comprises: 1. applying special functions of some punctuations to divide the complex long sentence into sub-sentence sequences; 2. extracting grammar rule and corresponding probability distribution information from large-scale database to analyze sentence and eliminate ambiguity. Much experiences show this invention can reduce time consumption and improves the analysis right rate and the recall rate about 7%.

Syntax analysis method and device for layering Chinese long sentences based on punctuation treatment

Syntax analysis method and device for layering Chinese long sentences based on punctuation treatment

Syntax analysis method and device for layering Chinese long sentences based on punctuation treatment

Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

ActiveUS20150039294A1Improve performanceKernel methodsNatural language data processingFeature setSurface level

The present invention provides a method and system directed to predicting implicit rhetorical relations between two spans of text, e.g., in a large annotated corpus, such as the Penn Discourse Treebank (“PDTB”), Rhetorical Structure Theory corpus, and the Discourse Graph Bank, and particularly directed to determining a rhetorical relation in the absence of an explicit discourse marker. Surface level features may be used to capture pragmatic information encoded in the absent marker. In one manner a simplified feature set based only on raw text and semantic dependencies is used to improve performance for all relations. By using surface level features to predict implicit rhetorical relations for the large annotated corpus the invention approaches a theoretical maximum performance, suggesting that more data will not necessarily improve performance based on these and similarly situated features.

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

Owner:THOMSON REUTERS ENTERPRISE CENT GMBH

Tendency text automatic classification system and achieving method of the same

InactiveCN102930042AImprove accuracyHigh sensitivitySpecial data processing applicationsStructure analysisChinese word

The invention provides a tendency text automatic classification system and an achieving method of the same, and relates to the field of natural language processing technology, text data mining, and text automatic classification technology. The system comprises a dependence relationship analysis module, a Chinese word segmentation module, a sentence structure analysis module and a multi-layer emotion classification sentence module library, wherein the dependence relationship analysis module is used for dependence relationship analysis of Chinese sentences, the Chinese word segmentation module is used for work segmentation of the Chinese sentences, the sentence structure analysis module is used for sentence structure analysis of the Chinese sentences after work segmentation, and the multi-layer emotion classification sentence module library is used for management of business related knowledge. The tendency text automatic classification system is characterized in that the multi-layer emotion classification sentence module library is divided into 3 large classes and 120 small classes, the three classes are attitude grammar, feeling grammar and thought grammar, and the multi-layer emotion classification sentence module library is sorted in manual mode according to Chinese using rules and the business related knowledge. Sentence structure analysis of all sentence models in the multi-layer emotion classification sentence module library is conducted to build a sentence structure Treebank, and dependence relationship analysis of all the sentence models in the multi-layer emotion classification sentence module library is conducted to form a dependence relationshipgallery.

Tendency text automatic classification system and achieving method of the same

Tendency text automatic classification system and achieving method of the same

Tendency text automatic classification system and achieving method of the same

Owner:WUYI UNIV

Phrase tree to dependency tree transformation method capable of combining Vietnamese grammatical features

ActiveCN105740235AImprove accuracyShorten the timeNatural language data processingSpecial data processing applicationsNODALThree level

The invention relates to a phrase tree to dependency tree transformation method capable of combining Vietnamese grammatical features, and belongs to the technical field of natural language processing. The phrase tree to dependency tree transformation method comprises the following steps: firstly, constructing a Vietnamese phrase tree library; utilizing a center subnode filter table which combines the Vietnamese grammatical features and a dependency relationship annotator to finish the phrase tree to dependency tree transformation in the Vietnamese phrase tree library to obtain a first-level Vietnamese dependency tree library; according to the corpus of the manually annotated first-level Vietnamese dependency tree library, training to obtain a MSTParser model, utilizing the MSTParser model to carry out the expansion of the first-level Vietnamese dependency tree library to obtain an expanded second-level Vietnamese dependency tree library; and utilizing a dependency relationship corrector to correct the corpus of the expanded second-level Vietnamese dependency tree library to obtain a final three-level Vietnamese dependency tree library. The method avoids a process that the Vietnamese dependency tree library is manually collected and annotated, saves manpower and time for constructing the tree library, and obviously improves accuracy.

Phrase tree to dependency tree transformation method capable of combining Vietnamese grammatical features

Phrase tree to dependency tree transformation method capable of combining Vietnamese grammatical features

Phrase tree to dependency tree transformation method capable of combining Vietnamese grammatical features

Owner:KUNMING UNIV OF SCI & TECH

Chunk-based Vietnamese phrase tree construction method

ActiveCN106202037AImprove accuracyScale upNatural language translationSpecial data processing applicationsFeature setTheoretical computer science

The invention relates to a chunk-based Vietnamese phrase tree construction method, and belongs to the technical field of natural language processing. The method comprises the following steps of: firstly carrying out upper-layer chunk labeling and basic-layer chunk labeling on a Vietnamese phrase tree label set; selecting feature sets of an upper-layer chunk and a basic-layer chunk, and constructing a chunk-based Vietnamese phrase tree library construction model; carrying out chunk analysis on word-segmented Vietnamese sentences by utilizing a chunk analysis tool, so as to obtain a chunk construction-based primary Vietnamese phrase tree library; and correcting the chunk construction-based primary Vietnamese phrase tree library by utilizing a phrase tree library corrector, so as to obtain a corrected final Vietnamese phrase tree library. According to the method provided by the invention, the process of manually collecting and labelling the Vietnamese phrase tree libraries is avoided, and the manpower and the time of constructing the tree libraries are saved; and compared with the method for constructing Vietnamese phrase tree libraries by adoption of context-free grammars and maximum entropies, the phrase tree construction method disclosed by the invention has an advantage of remarkably improving the correctness.

Chunk-based Vietnamese phrase tree construction method

Chunk-based Vietnamese phrase tree construction method

Chunk-based Vietnamese phrase tree construction method

Owner:KUNMING UNIV OF SCI & TECH

A judicial case discrimination system and method based on event tree analysis

InactiveCN109949185AQuick caseQuick Logic VisualizationData processing applicationsText database queryingData miningComputer science

The invention provides a judicial case discrimination system based on event tree analysis. The judicial case discrimination system comprises a legal text collection module, an event tree constructionmodule, an automatic criminal name discrimination module and an automatic penalty discrimination module. The legal text collection module is used for converting legal statement text information submitted by a litigation user into noiseless text data; the event tree construction module is used for converting the legal text data into triple information related to the two entities through a natural language processing technology, and forming time sequence legal event tree information through a treebank; the automatic criminal name discrimination module is used for giving an automatic criminal name discrimination result through an event tree similarity matching algorithm; and the automatic penalty discrimination module is used for providing an automatic penalty discrimination result through anevent tree similarity matching algorithm on the premise of criminal name discrimination. The method has the advantages that the litigation user can be helped to quickly know the preliminary result ofcase judgment, and the system and method can help a judge to perform a final case discrimination.

A judicial case discrimination system and method based on event tree analysis

A judicial case discrimination system and method based on event tree analysis

A judicial case discrimination system and method based on event tree analysis

Owner:NANJING UNIV OF POSTS & TELECOMM

Syntax tree library construction system

ActiveCN110362691ALower the threshold for labeling operationsEasy to buildSemantic analysisEnergy efficient computingMulti fieldTheoretical computer science

The invention provides a syntax tree library construction system. The syntax tree library construction system mainly comprises a word segmentation annotation module, a word meaning annotation module,a chunk connection module, a component identification and component relationship annotation module and a syntax tree proofreading module. More people can participate in syntax tree construction work,so that a large-scale, multi-field and high-quality syntax tree library is constructed. The problems that a traditional syntax tree construction method is high in cost, low in efficiency, poor in consistency, small in scale, narrow in field, slow in updating and the like are solved. The problems that labeling operation can only be conducted on a large screen and the like are also solved.

Syntax tree library construction system

Syntax tree library construction system

Syntax tree library construction system

Owner:大连语智星科技有限公司

MST algorithm based Vietnamese dependency tree library construction method

InactiveCN105740234ASolve time-consuming problemsMake up for scarcityNatural language data processingSpecial data processing applicationsLibrary trainingTheoretical computer science

The invention relates to an MST algorithm based Vietnamese dependency tree library construction method and belongs to the technical field of natural language processing. The method comprises the steps of firstly constructing a Vietnamese dependency tree library training corpus base; secondly performing training by utilizing corpora of the Vietnamese dependency tree library training corpus base to obtain an MST model and then training Vietnamese sentences by utilizing the MST model to obtain a Vietnamese dependency tree library; and correcting the obtained Vietnamese dependency tree library corpus base. The Vietnamese dependency tree library constructed with the method can provide powerful support for upper-layer applications such as syntactic analysis, machine translation, information acquisition and the like of Vietnamese language; the Vietnamese dependency tree library with one hundred thousand Vietnamese sentences can be constructed; the method avoids the processes of manually collecting and marking the Vietnamese dependency tree library, reduces the labor and shortens the time for constructing the tree library; and compared with a method for constructing a Vietnamese dependency tree library by adopting a CRFParser and Chinese-Vietnamese bilingual word-alignment corpora, the method provided by the invention has the advantage that the accuracy is remarkably improved.

MST algorithm based Vietnamese dependency tree library construction method

MST algorithm based Vietnamese dependency tree library construction method

MST algorithm based Vietnamese dependency tree library construction method

Owner:KUNMING UNIV OF SCI & TECH

Method for establishing Vietnamese dependency tree bank based on improved Nivre algorithm

ActiveCN106250367APromote mutual learningTo learn from each otherNatural language data processingSpecial data processing applicationsMaterial resourcesMachine translation

The invention relates to a method for establishing a Vietnamese dependency tree bank based on an improved Nivre algorithm, and belongs to the technical field of natural language processing. The method comprises the steps of firstly, establishing an initial training corpus, an expansion corpus and a test corpus; secondly training two dependency parsing weak learners S1 and S2 based on the improved Nivre algorithm by utilizing the established initial training corpus to serve as two fully redundant views; thirdly, performing dependency parsing on the expansion corpus by utilizing the two trained weak learners S1 and S2 and building a Vietnamese dependency tree bank model; and finally, performing dependency parsing testing on the test corpus and finally establishing the Vietnamese dependency tree bank. According to the method, the powerful support can be provided for upper applications of syntactic analysis, machine translation, information acquisition and the like of a Vietnamese language; the process of manually marking a dependency relation of Vietnamese sentences can be effectively avoided, so that the time of manpower and material resources is saved; and a large amount of unmarked Vietnamese sentence level corpora can be effectively utilized for improving the accuracy of dependency parsing.

Method for establishing Vietnamese dependency tree bank based on improved Nivre algorithm

Method for establishing Vietnamese dependency tree bank based on improved Nivre algorithm

Method for establishing Vietnamese dependency tree bank based on improved Nivre algorithm

Owner:KUNMING UNIV OF SCI & TECH

Chinese automatic syntactic analyzer based on sentence pattern structure

ActiveCN112949286ARealize the analysis functionImprove build efficiencyNatural language data processingText database queryingInformation processingEngineering

The invention provides an automatic Chinese syntactic analyzer based on a sentence pattern structure, which comprises the following steps: S1, expanding a grammar mode of a regular expression to realize an expanded regular expression grammar based on a multivariate word feature sequence; S2, constructing a syntactic rule library by using the extended regular expression grammar obtained in the step S1; S3, constructing a vocabulary knowledge base and a lexical knowledge base matched with the syntactic rule base constructed in the S2; and S4, based on the vocabulary knowledge base and the lexical knowledge base constructed in the S3, performing Chinese automatic syntactic analysis of a sentence pattern structure by adopting a lexical and syntactic integrated analysis algorithm. The method has the advantages that the Chinese automatic syntactic analysis function based on a sentence pattern structure system is achieved, the construction efficiency of a large-scale sentence standard syntax tree bank is improved, and a way is laid for connection of formalized graph analysis sentences and Chinese information processing downstream applications.

Chinese automatic syntactic analyzer based on sentence pattern structure

Chinese automatic syntactic analyzer based on sentence pattern structure

Chinese automatic syntactic analyzer based on sentence pattern structure

Owner:北京汉雅天诚教育科技有限公司

A bilingual unsupervised syntax analysis method and system

InactiveCN104281564BSpecial data processing applicationsAnalysis methodTreebank

The invention discloses bilingual unsupervised syntactic analysis method and system. The method comprises the following steps: 1 respectively building random syntactic analysis treebanks on a bilingual corpus source side and a target side; 2 individually training a monolingual unsupervised syntactic analysis model by computing the probability of a monolingual unsupervised syntactic analysis tree on the random syntactic analysis treebanks; 3 carrying out bilingual syntactic analysis on the monolingual unsupervised syntactic analysis model by computing the relaxation isomorphism similarity and a bilingual syntactic analysis algorithm, obtaining a bilingual syntactic analysis treebank which meets a relaxation isomorphism bilingual syntactic analysis target, so as to replace the random syntactic analysis treebanks; 4 repeating the steps 1 to 3 until the bilingual unsupervised syntactic analysis model is astringed; therefore, the better monolingual unsupervised syntactic analysis model is obtained, so as to be applied to all downstream application requiring syntactic analysis.

A bilingual unsupervised syntax analysis method and system

A bilingual unsupervised syntax analysis method and system

A bilingual unsupervised syntax analysis method and system

Owner:INST OF COMPUTING TECH CHINESE ACAD OF SCI +1

A method and system for obtaining a dependency structure tree bank

ActiveCN106598951BConvenience to mergeScale upNatural language data processingSpecial data processing applicationsDependency structureAlgorithm

The invention discloses a dependency structure treebank acquisition method and system. The method comprises the steps of calling a first treebank and converting phrase structures in the first treebank into dependency structures by adopting a conversion tool of the first treebank; converting phrase structures of flat structures in the first treebank into dependency structures by utilizing a syntactic analyzer; and performing dependency relationship conversion on the dependency structures in the first treebank by utilizing a dependency relationship mapping model obtained by training to obtain a dependency structure treebank of a second treebank type. According to the method and the system, the treebank after the conversion can be combined with the original dependency structure treebank, so that the treebank scale is increased and the performance of the syntactic analyzer is improved.

A method and system for obtaining a dependency structure tree bank

A method and system for obtaining a dependency structure tree bank

A method and system for obtaining a dependency structure tree bank

Owner:BEIJING KINGSOFT OFFICE SOFTWARE INC +1

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

ActiveUS9355372B2Kernel methodsNatural language data processingData miningRelational table

The present invention provides a method and system directed to predicting implicit rhetorical relations between two spans of text, e.g., in a large annotated corpus, such as the Penn Discourse Treebank (“PDTB”), Rhetorical Structure Theory corpus, and the Discourse Graph Bank, and particularly directed to determining a rhetorical relation in the absence of an explicit discourse marker. Surface level features may be used to capture pragmatic information encoded in the absent marker. In one manner a simplified feature set based only on raw text and semantic dependencies is used to improve performance for all relations. By using surface level features to predict implicit rhetorical relations for the large annotated corpus the invention approaches a theoretical maximum performance, suggesting that more data will not necessarily improve performance based on these and similarly situated features.

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

Owner:THOMSON REUTERS ENTERPRISE CENT GMBH

Processing method of unregistered words in Chinese dependency tree bank

InactiveCN103678272BSpecial data processing applicationsChinese charactersGranularity

The invention belongs to the field of processing for natural languages of computational linguistics, and discloses a method for processing unknown words in Chinese-language dependency tree banks. The method includes steps of A, searching all synonyms of the unknown words by the aid of synonym forests; B, computing character pattern similarity degrees among the unknown words and all the synonyms of the unknown words according to character pattern features of Chinese characters; C, extracting mapped words and information quantities of word classes of the mapped words when the character pattern similarity degrees among the unknown words and the multiple synonyms are high, and improving character pattern similarity degree computation models; D, extracting the words with the maximum character pattern similarity degrees as the optimal mapped words of the unknown words and using the extracted words as explanation for the unknown words in the tree banks. The method has the advantages that unit pairs (word classes, word classes) in dependency syntactic analysis can be recovered to unit pairs (word classes, words) or unit pairs (words, word classes) on the premise that the scales of the tree banks are no longer expanded, accordingly, the information granularity can be refined, the problem of data sparseness can be solved, and the dependency syntactic analysis performance can be improved.

Processing method of unregistered words in Chinese dependency tree bank

Processing method of unregistered words in Chinese dependency tree bank

Processing method of unregistered words in Chinese dependency tree bank

Owner:BEIJING INFORMATION SCI & TECH UNIV

A Method of Constructing Vietnamese Dependency Treebank Based on Improved Nivre Algorithm

ActiveCN106250367BAvoid cumbersomeShorten the timeNatural language data processingSpecial data processing applicationsMaterial resourcesMachine translation

The invention relates to a method for establishing a Vietnamese dependency tree bank based on an improved Nivre algorithm, and belongs to the technical field of natural language processing. The method comprises the steps of firstly, establishing an initial training corpus, an expansion corpus and a test corpus; secondly training two dependency parsing weak learners S1 and S2 based on the improved Nivre algorithm by utilizing the established initial training corpus to serve as two fully redundant views; thirdly, performing dependency parsing on the expansion corpus by utilizing the two trained weak learners S1 and S2 and building a Vietnamese dependency tree bank model; and finally, performing dependency parsing testing on the test corpus and finally establishing the Vietnamese dependency tree bank. According to the method, the powerful support can be provided for upper applications of syntactic analysis, machine translation, information acquisition and the like of a Vietnamese language; the process of manually marking a dependency relation of Vietnamese sentences can be effectively avoided, so that the time of manpower and material resources is saved; and a large amount of unmarked Vietnamese sentence level corpora can be effectively utilized for improving the accuracy of dependency parsing.

A Method of Constructing Vietnamese Dependency Treebank Based on Improved Nivre Algorithm

A Method of Constructing Vietnamese Dependency Treebank Based on Improved Nivre Algorithm

A Method of Constructing Vietnamese Dependency Treebank Based on Improved Nivre Algorithm

Owner:KUNMING UNIV OF SCI & TECH

A Phrase Tree to Dependency Tree Conversion Method Integrating Vietnamese Grammatical Features

ActiveCN105740235BImprove accuracyShorten the timeNatural language data processingSpecial data processing applicationsThree levelTheoretical computer science

The invention relates to a phrase tree to dependency tree transformation method capable of combining Vietnamese grammatical features, and belongs to the technical field of natural language processing. The phrase tree to dependency tree transformation method comprises the following steps: firstly, constructing a Vietnamese phrase tree library; utilizing a center subnode filter table which combines the Vietnamese grammatical features and a dependency relationship annotator to finish the phrase tree to dependency tree transformation in the Vietnamese phrase tree library to obtain a first-level Vietnamese dependency tree library; according to the corpus of the manually annotated first-level Vietnamese dependency tree library, training to obtain a MSTParser model, utilizing the MSTParser model to carry out the expansion of the first-level Vietnamese dependency tree library to obtain an expanded second-level Vietnamese dependency tree library; and utilizing a dependency relationship corrector to correct the corpus of the expanded second-level Vietnamese dependency tree library to obtain a final three-level Vietnamese dependency tree library. The method avoids a process that the Vietnamese dependency tree library is manually collected and annotated, saves manpower and time for constructing the tree library, and obviously improves accuracy.

A Phrase Tree to Dependency Tree Conversion Method Integrating Vietnamese Grammatical Features

A Phrase Tree to Dependency Tree Conversion Method Integrating Vietnamese Grammatical Features

A Phrase Tree to Dependency Tree Conversion Method Integrating Vietnamese Grammatical Features

Owner:KUNMING UNIV OF SCI & TECH

A dispute focus detection method and device based on deep learning hybrid model

ActiveCN112613582BQuick checkAccurate detectionCharacter and pattern recognitionNatural language data processingData setAlgorithm

The invention relates to a dispute focus detection method and device based on a deep learning mixed model, belonging to the field of natural language processing. The method includes the following steps: ①Constructing a tree bank of disputed focus; ②Completing the data labeling and obtaining a data set; ③Obtaining a complete and trainable data set; ④Preprocessing the data set obtained in step S3 for Chinese data; ⑤Using BERT‑ The wwm model obtains the text word vector matrix; ⑥Use the LSTM network model to extract the global semantic features of the text; use various convolution kernels of the TextCNN model to extract the local semantic features of different granularities of the text; average the probability results of the two models, set The threshold is predicted, and the output probability exceeds the threshold of controversy. Aiming at the problem that a single model cannot capture and utilize multi-level semantic features at the same time, the present invention provides a mixed model method for predicting the focus of disputes, which greatly improves the prediction accuracy.

A dispute focus detection method and device based on deep learning hybrid model

A dispute focus detection method and device based on deep learning hybrid model

A dispute focus detection method and device based on deep learning hybrid model

Owner:CHONGQING UNIV OF POSTS & TELECOMM

Method and system for automatic treebank transformation based on pattern embedding

ActiveCN108647254BNatural language data processingText database indexingAlgorithmTheoretical computer science

The invention relates to an automatic tree bank conversion method and system based on pattern embedding, which is designed to obtain an accurate supervised conversion model. The present invention is based on the automatic tree bank conversion method of pattern embedding, and determines the word w i and the word w j pattern; the word w i and the word w j The pattern of is transformed into the corresponding pattern embedding vector; the word w in the source tree i , word w j , the smallest common ancestor node w a The dependency labels corresponding to the three are respectively transformed into dependency embedding vectors; the pattern embedding vector and the three dependency embedding vectors are spliced together as the word w in the source tree i and the word w j The representation vector of the structural information of the cyclic neural network, the top-level output of the recurrent neural network is spliced with the representation vector respectively, and used as the input of the perceptron MLP; the word w is obtained by biaffine calculation i and the word w j The target end depends on the arc-score value; the invention makes full use of the source-end syntax tree to describe the corresponding laws of the two labeling specifications, and finally completes the high-quality tree bank conversion.

Method and system for automatic treebank transformation based on pattern embedding

Method and system for automatic treebank transformation based on pattern embedding

Method and system for automatic treebank transformation based on pattern embedding

Owner:SUZHOU UNIV

Method and system for automatic treebank conversion based on tree-shaped recurrent neural network

ActiveCN108628829BDependent on arc minutesAccurate Supervised Transformation ModelNatural language data processingNeural architecturesPattern recognitionHidden layer

The invention relates to an automatic tree bank conversion method and system based on a tree-shaped recurrent neural network, which is designed to obtain an accurate supervised conversion model. The automatic tree bank conversion method based on the tree-shaped recurrent neural network of the present invention includes: based on the bidirectional tree-shaped recurrent neural network TreeLSTM, the word w is obtained i , word w i , word w a The hidden layer output vector of the hidden layer output vector is spliced together as the word w i and the word w j In the source tree, the representation vector output vector of the top layer of the recurrent neural network BiSeqLSTM is concatenated with the representation vector respectively, and used as the input of the perceptron MLP, the perceptron extracts syntactic information; the word w is calculated using biaffine i and the word w j The target side of the depends on the arc minutes value. The invention makes full use of the source-end syntax tree to describe the corresponding rules of the two labeling norms and provides necessary data support for establishing a high-quality tree supervised transformation model.

Method and system for automatic treebank conversion based on tree-shaped recurrent neural network

Method and system for automatic treebank conversion based on tree-shaped recurrent neural network

Method and system for automatic treebank conversion based on tree-shaped recurrent neural network

Owner:SUZHOU UNIV

A method and device for generating an event relationship graph

ActiveCN106095748BSave time readingNatural language data processingSpecial data processing applicationsGraph spectraData mining

The invention provides an event relationship graph generation method and apparatus. The method comprises the steps of splitting a manuscript into statements according to preset punctuation marks; extracting characters in the split statements and screening out the statements containing the characters as standby statements, wherein the characters include at least one of a personal name, a role and a personal pronoun; performing syntactic analysis on the standby statements by utilizing a pre-obtained syntactic analysis tree bank to generate an associative relationship among the characters; and generating an event relationship graph by utilizing the characters and the associative relationship. By utilizing the method, events occurring among the characters can be visually viewed, so that a user can understand substances of the events for a relatively short time and the reading time is shortened.

A method and device for generating an event relationship graph

A method and device for generating an event relationship graph

A method and device for generating an event relationship graph

Owner:NEUSOFT CORP

Popular searches

Semantic relation Human language Grammaticality Phrase Spoken Language Ability Technical standard Sequence model Perplexity Word list Language modelling