Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

386 results about "Sentence pair" patented technology

Pair in a sentence A pair of. Pair of aces. was one of a pair. A pair of hawks. Very rare pairing. It was pairing time. paired up and synced. Paired off and cocky. Two pairs of. They fly in pairs.

Training-corpus quality evaluation and selection method orienting to statistical-machine translation

ActiveCN102945232AEnriching Sentence Pair Quality Evaluation FeaturesRealize automatic learningSpecial data processing applicationsSentence pairMachine translation system
The invention relates to a training-corpus quality evaluation and selection method orienting to statistical-machine translation. The training-corpus quality evaluation and selection method comprises the following steps of: automatic weight acquisition: adopting small-scale corpus to train an automatic weight acquisition model so as to obtain a characteristic weight and a classification critical value; sentence-pair quality evaluation: using the weight and the classification critical value as well as the original large-scale parallel corpuses as input, carrying out classification on the large-scale parallel corpuses by using a linear model for sentence-pair quality evaluation, and generating all corpus subsets; and high-quality corpus subset selection: on the basis of all the corpus subsets, considering the influence of the cover degree, and selecting the high-quality corpuses as training data of a statistical-machine translation system. The training-corpus quality evaluation and selection method has the advantages that richer sequence-pair quality evaluation characteristic is provided, so that the automatic learning of the characteristic weight is realized, and when the scale of the subsets reaches to 30%, the performance can reach 100%, even better; and the class of any input sequence pair can be divided, and help can be provided for tasks such as selection of high-quality corpus data.
Owner:沈阳雅译网络技术有限公司

Question sentence recommendation method and system

The invention provides a question sentence recommendation method. The method includes the following steps: S1, receiving corpus data, wherein the corpus data are multi-round question and answer data;S2, transforming the corpus data to generate positive example pairs, and generating counter example pairs through random sampling and combination with the corpus data; S3, carrying out word vectorization on the positive example pairs and the counter example pairs through a word2vec model to respectively acquire sentence vector matrices; S4, inputting the sentence vector matrices into a hidden layer, and carrying out dot product operating on the sentence vector matrices and a weight matrix to obtain new sentence vector matrices; S5, inputting the sentence vector matrices into a convolutional neural network, and carrying out convolution and pooling sampling operation to obtain semantic vectors of the sentences; and S6, carrying out non-linear transformation on the semantic vectors of the sentences, calculating cosine similarity of the semantic vectors of the positive example sentence pairs and cosine similarity of the counter example sentence pairs, and finally, acquiring a prediction model. The invention also provides a question sentence recommendation system used for realizing the above-mentioned method.
Owner:GUANGZHOU DUOYI NETWORK TECH +2

Word forecasting method and system based on nerve machine translation system

ActiveCN106844352AAccurately Obtain Predicted ProbabilitiesNatural language translationNeural architecturesSentence pairPrediction probability
The invention relates to a word forecasting method and system based on a nerve machine translation system. The word forecasting method includes the steps that parallel corpora are trained, extracting is carried out from the training result, and a phrase translation table is obtained; source language sentences in any parallel sentence pairs are subjected to matching searching, and all source language phrases contained in the source language sentences are determined; target phrase translation candidate sets corresponding to all the source language phrases respectively are found from the phrase translation table; part of obtained translations are translated according to the target phrase translation candidate sets and the nerve machine translation system, and target word sets needing to be encouraged are obtained; encouragement values of all target words in the target word sets are determined according to the attention probability and the target phrase translation candidate sets which are based on the nerve machine translation system; the prediction probability of all the target words is obtained according to the encouragement values of all the target words. The encouragement values of the target words are obtained in the mode that the phrase translation table is introduced and added into a nerve translation model, and therefore the prediction probability of the target words can be increased.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Neural machine translation method by introducing source language block information to encode

The present invention relates to a neural machine translation method for introducing source language block information to encode. The method comprises: inputting bilingual sentence-level parallel data, and carrying out word segmentation on the source language and the target language respectively to obtain bilingual parallel sentence pairs after being subject to word segmentation; encoding the source sentence in the bilingual parallel sentence pairs after being subject to word segmentation according to the time sequence, obtaining the state of each time sequence on the hidden layer of the lastlayer, and segmenting the input source sentence by blocks; according to the state of each time sequence of the source sentence and the segmentation information of the source sentence, obtaining the block encoding information of the source sentence; combing the time sequence encoding information with the block encoding information to obtain final source sentence memory information; and by dynamically querying the source sentence memory information, using attention mechanism to generate a context vector at each moment through a decoder network, and extracting feature vectors for word prediction.According to the method provided by the present invention, block segmentation is automatically carried out on the source sentence without the need of any pre-divided sentence to participate in the training, and the method can capture the latest and the best block segmentation manner of the source sentence.
Owner:沈阳雅译网络技术有限公司

An ancient Chinese automatic translation method based on multi-feature fusion

ActiveCN109684648AIncreased accuracySolve the problem of unregistered wordsNatural language translationSpecial data processing applicationsWord listSentence pair
The invention discloses an ancient Chinese automatic translation method based on multi-feature fusion. The method comprises the following steps: 1) collecting a text, modern text translation data of the text, a text word list and modern Chinese monolingual corpus data; And 2) cleaning the data and constructing an ancient Chinese parallel corpus by using a sentence alignment method. And 3) carryingout word segmentation on the modern text and the ancient text by using a Chinese word segmentation tool; 4) performing topic modeling on the ancient text corpus to generate topics-Word distribution and word-Subject conditional probability distribution 5) using the modern Chinese monolingual corpus to train to obtain a modern Chinese language model; And obtaining an aligned dictionary by using ancient Chinese parallel corpora. 6) on the basis of the attention-based recurrent neural network translation model, fusing statistical machine translation characteristics such as a language model and analignment dictionary, and using an ancient Chinese parallel sentence pair and a word topic sequence training model, and 7) inputting a to-be-translated text by a user, and obtaining a modern text translation by using the model obtained by training in the step 6).
Owner:ZHEJIANG UNIV

Method for constructing Vietnamese dependency tree bank on basis of Chinese-Vietnamese vocabulary alignment corpora

The present invention relates to a method for constructing a Vietnamese dependency tree bank on the basis of Chinese-Vietnamese vocabulary alignment corpora and belongs to the technical field of natural language processing. According to the present invention, firstly, a Chinese-Vietnamese vocabulary alignment sentence pair library is constructed; then a Chinese dependency tree corpus is constructed; and according to the constructed Chinese-Vietnamese vocabulary alignment sentence pair library and Chinese dependency tree corpus, a Vietnamese dependency tree corpus is constructed. The Vietnamese dependency tree bank constructed by the method can provide powerful support for upper layer applications of syntactic analysis, machine translation, information acquisition and the like; a bilingual parallel dependency tree corpus is constructed; according to the method for constructing a dependency tree, which is disclosed by the present invention, the process of manually collecting and labeling the Vietnamese dependency tree bank is simplified and labor and time of constructing the tree bank are saved; and compared with a method adopting a machine to carry out learning, the method for constructing a dependency tree, which is disclosed by the present invention, is obviously improved in accuracy.
Owner:KUNMING UNIV OF SCI & TECH

Method and system for determining intertranslation relationship of bilingual sentence pairs

The invention discloses a method and system for determining an intertranslation relationship of bilingual sentence pairs. The method comprises a step of determining matching feature values of the bilingual sentence pairs, performing filtering and classification on the bilingual sentence pairs according to the weights of the matching feature values in the intertranslation relationship according to a pre-established training classification model, and determining whether the bilingual sentence pairs are bilingual sentence pairs satisfying the requirements of the intertranslation relationship. Therefore, by adoption of the method for determining the intertranslation relationship of the bilingual sentence pairs provided by the embodiment of the invention, a bilingual corpus with a huge data size can be processed quickly and conveniently. The problem of determining the intertranslation relationship of the bilingual sentence pairs is converted into a binary classification problem by using the classification idea of the training classification model, so that the weights of the matching features of the bilingual corpus can be determined more scientifically and reasonably, and compared with the existing experience method, the universality is better, and the accuracy and the recall rate are improved accordingly.
Owner:BEIJING KINGSOFT OFFICE SOFTWARE INC +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products