Patents

Literature

PatSnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

132 results about "Sentence similarity" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Sentence similarity is typically calculated by first embedding the sentences and then taking the cosine similarity between them. () ... Sentence similarity is a complex phenomenon. The meaning of a sentence does not only depend on the words in it, but also on the way they are combined.

Full text query and search systems and methods of use

InactiveUS20060212441A1High scoreWeb data indexingSpecial data processing applicationsInternet contentRanking

The invention is a method for textual searching of text-based databases including databases of compiled internet content, scientific literature, abstracts for books and articles, newspapers, journals, and the like. Specifically, the algorithm supports searches using full-text or webpage as query and keyword searches allowing multiple entries and an information-content based ranking system (Shannon Information score) that uses p-values to represent the likelihood that a hit is due to random matches. Additionally, users can specify the parameters that determine hits and their ranking with scoring based on phrase matches and sentence similarities.

Full text query and search systems and methods of use

Full text query and search systems and methods of use

Full text query and search systems and methods of use

Owner:INFOVELL

Intelligent Chinese request-answering system based on concept

InactiveCN101286161AImprove precisionImprove recallSpecial data processing applicationsUser inputQuestions and answers

The invention discloses a Chinese question answering system based on concept, which mainly comprises a data server, a question pre-treatment module, a candidate question set extracting module and a question sentence similarity calculation module. The invention aims at providing a question answering system which is based on concept, can carry out synonym expansion of keywords which are processed by question sentences which are input by the user, understand question sentences better, carry out searching and improve the recall ratio of the question answering system. Furthermore, the system has a Chinese sentence similarity calculation method based on concept from three aspects: word form, word order and word length, and improves searching precision ratio. Meanwhile, the system adopts a high-efficiency retrieval technology to realize rapid extraction of candidate question set, calculates question sentence similarity, sorts question set quickly and returns the sorted questions and answers to the user. The question answering system of the invention gives more precise understanding in concept to the question sentences input by the user and searches the accurate answers. Experiments show that the question answering system of the invention achieves high recall ratio and precision ratio.

Intelligent Chinese request-answering system based on concept

Intelligent Chinese request-answering system based on concept

Intelligent Chinese request-answering system based on concept

Owner:HUAZHONG UNIV OF SCI & TECH

Machine translation apparatus and machine translation computer program

ActiveUS20050049851A1Quality improvementRaise the possibilityNatural language translationSpecial data processing applicationsSentence similarityHigh likelihood

A method of machine translation, using a bilingual corpus containing translation pairs each consisting of a sentence of a first language and a sentence of a second language, for translating an input sentence of the first language to the second language, including the steps of: receiving the input sentence of the first language and extracting, from the bilingual corpus, a sentence of the second language forming a pair with a sentence of the first language with highest similarity to the input sentence; applying an arbitrary modification among a plurality of predetermined modifications to the extracted sentence of the second language, and computing likelihood of sentences resulting from the modification; selecting a prescribed number of sentences having high likelihood from among the sentences resulting from the modification; repeating, on each of the sentences selected in the step of selecting, the steps of extracting, computing and selecting, until the likelihood no longer improves; and outputting, as a translation of the input sentence, a sentence having the highest likelihood among the sentences of the second language left at the end of the step of repeating.

Machine translation apparatus and machine translation computer program

Machine translation apparatus and machine translation computer program

Machine translation apparatus and machine translation computer program

Owner:ATR ADVANCED TELECOMM RES INST INT

Multiple-document automatic abstracting method based on frequent itemset

InactiveCN102043851ASimple and easy to implementImprove simplicitySpecial data processing applicationsDocument preparationSentence similarity

The invention discloses a multiple-document automatic abstracting method based on a frequent itemset. In the method, a frequent itemset excavating ideal in the association rules is introduced, and an associating method is utilized to excavate the frequent itemsets of effective itemset to serve as child themes; sentences are directly clustered to different child themes without carrying out sentence similarity computing; and multiple-document automatic abstracting is carried out on the basis of an SFI (sub-topics based on frequent item sets) method. In the method, the sentences are directly clustered to different child themes without carrying out sentence similarity computing, thus the method has the characteristics of high simplicity, high legibility, high practicability and the like.

Multiple-document automatic abstracting method based on frequent itemset

Multiple-document automatic abstracting method based on frequent itemset

Multiple-document automatic abstracting method based on frequent itemset

Owner:SICHUAN UNIV

Voice answering method for combining intelligent answer with artificial answer

InactiveCN107315766AImprove experienceSolve questions in timeNatural language data processingSpeech recognitionNatural language processingSentence segmentation

The invention relates to the technical field of human-computer interaction, and discloses a voice answering method for combining intelligent answer with artificial answer. The voice answering method provided by the invention has a core thought that a voice identification algorithm is firstly used for converting problem voice information into problem text information; then, the problem text information is subjected to sentence segmentation processing to obtain a user interrogative sentence; and finally, on the basis of an interrogative sentence similarity, whether a standard interrogative sentence which is most similar to the user interrogative sentence and corresponding answering information are found in a QA library or a manual service desk is accessed to obtain answering information is determined. Therefore, voice answering can be realized so as to bring convenience for users to input problem information, and questioning and answering efficiency and user experience are improved. Meanwhile, when a proper answer can not be found in the QA library, manual answer can be switched to, and the doubts of users can be solved in time.

Voice answering method for combining intelligent answer with artificial answer

Voice answering method for combining intelligent answer with artificial answer

Voice answering method for combining intelligent answer with artificial answer

Owner:JIANGMEN POWER SUPPLY BUREAU OF GUANGDONG POWER GRID

Information processing method and device for realizing intelligent question answering

InactiveCN107273350AImprove experienceImprove targetingSemantic analysisSpecial data processing applicationsInformation processingNatural language understanding

The invention relates to the technical field of man-machine interaction, and discloses an information processing method and device for realizing intelligent question answering. The information processing method comprises the following steps of: carrying out sentence segmentation on question text information to obtain a user question; and searching a standard question most similar to the user question and corresponding answer information from a QA library on the basis of a question similarity. Compared with the existing keyword retrieval-based question answering method, the method disclosed by the invention does not need to require the users to have keyword decomposition ability, is automatic in the whole process and is capable of greatly enhancing the user experience and improving the search effect and the pertinence and effectiveness of answers. Meanwhile, through fusing natural language understanding technologies such as sentence model analysis, lexical analysis and lexical meaning extension, and carrying out comprehensive calculation on multi-dimensional similarity, the method is capable of improving the correctness of a final sentence similarity in a Chinese automatic question answering process, and enabling a Chinese intelligent question answering system to be possible.

Information processing method and device for realizing intelligent question answering

Information processing method and device for realizing intelligent question answering

Information processing method and device for realizing intelligent question answering

Owner:JIANGMEN POWER SUPPLY BUREAU OF GUANGDONG POWER GRID

Sentence similarity calculation method and system

ActiveCN106021223AAccurate classificationReduce manual interventionNatural language data processingSpecial data processing applicationsAlgorithmSentence similarity

The invention provides a sentence similarity calculation method and system. By using a word2vec algorithm, a pre-built corpus is trained to obtain vectors of all words in the corpus; performing intelligent word segmentation on two sentences to be subjected to similarity calculation; finding vectors corresponding to each segmented word in the first sentence and the second sentence from the corpus; sequentially calculating the similarity between each segmented word of the first sentence and each segmented word of the second sentence; obtaining two groups of segmented word sets with the segmented word similarity exceeding a preset threshold; calculating the similarity contribution value of each group of segmented words in the whole sentence according to the deviation quantity of each group of segmented words in the position of the sentence; and adding the contribution values of the segmented words in the two sentences to obtain the similarity between the sentences. The method and the system provided by the invention have the advantages that the semantic similarity of words is calculated by word2vec; and through mass corpus automatic training, convenience is provided for accurate information retrieval, file classification or system answering.

Sentence similarity calculation method and system

Sentence similarity calculation method and system

Sentence similarity calculation method and system

Owner:TCL CORPORATION

Full text query and search systems and methods of use

InactiveUS20090024612A1Web data indexingSpecial data processing applicationsInternet contentRanking

The invention is a method for textual searching of text-based databases including databases of compiled internet content, scientific literature, abstracts for books and articles, newspapers, journals, and the like. Specifically, the algorithm supports searches using full-text or webpage as query and keyword searches allowing multiple entries and an information-content based ranking system (Shannon Information score) that uses p-values to represent the likelihood that a hit is due to random matches. Additionally, users can specify the parameters that determine hits and their ranking with scoring based on phrase matches and sentence similarities.

Full text query and search systems and methods of use

Full text query and search systems and methods of use

Full text query and search systems and methods of use

Owner:INFOVELL

Question and answer sentence similarity calculation method based on multilevel characteristics

ActiveCN106997376ASpecial data processing applicationsPattern recognitionQuestions and answers

The invention discloses a question and answer sentence similarity calculation method based on multilevel characteristics, and is mainly applied to an automatic question-answering system. The method totally comprises five characteristic functions, wherein each characteristic function is independently used for measuring a similarity among different angles between two sentences; a word-level characteristic measures the similarity of two sentences on an aspect of words; a phrase-level characteristic can measure a similarity between an idiom and a phrase in common use; a sentence semantic characteristic can measure the similarity of the two sentences on an aspect of meanings; a sentence structure characteristic can measure the similarity of the two sentences on an aspect of grammar and syntax; an answer type characteristic can measure whether a required answer type is contained in an answer sentence; and finally, through a linear function, the five characteristic functions are subjected to weighted summation to form a question and answer sentence similarity algorithm based on multilevel characteristics. By use of the method, the similarity between a question sentence and an answer sentence can be comprehensively measured.

Question and answer sentence similarity calculation method based on multilevel characteristics

Question and answer sentence similarity calculation method based on multilevel characteristics

Question and answer sentence similarity calculation method based on multilevel characteristics

Owner:ZHEJIANG UNIV

Chinese short text clustering method

ActiveCN106599029AImprove accuracySolve the characteristicsSemantic analysisSpecial data processing applicationsPeak valueSentence similarity

The invention relates to a Chinese short text clustering method, and in particular relates to a Chinese short text clustering method based on word vectors and similarity calculation thereof. The Chinese short text clustering method comprises the following specific steps of: obtaining needed word vectors by utilizing a Word2Vec word vector training model; obtaining weights of all words in a short text set by utilizing a word weight calculation algorithm; according to the word vectors and the weights of all the words, calculating the similarity value between every two texts in the short text set through a short text similarity algorithm; and, according to the similarity value between every two texts in the short text set, clustering short texts. The invention provides a carrying optimization type short text similarity calculation method; the problems of sparse short text grammar characteristics, semantic loss and the like can be solved; on the basis of a graph model, the weights of the words are continuously calculated iteratively, so that the sentence similarity calculation accuracy is increased; and, a density peak clustering method is applied in short text clustering, so that the efficiency of the clustering method is effectively increased.

Chinese short text clustering method

Chinese short text clustering method

Chinese short text clustering method

Owner:FOCUS TECH +1

Sentence similarity calculation method based on sentence meaning structure characteristics

InactiveCN106445920ACharacterize inner connectionReduce lossSemantic analysisCharacter and pattern recognitionContext basedSentence similarity

The invention provides a sentence similarity calculation method based on sentence meaning structure characteristics, aiming to solve the problem of characteristic sparsity in social short-text sentence similarity calculation. The sentence similarity calculation method includes analyzing the meaning of a sentence according to a sentence meaning structure model, digging potential thematic knowledge according to a thematic model, expanding sentence characteristics according to theme-word distribution to obtain a sentence vector based on the sentence characteristics, introducing a Paragraph Vector deep study model to study the context characteristics of the sentence, acquiring a sentence vector based on context information, and weighing sentence similarity obtained from calculation of the two sentence vectors. The sentence similarity calculation has the advantages that semantic information and the context information of the sentence are dug deeply, so that internal relations among sentences are described comprehensively and accurately, and accuracy in similarity calculation is improved.

Sentence similarity calculation method based on sentence meaning structure characteristics

Sentence similarity calculation method based on sentence meaning structure characteristics

Sentence similarity calculation method based on sentence meaning structure characteristics

Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

A sentence similarity recognition method for voice answer system

InactiveCN101086843AExact similarity valueAccurate intentionSpeech recognitionSpoken languageSpeech identification

The invention relates to a sentence similarity identification method in audio answering system that comprises the key word combination representing the usual questions in the knowledge base, abstracting the key word in the audio identification result, matching with the normal knowledge base to get the candidate questions, deciding the similarity equation and identification result matching procedure.It solves the accuracy issue of dialectic audio identification, using sentence similarity value to get the intent of users in a relatively accurate way.

A sentence similarity recognition method for voice answer system

A sentence similarity recognition method for voice answer system

A sentence similarity recognition method for voice answer system

Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Sentence similarity calculation method and apparatus

InactiveCN105183714ACalculate approximation accuratelyAccurate calculationSpecial data processing applicationsNatural language processingSemantics

The present invention discloses a sentence similarity calculation method and apparatus and relates to the technical field of automatic correcting. The method comprises: acquiring a vector corresponding to each word; performing syntax analysis on two sentences to be compared so as to acquire words forming compositions of the two sentences; calculating a first cosine distance between the vectors corresponding to the words forming the corresponding compositions of the two sentences; and according to the first cosine distance, determining similarity between the two sentences. According to the method provided by the present invention, by performing syntax analysis on the sentences and structurally holding semantics of the sentences, similarity between the sentences are more accurately calculated; and in addition, the word vectors based on a neural network model are adopted to represent the words, thereby more accurately calculating similarity between the words and getting rid of restrictions of a near-synonym dictionary.

Sentence similarity calculation method and apparatus

Sentence similarity calculation method and apparatus

Sentence similarity calculation method and apparatus

Owner:BEIJING FOCUSEDU INT EDUCATION CONSULTATION

News sentence clustering method based on semantic similarity, device and storage medium

ActiveCN107679144AAccurate clusteringEfficient clusteringSemantic analysisSpecial data processing applicationsComputational semanticsSemantic vector

The invention provides a news sentence clustering method based on semantic similarity. The method includes the following steps: preprocessing news sentences of a corpus, and extracting available words; utilizing the available words to train a continuous bag-of-words model to obtain an initial word vector of each available word; utilizing an initial sentence vector of each news sentence and the initial word vectors of the left and right adjoining available words of a certain available word in the news sentence to train the continuous bag-of-words model in an iterative manner to obtain a currentword vector of each available word in the news sentence and a final sentence vector of the news sentence; merging an average value of the word vectors of all the available words, one-hot vectors of high-frequency words and the final sentence vector of each news sentence to obtain a semantic vector of the news sentence; and calculating distances between the semantic vectors to obtain the semanticsimilarity between the different news sentences, and clustering the news sentences of the corpus in accordance therewith. The invention also provides an electronic device and a computer-readable storage medium.

News sentence clustering method based on semantic similarity, device and storage medium

News sentence clustering method based on semantic similarity, device and storage medium

News sentence clustering method based on semantic similarity, device and storage medium

Owner:PING AN TECH (SHENZHEN) CO LTD

A method for grading subjective questions

ActiveCN109213999ASmall amount of calculationEasy to useSemantic analysisEnergy efficient computingFeature extraction algorithmScore method

A method for grading subjective questions includes sentence preprocessing, feature extraction, feature fusion, similarity calculation and comprehensive scoring. Wherein, the sentence preprocessing isused for clause, word segmentation, keyword detection, part-of-speech tagging and sentence emotion analysis of the target paragraph. The feature extraction algorithm is used for extracting a word vector, a sentence vector, a word structure and a syntactic structure. The feature fusion is used for fusing a target paragraph containing M sentences into a contrast template containing N templates (N (M): the similarity calculation is used for calculating word similarity and sentence similarity. The comprehensive score is used for constructing a weight model according to the word similarity, sentence similarity, word structure similarity, syntactic structure similarity, keyword score and affective score between the student answers and the comparative template, and then grading the student answers. The invention adapts to the scoring requirements of subjective questions of various disciplines, and good scoring effect can be obtained through training of a small number of samples.

A method for grading subjective questions

A method for grading subjective questions

A method for grading subjective questions

Owner:成都佳发安泰教育科技股份有限公司

Vertical domain-oriented intelligent question and answer system

ActiveCN105843897AAccurate understandingImprove accuracySemantic analysisSpecial data processing applicationsQuestions and answersSentence similarity

The invention discloses a vertical domain-oriented intelligent question and answer system. The system comprises a question asking module (1), a preprocessing module (2), a word segmentation and vocabulary standardization module (3), a word purification module (4), a synonym expansion module (5), a vocabulary expansion or deletion module (6), a sentence similarity calculation module (7) and an answer output module (8). The system calculates the similarity of question sentences of a user through domain ontology construction and depends on a word segmentation technology, domain ontology construction and ontology similarity calculation. The system has the advantages that a question asking intention of the user can be understood more accurately by applying a domain ontology technology through a sentence similarity algorithm, the sentence similarity can be calculated, and the accuracy of the question and answer system can be improved.

Vertical domain-oriented intelligent question and answer system

Vertical domain-oriented intelligent question and answer system

Vertical domain-oriented intelligent question and answer system

Owner:QINGDAO PENGHAI SOFT CO LTD

Device and method for judging subjective questions in test paper based on natural language

InactiveCN108959261AReduce the burden onEnsure fairness and justiceNatural language data processingSpecial data processing applicationsPart of speechPaper based

The invention relates to a device and a method for judging subjective questions in a test paper based on a natural language, the method including: processing a text of an answer of an examinee by clauses and dividing the text of the answer of the examinee into a plurality of clauses; carrying out word segmentation and part-of-speech tagging on the clauses after the clauses in the clause processingmodule; conducting keyword extraction of part-of-speech tagged words in part-of-speech tagging module; calling a syntactic analysis module to analyze the clause after the clause processing module, and analyzing the syntactic structure information of the clause; an obtaining module matching the keywords of the examinees' answers with the scoring keywords provided by the reference answers, and obtaining the similarity between the keywords of the examinees' answers and the scoring keywords; match the text clauses of the candidates' answers with the clauses provided by the reference answers, andobtaining the sentence similarity between the text clauses of the candidates' answers and the clauses provided by the reference answers. The invention realizes the machine correction of the subjectivequestion, lightens the burden of the marking teacher, improves the marking efficiency, and guarantees the fairness and justice of the scores of the examinees.

Device and method for judging subjective questions in test paper based on natural language

Device and method for judging subjective questions in test paper based on natural language

Device and method for judging subjective questions in test paper based on natural language

Owner:京工博创(北京)科技有限公司

Generation method of professor intention answers in man-machine conversations

ActiveCN106095950AImprove user experienceSpecial data processing applicationsMan machineSentence similarity

The invention discloses a generation method of professor intention answers in man-machine conversations. The method comprises the steps that according to received conversations, sentence similarities are calculated, and then dialogue intention recognition is performed on the current dialogue sentences; if the current dialogue intention is a small talk, the current dialogue and a corresponding inquiry intention value corresponding to the current dialogue are added into historical records, and answers are directly returned by searching a knowledge base or a network; if the current dialogue intention is a professor, the next step is executed; the historical records are searched for questions corresponding to the current dialogue; the current dialogue and network information are combined, and a plurality of simulation internal dialogues are performed to obtain a related answer set; the related answer set is filtered; on the basis of each answer weight, an abstract is extracted, the answer with the highest weight is adopted as an abstract extraction result and the current dialogue answer to be returned. According to the method, the professor content of a user can be well fed back under the professor intention in man-machine conversations, and the satisfaction of the man-machine conversations is improved.

Generation method of professor intention answers in man-machine conversations

Generation method of professor intention answers in man-machine conversations

Generation method of professor intention answers in man-machine conversations

Owner:中科极限元(杭州)智能科技股份有限公司

Personalized summary system based on user interest model

InactiveCN101373486ASpecial data processing applicationsPersonalizationGenerative process

The invention discloses an individual abstract system based on user interest modules. The individual abstract system consists of a Web information retrieval unit, a user interest unit and an individual abstract unit. In the individual abstract system, the user interest modules described with hierarchical conceptual structure are established and / or updated by using the conceptual clustering method through analyzing retrieval logs of users; subsequently, the similarity of user interests and sentences in the retrieval results is analyzed according to the user interest modules and the retrieval results, thus getting individual abstracts meeting requirements of users. The individual abstracts got through grading and handling individual sentences takes full consideration of the interests and the features of users so that the generative process of the abstracts are matched according to the interests of users and improving the effectiveness of the abstracts and the satisfaction of users.

Personalized summary system based on user interest model

Personalized summary system based on user interest model

Personalized summary system based on user interest model

Owner:BEIHANG UNIV

An intelligent operation and maintenance statement similarity matching method based on natural language processing

InactiveCN109902159AAvoid the phenomenon of "curse of dimensionality"Improve accuracySpecial data processing applicationsSemantic tool creationCurse of dimensionalitySentence similarity

The invention discloses an intelligent operation and maintenance statement similarity matching method based on a natural language processing technology. The method mainly comprises two parts of data processing in knowledge base construction and sentence similarity matching based on deep learning. Compared with the prior art, the method has the advantages that (1) the operation and maintenance management knowledge is subjected to word segmentation by utilizing the specific word library and the HMM to find the new word model, so that the text word segmentation accuracy is improved, and the moreperfect text word library is established; (2) word vectors are trained through a deep learning method, so that the phenomenon of'dimensionality disaster 'represented by the word vectors can be avoided, information of vocabulary contexts can be fully mined, and relations between words can be obtained; And (3) on the basis of the sentence vectors configured with the weights, not only can the importance measure of each word be obtained, but also the information of the sentence vectors can be richer through the combination of the word vectors, and the accuracy of matching on the basis of forming the sentence vectors can be guaranteed through a cosine similarity matching algorithm.

An intelligent operation and maintenance statement similarity matching method based on natural language processing

An intelligent operation and maintenance statement similarity matching method based on natural language processing

An intelligent operation and maintenance statement similarity matching method based on natural language processing

Owner:华融融通(北京)科技有限公司

Interrogative sentence similarity measurement method based on deep convolutional neural network

InactiveCN108021555AAvoid the impact of subsequent analysisAvoiding Sentence Fragmentation ProblemsNatural language data processingNeural architecturesCharacter analysisChinese characters

The invention provides an interrogative sentence similarity measurement method based on a deep convolutional neural network. The method comprises the following steps that: S1: through a knowledge domain related page, generating a raw corpus, crawling Chinese characters which appear in raw language materials, and generating a word vector corresponding to each Chinese character; S2: replacing each Chinese character in an interrogative sentence by the corresponding word vector to obtain a word vector set corresponding to the interrogative sentence, wherein the word vector set carries out calculation through the convolutional neural network to obtain a corresponding sentence meaning vector; and S3: carrying out pairwise combination on the interrogative sentences, and calculating the cosine function absolute values of the sentence meaning vectors corresponding to two interrogative sentences to obtain a similarity between two interrogative sentences. By use of the method, an individual-character analysis method is adopted to avoid an influence on subsequent analysis by word segmentation errors, the convolutional neural network takes the whole interrogative sentence as a whole to extractwhole sentence characteristics, and a sentence meaning splitting problem brought by using word similarity matrixes is avoided.

Interrogative sentence similarity measurement method based on deep convolutional neural network

Interrogative sentence similarity measurement method based on deep convolutional neural network

Interrogative sentence similarity measurement method based on deep convolutional neural network

Owner:INSPUR FINANCIAL INFORMATION TECH CO LTD

Sentence similarity calculating method and device as well as system

ActiveCN108268441AImprove accuracySemantic analysisCharacter and pattern recognitionNatural language processingSentence pair

The invention discloses a sentence similarity calculating method and device as well as system. The sentence similarity calculating method comprises the following steps: acquiring a sentence pair of to-be-calculated similarity; building a dependence syntactic tree of each sentence in the sentence pair; and calculating similarity between sentences in the sentence pair according to a pre-built sentence similarity calculation model and the dependence syntactic tree of each sentence. According to the method disclosed by the invention, the calculation accuracy of the sentence similarity can be improved.

Sentence similarity calculating method and device as well as system

Sentence similarity calculating method and device as well as system

Sentence similarity calculating method and device as well as system

Owner:IFLYTEK CO LTD

Method and device for calculating sentence similarity and method and device for machine translation

ActiveCN103034627AReflect the degree of matchingQuality improvementSpecial data processing applicationsCo-occurrenceMachine translation

The invention provides a method and a device for calculating sentence similarity and a method and a device for machine translation, wherein the method for calculating the sentence similarity comprises the following steps that a first sentence and a second sentence are compared, so as to determine different word pairs; different words are marked by utilizing the matching probability of the different words in the different word pairs with other words in the first sentence or the second sentence in which the different words are contained, wherein the matching probability of two words is obtained by inquiring a matching probability model, and the matching probability of the two words in the matching probability model is obtained by counting the co-occurrence frequency of the two words in a preset corpus; the marking results of the different words in the different word pairs are utilized to mark the different word pairs; and the marking results of the different word pairs are utilized to determine the similarity of the first sentence and the second sentence. According to the method and the device, the matching degree of the two sentences can be more accurately reflected, thereby increasing the application quality of the method and the device for the machine translation and the like.

Method and device for calculating sentence similarity and method and device for machine translation

Method and device for calculating sentence similarity and method and device for machine translation

Method and device for calculating sentence similarity and method and device for machine translation

Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Clustering method for topic views based on sentence similarity

ActiveCN106372208AWell representedEasy extractionCharacter and pattern recognitionSpecial data processing applicationsView basedThe Internet

The invention discloses a clustering method for topic views based on sentence similarity. The clustering method can be used for clustering main views about a certain topic in internet. The clustering method comprises the following steps: firstly, constructing a view lexicon aiming at a topic by utilizing a human-computer cooperation mode; secondly, extracting all view sentences of the topic and performing the view clustering by using the similarity of the view sentences; finally, selecting a representative view sentence for each view class according to the average similarity of the sentences. The clustering method disclosed by the invention has the advantages that a clustering result can be more diversified and refined; a user is enabled to learn the views and details of various parties of the topic more clearly; fuzziness and one-sidedness of view clustering and description are effectively avoided.

Clustering method for topic views based on sentence similarity

Clustering method for topic views based on sentence similarity

Clustering method for topic views based on sentence similarity

Owner:SOUTHEAST UNIV

Chinese sentence similarity calculation method based on Word2Vec

InactiveCN109062892AImprove accuracyGood calculation resultSemantic analysisSpecial data processing applicationsPart of speechUser input

The invention discloses a Chinese sentence similarity calculation method based on Word2Vec. This method is based on large-scale corpus training to get the word vector model, and the sentence is represented as syntactic component tree structure by LTP syntactic parser. The calculation method comprises the following steps of: accepting a question Q input by a user; performing word segmentation, part-of-speech analysis and syntactic analysis of user input question Q; The similarity adjustment coefficient score1 and semantic similarity score score2 between question Q and question A are obtained bymatching the question Q input by user and each question A in the question template. The similarity adjustment coefficient score1 and semantic similarity score score2 are calculated. According to thesimilarity adjustment coefficient score1 and the semantic similarity score2, the sentence similarity score between the question Q and the question A is calculated. The invention effectively improves the accuracy of similarity calculation by adding sentence structure information into sentence similarity calculation and calculating syntactic relations between words.

Chinese sentence similarity calculation method based on Word2Vec

Chinese sentence similarity calculation method based on Word2Vec

Chinese sentence similarity calculation method based on Word2Vec

Owner:NORTHEASTERN UNIV

Chinese patent text similarity calculation method

InactiveCN108549634ASpeed up the reviewHigh accuracy and recallSemantic analysisCharacter and pattern recognitionWeight valueSentence similarity

The invention relates to a Chinese patent text similarity calculation method. The method comprises the steps of performing word segmentation on texts; calculating TF-IDF values for word segmentation results, extracting the word segmentation results with the relatively high TF-IDF values to serve as keywords, locating sentences where the keywords are located to serve as key sentences, and taking maximum weight values of the keywords in the key sentences as weight values of the key sentences, thereby obtaining a keyword set of each text; and calculating weights of comparison texts of the key sentences, selecting the key sentences of the to-be-compared texts and the comparison texts in sequence, and based on the sentence similarity of the key sentences, calculating the similarity of the texts. By utilizing existing patent domain ontologies, semantic relationships in the patent texts are analyzed; by utilizing a vector space model and the domain ontologies, patent text similarity is calculated; the correct rate and the recall rate of a calculation result are relatively high; the similarity between patents can be described more accurately; the patent examination speed can be increased;and the need of actual application can be well met.

Chinese patent text similarity calculation method

Chinese patent text similarity calculation method

Chinese patent text similarity calculation method

Owner:BEIJING INFORMATION SCI & TECH UNIV +1

Full text query and search systems and methods of use

InactiveCN101088082ASpecial data processing applicationsInternet contentRanking

The invention is a method for textual searching of text-based databases including databases of compiled internet content, scientific literature, abstracts for books and articles, newspapers, journals, and the like. Specifically, the algorithm supports searches using full-text or webpage as query and keyword searches allowing multiple entries and an information-content based ranking system (Shannon Information score) that uses p-values to represent the likelihood that a hit is due to random matches. Additionally, users can specify the parameters that determine hits and their ranking with scoring based on phrase matches and sentence similarities.

Full text query and search systems and methods of use

Full text query and search systems and methods of use

Full text query and search systems and methods of use

Owner:英孚威尔公司

Online corpus alignment method and system

ActiveCN106126506AExact Alignment ResultsImprove alignment efficiencyNatural language translationSpecial data processing applicationsNatural language processingSentence segmentation

The invention discloses an online corpus alignment method and system. The method comprises the steps of analyzing a bilingual inter-translated file to obtain a result file; performing paragraph adjustment on the result file to enable paragraphs between an original text and a translated text to correspond; automatically performing sentence segmentation on the original text and the translated text through a preset sentence segmentation rule to obtain original text sentences and translated text sentences, and performing calculation according to a preset arrangement rule to obtain arrangement combinations of the original text sentences and the translated text sentences; and calculating sentence similarity corresponding to each arrangement combination of the original text sentences and the translated text sentences, and selecting the arrangement combination with the maximum similarity as a final sentence-sentence alignment result. According to the method and the system, the accuracy of alignment can be improved.

Online corpus alignment method and system

Online corpus alignment method and system

Online corpus alignment method and system

Owner:上海一者信息科技有限公司

Successive principal axes filter method of multi-document automatic summarization

InactiveCN101008941AGood effectHigh precisionSpecial data processing applicationsPattern recognitionSentence similarity

This invention relates to one multi-file automatic abstract order main axis filter method in text information technique, which is based on OR rotation axis method comprises the steps of sentences similarity computation and analyzing main axis and abstracting sentence redundant part.

Successive principal axes filter method of multi-document automatic summarization

Successive principal axes filter method of multi-document automatic summarization

Successive principal axes filter method of multi-document automatic summarization

Owner:FUDAN UNIV

Computing method, search processing method, computing device and search processing device for sentence similarity

ActiveCN104462327AAccurate Sentence Similarity DataSearch results are accurateSpecial data processing applicationsAlgorithmSentence similarity

The invention provides a computing method, a search processing method, a computing device and a search processing device for sentence similarity realized through a computer. The computing method comprises the following steps: obtaining a first sentence and a second sentence; carrying out dependency analysis on the first sentence and the second sentence respectively to obtain a first dependency tree and a second dependency; computing the sentence similarity of the first sentence with the second sentence according to the first dependency tree and the second dependency tree. The search processing method comprises the following steps: receiving a query sentence; obtaining at least one search result item according to the query sentence; respectively computing the semantic similarity of the query sentence and the semantic similarity of the search result items through the computing method for sentence similarity realized through the computer; ordering the search result items according to the computed value of the semantic similarity; sending the search result items through ordering. More accurate sentence similarity can be computed according to the semanteme of sentences, and the search results are more accurate.

Computing method, search processing method, computing device and search processing device for sentence similarity

Computing method, search processing method, computing device and search processing device for sentence similarity

Computing method, search processing method, computing device and search processing device for sentence similarity

Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Popular searches

Web page System usage Keyword search Phrase Text query Pre treatment Degree of similarity Question answer Word order Calculation methods