Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

35 results about "Longest common substring problem" patented technology

In computer science, the longest common substring problem is to find the longest string (or strings) that is a substring (or are substrings) of two or more strings.

Log event extraction method and system based on log tree and parse tree

The invention discloses a log event extraction method and system based on a log tree and a parse tree. The method is divided into two steps of preprocessing and log content parsing, and the method specifically comprises the steps of providing and maintaining a rule base composed of regular expressions and heuristic rules, and extracting a small part of logs to automatically generate a log format; recognizing the log as a log head and log content on line based on the log format; searching the analytic tree, and respectively calculating the similarity between the static field and the dynamic parameter in the log tree and the event tree by adopting the longest common substring and the longest common subvector; and matching the log tree and the event tree by adopting a clustering technology, and extracting events and corresponding parameters. In order to cope with the complexity of the log content, the preprocessing and log content analysis steps in the online event extraction method are improved. The workload of manually recognizing log formats is reduced, the problem that an existing method is difficult to identify events containing uncertain number of parameters is solved, and log events are extracted more accurately.
Owner:NANJING UNIV OF SCI & TECH

Semantic similarity calculation method and device based on CTW and KM algorithms

The invention provides a semantic similarity calculation method and device based on CTW and KM algorithms, and aims to overcome the defect that in the semantic similarity calculation method in the prior art, the important influence of a word segmentation sequence on semantics is not considered, and the influence of the sequence on sentences is considered while a single semantic judgment rule is kept. The method comprises: using a Word2Vec deep learning platform for dividing a text into word segmentation vectors of a multi-dimensional space; obtaining a plurality of text similarity values, mapping the text similarity values to a multi-dimensional vector space, connecting vectors to form a curve in the multi-dimensional space, comparing the similarity values of a plurality of texts through aword vector curve by means of a relatively new time warping distance in the curve similarity values in an image, and adopting a KM algorithm in order to reduce the calculation scale. Compared with traditional longest common substrings, word frequency statistics and other methods, the method has higher robustness, has an obvious effect on sentences with the same word segmentation word order and different word orders which cannot be overcome by the traditional method, and improves the calculation accuracy.
Owner:HUBEI UNIV OF TECH

Method for detecting similarity of string matching codes

The invention discloses a method for detecting the similarity of string matching codes. The method includes steps of preprocessing program codes and carrying out standardized processing on source codes; comparing obtained feature vectors to to-be-compared codes according to rows and generating feature values formed by binary systems; dynamically generating code structure fingerprints; extracting identical feature vectors from the to-be-compared codes, searching generated corresponding structure fingerprints according to the identical feature vectors and forming structure fingerprints of code features. The feature values 0 represent the fact that local rows do not contain feature vector values, and the feature values 1 represent the fact that the local rows contain the feature vector values. The similarity can be compared; the structure similarity of the codes can be obtained from structure feature fingerprints of the to-be-compared codes by the aid of processes for matching the longestcommon substrings. The method has the advantages that the structure similarity of the codes can be detected on the basis of detection by the aid of the original methods for the similarity of the string matching codes, and the code similarity detection accuracy can be improved.
Owner:NANJING UNIV OF POSTS & TELECOMM

Candidate word evaluation method and device, computer equipment and storage medium

The invention relates to a candidate word evaluation method and device, computer equipment and a storage medium, which are applied to the field of data processing. The method comprises the following steps of: detecting an error word and obtaining a plurality of candidate words corresponding to the error word; determining an editing distance between each candidate word and the error word; determining the similarity between each candidate word and the error word, wherein the similarity is obtained according to the longest common subsequence and/or longest common substring of each candidate wordand the error word; replacing the error word with each candidate word to obtain a candidate sentence, and determining an evaluation probability of the corresponding candidate word according to the candidate sentence; obtaining error information of the error word relative to each candidate word; and determining the evaluation score corresponding to each candidate word according to the editing distance, similarity, evaluation probability and error information. The embodiment of the invention solves the problem of low reliability of candidate word evaluation, and is beneficial to improving the reliability of the candidate word evaluation result.
Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Candidate word evaluation method and apparatus, computer device and storage medium

The present invention relates to a candidate word evaluation method and apparatus, a computer device and a storage medium, which are applied to the field of data processing. The method comprises: whendetecting a wrong word, acquiring a plurality of candidate words corresponding to the wrong word; determining similarity between each candidate word and the wrong word, wherein the similarity is obtained according to a longest common subsequence and/or a longest common substring between each candidate word and the wrong word; respectively replacing the wrong word with each candidate word to obtain a candidate sentence, and determining an evaluation probability of the corresponding candidate word according to the candidate sentence, wherein the evaluation probability is obtained according to alocale probability of the candidate word in the candidate sentence and a locale probability of the adjacent words of the candidate word; acquiring error information of the wrong word with respect toeach candidate word; and determining an evaluation score corresponding to each candidate word according to the similarity, the evaluation probability, and the error information. According to embodiments of the present invention, the problem of low reliability of the candidate word evaluation is solved, and the reliability of the candidate word evaluation result can be improved in a facilitated manner.
Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Candidate word evaluation method, apparatus, computer equipment and storage medium

The invention relates to a candidate word evaluation method, device, computer equipment and storage medium, which are applied to the field of data processing. The method includes: detecting a wrong word, obtaining a plurality of candidate words corresponding to the wrong word; determining the similarity between each candidate word and the wrong word, and the similarity is based on the longest common subsequence and / or the longest common subsequence of each candidate word and the wrong word. or the longest common substring; replace the wrong word with each candidate word respectively to obtain a candidate sentence, and determine the evaluation probability of the corresponding candidate word according to the candidate sentence, and the evaluation probability is based on the language environment probability of the candidate word in the candidate sentence. , and the language environment probability of the adjacent words of the candidate word; obtain the error information of the wrong word relative to each candidate word; according to the similarity, the evaluation probability and the error information, determine the evaluation score corresponding to each candidate word. The embodiment of the present invention solves the problem of low reliability of candidate word evaluation, and is beneficial to improve the reliability of the candidate word evaluation result.
Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Candidate word evaluation method and apparatus, computer equipment and storage medium

The invention relates to a candidate word evaluation method and apparatus, computer equipment and a storage medium, which are applied to the field of data processing. The method comprises the steps ofdetecting a wrong word and acquiring a plurality of candidate words corresponding to the wrong word; determining an editing distance between each candidate word and the wrong word; determining the similarity between each candidate word and the wrong word, wherein the similarity is obtained according to the longest common sub-sequence and/or the longest common sub-string of each candidate word andthe wrong word; determining a language environment probability of each candidate word in the position of the wrong word; acquiring error information of the wrong word relative to each candidate word;and according to the editing distance, the similarity, the language environment probability and the error information, determining an evaluation score corresponding to each candidate word. Accordingto the candidate word evaluation method and apparatus, the problem of relatively low reliability of candidate word evaluation is solved, so that the reliability of a candidate word evaluation result is improved.
Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products