Patents

Literature

Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.

35 results about "Longest common substring problem" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

In computer science, the longest common substring problem is to find the longest string (or strings) that is a substring (or are substrings) of two or more strings.

Systems and methods to control work progress for content transformation based on natural language processing and/or machine learning

InactiveUS20140108103A1ResourcesLongest common substring problemDocumentation procedure

Systems and methods are provided to compute indicators of completeness of the work output of a transformation of text-based content, worker capacity in performing the transformation, and / or the degree of matching between a unit of work and a worker, based on information collected about complexity of works, times and throughput of workers, rating of work outputs and using natural language processing techniques and machine learning techniques, such as language detection, longest common substring, length ratio, document similarity, etc. The indicators are utilized to optimize job pickup and output submission for online crowdsourcing tasks related to transformation of text-based content, such as transcription, translation, proofreading, etc.

Systems and methods to control work progress for content transformation based on natural language processing and/or machine learning

Systems and methods to control work progress for content transformation based on natural language processing and/or machine learning

Systems and methods to control work progress for content transformation based on natural language processing and/or machine learning

Owner:GENGO

Electronic official document trace reserving method based on file comparison

InactiveCN105589838ASolve the problem of over-labelingReflect changesNatural language data processingSpecial data processing applicationsLongest common substring problemOperational system

The invention relates to the technical field of e-government affairs, in particular relates to an electronic official document trace reserving method based on file comparison, and provides an electronic official document trace reserving method based on text comparison by using the longest public substring matching. The method can effectively solve the problem of overuse of marks, and is simple in algorithm, relatively easy to be achieved by using various programming languages, and applicable to various operating systems and software environments; the electronic official document trace reserving method based on file comparison comprises the steps of firstly comparing an original text with a modified text, thus obtaining which character strings of the modified text is inserted and which character strings of the modified text is deleted based on the original text, and at last respectively marking the inserted and deleted character strings, thus achieving trace reservation; the electronic official document trace reserving method based on file comparison is mainly applied to modifying the electronic text.

Electronic official document trace reserving method based on file comparison

Electronic official document trace reserving method based on file comparison

Electronic official document trace reserving method based on file comparison

Owner:NO 33 RES INST OF CHINA ELECTRONICS TECHNOOGY GRP

Text similarity detection method

ActiveCN107562824AImprove accuracyImprove reliabilitySpecial data processing applicationsLongest common substring problemGeneration process

The invention relates to a text similarity detection method and belongs to the technical field of natural language processing. The method comprises the steps of firstly performing similarity calculation on a text by using a conventional Simhash algorithm; secondly introducing an N-Gram language model for performing combination on text keywords to enable the keywords to have a context connection relationship, and performing similarity calculation on the text by using the Simhash algorithm again; thirdly, introducing a longest common substring to serve as one of similarity judgment standards forperforming similarity calculation on the text; and finally, giving a corresponding weight to the calculated similarity, and performing final similarity superposition calculation. Compared with the prior art, the method has the advantages that the phenomena of poor supportability of short texts by the Simhash algorithm, effective information loss in a fingerprint generation process and the like are mainly eliminated; and the accuracy and reliability of text similarity detection are improved.

Text similarity detection method

Text similarity detection method

Text similarity detection method

Owner:KUNMING UNIV OF SCI & TECH

Method and system for obtaining word pair translation from bilingual sentence

InactiveCN101187924AReduce workloadImprove the efficiency of obtaining translationsSpecial data processing applicationsResource poolLongest common substring problem

The invention provides a method for obtaining word pair translation from a bilingual sentence pair. The method includes the following steps: A. a lemma to be treated is received; B. the bilingual sentence pair to be chosen is searched from an index resource pool according to the lemma to be treated; C. two groups of bilingual sentence pairs are chosen from the index result, a longest public substring with the same language type sentence as that of the lemma to be treated in the two groups of the bilingual sentence pairs is obtained; D. whether the substring is consistent to the lemma to be treated or not is judged, if being not consistent, another two groups of bilingual sentence pairs are chosen from the index result, the step C is repeated; if being consistent, then, E. the longest public substring of a corresponding sentence in the two groups of the bilingual sentence pairs is obtained. The index way is utilized, thereby reducing the workload of data processing, and improving the efficiency for obtaining the translation. The invention provides a system obtaining the word pair translation from the bilingual sentence pairs.

Method and system for obtaining word pair translation from bilingual sentence

Method and system for obtaining word pair translation from bilingual sentence

Method and system for obtaining word pair translation from bilingual sentence

Owner:BEIJING KINGSOFT SOFTWARE +2

Log event extraction method and system based on log tree and parse tree

PendingCN113626400AReduce workloadAccurate identificationFile metadata searchingSpecial data processing applicationsLongest common substring problemAlgorithm

The invention discloses a log event extraction method and system based on a log tree and a parse tree. The method is divided into two steps of preprocessing and log content parsing, and the method specifically comprises the steps of providing and maintaining a rule base composed of regular expressions and heuristic rules, and extracting a small part of logs to automatically generate a log format; recognizing the log as a log head and log content on line based on the log format; searching the analytic tree, and respectively calculating the similarity between the static field and the dynamic parameter in the log tree and the event tree by adopting the longest common substring and the longest common subvector; and matching the log tree and the event tree by adopting a clustering technology, and extracting events and corresponding parameters. In order to cope with the complexity of the log content, the preprocessing and log content analysis steps in the online event extraction method are improved. The workload of manually recognizing log formats is reduced, the problem that an existing method is difficult to identify events containing uncertain number of parameters is solved, and log events are extracted more accurately.

Log event extraction method and system based on log tree and parse tree

Log event extraction method and system based on log tree and parse tree

Owner:NANJING UNIV OF SCI & TECH

Text similarity calculation method and device and electronic device

ActiveCN109271641AThe similarity is close to realityImprove accuracyNatural language data processingText database queryingLongest common substring problemTheoretical computer science

The embodiment of the invention discloses a text similarity calculation method and device and an electronic device. The method according to the embodiment of the invention comprises the following steps: obtaining an original text and a target text; calculating an editing distance between the original text and the target text; determining the longest common substring of the original text and the target text, and obtaining a starting position of the longest common substring in the original text; calculating text similarity between the original text and the target text based on the starting position of the longest common substring in the original text. The embodiment of the invention combines the editing distance of the original text and the target text and the longest common substring to calculate the text similarity, the calculated text similarity is closer to the reality, and the accuracy of the text similarity calculation is improved.

Text similarity calculation method and device and electronic device

Text similarity calculation method and device and electronic device

Text similarity calculation method and device and electronic device

Owner:广西三方大供应链技术服务有限公司

Semantic similarity calculation method and device based on CTW and KM algorithms

ActiveCN109858015AImprove accuracyImprove robustnessSpecial data processing applicationsLongest common substring problemSemantics

The invention provides a semantic similarity calculation method and device based on CTW and KM algorithms, and aims to overcome the defect that in the semantic similarity calculation method in the prior art, the important influence of a word segmentation sequence on semantics is not considered, and the influence of the sequence on sentences is considered while a single semantic judgment rule is kept. The method comprises: using a Word2Vec deep learning platform for dividing a text into word segmentation vectors of a multi-dimensional space; obtaining a plurality of text similarity values, mapping the text similarity values to a multi-dimensional vector space, connecting vectors to form a curve in the multi-dimensional space, comparing the similarity values of a plurality of texts through aword vector curve by means of a relatively new time warping distance in the curve similarity values in an image, and adopting a KM algorithm in order to reduce the calculation scale. Compared with traditional longest common substrings, word frequency statistics and other methods, the method has higher robustness, has an obvious effect on sentences with the same word segmentation word order and different word orders which cannot be overcome by the traditional method, and improves the calculation accuracy.

Semantic similarity calculation method and device based on CTW and KM algorithms

Semantic similarity calculation method and device based on CTW and KM algorithms

Semantic similarity calculation method and device based on CTW and KM algorithms

Owner:HUBEI UNIV OF TECH

Method for obtaining longest common substring of alphabetic strings

ActiveCN102222093AReduce workloadConvenient querySpecial data processing applicationsLongest common substring problemByte

The invention relates to a method for obtaining the longest common substring among alphabetic strings. For improving the efficiency to obtain the longest common substring among alphabetic strings, the method comprises the following steps that: firstly, bidirectional comparison is carried out between the two sides of a match byte so as to obtain initial common substrings and calculate the lengths of the initial common substrings; and secondly, based on the existing longest common substring, a longer common substring is repeatedly tried to be found by means of combing multiple trans-mechanisms.until all alphabetic strings are subjected to the process. The invention has the advantages of improving the calculation efficiency for obtaining the longest common substring and reducing resource overhead.

Method for obtaining longest common substring of alphabetic strings

Method for obtaining longest common substring of alphabetic strings

Method for obtaining longest common substring of alphabetic strings

Owner:COMP APPL RES INST CHINA ACAD OF ENG PHYSICS

Method for detecting similarity of string matching codes

ActiveCN108920361AEasy to operateAvoid difficultiesSoftware testing/debuggingFeature vectorLongest common substring problem

The invention discloses a method for detecting the similarity of string matching codes. The method includes steps of preprocessing program codes and carrying out standardized processing on source codes; comparing obtained feature vectors to to-be-compared codes according to rows and generating feature values formed by binary systems; dynamically generating code structure fingerprints; extracting identical feature vectors from the to-be-compared codes, searching generated corresponding structure fingerprints according to the identical feature vectors and forming structure fingerprints of code features. The feature values 0 represent the fact that local rows do not contain feature vector values, and the feature values 1 represent the fact that the local rows contain the feature vector values. The similarity can be compared; the structure similarity of the codes can be obtained from structure feature fingerprints of the to-be-compared codes by the aid of processes for matching the longestcommon substrings. The method has the advantages that the structure similarity of the codes can be detected on the basis of detection by the aid of the original methods for the similarity of the string matching codes, and the code similarity detection accuracy can be improved.

Method for detecting similarity of string matching codes

Method for detecting similarity of string matching codes

Method for detecting similarity of string matching codes

Owner:NANJING UNIV OF POSTS & TELECOMM

Military equipment knowledge graph-oriented key information query method

PendingCN114357143ANatural language data processingSpecial data processing applicationsLongest common substring problemEntity linking

The invention discloses a military equipment knowledge graph-oriented key information query method, which comprises the following steps of: acquiring a natural language query statement, and performing entity linking on entities involved in the natural language query statement and existing entities in a military equipment knowledge graph based on longest common substring matching to obtain a key information query statement; obtaining a plurality of entity linking results; sorting the plurality of entity linking results based on longest prefix matching, and selecting an optimal entity linking result; creating a query template library according to element types in the natural language query statements; on the basis of template matching, the elements in the natural language query statement are matched with the query template library, the corresponding elements in the natural language query statement are filled into the corresponding atlas query statement templates in the query template library, a complete atlas query statement is formed, and a query result is obtained after atlas query. And four typical query statements of military equipment can be answered without data set and algorithm training.

Military equipment knowledge graph-oriented key information query method

Military equipment knowledge graph-oriented key information query method

Military equipment knowledge graph-oriented key information query method

Owner:中国人民解放军军事科学院战争研究院

Emotion feature identification method and apparatus

InactiveCN108021548AAutomatic recognition of emotional featuresReduce labor costsNatural language data processingSpecial data processing applicationsPattern recognitionLongest common substring problem

The invention discloses an emotion feature identification method and apparatus, relates to the technical field of information, and solves the problems of relatively low efficiency and relatively low accuracy of emotion feature identification in the prior art. According to the main technical scheme, the method comprises the steps of firstly obtaining comment data of a target product, wherein the product comment data includes product comment text data; secondly according to a preset processing rule, processing the product comment text data to obtain multiple sentences containing same emotion words; and finally determining a longest common substring, containing the emotion words, among the multiple sentences as an emotion feature of the target product. The emotion feature identification method and apparatus is suitable for emotion feature identification.

Emotion feature identification method and apparatus

Emotion feature identification method and apparatus

Emotion feature identification method and apparatus

Owner:BEIJING GRIDSUM TECH CO LTD

Method and device of analyzing search keyword frequency

ActiveCN107203570AFix and make up for errorsImprove the efficiency of similarity calculationSpecial data processing applicationsText database clustering/classificationLongest common substring problemNeighbor algorithm

The invention provides a method and device of analyzing search keyword frequency based on HLSA. In the method, keyword aggregation is conducted by introducing the LSA space model which contains a theme, the deficiency that the Euclidean distance model based on VSM vector does not take into account the semantic information of a word per se is overcome and the error caused by the order changes of keywords based on an edit distance model is remedied. Additionally the method further combines with Hamming keywords to make computations on the similarity of eigenvectors between the keywords, new HLSA algorithm is formed, the computation efficiency of similarity is increased; the K-nearest neighbor algorithm is utilized to classify and statistically measure the frequency of keywords, aggregation is conducted on keywords of different granularities, and misjudgments due to too small particle size by the longest common substring model are effectively avoided.

Method and device of analyzing search keyword frequency

Method and device of analyzing search keyword frequency

Method and device of analyzing search keyword frequency

Owner:BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1

Classic track similar track identification method

ActiveCN108470146AHigh similarityHigh similar track recognition rateCharacter and pattern recognitionLongest common substring problemEngineering

The invention discloses a classic track similar track identification method, and aims to provide a classic track identification method with a high similar track identification rate and capable of processing unstable tracks; the method comprises the following steps: reading classic tracks from a classic track knowledge base, and reading real time tracks from a real time track database; using a Douglas-Peucker algorithm to compress the real time track; primarily determining track similarity according to track features; if the primary determination succeeds, using the distance between points of the classic track and a line segment of the real time track to calculate the multi-to-1 longest common substring distance; using the multi-to-1 longest common substring distance as the multi-to-1 longest common substring distance between points and line of the classic track and the real time track; using the ratio between the point-to-line multi-to-1 longest common substring distance and the classic track length as the track similarity; precisely determining track similarity according to the obtained track similarity, and outputting a result if the track similarity precise determination succeeds.

Classic track similar track identification method

Classic track similar track identification method

Classic track similar track identification method

Owner:10TH RES INST OF CETC

Method and device of data deduplication

InactiveCN108280085AImprove comparison efficiencyReduce storageSpecial data processing applicationsLongest common substring problemFile comparison

The invention belongs to the technical field of data statistics, and particularly relates to a method and a device of data deduplication. The method of data deduplication of the invention includes: constructing a longest-common-sub-string table according to acquired target data; extracting a longest common sub-string of two pieces of data on which deduplication judgment needs to be carried out, and comparing the longest common sub-string with sub-strings in the longest-common-sub-string table; and carrying out deduplication processing on the two pieces of data if a sub-string which is the sameas the longest common sub-string does not exist in the longest-common-sub-string table. According to the method and the device of data deduplication of the invention, frequent updating of data in thetable is not needed, a data storage amount is decreased, and efficiency of data comparison in a deduplication process is improved.

Method and device of data deduplication

Method and device of data deduplication

Owner:CHINA ACADEMY OF INFORMATION & COMM

Similar track recognition method for classic tracks

InactiveCN108537258AHigh similarityHigh similar track recognition rateCharacter and pattern recognitionLongest common substring problemComputer science

The invention discloses a similar track recognition method for classic tracks and aims at providing a classic track recognition method which has a high similar track recognition rate and can process an instable track. The technical scheme comprises steps: a classic track is read from a classic track knowledge base, a real-time track is then read from a real-time track base, a Douglas-Peucker algorithm is adopted to compress the real-time track, track features are used for track similarity initial judgment, initial judgment succeeds, the distance between a point of the classic track and a linesection of the real-time track is used to calculate the longest common substring distance of multiple one pairs, the longest common substring distance of multiple one pairs is used as the longest common substring distance of multiple one pairs for a point to a line between the classic track and the real-time track, the ratio of the longest common substring distance of multiple one pairs for a point to a line to the length of the classic track is used as a track similarity, track similarity precise judgment is then carried out according to the track similarity, and if the track similarity precise judgment succeeds, a result is outputted.

Similar track recognition method for classic tracks

Similar track recognition method for classic tracks

Similar track recognition method for classic tracks

Owner:10TH RES INST OF CETC

Method and device for generating traffic detection rule

ActiveCN106506507ATransmissionLongest common substring problemTime-Consuming

Embodiments of the invention provide a method and a device for generating a traffic detection rule, which are applied to electronic equipment. The method comprises the following steps of obtaining traffic files of at least two attack traffics for a preset loophole, wherein the traffic files at least include load data in the attack traffics; determining a requester and an answer party of each attack traffic according to a protocol type of the attack traffic; determining loophole information of the preset loophole as an information guide item; extracting the first load data of all requesters from all traffic files; taking all first load data and the information guide item as a first input source and computing to obtain a first longest common substring of all first load data; determining the first longest common substring as a first characteristic; and generating a first traffic detection rule according to the first characteristic. Through application of the embodiments of the invention, the time consumed by generation of the traffic detection rule is reduced.

Method and device for generating traffic detection rule

Method and device for generating traffic detection rule

Method and device for generating traffic detection rule

Owner:NEW H3C TECH CO LTD

Candidate word evaluation method and device, computer device and storage medium

InactiveCN108681534AAccurate assessmentImprove error correction performanceNatural language data processingSpecial data processing applicationsEvaluation resultLongest common subsequence problem

The invention relates to a candidate word evaluation method and device, a computer device and a storage medium, which are applied to the field of data processing. The method comprises the following steps of: upon detecting an error word, obtaining a plurality of candidate words corresponding to the error word; determining similarity between each candidate word and the error word, wherein the similarity is obtained according to the longest common subsequence and / or the longest common substring of each candidate word and the error word; obtaining error information of the error word relative to each candidate word; and determining the evaluation score corresponding to each candidate word according to the similarity and the error information. The method, the device, the computer device or thestorage medium of the embodiment of the invention are advantageous in improving the reliability of the candidate word evaluation result.

Candidate word evaluation method and device, computer device and storage medium

Candidate word evaluation method and device, computer device and storage medium

Candidate word evaluation method and device, computer device and storage medium

Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Candidate word evaluation method and device, computer equipment and storage medium

ActiveCN108681533AImprove accuracyNatural language data processingSpecial data processing applicationsEvaluation resultLongest common subsequence problem

The invention relates to a candidate word evaluation method and device, computer equipment and a storage medium, which are applied to the field of data processing. The method comprises the following steps of: detecting an error word and obtaining a plurality of candidate words corresponding to the error word; determining an editing distance between each candidate word and the error word; determining the similarity between each candidate word and the error word, wherein the similarity is obtained according to the longest common subsequence and / or longest common substring of each candidate wordand the error word; replacing the error word with each candidate word to obtain a candidate sentence, and determining an evaluation probability of the corresponding candidate word according to the candidate sentence; obtaining error information of the error word relative to each candidate word; and determining the evaluation score corresponding to each candidate word according to the editing distance, similarity, evaluation probability and error information. The embodiment of the invention solves the problem of low reliability of candidate word evaluation, and is beneficial to improving the reliability of the candidate word evaluation result.

Candidate word evaluation method and device, computer equipment and storage medium

Candidate word evaluation method and device, computer equipment and storage medium

Candidate word evaluation method and device, computer equipment and storage medium

Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Candidate word assessment method and apparatus, and candidate word sorting method and apparatus

ActiveCN108595419AImprove reliabilityNatural language data processingSpecial data processing applicationsLongest common substring problemLongest common subsequence problem

The invention relates to a candidate word assessment method and apparatus, and a candidate word sorting method and apparatus, which are applied to the field of data processing. The method comprises the steps of detecting a wrong word, and obtaining multiple candidate words corresponding to the wrong word; determining the similarity between each candidate word and the wrong word, wherein the similarity is obtained according to a longest common sub-sequence and / or a longest common sub-string of each candidate word and the wrong word; replacing the wrong word with the candidate words to obtain candidate statements, and determining assessment probabilities corresponding to the candidate words according to the candidate statements, wherein the assessment probabilities are obtained according tolanguage environment probabilities of the candidate words in the candidate statements and language environment probabilities of neighboring words of the candidate words; and according to the similarity and the assessment probabilities, determining assessment scores of the candidate words. The reliability of candidate word assessment results can be improved.

Candidate word assessment method and apparatus, and candidate word sorting method and apparatus

Candidate word assessment method and apparatus, and candidate word sorting method and apparatus

Candidate word assessment method and apparatus, and candidate word sorting method and apparatus

Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Method for extracting longest common substring of time series data

PendingCN110232076AQuick extractionDigital data information retrievalSpecial data processing applicationsLongest common substring problemSlide window

The invention discloses a method for extracting a longest common substring of time series data. The method comprises the following steps: step 1, reading to-be-compared time series data; step 2, selecting a time characteristic parameter of the time sequence data as a calculation parameter, and performing differential transformation, quantitative processing and symbolization processing on the timesequence data to obtain a symbolized sequence; step 3, establishing an equivalent character table of time sequence data according to the symbolized sequence; step 4, according to the equivalent character table, searching and storing a common substring of the time sequence data by adopting a dynamic sliding window mode; and step 5, extracting the longest common substring by judging the length of the common substring. According to the longest common substring extraction method for the time sequence data, the longest common substring of the time sequence data can be quickly extracted, and the longest common substring is also effective under the condition that the data part of the time sequence is lost.

Method for extracting longest common substring of time series data

Method for extracting longest common substring of time series data

Method for extracting longest common substring of time series data

Owner:SOUTHWEST CHINA RES INST OF ELECTRONICS EQUIP

Candidate word evaluation method, candidate word sorting method and device

ActiveCN108595419BImprove reliabilityNatural language data processingText database queryingLongest common substring problemEvaluation result

The invention relates to a candidate word evaluation method, a candidate word sorting method and a device, which are applied in the field of data processing. The method includes: detecting a wrong word, obtaining a plurality of candidate words corresponding to the wrong word; determining the similarity between each candidate word and the wrong word, and the similarity is based on the longest common subsequence and / or of each candidate word and the wrong word or the longest common substring obtains; replace the wrong word with each candidate word respectively, obtain the candidate sentence, determine the evaluation probability of the corresponding candidate word according to the candidate sentence, and the evaluation probability is based on the locale probability of the candidate word in the candidate sentence , and the language environment probabilities of the adjacent words of the candidate words; and determine the evaluation score of each candidate word according to the similarity and the evaluation probability. The embodiment of the present invention is beneficial to improve the reliability of the evaluation result of candidate words.

Candidate word evaluation method, candidate word sorting method and device

Candidate word evaluation method, candidate word sorting method and device

Candidate word evaluation method, candidate word sorting method and device

Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

A Method for Retaining Traces of Electronic Official Documents Based on File Comparison

InactiveCN105589838BSolve the problem of over-labelingReflect changesNatural language data processingSpecial data processing applicationsElectronic documentLongest common substring problem

The invention relates to the technical field of e-government affairs, in particular relates to an electronic official document trace reserving method based on file comparison, and provides an electronic official document trace reserving method based on text comparison by using the longest public substring matching. The method can effectively solve the problem of overuse of marks, and is simple in algorithm, relatively easy to be achieved by using various programming languages, and applicable to various operating systems and software environments; the electronic official document trace reserving method based on file comparison comprises the steps of firstly comparing an original text with a modified text, thus obtaining which character strings of the modified text is inserted and which character strings of the modified text is deleted based on the original text, and at last respectively marking the inserted and deleted character strings, thus achieving trace reservation; the electronic official document trace reserving method based on file comparison is mainly applied to modifying the electronic text.

A Method for Retaining Traces of Electronic Official Documents Based on File Comparison

A Method for Retaining Traces of Electronic Official Documents Based on File Comparison

A Method for Retaining Traces of Electronic Official Documents Based on File Comparison

Owner:NO 33 RES INST OF CHINA ELECTRONICS TECHNOOGY GRP

Candidate word evaluation method and apparatus, computer device and storage medium

ActiveCN108664466AImprove accuracyNatural language data processingSpecial data processing applicationsEvaluation resultLongest common subsequence problem

The present invention relates to a candidate word evaluation method and apparatus, a computer device and a storage medium, which are applied to the field of data processing. The method comprises: whendetecting a wrong word, acquiring a plurality of candidate words corresponding to the wrong word; determining similarity between each candidate word and the wrong word, wherein the similarity is obtained according to a longest common subsequence and / or a longest common substring between each candidate word and the wrong word; respectively replacing the wrong word with each candidate word to obtain a candidate sentence, and determining an evaluation probability of the corresponding candidate word according to the candidate sentence, wherein the evaluation probability is obtained according to alocale probability of the candidate word in the candidate sentence and a locale probability of the adjacent words of the candidate word; acquiring error information of the wrong word with respect toeach candidate word; and determining an evaluation score corresponding to each candidate word according to the similarity, the evaluation probability, and the error information. According to embodiments of the present invention, the problem of low reliability of the candidate word evaluation is solved, and the reliability of the candidate word evaluation result can be improved in a facilitated manner.

Candidate word evaluation method and apparatus, computer device and storage medium

Candidate word evaluation method and apparatus, computer device and storage medium

Candidate word evaluation method and apparatus, computer device and storage medium

Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Method for determining malicious software characteristics and malicious software detection method and device

ActiveCN111737693AImprove detection efficiencyWeb data indexingCharacter and pattern recognitionLongest common substring problemTheoretical computer science

The embodiment of the invention discloses a method for determining malicious software characteristics and a malicious software detection method and device. One of the methods comprises the following steps: determining a longest common substring in each character string binary group in one or more character string binary groups according to a longest common substring algorithm, and determining thecharacteristics of the malicious software according to the determined one or more longest common substrings. Therefore, the features of the malicious software can be automatically extracted, and the working efficiency is greatly improved.

Method for determining malicious software characteristics and malicious software detection method and device

Method for determining malicious software characteristics and malicious software detection method and device

Method for determining malicious software characteristics and malicious software detection method and device

Owner:BEIJING VENUS INFORMATION SECURITY TECH +1

Candidate word evaluation method and device, computer equipment and storage medium

ActiveCN108681535AImprove accuracyNatural language data processingSpecial data processing applicationsEvaluation resultLongest common subsequence problem

The invention relates to a candidate word evaluation method and device, computer equipment and a storage medium, which are applied to the field of data processing. The method comprises the following steps: detecting an error word and obtaining a plurality of candidate words corresponding to the error word; determining an editing distance between each candidate word and the error word; determiningsimilarity between each candidate word and the error word, wherein the similarity is obtained according to the longest common subsequence and / or longest common substring of each candidate word and theerror word; and obtaining error information of the error word relative to each candidate word; and determining the evaluation score corresponding to each candidate word according to the editing distance, similarity and error information. The embodiment of the invention solves the problem of low reliability of candidate word evaluation, and is beneficial to improving the reliability of candidate word evaluation results.

Candidate word evaluation method and device, computer equipment and storage medium

Candidate word evaluation method and device, computer equipment and storage medium

Candidate word evaluation method and device, computer equipment and storage medium

Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Candidate word evaluation method, apparatus, computer equipment and storage medium

ActiveCN108664466BImprove accuracyNatural language data processingText database queryingLongest common substring problemEvaluation result

The invention relates to a candidate word evaluation method, device, computer equipment and storage medium, which are applied to the field of data processing. The method includes: detecting a wrong word, obtaining a plurality of candidate words corresponding to the wrong word; determining the similarity between each candidate word and the wrong word, and the similarity is based on the longest common subsequence and / or the longest common subsequence of each candidate word and the wrong word. or the longest common substring; replace the wrong word with each candidate word respectively to obtain a candidate sentence, and determine the evaluation probability of the corresponding candidate word according to the candidate sentence, and the evaluation probability is based on the language environment probability of the candidate word in the candidate sentence. , and the language environment probability of the adjacent words of the candidate word; obtain the error information of the wrong word relative to each candidate word; according to the similarity, the evaluation probability and the error information, determine the evaluation score corresponding to each candidate word. The embodiment of the present invention solves the problem of low reliability of candidate word evaluation, and is beneficial to improve the reliability of the candidate word evaluation result.

Candidate word evaluation method, apparatus, computer equipment and storage medium

Candidate word evaluation method, apparatus, computer equipment and storage medium

Candidate word evaluation method, apparatus, computer equipment and storage medium

Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Method for obtaining longest common substring of alphabetic strings

ActiveCN102222093BReduce workloadConvenient querySpecial data processing applicationsLongest common substring problemByte

The invention relates to a method for obtaining the longest common substring among alphabetic strings. For improving the efficiency to obtain the longest common substring among alphabetic strings, the method comprises the following steps that: firstly, bidirectional comparison is carried out between the two sides of a match byte so as to obtain initial common substrings and calculate the lengths of the initial common substrings; and secondly, based on the existing longest common substring, a longer common substring is repeatedly tried to be found by means of combing multiple trans-mechanisms.until all alphabetic strings are subjected to the process. The invention has the advantages of improving the calculation efficiency for obtaining the longest common substring and reducing resource overhead.

Method for obtaining longest common substring of alphabetic strings

Method for obtaining longest common substring of alphabetic strings

Method for obtaining longest common substring of alphabetic strings

Owner:COMP APPL RES INST CHINA ACAD OF ENG PHYSICS

Candidate word evaluation method, device, computer equipment and storage medium

ActiveCN108694166BImprove accuracyNatural language data processingLongest common substring problemEvaluation result

The invention relates to a candidate word evaluation method, device, computer equipment and storage medium, and is applied in the field of data processing. The method includes: detecting a wrong word, obtaining a plurality of candidate words corresponding to the wrong word; determining the similarity between each candidate word and the wrong word, and the similarity is based on the longest common subsequence and / or of each candidate word and the wrong word Or the longest common substring obtains; Determine the language environment probability of each candidate word at the described wrong word position; Obtain the error information of the described wrong word relative to each candidate word; According to the similarity, language environment probability and error information, Determine the evaluation score corresponding to each candidate word. The embodiment of the present invention solves the problem of low evaluation reliability of candidate words, and is conducive to improving the reliability of evaluation results of candidate words.

Candidate word evaluation method, device, computer equipment and storage medium

Candidate word evaluation method, device, computer equipment and storage medium

Candidate word evaluation method, device, computer equipment and storage medium

Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Method and system for obtaining word pair translation from bilingual sentence

InactiveCN100524293CReduce workloadImprove the efficiency of obtaining translationsSpecial data processing applicationsResource poolLongest common substring problem

The invention provides a method for obtaining word pair translation from a bilingual sentence pair. The method includes the following steps: A. a lemma to be treated is received; B. the bilingual sentence pair to be chosen is searched from an index resource pool according to the lemma to be treated; C. two groups of bilingual sentence pairs are chosen from the index result, a longest public substring with the same language type sentence as that of the lemma to be treated in the two groups of the bilingual sentence pairs is obtained; D. whether the substring is consistent to the lemma to be treated or not is judged, if being not consistent, another two groups of bilingual sentence pairs are chosen from the index result, the step C is repeated; if being consistent, then, E. the longest public substring of a corresponding sentence in the two groups of the bilingual sentence pairs is obtained. The index way is utilized, thereby reducing the workload of data processing, and improving the efficiency for obtaining the translation. The invention provides a system obtaining the word pair translation from the bilingual sentence pairs.

Method and system for obtaining word pair translation from bilingual sentence

Method and system for obtaining word pair translation from bilingual sentence

Method and system for obtaining word pair translation from bilingual sentence

Owner:BEIJING KINGSOFT SOFTWARE +2

Candidate word evaluation method and apparatus, computer equipment and storage medium

InactiveCN108733645AImprove accuracyNatural language data processingSpecial data processing applicationsLongest common subsequence problemLongest common substring problem

The invention relates to a candidate word evaluation method and apparatus, computer equipment and a storage medium, which are applied to the field of data processing. The method comprises the steps ofdetecting a wrong word and acquiring a plurality of candidate words corresponding to the wrong word; determining an editing distance between each candidate word and the wrong word; determining the similarity between each candidate word and the wrong word, wherein the similarity is obtained according to the longest common sub-sequence and / or the longest common sub-string of each candidate word andthe wrong word; determining a language environment probability of each candidate word in the position of the wrong word; acquiring error information of the wrong word relative to each candidate word;and according to the editing distance, the similarity, the language environment probability and the error information, determining an evaluation score corresponding to each candidate word. Accordingto the candidate word evaluation method and apparatus, the problem of relatively low reliability of candidate word evaluation is solved, so that the reliability of a candidate word evaluation result is improved.

Candidate word evaluation method and apparatus, computer equipment and storage medium

Candidate word evaluation method and apparatus, computer equipment and storage medium

Candidate word evaluation method and apparatus, computer equipment and storage medium

Owner:GUANGZHOU SHIYUAN ELECTRONICS CO LTD

Popular searches

Document similarity Machine learning Human language Throughput Online machine learning Natural language Substring Transformation of text Data mining Work output

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com