Patents

Literature

Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.

44 results about "Sequence alignment algorithm" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Determination of optimal local sequence alignment similarity score

InactiveUS7917302B2Low costShorten the timeMicrobiological testing/measurementRecombinant DNA-technologyLocal sequence alignmentProtein function prediction

Sequence alignment and sequence database similarity searching are among the most important and challenging task in bio informatics, and are used for several purposes, including protein function prediction. An efficient parallelisation of the Smith-Waterman sequence alignment algorithm using parallel processing in the form of SIMD (Single-Instruction, Multiple-Data) technology is presented. The method has been implementation using the MMX (MultiMedia eXtensions) and SSE (Streaming SIMD Extensions) technology that is embedded in Intel's latest microprocessors, but the method can also be implemented using similar technology existing in other modern microprocessors. Near eight-fold speed-up relative to the fastest previously an optimised eight-way parallel processing approach achieved know non-parallel Smith-Waterman implementation on the same hardware. A speed of about 200 million cell updates per second has been obtained on a single Intel Pentium III 500 MHz microprocessor.

Determination of optimal local sequence alignment similarity score

Determination of optimal local sequence alignment similarity score

Determination of optimal local sequence alignment similarity score

Owner:SEEBERG ERLING CHRISTEN +1

Industrial control private agreement-based fuzzy test method

ActiveCN107241226AData switching networksSequence alignment algorithmRecognition algorithm

The invention discloses an industrial control private agreement-based fuzzy test method. A protocol tree for a private agreement is constructed through private agreement data flow captured in a normal industrial control network environment and a private agreement tree construction algorithm, a request message and a response message are effectively classified. Basic protocol information is learned, and protocol characteristics are learned through counting data sequences of individual classes and using probability statistics, a length domain recognition algorithm, an Apriori association rule algorithm and a Needleman / Wunsch pairwise sequence alignment algorithm. Different protocol characteristics are varied by using a variation rule to generate test cases. The connection condition with a tested device is monitored in the test process and the response data condition of the tested device is detected by using request and response characteristics. According to the method, the problem of efficiency of fuzzy test of the industrial control private agreement can be solved, and the effectiveness of the test cases is improved. The method comprises a data preprocessing module, a protocol learning module, a fuzzy test module and an exception alarm module.

Industrial control private agreement-based fuzzy test method

Industrial control private agreement-based fuzzy test method

Industrial control private agreement-based fuzzy test method

Owner:BEIJING UNIV OF TECH

Determination of optimal local sequence alignment similarity score

InactiveUS20040024536A1Microbiological testing/measurementRecombinant DNA-technologyLocal sequence alignmentProtein function prediction

Sequence alignment and sequence database similarity searching are among the most important and challenging task in bio informatics, and are used for several purposes, including protein function prediction. An efficient parallelisation of the Smith-Waterman sequence alignment algorithm using parallel processing in the form of SIMD (Single-Instruction, Multiple-Data) technology is presented. The method has been implementation using the MMX (MultiMedia eXtensions) and SSE (Streaming SIMD Extensions) technology that is embedded in Intel's latest microprocessors, but the method can also be implemented using similar technology existing in other modern microprocessors. Near eight-fold speed-up relative to the fastest previously an optimised eight-way parallel processing approach achieved know non-parallel Smith-Waterman implementation on the same hardware. A speed of about 200 million cell updates per second has been obtained on a single Intel Pentium III 500 MHz microprocessor.

Determination of optimal local sequence alignment similarity score

Determination of optimal local sequence alignment similarity score

Determination of optimal local sequence alignment similarity score

Owner:SEEBERG ERLING CHRISTEN +1

Method of application classification in Tor anonymous communication flow

ActiveCN104135385AReduce loadImplement application classificationData switching networksTraffic capacitySequence alignment algorithm

The invention discloses a method of application classification in Tor anonymous communication flow, which mainly solves the problem of acquisition of upper-layer application type information in the Tor anonymous communication flow and relates to the correlation technique, such as feature selection, sampling preprocessing and flow modeling. The method comprises the following steps of: firstly, defining a concept of a flow burst section by utilizing a data packet scheduling mechanism of Tor, and serving a volume value and a direction of the flow burst section as classification features; secondly, preprocessing a data sample based on a K-means clustering algorithm and a multiple sequence alignment algorithm, and solving the problems of over-fitting and inconsistent length of the data sample through the manners of value symbolization and gap insertion; and lastly, respectively modeling uplink Tor anonymous communication flow and downlink Tor anonymous communication flow of different applications by utilizing a Profile hidden Markov model, providing a heuristic algorithm to establish the Profile hidden Markov model quickly, during specific classification, substituting features of network flow to be classified into the Profile hidden Markov models of different applications, respectively figuring up probabilities corresponding to an uplink flow model and a downlink flow model, and deciding the upper-layer application type included by the Tor anonymous communication flow to be classified through a maximum joint probability value.

Method of application classification in Tor anonymous communication flow

Method of application classification in Tor anonymous communication flow

Method of application classification in Tor anonymous communication flow

Owner:南京市公安局

Ransomware variation detection method based on sequence alignment algorithm

ActiveCN107679403AAchieve exact alignmentReduced sample sizeCharacter and pattern recognitionPlatform integrity maintainanceSequence alignment algorithmSequence processing

The invention provides a ransomware variation detection method based on a sequence alignment algorithm. The method comprises the specific steps of inputting a ransomware sample, extracting a sample feature sequence, processing the sample feature sequence into a gene sequence, and detecting a ransomware variation. The step of variation detection specifically comprises the sub-steps of clustering each gene sequence in a sample set, extracting clustering result information to acquire various ransomware families; using the sequence alignment algorithm Needleman-Wunsch to compute similarity betweena sample to be detected and a class cluster center sample of various ransomware families, screening out clusters with the similarity more than a preset threshold, and using the screened clusters to form a new ransomware training sample set; determining the ransomware family class to which the sample to be detected belongs b using the newly screened training sample set in combination with the sequence alignment algorithm and a KNN classification algorithm to achieve variation detection. According to the method, the purpose of quickly achieving ransomware variation detection is achieved by combining the sequence alignment algorithm with the existing classification algorithm.

Ransomware variation detection method based on sequence alignment algorithm

Ransomware variation detection method based on sequence alignment algorithm

Ransomware variation detection method based on sequence alignment algorithm

Owner:BEIJING INSTITUTE OF TECHNOLOGYGY +1

Test-case selection method based on user sessions and hierarchical clustering algorithm

InactiveCN108388508AGood choiceImprove production efficiencyCharacter and pattern recognitionSoftware testing/debuggingSequence alignment algorithmHierarchical cluster algorithm

The invention discloses a test-case selection method based on user sessions and a hierarchical clustering algorithm. The method includes the following steps: acquiring server access logs, and carryingout sorting according to time; carrying out preprocessing and clustering to form a user session sequence set; calculating similarity distances among all user session sequences through using an improved user-session-sequence comparison algorithm; employing the improved condensing hierarchical clustering algorithm to cluster the user session sequences, and outputting final clustering results of test cases; and optimizing selection of the test cases through deleting redundant test cases. According to the method of the invention, representative user operation sequences can be quickly mined from the large number of server access logs to use the same as test cases, automation of test-case generation and optimization of test-case selection are realized, and subsequent work of automated functiontests of a server, performance tests, user behavior analysis and the like is facilitated.

Test-case selection method based on user sessions and hierarchical clustering algorithm

Test-case selection method based on user sessions and hierarchical clustering algorithm

Test-case selection method based on user sessions and hierarchical clustering algorithm

Owner:SOUTH CHINA UNIV OF TECH

Genomic sequence alignment method and genomic sequence alignment device

ActiveCN106682393AShorten comparison timeHigh speedProteomicsGenomicsReference genome sequenceSequence alignment algorithm

The invention discloses a genomic sequence alignment method and a genomic sequence alignment device. The method includes: reading part of genomic sequences from to-be-aligned genomic sequence files; subjecting the part of the genomic sequences and a reference genomic sequence to alignment according to a two-way BWT alignment algorithm, a single-end dynamic programming alignment algorithm and a double-end dynamic programming alignment algorithm; after alignment is finished according to any of the alignment algorithms, if no sequence failed in alignment exists in the part of the genomic sequences, reading new part of genomic sequences from the to-be-aligned genomic sequence files, and performing alignment according to the steps; repeating the steps until alignment of all of the to-be-aligned genomic sequence files is finished, and outputting alignment results. By the genomic sequence alignment method and the genomic sequence alignment device, problems of high time consumption, low processing speed and high resource consumption of the genomic sequence alignment algorithms can be solved.

Genomic sequence alignment method and genomic sequence alignment device

Genomic sequence alignment method and genomic sequence alignment device

Owner:UNITED ELECTRONICS

Social network association searching method based on graphics processing unit (GPU) multiple sequence alignment algorithm

InactiveCN102651030ASolve the large amount of dataSolve complexitySpecial data processing applicationsSearch problemDistance matrix

The invention discloses a social network association searching method based on a graphics processing unit (GPU) multiple sequence alignment algorithm. The method comprises the following steps that: a central processing unit (CPU) performs web crawler on an individual webpage so as to extract an individual characteristic vector from a social network; the CPU filters redundant characteristic information from the individual characteristic vector so as to generate a uniform individual characteristic information vector base; a GPU calculates an individual distance matrix and a correction distance matrix of the social network according to the uniform individual characteristic information vector base; the GPU establishes a social network association route guidance tree according to the correction distance matrix; and the GPU traverses the social network association route guidance tree so as to perform the optimal association route searching. By utilizing the advantage that the GPU is suitable for processing a large amount of dense data, associated searching problems which are solved by the the multiple sequence alignment algorithm are parallelized, complex and time-consuming operations, such as formation and traversing of the matrixes and the association route guidance tree, are finished by the GPU, and the problem of long time caused by a large amount of social network data and operation complexity is solved.

Social network association searching method based on graphics processing unit (GPU) multiple sequence alignment algorithm

Social network association searching method based on graphics processing unit (GPU) multiple sequence alignment algorithm

Social network association searching method based on graphics processing unit (GPU) multiple sequence alignment algorithm

Owner:HUAZHONG UNIV OF SCI & TECH

Unknown protocol message format deduction method

ActiveCN104935567AInferred validReduce workloadTransmissionSequence alignment algorithmPrior information

The present invention provides an unknown protocol message format deduction method. The method comprises the steps of capturing an original data packet in the network, establishing a sequence alignment binary tree according to the length of the data packet, and carrying out the upward sequence alignment from the leaf nodes of the binary tree, wherein the sequence alignment adopts a sequence alignment algorithm based on dynamic programming, obtaining a result possessing the same length leaf node alignment after the sequence alignment of all nodes are ended, and according to the result, searching the same parts, thereby automatically realizing the unknown protocol message format deduction and output. Compared with an existing artificial participation unknown data packet format deduction method, an automatic unknown protocol message method based on the data packet sequence alignment provided by the present invention enables the artificial participation workload to be reduced to realize the automatic deduction on the basis of determining the number of the acquisition data packets, and can realize the effective deduction to an unknown protocol data packet format on the condition of not having data packet format any prior information.

Unknown protocol message format deduction method

Unknown protocol message format deduction method

Unknown protocol message format deduction method

Owner:SOUTHWEST CHINA RES INST OF ELECTRONICS EQUIP

Large-scale ontology mapping method for Chinese languages

InactiveCN104699767ASpecial data processing applicationsSequence alignment algorithmDegree of similarity

The invention provides a mapping method for large-scale Chinese ontology. The method comprises the following steps: initializing a correlation degree computing method on the basis of the concept integrating Chinese thesaurus and an edit distance similarity algorithm; compressing large-scale ontology mapping scale on the basis of a pseudo-nuclear-force field potential function integrating concept similarity and dissimilarity improved by initial correlation degree; performing similarity measurement on complex concepts in the Chinese ontology through introducing a global sequence alignment algorithm. Chinese works have the phenomena of polysemy and sensitive word order, and the computing cost of large-scale ontology mapping is high, and according to the method, firstly, the existing pseudo-nuclear-force field potential function is improved, so that the measurement of similarity among concepts and the scale compression of the ontology to be mapped are more reasonable. Secondly, a global sequence alignment technology is adopted to map complex Chinese concepts, further defects of a traditional Chinese ontology mapping system are overcome, and finally the mapping efficiency of the system is improved, and the precision ratio and the recall ratio are increased.

Large-scale ontology mapping method for Chinese languages

Large-scale ontology mapping method for Chinese languages

Large-scale ontology mapping method for Chinese languages

Owner:CAPITAL UNIV OF ECONOMICS & BUSINESS

Method and system for sensing abnormal signs in daily activities

InactiveUS7847682B2Efficiently sensedData processing applicationsMedical automated diagnosisSequence alignment algorithmOlder people

There are provided a method and system for sensing abnormal signs in daily activities, the method comprising, at the system, sensing the daily activities, reading previously stored daily activity information, generating a daily activity sequence based thereon, sensing the abnormal signs from the daily activity sequence by using a preset sequence alignment algorithm, and providing the sensed abnormal signs to a user. As described above, the abnormal signs, which should be checked to provide care services, are sensed via changes in a daily activity pattern and added to a care service system that will be installed in welfare facilities for the aged or a home of a solitary old person, thereby effectively sensing the abnormal signs in daily activities of the aged.

Method and system for sensing abnormal signs in daily activities

Method and system for sensing abnormal signs in daily activities

Method and system for sensing abnormal signs in daily activities

Owner:ELECTRONICS & TELECOMM RES INST

Prediction method for signal peptide and cleavage site thereof on the basis of layered mixture model

ActiveCN106951735AReduce false positivesHigh sensitivityBiostatisticsSequence analysisProtein insertionPredictive methods

The invention discloses a prediction method for signal peptide and a cleavage site thereof on the basis of a layered mixture model. The prediction method comprises the following steps that: firstly, in a first layer, applying an SVM (Support Vector Machine) classifier based on amino acid residue features to identify whether a protein sequence contains N-end hydrophobic fragments or not; then, in a second layer, applying a Naive Bayes and SVM classifier based on amino acid residue features and functional structural domain features to identify whether the hydrophobic fragments are the signal peptide or N-end transmembrane helixes or not; and finally, in a third layer, according to a statistical learning rule, screening candidate cleavage sites, calculating a statistical credit score, then, calculating the similarity score of a signal peptide sequence through a Needleman-Wunsch sequence comparison algorithm, and determining a predicted signal peptide cleavage site for the statistical credit score and a sequence similarity score integral.

Prediction method for signal peptide and cleavage site thereof on the basis of layered mixture model

Prediction method for signal peptide and cleavage site thereof on the basis of layered mixture model

Prediction method for signal peptide and cleavage site thereof on the basis of layered mixture model

Owner:SHANGHAI JIAO TONG UNIV

Multiple sequence alignment visualization method based on image processing

InactiveCN108052799AEliminate noise interferenceThe segmentation result is accurateImage enhancementImage analysisPattern recognitionColor transformation

The invention relates to a multiple sequence alignment visualization method based on image processing. The method includes following steps: S1, taking multiple amino acid sequences generated by a multiple sequence alignment algorithm as input; S2, respectively defining different colors for different types of amino acids, and performing color conversion on the amino acid sequences; S3, combining with image conversion to enable each amino acid in the amino acid sequences to correspond to one pixel in images, to enable color of each pixel to correspond to that of the corresponding amino acid andto convert multiple one-dimensional amino acid sequences into two-dimensional colored images; S4, utilizing an image segmentation method based on edge detection to segment converted images, and presenting segmented images to a user.

Multiple sequence alignment visualization method based on image processing

Multiple sequence alignment visualization method based on image processing

Multiple sequence alignment visualization method based on image processing

Owner:SUN YAT SEN UNIV

Method and system for sensing abnormal signs in daily activities

InactiveUS20090066501A1Efficiently sensedData processing applicationsMedical automated diagnosisSequence alignment algorithmOlder people

There are provided a method and system for sensing abnormal signs in daily activities, the method comprising, at the system, sensing the daily activities, reading previously stored daily activity information, generating a daily activity sequence based thereon, sensing the abnormal signs from the daily activity sequence by using a preset sequence alignment algorithm, and providing the sensed abnormal signs to a user. As described above, the abnormal signs, which should be checked to provide care services, are sensed via changes in a daily activity pattern and added to a care service system that will be installed in welfare facilities for the aged or a home of a solitary old person, thereby effectively sensing the abnormal signs in daily activities of the aged.

Method and system for sensing abnormal signs in daily activities

Method and system for sensing abnormal signs in daily activities

Method and system for sensing abnormal signs in daily activities

Owner:ELECTRONICS & TELECOMM RES INST

Gene sequence alignment method and system

PendingCN112735528AHigh speedHigh precisionBiostatisticsSequence analysisReference genome sequenceData set

The invention discloses a gene sequence alignment method and system. The method comprises the following steps: storing a reference genome sequence and a query genome sequence in a distributed storage system; under a Spark heterogeneous distributed computing platform framework, segmenting a reference genome sequence according to row offset, and preprocessing to obtain a plurality of preprocessed reference data sets; establishing an index for each preprocessing reference data set by adopting a suffix array algorithm, and combining all the preprocessing reference data sets after the index is established to obtain a reference sequence index file; carrying out CUDA fine-grained sequence comparison on each fragment in the query genome sequence and a reference sequence index file by adopting a seed extension algorithm, and determining position information of each fragment in the reference sequence index file; and combining the position information of all the fragments in the reference sequence index file to obtain a gene sequence comparison result. According to the invention, the calculation speed and precision of a large-scale sequence alignment algorithm are improved.

Gene sequence alignment method and system

Gene sequence alignment method and system

Gene sequence alignment method and system

Owner:HUAZHONG AGRI UNIV

Gesture identity authentication system and method based on sensor on mobile phone

InactiveCN105530357ASimple and convenient registrationEase of useUnauthorised/fraudulent call preventionSequence alignment algorithmEnvironment effect

The invention provides a gesture identity authentication system and method based on a sensor on a mobile phone, and relates to the field of identity authentication based on sensors on mobile phones. The gesture identity authentication system comprises an acceleration sensor used for recording real-time acceleration information of a user gesture in a moving process; a direction sensor used for recording azimuth angle information of the user gesture in the moving process; a preprocessing module used for carrying out filtering denoising and equal frequency sampling on the information recorded in the acceleration sensor and the direction sensor; a calculation module used for respectively calculating matching scores of the acceleration information and the azimuth angle information via a global sequence alignment algorithm, calculating a threshold through the matching scores and gesture information made by the user again, and then comparing the user gesture information input at each time with the threshold; and a template base module used for storing original samples of all user gestures and storing the matching scores and the threshold calculated by the calculation module. The gesture identity authentication system provided by the invention adopts no additional device to serve as support, is scarcely influenced by environmental factors and is safe and convenient to carry out identity authentication of the user on the mobile phone.

Gesture identity authentication system and method based on sensor on mobile phone

Gesture identity authentication system and method based on sensor on mobile phone

Owner:WUHAN UNIV OF TECH +1

Flexible distributed sequence alignment system and method based on Spark and SIMD

InactiveCN107358061AImprove performanceSolve the problem of limited scalabilitySpecial data processing applicationsBioinformaticsSequence alignment algorithmDistributed memory

The invention discloses a flexible distributed sequence alignment system based on Spark and SIMD. The system includes a master node and multiple working nodes connected to the master node; the master node is used for management of metadata and clusters and includes a master node body based on the distributed type computational frame Spark, a master node body based on a distributed type memory file system and a master node body of a Hadoop distributed type file system; the working nodes are used for data storage and calculation and includes a storage layer and a calculation layer; the storage layer includes Alluxio and HDFS, the calculation layer includes the Spark and an SIMD instruction set, and according to the distributed type computational frame Spark, a sequence alignment algorithm based on the SIMD is called through a mediation module for sequence alignment. The Alluxio and the HDFS are used for distributed storage of data, the Spark is used for distributed type calculation, the SIMD technology is adopted at each node for sequence alignment, and performance is improved.

Flexible distributed sequence alignment system and method based on Spark and SIMD

Flexible distributed sequence alignment system and method based on Spark and SIMD

Flexible distributed sequence alignment system and method based on Spark and SIMD

Owner:UNIV OF SCI & TECH OF CHINA

Industrial control protocol reverse analysis method based on active learning

PendingCN111723181AImprove accuracyIncrease coverageMachine learningText database queryingReverse analysisSequence alignment algorithm

The invention discloses an industrial control protocol reverse analysis method based on active learning. The method comprises the steps of importing, preliminary analysis, variation, matching and merging. According to the method, an industrial control protocol pcap message sample is subjected to preliminary analysis; a partial message format and a state machine of an industrial control protocol are mastered;and then, interactive active learning is carried out with the industrial personal computer by utilizing the result to continuouslyobtain new messages, so that protocol individual lexical methods and grammars can be deduced more accurately and completely; a Needleman-Wunsch sequence alignment algorithm is adopted when reverse analysis is carried out on the protocol; according to the algorithm, a format and a state machine of a protocol are deduced through similarity scoring and optimal backtracking steps; the method is advantaged in that accuracy of the analysis result is effectivelyguaranteed, through combination with the active learning process, the response message is matched with the protocol formats in the preliminary analysis result, whether the message is matched with theprotocol formats is determined, repeated matching is carried out according to demands, and reverse accuracy and coverage of the industrial control protocol are substantially improved.

Industrial control protocol reverse analysis method based on active learning

Industrial control protocol reverse analysis method based on active learning

Owner:NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT +1

Industrial control protocol reverse analysis method based on semantic pre-mining

InactiveCN111585832AImprove recognition rateAvoid failureSemantic analysisData switching networksReverse analysisTimestamp

The invention discloses an industrial control protocol reverse analysis method based on semantic pre-mining, which realizes optimization of an industrial control data sample protocol reverse analysisresult by pre-mining semantics such as timestamps, lengths, serial numbers and the like and then carrying out field division before protocol format reverse analysis is carried out. The basic idea of the method is that the method comprises the steps: when protocol format analysis is performed on a target industrial control data sample, clustering a sample set to be analyzed according to the lengthof a message, analyzing whether fields such as timestamps, lengths and serial numbers exist in different types of messages, and replacing discovered semantic fields with wildcard characters; after semantic pre-analysis is completed, adopting a Needleman-Wunsch sequence alignment algorithm to analyze the data sample; and finally, replacing the semantic result obtained by pre-analysis in the analysis result, so the accuracy of the analysis result is improved. The method has the advantages of accurate analysis result, high semantic recognition rate and the like.

Industrial control protocol reverse analysis method based on semantic pre-mining

Owner:ZHEJIANG SHUREN COLLEGE ZHEJIANG SHUREN UNIV

Mobile code decoding fault recovery via history data analysis

InactiveUS8640952B2Code conversionCharacter and pattern recognitionSequence alignment algorithmMobile code

An apparatus and method for mobile code decoding fault recovery are provided. The method includes scanning a mobile code, decoding the mobile code, when the mobile code is not decoded successfully, storing a decoded portion of the mobile code as a partially decoded mobile code, and decoding the mobile code based on a sequence alignment algorithm and a predetermined number of partially decoded mobile codes, and providing the decoded mobile code to the mobile device.

Mobile code decoding fault recovery via history data analysis

Mobile code decoding fault recovery via history data analysis

Mobile code decoding fault recovery via history data analysis

Owner:SAMSUNG ELECTRONICS CO LTD

Re-sequencing sequence alignment method based on Spark framework

InactiveCN110136777AImprove analysis efficiencySequence analysisInstrumentsRe sequencingSequence alignment algorithm

The invention relates to the technical field of computer science and bioinformatics, and particularly to a re-sequencing sequence alignment method based on Spark framework. The method comprises threesteps of a RDDs creating step, a Map step and a Reduce step. The corresponding RDDs are created based on the FASTQ file and are stored in an HDFS. Then a sequence alignment algorithm of a BWA is applied on each RDDs. Furthermore the RDDs perform multi-node mapping. Finally whether to execute a final combining step is determined according to a processing requirement. According to the method, a sequence alignment BWA which is used in a re-sequencing step is integrated in a Spark big data processing frame, and re-sequencing procedure optimization is finished in a distributed calculation manner, thereby effectively improving re-sequencing data analysis efficiency.

Re-sequencing sequence alignment method based on Spark framework

Re-sequencing sequence alignment method based on Spark framework

Re-sequencing sequence alignment method based on Spark framework

Owner:SHENZHEN INST OF ADVANCED TECH

Method for extracting information from error OCR result

PendingCN113221936AGood compatibilityReduce Typo PenaltiesCharacter and pattern recognitionPattern recognitionSequence alignment algorithm

The invention is applicable to the technical field of image text processing, and provides a method for extracting information from an error OCR result, which comprises the following steps of: obtaining a result of extracting an image text through OCR; carrying out post-processing on the OCR results, and merging the OCR results into rows; defining an extraction template according to an information extraction target; carrying out fuzzy matching on a template and all OCR lines by utilizing an optimized global sequence alignment algorithm; optimizing a matching alignment result by utilizing a character library with a similar shape; extracting target information according to a matching alignment result. Meanwhile, the invention further provides a method for generating the similar character library through the neural network recognition model, by means of the similar character library, information provided by wrong characters in OCR recognition can be more effectively utilized, and the information extraction precision is improved. Compared with the prior art, the information extraction method provided by the invention has the advantages that the problem of OCR result error can be effectively solved, and the information extraction effect under the conditions of missing characters, multiple characters and wrong characters is greatly improved.

Method for extracting information from error OCR result

Method for extracting information from error OCR result

Method for extracting information from error OCR result

Owner:上海兑观信息科技技术有限公司

High-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism

ActiveCN114064551AImprove performanceMake full use of concurrent computing capabilitiesProgram initiation/switchingMultiple digital computer combinationsComputational scienceSequence alignment algorithm

The invention discloses a high-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism. The method comprises the following steps: reconstructing BWA-MEM algorithm codes; performing task concurrent processing on the CPU: completing division of a sequence set, and forming a plurality of concurrent tasks for the first time; running the BWA-MEM algorithm after code reconstruction, and completing concurrent processing of data on the GPU; and task concurrent processing on the GPU: for seed sets and chains generated in the sequence data comparison process, dividing the seed sets with the same or adjacent length, position and quantity into the same data block and chain, and performing the same processing, thereby completing the division of the seed sets and the chains, and forming a plurality of concurrent tasks for the second time. According to the method, the characteristics of the BWA-MEM algorithm and the characteristics of GPU acceleration equipment are closely combined by designing a task parallel and data parallel mode, the strong concurrent operation capability of the GPU is fully utilized, excellent performance is provided for a sequence alignment algorithm, and the efficiency of high-concurrent processing is higher.

High-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism

High-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism

High-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism

Owner:GUANGZHOU JIAJIAN MEDICAL TESTING CO LTD

Third generation sequencing alignment algorithm

InactiveCN108699601AMicrobiological testing/measurementData visualisationSequence alignment algorithmAlgorithm

Methods, software, and systems for aligning a read sequence to a reference sequence are disclosed. In certain embodiments, the methods, software, and systems involve determining similarity of distribution of k-mers between a region of the read sequence and a region of the reference sequence in order to determine whether the region of the read sequence maps to the region of the reference sequence.

Third generation sequencing alignment algorithm

Third generation sequencing alignment algorithm

Third generation sequencing alignment algorithm

Owner:THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIV

Third Generation Sequencing Alignment Algorithm

InactiveUS20190042696A1Massive throughputLow costMicrobiological testing/measurementData visualisationSequence alignment algorithmThird generation

Methods, software, and systems for aligning a read sequence to a reference sequence are disclosed. In certain embodiments, the methods, software, and systems involve determining similarity of distribution of k-mers between a region of the read sequence and a region of the reference sequence in order to determine whether the region of the read sequence maps to the region of the reference sequence.

Third Generation Sequencing Alignment Algorithm

Third Generation Sequencing Alignment Algorithm

Third Generation Sequencing Alignment Algorithm

Owner:THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIV

Method and system for optimizing multiple sequence alignment algorithms, and storage medium

ActiveCN109949867AShort timeReduce consumptionBiostatisticsSequence analysisSequence alignment algorithmDistance matrix

The invention relates to a method and a system for optimizing multiple sequence alignment algorithms, and a storage medium. The method comprises the steps of selecting a core sequence from multiple sequences; performing pairwise alignment on the core sequence and other sequences in the multiple sequences, and obtaining the number of common fragments of the sequences; constructing a first guiding tree according to the number of common fragments of the pairwise sequences; performing a progressive algorithm on the first guiding tree for obtaining a first result through alignment of multiple sequences; calculating the distance between the pairwise sequences according to the first result, and obtaining a distance matrix; constructing a second guiding tree according to the distance matrix, comparing the first guiding tree with the second guiding tree, performing re-alignment on the sequences which correspond with the changing part for obtaining a second result, and repeating processes of constructing the second guiding tree and comparing the first guiding tree with the second guiding tree until the number of comparison times exceeds a threshold, thereby shortening time consumption in sequence comparison, increasing processing process and reducing resource consumption.

Method and system for optimizing multiple sequence alignment algorithms, and storage medium

Method and system for optimizing multiple sequence alignment algorithms, and storage medium

Method and system for optimizing multiple sequence alignment algorithms, and storage medium

Owner:INST OF SPECIAL ANIMAL & PLANT SCI OF CAAS

Sequence alignment Seed processing method, system and device and readable storage medium

ActiveCN110942809AAvoid redundancySmall amount of calculationSequence analysisInstrumentsSequence alignment algorithmAlgorithm

The invention discloses a sequence alignment Seed processing method, system and device and a computer readable storage medium. The method comprises the steps: according to the to-be-compared sequenceposition of the Seeds on a to-be-compared sequence and the candidate comparison position of the Seeds on the reference sequence, determining the linear Seeds with the consistent relative relationshipbetween the two positions of the Seeds; splicing the linear Seeds to obtain a new spliced Seed; screening out the longest Seed covering the longest base of the same base fragment of the to-be-comparedsequence from a Seed set comprising the spliced Seeds and nonlinear Seeds; further screening out the Seed which covers the target basic group fragment in each target basic group fragment on the to-be-compared sequence and of which the termination position is greater than the invalid Seed from the Seed set; synthesizing the target Seed of each target base fragment to obtain a target Seed set, wherein the target Seed set does not include Seeds in the longest Seed set, and the number of Seeds used when a subsequent sequence alignment algorithm is expanded is comprehensively reduced, so the calculated amount of an alignment system is reduced, and the matching precision and the processing performance of gene sequence alignment are improved.

Sequence alignment Seed processing method, system and device and readable storage medium

Sequence alignment Seed processing method, system and device and readable storage medium

Sequence alignment Seed processing method, system and device and readable storage medium

Owner:LANGCHAO ELECTRONIC INFORMATION IND CO LTD

A Large-Scale Ontology Mapping Method for Chinese Language

InactiveCN104699767BSpecial data processing applicationsNuclear fusionDegree of association

The invention provides a large-scale Chinese ontology-oriented mapping method. The method includes: a concept initial correlation degree calculation method based on the fusion of synonyms and edit distance similarity algorithms; a quasi-nuclear force field potential function based on the improved fusion of concept similarity and dissimilarity based on the initial correlation degree. Compress the scale of large-scale ontology mapping; measure the similarity of complex concepts in Chinese ontology by introducing a global sequence alignment algorithm. Due to the phenomenon of polysemy and word order sensitivity in Chinese words, and the computational overhead of large-scale ontology mapping is very large, and the present invention first improves the existing quasi-nuclear force field potential function so that the measurement of the similarity between concepts and the to-be-mapped Ontology scale reduction is more reasonable. Secondly, the global sequence alignment technology is used to map complex Chinese concepts, and then improve the defects of the existing Chinese ontology mapping system, and finally improve the mapping efficiency, precision and recall of the system.

A Large-Scale Ontology Mapping Method for Chinese Language

A Large-Scale Ontology Mapping Method for Chinese Language

A Large-Scale Ontology Mapping Method for Chinese Language

Owner:CAPITAL UNIV OF ECONOMICS & BUSINESS

Method for realizing trajectory data release k-anonymity based on point density segmentation trajectory

ActiveCN112818402AEfficient integrationReduce lossesDigital data protectionGeographical information databasesCluster algorithmData set

The invention discloses a method for realizing trajectory data release k- anonymity based on a point density segmentation trajectory. The method comprises the following steps: 1) acquiring basic trajectory data, and establishing a trajectory data set model; 2) establishing a trajectory loss model DGH tree; (3) adding virtual points into the trajectory data set model, and generating trajectory data set models containing the virtual points and a virtual point mark data set model; 4) clustering the trajectory data set models containing the virtual points, marking a clustering center to which each point belongs, and generating a mark data set model; 5) traversing the trajectory data set models, segmenting the trajectory through the mark data set model, and generating a segmented trajectory data set model; and 6) for the segmented data set model, using a dynamic sequence alignment algorithm to calculate loss, and then using an iterative trajectory k anonymous clustering algorithm to perform clustering based on information loss. According to the method, the trajectory is segmented based on the point density of the trajectory data set, and the information loss caused in the k-anonymity process is reduced.

Method for realizing trajectory data release k-anonymity based on point density segmentation trajectory

Method for realizing trajectory data release k-anonymity based on point density segmentation trajectory

Method for realizing trajectory data release k-anonymity based on point density segmentation trajectory

Owner:SOUTH CHINA UNIV OF TECH

A Method for Classifying Tor Anonymous Communication Traffic Applications

ActiveCN104135385BSolving Consistency IssuesReduce loadData switching networksData packAlgorithm

The invention discloses a method of application classification in Tor anonymous communication flow, which mainly solves the problem of acquisition of upper-layer application type information in the Tor anonymous communication flow and relates to the correlation technique, such as feature selection, sampling preprocessing and flow modeling. The method comprises the following steps of: firstly, defining a concept of a flow burst section by utilizing a data packet scheduling mechanism of Tor, and serving a volume value and a direction of the flow burst section as classification features; secondly, preprocessing a data sample based on a K-means clustering algorithm and a multiple sequence alignment algorithm, and solving the problems of over-fitting and inconsistent length of the data sample through the manners of value symbolization and gap insertion; and lastly, respectively modeling uplink Tor anonymous communication flow and downlink Tor anonymous communication flow of different applications by utilizing a Profile hidden Markov model, providing a heuristic algorithm to establish the Profile hidden Markov model quickly, during specific classification, substituting features of network flow to be classified into the Profile hidden Markov models of different applications, respectively figuring up probabilities corresponding to an uplink flow model and a downlink flow model, and deciding the upper-layer application type included by the Tor anonymous communication flow to be classified through a maximum joint probability value.

A Method for Classifying Tor Anonymous Communication Traffic Applications

A Method for Classifying Tor Anonymous Communication Traffic Applications

A Method for Classifying Tor Anonymous Communication Traffic Applications

Owner:南京市公安局

Popular searches

Sequence database Data bank Microprocessor Bio informatics Million Cells Cut score Pentium Parallel processing Computer science Genetic Sequence Databases

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com