Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

401 results about "Edit distance" patented technology

In computational linguistics and computer science, edit distance is a way of quantifying how dissimilar two strings (e.g., words) are to one another by counting the minimum number of operations required to transform one string into the other. Edit distances find applications in natural language processing, where automatic spelling correction can determine candidate corrections for a misspelled word by selecting words from a dictionary that have a low distance to the word in question. In bioinformatics, it can be used to quantify the similarity of DNA sequences, which can be viewed as strings of the letters A, C, G and T.

End-to-end identification method for scene text with random shape

The invention discloses an end-to-end identification method for a scene text with a random shape. The method comprises the steps of extracting a text characteristic through a characteristic pyramid network for generating a candidate text box by an area extracting network; adjusting the position of the candidate text box through quick area classification regression branch for obtaining more accurate position of a text bounding box; inputting the position information of the bounding box into a dividing branch, obtaining a predicated character sequence through a pixel voting algorithm; and finally processing the predicated character sequence through a weighted editing distance algorithm, finding out a most matched word of the predicated character sequence in a given dictionary, thereby obtaining a final text identification result. According to the method of the invention, the scene texts with the random shape can be simultaneously detected and identified, wherein the scene texts comprisehorizontal text, multidirectional text and curved text. Furthermore end-to-end training can be completely performed. Compared with prior art, the identification method according to the invention has advantages of obtaining advantageous effects in accuracy and versatility, and realizing high application value.
Owner:HUAZHONG UNIV OF SCI & TECH

Document Scanning and Data Derivation Architecture.

InactiveUS20070033118A1Reduce and eliminate manual typingEliminate or reduce common typographical errorsComplete banking machinesFinanceFeature vectorImaging analysis
Proprietary suite of underlying document image analysis capabilities, including a novel forms enhancement, segmentation and modeling component, forms recognition and optical character recognition. Future version of the system will include form reasoning to detect and classify fields on forms with varying layout. Product provides acquisition, modeling, recognition and processing components, and has the ability to verify recognized data on the image with a line by line comparison. The key enabling technologies center around the recognition and processing of the scanned forms. The system learns the positions of lines and the location of text on the pre-printed form, and associates various regions of the form with specific required fields in the electronic version. Once the form is recognized, the preprinted material is removed and individual regions are passed to an optical character recognition component. The current proprietary OCR engine is trained with a variety of Roman text fonts and has a back end dictionary that can be customized to account for the fact that the system knows which field it is recognizing. The engine performs segmentation to obtain isolated characters and computes a structure based feature vector. The characters are normalized and classified using a cluster centric classifier, which responds well to variations in the symbols contour. An efficient dictionary lookup scheme provides exact and edit distance lookup using a TRIE structure. An edit distance is computed and a collection of near misses can be output in a lattice to enhance the final recognition result. The current classification rate can exceed 99% with context. The ultimate goal of this system is to enable the processing of all tax forms including forms with handwritten material.
Owner:TAXSCAN TECH

Music retrieval system based on audio fingerprint features

The invention belongs to the technical field of information retrieval, and particularly relates to a music retrieval system based on audio fingerprint features. The system is composed of a preprocessing module, a feature extraction module, a reverse index module and a fine matching module. The preprocessing module mainly carries out audio signal conversion, resampling and filtering; the feature extraction module is used for representing audio files, wherein the audio fingerprint features are adopted to select the most stable point from a frequency spectrum as the feature point through twice screening based on dynamic threshold values, and each feature is represented by a dot pair; according to the reverse index module, the features are used as key words, reverse indexes are built according to the features of a song library, and the index result is returned according to the number of the same key words; according to the fine matching module, the sequential relationship of the audio features is combined, an improved editing distance is adopted as the similarity of two feature sequences, and therefore the index result is optimized. The music retrieval system based on the audio fingerprint features is suitable for the retrieval of a large number of songs, and can particularly conduct effective retrieval on record inquiry segments.
Owner:FUDAN UNIV

Character string updated degree evaluation program

There is provided a character string updated degree evaluation program that enables quantitative grasping of an amount of intellectual work through editing and updating of character strings. A text subjected to comparison is divided into common part character strings each having a length greater than or equal to a threshold value, and non-common part character strings. A number of edited points from the original text and a context edit distance are calculated based on the rate of the common part character strings and the occurrence pattern thereof. A number of edited point is acquired from a number of elements contained in a common part character string set, and a context edit distance is acquired from a change in an order of occurrence of the common part character strings. Calculation of a new creation percentage and analysis by an N-gram are performed on the non-common part character string. The new creation percentage is acquired from the total length of the elements contained in a non-common part character string set, and a new creation novelty degree is acquired from a non-partial matching rate between a non-common part character string set and an element contained in the non-common part character string set. Calculations for the common part character string set and for the non-common part character string set are united, thereby calculating a text updated degree.
Owner:NAT UNIV CORP NAGAOKA UNIV TECH

Method and device for calculating similarity of Chinese character strings on the basis of edit distance

An embodiment of the invention provides a method for calculating the similarity of Chinese character strings on the basis of edit distance. The method includes: calculating the similarity of Chinese characters in character strings to be compared; calculating the similarity of the Chinese character strings to be compared. According to the method, the Chinese characters in the character strings are converted into four-corner codes by the four-corner code method; the similarity of the Chinese characters is accordingly calculated on the basis of edit distance; on this basis, the weight of edit distance is replaced with the similarity of the Chinese characters to calculate the similarity of the character strings. The Chinese characters are converted into numeric strings for comparison, thus matching of the Chinese characters is more precise; the weight of the edit distance is replaced with the similarity of the Chinese characters to calculate the similarity of the character strings, thus the edit distance algorithm is applied to matching of the Chinese character strings under the Chinese language environment and matching results are more accurate. In addition, another embodiment of the invention provides a device for calculating the similarity of Chinese character strings on the basis of edit distance.
Owner:SHENZHEN AUDAQUE DATA TECH

Method and device for identifying human face through double models

The invention discloses a method for identifying human face through double models and mainly solves the problem that the traditional identification method greatly depends on textures of a human face image. The method of the invention comprises the following steps: dividing a human face image sample set into a test image set and a train image set, and studying a train image to obtain a characteristic face subspace and an active apparent model; projecting test and train images to the characteristic face subspace to obtain texture models, and calculating the distance between the test and train image texture models; automatically searching test and train image characteristic points according to the active apparent model, constructing shape models, and taking an image edit distance as the distance between test and train image shape models; and determining identity information of the test image through weighted fusion of the distances. Compared with the texture-based or structural information-based identification method, the method of the invention has the advantage of higher identification rate to the human face image with changed expression, illumination and size, particularly to the human face image acquired under the condition of changed illumination, and can be used for authentication under the influence of a plurality of factors.
Owner:XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products