Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

7339 results about "Degree of similarity" patented technology

Document similarity detection and classification system

A document similarity detection and classification system is presented. The system employs a case-based method of classifying electronically distributed documents in which content chunks of an unclassified document are compared to the sets of content chunks comprising each of a set of previously classified sample documents in order to determine a highest level of resemblance between an unclassified document and any of a set of previously classified documents. The sample documents have been manually reviewed and annotated to distinguish document classifications and to distinguish significant content chunks from insignificant content chunks. These annotations are used in the similarity comparison process. If a significant resemblance level exceeding a predetermined threshold is detected, the classification of the most significantly resembling sample document is assigned to the unclassified document. Sample documents may be acquired to build and maintain a repository of sample documents by detecting unclassified documents that are similar to other unclassified documents and subjecting at least some similar documents to a manual review and classification process. In a preferred embodiment the invention may be used to classify email messages in support of a message filtering or classification objective.
Owner:GLASS JEFFREY B MR

Detecting query-specific duplicate documents

An improved duplicate detection technique that uses query-relevant information to limit the portion(s) of documents to be compared for similarity is described. Before comparing two documents for similarity, the content of these documents may be condensed based on the query. In one embodiment, query-relevant information or text (also referred to as "snippets") is extracted from the documents and only the extracted snippets, rather than the entire documents, are compared for purposes of determining similarity.
Owner:GOOGLE LLC

Method and system for exploring similarities

A method and computer readable medium for exploring similar users and items of a media service. In one aspect, a user can explore for similar users iteratively. In one aspect, a user interface is generated that displays a user selectable indicia representing a similar member function for allowing a user to search a media service for at least one other user which has a degree of similarity with respect to the searching user. In another aspect, a method facilitates the search of such a similar user within a media service.
Owner:HUAWEI TECH CO LTD

Method, a device and computer program products for protecting privacy of users from web-trackers

A method, a device and computer program products for protecting privacy of users from web-trackersThe method comprising:capturing and removing a public unique identifier set by a Website (300) in a computing device (100D) of a user (100);monitoring, during a first time-period, web-requests the user (100) makes to obtain a web-behavioral profile of the user (300), and storing the obtained web-behavioral profile as a first vector;tracking, during a second time-period, the web-requests to examine the effect each web-request has on assisting the de-anonymization of the user (100), obtaining a second vector;classifying, the obtained second vector taking into account a computed similarity score parameter;creating and mapping, a corresponding private unique identifier for said captured public identifier; andexecuting, based on said mapping between the private and the public unique identifiers, an intervention algorithm for said web-tracker, that considers a configured intervention policy.
Owner:TELEFONICA SA

Aural similarity measuring system for text

The aural similarity measuring system and method provides a measure of the aural similarity between a target text (10) and one or more reference texts (11). Both the target text (10) and the reference texts (11) are converted into a string of phonemes (15) and then one or other of the phoneme strings are adjusted (16) so that both are equal in length. The phoneme strings are compared (12) and a score generated representative of the degree of similarity of the two phoneme strings. Finally, where there is a plurality of reference texts the similarity scores for each of the reference texts are ranked (13). With this aural similarity measuring system the analysis is automated thereby reducing risks of errors and omissions. Moreover, the system provides an objective measure of aural similarity enabling consistency of comparison in results and reproducibility of results.
Owner:MONGOOSE VENTURES

Nomination engine

A system and method for nominating candidate enterprises for inclusion in a competitive set assembled for the purpose of competitively analyzing a subject enterprise. Characteristics of the subject enterprise in which the subject enterprise is comparable to a plurality of candidate enterprises are identified. The characteristics may include location, size of the physical presence, a dollar volume of revenue, classification of the business engaged, market share, average purchase size; of the subject purchase frequency of customers; size of customer base; demographic characteristics of the customer base, location of customers, degree of customer loyalty, and share of the customer's wallet. A list candidate enterprises is compiled based upon a predetermined degree of similarity between the subject enterprise and / or an identified competitor enterprise on the one hand, and the candidate enterprise on the other. A plurality of nominee enterprises are selected from the list of candidates to populate the competitive set.
Owner:MASTERCARD INT INC

Automatic match tuning

Methods and apparatus, including computer program products, for identifying matches between disparate schemas calculates a degree of similarity between elements of two schemas using each of multiple matching processes. The calculated degrees of similarity are combined using a first weighting vector to produce first combined degrees of similarity. The first weighting vector includes multiple weighting coefficients and each weighting coefficient corresponds to one of the matching processes. The weighting coefficients are tuned using information relating to a predicted degree of matching accuracy associated with the first weighting vector.
Owner:SAP AG

Apparatus and methods for improving detection of watermarks in content that has undergone a lossy transformation

Techniques for improving detection of watermarks in content that has undergone a lossy transformation. One of the techniques is used when the message that is contained in a watermark belonging to a digital representation that is derived from an original watermarked digital representation cannot be decoded. The technique obtains information about the watermark by comparing the watermark vector for the watermark that cannot be decoded with a replica of the watermark vector from the original watermarked digital representation. The replica is made using the message. Depending on the degree of similarity, the watermark's presence and some of its characteristics may be determined. Another technique improves the robustness of watermarks that are used for authentication by employing a short (even single-bit) watermark vector to make the watermark and using the message needed for the authentication to determine where the watermark is located in the digital representation. Authentication of a digital representation is done by determining whether the watermark is present in the digital representation. In another technique, detection of the presence of a watermark is used to determine what areas of a digital representation have been subject to alteration. Techniques for synchronizing digital representations for watermark detection and other purposes include adding marks whose locations can be automatically detected only with the help of information that is external to the digital representation, such as a key, and adding marks to a sequence of digital representations and detecting the marks by summing the sequence.
Owner:THOMSON LICENSING SA

Environment identification device, environment identification method, and robot device

An environment identifying apparatus (400) is adapted to be mounted in a robot apparatus that moves in an identifiable unique environment in which a plurality of landmarks are located so as to identify the current environment by means of a plurality of registered environments. The environment identifying apparatus comprises an environment map building section (402) for recognizing the landmarks in the current environment, computing the movement / state quantity of the robot apparatus itself and building an environment map of the current environment containing information on the positions of the landmarks in the current environment on the basis of the landmarks and the movement / state quantity, an environment map storage section (403) having a data base of registered environment maps containing positional information on the landmarks and environment IDs, an environment identifying section (404) for identifying the current environment on the basis of the degree of similarity between the environment map of the current environment and each of the registered environment maps and an environment exploring section (405) for exploring a new environment.
Owner:SONY CORP

Human face similarity degree matching method and device

The embodiment of the invention provides a human face similarity matching method which comprises the following steps of: obtaining a first human face image; extracting the feature data of a plurality of key points in the first human face image and a stored second human face image; searching a matched key point in the key points of the second human face image for each key point of the first human face image; calculating the score of the similarity between the key point of the first human face image and the key point of the second human face image matching with the key point of the first human face image; and inosculating each key point of the first human face image with the similarity score of the key point of the second human face image to judge whether the two are matched with each other.
Owner:HUAWEI TECH CO LTD +1

Chinese question-answering system based on neural network

The invention discloses a Chinese question-answering system based on a neural network, which comprises a user interface module, a question word pre-segmentation module, a nerve cell pre-tagging module, a learning and training module, a nerve cell knowledge base module, a semantic block identification module, a question set index module and an answer reasoning module. The system comprises the steps of: firstly adopting an SIE encoding mode to encode the in-vocabulary words of the semantic block according to corresponding position, later converting an identification problem of the question semantic block into a tagging classification problem, and then adopting a classification model based on the neural network to determine the semantic structure of the question, and finally combing the semantic structure of the question to realize the question similarity computation based on the neural network and comparing the weight of various semantic features of the question by extracting the tagged semantic features of the question, thereby providing a basis for final answer reasoning. The Chinese question-answering system integrates the syntax, the semantics and the contextual knowledge of the question and can simulate the process that human beings process the sentence.
Owner:HUAZHONG NORMAL UNIV

Image processing method for direction dependent low pass filtering

InactiveUS7016549B1Without losing fine structure of imageTelevision system detailsImage enhancementImaging processingJaggies
First similarity values along at least four directions are ascertained within a local area containing a target pixel and weighted averaging is performed by adding the pixel values of pixels around the target pixel value to the pixel value of the target pixel, adding weight along a direction having a small first similarity value (along a direction manifesting a high degree of similarity). By incorporating the pixel value level differences among a plurality of pixels on adjacent lines extending adjacent to the target pixel into the first similarity values, it becomes possible to effectively remove jaggies that are difficult to eliminate in the prior art. Furthermore, by making a judgment on degrees of similarity by incorporating color information such as characteristics differences among different color pixels, a more accurate judgment can be made with regard to the image structure to enable very accurate direction-dependent low-pass filtering.
Owner:NIKON CORP

Real-time tracking of non-rigid objects using mean shift

A method and apparatus for real-time tracking of a non-rigid target. The tracking is based on visual features, such as color and / or texture, where statistical distributions of those features characterize the target. A degree of similarity (rho(y0)) is computed between a given target (at y0) in a first frame and a candidate target (at y1) in a successive frame, the degree being expressed by a metric derived from the Bhattacharyya coefficient. A gradient vector corresponding to a maximization of the Bhattacharyya coefficient is used to derive the most probable location of the candidate target in the successive frame.
Owner:SIEMENS MEDICAL SOLUTIONS USA INC

Method and system for filtering of information entities

A system and method are provided for eliciting interesting structure from a collection of entities or resources with explicit and / or implicit, static and / or dynamic relations, called “affinities,” between them. Interesting structure includes (1) notions of quality, authority, or definitiveness of information, (2) notions of relevance to a user's information need, (3) notions of similarity among the plurality of resources retrieved from a universe of resources by a query process, and (4) notions of similarity among the usages of resources by different users / servers. Similarities between entities are computed, based on similarities between the affinity values for the entities. That is, where the affinitiy values for two entities resemble each other, the two entities have a high degree of similarity. Using the similarities, the entities are ranked, clustered, etc., based on a significance derived from the similarities. The ranking, clustering, etc., makes up the interesting structure which is sought.
Owner:IBM CORP

System, apparatus, program and method for data aggregatione

Embodiments include a method, apparatus, program, and system for distributing data items among a plurality of data storage units, the data items being an aggregation of data from a plurality of data sources. The method comprises generating a semantic description of each of the plurality of data sources; calculating, for each pair of data sources from among the plurality of data sources, a degree of similarity between the semantic descriptions of the pair of data sources; and allocating data items to data storage units in dependence upon the calculated degree of similarity between the data source of a data item being allocated and the or each data source of data items already allocated to the data storage units.
Owner:FUJITSU LTD

Document similarity scoring and ranking method, device and computer program product

InactiveUS7689559B2Avoids large and wasted effortSmall similarity scoreData processing applicationsWeb data indexingDocument similarityCollation
A device, computer program product and a method for searching, navigating or retrieving documents in a set of electronic documents, including performing a link analysis of the set of electronic documents. The link analysis includes one of analyzing at least two of the set of documents with at least a portion of a similarity graph constructed among the set of documents and analyzing the at least two of the set of documents with the at least a portion of the similarity graph and at least a portion of a hyperlink graph constructed from hyperlinks between the set of documents. Also described is a method for building a similarity matrix.
Owner:TELENOR AS

Text similarity, acceptation similarity calculating method and system and application system

The invention discloses a calculating method of text similarity degree and vocabulary meaning similarity degree and system and application system, which comprises the following steps: basing on vocabulary data bank; proceeding initialize; calculating; getting initial vocabulary meaning similarity degree among vocabulary in the vocabulary data bank; basing on the initial vocabulary meaning similarity degree; calculating initial semantic similarity degree among text; iterating semantic similarity degree among each text and vocabulary meaning similarity degree among vocabulary till constriction; constructuring final vocabulary meaning similar matrix with final vocabulary similarity degree; transforming the text vocabulary frequency vector of the initial text to the new text vocabulary text vocabulary frequency vector; calculating text similarity degree in the text collection. This invention can improve related property of current text especially about short text.
Owner:蒙圣光 +1

Method, article, apparatus and computer system for inputting a graphical object

Method, article, apparatus and computer system facilitating the easy and intuitive inputting of a desired graphical object into an electronic system from a large plurality of predetermined graphical objects. In one example embodiment, this is achieved by assigning each of said graphical objects into one of a plurality of groups in accordance with a predetermined similarity criterion, associating respective base shapes to each of said groups, wherein said base shapes having a certain degree of similarity to the objects assigned to the associated group according to said similarity criterion and associating in each of said groups at least one gesture to each of said graphical objects, so that the associated gestures are distinguishable from each other. In order to input the desired graphical object, one of the groups is selected by selecting its base shape and then the desired graphical object is identified by drawing the respective gesture associated thereto.
Owner:HEWLETT PACKARD DEV CO LP

Searching images

A database of visual images includes metadata having, for a particular image, at least one entry specifying: a part of that image, another stored image, and a measure Sabi of the degree of similarity between that specified part and the specified other image. The searching method comprises displaying one or more images; receiving input from a user (for example by using a gaze tracker) indicative of part of the displayed images; determining measures of interest for each of a plurality of non-displayed stored images specified by the metadata for the displayed image(s), as a function of the similarity measure(s) and the relationship between the user input and the part specified; and, on the basis of these measures, selecting, from those non-displayed stored images, further images for display.
Owner:BRITISH TELECOMM PLC

Identifying similarities within large collections of unstructured data

A technique for determining when documents stored in digital format in a data processing system are similar. A method compares a sparse representation of two or more documents by breaking the documents into “chunks” of data of predefined sizes. Selected subsets of the chunks are determined as being representative of data in the documents and coefficients are developed to represent such chunks. Coefficients are then combined into coefficient clusters containing coefficients that are similar according to a predetermined similarity metric. The degree of similarity between documents is then evaluated by counting clusters into which chunks of similar documents fall.
Owner:DIGITAL GUARDIAN LLC

Commendation method of personalized resource information based on scene information

The invention discloses a commendation method of personalized resource information based on scene information. The method comprises: pre-processing the web page of a synergic labeling system, storing the information of all labeling behaviors of a user according to the information of the labeling behaviors, wherein the information is extruded by a specific user and comprises labeled resource information, used tag information and labeled time information; generating grading data expressing user preferences according to the tag information that the user uses the resources in a database and the time information of the labeled resource; calculating the similarity among users based on the generated grading data expressing the user preference so as to determine that the users with similar interest are adjacent; commending the resource that is not labeled by the user to the users according to the adjacent preference information of the users to complete the commendation of synergic filtrated personalized resource. Experiments show that better personalized commendation service can be provided for the users through integrating the scene information.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Biometric information processing apparatus and biometric information processing method

In order to acquire a suitable fingerprint image by correcting an elongated fingerprint image, a line sensor acquires fingerprint image as a plurality of line-shaped images. A computation unit computes a similarity value by use of an evaluation function for evaluating the degree of similarity between the line-shaped images. The similarity value represents the degree of similarity between a first line-shaped image and a second line-shaped image which serve as a similarity evaluation target and are included in the plurality of the line-shaped image. A compression unit compresses the first line-shaped image and the second line-shaped image when the similarity value is equal to or larger than a predetermined threshold value to generate a new line-shaped image. A generation unit generates the entire fingerprint image by combining the new line-shaped image with the other line-shaped images.
Owner:SANYO ELECTRIC CO LTD

Realization method and system for electronic medical record post-structuring and auxiliary diagnosis

InactiveCN106383853AGood effectSpecial data processing applicationsData setJaro–Winkler distance
The invention relates to a realization method and system for electronic medical record post-structuring and auxiliary diagnosis. A combination mode of multiple types of distance measurement is used: a character string editing distance refers to a minimum number of replacement, insertion and deletion operations required for converting a character into another character string; a Jaro-Winkler distance measures similarity between two character strings and is used for repeated recording detection; a geometric mean value of a Chinese character distance and a Chinese character input method is adopted as comprehensive similarity measurement for measuring similarity between characteristic texts; characteristic ranking is realized by using a TF-IDF method and is used for assessing the importance of characteristic terms relative to documents in a file set or a corpus library, and the importance of the characteristic terms is in direct proportion to an occurrence frequency in the documents and is in inverse proportion to an occurrence document in the corpus library; and files are converted to be in a file format of PU learning of a positive example data set and an unlabelled data set according to the generated characteristic terms, and through the PU learning, the system automatically recommends related diagnoses for clinical medical personnel to refer.
Owner:刘勇

Text joins for data cleansing and integration in a relational database management system

An organization's data records are often noisy: because of transcription errors, incomplete information, and lack of standard formats for textual data. A fundamental task during data cleansing and integration is matching strings—perhaps across multiple relations—that refer to the same entity (e.g., organization name or address). Furthermore, it is desirable to perform this matching within an RDBMS, which is where the data is likely to reside. In this paper, We adapt the widely used and established cosine similarity metric from the information retrieval field to the relational database context in order to identify potential string matches across relations. We then use this similarity metric to characterize this key aspect of data cleansing and integration as a join between relations on textual attributes, where the similarity of matches exceeds a specified threshold. Computing an exact answer to the text join can be expensive. For query processing efficiency, we propose an approximate, sampling-based approach to the join problem that can be easily and efficiently executed in a standard, unmodified RDBMS. Therefore the present invention includes a system for string matching across multiple relations in a relational database management system comprising generating a set of strings from a set of characters, decomposing each string into a subset of tokens, establishing at least two relations within the strings, establishing a similarity threshold for the relations, sampling the at least two relations, correlating the relations for the similarity threshold and returning all of the tokens which meet the criteria of the similarity threshold.
Owner:AMERICAN TELEPHONE & TELEGRAPH CO +1

Acquisition and application of contextual role knowledge for coreference resolution

Coreference resolution is the process of identifying when two noun phrases (NP) refer to the same entity. Two main contributions to computational coreference resolution are made. First, this work contributes a new method for recognizing when an NP is anaphoric. Second, traditional approaches to coreference resolution typically select the most appropriate antecedent by recognizing word similarity, proximity, and agreement in number, gender, and semantic class. This work contributes a new source of evidence that focuses on the roles that an anaphor and antecedent play in particular events or relationships. I show that using contextual role knowledge as part of the coreference resolution process increases the number of anaphors that can be resolved, and I demonstrate an unsupervised method for acquiring contextual role knowledge that does not require an annotated training corpus. A probabilistic model based on the Dempster-Shafer model of evidence is used to incorporate contextual role knowledge with traditional evidence sources.
Owner:UNIV OF UTAH RES FOUND

Mobile robot, and control method and program for the same

InactiveUS20070276541A1Readily and accurately estimatePromote generationComputer controlSimulator controlTurn angleProgram planning
A path planning unit plans a travel path to a destination based on an estimated current travel position and outputs a travel command to a travel control unit to perform travel control so as to follow the travel path. A travel position prediction unit accumulates a travel distance, which is detected by a wheel turning-angle sensor, to the estimated current travel position so as to predict the current travel position. A predictive image generating unit generates a plurality of predictive edge images which are composed of edge information and captured when a camera is virtually disposed at the predicted current travel position and candidate positions in the vicinity of it based on layout information of the environment, and an edge image generating unit generates an actual edge image from the actual image captured by the camera. A position estimation unit compares the edge image with the plurality of predictive edge images, estimates the candidate position of the predictive edge image at which the degree of similarity is the maximum, and updates the travel position of the path planning unit and the travel position prediction unit.
Owner:FUJITSU LTD

Intelligent Chinese request-answering system based on concept

The invention discloses a Chinese question answering system based on concept, which mainly comprises a data server, a question pre-treatment module, a candidate question set extracting module and a question sentence similarity calculation module. The invention aims at providing a question answering system which is based on concept, can carry out synonym expansion of keywords which are processed by question sentences which are input by the user, understand question sentences better, carry out searching and improve the recall ratio of the question answering system. Furthermore, the system has a Chinese sentence similarity calculation method based on concept from three aspects: word form, word order and word length, and improves searching precision ratio. Meanwhile, the system adopts a high-efficiency retrieval technology to realize rapid extraction of candidate question set, calculates question sentence similarity, sorts question set quickly and returns the sorted questions and answers to the user. The question answering system of the invention gives more precise understanding in concept to the question sentences input by the user and searches the accurate answers. Experiments show that the question answering system of the invention achieves high recall ratio and precision ratio.
Owner:HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products