Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

70 results about "Latent semantic analysis" patented technology

Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text (the distributional hypothesis). A matrix containing word counts per paragraph (rows represent unique words and columns represent each paragraph) is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Paragraphs are then compared by taking the cosine of the angle between the two vectors (or the dot product between the normalizations of the two vectors) formed by any two columns. Values close to 1 represent very similar paragraphs while values close to 0 represent very dissimilar paragraphs.

Automatic organization of browsing histories

An automatic organization into topics for a browsing history. In one embodiment, a system identifies groups of browsing actions as related, and clusters the browsing history (e.g. a web browsing history) into sessions based on heuristics used to determine relationships. Latent semantic analysis can be used to determine the relationships which can be considered topics. User interfaces for displaying or otherwise presenting these sessions can include icons representative of topics, and these icons can have different sizes depending on a frequency of web page visits within a topic. The topics can be displayed in time ranges or in a cover flow view or both time ranges and cover flow view.
Owner:APPLE INC

Consistency modeling of healthcare claims to detect fraud and abuse

Transaction-based behavioral profiling, whereby the entity to be profiled is represented by a stream of transactions, is required in a variety of data mining and predictive modeling applications. An approach is described for assessing inconsistency in the activity of an entity, as a way of detecting fraud and abuse, using service-code information available on each transaction. Inconsistency is based on the concept that certain service-codes naturally co-occur more than do others. An assessment is made of activity consistency looking at the overall activity of an individual entity, as well as looking at the interaction of entities. Several approaches for measuring consistency are provided, including one inspired by latent semantic analysis as used in text analysis. While the description is in the context of fraud detection in healthcare, the techniques are relevant to application in other industries and for purposes other than fraud detection.
Owner:FAIR ISAAC & CO INC

Consistency modeling of healthcare claims to detect fraud and abuse

Transaction-based behavioral profiling, whereby the entity to be profiled is represented by a stream of transactions, is required in a variety of data mining and predictive modeling applications. An approach is described for assessing inconsistency in the activity of an entity, as a way of detecting fraud and abuse, using service-code information available on each transaction. Inconsistency is based on the concept that certain service-codes naturally co-occur more than do others. An assessment is made of activity consistency looking at the overall activity of an individual entity, as well as looking at the interaction of entities. Several approaches for measuring consistency are provided, including one inspired by latent semantic analysis as used in text analysis. While the description is in the context of fraud detection in healthcare, the techniques are relevant to application in other industries and for purposes other than fraud detection.
Owner:FAIR ISAAC & CO INC

System and method of structuring data for search using latent semantic analysis techniques

The disclosed embodiments provide a system and method for using modified Latent Semantic Analysis techniques to structure data for efficient search and display. The present invention creates a hierarchy of clustered documents, representing the topics of a domain corpus, through a process of optimal agglomerative clustering. The output from a search query is displayed in a fisheye view corresponding to the hierarchy of clustered documents. The fisheye view may link to a two-dimensional self-organizing map that represents semantic relationships between documents.
Owner:SYNTORG INC

Assistive call center interface

Unstructured voice information from an incoming caller is processed by automatic speech recognition and semantic categorization system to convert the information into structured data that may then be used to access one or more databases to retrieve associated supplemental data. The structured data and associated supplemental data are then made available through a presentation system that provides information to the call center agent and, optionally, to the incoming caller. The system thus allows a call center information processing system to handle unstructured voice input for use by the live agent in handling the incoming call and for storage and retrieval at a later time. The semantic analysis system may be implemented by a global parser or by an information retrieval technique, such as latent semantic analysis. Co-occurrence of keywords may be used to associate prior calls with an incoming call to assist in understanding the purpose of the incoming call.
Owner:PANASONIC CORP

Ontology mapper

Systems, methods and computer-readable media are provided for facilitating patient health care by providing discovery, validation, and quality assurance of nomenclatural linkages between pairs of terms or combinations of terms in databases extant on multiple different health information systems that do not share a set of unified codesets, nomenclatures, or ontologies, or that may in part rely upon unstructured free-text narrative content instead of codes or standardized tags. Embodiments discover semantic structures existing naturally in documents and records, including relationships of synonymy and polysemy between terms arising from disparate processes, and maintained by different information systems. In some embodiments, this process is facilitated by applying Latent Semantic Analysis in concert with decision-tree induction and similarity metrics. In some embodiments, data is re-mined and regression testing is applied to new mappings against an existing mapping base, thereby permitting these embodiments to “learn” ontology mappings as clinical, operational, or financial patterns evolve.
Owner:CERNER INNOVATION

Latent semantic analysis for application in a question answer system

ActiveUS20140229163A1Improves obtaining similarity measureSemantic analysisSpecial data processing applicationsGraphicsQuestion answer
A system and method that improves obtaining similarity measure between concepts based on Latent Semantic Analysis by taking onto account graph structure derived from the knowledge bases by using a vector propagation algorithm, in the context domain, such as a medical domain. Concepts contained in a corpus of documents are expressed in a graph wherein each node is a concept and edges between node express relation between concepts weighted by the number of semantic relations determined from the corpus. A vector of neighbors is created and assigned to each concept, thereby providing an improved similarity measure between documents, i.e., corpus and query against corpus.
Owner:IBM CORP

Object-oriented high-resolution remote-sensing image classification method

The invention provides an object-oriented high-resolution remote-sensing image classification method. The method comprises the steps of S1, conducting segmentation processing on images to be processed to obtained a plurality of subimage objects; S2, obtaining feature information of subimage objects; and S3, classifying subimage objects according to the obtained feature information, wherein images to be processed are high-resolution remote-sensing images, the feature information of subimage objects comprises spectral information, shape information and texture information of subimage objects. According to the method, on the basis of object-oriented classification, a classification method combining probabilistic latent semantic analysis and a support vector machine is introduced, the problem that 'the same features with different classifications' and 'the same classifications with different features' are not high in identification ratio in the prior art is solved, the classification precision of high-resolution remote-sensing images is greatly improved, advantages of latent semantic analysis (LSA) and advantages of probabilistic latent semantic analysis (PLSA) are combined, and the problems of overfitting and local optimum which are caused by random initialization are effectively solved.
Owner:UNIV OF ELECTRONIC SCI & TECH OF CHINA

Decisions with Big Data

This invention presents a framework for applying artificial intelligence to aid with product design, mission or retail planning. The invention outlines a novel approach for applying predictive analytics to the training of a system model for product design, assimilates the definition of meta-data for design containers to that of labels for books in a library, and represents customers, requirements, components and assemblies in the form of database objects with relational dependence. Design information can be harvested, for the purpose of improving decision fidelity for new designs, by providing such database representation of the design content. Further, a retrieval model, that operates on the archived design containers, and yields results that are likely to satisfy user queries, is presented. This model, which is based on latent semantic analysis, predicts the degree of relevance between accessible design information and a query, and presents the most relevant previous design information to the user.
Owner:IMAGARS

Space trajectory big data analysis-based person management and control method and system

The invention relates to the technical field of person behavior pattern analysis, and particularly relates to a space trajectory big data analysis-based person management and control method and a system. The method comprises steps: trajectory data of important persons are extracted, and according to the important person specified by a user, a space region or a time range, trajectory address information of the person is queried from multiple databases according to an ID number; according to the name of the trajectory address, corresponding geographic latitude and longitude coordinates are found out from the address database, and finally each important person can be presented to be a corresponding geographic coordinate sequence; vectorization of the trajectory data is carried out; latent semantic analysis is carried out on the trajectory mode, singular value decomposition is carried out on a matrix, the matrix is rebuilt through dimension reduction, and the rebuilt matrix is the latent semantic matrix of the important person trajectory mode; the important persons are clustered, and a management and control task is assigned according to a clustering processing result. Compared with the prior art, potential contact of the important persons is dug in massive person trajectory data, and management and control tasks are assigned reasonably.
Owner:WEIHAI BEIYANG ELECTRIC GRP CO LTD BEIJING BRANCH

Generalized latent semantic analysis

InactiveUS20070067281A1Facilitate documentFacilitate word-level processing operationDigital data information retrievalDigital data processing detailsAlgorithmPaper document
One embodiment of the present invention provides a system that builds an association tensor (such as a matrix) to facilitate document and word-level processing operations. During operation, the system uses terms from a collection of documents to build an association tensor, which contains values representing pair-wise similarities between terms in the collection of documents. During this process, if a given value in the association tensor is calculated based on an insufficient number of samples, the system determines a corresponding value from a reference document collection, and then substitutes the corresponding value for the given value in the association tensor. After the association tensor is obtained, a dimensionality reduction method is applied to compute a low-dimensional vector space representation for the vocabulary terms. Document vectors are computed as linear combinations of term vectors.
Owner:PALO ALTO RES CENT INC

Ontology mapper

Systems, methods and computer-readable media are provided for facilitating patient health care by providing discovery, validation, and quality assurance of nomenclatural linkages between pairs of terms or combinations of terms in databases extant on multiple different health information systems that do not share a set of unified codesets, nomenclatures, or ontologies, or that may in part rely upon unstructured free-text narrative content instead of codes or standardized tags. Embodiments discover semantic structures existing naturally in documents and records, including relationships of synonymy and polysemy between terms arising from disparate processes, and maintained by different information systems. In some embodiments, this process is facilitated by applying Latent Semantic Analysis in concert with decision-tree induction and similarity metrics. In some embodiments, data is re-mined and regression testing is applied to new mappings against an existing mapping base, thereby permitting these embodiments to “learn” ontology mappings as clinical, operational, or financial patterns evolve.
Owner:CERNER INNOVATION

Method and apparatus for preparing a document to be read by a text-to-speech reader

There is disclosed a method and system for preparing a document to be read by a text-to-speech reader. The method can include identifying two or more voice types available to the text-to-speech reader, identifying the text elements within the document, grouping related text elements together, and classifying the text elements according to voice types available to the text-to-speech reader. The method of grouping the related text elements together can include syntactic and intelligent clustering. The classification of text elements can include performing latent semantic analysis on the text elements and characteristics of the available voice types.
Owner:CERENCE OPERATING CO

Method for recommending problem based on probability latent semantic analysis

The invention discloses a recommended method for analyzing the latent semantic problem based on the probability. The method describes the interest of a user through a state model in the latent semantic analysis on the probability and provides the compatible problem recommendation for the user interactive type question-answering system. The method adopts a ternary state model, has the advantages of two recommended ways based on the content and collaborative filtering, performs problem recommendation according to the personalized information of the user, and has high accuracy and good applicability in the user interactive type question-answering system.
Owner:ZHEJIANG UNIV

Text message extracting method and system

The invention provides a text message extracting method. The method comprises the steps of determining a target object; preprocessing the target object; constructing LSA (latent semantic analysis) according to the preprocessing result, digitizing the target object; clustering the digitalized target object by the k-means clustering algorithm to obtain at least one clustering cluster; extracting the message of each clustering cluster by LSA-based algorithm; combining the extracted message, so as to accurately extract the summary of a microblog. The invention also provides a text message extracting system which can accurately extract the summary of the microblog.
Owner:NAT UNIV OF DEFENSE TECH

Image scene classification method based on target and space relationship characteristics

The invention discloses an image scene classification method based on target and space relationship characteristics and relates to image scene classification technologies. The method comprises the steps of: defining a space relationship histogram, conducting representation on the space relationship between targets, comprising left, right, top, bottom, far, near, including and excluding, and giving a calculation method; labeling a target in a sample image, assigning the membership degree of the space relationship between any two targets, counting mathematical features of the membership degree of the space relationship between any two targets in the scene, classifying the space relationship histogram between the targets by using a fuzzy K neighbor classifier according to test images, and calculating the membership degree of the space relationship; establishing an image model by employing a probability latent semantic analysis model of the space relationship characteristics between fusion themes; and classifying the scene images by using a support vector machine. According to the method, the image is modeled by employing the probability latent semantic analysis model of the space relationship characteristics between fusion themes, and the scene images are classified through input of the support vector machine.
Owner:INST OF ELECTRONICS CHINESE ACAD OF SCI

Social advertising facing Twitter feasibility analysis method

InactiveCN104268130ASolve bottlenecksOvercoming barriers posed by semantic analysisSpecial data processing applicationsMarketingTopic analysisAnalysis method
A social advertising facing Twitter feasibility analysis method includes the steps of building a multi-source Twitter corpus by innovatively combining corpus information of different sources of Twitter users and effectively expanding Twitter short text to infer the potential advertising value of the content published by the users to further achieve precise advertising audience targeting; proposing a multi-source Twitter corpus theme analysis model for latent semantic analysis of the content published by the users; based on semantic analysis results, designing feature selection, filtering and presentation algorithms, constructing a logistic regression classifier, and classifying advertising feasibility used as the basis for decision making of advertising recommendation. The social advertising facing Twitter feasibility analysis method takes full advantage of characteristics of information published by the users and can accurately infer the potential advertising value. By means of the social advertising facing Twitter feasibility analysis method, inferred results conforming to the intent of the users can be obtained. The social advertising facing Twitter feasibility analysis method is applicable to advertising recommendation of social networking services, such as Twitter.
Owner:NANKAI UNIV

Greedy Active Learning for Reducing User Interaction

A method, system and computer-usable medium are disclosed for reducing user interaction when training an active learning system. Source input containing unlabeled instances and an input category are received. A Latent Semantic Analysis (LSA) similarity score, and a search engine score, are generated for each unlabeled instance, which in turn are used with the input category to rank the unlabeled instances. If a first threshold for negative instances has been met, a first unlabeled instance, having the highest ranking, is selected for annotation from the ranked collection of unlabeled instances and provided to a user for annotation with a positive label. If a second threshold for positive instances has been met, then second unlabeled instance, having the lowest ranking, is selected for annotation from the ranked collection of unannotated instances and automatically annotated with a negative label. The annotated instances are then used to train an active learning system.
Owner:IBM CORP

Cross-media information analysis and retrieval method

InactiveCN102693321AAlleviate the problem of excessive space complexitySolve the problem of feature heterogeneitySpecial data processing applicationsFeature vectorInformation analysis
The invention provides a cross-media information analysis and retrieval method, which comprises the following steps of: performing semantic integration processing on multimode information; performing expansion according to a probability latent semantic analysis model to obtain a multilayer continuous probability latent semantic analysis model for processing a continuous feature vector; learning the multilayer-continuous probability latent semantic analysis model by adopting an asymmetric learning method, and calculating the visual feature vector distribution of an image, the visual feature vector distribution of an audio and topic probability distribution; submitting a training set and a tested media object which serves as a retrieval example by a user, and calculating intra-mode and inter-mode initial similarity values of the image and the audio in the retrieval sample; constructing a propagation model, and updating the intra-mode and inter-mode similarity values according to the propagation model; and performing secondary retrieval according to the updated similarity values.
Owner:CHANGZHOU HIGH TECH RES INST OF NANJING UNIV

Latent semantic feature extraction method in aged user multi-biometric identity authentication

The invention relates to a latent semantic feature extraction method in aged user multi-biometric identity authentication. Identity authentication is performed by performing multi-mode latent semantic analysis and data mining mapping on aged user multi-biometric images and extracting the latent semantic features of the images. According to the latent semantic feature extraction method in the ageduser multi-biometric identity authentication, multiple local bottom features can be acquired by extracting the multi-biometric images of face, multiple fingerprints and palm prints and the like; the extracted features can be processed by using a multi-mode latent semantic analysis algorithm on three aspects of bottom feature-image matrix construction, two-dimensional matrix decomposition and clustering algorithm; the processed features are further mined and mapped through an 'intelligent black box model', so that the latent semantic features of the images can be effectively acquired; and the system is automatically adjusted by introducing an adaptive feedback structure with a genetic algorithm (GA), so that modification of the latent semantic features of the images is realized.
Owner:JIANGXI UNIVERSITY OF FINANCE AND ECONOMICS +1

Transfer learning method based on latent semantic analysis

The invention provides a transfer learning method based on latent semantic analysis. The method includes the following steps: carrying out stop word removal and stemming on training data; calculating the vocabulary weight in a source domain and the vocabulary weight in a target domain respectively to acquire a vocabulary-text matrix M; carrying out singular value decomposition on the matrix M; mapping vocabulary and text in the matrix M to a lower dimension latent semantic space; removing synonymy noise effect from the source domain; adjusting the matrix M structure; finding vocabulary with a high association degree with target domain text in the source domain as transfer terms; adjusting the matrix M structure again; analyzing target domain vocabulary in the adjusted matrix M structure to acquire new character representation of target domain data; acquiring a final classifier in a training dataset; and classifying a testing dataset S.
Owner:HARBIN ENG UNIV

System and method of structuring data for search using latent semantic analysis techniques

The disclosed embodiments provide a system and method for using modified Latent Semantic Analysis techniques to structure data for efficient search and display. The present invention creates a hierarchy of clustered documents, representing the topics of a domain corpus, through a process of optimal agglomerative clustering. The output from a search query is displayed in a fisheye view corresponding to the hierarchy of clustered documents. The fisheye view may link to a two-dimensional self-organizing map that represents semantic relationships between documents.
Owner:SYNTORG INC

Method for dynamic context scope selection in hybrid N-gramlanguage modeling

A method and system for dynamic language modeling of a document are described. In one embodiment, a number of local probabilities of a current document are computed and a vector representation of the current document in a latent semantic analysis (LSA) space is determined. In addition, a number of global probabilities based upon the vector representation of the current document in an LSA space is computed. Further, the local probabilities and the global probabilities are combined to produce the language modeling.
Owner:APPLE INC

Generalized latent semantic analysis

One embodiment of the present invention provides a system that builds an association tensor (such as a matrix) to facilitate document and word-level processing operations. During operation, the system uses terms from a collection of documents to build an association tensor, which contains values representing pair-wise similarities between terms in the collection of documents. During this process, if a given value in the association tensor is calculated based on an insufficient number of samples, the system determines a corresponding value from a reference document collection, and then substitutes the corresponding value for the given value in the association tensor. After the association tensor is obtained, a dimensionality reduction method is applied to compute a low-dimensional vector space representation for the vocabulary terms. Document vectors are computed as linear combinations of term vectors.
Owner:PALO ALTO RES CENT INC

Text classification method based on WordNet and latent semantic analysis

A text classification method based on WordNet and latent semantic analysis relates to the field of a computer. The text classification method based on WordNet and latent semantic analysis considers synonyms, hypernyms and hyponyms of words in a text and word frequencies of the synonyms, the hypernyms and the hyponyms are increased according to the similarity, so that influence of synonymy of a plurality of words on classification is reduced. Different from a common method of carrying out feature extraction on a feature matrix by a single method, the text classification method based on WordNet and latent semantic analysis obtains a plurality of feature matrices by regulating a WordNet invocation parameters and uses a genetic algorithm (GA) to assist latent semantic analysis (LSA) to complete feature extraction together so as to obtain better feature matrices, thereby improving a classification effect.
Owner:BEIJING UNIV OF TECH

Image quality blind evaluation method

The invention provides an image quality blind evaluation method. The image quality blind evaluation method comprises the steps of 1), extracting a characteristic of a quality-reduced image block in a training image, and estimating an offset between the characteristic of the image block and the characteristic of a non-quality-reduced image block; 2), analyzing different types of quality reduction in a probability latent semantic analysis method, and mapping the different types of quality reduction to different theme distribution characteristics, wherein the different types of quality reduction comprise single quality reduction and hybrid quality reduction; 3), establishing a relation between an image theme distribution characteristic and the image quality based on the image training set according to a machine learning method, thereby forming a quality blind evaluation model for a hybrid quality reduction image; and 4), evaluating the quality of the quality reduction image outside the training set by means of the quality blind evaluation model. The image quality blind evaluation method has advantages of improving accuracy in no-parameter quality evaluation, setting a problem of evaluating the hybrid quality reduction image in engineering, and realizing high suitability of comprehensive evaluation for image acquisition, compression and transmission performance in a multimedia system on the condition that an original image cannot be acquired.
Owner:SHANGHAI ADVANCED RES INST CHINESE ACADEMY OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products