Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

134 results about "Latent Dirichlet allocation" patented technology

In natural language processing, latent Dirichlet allocation (LDA) is a generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's presence is attributable to one of the document's topics. LDA is an example of a topic model.

Academic resource recommendation service system and method

The invention provides an academic resource recommendation service system and method. The method comprises the following steps: crawling academic resources on an internet by using an LDA (Latent Dirichlet Allocation)-based focused crawler, classifying the academic resources according to preset A types by using an LDA-based text classification model, and storing the academic resources in a local academic resource database, wherein the system further comprises an academic resource model, a resource quality value calculation module and a user interest module; implanting a tracking software module at a user terminal, combining interesting subjects and historical browsing behavior data of the user, respectively modeling the academic resource model and the user interest module by virtue of four dimensions such as the academic resource type, subject theme distribution, key word distribution and LDA latent theme distribution, calculating the similarity between the academic resource model and the user interest preference module, combining the resource quality value to calculate the recommendation degree, and finally perform academic resource Top-N recommendation for the user according to the recommendation degree. According to the method disclosed by the invention, personalized accurate recommendation of the academic resources is performed according to the identity, interest and browsing behaviors of users, and the working efficiency of scientific research personnel is improved.
Owner:NINGBO UNIV

Multi-level-point-set characteristic extraction method applicable to ground laser radar point cloud classification

The invention relates to a multi-level-point-set characteristic extraction method applicable to ground laser radar point cloud classification. Based on point set characteristics, high-precision classification of four kinds of common ground features including pedestrians, trees, buildings and automobiles and the like in a scene is realized. Firstly, point sets are constructed and a point cloud is re-sampled into a point cloud of different scales and thus point sets which are different in size and provided with layered structures are formed through clustering and characteristics of each point in the point sets are obtained; next, an LDA (Latent Dirichlet Allocation ) method is adopted to synthesizing point-based characteristics of all points in each point set into shape characteristics of the point sets; and at last, based on the shape characteristics of the point set, an Adaboost classifier is adopted to train the point sets of different levels so as to obtain a classification result of the whole point cloud. The multi-level-point-set characteristic extraction method has a higher classification precision and has a classification precision, which is far higher than that of point-based characteristics, Bag-of-Word-based characteristics and characteristics based on probabilistic latent semantic analysis (PLSA), in aspect of pedestrians and vehicles.
Owner:BEIJING NORMAL UNIVERSITY

Clustering method for network behavior habits based on K-means and LDA (Latent Dirichlet Allocation) two-way authentication

The invention discloses a clustering method for network behavior habits based on K-means and LDA (Latent Dirichlet Allocation) two-way authentication. According to the clustering method, webpage properties, keywords and frequency in internet browsing records of persons are utilized to combine with a K-means algorithm, an LDA document topic extracting model and an annealing algorithm. The clustering method comprises the following steps: firstly, performing K-means algorithm clustering and LDA document topic extracting model generation on a staff-label-frequency set and a person browsing record-person-keyword set; secondly, storing and calculating an intermediate result, and then performing K-means and LDA two-way authentication by using the annealing algorithm; calculating a global best topic-classification label sequence, and optimizing a network behavior habit clustering result by taking the global best topic-classification label sequence as a reference. By means of the K-means and LDA two-way authentication, the sensitivity to person-classification labels is improved; by using the annealing algorithm, the optimizing efficiency of the clustering result can be improved, and further the clustering accuracy is improved.
Owner:HUAIYIN INSTITUTE OF TECHNOLOGY

LDA (latent dirichlet allocation) and VSM (vector space model) based similar Chinese herb literature recommendation method

ActiveCN103823848AFast and efficient similar recommendationRobustSpecial data processing applicationsLexical itemVector space model
The invention discloses an LDA (latent dirichlet allocation) and VSM (vector space model) based similar Chinese herb literature recommendation method. The method includes: adopting an IKAnalyzer to perform word segmentation on topics and summary information of literature on the basis of a terminological dictionary for Chinese herbs, constructing a vector space, performing dimensionality reduction on the vector space, constructing a semantic dictionary, numbering all lexical items in the dictionary in sequence, performing vectorization through each document on the basis of the semantic dictionary, constructing term vectors of each document, utilizing LDA and a Gibbs sampling algorithm to perform training to obtain probability distribution of each document on themes, then computing a value of similarity between every two documents by the aid of KL divergence, computing cosine similarity of the term vectors of each document on the basis of term frequency, performing joint weighting on the two kinds of similarities prior to performing similarity sorting, and then making recommendation. By the method, the literature, similar both in content and theme, in the Chinese herb literature can be recommended to users, and recommendation results are closer to user requirements.
Owner:ZHEJIANG UNIV

Method for identifying place image on the basis of improved probabilistic topic model

The invention discloses a method for identifying a place image on the basis of an improved probabilistic topic model, belonging to the technical field of image identification. The method provided by the invention can be used for well solving the problems that the image identification is uncertain due to different angles, illumination, and height dynamic changes of figures and objects. The method comprises the following steps: an image acquiring step, an image preprocessing step, a feature extraction step, a feature clustering step, a feature distribution step and a potential topic modeling step, wherein in the image acquiring step, the features of the image are extracted by adopting a SIFI (scale invariant feature transform) algorithm; in the feature clustering step, all the features are clustered so as to obtain a plurality of clustering centers; in the feature distribution step, the feature of each image is voted in the clustering center so as to obtain a frequency vector corresponding to each clustering center; in the potential topic modeling step, the potential topic distribution of the image is learned by adopting the improved probabilistic topic model; and a classifier is adopted to identify the images at unknown places. According to the invention, a quantization function is added in an LDA (latent dirichlet allocation) model, and the potential topic of the image is learned through the improved probabilistic topic model, so that the identification performance is effectively improved on the premise of guaranteeing instantaneity.
Owner:BEIJING UNIV OF TECH

Academic resource acquisition method based on LDA (latent Dirichlet allocation)

The invention provides an academic resource acquisition method based on LDA (latent Dirichlet allocation). According to the academic resource acquisition method, a topical crawler is used; an LDA topical model is also used; firstly, a training corpus is provided for the LDA topical model to train to obtain a topical document; the topical crawler further comprises a topical determination module, a similarity calculation module and a URL (Uniform Resource Locator) priority ranking module on the basis of a general network crawler; in the crawling process of the topical crawler, the topical document is adopted to guide calculation of topical similarity, a URL of which the topical similarity is greater than a set threshold is selected, the topical crawler maintains an URL queue of webpages which are not accessed, and the topical crawler sequentially and continuously accesses a webpage of each URL according to a ranking sequence of the URL queue, crawls corresponding academic resources, and continuously stores the crawled academic resources into a database after carrying out classification labeling until the URL in the queue of the webpages which are not accessed is empty; an API (Application Program Interface) of which an academic resource database is open is provided for display and calling; machine learning is fused into the academic resource acquisition method, and academic resource acquisition quality and efficiency are improved.
Owner:NINGBO UNIV

Legal provision recommendation method based on LDA (Latent Dirichlet Allocation) topic model

The invention discloses a legal provision recommendation method based on a LDA (Latent Dirichlet Allocation) topic model. The method comprises the following steps that: extracting a judgment documentset to construct training linguistic data; preprocessing the judgment document; preprocessing a case situation; training the LDA topic model to extract the judgment document set which is similar to the case situation; extracting a recommended legal provision set, designing a legal provision association degree scoring mechanism to calculate an association degree between the legal provision and thecase, and combining with a frequent item set to mine the associated legal provision; and outputting a recommended legal provision list. The step of preprocessing the judgment document comprises the following steps that: extracting the basic situation paragraph and the quoted legal provision list of the case, carrying out Chinese word segmentation on the basic situation of the case, removing legally proprietary stop words, and carrying out legal provision name standardization. By use of the method, a real scene that a judge always looks up similar judgment documents to decide legal provision quotation in a practical judgment process is simulated, the similarity of the judgment document is measured from a semantic level, the similar judgment document can be accurately obtained, associated legal provision recommendation is carried out, and legal provision recommendation accuracy is improved.
Owner:NANJING UNIV

Aurora image classification method based on latent theme combining with saliency information

InactiveCN103632166AImprove uniformityAvoid the pitfall of extracting its featuresCharacter and pattern recognitionSupport vector machineDocumentation procedure
The invention discloses an aurora image classification method based on a latent theme combining with saliency information, and mainly solves the problem that existing technical classification is low in accuracy and classification efficiency and narrow in application range. The method includes the implementation steps: (1) preprocessing an aurora image, extracting visual words of the preprocessed aurora image and generating a visual documentation; (2) using a spectral residual algorithm to acquire an aurora saliency map of the inputted aurora image, extracting visual words of the aurora saliency map and generating a visual document of the aurora saliency map; (3) connecting the visual documents in the step (1) and the step (2) to generate a semantic enhanced document of the aurora image, and inputting the semantic enhanced document of the aurora image to a Latent Dirichlet Allocation model to obtain saliency information latent semantic distribution characteristics SM-LDA of the aurora image; (4) inputting the SM-LDA characteristics into a support vector machine for classification so as to obtain a final classification result. By the method applicable to scene classification and target recognition, high classification accuracy is maintained, meanwhile, classification time is shortened, and classification efficiency is improved.
Owner:XIDIAN UNIV

Financial public opinion perception method based on weighted LDA (latent Dirichlet allocation) topic model

The invention discloses a financial public opinion perception method based on a weighted LDA (latent Dirichlet allocation) topic model and belongs to the technical field of natural language understanding and processing as well as network public opinion. Everyday financial public opinions are perceived on the basis of microblog data related to everyday finance and are quantified according to 'everyday financial public opinion comprehensive index'. The 'everyday financial public opinion comprehensive index' is a weighted average of all financial related blog emotion values on the day, and the blog emotion values are a result of text emotion classification of blog content. An SVM (support vector machine) classification model based on weighted LDA is adopted for text emotion classification and adopts the weighted LDA for establishing text represented hidden topic space, objective data indirectly embodying investor sentiment and subjective data directly embodying investor sentiment are organically combined with a new lexical item weight calculation method, and accurate understanding of texts from the semantic level is promoted greatly, so that the text emotion classification effect is better.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Personalized travel package recommendation method based on demand classification and subject analysis

The invention relates to a personalized travel package recommendation method based on demand classification and subject analysis. The method includes: analyzing natural language form demands input by a user, utilizing word segmentation, demand classification and other natural language processing techniques to process and classify the user demands so as to obtain rigid demand, flexible demand and negative demand of the user; then utilizing an LDA (latent dirichlet allocation) document theme generation model, making travel service individuals effectively cluster into different service fields by theme similarity, and then conducting similarity matching with the user demands so as to obtain a service list best matching user expectation; finally carrying out travel package design recommendation by means of travel package optimized recommendation algorithm: firstly acquiring a travel package scenic spot set according to user time demand and service priority information; then combining location information, preference information and the like to determine travel package hotel service; and then calculating the optimal journey of every day according to the distance function L, and ranking the travel package according to travel package recommendation index; and finally, selecting travel package catering service according to scenic spot location and user preference. By integrating the processing, the purpose of designing and recommending personalized travel package best meeting the user demand can be realized.
Owner:TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products