Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

14 results about "Text cluster" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Methods, apparatus, devices, and storage media for aggregating cross-page test questions

PendingCN122290156AImage extractionTheoretical computer science

This application relates to the field of computer technology and discloses a method, apparatus, device, and storage medium for cross-page test question aggregation. The method includes acquiring multiple test question images, each image including test question elements; extracting text blocks from the test question images and grouping text blocks belonging to the same test question and the same test question element into a text cluster, obtaining a text cluster set; searching for a baseline text cluster and multiple candidate text clusters in the text cluster set, where candidate text clusters are those related to the baseline text cluster; generating label codes for each text cluster in the baseline and candidate text clusters based on the test question element to which the text cluster belongs, the text blocks within the text cluster, and the position of the text blocks in the test question images; and searching for text clusters belonging to the same test question as the baseline text cluster in the candidate text clusters based on the label codes, obtaining a test question text cluster set. The method of this application can achieve automatic clustering of test question elements with high clustering accuracy.

Methods, apparatus, devices, and storage media for aggregating cross-page test questions

View all

Owner:HANGZHOU ZHIJUAN PLANET TECHNOLOGY CO LTD

A keyword reverse propagation algorithm based on citation network structure

ActiveCN116680397BNatural language data processingOther databases indexingAlgorithmTheoretical computer science

The application provides a keyword reverse propagation algorithm based on a citation network structure, and comprises the following steps: a spring charge model is established, force-directed layout processing is performed, and a force-directed layout graph is established; a keyword propagation model is established by using a reverse propagation algorithm, and a keyword weight change contrast curve is obtained; a citation network model is constructed by using the force-directed layout graph; iterative calculation is performed on the force-directed layout graph until the energy state in the force-directed layout graph reaches a minimum value; while the citation network is iteratively calculated, the keyword weight in the citation network model is adjusted, and a converged citation network layout graph is calculated. The application has the beneficial effects that the entire network is taken as a main body, keywords owned by cited documents are selected at a certain probability and are propagated backward along the network to the citing documents, the weight of the keywords is increased while the text clustering idea is retained, and clear visual data effects are obtained through force-directed layout.

A keyword reverse propagation algorithm based on citation network structure

View all

Owner:UNICLOUD TECH CO LTD

A cross-modal retrieval method based on prototype regularization learning

PendingCN122346555APattern recognitionInformation processing

The application provides a cross-modal retrieval method based on prototype regularization learning, and belongs to the field of artificial intelligence and multi-modal information processing, and comprises the following steps: extracting visual features and text features through a visual encoder and a text encoder and calculating a basic contrast loss; dividing visual clusters and text clusters through a clustering algorithm and calculating visual prototypes and text prototypes, selecting text features most similar to the visual prototypes as anchor points, and constructing cross-modal prototypes; calculating a prototype-level discriminant loss according to the visual prototypes and the text prototypes, calculating an instance-level discriminant loss according to the visual clusters and the text clusters, and calculating a prototype projection loss according to the cross-modal semantic prototypes; constructing a joint optimization objective function by combining all the losses, training the visual encoder and the text encoder, and performing cross-modal retrieval after the training is completed. The application solves the problems of excessive intra-class variance and excessively high inter-class similarity caused by appearance bias in the existing cross-modal retrieval method, and causes the problems of insufficient retrieval accuracy and robustness.

View all

Owner:SHANGHAI EYE DISEASE PREVENTION & TREATMENT CENTER

A clustering-based text supervision checking method, device and computer equipment

ActiveCN115269839BGuaranteed accuracysave computing resourcesDigital data information retrievalNatural language data processingCluster algorithmAlgorithm

The application relates to a clustering-based text supervision checking method and device and a computer device. Feature word sets corresponding to a plurality of question description texts in a question list are obtained by respectively extracting feature words from the question description texts. The word frequency of each feature word in the question description text, the first inverse document frequency in all question description texts and the second inverse document frequency in all checked units are respectively calculated to obtain a feature word weight set of the question description text of each checked unit. A plurality of question description text clusters and a plurality of initial high-correlation-degree feature word sets corresponding to the question description text clusters are obtained through clustering. Finally, the initial high-correlation-degree feature word set is optimized according to a chi-square statistic to obtain an accurate high-correlation-degree feature word set. The method saves the calculation resources while ensuring the accuracy of the results by optimizing the feature word weight calculation, adopting a clustering algorithm for preliminary clustering and optimizing the clustering results through chi-square statistics.

A clustering-based text supervision checking method, device and computer equipment

View all

Owner:NAT UNIV OF DEFENSE TECH

A cross-domain technology fusion opportunity finding method based on science and technology fusion interaction

PendingCN122388920ATechnology fusionDegree of similarity

The application belongs to the technical field of data mining, and specifically discloses a cross-field technology fusion opportunity finding method based on science and technology fusion interaction, which comprises the following steps: obtaining scientific paper and technology patent data, identifying cross-field scientific fusion and technology fusion based on semantic anchor points and a similarity threshold; using a trained deep text clustering model to respectively perform deep clustering on scientific fusion and technology fusion, and obtaining corresponding clustering sets; obtaining text representation corresponding to each cluster, filtering through a threshold, and establishing association between two types of clusters through bipartite graph maximum weight matching to identify shared topics and exclusive topics; using a dynamic time warping method to align the time series of scientific and technological trends, judging topics with scientific convergence preceding technological convergence, and identifying technology fusion opportunities with growth potential based on the development trend of scientific papers. The application can deeply mine cross-field science and technology semantics, improve clustering accuracy, quantify science and technology association, and dynamically find forward-looking technology fusion opportunities.

A cross-domain technology fusion opportunity finding method based on science and technology fusion interaction

View all

Owner:SOUTHWEST JIAOTONG UNIV

Short text clustering and fuzzy recognition algorithm based on large-scale network online subgraph sampling

ActiveCN121958560BDigital data information retrievalSemantic analysisPattern recognitionLinguistic model

This invention provides a short text clustering and fuzzy recognition algorithm for large-scale online subgraph sampling, comprising the following steps: Step S1, extraction and preprocessing of training samples; Step S2, construction of the neural network; Step S3, overall clustering prediction; Step S4, fuzzy sample recognition; Step S5, retraining of the neural network. This invention combines short text clustering with a large language model, which not only improves clustering accuracy but also enables the handling of clustering tasks with different themes and classification requirements, significantly reducing the manual cost of data annotation. Furthermore, this invention can annotate fuzzy samples for classification, using K-nearest neighbors combined with minimum spanning trees to assist subgraph sampling in the selection scheme. This utilizes sparse structure to reduce computational costs and exposes the fluctuations of boundary samples through spectral clustering, providing a more comprehensive perspective for fuzzy sample selection and improving the accuracy and interpretability of the clustering results.

Short text clustering and fuzzy recognition algorithm based on large-scale network online subgraph sampling

View all

Owner:RENMIN UNIVERSITY OF CHINA +1

Deep Learning-Based Text Key Feature Extraction System and Method

ActiveCN120913219BFeature extractionExternal data

This invention relates to the field of text extraction, specifically to a deep learning-based system and method for extracting key text features. The system includes a text filtering module, a graph construction module, a deep learning module, a cluster feature module, and an input fusion module. The text filtering module merges text clusters, the graph construction module generates a stereo model, the deep learning module expands the model using clustered text, the cluster feature module prunes the stereo model, and the input fusion module captures new text features to generate fused text. This invention provides a visual interface for training complex text, enhances the machine's ability to analyze the coherence of long texts, improves the generalization ability of machine learning, is suitable for processing large amounts of complex, high-dimensional text data, reduces external data requirements, adapts to changes in data distribution, reduces the computational burden on the model, and achieves stable language frameworks and precise fusion of text features.

Deep Learning-Based Text Key Feature Extraction System and Method

View all

Owner:BEIJING CETEN EDUCATION TECH GRP CO LTD

Public opinion dynamic monitoring and early warning system based on text clustering analysis

ActiveCN122173656BEarly warning systemEvent evolution

The present application relates to the technical field of public opinion monitoring, and particularly relates to a public opinion dynamic monitoring and early warning system based on text clustering analysis. In the clustering process, the semantic center of historical event clusters is introduced as a constraint condition, so that the semantic continuity of existing events can be fully considered when the public opinion text is classified, and event misjudgment or repeated generation caused by short-term expression changes can be effectively avoided. By constructing a drift suppression mechanism based on the semantic center change amplitude, the unstable event clusters are adjusted, the stability and accuracy of event evolution modeling are improved, the cross-time window event clusters are associated and determined, the stable event clusters are formed, the tracking management of public opinion events is realized, the text scale, the emotional mean, the emotional heterogeneity, the evolution characteristics of the scale and the emotion are introduced, and the low sample compensation mechanism is combined to model the public opinion event risk, so that the high-risk events are not ignored, the mature event risk is not excessively enlarged, and the timeliness and accuracy of public opinion early warning are improved.

Public opinion dynamic monitoring and early warning system based on text clustering analysis

View all

Owner:KUNMING UNIV OF SCI & TECH

A label acquisition method and device based on an LLM model and a medium

ActiveCN121303112Bavoid missingavoid situationsData miningProcessing

The application relates to the technical field of data processing, in particular to a label acquisition method and device based on an LLM model and a medium. The method comprises the following steps: processing an original text set provided by a target server to obtain an original text cluster set; acquiring a first label set corresponding to the original text cluster set and a second label set corresponding to the original text cluster set according to the original text cluster set corresponding to the original text set; and processing the first label set and the second label set to obtain an original target label set. It is known that the first label set is a label processed by a large model, and the second label set is a label processed by a rule. On the one hand, the labels can be quickly generated through the rule and the large model, manual label allocation is avoided, and the processing efficiency is improved. On the other hand, the labels can be generated through two different ways, the lack or inaccuracy of the labels is avoided, and the inconsistent label allocation caused by the subjective judgment difference of the same operator is also avoided.

A label acquisition method and device based on an LLM model and a medium

View all

Owner:BEIJING SHOUFA INTELLIGENT TECHNOLOGY CO LTD

A cross-language text clustering method and system based on high-dimensional vector space manifold alignment

PendingCN122364969ATheoretical computer scienceCross lingual

The application provides a cross-language text clustering method and system based on high-dimensional vector space manifold alignment, and relates to the technical field of natural language processing.The method comprises the following steps: constructing a source language and target language feature matrix through feature extraction, generating a pseudo-anchor point matrix by unsupervised topological matching, performing space centering processing, and executing orthogonal Procrustes analysis; based on the orthogonal transformation matrix, scaling factor and translation vector, performing rigid body transformation on the source language feature matrix, fusing the target language feature matrix, constructing a unified manifold space, performing clustering processing, and obtaining the semantic cluster division result of the cross-language text. Through the application, the technical problem of the prior art that the local topological structure of the source language semantic space is destroyed due to the dependence on large-scale parallel corpus for forced space mapping, which affects the cross-language text clustering accuracy and causes clustering drift can be solved, the topological coincidence of different language texts in a unified space is realized, and the cross-language text clustering accuracy is improved.

View all

Owner:DOCUMENT & INFORMATION CENT OF CHINESE ACAD OF SCI

A short text clustering method based on multi-view alignment and optimal transmission

PendingCN122153059ADigital data information retrievalSemantic analysisPattern recognitionCluster algorithm

The application provides a short text clustering method based on multi-view alignment and optimal transmission, first acquires short text samples, then carries out data enhancement to obtain enhanced short text samples, inputs the two kinds of samples into an encoder to output original text embedding and enhanced text embedding, generates original semantic view, explicit enhanced view and negative sample view according to the output, obtains initial cluster centers through a k-means clustering algorithm, respectively obtains a sentence-level structure view and a cluster-level structure view according to the original text embedding and the initial cluster centers, compares and learns the obtained views, and optimizes clustering of the original text embedding, the enhanced text embedding and the initial cluster centers to obtain a total loss function, trains the encoder according to the total loss function, inputs real-time short text samples after the training is completed, and finally realizes clustering of the short text. The application obtains more sufficient multi-granularity semantic representation in a semantic sparse and short text scene, and improves the separability and stability of a clustering boundary.

View all

Owner:SUN YAT SEN UNIV

Deep semantic diversification training data generation method and system based on large language model

PendingCN122346546AFeature vectorLinguistic model

The application discloses a kind of deep semantic diversification training data generation method and system based on large language model.The method comprises the following steps: obtaining a reference text set, mapping to vector space to obtain semantic feature vector and clustering, to obtain at least one text cluster;For each text cluster, extract its deep semantic features in the dimensions of theme, sentiment and language style through a large language model to generate semantic guidance information;Build a prompt word template containing a task description part, a label type explanation part and a sample demonstration part;Embed the semantic guidance information into the prompt word template to form the target prompt word, input the large language model to generate training samples, and update the prompt word template by appending the valid samples to the sample demonstration part after effectiveness verification;Repeat until the number of samples reaches the preset requirement, and aggregate to form a training dataset.The training data generated by the present application has deep semantic diversity and can effectively improve the performance of downstream natural language processing models.

View all

Owner:NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

Methods for determining video tags, methods for displaying video list information, and related devices

ActiveCN116958971BMetadata video data retrievalBiological modelsComputer graphics (images)Cluster labeling

This invention provides a method for determining video tags, a method for displaying video list information, and related apparatus. The method includes: acquiring interactive text posted for a target video; clustering the interactive text based on the text feature similarity to obtain one or more interactive text clusters and clustering tags for each interactive text cluster; and determining topic tags for the target video based on the clustering tags of the interactive text clusters. This method can determine more intelligent, detailed, and diverse video tags.

Methods for determining video tags, methods for displaying video list information, and related devices

View all

Owner:BEIJING QIYI CENTURY SCI & TECH CO LTD

14 results about "Text cluster" patented technology

Popular searches