Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

37 results about "Cybertext" patented technology

Cybertext is the organization of text in order to analyze the influence of the medium as an integral part of the literary dynamic, as defined by Espen Aarseth in 1997. Aarseth defined it as a type of ergodic literature.

Multi-label text classification method based on BiGRU and attention mechanism

The invention provides a multi-label text classification method based on BiGRU and an attention mechanism. The method comprises the following steps: S1, acquiring a plurality of web texts; s2, preprocessing the plurality of web texts; s3, extracting deep information features of the web text by using the pre-trained word vector; s4, adding corresponding weights to the deep information features according to an attention mechanism; s5, performing probability classification of different types of labels on the data obtained in the step S4 by using BiGRU; and S6, outputting the probability of each web text on different types of labels. The method has the advantages that the pre-training word vector is adopted to further shorten the training time, the attention mechanism is adopted to enable theneural network to focus on important information for improving the classification effect, and compared with the prior art, the BiGRU and the attention mechanism are fused to enable the method to obtain the same high accuracy by using less training time.
Owner:SHANGHAI MUNICIPAL ELECTRIC POWER CO

Chinese short text classification method based on graph attention network

The invention discloses a Chinese short text classification method based on a graph attention network, and the method comprises the following steps: preprocessing text data, and obtaining a word listset corresponding to a text; text feature extraction: carrying out word embedding processing on the word list set corresponding to the text by adopting a feature embedding tool to obtain a corresponding word vector; carrying out mapping by adopting a graph structure, and establishing a heterogeneous graph by taking the text and words in the text as graph nodes; establishing a graph attention network text classification model; adopting a Chinese short text data set with category annotation of a network open source as a training language data set, and adopting a heterogeneous graph training graph attention network text classification model; outputting the category to which the text belongs; processing the node features through a softmax classification layer to obtain a final classification category; according to the invention, the text features can be fully extracted under the condition that the short text information amount is insufficient, information with high value for text classification is focused on, and the classification accuracy is effectively improved.
Owner:JINAN UNIVERSITY

Text classification method for open network questions in specific field

The invention belongs to the technical field of text classification processing, and particularly relates to a text classification method for open network questions in a specific field. According to the method, the problems of lack of enough available corpus sets with category marks, low network text information amount and high noise under the condition of executing network open text classificationtasks in certain specific fields are solved, and a new method is provided for hierarchical classification of open network questions in the fields. According to the method, open network questions andwritten texts in a specific domain are utilized to enable word embedding representation in the domain to better conform to domain knowledge features, and meanwhile, a semi-supervised method is used for accelerating classification model training and reducing required marked samples; and in addition, category classification at a multi-granularity level is realized in combination with conditional probability. The method can assist the extraction, discrimination and construction of data in such fields as question and answer system, emotion analysis and domain knowledge bases and the like.
Owner:HARBIN ENG UNIV

Text keyword automatic extraction method and device based on co-occurrence language network

The invention discloses a text keyword automatic extraction method and device based on a co-occurrence language network. The defect that a large amount of manual annotation data is needed in supervised machine learning is overcome, the text keyword automatic extraction method and device overcome the defect of weak generalization ability of a language analysis method, avoid the problem that a statistical method is easy to ignore low-frequency and very important keywords. The method comprises the steps of preprocessing a webpage, constructing a language network graph model, jointly extracting candidate keyword features, comprehensively sorting the candidate keyword features and outputting the keywords. According to the method, web text preprocessing, co-occurrence language network model construction, keyword feature joint extraction and candidate keyword sorting optimization are carried out, so that the extracted keywords have good readability, coherence and correlation, and can be widely applied to the fields of natural language processing, information retrieval, text mining, sentiment analysis, multi-mode human-computer interaction and the like.
Owner:SICHUAN JIUZHOU ELECTRIC GROUP

Entity attribute value extraction method based on bidirectional long-short-term memory network

The invention belongs to the technical field of network text data processing, and particularly relates to an entity attribute value extraction method based on a bidirectional long-short-term memory network, which comprises the following steps of: 1, preprocessing a document set; step 2, adopting category mapping to identify attribute values from statements containing entities; 3, performing deep syntactic analysis on the sentences of the entities and the attribute values, and extracting related sentence components as training corpora; and 4, performing vector conversion on the training corpusby adopting a word vector model, training BLSTM model parameters in combination with syntactic features, and classifying the entities and the attribute values into a given attribute name category. According to the method, a bidirectional long-short-term memory network is adopted, so that the relationship among entities, attribute names and attribute values can be accurately judged.
Owner:INST OF ELECTRONICS & INFORMATION ENG OF UESTC IN GUANGDONG

Domain adaptation-based word division method of network text

The invention discloses a domain adaptation-based word division method of social network text. Through building an integrated neural network and using a self-training learning method, cross-domain news corpus and labeled data and unlabeled data in a social network are utilized to train an integrated neural network model. The method specifically comprises: dividing the social network text into labeled and unlabeled datasets, and using the datasets as input; using the news domain corpus as source corpus, and pre-training source classifiers on the news source corpus; integrating the source classifiers through a manner of assigning weights to the source classifiers; using the social network corpus to train the integrated neural network model; and utilizing the well-trained integrated neural network model to carry out prediction, and thus improving an effect of word division of the social network. The method can be used to solve the problem of a poor effect caused by very insufficient data in the social network, and can effectively improve the effect of word division of the social network text.
Owner:PEKING UNIV

Entity relationship classification model construction method and device and storage medium

The invention relates to the field of data processing, and discloses an entity relationship classification model construction method and device and a storage medium, and the method comprises the steps: obtaining web text data; performing word segmentation and dependency syntactic analysis on text sentences, and extracting preprocessing of a plurality of features; enabling the training word vector vocabulary to input a plurality of features into an Embedding layer for training, and mapping the features into low-dimensional vector representation; inputting the low-dimensional vector representation of the sentence into a Bi-GRU layer for training; carrying out multiple times of self-attention calculation on all moment output vectors trained by the Bi-GRU layer by utilizing a Multi-headattention layer to obtain features of more layers in different representation subspaces, so as to obtain more context information of sentences; and extracting text information and position information of keywords in the entity relationship classification task through the capsule network layer, and classifying the entity relationship through the length of a capsule vector. According to the method, entity and position information can be utilized more effectively through the entity relationship classification model provided by combining Multi-headtention and a capsule network, the relationship classification effect is improved, and the position information sense is improved.
Owner:TAIYUAN UNIV OF TECH

Portrayal generation method and system based on guide network text classification and medium

The invention discloses a portrait generation method and system based on guide network text classification and a medium, and can be widely applied to the technical field of computers. According to the method, the features extracted by the feature extractor and the text label are input into the guide network together, so that the semantic association relationship between the student text and the text label is obtained through the guide network, and the parameters of the feature extractor can be adjusted according to the semantic association relationship; after the parameters of the feature extractor meet a first preset requirement, the parameters of a sorting classifier are adjusted according to the current text features and the text labels of the feature extractor; and after the parameters of the sorting classifier meet a second preset requirement, the target label of the current student text is obtained through the feature extractor meeting the first preset requirement and the sorting classifier meeting the second preset requirement, so that a more accurate student portrait can be obtained according to the target label.
Owner:ZHEJIANG NORMAL UNIVERSITY

Network hot topic discovery method and system

The present invention proposes a network hot topic discovery method and system based on sparse matrix decomposition and a topic model. The method comprises: constructing a word co-occurrence matrix for web texts, for massive web texts, removing low-frequency lexical terms when the document size reaches a certain level, and using the words to construct the word co-occurrence matrix; carrying out sparse non-negative matrix decomposition, carrying out decomposition on the word co-occurrence matrix to obtain a lexical term-topic matrix and a document topic matrix, and taking the matrices as the next input; and taking the two matrices obtained by matrix decomposition as the initial condition to input into the topic model, obtaining the optimal solution through the expectation maximization algorithm, and carrying out steps such as sorting the number of documents with specific topics, and the like to finally realize the discovery of network hot topics.
Owner:THE 28TH RES INST OF CHINA ELECTRONICS TECH GROUP CORP

Positive and negative emotion analysis method, terminal equipment and storage medium

The invention relates to a positive and negative emotion analysis method, terminal equipment and a storage medium. The method comprises the steps of S1, constructing and maintaining industry keyword rules and industry emotion dictionaries corresponding to different industries; s2, judging whether the to-be-analyzed text data contains keywords contained in the industry keyword rule or not, and if so, entering S3; otherwise, entering S4; s3, according to the industries to which the keywords belong and the industry keyword rules corresponding to the industries, after all the industries to which the text data belong are judged, calculating emotion scores of all the industries according to the industry emotion dictionary corresponding to all the industries, and then obtaining emotion positive and negative analysis results of the text data; and S4, obtaining an emotion positive and negative analysis result of the text data through the trained machine learning model. The sentiment analysis method based on fusion of the industry sentiment words and the machine learning model is adopted, the web text is divided and treated, and the analysis effect is improved.
Owner:XIAMEN MEIYA PICO INFORMATION

Public opinion data analysis model based on deep learning

The invention relates to a multitask text analysis method based on text sentiment analysis of CNN-LSTM and textrank abstract automatic extraction of word2vector. The method comprises the steps of obtaining massive to-be-tested network text data, firstly, preprocessing network text data to be tested and then inputting the preprocessed network text data into an LSTM-CNN neural network; according tothe LSTM-CNN, a classical text sequence processing method being used for a long-term and short-term memory network; obtaining a vector representing the context; the CNN further extracting higher-dimensional and effective features; then, sending features into softmax to be subjected to multi-classification, so that sentiment positive and negative directions of a text are obtained, secondly, segmenting the input text data into sentences by combining a textrank algorithm based on word embedding to construct a graph model, and calculating the similarity between the sentences to serve as weights ofedges; by calculating sentence scores, sorting the obtained sentence scores in an inverted order, and extracting several sentences with the highest importance degree as candidate abstract sentences;finally, displaying the analysis result in the form of a report. The multi-task text data processing model enables a public opinion monitoring result to obtain high accuracy and high efficiency, and text analysis precision is improved by using two neural network training.
Owner:SUN YAT SEN UNIV

Network text sentiment analysis method based on Bi-GRU neural network and self-attention mechanism

The invention relates to a network text sentiment analysis method based on a BiGRU neural network and a self-attention mechanism, and belongs to the technical field of information. The method comprises the following steps: S1, acquiring network text information, and encoding a text by utilizing a word embedding vector; S2, summarizing forward and backward information of sentences through a Bi-GRUnetwork layer, and then combining the information from the two directions to obtain a final implicit vector; S3, inputting the obtained implicit vector into a multi-layer perceptron to obtain a new implicit representation, then calculating an importance word-level context vector of a word, and performing random initialization and common learning in a training process; and S4, multiplying the implicit vector of each word by a corresponding weight obtained through a self-attention layer, and then performing text sentiment classification through an improved softmax layer. According to the method,the web text sentiment classification accuracy can be effectively improved.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

Joint detection method for drug name and adverse drug reaction in web text

The invention relates to the technical field of text processing, in particular to a joint detection method for drug names and adverse drug reactions in web texts, and the method comprises the following steps: extracting local context information of the web texts to obtain local context representations of words in the web texts; extracting global context information of the web text to obtain global context representation of the web text; and based on the local context representation and the global context representation, utilizing a pre-trained classification model to identify an actual category of the web text, and obtaining a detection effect of the drug name and the adverse drug reaction in the web text according to the actual category. Therefore, the detection effect of the drug name and the adverse drug reaction in the web text is effectively improved.
Owner:TSINGHUA UNIV

Specific domain-oriented network user group division method and device

The invention relates to a specific domain-oriented network user group division method and device, and the method comprises the steps: extracting semantic clue information corresponding to one or moredomains from collected network text data; determining an account sequence corresponding to each account in the account set according to the target semantic clue information of the target domain corresponding to the account set and the association relationship between the accounts in the account set, the account sequence comprising a plurality of accounts taking each account as a starting account;generating a second feature vector corresponding to each account according to the first feature vector of the account in the account sequence; and dividing accounts included in the account set into aplurality of network user groups according to the second feature vector. According to the method and the apparatus, the technical problem of relatively low accuracy when the network user groups are divided is solved.
Owner:北京市公安局 +1

Operation method, device and equipment of neural network text translation model and medium

Embodiments of the invention disclose an operation method and device of a neural network text translation model, an electronic device and a storage medium. The neural network text translation model comprises an encoder layer, an attention mechanism layer and a decoder layer, and the method comprises the steps: inputting a source language vocabulary sequence into the encoder layer for processing toform an implicit structure vector; controlling an attention mechanism layer to generate a vocabulary alignment table; inputting the implicit structure vector and a context vector when each vocabularyis translated into a decoder layer for processing to generate a target language vocabulary sequence; acquiring unknown characters in the target language vocabulary sequence, and determining source language vocabularies in the source language vocabulary sequence corresponding to the unknown characters according to the vocabulary alignment table; translating the source language vocabulary to obtaina target language vocabulary; and replacing the unknown characters in the target language vocabulary sequence by the target language vocabularies, so the unknown characters in the translation resultcan be reduced or even completely eliminated.
Owner:JIANGSU SUNYU INFORMATION TECH CO LTD

Method and device for processing a set of related words

ActiveCN106649334BTo achieve the effect of perfect collectionThe effect of perfecting the collectionSpecial data processing applicationsText database clustering/classificationCybertextPart of speech
The invention discloses a conjunction word set processing method and device, wherein the processing method comprises the steps of crawling a web text from a target data source on the basis of conjunction words in a conjunction word set of an object to be analyzed; performing word segmentation on the web text to obtain a plurality of text vocabularies, and obtaining the vocabulary information of each text vocabulary, wherein the vocabulary information includes conjunction index data of each text vocabulary and / or information of part of speech of each text vocabulary, and the conjunction index data is used for indicating the conjunction degree of each text vocabulary and the conjunction words; screening the conjunction index data of a plurality of text vocabularies and / or information of part of speech of a plurality of text vocabularies, and obtaining the screened conjunction vocabularies; and updating the conjunction word set by using the screened conjunction vocabularies. The method and the device provided by the invention solve the technical problem of small vocabulary quantity of the existing word bag accumulating method.
Owner:BEIJING GRIDSUM TECH CO LTD

Network text segmenting method based on genetic algorithm

The invention discloses a network text segmenting method based on the genetic algorithm, used for segmenting short network texts. The method comprises the following steps of: evaluating a Latent Dirichlet allocation (LDA) model corresponding to a corpus by using a Gibbs sampling method, inferring latent topic information using the model, representing texts by using the latent topic information; then transforming a text-segmenting process into a multi-target optimum process by using a parallel genetic algorithm, and calculating the coherency of segmented units, the divergence among the segmented units and fitness functions by using deeper semantic information; and carrying out the genetic iteration of the text segmenting process, and determining whether the segmenting process terminates based on the similarity among multi-iteration results or the upper limit of iterations to obtain the global optimal solution for segmenting the texts. Therefore, the invention improves the accuracy for segmenting the short network texts.
Owner:NANTONG LONGXIANG ELECTRICAL APPLIANCE EQUIP +1

A public opinion calculation and deduction method and system for social event network texts

The invention discloses a public opinion calculation and deduction method and system for social event network texts, and relates to the technical field of network text processing, including: acquiring social event network texts; preprocessing the social event network texts to obtain network social event text characters feature, network social event text word feature and network social event text implicit feature; the network social event text word feature, described network social event text word feature and described network social event text implicit feature are input into the training respectively The social emotion computing model and the trained text emotion computing model are predicted to obtain six emotional probabilities of the social event network text; according to the six emotional probabilities of the social event network text, a voting mechanism method is used to determine the emotional orientation of the social event network text . The method and system provided by the present invention can realize the orientation of the final emotion of the social event network text through the analysis of multiple emotions of the social event network text.
Owner:SHANGHAI UNIV

Geographic text corpus labeling method based on feature evaluation and keyword similarity

The invention provides a geographic text corpus tagging method based on feature evaluation and keyword similarity, and high-quality geographic field tagging corpus is obtained. The method comprises the following steps: crawling web texts by utilizing a crawler technology to obtain a knowledge base and a corpus; preprocessing the corpus to obtain cleaned corpora; aligning the knowledge base and the corpus according to the entity pairs in the text; calculating sentence feature words; calculating weights of the words in the geographic entity pairs; selecting a word with the maximum weight as a relational word; generating a word vector by using a Word2Vec model; calculating the similarity between the relational words in the sentences and the relational words in the knowledge base; and finding out the relational word with the maximum similarity and carrying out corpus tagging to finally obtain a statement tagged with an entity and a relation type.
Owner:UNIV OF ELECTRONICS SCI & TECH OF CHINA +1

A text deduplication method and system

The invention provides a text duplicate removal method. The method comprises the following steps: a step of preprocessing data of a target text, a step of generating a local sensitive hash value of amain body of the target text and a local sensitive hash value of a title of the target text, and a duplicate removal step. For unique characteristics of a network text, a policy applying a SimHash algorithm is adjusted, so that the better effect and the higher robustness are achieved when the duplicate removal is performed by taking an event behind a news text as a subject.
Owner:重庆电信系统集成有限公司 +1

Mass data-based causal group extraction method and system, and computer readable storage medium

The invention discloses a causal group extraction method and system based on mass data and a computer readable storage medium. The method comprises the following steps: acquiring a web text and storing the web text according to a time period; uniformly sampling the obtained web text to obtain a sample set, and pre-labeling the sample set; performing event labeling in a BIO format and causal relationship labeling on the pre-labeled text set; training the BERT + CRF model by using the data obtained by marking; carrying out causal extraction on the stored web text by utilizing a BERT + CRF model, and forming a triple in a preset format; clustering the triple through a clustering algorithm to obtain a causal group; and performing selection and reduction processing on the obtained causal group, and storing the reduced causal group. According to the method, the causality extraction accuracy is improved, noise data, redundant data and isolated data in an extraction result are reduced, and the method has relatively high reliability.
Owner:GUANGZHOU DATASTORY INFORMATION TECH CO LTD

Public opinion calculation and deduction method and system for social event web text

The invention discloses a public opinion calculation and deduction method and system for a social event web text, and relates to the technical field of web text processing. The method comprises the following steps: obtaining the social event web text; preprocessing the social event web text to obtain web social event text word features, web social event text word features and web social event textimplicit features; respectively inputting the web social event text word features, the web social event text word features and the web social event text implicit features into a trained social emotion calculation model and a trained text emotion calculation model for prediction to obtain six emotion probabilities of the social event web text; and determining emotion orientation of the social event web text by adopting a voting mechanism method according to the six emotion probabilities of the social event web text. According to the method and the system provided by the invention, the orientation of the final emotion of the social event web text can be realized by analyzing various emotions of the social event web text.
Owner:SHANGHAI UNIV

Text Fine-grained Sentiment Generation Method Based on Probabilistic Reasoning and Sentiment Cognition

The present invention relates to a text fine-grained emotion generation method based on probabilistic reasoning and emotional cognition, comprising the following steps: Step 1: Prepare the text data set required by the training method; Step 2: Process the text data set; Step 3: Extract and construct Sentiment evaluation variables used by the Bayesian network; Step 4: According to the characteristics of the network text, add emotional evaluation variables based on emoji and word frequency; Step 5: Build an emotional knowledge base; Step 6: Build a common sense knowledge base; Step 7: Emotion Evaluation variable assignment; Step 8: Learning the network structure of the Bayesian network for emotion generation; Step 9, parameter learning; Step 10: Complete the construction of the emotion generation method. The present invention uses the emotion recognition method to solve the problem of ignoring the hidden emotion existing in other emotion generation methods, and uses the Bayesian network to calculate the probability of emotion generation, compares the probability of each emotion category, and generates one or more emotion.
Owner:福昕鲲鹏(北京)信息科技有限公司

Answer selection method and device based on web text, server and storage medium

The embodiment of the invention relates to the field of artificial intelligence, and discloses an answer selection method and device based on a web text, a server and a storage medium. The method comprises the following steps: selecting M candidate paragraphs related to a question from a pre-stored document; respectively splicing and encoding the M first features to obtain M vectors correspondingto the M first features, each first feature comprising each candidate paragraph and question; interacting the M vectors through a self-attention mechanism to obtain M updated vectors; the self-attention mechanism pays attention to the internal correlation between the data, the M vectors are interacted by using the self-attention mechanism so as to update the M vectors, and the mutual influence between the vectors corresponding to the features and the vectors corresponding to the features is considered, so that the updated vectors are more accurate, and N golden paragraphs can be selected fromthe M candidate paragraphs corresponding to the M updated vectors more accurately; and according to the N golden paragraphs, the position information of the answer is acquired more accurately.
Owner:EAST CHINA UNIV OF SCI & TECH

Topic web crawler method and system based on text classification

The invention provides a topic web crawler method based on text classification. The method comprises the following steps: S1, receiving a topic, and initializing a URL task queue; S2, taking out a crawler task from the URL task queue, and obtaining network document content; S3, obtaining the classification of the network document content; S4, judging whether the classification is the same as the theme or not; if yes, extracting the URL in the network document, and adding the URL into a URL task queue; if the URL task queue is different from the URL task queue and there is a task in the URL task queue, executing the step S2; and S5, circularly executing the steps S2-S4 until no task exists in the URL task queue. According to the invention, the classification model can learn the features of the web text, and the classification accuracy of the classification task can be effectively improved.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY +1

A positive and negative sentiment analysis method, terminal device and storage medium

The invention relates to a positive and negative sentiment analysis method, a terminal device and a storage medium. The method includes: S1: constructing and maintaining industry keyword rules and industry sentiment dictionaries corresponding to different industries; S2: judging whether the text data to be analyzed contains Contains the keywords contained in the industry keyword rules, if included, go to S3; otherwise, go to S4; S3: According to the industry to which the keyword belongs and the industry keyword rules corresponding to the industry, after judging all industries to which the text data belongs , calculate the sentiment score of each industry according to the industry sentiment dictionary corresponding to each industry, and then obtain the positive and negative sentiment analysis results of the text data; S4: obtain the sentiment positive and negative analysis results of the text data through the trained machine learning model. The invention adopts the emotion analysis method based on the fusion of industry emotion words and machine learning models, divides and conquers network texts, and improves the analysis effect.
Owner:XIAMEN MEIYA PICO INFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products