Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

52 results about "Sequence annotation" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Sequence annotation. Sequence annotation is the "process of marking specific features in a DNA, RNA or protein sequence with descriptive information about structure or function".

Program for microarray design and analysis

InactiveUS20030033290A1Facilitate identification and analysis and comparisonDesigning can be facilitatedData processing applicationsDigital data processing detailsComputer toolsMicroarray design

The invention relates to computer-based systems and methods for the design, comparison and analysis of genetic and proteomic databases. In a particular embodiment, the recited systems and methods have been implemented in a computer tool called ARROGANT. ARROGANT, in the analysis mode, is a comprehensive tool for providing annotation to large gene and protein collections. ARROGANT takes in a large collection of sequence identifiers and associates it with other information collected from many sources like sequence annotations, pathways, homology, polymorphisms, artifacts, etc. The simultaneous annotation for a large assembly of genes makes the collection of genomic / EST sequences truly informative.

Program for microarray design and analysis

Program for microarray design and analysis

Program for microarray design and analysis

Owner:BOARD OF RGT THE UNIV OF TEXAS SYST

Text emotion analysis method and device and storage medium

ActiveCN108717406AImprove accuracySmall amount of calculationSemantic analysisCharacter and pattern recognitionSequence annotationAnalysis method

The invention provides a text emotion analysis method. The method comprises the steps that a text emotion analysis request carrying a target text is received, the target text is preprocessed, and a preset sequence annotation method is adopted to perform word segmentation processing on the preprocessed target text to obtain an available word set corresponding to the target text; a to-be-analyzed sentence of the target text is determined, an available word set corresponding to the to-be-analyzed sentence in the target text is acquired, and sentence vectors of the to-be-analyzed sentence are calculated according to preset calculation rules, wherein the sentence vectors include sentence-level vectors and word-level vectors; and the sentence vectors of the to-be-analyzed sentence are input intoa pre-trained emotion judgment model, and the emotion polarity of the target text is judged according to the model output result. The invention furthermore provides an electronic device and a computer storage medium. By use of the text emotion analysis method, the electronic device and the computer storage medium, the accuracy of target text emotion analysis can be improved.

Text emotion analysis method and device and storage medium

Text emotion analysis method and device and storage medium

Text emotion analysis method and device and storage medium

Owner:PING AN TECH (SHENZHEN) CO LTD

Named entity recognition method and device based on semi-supervised learning training

PendingCN111062215AReduce labeling costsNatural language data processingNeural architecturesSemantic vectorNamed-entity recognition

The invention relates to a named entity recognition method and device based on semi-supervised learning training, computer equipment and a storage medium. The method comprises the steps of obtaining annotation data and non-annotation data; performing supervised training on a sequence labeling model by utilizing the labeling data; calculating semantic vectors corresponding to the annotation data and the unannotated data through a trained sequence annotation model, and identifying the unannotated data in the same distribution as the annotation data according to the semantic vectors; calling a semi-supervised learning model, wherein the semi-supervised learning model is composed of the trained sequence labeling model and an auxiliary prediction network with a limited input view angle; and training the semi-supervised learning model through unlabeled data in the same distribution, and outputting a corresponding named entity recognition result through Viterbi decoding. By adopting the method, the data annotation cost can be effectively reduced, and the named entity identification accuracy can be effectively improved.

Named entity recognition method and device based on semi-supervised learning training

Named entity recognition method and device based on semi-supervised learning training

Named entity recognition method and device based on semi-supervised learning training

Owner:KINGDEE SOFTWARE(CHINA) CO LTD

A method and a device for acquiring document information

ActiveCN109685056ACharacter and pattern recognitionDocument structuringAlgorithm

The invention relates to a document information extraction method and device based on sequence labeling and a learning model. The method comprises the following steps: training at least one sequence labeling algorithm model to obtain at least one offline sequence labeling algorithm model; Determining the accuracy of the annotation information in each of the offline sequence annotation algorithm models, and converting a to-be-processed document into a text document; Obtaining document structure format property information from the to-be-processed document; And inputting the text document and the structural format property information into the offline sequence labeling algorithm model to obtain labeling information corresponding to the document information in the document. According to the method, the key information of the document can be extracted by using the sequence labeling technology. And by using a multi-model fusion technology, different key information in the document can be extracted by using an optimal model. In addition, business rule reasoning and calculation are carried out on a typeface extraction result, and the application range is wider.

A method and a device for acquiring document information

A method and a device for acquiring document information

A method and a device for acquiring document information

Owner:DATAGRAND TECH INC

Computer-based method for creating collections of sequences from a dataset of sequence identifiers corresponding to natural complex biopolymer sequences and linked to corresponding annotations

InactiveUS7065451B2Facilitate identification and analysis and comparisonDesigning can be facilitatedData processing applicationsComputer controlData setBiopolymer

The invention relates to computer-based systems and methods for the design, comparison and analysis of genetic and proteomic databases. In a particular embodiment, the recited systems and methods have been implemented in a computer tool called ARROGANT. ARROGANT, in the analysis mode, is a comprehensive tool for providing annotation to large gene and protein collections. ARROGANT takes in a large collection of sequence identifiers and associates it with other information collected from many sources like sequence annotations, pathways, homology, polymorphisms, artifacts, etc. The simultaneous annotation for a large assembly of genes makes the collection of genomic / EST sequences truly informative.

Computer-based method for creating collections of sequences from a dataset of sequence identifiers corresponding to natural complex biopolymer sequences and linked to corresponding annotations

Computer-based method for creating collections of sequences from a dataset of sequence identifiers corresponding to natural complex biopolymer sequences and linked to corresponding annotations

Computer-based method for creating collections of sequences from a dataset of sequence identifiers corresponding to natural complex biopolymer sequences and linked to corresponding annotations

Owner:BOARD OF RGT THE UNIV OF TEXAS SYST

Signature information extraction method and device

ActiveCN109460551AAccurate extractionOvercome limitationsCharacter and pattern recognitionNatural language data processingGranularitySequence annotation

The embodiment of the present application provides a signature information extraction method and device, which can extract the signature information of a rule very quickly and conveniently by extracting the structured information in each statement separately by using a regular expression. Extracting unstructured information uses machine learning classification model and character granularity sequence annotation, which can solve the limitation of traditional way of using mail template alignment to get extracted information. In the implementation process, the TF-IDF word frequency feature and tagged sequence feature are extracted, and the extracted TF-IDF word frequency feature and annotation sequence feature are inputted into address binary classification model and character granularity sequence annotation model respectively, and the name information and address information in each sentence are obtained. Thus, by extracting the TF-IDF word frequency features, the address information canbe identified completely, and the tagged sequence features are used to greatly reduce the negative impact of wrong word segmentation on the identification of name information, so as to accurately extract the mail signature information.

Signature information extraction method and device

Signature information extraction method and device

Signature information extraction method and device

Owner:BEIJING KNOWNSEC INFORMATION TECH

Mammary gland electronic medical record entity recognition system based on multi-standard active learning

ActiveCN111222340AImprove execution efficiencyIncrease chance of cureMedical data miningNatural language data processingMedical recordDisease

The invention relates to a mammary gland electronic medical record entity recognition system based on multi-standard active learning, and the system is characterized in that the system comprises a preprocessing module; an entity identification module; and an active learning module. According to the invention, the active learning selection strategy for text sequence annotation is designed by considering three aspects of annotation data volume, sentence annotation cost and data sampling balance, so the total annotation workload is reduced. On the one hand, the system can be used for constructingsystems such as breast disease risk patient identification marks, disease medicine recommendation and auxiliary decision diagnosis, doctors are helped to improve the execution efficiency of breast disease standardized diagnosis and treatment, and scientific bases and suggested schemes are provided; on the other hand, doctors can be assisted in finding out potential abnormal conditions in the diagnosis and treatment process, the misdiagnosis and missed diagnosis rate is reduced, the curing probability of breast disease patients is increased, and important value is achieved for intelligent development of breast disease research.

Mammary gland electronic medical record entity recognition system based on multi-standard active learning

Mammary gland electronic medical record entity recognition system based on multi-standard active learning

Mammary gland electronic medical record entity recognition system based on multi-standard active learning

Owner:DONGHUA UNIV +1

Sequence labeling method and system, computer equipment and computer readable storage medium

ActiveCN110866115ASequence labeling is accurateCharacter and pattern recognitionNeural architecturesData packFeature vector

The embodiment of the invention discloses a sequence labeling method, which comprises the following steps: obtaining a training sample set, wherein the training sample set comprises a plurality of pieces of training sample data, and each piece of training sample data comprises an input text sequence and a label corresponding to the input text sequence; preprocessing each piece of training sample data to obtain vector data corresponding to each piece of training sample data; inputting vector data corresponding to each piece of training sample data into a first-order hidden Markov model to construct a feature vector matched with each piece of training sample data; inputting the feature vectors corresponding to the sample data into a neural network model for training to generate a sequence labeling model; and inputting the to-be-labeled sequence into the sequence labeling model to obtain a target label sequence corresponding to the to-be-labeled sequence. The embodiment of the invention further discloses a sequence labeling system, computer equipment and a readable storage medium. The embodiment of the invention has the beneficial effect that the sequence annotation is more accurate.

Sequence labeling method and system, computer equipment and computer readable storage medium

Sequence labeling method and system, computer equipment and computer readable storage medium

Sequence labeling method and system, computer equipment and computer readable storage medium

Owner:PING AN TECH (SHENZHEN) CO LTD

Judicial judgment case information structural processing system

ActiveCN109344187ATargeted optimizationWell structured featuresData processing applicationsDatabase management systemsSequence annotationLibrary science

The invention discloses a judicial judgment case information structural processing system, which is applicable to the fields of information extraction and natural language processing. The system includes a judiciary judgement case information structured presentation module, an establishment judiciary judgement case information sequence annotation model module, an attribute trigger word managementmodule and a generation structured judiciary judgement case information module. According to the case types given by the users, the establishment judiciary judgement case information sequence annotation model module constructs a training set of judicial judgment case information sequence tagging and training sequence tagging model, and combines the attribute trigger word set to generate structuredjudicial judgment case information according to the method of generating structured judicial judgment case information. The system of the invention realizes the structured processing of the case information of the judicial judgment according to the case type and the case information of the judicial judgment provided by the user, and aims at providing an effective mode for extracting the structured information from the unstructured judicial judgment text.

Judicial judgment case information structural processing system

Judicial judgment case information structural processing system

Judicial judgment case information structural processing system

Owner:HEFEI UNIV OF TECH

Data benchmarking method and device and storage device

ActiveCN110795482AIncrease credibilityReduce false match rateDatabase management systemsText database queryingDatasheetOriginal data

The invention discloses a data benchmarking method and device and a storage device. The data benchmarking method comprises the steps that original data information is extracted from a data table to bebenchmarked, and the original data information comprises field names and field annotations corresponding to the field names; identifying the field annotation based on a sequence annotation model of deep learning to obtain a characteristic word corresponding to the field name; carrying out first text matching on the characteristic words corresponding to the field names and standard data elements in a standard library; and verifying a result output after the first text is matched. By means of the mode, text matching is conducted on the basis that the feature words are recognized, the credibility of a text matching result is improved, and the mismatching rate in the benchmarking process is reduced.

Data benchmarking method and device and storage device

Data benchmarking method and device and storage device

Data benchmarking method and device and storage device

Owner:ZHEJIANG DAHUA TECH CO LTD

Method for constructing antibiotic resistance genbank

ActiveCN108491692ASpecial data processing applicationsBioinformaticsSequence annotationAntibiotic resistance genes

The invention discloses a bioinformatics method for constructing an antibiotic resistance genbank in the biotechnology field. The method comprises the following steps that: searching the protein sequence of a resistance gene in a GenBank; selecting a sequence with high accuracy as an initial sequence; adopting a Clustalw method for comparison; constructing a hidden Markov model, and searching a GenBank protein database to obtain all sequences which contain protein conserved sites; according to the E values of the sequences and the annotation information of the sequences in the GenBank database, removing the sequences which are highly homologous and do not conform to requirements; and after repeated sequences are removed, adding species annotation information; and integrating all protein sequences to finish database construction. By use of the method, the annotation information and the comparison similarity of a sequence can be comprehensively measured, and sequence collection speed andaccuracy can be improved. By use of the method provided by the invention, the construction of the antibiotic resistance genbank can be finished, and a basic data is provided for researching the primer design, the data analysis and the sequence annotation of the resistance gene.

Method for constructing antibiotic resistance genbank

Method for constructing antibiotic resistance genbank

Method for constructing antibiotic resistance genbank

Owner:RES CENT FOR ECO ENVIRONMENTAL SCI THE CHINESE ACAD OF SCI +1

Medical field entity classification method fusing entity keyword features

PendingCN112507717AImplement classificationImprove accuracyNatural language data processingNeural architecturesFeature extractionMedicine

The invention discloses a medical field entity classification method fusing entity keyword features. The method comprises the steps: text vectorization operation; feature extraction; and sequence annotation. According to the medical field entity classification method fusing entity keyword features provided by the embodiment of the invention, a TFIDF is adopted to assist in constructing a keyword table, the keywords are used as feature input models, a BERT model is adopted to perform text vectorization operation to generate word vectors, the word vectors are input into a BILSTM-CNN hybrid modelto learn features, and then sequence labeling is performed through a CRF layer, so that medical field entity classification can be realized, and the accuracy, recall rate and F1 value of medical field entity classification can be greatly improved.

Medical field entity classification method fusing entity keyword features

Medical field entity classification method fusing entity keyword features

Medical field entity classification method fusing entity keyword features

Owner:BEIJING INFORMATION SCI & TECH UNIV

Chinese rhythm boundary prediction method based on graph-to-sequence

PendingCN111951781AImprove accuracySpeech synthesisGraphicsFeature extraction

The invention discloses a Chinese rhythm boundary prediction method based on graph-to-sequence, which specifically comprises the following four parts: (1) word embedding representation characteristics: converting the characteristics into digital representation, so that the technology of mapping words into real number field vectors is called word embedding; (2) a text time sequence feature extraction model, wherein annotation of rhythm boundaries is sequence annotation in the time dimension; (3) text space information: inputting a text sequence, processing the text sequence into a graphic structure, and processing a dependency relationship between rhythm boundaries by adding space information; and (4) combining spatial and temporal features. Time sequence information and space information of the text are combined together to serve as a new feature, and the accuracy of the final rhythm boundary is improved.

Chinese rhythm boundary prediction method based on graph-to-sequence

Chinese rhythm boundary prediction method based on graph-to-sequence

Chinese rhythm boundary prediction method based on graph-to-sequence

Owner:TIANJIN UNIV

Sequence labeling method and device, computer equipment and storage medium

ActiveCN110688853AImprove accuracyNatural language data processingAlgorithmEngineering

The invention relates to a sequence labeling method and device based on a neural network, computer equipment and a storage medium. The method comprises the steps of performing vector conversion on each character in a to-be-labeled sequence to obtain a corresponding feature word vector; inputting the feature word vectors into a preset sequence annotation neural network to segment words of the to-be-annotated sequence to obtain candidate words and word tags corresponding to the candidate words; and combining the word label with the position of each character in the candidate word to obtain a character label to which the character belongs in the candidate word; calculating a first pairing index of the candidate word based on the weight vector of the character tag to which each character in the candidate word belongs; calculating a second pairing index of the candidate labeling sequence based on the first pairing index corresponding to each group of candidate words; and identifying the candidate labeling sequence corresponding to the second pairing index with the maximum numerical value as the first labeling sequence. By adopting the method, the labeling accuracy can be improved.

Sequence labeling method and device, computer equipment and storage medium

Sequence labeling method and device, computer equipment and storage medium

Sequence labeling method and device, computer equipment and storage medium

Owner:PING AN TECH (SHENZHEN) CO LTD

Named entity identification method based on time convolution network

PendingCN110442860ASolve the defect that timing information cannot be obtainedSolve the problem of not being able to remember long-term informationNeural architecturesNeural learning methodsNamed-entity recognitionOne-hot

The invention relates to a named entity identification method based on a time convolution network. The method comprises the following steps of: firstly, constructing a feature representation layer which mainly consists of a word vector and a character feature layer, wherein the word vector layer and the character vector layer respectively accept words and characters as input, and respectively mapdiscrete One-hot representations to respective continuous dense low-dimensional feature spaces; splicing the word vectors and the character-level vectors to represent features of the words in a particular semantic space; secondly, taking the spliced features as input of a time convolution network, extracting different features through the time convolution network with different fusion convolutionkernel sizes, and obtaining final features h1h2... hn; finally, taking the obtained features as input of a CRF layer; and after the CRF further restrains context annotation, outputting sequence annotation results y1y2... yn. Compared with an existing LSTM network, the TCN network has the advantages that the recognition precision is slightly improved, and the training time is only about 1/3 of thatof the LSTM network.

Named entity identification method based on time convolution network

Named entity identification method based on time convolution network

Named entity identification method based on time convolution network

Owner:DALIAN UNIV

Sequence labeling method and device, storage medium and computer equipment

PendingCN111353295AImprove accuracyAuxiliary generationNatural language data processingPart of speechEngineering

The invention provides a sequence labeling method and device, a storage medium and computer equipment, and a sequence comprises a to-be-labeled word and a labeled word, and the sequence is used for generating a text. The method comprises the steps: obtaining the sequence; identifying context information of words to be labeled in the sequence; according to the context information, determining a second part-of-speech of the to-be-annotated word in combination with the first part-of-speech of the annotated word adjacent to the to-be-annotated word, wherein the second part-of-speech is used for annotating the to-be-annotated word. According to the method, the part-of-speech of the to-be-annotated word is annotated according to the context information of the to-be-annotated word in the sequence, the part-of-speech annotation accuracy can be improved, the sequence annotation effect is improved, and therefore text generation is effectively assisted.

Sequence labeling method and device, storage medium and computer equipment

Sequence labeling method and device, storage medium and computer equipment

Sequence labeling method and device, storage medium and computer equipment

Owner:GUANGDONG BOZHILIN ROBOT CO LTD

Method for recognizing text segments by using sequence annotation

ActiveCN111191456ASemantic analysisPatient-specific dataFeature vectorSemantics

The invention provides a method for recognizing text segments by using sequence annotation, which comprises the following steps: A, respectively segmenting different text segments of a sample set intoclause sets, and annotating the clause sets by using semantic feature vectors to form semantic feature vector sets; B, performing clustering training on the semantic feature vector set to obtain a clustering model, and performing cluster numbering on each object of the clustering model to form a sequence model; C, establishing mapping between the sequence model and the different text fields, andtraining a sequence labeling model for the mapped cluster sequence; and D, sequentially applying the sequence model and the sequence labeling model, and segmenting the text to be segmented. The methodperforms standardized modeling by taking the sample set as a database template. And during subsequent text segmentation recognition, the method includes standardizing the sentence pattern model in the to-be-segmented text, and mapping the standardized sentence to the sentence features according to the model, so that different expressions representing the same semantics can be expressed to complete text segmentation recognition.

Method for recognizing text segments by using sequence annotation

Owner:零氪科技(天津)有限公司

Antibody calculation optimization method based on genetic algorithm

PendingCN112365919AFully automatedReduce dependenceBiostatisticsSystems biologySequence designEpitope

The invention provides an antibody calculation optimization method based on a genetic algorithm. The method covers algorithms such as peptide chain processing, epitope recognition, sequence annotation, CDR H3 sequence design, antibody modeling, molecular docking, antibody property evaluation and the like, and has a full-process automatic antibody design function. Based on known antibody sequence data, a variant antibody sequence formed by combining random sites and random residues is iteratively generated and evaluated by utilizing a genetic algorithm aiming at a heavy-chain highly variable H3section (CDR H3) of the antibody and is subjected to comprehensive scoring comparison with an original antibody, so that an optimized antibody is obtained or a low-quality antibody is removed, and finally, a candidate antibody sequence library is generated, and the biophysical property of the candidate antibody is predicted. According to the invention, basic elements of an antibody calculation optimization process are integrated, and automation of the process is realized on the same platform.

Antibody calculation optimization method based on genetic algorithm

Antibody calculation optimization method based on genetic algorithm

Antibody calculation optimization method based on genetic algorithm

Owner:北京迈迪培尔信息技术有限公司

Text entity detection method and system and related components

ActiveCN110348017AIncrease the number ofQuality improvementNatural language data processingData miningSeed entityAlgorithm

The invention discloses a text entity detection method, and the method comprises the steps: carrying out the matching of each statement instance in a target statement through a seed entity set to obtain a matching result, and generating annotation data corresponding to the target statement according to the matching result; querying a statement instance matched with an unlabeled corpus word frequency table in the target statement, and modifying the labeled data according to a query result to obtain local labeled data; training a sequence annotation neural model by utilizing the local annotationdata; and performing sequence annotation on the unannotated corpus in the target statement by utilizing the trained sequence annotation neural model so as to obtain an entity set of the target statement. According to the method, high-quality entity mining can be realized on the premise of not being limited by the quality and the scale of the unlabeled corpus. The invention further discloses a text entity detection system, a computer readable storage medium and electronic equipment, which have the above beneficial effects.

Text entity detection method and system and related components

Text entity detection method and system and related components

Text entity detection method and system and related components

Owner:SUZHOU UNIV

Sample expansion method, terminal, device and readable storage medium

ActiveCN111291560AImprove expansion efficiencyImprove robustnessNatural language data processingSpecial data processing applicationsData setData expansion

The invention discloses a sample expansion method and device, a terminal and a readable storage medium. The method comprises the following steps of: selecting sample data from a preset labeled sampledata set as seed data, selecting word data based on the seed data, obtaining seed data of a marked sample data set, obtaining a word type of the word data, determining an expansion mode of the markedsample data set based on the word type, updating the word data in the seed data based on the expansion mode, and taking the updated seed data as expansion sample data to expand the marked sample dataset. Sample data expansion is carried out on the labeled sample data through different expansion modes, according to the method, the cost of obtaining the annotation sample is reduced, the sample expansion efficiency is improved, meanwhile, the generated expansion sample data and the annotated sample data obey the same data distribution, and it can be guaranteed that the model generated by training the sequence annotation model through the expansion sample has very high robustness and accuracy.

Sample expansion method, terminal, device and readable storage medium

Sample expansion method, terminal, device and readable storage medium

Sample expansion method, terminal, device and readable storage medium

Owner:WEBANK (CHINA)

Method and device for extracting relations from texts

PendingCN111859858AResolve accuracyImprove versatilityNatural language data processingNeural architecturesText annotationRelationship extraction

The present disclosure provides a method of extracting a relationship from a text, comprising: generating a sequence annotation of the text using a text annotation model, the sequence annotation comprising an annotation for a word in the text, the annotation comprising an entity annotation of the word and a relationship role of the word, the relationship role comprising one of a subject, a predicate, and an object; generating an entity relationship sequence of the text according to the sequence label; and extracting a relationship result set of the text according to the entity relationship sequence. According to the method and device for extracting the relationship from the text, the problems that in the prior art, a relationship extraction method is low in accuracy, poor in universality and low in extraction efficiency can be effectively solved.

Method and device for extracting relations from texts

Method and device for extracting relations from texts

Method and device for extracting relations from texts

Owner:ZHIZHESIHAIBEIJINGTECH CO LTD

Training file generation and evaluation method and device, computer system and storage medium

PendingCN111582497AGuaranteed Build QualityGuaranteed generation speedCharacter and pattern recognitionMachine learningNatural language understandingEngineering

The invention discloses a training file generation and evaluation method and device, a computer system and a storage medium, and the method comprises the steps: receiving an original file, obtaining the domain information and training entity of the original file, and processing the original file according to the domain information and training entity, and obtaining a labeled file; identifying semanteme of the annotation file through a preset natural language understanding model, and performing sequence annotation on the annotation file to obtain a training file; and inputting the training fileinto an intelligent search model corresponding to the domain information to obtain a training result, calculating the training result through a hit analysis algorithm to obtain a hit rate, and summarizing the training file and the hit rate to generate a hit analysis report. The technical effect of automatically obtaining the training file is achieved, the generation quality and the generation speed of the training file are guaranteed, and the problem that the labeling quality of the training sample cannot be guaranteed due to the fact that the real hit rate of the training sample cannot be obtained at present is solved.

Training file generation and evaluation method and device, computer system and storage medium

Training file generation and evaluation method and device, computer system and storage medium

Training file generation and evaluation method and device, computer system and storage medium

Owner:深圳平安医疗健康科技服务有限公司

Sequence labeling method and system, computer readable storage medium and computer equipment

PendingCN112270181AReduce error accumulationSolve the problem of multiple meaningsNatural language data processingNeural architecturesOriginal dataTheoretical computer science

The invention discloses a sequence labeling method and system, a computer readable storage medium and computer equipment, and the sequence labeling method is based on a joint training mode, and specifically comprises the steps: a data processing step: labeling an original data text in a double-pointer mode, and obtaining a labeled data text; a sequence annotation model construction step: traininga dynamic pre-training model according to the annotation data text to obtain a sequence annotation model; and a sequence labeling step: labeling the real-time data text through the sequence labeling model. According to the method and the system, more accurate information can be obtained, and the model can be deduced according to the context of the entity, so that the problem of multiple meanings of one word can be solved, and the deducing speed is higher.

Sequence labeling method and system, computer readable storage medium and computer equipment

Sequence labeling method and system, computer readable storage medium and computer equipment

Sequence labeling method and system, computer readable storage medium and computer equipment

Owner:BEIJING MININGLAMP SOFTWARE SYST CO LTD

Method for automatically extracting subject of argumentative article

InactiveCN106933795AImprove applicabilityImprove accuracyCharacter and pattern recognitionNatural language data processingConditional random fieldContent retrieval

The present invention relates to a method for automatically extracting a subject of an argumentative article, and belongs to the technical application field of natural language processing. The method disclosed by the present invention comprises: based on the sequence annotation strategy of the random field of the statistical condition, by analyzing semantic features and position characteristics of the subject in the title of the argumentative article and combining with performance of the trained corpus, establishing a commonly used word dictionary and an important word dictionary; using information such as dictionaries and words, locations and the like to carry out sequence feature annotation on the title of the argumentative article; and using the annotated corpus to train and generate the model, so that unknown data can be predicted, the relatively high accuracy can be ensured, and the applicability of the algorithm in different scenarios can be improved. According to the method disclosed by the present invention, automatic extraction of the subject in the argumentative article by the computer can be effectively realized, the main display object of the article can be displayed in an intuitive form, related information of the object can be quickly mastered by the reader in a facilitated manner, related content retrieval and comparison can be facilitated, and automatically extracted phrases can be provided for the computer to carry out various follow-up analysis.

Method for automatically extracting subject of argumentative article

Method for automatically extracting subject of argumentative article

Method for automatically extracting subject of argumentative article

Owner:贺惠新

Adversarial interpolation sequence-based annotation data enhancement method and device, equipment and medium

PendingCN113297355AGood effectSolve the problem that less affects the accuracy of the modelSemantic analysisText database queryingLinguistic modelAlgorithm

The invention discloses an adversarial interpolation sequence-based annotation data enhancement method and device, equipment and a medium. The method comprises the following steps: acquiring first sample data containing sequence labels; inputting the first sample data into a preset language model, outputting candidate word vectors conforming to context semantic constraints, and forming enhanced second sample data according to the candidate word vectors; and interpolating the first sample data and the second sample data by adopting an adversarial interpolation method to obtain interpolated enhanced sample data. According to the sequence annotation data enhancement method provided by the embodiment of the invention, the language model is utilized to provide the candidate word vector conforming to the context constraint, and the adversarial interpolation is utilized to consider the task characteristics, so that a more difficult sample which enables a machine learning algorithm to easily generate misjudgment is generated, the effect of the sequence model under low resources is improved. The problem that the accuracy of the model is influenced by less annotation data is solved.

Adversarial interpolation sequence-based annotation data enhancement method and device, equipment and medium

Adversarial interpolation sequence-based annotation data enhancement method and device, equipment and medium

Adversarial interpolation sequence-based annotation data enhancement method and device, equipment and medium

Owner:CHINA PING AN LIFE INSURANCE CO LTD

Speech recognition result processing method and a related device

PendingCN109829163AImprove adding efficiencyImprove experienceSpeech recognitionSpecial data processing applicationsSemantic featureSequence annotation

The invention discloses a speech recognition result processing method, which comprises the following steps: carrying out semantic feature labeling on a speech recognition result to obtain a semantic labeling result; identifying the semantic annotation result by adopting a sequence annotation model to obtain annotated mark point data; wherein the sequence labeling model is obtained by carrying outdeep learning training according to training data of labeled punctuation; and arranging the labeled punctuation data to obtain a final punctuation adding result. The speech recognition result is subjected to punctuation adding through the sequence labeling model, the punctuation adding efficiency is improved, and good real-time performance is achieved. The invention also discloses a speech recognition result processing system, a computer device and a computer readable storage medium, which have the above beneficial effects.

Speech recognition result processing method and a related device

Speech recognition result processing method and a related device

Owner:HITHINK ROYALFLUSH INFORMATION NETWORK CO LTD

Language sequence labeling method and device, storage medium and computer equipment

ActiveCN111274813ASolve the technical problems of inaccurate and incomplete labelingNatural language data processingNeural architecturesProgramming languageTheoretical computer science

The invention discloses a language sequence labeling method and device, a storage medium and computer equipment. The method comprises the steps of generating a cross-language vector based on a sourcelanguage vector and a target language vector; generating a language correspondence relationship according to the cross-language vector, the language correspondence relationship comprising a relationship corresponding to the source language and the target language; converting the source language sequence annotation data into conversion data according to the language corresponding relation; trainingthe source language sequence labeling data and the conversion data to obtain a cross-language sequence labeling model; and performing sequence labeling on the target language based on the cross-language sequence labeling model. According to the method and the device, the technical problem of inaccurate and incomplete annotation caused by lack of annotation resources of the target language in a language sequence annotation method in related technologies is solved.

Language sequence labeling method and device, storage medium and computer equipment

Language sequence labeling method and device, storage medium and computer equipment

Language sequence labeling method and device, storage medium and computer equipment

Owner:ALIBABA GRP HLDG LTD

Resume information extraction method based on cascading sequence annotation

ActiveCN111966785ASolve extractionResolve subsectionsCharacter and pattern recognitionNatural language data processingEngineeringSequence annotation

The invention provides a resume information extraction method based on cascading sequence annotation. The method comprises the following steps of 1, analyzing a pdf resume by using a pdfminer, and converting an original pdf into a multi-line text representation; wherein the process mainly solves the problems of disordered sequence and wrong broken lines; step 2, training process data marking, utilizing remotely-supervised data back marking and combining similar items in the marking process; step 3, resume information block division, for sentences obtained through pdfminer, judging the block where each sentence is located according to the classification of each sentence; and step 4, realizing information extraction at a sentence level and a short text fragment level by utilizing the double-layer sequence labeling model. The method is advantaged in that filtering is subsequently realized by utilizing resume block information, so the recall rate is effectively improved, and meanwhile, accuracy is not greatly reduced; through four stages, extraction of the resume information can be effectively realized.

Resume information extraction method based on cascading sequence annotation

Resume information extraction method based on cascading sequence annotation

Resume information extraction method based on cascading sequence annotation

Owner:THE 28TH RES INST OF CHINA ELECTRONICS TECH GROUP CORP

Method and system for extracting dynamic information of smart home industry

PendingCN112464668ASpeed up circulationImprove learning effectNatural language data processingNeural architecturesEngineeringPredicting performance

The invention provides a method and a system for extracting dynamic information of the smart home industry, and provides a method for constructing automatic industry dynamic trend capture and automatically generating a report on the basis of industry dynamic data capture and extraction tasks in the field of smart home. According to the method, a smart industry dynamic data extraction mode combining industry priori knowledge and natural language processing sequence annotation can be provided on the basis of the smart home industry background and article structural information extraction, and meanwhile, an industry research report is automatically generated in combination with a text classification model based on deep learning and paragraph abstract extraction of multiple types of indexes. Moreover, the method is a process where a machine learning algorithm is deeply combined with business features of the smart home industry, a natural language analysis business process with a good prediction effect is researched through a large number of practices, the algorithm is efficient and highly targeted, and the process flow highly conforms to a data analysis business, and the success rate of data extraction and report generation is relatively high.

Method and system for extracting dynamic information of smart home industry

Method and system for extracting dynamic information of smart home industry

Method and system for extracting dynamic information of smart home industry

Owner:南京数脉动力信息技术有限公司

Sequence labeling method and device in natural language processing, equipment and storage medium

PendingCN109885702AGood effectImprove universalityMultimedia data queryingCharacter and pattern recognitionAlgorithmSequence model

The invention relates to a sequence labeling method in natural language processing, and the method comprises the steps: obtaining a text sequence, inputting the text sequence into a sequence labelingmodel, obtaining a target path, enabling each node in the target path to be a label in a preset label set, and enabling the preset label set to comprise each label corresponding to m coding modes; andarranging the nodes in the target path according to a sequence from first to second in the target path to obtain a labeling sequence corresponding to the text sequence. According to the invention, atext sequence sample coding result is trained according to a plurality of coding modes to obtain a sequence labeling model; the sequence model label is used for inputting the text sequence, the inputtext sequence is processed through the sequence model label, the annotation sequence corresponding to the text sequence is output, annotation of the text sequence is not limited to a single coding mode, and therefore universality of the sequence annotation model to different inputs is improved, and the effect of sequence annotation is improved.

Sequence labeling method and device in natural language processing, equipment and storage medium

Sequence labeling method and device in natural language processing, equipment and storage medium

Sequence labeling method and device in natural language processing, equipment and storage medium

Owner:HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL +1

Popular searches

Proteome Annotation Computer based Protein formation Target text Speech recognition Data Annotation Supervised learning Biology Named entity