Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

631 results about "Participle" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

A participle (PTCP) is a form of a verb that is used in a sentence to modify a noun, noun phrase, verb, or verb phrase, and plays a role similar to an adjective or adverb. It is one of the types of nonfinite verb forms. Its name comes from the Latin participium, a calque of Greek μετοχή (metokhḗ) "partaking" or "sharing"; it is so named because the Ancient Greek and Latin participles "share" some of the categories of the adjective or noun (gender, number, case) and some of those of the verb (tense and voice).

Word boundary probability estimating, probabilistic language model building, kana-kanji converting, and unknown word model building

InactiveUS20060015326A1Improve recognition accuracyImprove capabilityNatural language translationSpecial data processing applicationsText corpusProbabilistic estimation

Calculates a word n-gram probability with high accuracy in a situation where a first corpus), which is a relatively small corpus containing manually segmented word information, and a second corpus, which is a relatively large corpus, are given as a training corpus that is storage containing vast quantities of sample sentences. Vocabulary including contextual information is expanded from words occurring in first corpus of relatively small size to words occurring in second corpus of relatively large size by using a word n-gram probability estimated from an unknown word model and the raw corpus. The first corpus (word-segmented) is used for calculating n-grams and the probability that the word boundary between two adjacent characters will be the boundary of two words (segmentation probability). The second corpus (word-unsegmented), in which probabilistic word boundaries are assigned based on information in the first corpus (word-segmented), is used for calculating a word n-grams.

Word boundary probability estimating, probabilistic language model building, kana-kanji converting, and unknown word model building

Word boundary probability estimating, probabilistic language model building, kana-kanji converting, and unknown word model building

Word boundary probability estimating, probabilistic language model building, kana-kanji converting, and unknown word model building

Owner:IBM CORP

Method for extracting text-oriented field term and term relationship

InactiveCN102360383AEfficient removalImprove term recognitionSpecial data processing applicationsNODALConditional random field

The invention discloses a method for extracting a text-oriented field term and term relationship. The method is characterized by comprising the following steps of: firstly, preprocessing original linguistic data to obtain a candidate word set including clauses, participles and part of speech tagging, and filtering noise words; secondly, extracting term characteristics from the original linguisticdata and the Internet, and separating terms from candidate words by combining with a dual-model structure algorithm; thirdly, constructing a term dictionary by adopting an inverted index method, and tagging the terms in a text to be identified by using a longest match algorithm; and finally, carrying out multilevel sign sequence tagging through a conditional random field model according to a multi-dimensional node signing rule to obtain a relationship among the terms in the text to be identified.

Method for extracting text-oriented field term and term relationship

Method for extracting text-oriented field term and term relationship

Method for extracting text-oriented field term and term relationship

Owner:XI AN JIAOTONG UNIV

Speech synthetic text processing method based on rhythm structure

InactiveCN101000764AEffective syncopationNatural soundSpeech synthesisStructure analysisSpeech synthesis

A method for processing voice synthetic text based on rhythm structure includes comparing inputted text with preset special symbol table to output legal pronunciation character string, comparing legal pronunciation character string according to participle rule and rhythm structure analysis rule to output labeled character string with rhythm structure information, comparing labeled character string with preset rhythm rule and phonetic table word by work and outputting label phonetic code string labeled rhythm information.

Speech synthetic text processing method based on rhythm structure

Speech synthetic text processing method based on rhythm structure

Speech synthetic text processing method based on rhythm structure

Owner:HEILONGJIANG UNIV

Text classification method of Chinese web page based on steam clustering

InactiveCN101727500AWide coverageWord segmentation method is simple and easySpecial data processing applicationsFeature vectorThe Internet

The invention relates to a text classification method of a Chinese web page based on steam clustering, belonging to the technical field of internetwork data mining. The text classification method comprises the following steps of: acquiring a web page in real time; removing unprocessed labels in the format of the web page, and analyzing the characteristic information of texts of the web page; segmenting the content of the texts, using as ngram participles, and forming a plurality of word strings; computing the weight value of each word string; extracting the word string with a high weight value, and using the word string with the high weight value and the corresponding weight value thereof as characteristic vectors; computing the similarity of the characteristic vectors and characteristic information and a known class; computing obtained total similarity, and classifying the texts to the know class or establishing a new class; judging whether the know class is divided into two subclasses or not according to the number of characteristic items of the known class; and storing processed text records and the information of the known class. The text classification method sufficiently excavates the effective information of web page texts aiming at the characteristics of the web page texts and is incremental, fast, effective and more practical.

Owner:TSINGHUA UNIV

Natural language-based robot deep interacting and reasoning method and device

ActiveCN106056207AImprove recognition rateNarrow searchSemantic analysisCharacter and pattern recognitionWork taskSpeech identification

The present invention discloses a natural language-based robot deep interacting and reasoning method and device. The method comprises 1) a speech identification step of receiving the user speech input, and processing an input signal to obtain the text information; 2) a case attribute obtaining step of carrying out the participle processing on the text obtained in the step 1), and then carrying out the similarity matching on the text after participle and the cases in a case base to extract the attributes of the cases; 3) a deep dialogue and three dimensional scene interaction step of if the intention of the user obtained according to the case attributes extracted in the step 2) is not complete, repeatedly guiding a user by combining a real-time map file obtained by a Kinect sensor until the complete intention is obtained, and then generating a solution scheme aiming at a working task of the user complete intention; a speech synthesis step of displaying the obtained solution scheme in a text format, synthesizing the speech, and feeding back to the user via a sound device. During an interaction process of the present invention, a robot and the user both use a natural language.

Natural language-based robot deep interacting and reasoning method and device

Natural language-based robot deep interacting and reasoning method and device

Natural language-based robot deep interacting and reasoning method and device

Owner:WUHAN UNIV OF SCI & TECH

Method and system for cutting index participle

ActiveCN101071420AResolve accuracySolve a certain amount of redundant wordsSpecial data processing applicationsChinese charactersEnglish characters

The invention discloses a segmentation index segmentation method. Including the following steps: read the character stream; identification described the character stream to identify Chinese characters and English characters, as well as an identification number or character; already identified Chinese and English characters or Digital and pre-built 1.1 tree comparison, the sub-set match words; English characters or figures generic fuzzy matching ASCII codes to determine English string or string of digital-term matching the above mentioned English words and string or digital string of words and non-recognition of characters referred to the character stream by order of ranking; The words and figures mentioned in the English string or strings of the sort described in the order of the character stream. The invention also openly segmentation Index segmentation system. The invention provides a cut-word indexing method and system can simultaneously address the precise words, a certain amount of redundant words and word-term problems, enhance the user experience.

Method and system for cutting index participle

Method and system for cutting index participle

Method and system for cutting index participle

Owner:SHENZHEN TENCENT COMP SYST CO LTD

SVM based micro-blog emotion classification method fusing various kinds of emotion resources

PendingCN106503049AImprove accuracyAccurate acquisitionNatural language data processingSpecial data processing applicationsPart of speechStructure analysis

The invention discloses an SVM based micro-blog emotion classification method fusing various kinds of emotion resources. The method includes the following steps: constructing relevant dictionaries including an emotion dictionary, a negation dictionary, and a degree adverb dictionary; performing pretreatment on different corpora, performing word segmentation and part-of-speech tagging on the corpora, and performing sentence structure analysis; comparing the segmented words and positive and negative dictionaries to acquire initial word polarity, comparing words ahead of emotion words and the word degree grade dictionary and the negation dictionary to acquire modifier weight, and multiplying the initial word polarity by the modifier weight to acquire emotion scores of each micro-blog; extracting features such as nouns, verbs, adjectives, positive and negative emotion words, degree adverb weights, emotion scores, privatives and specific symbols from part-of-speech features, emotion features, sentence pattern features, and semantic features; and inputting the extracted features into an Libsvm to perform model training so as to acquire a training model. The method can achieve emotion 5-grade classification of micro-blogs, and can accurately and roundly acquire emotion tendency of netizens.

SVM based micro-blog emotion classification method fusing various kinds of emotion resources

SVM based micro-blog emotion classification method fusing various kinds of emotion resources

SVM based micro-blog emotion classification method fusing various kinds of emotion resources

Owner:NANJING UNIV OF SCI & TECH

Data classification method and device

ActiveCN102193936AReduce in quantityImprove classificationDigital data information retrievalCommerceClassification methodsData mining

The invention relates to the field of data processing and discloses a commodity classification method and device, which are used for increasing the executing efficiency of a commodity classification flow. The method comprises the following steps of: acquiring relevant data of commodities to be classified and extracting commodity titles from the data; dividing participles of commodity titles respectively and determining the weight of each participle, wherein the weight of each participle is used for representing the history occurrence rate of the participle; selecting participles of which the weight values are consistent with a preset condition respectively specific to different commodities to constitute a participle sequence; and comparing the participle sequences selected specific to thecommodities and combining relevant data of commodities having the same participle sequence. By adopting the method and the device, the quantity of relevant data of commodities needing to be processedis reduced greatly, commodity classification can be realized quickly and accurately in a short period of time, the executing efficiency of the commodity classification flow is increased effectively, the management complexity of relevant data of the commodities is lowered, and the operation load of a system is lowered.

Data classification method and device

Data classification method and device

Data classification method and device

Owner:ALIBABA GRP HLDG LTD

Semantization service generation system and method based on graph mining technique

ActiveCN103631882AShield dependenciesClose to and meet the needsWeb data indexingSpecial data processing applicationsSystem integrationWeb service

The invention provides a semantization service generation system and method based on a graph mining technique. The system is built based on a traditional server and comprises multiple techniques including natural language participles, graph mining, clustering, semantization analysis, service procedure generation and service execution. After a user collects and analyzes application requirements with a natural language or text description, key words are naturally extracted, service requirements are analyzed, usable services and combination modes thereof are mined in a built Web service tree graph, and ultimately, in a system integration operation environment, the services are automatically carried out and execution results are fed back. The system and method have the advantages that operations are directly carried out on the requirements of the user for the natural language or the text description, semantization features are emphasized, the service execution environment is integrated, the service operation results are directly obtained, use habits and requirements of the user are met, widening of a user range is facilitated, automatic operation and maintenance of the system are achieved, and the system and method are suitable for the distributive execution environment.

Semantization service generation system and method based on graph mining technique

Semantization service generation system and method based on graph mining technique

Semantization service generation system and method based on graph mining technique

Owner:BEIJING UNIV OF POSTS & TELECOMM

Recognition text error correction method and device

ActiveCN107977356AImplement error correctionImprove accuracyNatural language translationSpeech recognitionConfidence measuresIdentification error

The invention provides a recognition text error correction method and device, and belongs to the technical field of language processing. The method comprises the following steps of: determining an error correction word corresponding to an error participle of each recognition error in a recognition text; replacing the corresponding error participle in the recognition text by each error correction word so as to obtain error correction candidate texts corresponding to the recognition text; and determining an error correction credibility corresponding to each error correction candidate text and taking the error correction candidate texts, the error correction credibility of which is greater than a first preset threshold value, as error corrected recognition texts. After the recognition text isobtained through voice recognition, the error correction candidate texts with relatively high credibility can be selected as error corrected recognition texts, so as to carry out error correction onthe recognition text and improve the correctness of subsequent translation.

Recognition text error correction method and device

Recognition text error correction method and device

Recognition text error correction method and device

Owner:新疆声谷融创数字产业发展有限公司

Named entity identification method, device, medium and equipment

ActiveCN109145303AImprove accuracySimple structureNatural language data processingSpecial data processing applicationsPattern recognitionNamed-entity recognition

Embodiments of the present application disclose a named entity recognition method, a device, equipment, and a medium, wherein, the method includes: obtaining a text to be recognized; word segmentationprocessing being carried out on the text to be recognized to obtain a word segmentation sequence; inputting the word segmentation sequence to a named entity recognition model, and obtaining attributeidentifiers of named entities corresponding to each word segmentation output from the named entity recognition model; furthermore, the named entity in the text to be recognized being determined according to the attribute identification of the named entity corresponding to each participle. The named entity recognition model used in this method is based on feedforward neural network with simple network structure and fewer network parameters, which ensures that the model is easy to maintain and update. In addition, based on the multi-dimensional segmentation features that can fully and comprehensively express the semantic information of segmentation, the model determines the attribute identification of named entity corresponding to each segmentation, which ensures the accuracy of named entity recognition. In addition, the present application also provides a method and apparatus for training a named entity recognition model.

Named entity identification method, device, medium and equipment

Named entity identification method, device, medium and equipment

Named entity identification method, device, medium and equipment

Owner:TENCENT TECH (SHENZHEN) CO LTD

Speech recognition method and system

ActiveCN107204184AHigh precisionLow costNatural language data processingSpeech recognitionData sourceSpeech identification

The present invention discloses a speech recognition method and system. The method comprises: obtaining the special type of information text from a pre-determined data source; performing statement segmentation of each obtained information text to obtain a plurality of statements, performing segmentation processing of each statement to obtain corresponding participles, forming a first mapping corpuses through adoption of each statement and the corresponding participles; and according to each obtained first mapping corpus, training the preset type of a first language model, and performing speech recognition based on the trained first language model. The precision of speech recognition is effectively improved, and the cost of speech recognition is effectively reduced.

Speech recognition method and system

Speech recognition method and system

Speech recognition method and system

Owner:PING AN TECH (SHENZHEN) CO LTD

Product review attribute-level emotion classification method based on rules and neural networks

ActiveCN107862343APromote amplificationEfficient use ofCharacter and pattern recognitionNeural learning methodsClassification methodsBusiness forecasting

The invention discloses a product review attribute-level emotion classification method based on rules and neural networks. The method includes the steps: firstly, acquiring review data and filtering Chinese participles and stop words from a review text; secondly, screening a product attribute set by the aid of a rule template, building a <attribute and review> sample set, performing emotion tagging on the attribute of each review, and building a <attribute, review and emotion> training set; building a neural network emotion classification model based on bilateral attention, and training the model by the aid of the training set; finally, filtering Chinese participles and stop words from testing data, screening a product attribute set, building a <attribute and review> testing set, and performing emotion classification by the aid of an emotion classification model. According to the method, attribute emotion category forecasting accuracy can be greatly and effectively improved by the aidof context information of attributes in the reviews.

Product review attribute-level emotion classification method based on rules and neural networks

Product review attribute-level emotion classification method based on rules and neural networks

Product review attribute-level emotion classification method based on rules and neural networks

Owner:NANJING UNIV OF SCI & TECH

Text matching method and device

ActiveCN105893533AImprove accuracySpecial data processing applicationsText database clustering/classificationUser inputText matching

The embodiment of the invention provides a text matching method and device applied to electronic equipment. The method comprises the steps that search terms input by a user are received and subjected to participle treatment, at least one participle is obtained; according to a pre-trained classification model and at least one of the term property, semantic attribute, independent search term probability and click rate of each term, the importance weight of each participle is determined; according to the text to be matched and a text set where the text to be matched is located, the term frequency and inverse document frequency of each participle are determined; according to the importance weight, term frequency and inverse document frequency of each participle, the matching degree of the search terms and the text to be matched is calculated. According to the text matching method and device, the importance degree of each participle can be more accurately measured by means of the important weight and inverse document frequency of each term, and therefore the text matching accuracy can be improved when text matching is carried out according to the importance degree of each participle.

Text matching method and device

Text matching method and device

Text matching method and device

Owner:BEIJING QIYI CENTURY SCI & TECH CO LTD

Generation system and method for Chinese similar problem

ActiveCN108287822AImprove matchImprove rationalityNatural language translationSemantic analysisQuestion generationAlgorithm

The invention discloses a generation system and method for a Chinese similar problem. The system comprises a preprocessing module, a naming entity identification module, a problem classification module and a similar semantic problem generation module, wherein the preprocessing module is used for receiving a given problem and carrying out preprocessing, wherein the preprocessing comprises word segmentation, stop word removal, part-of-speech tagging for obtained segmented words; the naming entity identification module is used for identifying the given problem to obtain a naming entity to be identified; the problem classification module is used for classifying given problems according to semantics; the similar semantic problem generation module is used for generating a semantic similar problems by aiming at the given problems; and the similar semantic problem generation module comprises a similar semantic problem generation submodule based on a rule and a similar semantic problem generation submodule based on machine learning. By use of the system, the matching degree and the rationality of the generated problem question and an original problem can be effectively improved.

Generation system and method for Chinese similar problem

Generation system and method for Chinese similar problem

Generation system and method for Chinese similar problem

Owner:BEIJING RONGLIAN YITONG INFORMATION TECH CO LTD

Query suggestion method based on query semantics and click-through data

InactiveCN102253982AImprove usabilityImprove interactivitySpecial data processing applicationsUser inputAmbiguity

The invention relates to a query suggestion method based on query semantics and click-through data, which comprises the following steps of: 1, preprocessing collected query log data; 2, preprocessing participles and filtering stop words of query data input by a user; 3, calculating similarity of log information in a user query data string and a query log library one by one; 4, calculating semantic relativity of the log information in the user query data string and the query log library one by one on the basis of a word concept relevancy calculation method in the HowNet; 5, fusing the similarity and the semantic relativity, and calculating query semantic relativity of each piece of log information in the user query data string and the query log library; and 6, taking Top-N out and recommending to the user according to a descending relativity sequence in the step 5. By the method, query ambiguity can be effectively eliminated, an input error can be reminded, and usability and interactivity of an information retrieval system are improved.

Query suggestion method based on query semantics and click-through data

Query suggestion method based on query semantics and click-through data

Query suggestion method based on query semantics and click-through data

Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Method for calculating language structure, executing participle, machine translation and speech recognition using HMM

InactiveCN101201818AComplete and expressiveComplete and efficient computingSpeech recognitionSpecial data processing applicationsHide markov modelMachine translation system

The invention discloses a method of modeling and calculating language structure by adopting hidden Markov models (HMM); the method can express and calculate the grammar structure of natural language effectively and completely, which more particularly expresses and calculates recursion and juxtaposition. The technical proposal is that: the GBBG grammar of the language is established; the HMM grammar of the language is established and the topological structure of LHMM is designed; the topological structure and parameters of the LHMM are trained and tested according to the HMM grammar; the grammar structure of sentences required to be analyzed is calculated by adopting the HMM theory and the topological structure of the LHMM and the parameters are regulated continuously according to the need. The invention is applied to the processing field of natural language.

Method for calculating language structure, executing participle, machine translation and speech recognition using HMM

Method for calculating language structure, executing participle, machine translation and speech recognition using HMM

Method for calculating language structure, executing participle, machine translation and speech recognition using HMM

Owner:李萍

Method and device for translating natural languages into commands and navigation application of method and device

InactiveCN104766606AAccurate identificationQuick identificationSpeech recognitionProgramming languageSpeech identification

The invention discloses a method and device for translating natural languages into commands and navigation application of the method and device, and belongs to the technical field of speech recognition. The method includes the steps of entering statements and marking instruction classifications of the statements; conducting word segmentation on the statements; calculating first probabilities of all segmentation words, and storing the segmentation words, the first probabilities and the sequence of the segmentation words in the statements into a first data sheet; calculating second probabilities of all segmentation words, and storing the segmentation words, the second probabilities and the instruction classifications into a second data sheet; calculating the first matching degrees between similar studying statements and a conjecturing statement, and judging that the studying statement with the highest first matching degree is more similar to the conjecturing statement; calculating the second matching degrees between all similar command classifications and the conjecturing statement, and judging that the command classification with the highest matching degree is the command classification of the conjecturing statement. By means of the method and device and the navigation application, the natural languages can be translated into commands readable by a machine more accurately and rapidly, and the expandability is good.

Method and device for translating natural languages into commands and navigation application of method and device

Method and device for translating natural languages into commands and navigation application of method and device

Method and device for translating natural languages into commands and navigation application of method and device

Owner:SHANGHAI XIUYUAN NETWORK TECH

An intelligent operation and maintenance statement similarity matching method based on natural language processing

InactiveCN109902159AAvoid the phenomenon of "curse of dimensionality"Improve accuracySpecial data processing applicationsSemantic tool creationCurse of dimensionalitySentence similarity

The invention discloses an intelligent operation and maintenance statement similarity matching method based on a natural language processing technology. The method mainly comprises two parts of data processing in knowledge base construction and sentence similarity matching based on deep learning. Compared with the prior art, the method has the advantages that (1) the operation and maintenance management knowledge is subjected to word segmentation by utilizing the specific word library and the HMM to find the new word model, so that the text word segmentation accuracy is improved, and the moreperfect text word library is established; (2) word vectors are trained through a deep learning method, so that the phenomenon of'dimensionality disaster 'represented by the word vectors can be avoided, information of vocabulary contexts can be fully mined, and relations between words can be obtained; And (3) on the basis of the sentence vectors configured with the weights, not only can the importance measure of each word be obtained, but also the information of the sentence vectors can be richer through the combination of the word vectors, and the accuracy of matching on the basis of forming the sentence vectors can be guaranteed through a cosine similarity matching algorithm.

An intelligent operation and maintenance statement similarity matching method based on natural language processing

An intelligent operation and maintenance statement similarity matching method based on natural language processing

An intelligent operation and maintenance statement similarity matching method based on natural language processing

Owner:华融融通(北京)科技有限公司

Document similarity calculation method and near-duplicate document detection method and device

InactiveCN104252445AImprove detection accuracyEfficient identificationNatural language data processingSpecial data processing applicationsComputation complexityDocument similarity

The invention relates to a document similarity calculation method and a near-duplicate document detection method and device. The calculation method comprises the following steps: performing word segmentation processing on two documents to be detected to obtain respective participle sets of the documents to be detected; calculating the edition similarity of all participle pairs in the two participle sets, wherein two participles in each participle pair come from the two participle sets respectively; establishing sides among the participle pairs of which the edition similarity meets a certain requirement in all the participle pairs to obtain a weighted biograph, wherein the edition similarity is the weights of the sides of corresponding participle pairs; calculating the maximum weighted matching value of the weighted bi-graph; calculating the similarity between the documents to be detected by using the maximum weighted matching value. By adopting the document similarity calculation method and the near-duplicate document detection method and device provided by the invention, high accuracy is achieved, near-duplicate texts comprising participle set edition errors can be identified effectively, the near-duplicate document detection accuracy is increased, the calculation complexity is lowered, and the calculation efficiency is optimized.

Document similarity calculation method and near-duplicate document detection method and device

Document similarity calculation method and near-duplicate document detection method and device

Document similarity calculation method and near-duplicate document detection method and device

Owner:HUAWEI TECH CO LTD +1

Related resource address push method and device based on video retrieval

ActiveCN103491205AImprove digging efficiencyImprove recallTransmissionVideo retrievalCo-occurrence

The invention discloses a related resource address push method and device based on video retrieval. The related resource address push method and device based on the video retrieval comprises the steps of obtaining the characteristic text information of first video resource data when the loading or playing requests of the first video resource data are received, mapping the characteristic text information as one or more first participles, searching related second participles having the co-occurrence rate with the one or more first participles higher than a preset threshold value, wherein the co-occurrence rate is the possibility of the current one or more first participles and the second participles emerge together in identical video resource data, obtaining the network chained addresses of the second video resource data matched with the one or more fist participles and the related second participles, and pushing the network chained addresses of the second video resource data. The related resource address push method and device based on the video retrieval achieves the purpose of delving resources of good quality in a video database deeply, and improves delving efficiency of the resources. In addition, an index table can be enlarged continuously along with the accumulation of the video content of the internet, and the fact that a recall rate is facilitated is enlarged.

Related resource address push method and device based on video retrieval

Related resource address push method and device based on video retrieval

Related resource address push method and device based on video retrieval

Owner:BEIJING QIHOO TECH CO LTD

Text sensitive information recognition method based on semi-supervised learning

InactiveCN106897459AImprove acceleration performanceImprove practicalityCharacter and pattern recognitionNatural language data processingFeature vectorSupervised learning

The invention relates to the technical field of information safety, and discloses a text sensitive information recognition method based on semi-supervised learning. The method comprises the steps of 1, based on learning texts in a labeled sensitive text set and an unlabeled unknown text set, conducting semi-supervised learning to obtain a classified strategy knowledge base; 2, conducting Chinese participle and stop words processing on a text to be detected to obtain characteristic element data in the text; 3, using a characteristic vector to represent the characteristic element data, and extracting a characteristic value; 4, using a classified strategy database to conduct sensitive text nature judgment on the characteristic value, and giving a judgment result of the sensitive text or a safe text. According to the text sensitive information recognition method based on semi-supervised learning, a small amount of sensitive texts are labeled, a large amount of text sets are subjected to semi-supervised learning, and the expandable capacity and the practicability of sensitive information recognition can be improved.

Text sensitive information recognition method based on semi-supervised learning

Text sensitive information recognition method based on semi-supervised learning

Text sensitive information recognition method based on semi-supervised learning

Owner:NO 30 INST OF CHINA ELECTRONIC TECH GRP CORP

Voice recognition text error correction method in specific field

ActiveCN111369996AImprove the detection rateAvoid being mishandledNatural language data processingSpeech recognitionWord listSpeech sound

The invention relates to a voice recognition text error correction method in a specific field, wherein the method comprises the following steps: firstly, performing statistics by using correct field corpora to obtain a character and word level language model and a pinyin language model; then, receiving a text sequence to be subjected to error correction, and performing clause processing on more than one sentence; determining suspected wrong words by using a word, word and pinyin language model; determining a candidate word list of the suspected wrong words according to a language model vocabulary and a pronunciation-prone dictionary; and finally, substituting candidate words into the original text sequence, and selecting and outputting the most reasonable sentence in combination with macroscopic and microcosmic scores. Basic units with different granularities and dimensions such as characters, words, pinyin and initial and final consonants are selected to construct a language model, and word segmentation error interference caused by wrong characters is reduced; isolated character disorder is processed by adopting a word language model, and continuous recognition errors caused by pronunciation deviation is distinguished by adopting the pinyin language model; and candidate sentences after the wrong words are replaced are comprehensively evaluated by macroscopic and microcosmic scores, and the smoothness degree of the replaced sentences are measured.

Voice recognition text error correction method in specific field

Voice recognition text error correction method in specific field

Voice recognition text error correction method in specific field

Owner:网经科技(苏州)有限公司

Comment text aspect-level sentiment classification method and system based on deep learning

PendingCN111858945AHigh precisionFast predictionSemantic analysisCharacter and pattern recognitionPattern recognitionEmotion classification

The invention provides a comment text aspect-level sentiment classification method based on deep learning. The method comprises the following steps: preprocessing a comment text, including word segmentation and stop word removal, balancing aspect words and corresponding tags to generate a balanced sample, and vectorizing the balanced sample and Chinese words in an original sample to obtain word vectors in the balanced sample; inputting the word vectors into the model to predict a comment result, wherein the model is a deep learning model constructed according to a deep neural network, the similarity calculation is carried out on word vectors of aspect words and other words of sentences, and an aspect emotion semantic matrix of a balance sample is generated. According to the method, throughthe balance processing and construction of the Attn-Bi-LCNN model, the emotion semantic matrix can be effectively output, and the accuracy of the model and the prediction speed in practical application are improved, so the method is suitable for aspect-level fine-grained emotion classification of texts.

Comment text aspect-level sentiment classification method and system based on deep learning

Comment text aspect-level sentiment classification method and system based on deep learning

Comment text aspect-level sentiment classification method and system based on deep learning

Owner:上海哈蜂信息科技有限公司

Figure information disambiguation treatment method based on social network and name context

InactiveCN102054029AImplement disambiguationSpecial data processing applicationsDocumentation procedureName context

The invention discloses a figure information disambiguation treatment method based on a social network and a name context, which relates to a disambiguation treatment method of Internet figure information and solves the problem that related web-pages of different figures sharing one name are mixed in the retrieval result of a certain specified name by the search engine in the prior art. The method is used for retrieving Internet figure information and comprises the following steps of: firstly, inputting a name to be retrieved, finishing the retrieval by utilizing a search engine, and downloading a searched web page to a local computer by utilizing downloading software by a user; secondly, respectively carrying out text extraction treatment, participle treatment and part-of-speech tagging treatment on the webpage to form a document; thirdly, classifying the documents by utilizing figure field information, carrying out clustering treatment on the figure field information by utilizing the social network and the context information, finally displaying a corresponding relation between the figure field information and an entity figure, and displaying the social network where each entity figure lives.

Figure information disambiguation treatment method based on social network and name context

Figure information disambiguation treatment method based on social network and name context

Figure information disambiguation treatment method based on social network and name context

Owner:HARBIN INST OF TECH

Method for inputting spacing participle

InactiveCN101382844AResolve input ambiguityImprove input efficiencyInput/output processes for data processingUser inputAlgorithm

The invention relates to a method for inputting interval divided words, in particular to an interval divided word input method which can automatically divide words according to the input dwell time of a user and record and calculate the time interval between the current successive inputting of two characters or codes; then the time interval is compared with a threshold value; if the time interval between the successive inputting of two characters or codes is more than the threshold value, the interval is a long interval, if the time interval is less than or equal to the threshold value, the interval is a short interval; the short interval indicates the input pause time among a plurality of input characters or codes of the same word or phrase; the long interval indicates the pause time of the divided words, namely, the input of a plurality of the input characters or codes of the word or the phase is completed. The method solves the problem of the ambiguity of the code input of other input methods and the problem that the input efficiency can not be improved, thus making up the shortcomings of the prior art.

Owner:上海埃帕信息科技有限公司

Error correction method and device thereof

ActiveCN103914444AImprove accuracySpecial data processing applicationsAlgorithmCorrection method

The invention discloses an error correction method and a device thereof. The error correction method comprises the steps of performing word segmentation on keywords input by a user, so as to obtain the participles of keywords; performing the word segmentation on the keywords input by the user, detecting the effectiveness of each participle by the participles of the obtained keywords, and taking the participles which can not pass the effectiveness detection as invalid participles; correcting the invalid participles according to the outer codes of the invalid participles, so as to obtain the corrected participles; and using the corrected participles to replace the corresponding invalid participles in the keywords, so as to obtain the corrected keywords. According to the technical scheme, the participles can be corrected through the outer codes, the reasons for causing the input errors of the participles can be dramatically determined, the accuracy of correcting the participles can be improved, and the accuracy for correcting the keywords can be improved.

Error correction method and device thereof

Error correction method and device thereof

Error correction method and device thereof

Owner:ALIBABA (CHINA) CO LTD

Text message processing method and device

InactiveCN106547740ALower update requirementsImprove accuracySemantic analysisSpecial data processing applicationsParticipleNatural language

The invention provides a text message processing method and device, and belongs to the technical field of natural language processing and data mining. The method comprises the following steps: acquiring text message; carrying out participle processing on the text message to obtain a plurality of undetermined terms; acquiring term vectors which respectively correspond to the undetermined terms; calculating the similarity of the term vector corresponding to each undetermined term and the term vector corresponding to each emotion term in a preset emotion dictionary; and judging the emotion attribute of the text message according to the similarity of the term vector corresponding to each undetermined term and the term vector corresponding to each emotion term in the emotion dictionary. Compared with an existing method, the text message processing method and device reduces requirements on the updating speed of the emotion dictionary; the problem of poor emotion analysis effect caused by the reason that the emotion dictionary is not updated in time is avoided; and accuracy of analysis results is improved effectively.

Text message processing method and device

Text message processing method and device

Text message processing method and device

Owner:四川无声信息技术有限公司

Prosodic hierarchy annotation method and device

ActiveCN105185374ASolve the problem of limited extension range of contextual featuresPrecise deliverySpeech recognitionNatural language processingShort-term memory

The invention discloses a prosodic hierarchy annotation method and a prosodic hierarchy annotation device. The prosodic hierarchy annotation method comprises the steps of: S1, acquiring a text sequence; S2, segmenting the text sequence into a plurality of participles, and extracting features of the participles; S3, regarding the features as input, and acquiring corresponding output results based on a two-way long / short-term memory model; and S4, annotating prosodic hierarchies of the text sequence according to the output results. According to the prosodic hierarchy annotation method and the prosodic hierarchy annotation device disclosed by the embodiment of the invention, the prosodic hierarchies are annotated based on the two-way long / short-term memory model, the problem of limited extension range of contextual features of the participles in the text sequence is effectively solved, and the prosodic hierarchies are annotated at one time, thus the problem of error transfer in annotation can be avoided.

Prosodic hierarchy annotation method and device

Prosodic hierarchy annotation method and device

Prosodic hierarchy annotation method and device

Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Computer processes for analyzing and improving document readability by identifying passive voice

ActiveUS8014996B1Improve readabilityNatural language translationSpecial data processing applicationsDocumentationSpeech sound

Disclosed are systems and methods for analyzing and improving document readability. For example, a computer-implementable method of improving authored text is disclosed that can identify passive voice text within authored text by performing the following steps: scanning for a “to be” verb in the text; identifying from the scan an occurrence of a “to be” verb; scanning for a past participle in the text; identifying from the scan an occurrence of a past participle in the text; and providing information to a user regarding a proposed edit to improve the text. Various other rules for improving text are also disclosed.

Computer processes for analyzing and improving document readability by identifying passive voice

Computer processes for analyzing and improving document readability by identifying passive voice

Computer processes for analyzing and improving document readability by identifying passive voice

Owner:WORDRAKE HLDG

Popular searches

Language model Word model Kana Contextual information Large size N-gram Speech model Sentence Multi dimensional Inverted index