Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

65 results about "Lexical set" patented technology

A lexical set is a group of words that all fall under a single category based on some shared phonological feature.

Text similarity detection device

The invention discloses a text similarity detection device. The text similarity detection device comprises the following steps: constructing a thesaurus according to classification labels of Baidu Encyclopedia entries; inputting two Chinese documents needing to be compared, and pre-processing the two Chinese documents respectively; filtering words in the two Chinese documents and removing repeated words to generate a word item set; dividing word items in the word item set into a specialized word set and a common word set; aligning specialized words in two sentences in the two Chinese documents and aligning common words in the two sentences; calculating the similarity, relative to the word with the corresponding property, of each word respectively; and calculating the similarity of each sentence in the two Chinese documents. According to the method, manpower resources are saved to the greatest extent, and the judgment accuracy and the judgment speed of a computer network system to Chinese are improved.
Owner:CHINA AGRI UNIV

Speech recognition method and system

A speech recognition method comprising the steps of: storing multiple recognition models for a vocabulary set, each model distinguished from the other models in response to a Lombard characteristic, detecting at least one speaker utterance in a motor vehicle, selecting one of the multiple recognition models in response to a Lombard characteristic of the at least one speaker utterance, utilizing the selected recognition model to recognize the at least one speaker utterance; and providing a signal in response to the recognition.
Owner:GENERA MOTORS LLC

Comment analysis method based on word vectors and syntactic features and visual interactive interface

The invention provides a comment analysis method based on word vectors and syntactic characteristics in the field of data analysis. The comment analysis method comprises the steps of obtaining commodity page comment data of an e-commerce website; preprocessing the acquired target data set; extracting a appendix lexical set provided by Hownet and NTU to form a basic emotion dictionary; carrying outword vector training on the obtained preprocessed data set through a Word2Vec tool; establishing a probability transfer matrix by using the semantic similarity matrix; carrying out core sentence rule-based processing on the obtained commodity comment text; carrying out preprocessing on the obtained text without the redundancy; performing part-of-speech extraction (commodity attributes, negative words, degree words and sentiment words) evaluation matching on the obtained dependency relationship pairs; combining the evaluation matching pair with an emotion dictionary, subjecting evaluation objects to appraisal value calculation and quality sorting, and finally, realizing the evaluation objects through a visual interaction interface, so that accurate, real-time, automatic and convenient processing and analysis on commodity comment data are realized, and the method can be used in an e-commerce platform.
Owner:NANJING UNIV OF POSTS & TELECOMM

Limited domain-oriented knowledge graph updating method and system

The invention provides a limited domain-oriented knowledge graph updating method and system, and the method comprises the steps: inputting limited domain question and answer corpora, extracting candidate entities of sentences in the corpora through word segmentation, screening common functional words in a word segmentation result through a word frequency dictionary, and obtaining a candidate entity set; constructing an inverted index dictionary according to the limited domain knowledge graph to obtain a similar vocabulary set of each candidate entity; training the candidate entities and the corresponding similar vocabulary sets into word vectors, and calculating cosine similarity so as to judge the types of the candidate entities; obtaining the relationship between every two candidate entities in the candidate entity set by using the trained Bert text classification model; and updating the candidate entity type and the relationship between the candidate entities into the knowledge graph according to the judgment. The knowledge graph updating method provided by the invention is higher in efficiency, can recognize the newly appearing entity type according to the existing entities inthe graph, and effectively improves the knowledge graph updating speed and accuracy.
Owner:HUAZHONG NORMAL UNIV

Test corpus generation method and device and electronic equipment

The invention provides a test corpus generation method and device and electronic equipment. The method comprises the steps of obtaining historical test corpora on a service function point line; determining a grammatical structure and a vocabulary set corresponding to the service function points in the historical test corpus; judging whether a grammar structure white list corresponding to the service function point contains the grammar structure or not; if so, querying synonym sets in one-to-one correspondence with the vocabularies in the vocabulary set from a synonym library of the service function points; establishing a plurality of synonym phrases based on the synonym sets and the syntax structure; and respectively replacing the vocabulary set in the historical test corpus with each synonym phrase to obtain a plurality of test corpuses of the service function point. A plurality of test corpora are extended from an online real historical test corpus, so that the test corpus is greatlyenriched.
Owner:BANK OF CHINA

Method for inputting special words through voice for mobile phone

The invention provides a voice input method for cell phones, which is characterized in that: first, the voice input is based on words rather than sentences. Second, finite carefully-selected common words group are preset in the cell phone as input limit, only these pre-selected words are input by voice, the cell phone recognition system can correctly recognize user voice and transform the voice to corresponding text. Third, the well-chosen words group is preset before the cell phone leaves factory, which is not needed for users to create, but users can add in their extra common words. The voice input method for cell phones of the invention facilitates information input for users and improves the whole input speed.
Owner:飞图科技(北京)有限公司

Text keyword recognition method and device, computer equipment and readable storage medium

The invention relates to the technical field of intelligent decision making of artificial intelligence, and discloses a text keyword recognition method. The method comprises the steps of obtaining text information, and performing word segmentation on the text information to obtain a vocabulary set; calculating the word frequency of each vocabulary in the vocabulary set, splitting the vocabulary set to obtain a sub-vocabulary set and an association relationship among the vocabularies in the sub-vocabulary set, and obtaining a total vocabulary table with characteristic values according to the word frequency of each vocabulary in the sub-vocabulary set and the association relationship among the vocabularies; and arranging the vocabularies in the total vocabulary table according to the characteristic values, and setting the vocabularies of which the characteristic values exceed a preset characteristic threshold value as keywords. The invention also relates to a blockchain technology, and information can be stored in the blockchain node. The key degree of the vocabulary is evaluated from two dimensions of the word frequency of each vocabulary in the vocabulary set and the degree of dependence of any vocabulary in the vocabulary set by other vocabularies, so that the accuracy of obtaining the keyword capable of reflecting the core meaning of the text information is improved.
Owner:ONE CONNECT SMART TECH CO LTD SHENZHEN

Method and device for extracting data in medical information, equipment and storage medium

The embodiment of the invention provides a method and device for extracting data in medical information, equipment and a storage medium. The method comprises the steps of determining a positioning word of the medical information; generating an adjacent vocabulary set corresponding to the positioning word based on the position of the positioning word in the medical information, wherein the adjacent vocabulary set comprises vocabularies in the medical information, and the vocabularies and the positioning word are located within a set distance; extracting target data having a specified attachment relationship with the positioning word from the adjacent vocabulary set by using a data extraction rule pre-configured for the positioning word; and positioning and extracting target data by using the positioning word and the data extraction rule pre-configured by the positioning word so as to realize extraction of specified data in different medical data.
Owner:SHANGHAI TAIMEI DIGITAL TECH CO LTD

Cloud-based vocabulary learning system and method

A cloud-based vocabulary learning system includes a cloud database and a learning server. The cloud database stores multiple vocabulary sets associated with different levels, and is connected with the learning server. The learning server includes a processor and a memory. The processor executes instructions stored on the memory to receive a user level from a client device. One of the vocabulary sets is selected as, a user vocabulary set according to the user level, and an electronic document is compared with the user vocabulary set to extract new words in the electronic document. The new words are provided to the client device for learning, and are added to the user vocabulary set after learning.
Owner:INVENTEC PUDONG TECH CORPOARTION +1

Cheating video identification method and system

The embodiment of the invention discloses a cheating video identification method and system, and the method comprises the steps: obtaining the title information of a target video, and extracting the feature words in the title information; Dividing the feature vocabularies into at least one feature vocabulary set according to the types of the feature vocabularies; Wherein the types of the feature words in the same feature word set are the same; Obtaining an identification threshold value associated with the current feature word set, and judging whether the current feature word set belongs to anabnormal word set or not based on the identification threshold value; And if the current feature vocabulary set belongs to an abnormal vocabulary set, judging that the target video is a cheating video. According to the technical scheme provided by the invention, the identification accuracy of the cheating video can be improved.
Owner:ALIBABA (CHINA) CO LTD

Error correction dictionary creation method and device, terminal and computer storage medium

PendingCN110738042AProblems such as few solutions and limited construction efficiencyImprove build efficiencyNatural language data processingSpecial data processing applicationsEngineeringParallel corpora
The embodiment of the invention provides an error correction dictionary construction method and device, a terminal and a computer storage medium, and the method comprises the steps: obtaining retrieval data and parallel corpora corresponding to the retrieval data; expanding the retrieval data and the parallel corpus to obtain a retrieval vocabulary set and a corpus vocabulary set; querying vocabulary pairs with a mapping relationship from the retrieval vocabulary set and the corpus vocabulary set; and constructing an error correction dictionary corresponding to the retrieval data according tothe vocabulary pairs. By adopting the embodiment of the invention, the problems of fewer vocabulary pairs, lower construction efficiency and the like in an error correction dictionary in the prior artcan be solved.
Owner:TENCENT MUSIC ENTERTAINMENT TECH SHENZHEN CO LTD

Broadcast receiving method, broadcast receiving system, recording medium, and program

A broadcast receiving system includes a broadcast receiving part for receiving a broadcast in which additional information that corresponds to an object appearing in broadcast contents and that contains keyword information for specifying the object is broadcasted simultaneously with the broadcast contents; a recognition vocabulary generating section for generating a recognition vocabulary set in a manner corresponding to the additional information by using a synonym dictionary; a speech recognition section for performing the speech recognition of a voice uttered by a viewing person, and for thereby specifying keyword information corresponding to a recognition vocabulary set when a word recognized as the speech recognition result is contained in the recognition vocabulary set; and a displaying section for displaying additional information corresponding to the specified keyword information.
Owner:SOVEREIGN PEAK VENTURES LLC

Disease data processing method, apparatus, electronic device and computer readable medium

The present disclosure relates to a disease data processing method, device, electronic equipment and computer readable medium. Relating to the field of medical big data processing, the method includes: acquiring disease data, the disease data including at least one disease symptom label; performing word segmentation processing on the disease data to generate a vocabulary set; constructing a symptom set through the vocabulary set, the symptom set The set includes at least one disease symptom label; and the symptom set is input into a diagnosis model to obtain a disease classification identifier, and the diagnosis model is an artificial neural network model. The disease data processing method, device, electronic equipment, and computer-readable medium involved in the present disclosure can improve the accuracy of disease prediction and make better auxiliary decisions for clinicians.
Owner:GOLDEN PANDA LTD

Image-text data expansion method and device and electronic equipment

The embodiment of the invention provides an image-text data expansion method and device and electronic equipment, and belongs to the technical field of image processing, and the method comprises the steps: carrying out the coding vectorization of vocabularies in a corpus, so as to obtain vocabulary codes corresponding to the vocabularies; clustering the vocabulary codes to obtain a plurality of word aggregation class sets; obtaining an image set corresponding to each word aggregation class set in the plurality of word aggregation class sets; rejecting unqualified vocabulary sets according to the image distribution condition in the image set to obtain qualified vocabulary sets; and combining any element in the qualified vocabulary set with any element in the image set to obtain expanded image-text data. By means of the processing scheme, the number of high-confidence image-text data is increased, the problem that weak label data related to images and texts is insufficient is solved, andthe data collected through the scheme can be used for subsequent model training, data analysis, algorithm adjustment and other links.
Owner:BEIJING BYTEDANCE NETWORK TECH CO LTD

Construction method and device of user knowledge concept network and evaluation method of user knowledge

The invention discloses a construction method and device of a user knowledge concept network and an evaluation method of user knowledge.The construction method of the user knowledge concept network comprises the steps that firstly, each text contained in a text set containing m independent texts is preprocessed, and then each vocabulary of corpus serves as a concept subject term; all sentences and vocabularies are traversed, vocabularies appearing together with the concept subject terms in the same sentence are included into vocabulary sets corresponding to the concept subject terms, then vocabulary element screening is conducted on each vocabulary set, and a concept library is constructed; the field division is performed on concepts contained in the concept library by adopting a hierarchical clustering method; then, according to the matching condition of vocabularies contained in the user text data and a concept library, concepts contained in the user text data are obtained; and finally, a user knowledge concept network is constructed according to the concepts contained in the user text data and the divided concept fields. According to the method, the accuracy and objectivity of evaluation can be improved.
Owner:武汉渔见晚科技有限责任公司

Reply information processing method and system

The invention relates to the technical field of the Internet, in particular to a reply information processing method and system. The method comprises the steps of when a to-be-processed problem of a current user is acquired, screening similar users based on user attribute data of the current user to acquire a similar user set; preprocessing the to-be-processed problem to extract a corresponding effective vocabulary set, and screening a historical problem set based on the effective vocabulary set to obtain an alternative problem set; and determining the historical question with the highest relevancy with the to-be-processed question in the alternative question set, and configuring the corresponding reply information as alternative reply information. According to the method, similar questions are searched and alternative replies are provided by utilizing the characteristics of the questions, so that the accuracy of automatic reply is effectively improved, and the reply efficiency is further improved.
Owner:HANGZHOU BROADLINK ELECTRONICS TECH

Method, apparatus and device for data processing

The invention discloses a data processing method and a device, which determine the vocabulary set corresponding to the authority in a service system by the result of segmenting the text content in theauthority. Then through the classification of each word and determine the word frequency of each type of vocabulary in each permission, determine the attributes of the permission. Finally, the permissions of attributes are determined as training samples, and the classification model is trained to determine the types of permissions in the business system according to the classification model.
Owner:ADVANCED NEW TECH CO LTD

Text information processing method and system

The invention provides a text information processing method and system, and the method comprises the steps: carrying out the word segmentation of a to-be-approved text, and obtaining a vocabulary setcomprising a plurality of vocabularies; extracting features of each vocabulary in the vocabulary set to obtain a vocabulary feature set; inputting the vocabulary feature set into a preset classification model for vocabulary classification, and determining whether the to-be-approved text contains sensitive words or not; if the sensitive words are contained, outputting text information used for indicating that the to-be-approved text does not pass the approval; and if the sensitive words are not included, outputting text information used for indicating that the to-be-approved text passes approval. According to the scheme, vocabulary classification is carried out on the to-be-approved text by utilizing the pre-trained classification model, and whether the to-be-approved text contains the sensitive words or not is determined. And outputting text information used for indicating whether the approval text passes approval or not according to the determination result without manual approval, sothat manpower and approval cost are saved, and approval speed and approval efficiency are improved.
Owner:BANK OF CHINA

System for predicting mood of user by using web content, and method therefor

A system for predicting an emotion of a user by using a web content includes a URL collection unit for collecting a URL of a web page; a representative URL selection unit for selecting a category-specific representative URL, a basic emotion-specific representative URL, and a dimensional emotion-specific representative URL according to contents included in a plurality of collected URLs; a representative vocabulary set creation unit for creating vocabulary sets representing a category, a basic emotion, and a dimensional emotion, respectively, on the basis of the selected representative URLs; a vocabulary extraction unit for crawling a plurality of texts; and a selection unit for comparing document similarities between the plurality of extracted vocabularies and the vocabulary sets.
Owner:SANGMYUNG UNIV IND ACAD COOP FOUND

Method and device for intelligent distribution of electronic files

The invention provides a method and device for intelligently distributing electronic files, and relates to the technical field of electronic file processing. The method includes: obtaining the content of each historical electronic file as a machine learning sample; performing word segmentation processing on each historical electronic file content through a natural language processing word segmentation method to obtain a vocabulary set; determining high-frequency words from the vocabulary set as machine learning features; Use the information retrieval weighting algorithm for each historical electronic file content to calculate the information retrieval weighted value of the historical electronic file content on each machine learning feature; determine the electronic file type according to the receiver information of each historical electronic file; according to each machine learning feature Each electronic file type and each information retrieval weighted value form a learning matrix; according to the learning matrix, use a machine learning algorithm to perform fitting training to generate a distribution model; obtain the electronic file to be processed, and perform distribution processing through the distribution model.
Owner:BANK OF CHINA

Interactive story system using four-valued logic

A language learning system that teaches users a personalized vocabulary set through interactive stories. The interactive story is modeled by probabilistic rules in a semantic network with objects and relationships. Dialogue and narrative are dynamically generated based on the state of the model of the interactive story using phrase rewriting rules evaluated using a four-valued logic system in which the truth values ​​of objects and relations are stored in parallel storage structures. Coding is true, false, defined and undefined.
Owner:罗杰密德茂尔

Dictionary-based word vector generation method and system

PendingCN112163422AAdequate vocabularySufficient lexical relationsSemantic analysisParaphraseLexical set
The invention relates to a dictionary-based word vector generation method and system, and the method comprises the steps: enabling vocabularies contained in the dictionary to form a vocabulary set, carrying out the statistics of the occurrence frequency of each vocabulary in the vocabulary set in vocabulary paraphrases contained in the dictionary, carrying out the word segmentation of each vocabulary paraphrase according to the frequency, and obtaining a paraphrase vocabulary sequence; taking the vocabularies as nodes, connecting the nodes according to the corresponding relation between the vocabularies and the paraphrasing vocabulary sequences to form directed edges, and determining the weight of each directed edge to obtain a directed graph based on a dictionary; and calculating the directed graph based on a depth walk algorithm to obtain a word vector. According to the method, the vocabulary information provided by the dictionary is fused into the word vector, so that a high-qualitydata basis can be provided for word vector training, word meanings can be better mined, and a natural language processing task is supported.
Owner:WORKWAY SHENZHENINFORMATION TECH CO LTD

Method for screening security subset of security-critical software modeling language

The invention discloses a method for screening a security subset of a security-critical software modeling language. The method comprises the following steps: (1) establishing a complete vocabulary setof the modeling language; (2) removing unsafe elements from the complete vocabulary set, and establishing a preliminary screening version of a safe subset; (3) establishing a target domain programming element set; (4) carrying out adequacy analysis of the security subsets; and (5) carrying out safety verification of the safety subset. The method provided by the invention ensures the sufficiency,necessity and security of the security subset, so that the security subset not only can meet the requirements of security critical software development, but also avoids the problems caused by repeatedelement functions in the security subset, thereby generating a modeling language with extremely high security.
Owner:中国航发控制系统研究所

Element extraction method and device, electronic equipment and storage medium

The invention provides an element extraction method and device, electronic equipment and a storage medium. The method comprises the steps of obtaining a to-be-extracted text and a vocabulary set of the to-be-extracted text; based on a matching result between character strings corresponding to every two characters in the to-be-extracted text and the vocabulary set, the relevancy between every two characters is determined, and the character strings are obtained by being intercepted from the to-be-extracted text with the two corresponding characters as starting points and ending points; coding each character in the to-be-extracted text on the basis of the relevancy between every two characters to obtain an element boundary feature of each character; and determining an element extraction result of the to-be-extracted text based on the element boundary features of the characters. According to the element extraction method and device, the electronic equipment and the storage medium provided by the invention, the matched vocabularies and the original sentences do not need to be spliced, and the original input length is not changed, so that the coding efficiency is improved. In addition, compared with an existing vocabulary splicing method, the storage space is saved.
Owner:IFLYTEK (SUZHOU) TECH CO LTD

Voice tag judgment method and system, storage medium and electronic equipment

The invention relates to the field of audio recognition, and in particular, relates to a voice tag judgment method and system, a storage medium and electronic equipment. The method comprises the steps: acquiring open source vocabularies to form an open source vocabulary set; performing word segmentation processing on the text in the related scene to obtain a word segmentation set; obtaining an audio file, and processing the audio file to obtain a high-frequency vocabulary set; obtaining a preset list, and processing the preset list to obtain a related vocabulary set; performing union processing on the open source vocabulary set, the word segmentation set, the high-frequency vocabulary set and the related vocabulary set, and obtaining a vocabulary list; and performing tag processing on the voice content according to the vocabulary list. The method is high in operability and suitable for the cold start stage; and the ASR recognition accuracy in the content risk control field and the downstream nlp classification task and tag effect can be effectively improved, and the method can be quickly applied to related fields.
Owner:北京数美时代科技有限公司

A guessing method of microblog user's location

The invention provides a microblog user location estimation method which includes a location feature word learning process and a microblog user location estimation process. The location feature word learning process includes acquiring microblog corpus of users who have fill in information about locations, extracting nouns in the corpus, extracting location feature words on the basis of a feature extraction method and calculating the corresponding weight. The microblog user location estimation process includes acquiring location information of microblog users to be estimated and their fans as input, extracting to obtain a user microblog geography word set and a fan geography word set, calculating the weight corresponding to the user microblog geography word representative location and the weight corresponding to the fan geography word representative location, performing weighting summation on the weights of the location words, and outputting the location words with the maximum weight as estimated locations of the users. Microblog user location estimation can be more targeted, and locations of microblog users can be estimated more accurately by the microblog user location estimation method.
Owner:HANGZHOU DIANZI UNIV

Title information optimization method, device, equipment and system

A title information optimization method, apparatus, device, and system, wherein said method comprises: determining a target service object set to which a service object to be optimized belongs (S201); acquiring a popular word set corresponding to said target service object set, and user features corresponding to the service object to be optimized (S202); determining a target popular word within said popular word set according to said user features (S203); optimizing title information of said service object to be optimized according to said target popular word (S204). The method is used to improve the accuracy of recommending a service object to a user.
Owner:ALIBABA GRP HLDG LTD

Multilingual speech recognition and theme semantic analysis method and device

The invention relates to a multilingual speech recognition and theme semantic analysis method, which comprises the following steps: acquiring a pinyin character string corresponding to a speech input signal according to a speech comparison table, judging that the pinyin character string corresponds to a plurality of original words according to multilingual word collection, and forming a statement according to the multi-word collection and the plurality of original words through a speech recognizer; executing the following steps through a semantic analyzer: selectively executing a correction process, executing an analysis state judgment process or outputting the statement according to the statement and subject vocabulary semantic relationship data set; outputting the corrected statement when the correction process is judged to be successful, and when the correction process is judged to be failed, executing an analysis state judgment process to selectively output a judgment result.
Owner:卢文祥
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products