Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

36 results about "Function word" patented technology

In linguistics, function words (also called functors) are words that have little lexical meaning or have ambiguous meaning and express grammatical relationships among other words within a sentence, or specify the attitude or mood of the speaker. They signal the structural relationships that words have to one another and are the glue that holds sentences together. Thus they form important elements in the structures of sentences.

Clustering hypertext with applications to WEB searching

A method and structure for providing a database of documents comprising performing a search of the database using a query to produce query result documents, constructing a word dictionary of words within the query result documents, pruning function words from the word dictionary, forming first vectors for words remaining in a word dictionary, constructing an out-link dictionary of documents within the database that are pointed to by the query result documents, adding the query result documents to the out-link dictionary, pruning documents from the out-link dictionary that are pointed to by fewer than a first predetermined number of the query result documents, forming second vectors for documents remaining in the out-link dictionary, constructing an in-link dictionary of documents within the database that point to the query result documents, adding the query result documents to the in-link dictionary, pruning documents from the in-link dictionary that point to fewer than a second predetermined number of the query result documents, forming third vectors for documents remaining in the in-link dictionary, normalizing the first vectors, the second vectors, and the third vectors to create vector triplets for document remaining in the in-link dictionary and the out-link dictionary, clustering the vector triplets using the toric k-means process, and annotating / summarizing the obtained clusters using nuggets of information, the nuggets including summary, breakthrough, review, keyword, citation, and reference.
Owner:INT BUSINESS MASCH CORP

System for automation of business knowledge in natural language using rete algorithm

The present invention is directed to a system for managing business knowledge expressed as statements, preferably sentences using a vocabulary, where such statements may be automated by the generation of programming language source code or computer program instructions. As such, the present invention also manages software design specifications that define, describe, or constrain the programming code it generates or programs with which it or the code it generates is to integrate. All information managed within the present invention is maintained within a relational database that is encapsulated within an object-oriented model. Each object in this model is subject to version control and administration using permissions. Each user of the system is an object and belongs to one or more groups. Users and groups may be granted privileges. Objects may be created, examined, used, modified, deleted, or otherwise operated upon only if corresponding permission or privilege has been granted. The vocabulary managed by the present invention consists of the function words commonly used in a language, such as the auxiliary verbs, prepositions, articles, conjunctions, and other essentially closed parts of speech in English, as well as open parts of speech, such as nouns, verbs, adjectives, and adverbs.
Owner:ORACLE INT CORP +1

Chinese author identification method based on double-layer classification model, and device for realizing Chinese author identification method

The invention relates to a Chinese author identification method based on a double-layer classification model and a device for realizing the Chinese author identification method, belonging to the field of information security. Aiming at the problem of low identification accuracy caused by excessive authors, an author grouping layer is added in an author identification model; each author is represented into an author vector; authors are grouped by a clustering algorithm; a second layer is an author identification layer; a dependence relationship, a function word, a punctuation mark and a word class mark are extracted from the second layer to use as characteristics; and author identification is carried out in the group. According to the method or the device, the problem that the identification accuracy is lowered because of excessive authors can be effectively solved. Meanwhile, with a proposed characteristic dimensionality reduction and optimization method based on a main ingredient analysis method, the problem that the identification accuracy is affected by noise comprised by a high-dimensionality characteristic vector is solved. The Chinese author identification method can be applied to the author textual research field of a literature and also can be applied to the field of information security, such as copyright protection.
Owner:HUNAN UNIV

Document classifying method based on network measure index

The invention relates to a document classifying method based on a network measure index. The document classifying method comprises a sample training phase and a document classifying phase. The sample training phase comprises the first step of sample collecting, the second step of text segmenting, the third step of word class analyzing, the fourth step of function word and name removing, the fifth step of word frequency counting, the sixth step of characteristic set Vd establishing, the seventh step of characteristic network peak establishing, the eighth step of characteristic network edge establishing, the ninth step of average degree calculating, the tenth step of cluster coefficient calculating, the eleventh step of characteristic path length calculating and the twelfth step of network measure index interval obtaining. The document classifying phase comprises the first step of processing a document to be classified and the second step of judging document classification. According to the document classifying method, classifying is accurate, classifying efficiency is high, the problem that according to an existing classifying method, scientific and technical literature, novels and prose cannot be distinguished is solved, and a scientific classification method and a theoretical foundation is laid for automatic distinguishing of the scientific and technical literature, the novels and the prose.
Owner:INFORMATION RES INST OF SHANDONG ACAD OF SCI

A Document Classification Method Based on Network Metrics

The invention relates to a document classifying method based on a network measure index. The document classifying method comprises a sample training phase and a document classifying phase. The sample training phase comprises the first step of sample collecting, the second step of text segmenting, the third step of word class analyzing, the fourth step of function word and name removing, the fifth step of word frequency counting, the sixth step of characteristic set Vd establishing, the seventh step of characteristic network peak establishing, the eighth step of characteristic network edge establishing, the ninth step of average degree calculating, the tenth step of cluster coefficient calculating, the eleventh step of characteristic path length calculating and the twelfth step of network measure index interval obtaining. The document classifying phase comprises the first step of processing a document to be classified and the second step of judging document classification. According to the document classifying method, classifying is accurate, classifying efficiency is high, the problem that according to an existing classifying method, scientific and technical literature, novels and prose cannot be distinguished is solved, and a scientific classification method and a theoretical foundation is laid for automatic distinguishing of the scientific and technical literature, the novels and the prose.
Owner:INFORMATION RES INST OF SHANDONG ACAD OF SCI

Man-machine interaction intention analysis method and device, computer equipment and storage medium

The embodiment of the invention discloses a man-machine interaction intention analysis method and device, computer equipment and a storage medium. The method comprises: picking up semantic interaction voice and converting the semantic interaction voice into a semantic text; performing syntactic dependency analysis to obtain an analysis result; judging whether punctuation marks exist in the analysis result or not; if the first punctuation mark exists, cutting off the analysis result according to the position of the first punctuation mark to obtain two clauses; determining a core relationship and a relationship between the advertent of the clause where the core relationship is located and the head word; determining whether the semantic text contains effective information or not; if not, judging whether the end of the semantic text is a virtual word or not; if yes, deleting the virtual words at the end of the semantic text; if not, retrieving a core relationship; and judging whether the semantic text contains effective information or not by combining the subscript length of the core relationship. By implementing the method provided by the embodiment of the invention, the problem that the current semantic service cannot accurately judge the real intention of the user is solved, so that the semantic service understanding is more accurate.
Owner:深圳科卫机器人科技有限公司

Method and device for enhancing grammar error correction data based on real error mode

The invention discloses a method and a device for enhancing grammar error correction data based on a real error pattern. The method comprises the following steps: acquiring a to-be-noise-added statement and a noise adding strategy set; determining the noise adding probability of each word in the statement to be subjected to noise adding; randomly selecting a noise adding strategy from a noise adding strategy set according to the noise adding probability to carry out noise adding processing on the to-be-noise-added word; and constructing parallel statement pairs according to the error statements subjected to noise addition processing and the correct statements before noise addition processing. The noise adding strategy set comprises a real error pattern-based replacement strategy, a synonym replacement strategy, a function word replacement strategy, a similar spelling replacement strategy and a flexion replacement strategy. According to the embodiment of the invention, through introduction of real errors and simulation of various real errors, high-quality artificial error enhancement data which is more real and closer to real errors of learners can be generated; and various grammar errors can be manufactured through various types of noise schemes, and the method and the device can be widely applied to the technical field of data processing.
Owner:GUANGDONG UNIVERSITY OF FOREIGN STUDIES
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products