Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

122 results about "Document level" patented technology

Document Level. Also sometimes referred to as Code Behind. Consists of a single assembly associated with a single workbook, document or template. The code is inside an assembly that is then linked to the particular office file.

Method and system for semantic search and retrieval of electronic documents

A system and method for semantic search for electronic documents stored on a computer readable media, and providing a search result in response to a query. The system includes a corpus including a plurality of electronic documents that are domain tagged at a document level and analyzed based on the tags to identify word usage patterns. An index of word usage patterns is provided that indexes the plurality of documents in the corpus according to their word usage patterns. The system also includes a query pre-processing module that receives a query from a user, and analyzes the query to determine probable word usage patterns in the query. The system further includes a processor that uses the index to identify documents having word usage patterns that matches the probable word usage patterns in the query as a candidate electronic document, and retrieves the candidate electronic document.
Owner:TEXTDIGGER

Web based data collaboration tool

A web based data collaboration tool includes a dynamic international collaborative environment in which system partners, including customers, technology partners and suppliers, can exchange information between one another in a truly collaborative environment. The web based data collaboration tool includes unique “fine grain” security at both the document and sub-document level. This allows one source document to be shared between the system partners, including partners from different companies and those located in different countries, based upon an individual document / sub-document security profile. Further, the web based data collaboration tool includes a secure “Sandbox” for peer-to-peer sharing of sensitive documents and electronically incorporates a business area export representative (BAER) approval process that includes the required retention of International Traffic in Arms Regulations (ITAR) documents making the web based data collaboration tool fully ITAR compliant.
Owner:UNITED TECH CORP

Comment text emotion classification model training and emotion classification method and device and equipment

ActiveCN108363753AAchieving Context Semantic Robust AwarenessRealize semantic expressionSemantic analysisSpecial data processing applicationsClassification methodsNetwork model
The invention discloses a comment text emotion classification model training and emotion classification method and device and equipment and belongs to the field of text emotion classification in natural language processing. Model training comprises the steps that a comment text and associated subject and object information are acquired; a comment subject and object attention mechanism is fused based on a first-layer Bi-LSTM network to extract sentence-level feature representation; the comment subject and object attention mechanism is fused based on a second-layer Bi-LSTM network to extract document-level feature representation; and a hyperbolic tangent non-linear mapping function is adopted to map document-level features to an emotion category space, softmax classification is adopted to train parameters in a model, and an optimal text emotion classification model is obtained. According to the method, the hierarchical bidirectional Bi-LSTM network model and the attention mechanism are adopted, context semantic robust perception and semantic expression of the text can be realized, the robustness of text emotion classification can be remarkably improved, and the correct rate of classification is increased.
Owner:NANJING UNIV OF POSTS & TELECOMM

Data document generator

A data management system for generating customized versions of data documents. Initially the document is stored as in the form of raw data, which is subsequently parsed into an internal representation of the document. In one embodiment, raw data is stored in XML form and is parsed by an XML parser. Upon the initial request for a customized version of the document, a sequence of transforms is applied to the internal representation and to subsequently transformed documents in order to create hierarchical, customized document levels. In one embodiment, transforms are implemented as either XSL stylesheets, although Java classes may also be employed. The document versions are written to cache, and subsequent requests for existing versions of the document are referred to cache. In the event that any document dependencies change, a cached version will be denoted invalid, and subsequent requests will result in the re-generation of a customized version. The data management system is implemented in the form of a document manager, a database that includes a document table and a transform table. The document manager reads raw documents from a raw-document database and reads transforms from a transform database. Requested customized documents are written to cache.
Owner:VERSATA SOFTWARE INC

Secure search of private documents in an enterprise content management system

An enterprise content management system such as an electronic contract system manages a large number of secure documents for many organizations. The search of these private documents for different organizational users with role-based access control is a challenging task. A content-based extensible mark-up language (XML)-annotated secure-index search mechanism is provided that provides an effective search and retrieval of private documents with document-level security. The search mechanism includes a document analysis framework for text analysis and annotation, a search indexer to build and incorporate document access control information directly into a search index, an XML-based search engine, and a compound query generation technique to join user role and organization information into search query. By incorporating document access information directly into the search index and combining user information in the search query, search and retrieval of private contract documents can be achieved very effectively and securely with high performance.
Owner:IBM CORP

Document processing apparatus and a method for controlling a document processing apparatus

The present invention enables page-level or document-level print setup of an XPS document via a user interface. The print ticket of a page of interest is obtained by merging a job-level print ticket 1804, to which the page of interest belongs, a document-level print ticket 1805, to which the page of interest belongs, and a page-level print ticket 1806 of the page of interest. The obtained individual page print tickets are compared with the job-level print ticket and if there are differences, it is determined that this particular page has exception settings and the exception settings are saved and displayed.
Owner:CANON KK

Composition scoring method based on attention mechanism

The invention provides a composition scoring method based on an attention mechanism; the method comprises: using a neural network attention frame of word-sentence-document trilayer structure in a composition scoring system, using artificially extracted features to perform fusing in sentence and document levels of the frame, and setting attention weights of the sentence and document levels. The influences of factors, such as local features and global features of a language, completeness of sentences, accuracy of word usage, diversity of expressions, coherency of sentences and digressing or zero digressing from the subject, upon a scoring task are comprehensively considered herein, and composition scoring effect is maximally improved.
Owner:RENMIN UNIVERSITY OF CHINA +2

Knowledge discovery from citation networks

In a corpus of scientific articles such as a digital library, documents are connected by citations and one document plays two different roles in the corpus: document itself and a citation of other documents. A Bernoulli Process Topic (BPT) model is provided which models the corpus at two levels: document level and citation level. In the BPT model, each document has two different representations in the latent topic space associated with its roles. Moreover, the multi-level hierarchical structure of the citation network is captured by a generative process involving a Bernoulli process. The distribution parameters of the BPT model are estimated by a variational approximation approach.
Owner:THE RES FOUND OF STATE UNIV OF NEW YORK

Event trigger word extraction method based on document level attention mechanism

The invention relates to an event trigger word extraction method, in particular to an event trigger word extraction method based on a document level attention mechanism, comprising the following steps: (1) preprocessing training corpus; (2) performing word vector training by using PubMed database corpus; (3) constructing a distributed representation way of a sample; (4) constructing a characteristic representation way based on BiLSTM-Attention; (5) adopting CRF learning, and acquiring an optimal sequence labeling result of the current document sequence; and (6) extracting event trigger words.The method provided by the invention has the advantages that firstly a BIO tag labeling way is adopted, and recognition including multi-word trigger word recognition is realized; secondly a corresponding simple word and characteristic distributed representation way is constructed for a trigger word recognition task; and thirdly, a BiLSTM-Attention model is proposed, a distributed representation structure specific to the currently input document level information is realized by introducing an Attention mechanism, and trigger word recognition effect is improved.
Owner:DALIAN UNIV OF TECH

Document level indexes for efficient processing in multiple tiers of a computer system

To improve performance of performing XML operations on an XML document in by a client tier, the client generates an index that indexes the nodes of an XML document. The index may be generated, for example, by and during parsing of the XML document. The index contains similar structures to those maintained by a database server to perform XML operations on collections of XML documents. In lieu of parsing XML document to generate an index, the client may generate indexes based on data retrieved from the indexes at the database server.
Owner:ORACLE INT CORP

Attention dual-layer LSTM-based long text emotional tendency analysis method

InactiveCN108446275AImprove the accuracy of sentiment classificationAvoiding the pitfalls of RNNsSemantic analysisCharacter and pattern recognitionSemanticsDocumentation
The invention relates to an attention dual-layer LSTM-based long text emotional tendency analysis method, belongs to the field of natural language processing and machine learning, and mainly aims to solve the problem of difficulty in accurately judging an emotional tendency of a full text due to long comment length of the long text, discrete distribution of positive and negative emotional featuresand different emotional semantic contribution degrees of sentences. The method comprises the steps of firstly learning sentence-level emotional vector representation by utilizing LSTM; secondly coding semantic relationships between emotional semantics of all the sentences in a document and the sentences by adopting bidirectional LSTM, and based on an attention mechanism, performing weight allocation on the sentences with different emotional semantic contribution degrees; and finally, weighting the sentence-level emotional vector representation to obtain document-level emotional vector representation of the long text, and through a Softmax layer, obtaining the emotional tendency of the long text. An experiment is performed in Yelp2015 and IMDb film comment corpora; and a result shows thata relatively good classification effect can be achieved, so that the emotional classification correctness is further improved.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Data caching system and data caching method

The invention provides a data caching system and a data caching method, wherein the data caching system comprises a cache dividing unit and a storage unit; the cache dividing unit is used for querying the type of the data to be stored, setting a corresponding logo for each type of data and respectively dividing a corresponding caching area for each type of data; the storage unit is used for storing the data in each caching area into the memory of an operation system and storing the metadata, logo and / or the attribution information, which correspond to the data respectively in each caching area, into the memory of a virtual machine. By the technical scheme disclosed by the invention, a KEY set and the actual data can be stored in a separating mode under the condition of performing a large-memory caching, the version control on caching data can be realized, audio monitors with a document level can be automatically updated, and the monitoring strength can be improved.
Owner:YONYOU NETWORK TECH

Artificial intelligence-based multi-label classification method and system of multi-level text

The invention relates to an artificial intelligence-based multi-label classification method and system of multi-level text. The method includes: 1) utilizing a neural network to construct a multi-label classification model of the multi-level text, and obtaining text class prediction results of training text according to the model; 2) carrying out learning on parameters of the multi-label classification model of the multi-level text according to existing text class labeling information in the training text and the text class prediction results, which are of the training text and are obtained inthe step 1), to obtain a multi-label classification model of the multi-level text with determined parameters; and 3) utilizing the multi-label classification model of the multi-level text with the determined parameters to classify to-be-classified text. The method infers labels of the formed text simply through the document-level labeling information, and can be well applied to scenes where labels of formed text are difficult to collect; compared with traditional multi-instance learning (MIL) methods, the method of the invention introduces minimal assumptions, and can better fit actual data;and the method of the invention has good scalability.
Owner:INST OF INFORMATION ENG CHINESE ACAD OF SCI

Document-level sentiment analysis method based on specific domain sentiment words

ActiveCN108804417AMake up for the lack of domain specific wordsVersatilitySemantic analysisCharacter and pattern recognitionData setAlgorithm
The invention provides a document-level sentiment analysis method based on specific domain sentiment words. The method is implemented by the following steps of collecting a document data set, traininga set of prototype words by using a Skip-gram word vector model to obtain a word vector corresponding to each prototype word, recombining the word vectors by utilizing an attention mechanism, and capturing a relation between non-continuous words in the word vectors; synthesizing the words and sentences by using an asymmetric convolutional neural network and a bidirectional gate recurrent neural network based on the attention mechanism respectively, thereby forming document vector characteristics; generating sentiment eigenvectors by utilizing a domain sentiment dictionary of the Skip-gram word vector model; and finally, combining the document vector characteristics and the sentiment eigenvectors by utilizing a linear combination layer to form document characteristics beneficial to document classification. The sentiment analysis is widely applied to the product analysis, the commodity recommendation, the stock price trend prediction and the like; and the method provided by the invention can accurately and efficiently carry out sentiment analysis on documents, and has great commercial values.
Owner:SHANDONG UNIV OF SCI & TECH

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

Methods and systems provide for automatically performing actions on or in association with text or data strings that are recognized as belonging to certain semantic categories. Text entered by a user is passed to a recognizer application. If a given text or data string is recognized as belonging to a given semantic category, the recognizer application passes data corresponding to the recognized string back to a host application. In response to recognized text or data, a pointer to the object model of the host application may be passed to the recognizer application to allow the recognizer application to perform any function of the host application in response to the recognized string. Alternatively, after the recognizer application passes data corresponding to the recognized string back to the host application, the host application may fire an application level or document level event for causing an action component to perform desired actions on recognized strings. Alternatively, after a string is recognized by the recognizer application, the recognizer application may set a property associated with a desired action to be performed on or in association with the recognized string. The host application may call an action component identified by the property for automatically performing the desired action on or in association with the recognized string.
Owner:MICROSOFT TECH LICENSING LLC

System and method for multithreaded text indexing for next generation multi-core architectures

A system and method for indexing documents in a data storage system includes generating a single document hash table in storage memory for a single document using an index construction in a multithreaded and scalable configuration wherein multiple threads are each assigned work to reduce synchronization between threads. The single document hash table includes partitioning the single document and indexing strings of partitioned portions of the single document to create a minor hash table for each document sub-part; generating a document level hash table from the minor hash tables; updating a stream level hash table for the strings which maps every string to a global identifier; and generating a term reordered array from the document level hash table.
Owner:IBM CORP

Document processing apparatus and a method for controlling a document processing apparatus

The present invention enables page-level or document-level print setup of an XPS document via a user interface. The print ticket of a page of interest is obtained by merging a job-level print ticket 1804, to which the page of interest belongs, a document-level print ticket 1805, to which the page of interest belongs, and a page-level print ticket 1806 of the page of interest. The obtained individual page print tickets are compared with the job-level print ticket and if there are differences, it is determined that this particular page has exception settings and the exception settings are saved and displayed.
Owner:CANON KK

Cryptograph index structure based on blocking organization and management method thereof

InactiveCN101655858AEfficient security structureEfficient security maintenance mechanismDigital data protectionSpecial data processing applicationsInternal memoryTimestamp
The invention discloses a cryptograph index structure based on a blocking organization and a management method thereof. Aiming at a blocking cryptograph index structure, an index establishing mode based on combination is firstly adopted to establish a plain text index when the index is established, and then the plain text index is blocked and encrypted in a unitive way. A maintenance mechanism based on a cryptograph index is divided into the addition, the deletion and the modification of a document in the index. The addition of the document is mainly divided into two conditions of batch addition and littleness addition; in the batch addition, a temporary index is established on a disc; and in the littleness addition, an internal memory index is established. In the deletion of the document,a deletion mark is firstly made on the document to be deleted, and the document is deleted in a unitive way until a proper opportunity. In the modification of the index, an original document is firstly deleted, and then a novel document is anew added. In a key management strategy, stratification management is carried out on an index encryption key, and the update of the key is realized by a timestamp mechanism. In an access control strategy based on the index, access control information is integrated into the index, and the access control of document level granularity is realized.
Owner:HUAZHONG UNIV OF SCI & TECH

Methods for automatically determining workflow for print jobs

A method automatically determines a workflow for a print job via analysis of PDF documents is disclosed. A job synopsis comprising hashed information of a historical print job's document content along with a PDF document-level and object-level metadata and information about workflow nodes and parameters can be stored in a database. The job synopsis of a new incoming print job can then be compared to the historical print job synopses database. If the new print job matches the historical print job within a certain predefined similarity limit, then workflow and parameter information associated with the historical job can be utilized to augment an initial workflow for the incoming print job.
Owner:XEROX CORP

Method for searching and sequencing personalized web pages based on user retention time analysis

InactiveCN102231165ACapture interest in readingPredict Potential AttractionSpecial data processing applicationsPersonalizationWeb browser
The invention discloses a method for searching and sequencing personalized web pages based on user retention time analysis. The method comprises the following steps of: firstly, obtaining document-level user retention time through a custom web browser; accordingly, predicting concept word-level user retention time; then, according to the predicted concept word-level user retention time, further predicting personalized reading interests of a user to each web page in any web page searching result; and finally, according to the personalized reading interests of the user, generating a personalized web page searching result facing to the user. In the method disclosed by the invention, by using an artificial intelligent related technology and methods for searching web pages, processing texts and the like, reading interests of users to different concepts can be estimated; therefore, personal reading habits and requirements are considered in the process for searching and sequencing the web pages; and the sequencing of the web page searching results is closer to the user personalized prediction result, therefore, better network search and browser support are provided for users.
Owner:ZHEJIANG UNIV

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

Methods and systems provide for automatically performing actions on or in association with text or data strings that are recognized as belonging to certain semantic categories. Text entered by a user is passed to a recognizer application. If a given text or data string is recognized as belonging to a given semantic category, the recognizer application passes data corresponding to the recognized string back to a host application. In response to recognized text or data, a pointer to the object model of the host application may be passed to the recognizer application to allow the recognizer application to perform any function of the host application in response to the recognized string. Alternatively, after the recognizer application passes data corresponding to the recognized string back to the host application, the host application may fire an application level or document level event for causing an action component to perform desired actions on recognized strings. Alternatively, after a string is recognized by the recognizer application, the recognizer application may set a property associated with a desired action to be performed on or in association with the recognized string. The host application may call an action component identified by the property for automatically performing the desired action on or in association with the recognized string.
Owner:MICROSOFT CORP

Method and system for generating large coded data set of text from textual documents using high resolution labeling

InactiveUS20170270096A1High resolution labelingHigh scoreSemantic analysisMachine learningData setDocumentation
A method and a system for generating coded dataset of sentences with a high resolution labeling are provided herein. The method may include: obtaining a plurality of textual documents that are pre-classified on a whole document level, into topics; training one or more mixed-membership model unsupervised algorithms, implemented by a computer processor, based on said topics, to yield a distribution of sub topics for each of the textual documents; and applying a transformation, implemented by a computer processor, to said distribution of sub topics for each of the textual documents, to yield a topic tagging score for said sub topics on a text-portion level.
Owner:YISSUM RES DEV CO OF THE HEBREWUNIVERSITY OF JERUSALEM LTD

Method, system and device for enhancing business information security

The present invention provides a method for creating an electronic document file comprising monitoring creation and changes of an electronic document file, receiving a policy file including document level set-up information and security policy, searching for words associated with business information from the text data retrieved from the electronic document file, computing an exposure score of the electronic document file based on the number of times for words associated with business information being searched and document level set-up information, assigning a document level to the electronic document file based on the exposure score, and inserting a watermark to text of the electronic document file to be displayed on the client device based on the user's personal information received from the server. Accordingly, leakage of business documents for electronic document files including business information can be prevented by providing pre-security and post-security measures stronger than conventional measures.
Owner:MARKANY

Twitter viewpoint classification-oriented sentiment-enriched word embedding learning method

The invention provides a Twitter viewpoint classification-oriented sentiment-enriched word embedding learning method, and relates to the technical field of computers. When word level n-gram and polarity information are modeled at the same time, sentiment polarity information of tweet document level is modeled, sentiment information of the word level is integrated, and the word level is naturally input to convolution to serve as an input of the tweet level. When learnt words are embedded in a Twitter viewpoint polarity classification task, an experimental result on a standard data set shows that the method provided by the invention is superior to existing similar methods.
Owner:PINGDINGSHAN UNIVERSITY

Scope-based xps filter driver configuration to perform dynamic operations

A filter pipeline framework is provided for a printer driver including a plurality of filters. The framework includes a logical page filter configured to perform operations on logical pages within a first thread; a document level filter configured to perform document level operations within a second thread; a job level and physical page filter configured perform job level operations and physical page operations within a third thread; and a command managing unit configured to generate and manage commands compiled in command lists for the filters. Print ticket processing is performed at the beginning of the filter pipeline in the logical page filter. And further, each filter simultaneously runs within its own thread to perform a specific scope of operations defined for each filter.
Owner:CANON KK
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products