Cross-modal document processing method and device and electronic equipment

A document processing and cross-modal technology, applied in the field of computer data processing, can solve problems such as unfavorable downstream task processing, unfavorable document analysis and understanding, inability to extract document metadata and knowledge construction, etc.

Active Publication Date: 2020-09-18
SOUTHEAST UNIV
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Most of the existing document processing technologies only perform Optical Character Recognition (OCR) processing on documents, and the layout information of documents is easily lost during processing, which is not conducive to the full-dimensional analysis and understanding of documents by machines.
Existing natural language processing mainly relies on elements such as semantics, and cannot realize metadata extraction and knowledge construction of documents, which is not conducive to the processing of downstream tasks, for example, not conducive to the construction of knowledge bases or knowledge networks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal document processing method and device and electronic equipment
  • Cross-modal document processing method and device and electronic equipment
  • Cross-modal document processing method and device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It should be noted that terms such as "first" and "second" are only used for distinguishing descriptions, and should not be understood as indicating or implying relative importance.

[0056] The embodiments of the present application will be described in detail below in conjunction with the accompanying drawings. In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.

[0057] Please refer to figure 1 The cross-modal document processing method provided by the embodiment of the present application can be applied to an electronic device, and each step in the method is executed or implemented by the electronic device. The cross-modal document processing method of the embodiment of the present application can convert the text content of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a cross-modal document processing method and device and electronic equipment. The method comprises the steps of obtaining text modal data and image modal data of a first document; converting the text modal data into a word and sentence feature embedding vector based on a natural language processing model, and extracting a first text element feature according to the text modal data; based on a computer vision algorithm, through a target detection algorithm and an optical character recognition algorithm, positioning a target position, recognizing text content of the firstdocument, extracting second text element features according to the image modal data, and performing element alignment to obtain structural features of the first document; and then obtaining a meta-knowledge graph model which comprises representation of the first document in combination with the structural features and the embedded representation of the multi-dimensional features of the document. Based on this, the text content of the document is converted into the meta-knowledge graph model, so that the electronic equipment can identify and understand the document content more completely by using the meta-knowledge graph model.

Description

technical field [0001] The present invention relates to the technical field of computer data processing, in particular to a cross-modal document processing method, device and electronic equipment. Background technique [0002] Documents are usually a file type formed by human natural language plus format information. Processing documents can enable machines to better use human data information. Most existing document processing technologies only perform Optical Character Recognition (OCR) processing on documents, and the format information of documents is easily lost during processing, which is not conducive to the full-dimensional analysis and understanding of documents by machines. Existing natural language processing mainly relies on elements such as semantics, and cannot realize metadata extraction and knowledge construction of documents, which is not conducive to the processing of downstream tasks, for example, it is not conducive to the construction of knowledge bases ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/194G06F40/30G06K9/20G06K9/62G06N3/04G06N3/08
CPCG06F40/194G06F40/30G06N3/049G06N3/08G06V10/22G06N3/045G06F18/2411
Inventor 刘树衎
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products