Information processing method and device, electronic equipment and readable medium
By performing data identification and layout analysis on tabular data and using a pre-trained tabular processing model for data extraction, the problem of insufficient tabular data processing capabilities in existing technologies is solved, achieving more efficient information extraction and more accurate analysis results.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- YANGTZE DELTA REGION INST OF TSINGHUA UNIV ZHEJIANG
- Filing Date
- 2023-12-27
- Publication Date
- 2026-06-26
AI Technical Summary
In existing technologies, the ability of computers to process tabular data is limited by the processing capabilities and formats of the database, resulting in insufficient precision and flexibility in tabular data analysis, which reduces the ability to extract effective information and the accuracy of analysis results.
By acquiring table information from the document to be processed, data identification and layout analysis are performed. A pre-trained table processing model is used to extract data and generate the target processing result. The model is trained based on information from multiple tasks to improve processing capabilities and flexibility.
It improves the ability to extract effective information from tabular data and the accuracy of analysis results. The data processing process is no longer limited by the database, enhancing the understanding and flexibility of tabular information.
Smart Images

Figure CN117933209B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and more particularly to an information processing method, apparatus, electronic device, and readable medium. Background Technology
[0002] With the development of computer technology, various question-answering systems, translation systems, knowledge graphs, and other technologies have been widely applied. Among these, there is a significant demand for using computers to process data in tables. For tabular data, computers extract target information from a given table based on the user's question. Therefore, the ability to extract useful information from tables is crucial.
[0003] In related technologies, computers import tabular data into a database and generate corresponding database language tasks as needed to execute within the database, thereby obtaining the required information.
[0004] However, in such technologies, data analysis capabilities are limited by the database's ability and processing format for tabular data. The database lacks sufficient finesse and flexibility in processing tabular data, which reduces the ability to extract effective information from the tabular data and is detrimental to the accuracy of the tabular data analysis results. Summary of the Invention
[0005] To address the aforementioned technical issues, this application provides an information processing method, apparatus, electronic device, and readable medium to improve the ability to extract effective information from tabular data and the accuracy of tabular data analysis results.
[0006] Other features and advantages of this application will become apparent from the following detailed description, or may be learned in part from practice of this application.
[0007] According to one aspect of the embodiments of this application, an information processing method is provided, including:
[0008] Retrieve the document containing the table information to be processed;
[0009] Data recognition is performed on the table information to be processed in the document to be processed to obtain the table data in the table information to be processed.
[0010] Obtain the target task information corresponding to the table data from multiple task information sources, wherein each task information is used to indicate the processing result and data type of the table information to be processed;
[0011] The table data and the target task information are input into a pre-trained table processing model for data extraction, generating a target processing result corresponding to the target task information. The pre-trained table processing model is a model trained based on the multiple task information.
[0012] In some embodiments of this application, based on the above technical solutions, data recognition is performed on the table information to be processed in the document to be processed to obtain the table data in the table information to be processed, including:
[0013] Determine the data type of the document elements in the document to be processed, wherein the document elements contain at least table elements;
[0014] According to the data processing strategy corresponding to the data type, data is extracted from the document to be processed to obtain the character content and coordinates of the text blocks in the document to be processed.
[0015] Perform layout analysis on the document to be processed to obtain the layout analysis results of each document element;
[0016] Based on the position information of the table elements and the text block coordinates of the text block in the layout analysis results, the character content and position information of the table elements are obtained as table data in the table information to be processed.
[0017] In some embodiments of this application, based on the above technical solutions, the character content and position information of the table elements are obtained according to the position information of the table elements and the text block coordinates of the text block in the layout analysis results, including:
[0018] Based on the position information of each cell of the table element in the layout analysis results, determine the text block whose coordinates correspond to the position information;
[0019] Based on the relative position of each cell in the table information to be processed and its position in the document to be processed, the text blocks corresponding to the cells are combined to obtain the character content and position information of the table elements.
[0020] In some embodiments of this application, based on the above technical solutions, the table data and the target task information are input into a pre-trained table processing model for data extraction, generating a target processing result corresponding to the target task information, including:
[0021] For each cell in the table data, determine the cells to be merged based on the cell's relative position in the table information to be processed;
[0022] Based on the row and column layout information of the table in the layout analysis results of the table data, determine the multiple standard cells corresponding to the merged cells in the table data;
[0023] The character content in the merged cell is copied to the multiple standard cells respectively;
[0024] The merged cells in the table data are replaced with the corresponding multiple standard cells to obtain the information to be input.
[0025] The input information and the target task information are input into a pre-trained table processing model for data extraction, generating a target processing result corresponding to the target task information.
[0026] In some embodiments of this application, based on the above technical solutions, the method further includes:
[0027] Retrieve data containing the training table;
[0028] Based on the training table data and the multiple task information, determine the training task result corresponding to each task information;
[0029] The pre-trained model is trained based on the training table data, the multiple task information, and the corresponding training task results to obtain the pre-trained table processing model.
[0030] In some embodiments of this application, based on the above technical solutions, the plurality of task information includes atlas task information; the step of determining the training task result corresponding to each task information according to the training table data and the plurality of task information includes:
[0031] Based on the map task information, determine the target cell in the table data;
[0032] Obtain the character content, row label, column label, cell type, and key value label of the target cell as the description information of the target cell;
[0033] The description information of the target cell is used as the training result of the atlas task information.
[0034] In some embodiments of this application, based on the above technical solutions, after inputting the table data and the target task information into a pre-trained table processing model for data extraction and generating a target processing result corresponding to the target task information, the method further includes:
[0035] Based on the task type information in the target task information, specific data in the target processing result is obtained, and there is a correspondence between the task type information and the data and data types contained in the target processing result;
[0036] The specified data is converted to a different format to obtain the processing result of the table information to be processed.
[0037] According to one aspect of the embodiments of this application, an information processing apparatus is provided, comprising:
[0038] The document acquisition module is used to acquire documents containing information about tables to be processed.
[0039] The data recognition module is used to recognize the table information to be processed in the document to be processed, and to obtain the table data in the table information to be processed.
[0040] The task acquisition module is used to acquire the target task information corresponding to the table data from multiple task information, wherein each task information is used to indicate the processing result and data type of the table information to be processed;
[0041] The data extraction module is used to input the table data and the target task information into a pre-trained table processing model for data extraction, and generate a target processing result corresponding to the target task information. The pre-trained table processing model is a model trained based on the multiple task information.
[0042] According to one aspect of the embodiments of this application, an electronic device is provided, the electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the information processing method as described above by executing the executable instructions.
[0043] According to one aspect of the embodiments of this application, a computer-readable storage medium is provided, on which a computer program is stored, which, when executed by a processor, implements the information processing method as described in the above technical solutions.
[0044] In the embodiments of this application, the solution performs data recognition on the table information to be processed in the document to be processed, obtains the table data in the table information, and then obtains the target task information corresponding to the table data from multiple task information. Each task information indicates the processing result and data type of the table information to be processed. The table data and target task information are then input into a pre-trained table processing model for data extraction, generating a target processing result corresponding to the target task information. The pre-trained table processing model is a model trained based on multiple task information. By using a table processing model pre-trained with multiple task information to extract information from the table to be processed, the data processing process is no longer limited by the database, increasing the processing process's ability to understand and flexibly handle table information, thereby improving the ability to extract effective information from the table data and the accuracy of the table data analysis results.
[0045] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and do not limit this application. Attached Figure Description
[0046] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application. It is obvious that the drawings described below are merely some embodiments of this application, and those skilled in the art can obtain other drawings based on these drawings without any inventive effort.
[0047] Figure 1 This is a flowchart illustrating the overall process of processing tabular information in this application embodiment.
[0048] Figure 2 A flowchart of an information processing method according to an embodiment of this application is shown.
[0049] Figure 3 This is a schematic flowchart illustrating the data identification process in an embodiment of this application.
[0050] Figure 4 This is a schematic flowchart illustrating the model training process in the embodiments of this application.
[0051] Figure 5 A schematic block diagram of the information processing apparatus in an embodiment of this application is shown.
[0052] Figure 6 A schematic diagram of the structure of a computer system suitable for implementing the electronic device of the present application is shown. Detailed Implementation
[0053] Exemplary embodiments will now be described more fully with reference to the accompanying drawings. However, these exemplary embodiments can be implemented in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided to make this application more comprehensive and complete, and to fully convey the concept of the exemplary embodiments to those skilled in the art.
[0054] Furthermore, the described features, structures, or characteristics can be combined in any suitable manner in one or more embodiments. Numerous specific details are provided in the following description to give a thorough understanding of embodiments of this application. However, those skilled in the art will recognize that the technical solutions of this application can be practiced without one or more of the specific details, or other methods, components, apparatuses, steps, etc., can be employed. In other instances, well-known methods, apparatuses, implementations, or operations are not shown or described in detail to avoid obscuring various aspects of this application.
[0055] In this application embodiment, the terms "module" or "unit" refer to a computer program or part of a computer program that has a predetermined function and works with other related parts to achieve a predetermined goal, and can be implemented wholly or partially using software, hardware (such as processing circuitry or memory), or a combination thereof. Similarly, a processor (or multiple processors or memory) can be used to implement one or more modules or units. Furthermore, each module or unit can be part of an overall module or unit that includes the functionality of that module or unit.
[0056] The block diagrams shown in the accompanying drawings are merely functional entities and do not necessarily correspond to physically independent entities. That is, these functional entities can be implemented in software, in one or more hardware modules or integrated circuits, or in different network and / or processor devices and / or microcontroller devices.
[0057] The flowcharts shown in the accompanying drawings are merely illustrative and do not necessarily include all content and operations / steps, nor do they necessarily have to be performed in the described order. For example, some operations / steps can be broken down, while others can be combined or partially combined; therefore, the actual execution order may change depending on the specific circumstances.
[0058] It should be understood that the solution presented in this application can be applied to information extraction scenarios, specifically to scenarios involving information extraction from tabular data. For details, please refer to [link to relevant documentation]. Figure 1 , Figure 1 This is a flowchart illustrating the overall process of processing table information in an embodiment of this application. Figure 1 As shown, the overall process mainly consists of four modules: a text-based table data generation module A, a prompt generation module B, a table pre-trained model C, and a post-processing module D. The solution receives a document as input. It's understood that the document will at least contain tables. The text-based table data generation module A extracts the text content and portions of the tables from the document. Subsequently, the prompt generation module B generates a prompt based on the extracted text information. The table pre-trained model C makes predictions based on the table text content and the prompt, obtaining the prediction results. Finally, the post-processing module D processes and transforms the results compared to the model output to obtain the final processed result. It's understood that the data in the final processed result is usually specified by the prompt; it can be the data from the table itself or calculations based on the data in the table. For example, the table data might contain a company's annual revenue and expenditure data. The data processing result could be the revenue and expenditure data for a specific month, or it could be a calculation based on the data in the table, such as average monthly expenditure, monthly profit and loss, or the rate of increase in turnover.
[0059] With the development of computer technology, various question-answering systems, translation systems, knowledge graphs, and other technologies have been widely applied. Among these, there is a significant demand for using computers to process data in tables. For tabular data, computers extract target information from a given table based on the user's question. Therefore, the ability to extract useful information from tables is crucial. In related technologies, computers import tabular data into a database and generate corresponding database language tasks to execute within the database, thereby obtaining the required information. However, in this type of technology, data analysis capabilities are limited by the database's processing capabilities and formats for tabular data. The granularity and flexibility of tabular data processing are insufficient, reducing the ability to extract effective information from the tabular data and negatively impacting the accuracy of the analysis results.
[0060] Based on this, this application proposes an information processing method. Figure 2 A flowchart of an information processing method according to an embodiment of this application is shown. This information processing method can be specifically executed by a terminal or server that has deployed the information processing device of this application. (Refer to...) Figure 2 As shown, this information processing method includes at least steps S210 to S240, which are described in detail below:
[0061] Step S210: Obtain the document to be processed containing the information of the table to be processed.
[0062] A document to be processed is typically a target file provided by a user to an information processing device. Documents to be processed can take various forms, such as PDF or text. Similarly, tables to be processed can also be in image or text format. A document to be processed can contain multiple tables, and the solution will process each table separately. The table information can include information contained within the table itself, as well as information describing the table, such as its location, size, and format.
[0063] Step S220: Perform data recognition on the table information to be processed in the document to be processed to obtain the table data in the table information to be processed.
[0064] The information processing device performs data recognition on the table information to be processed in the received document, thereby extracting the table data from the table information. Specifically, the information processing device typically recognizes all information in the document to be processed, such as figures, tables, text paragraphs, or multimedia content. After recognizing the tables in the document, the device extracts the table data from the table information. The table data includes the text content of the table itself, typically including row information, column information, and table data. In some embodiments, the table data may also include additional supplementary information for rows or columns, such as the storage address of the information corresponding to the table text information.
[0065] Step S230: Obtain the target task information corresponding to the table data from multiple task information sources, wherein each task information is used to indicate the processing result and data type of the table information to be processed.
[0066] After obtaining the table data, the information processing device retrieves the target task information corresponding to the table data from multiple task information sources. Task information is pre-defined information that indicates the processing result and data type of the table information to be processed. Processing results may include requirements such as calculating averages, totals, extracting data, performing comprehensive calculations based on multiple rows or columns, or combining data from multiple tables. Data type refers to the type of data in the processing result, which may include various formats such as numbers, text, images, links, and dates. Each task information has a corresponding table application scope; for example, calculating averages requires the table data to contain numerical or textual information. The information processing device can also determine the target task information based on information in the table data, such as determining if the table contains data matching a specific task based on the table name, row names, or column names, thereby identifying that task as the target task information. In some embodiments, the correspondence between table data and target task information is pre-stored in the information processing device, allowing the device to directly determine the target task information based on table identification information such as the table name. Task information is typically presented in textual descriptions, such as, "What percentage of operating revenue is net profit?".
[0067] Step S240: Input the table data and the target task information into the pre-trained table processing model for data extraction, and generate the target processing result corresponding to the target task information. The pre-trained table processing model is a model trained based on the multiple task information.
[0068] The information processing device inputs tabular data and target task information into a pre-trained tabular processing model for data extraction, generating a target processing result corresponding to the target task information. The pre-trained tabular processing model is a pre-trained model trained on multiple task information sets. This model calculates the target processing result based on the tabular data, according to the processing objective specified in the target task information. For example, if the table contains operating data, and the target task information requires calculating the ratio of net profit to operating revenue, then the target processing result is the calculated ratio.
[0069] In the embodiments of this application, the solution performs data recognition on the table information to be processed in the document to be processed, obtains the table data in the table information, and then obtains the target task information corresponding to the table data from multiple task information. Each task information indicates the processing result and data type of the table information to be processed. The table data and target task information are then input into a pre-trained table processing model for data extraction, generating a target processing result corresponding to the target task information. The pre-trained table processing model is a model trained based on multiple task information. By using a table processing model pre-trained with multiple task information to extract information from the table to be processed, the data processing process is no longer limited by the database, increasing the processing process's ability to understand and flexibly handle table information, thereby improving the ability to extract effective information from the table data and the accuracy of the table data analysis results.
[0070] In some embodiments of this application, based on the above technical solutions and steps, data recognition is performed on the table information to be processed in the document to be processed to obtain the table data in the table information to be processed. Specifically, the steps include the following:
[0071] Determine the data type of the document elements in the document to be processed, wherein the document elements contain at least table elements;
[0072] According to the data processing strategy corresponding to the data type, data is extracted from the document to be processed to obtain the character content and coordinates of the text blocks in the document to be processed.
[0073] Perform layout analysis on the document to be processed to obtain the layout analysis results of each document element;
[0074] Based on the position information of the table elements and the text block coordinates of the text block in the layout analysis results, the character content and position information of the table elements are obtained as table data in the table information to be processed.
[0075] Specifically, in this embodiment, the information processing device first determines the type of each text element in the document, then extracts data according to the data processing strategy corresponding to the type, thereby obtaining the character content and coordinates of the text block. Then, it obtains the character content and position information of the table element based on the correspondence between the text block coordinates and the position of the table element. For clarity, please refer to [link to documentation]. Figure 3 , Figure 3 This is a schematic flowchart illustrating the data identification process in an embodiment of this application. Figure 3 As shown, the document type determination module categorizes the input file type, sending text and image type files to the subsequent document parser / OCR modules respectively. After processing by the document parser / OCR modules, the module outputs the character content and corresponding coordinates of the corresponding text blocks in the document. The subsequent layout analysis module aggregates text blocks belonging to different document elements and outputs the document element category (table / legend / paragraph). Finally, the table information output module processes and converts the document element information of type table and outputs it according to a specific format. In this stage of output data, the table data structure is defined by several cell elements. In addition to the string information corresponding to the cell content, it also includes the relative position of the cell in the original document (row / column start position and number of cells occupied) and the absolute position information (coordinate values).
[0076] In some embodiments of this application, based on the above technical solution and the above steps, obtaining the character content and position information of the table element according to the position information of the table element and the text block coordinates of the text block in the layout analysis result specifically includes the following steps:
[0077] Based on the position information of each cell of the table element in the layout analysis results, determine the text block whose coordinates correspond to the position information;
[0078] Based on the relative position of each cell in the table information to be processed and its position in the document to be processed, the text blocks corresponding to the cells are combined to obtain the character content and position information of the table elements.
[0079] Specifically, the layout analysis results include the position information of each cell in the table element, which typically refers to the cell's location within the document. Based on this position information and text block coordinates, the text block falling within the cell can be determined, thus establishing the correspondence between cells and text blocks. Based on this correspondence, the text block corresponding to each cell can be identified. In this embodiment, one cell can correspond to multiple text blocks. By combining the text blocks corresponding to each cell, the character content and position information of the table element can be obtained. By determining the cell content based on the cell position and text block coordinates, even when cells in the table have inconsistent sizes, the content of heterogeneous cells composed of multiple cells can be combined, improving the ability to recognize table content.
[0080] In some embodiments of this application, based on the above technical solution and the above steps, the table data and the target task information are input into a pre-trained table processing model for data extraction to generate a target processing result corresponding to the target task information, specifically including the following steps:
[0081] For each cell in the table data, determine the cells to be merged based on the cell's relative position in the table information to be processed;
[0082] Based on the row and column layout information of the table in the layout analysis results of the table data, determine the multiple standard cells corresponding to the merged cells in the table data;
[0083] The character content in the merged cell is copied to the multiple standard cells respectively;
[0084] The merged cells in the table data are replaced with the corresponding multiple standard cells to obtain the information to be input.
[0085] The input information and the target task information are input into a pre-trained table processing model for data extraction, generating a target processing result corresponding to the target task information.
[0086] In this embodiment, the information processing device reconstructs merged cells in the table data. Specifically, for each cell in the table data, the information processing device determines the merged cells based on the cell's relative position in the table information to be processed. Specifically, the correspondence between merged cells and rows / columns in the table data differs from that of a single cell; merged cells typically correspond to multiple rows or columns. Based on the relative position of the cells in the table information to be processed, cells corresponding to multiple rows or columns can be selected, thus determining the merged cells. Subsequently, the information processing device determines multiple standard cells corresponding to the merged cells in the table data based on the row and column layout information of the table in the layout analysis results. Standard cells correspond to basic cells in the table, and merged cells are obtained by merging multiple cells. The number of rows and columns corresponding to a merged cell determines how many standard cells are merged to form the merged cell. The information processing device then copies the character content from the merged cells into multiple standard cells. The merged cells are split into standard cells, and the character content from the merged cells is copied into each of the split standard cells. Finally, these standard cells replace the original merged cells in the table data to obtain the input information. Finally, the input information and target task information are fed into a pre-trained table processing model for data extraction, generating a target processing result corresponding to the target task information. In some embodiments, the information processing device describes the two-dimensional table structure information in the form of a graph and performs text preprocessing such as word segmentation on the text cells; the vectorized table data is used as the input information for the pre-trained table processing model. By reconstructing merged cells, the table structure in the table data is adjusted to a uniform form, avoiding the additional computation caused by heterogeneous tables and improving the efficiency of table processing.
[0087] In some embodiments of this application, based on the above technical solutions, the method of this application further includes:
[0088] Retrieve data containing the training table;
[0089] Based on the training table data and the multiple task information, determine the training task result corresponding to each task information;
[0090] The pre-trained model is trained based on the training table data, the multiple task information, and the corresponding training task results to obtain the pre-trained table processing model.
[0091] In this application, the server employs multi-tasking to fine-tune the pre-trained model, thereby obtaining a pre-trained table processing model. For details, please refer to [link to relevant documentation]. Figure 4 , Figure 4 This is a schematic flowchart illustrating the model training process in an embodiment of this application. Figure 4 As shown, training table data with prompts is input during the training process. The generation process of training data is the same as the method for generating table data described above. The server determines the training task result corresponding to each task based on the training table data and the multiple task information. The training task result is the correct result that should be obtained after processing the training table data according to the task information. For example, for a task requiring the calculation of the average value, the server will determine the actual average value in the training table data as the training task result. For the input training table data, the server will perform steps such as cell merging reconstruction, text preprocessing, and vectorization through a preprocessing and table encoding module. Subsequently, the data is input into the pre-trained model. The pre-trained model will be trained on multiple task information, and during the training process, the model parameters will be adjusted based on the output results of the pre-trained model and the training task results, thereby obtaining a pre-trained table processing model for multiple task information. Through this training method, the pre-trained table processing model can improve its processing capability for table data and task information, and improve the accuracy of table data processing.
[0092] In some embodiments of this application, based on the above technical solutions, the multiple task information includes graph task information; the above step, determining the training task result corresponding to each task information based on the training table data and the multiple task information, includes:
[0093] Based on the map task information, determine the target cell in the table data;
[0094] Obtain the character content, row label, column label, cell type, and key value label of the target cell as the description information of the target cell;
[0095] The description information of the target cell is used as the training result of the atlas task information.
[0096] In this embodiment, the task information requires the model to extract and describe tabular data in the form of tuples. Specifically, the graph task information specifies the attributes of the cells to be extracted, such as which row and column rules they must conform to and the information they represent. For a target cell, the information processing module obtains the character content, row label, column label, cell type, and key value label of the target cell as its descriptive information. The row label can be the name of a row, used to describe the meaning of the data in that row; the row label can also be the name of a column, used to describe the meaning of the data in that column. The cell type refers to the format of the data in the cell, such as number, text, or date. The key value label indicates whether the cell itself is a row label or a column label. It can be understood that for a cell containing a row label or column label, since its own cell character content is the row label or column label, the corresponding row label attribute or column label attribute can be blank.
[0097] In some embodiments of this application, based on the above technical solutions, after inputting the table data and the target task information into a pre-trained table processing model for data extraction and generating a target processing result corresponding to the target task information, the method further includes:
[0098] Based on the task type information in the target task information, specific data in the target processing result is obtained, and there is a correspondence between the task type information and the data and data types contained in the target processing result;
[0099] The specified data is converted to a different format to obtain the processing result of the table information to be processed.
[0100] In this embodiment, the information processing module retrieves specified data from the target processing result based on the task type information in the target task information. There is a correspondence between the task type information and the data and data types contained in the target processing result. For example, if the task type information requires extracting a date, the result data will contain date data. However, the date data can be in date format, text format, or directly in numeric format. The information processing module performs data format conversion on the specified data to obtain the processing result of the table information to be processed. Specifically, the information processing module converts all specified data into a unified format; for example, for dates, the information processing module converts both text and date formats into numeric format, thereby normalizing the data.
[0101] It should be noted that although the steps of the method in this application are described in a specific order in the accompanying drawings, this does not require or imply that the steps must be performed in that specific order, or that all the steps shown must be performed to achieve the desired result. Additional or alternative steps may be omitted, multiple steps may be combined into one step, and / or one step may be broken down into multiple steps.
[0102] The following describes the implementation of the apparatus of this application, which can be used to execute the information processing method in the above embodiments of this application. Figure 5 A schematic block diagram illustrating the composition of an information processing apparatus according to an embodiment of this application is shown. For example... Figure 5 As shown, the information processing device 500 mainly includes:
[0103] Document acquisition module 510 is used to acquire a document containing information about a table to be processed.
[0104] Data recognition module 520 is used to perform data recognition on the table information to be processed in the document to be processed, and obtain the table data in the table information to be processed.
[0105] The task acquisition module 530 is used to acquire target task information corresponding to the table data from multiple task information, wherein each task information is used to indicate the processing result and data type of the table information to be processed;
[0106] The data extraction module 540 is used to input the table data and the target task information into a pre-trained table processing model for data extraction, and generate a target processing result corresponding to the target task information. The pre-trained table processing model is a model trained based on the multiple task information.
[0107] In some embodiments of this application, based on the above technical solutions, the data identification module 520 is specifically used for:
[0108] Determine the data type of the document elements in the document to be processed, wherein the document elements contain at least table elements;
[0109] According to the data processing strategy corresponding to the data type, data is extracted from the document to be processed to obtain the character content and coordinates of the text blocks in the document to be processed.
[0110] Perform layout analysis on the document to be processed to obtain the layout analysis results of each document element;
[0111] Based on the position information of the table elements and the text block coordinates of the text block in the layout analysis results, the character content and position information of the table elements are obtained as table data in the table information to be processed.
[0112] In some embodiments of this application, based on the above technical solutions, the data identification module 520 is specifically used for:
[0113] Based on the position information of each cell of the table element in the layout analysis results, determine the text block whose coordinates correspond to the position information;
[0114] Based on the relative position of each cell in the table information to be processed and its position in the document to be processed, the text blocks corresponding to the cells are combined to obtain the character content and position information of the table elements.
[0115] In some embodiments of this application, based on the above technical solutions, the data extraction module 540 is specifically used for:
[0116] For each cell in the table data, determine the cells to be merged based on the cell's relative position in the table information to be processed;
[0117] Based on the row and column layout information of the table in the layout analysis results of the table data, determine the multiple standard cells corresponding to the merged cells in the table data;
[0118] The character content in the merged cell is copied to the multiple standard cells respectively;
[0119] The merged cells in the table data are replaced with the corresponding multiple standard cells to obtain the information to be input.
[0120] The input information and the target task information are input into a pre-trained table processing model for data extraction, generating a target processing result corresponding to the target task information.
[0121] In some embodiments of this application, based on the above technical solutions, the information processing device 500 includes:
[0122] The training data acquisition module is used to acquire data containing training tables;
[0123] The result determination module is used to determine the training task result corresponding to each task information based on the training table data and the multiple task information.
[0124] The model training module is used to train the pre-trained model based on the training table data, the multiple task information and the corresponding training task results, to obtain the pre-trained table processing model.
[0125] In some embodiments of this application, based on the above technical solutions, the plurality of task information includes map task information; the result determination module is specifically used for:
[0126] Based on the map task information, determine the target cell in the table data;
[0127] Obtain the character content, row label, column label, cell type, and key value label of the target cell as the description information of the target cell;
[0128] The description information of the target cell is used as the training result of the atlas task information.
[0129] In some embodiments of this application, based on the above technical solutions, the data extraction module 540 is further used for:
[0130] Based on the task type information in the target task information, specific data in the target processing result is obtained, and there is a correspondence between the task type information and the data and data types contained in the target processing result;
[0131] The specified data is converted to a different format to obtain the processing result of the table information to be processed.
[0132] It should be noted that the apparatus provided in the above embodiments and the method provided in the above embodiments belong to the same concept, and the specific way in which each module performs the operation has been described in detail in the method embodiments, and will not be repeated here.
[0133] Figure 6 A schematic diagram of the structure of a computer system suitable for implementing the electronic device of the present application is shown.
[0134] It should be noted that, Figure 6 The computer system 600 of the electronic device shown is merely an example and should not impose any limitation on the functionality and scope of use of the embodiments of this application.
[0135] like Figure 6As shown, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes based on programs stored in Read-Only Memory (ROM) 602 or programs loaded from Storage Unit 608 into Random Access Memory (RAM) 603. The RAM 603 also stores various programs and data required for system operation. The CPU 601, ROM 602, and RAM 603 are interconnected via a bus 604. An Input / Output (I / O) interface 605 is also connected to the bus 604.
[0136] The following components are connected to I / O interface 605: an input section 606 including a keyboard, mouse, etc.; an output section 607 including a cathode ray tube (CRT), liquid crystal display (LCD), and speakers, etc.; a storage section 608 including a hard disk, etc.; and a communication section 609 including a network interface card such as a LAN (Local Area Network) card and a modem, etc. The communication section 609 performs communication processing via a network such as the Internet. A drive 610 is also connected to I / O interface 605 as needed. A removable medium 611, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on drive 610 as needed so that computer programs read from it can be installed into storage section 608 as needed.
[0137] Specifically, according to embodiments of this application, the processes described in the various method flowcharts can be implemented as computer software programs. For example, embodiments of this application include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via communication section 609, and / or installed from removable medium 611. When the computer program is executed by central processing unit (CPU) 601, it performs various functions defined in the system of this application.
[0138] It should be noted that the computer-readable medium shown in the embodiments of this application can be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. A computer-readable storage medium can be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), flash memory, optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this application, a computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In this application, a computer-readable signal medium can include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such transmitted data signals can take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. The computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to wireless, wired, etc., or any suitable combination thereof.
[0139] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this application. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, and combinations of blocks in a block diagram or flowchart, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.
[0140] It should be noted that although several modules or units for the device used to perform actions have been mentioned in the detailed description above, this division is not mandatory. In fact, according to the embodiments of this application, the features and functions of two or more modules or units described above can be embodied in one module or unit. Conversely, the features and functions of one module or unit described above can be further divided and embodied by multiple modules or units.
[0141] Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein can be implemented by software or by combining software with necessary hardware. Therefore, the technical solutions according to the embodiments of this application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (such as a CD-ROM, USB flash drive, external hard drive, etc.) or on a network, including several instructions to cause a computing device (such as a personal computer, server, touch terminal, or network device, etc.) to execute the method according to the embodiments of this application.
[0142] Other embodiments of this application will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of this application that follow the general principles of this application and include common knowledge or customary techniques in the art not disclosed herein.
[0143] It should be understood that this application is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this application is limited only by the appended claims.
Claims
1. An information processing method, characterized in that, include: Retrieve the document containing the table information to be processed; Determine the data type of the document elements in the document to be processed, wherein the document elements contain at least table elements; According to the data processing strategy corresponding to the data type, data is extracted from the document to be processed to obtain the character content and coordinates of the text blocks in the document to be processed. Perform layout analysis on the document to be processed to obtain the layout analysis results of each document element; Based on the position information of the table elements and the text block coordinates of the text blocks in the layout analysis results, the character content and position information of the table elements are obtained as table data in the table information to be processed; Obtain the target task information corresponding to the table data from multiple task information sources, wherein each task information is used to indicate the processing result and data type of the table information to be processed; For each cell in the table data, determine the cells to be merged based on the cell's relative position in the table information to be processed; Based on the row and column layout information of the table in the layout analysis results of the table data, determine the multiple standard cells corresponding to the merged cells in the table data; Copy the character content from the merged cell to the multiple standard cells respectively; Replace the merged cells in the table data with the corresponding multiple standard cells to obtain the information to be input; The input information and the target task information are input into a pre-trained table processing model for data extraction, generating a target processing result corresponding to the target task information. The pre-trained table processing model is a model trained based on the multiple task information.
2. The information processing method according to claim 1, characterized in that, Based on the position information of the table elements and the text block coordinates of the text block in the layout analysis results, the character content and position information of the table elements are obtained, including: Based on the position information of each cell of the table element in the layout analysis results, determine the text block whose coordinates correspond to the position information; Based on the relative position of each cell in the table information to be processed and its position in the document to be processed, the text blocks corresponding to the cells are combined to obtain the character content and position information of the table elements.
3. The information processing method according to claim 1, characterized in that, The method further includes: Retrieve data containing the training table; Based on the training table data and the multiple task information, determine the training task result corresponding to each task information; The pre-trained model is trained based on the training table data, the multiple task information, and the corresponding training task results to obtain the pre-trained table processing model.
4. The information processing method according to claim 3, characterized in that, The multiple task information includes map task information; Based on the training table data and the multiple task information, determine the training task result corresponding to each task information, including: Based on the map task information, determine the target cell in the table data; Obtain the character content, row label, column label, cell type, and key value label of the target cell as the description information of the target cell; The description information of the target cell is used as the training result of the atlas task information.
5. The method according to claim 1, characterized in that, After inputting the table data and the target task information into a pre-trained table processing model for data extraction and generating a target processing result corresponding to the target task information, the method further includes: Based on the task type information in the target task information, specific data in the target processing result is obtained, and there is a correspondence between the task type information and the data and data types contained in the target processing result; The specified data is converted to a different format to obtain the processing result of the table information to be processed.
6. An information processing device, characterized in that, include: The document acquisition module is used to acquire documents containing information about tables to be processed. The data recognition module is used to determine the data type of document elements in the document to be processed, wherein the document elements include at least table elements. According to the data processing strategy corresponding to the data type, the module extracts data from the document to be processed to obtain the character content and text block coordinates of the text blocks in the document to be processed. The module performs layout analysis on the document to be processed to obtain the layout analysis results of each document element. Based on the position information of the table elements and the text block coordinates of the text blocks in the layout analysis results, the module obtains the character content and position information of the table elements as the table data in the table information to be processed. The task acquisition module is used to acquire the target task information corresponding to the table data from multiple task information, wherein each task information is used to indicate the processing result and data type of the table information to be processed; The data extraction module is used to determine merged cells for each cell in the table data based on the relative position of the cell in the table information to be processed; determine multiple standard cells corresponding to the merged cells in the table data based on the row and column layout information of the table in the layout analysis results of the table data; copy the character content of the merged cells to the multiple standard cells respectively; replace the merged cells in the table data with the corresponding multiple standard cells to obtain the input information; input the input information and the target task information into a pre-trained table processing model for data extraction; and generate a target processing result corresponding to the target task information. The pre-trained table processing model is a model trained based on the multiple task information.
7. An electronic device, characterized in that, include: processor; Memory for storing the executable instructions of the processor; The processor is configured to execute the information processing method of any one of claims 1 to 5 by executing the executable instructions.
8. A computer-readable medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the information processing method as described in any one of claims 1 to 5.