Unlock instant, AI-driven research and patent intelligence for your innovation.

A method and system for automatic open identification of archives based on semantic analysis

A semantic analysis and automatic opening technology, applied in semantic analysis, natural language data processing, instruments, etc., can solve the problems of low accuracy rate and high misjudgment rate of appraisal results, and achieve the goal of ensuring accuracy, improving accuracy rate and coverage rate Effect

Active Publication Date: 2021-08-17
江苏联著实业股份有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present application provides a method and system for automatic opening and identification of files based on semantic analysis, which solves the problem of opening and identifying files based on a single optical character recognition technology in the prior art, so that the identification results have a high misjudgment rate and are accurate. For the technical problem of defects with a low rate, the files to be identified are converted into plain text files based on the character recognition system, and then the first round of format retrieval is performed based on the format semantic database, and the second round of keyword retrieval is performed based on the keyword database. The third round of preset semantic retrieval is performed based on the semantic knowledge base, and finally the openable files are sent to the manual review terminal for the last line of manual review to ensure the accuracy of file open identification without increasing the cost of identification. , the technical effect of further improving the accuracy and coverage of open file identification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for automatic open identification of archives based on semantic analysis
  • A method and system for automatic open identification of archives based on semantic analysis
  • A method and system for automatic open identification of archives based on semantic analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0023] Such as figure 1 As shown, the embodiment of the present application provides a method for automatic file opening identification based on semantic analysis, wherein the method is applied to an automatic file opening identification system, and the system is connected to a character recognition system and a semantic recognition system by communication. The method also includes:

[0024] Step S100: Obtain the batch set of file information to be identified;

[0025] Specifically, the fundamental purpose of archives work is to integrate various archives information resources to facilitate the use of the public, and the opening of archives is the most basic and important way for the public to obtain and use archives information. With the rapid development of science and technology, electronic information technology has also brought great changes to archives work, and the concepts of "archive digitization" and "smart archives" have emerged as the times require. The introductio...

Embodiment 2

[0080] Based on the same inventive concept as the method for automatically opening and identifying files based on semantic analysis in the foregoing embodiments, the present invention also provides a system for automatically opening and identifying files based on semantic analysis, such as figure 2 As shown, the system includes:

[0081] The first obtaining unit 11: the first obtaining unit 11 is used to obtain the file information of the batch set to be authenticated;

[0082] The first conversion unit 12: the first conversion unit 12 is used to convert the batch set of file information to be identified into the batch set of plain text file information based on the character recognition system;

[0083] The first input unit 13: the first input unit 13 is used to input the batch set of plain text file information into the format semantic database for training, train the input information with the special format identified, and obtain the first training result and the second ...

Embodiment 3

[0124] Refer below image 3 An electronic device according to an embodiment of the present application will be described.

[0125] image 3 A schematic structural diagram of an electronic device according to an embodiment of the present application is shown.

[0126] Based on the inventive concept of an automatic file opening identification method based on semantic analysis in the aforementioned example, the present invention also provides an automatic file opening identification system based on semantic analysis, on which a computer program is stored, and the program is executed by a processor When realizing the steps of any method of an automatic file opening identification system based on semantic analysis described above.

[0127] Among them, in image 3 In, bus architecture (represented by bus 300), bus 300 may include any number of interconnected buses and bridges, bus 300 will include one or more processors represented by processor 302 and various types of memory rep...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a semantic analysis-based file automatic open identification method and system, wherein the method includes: obtaining batch set of file information to be identified; converting the batch set of file information to be identified into batch set plain text file information; Batches of plain text file information are input into the format semantic library to obtain the first training result and the second training result; the second batch of plain text file information is input into the keyword library to obtain the third training result and the fourth training result; the fourth training result is obtained Enter the batches of plain text file information into the semantic knowledge base to obtain the fifth training result and the sixth training result; send the sixth batch of plain text file information to the manual review terminal for content semantic review, and generate the first review result; obtain open Profile information. It solves the technical problem in the prior art that the file is openly identified based on a single optical character recognition technology, so that the identification result has the defects of high misjudgment rate and low accuracy rate.

Description

technical field [0001] The invention relates to the technical field of file open identification, in particular to a semantic analysis-based automatic file open identification method and system. Background technique [0002] The fundamental purpose of archives work is to integrate various archives information resources to facilitate the use of the public. Opening archives is the most basic and important way for the public to obtain and utilize archives information. With the rapid development of science and technology, electronic information technology has also brought great changes to archives work. The introduction and application of new technologies will not change the fundamental purpose of archives work, but to serve the public more efficiently and conveniently. [0003] However, in the process of realizing the technical solution of the invention in the embodiment of the present application, the inventor of the present application found that the above-mentioned technology...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/332G06F16/335G06F16/38G06F40/30G06F40/151G06K9/62
CPCG06F16/332G06F16/335G06F16/38G06F40/30G06F40/151G06F18/214
Inventor 王楠张宇顾凌峰常祖贤银思琪刘杰宋永生
Owner 江苏联著实业股份有限公司