Print control method of OFD document, image forming apparatus, device and medium

By performing sensitive word detection and replacement on OFD documents in the image forming apparatus, the problem of missed or incorrect detection caused by manual inspection is solved, achieving efficient and accurate sensitive word management and meeting the confidentiality needs of enterprises.

CN119937948BActive Publication Date: 2026-06-12BEIJING PANTUM INFORMATION TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING PANTUM INFORMATION TECHNOLOGY CO LTD
Filing Date
2024-12-20
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In existing technologies, the detection of sensitive words in OFD documents relies on manual inspection, which is prone to missed detections and false detections, and is inefficient and cannot meet the confidentiality requirements, especially in multi-layered documents.

Method used

The OFD document is parsed using an image forming device, and sensitive words are detected in text and image objects using a sensitive word database. Users can choose the processing method, and the detection strategy is adjusted according to the user's role to achieve automatic detection and replacement of sensitive words.

🎯Benefits of technology

It improves the accuracy and efficiency of sensitive word detection, reduces missed detections, ensures that document output meets confidentiality requirements, and adapts to the needs of users with different confidentiality levels.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119937948B_ABST
    Figure CN119937948B_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide an OFD document printing control method, an image forming device, an electronic device and a storage medium. The method comprises: obtaining an OFD document, analyzing a text page of the OFD document, when the OFD document contains a text object and / or an image object, performing sensitive word detection on the text object and / or the image object contained in the OFD document, when a sensitive word is detected in the text object and / or the image object contained in the OFD document, outputting a prompt picture; and in response to an image forming control instruction triggered in the prompt picture, performing a corresponding image forming control operation on the OFD document. Embodiments of the present application can detect sensitive words in multiple layers without manual detection, thereby ensuring detection efficiency and comprehensiveness.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image forming technology, and more specifically to a printing control method for OFD documents, an image forming apparatus, an electronic device, and a storage medium. Background Technology

[0002] OFD (Open Fixed Layout Document) is an electronic document format that is independently controlled in my country. Its layout is fixed, does not change, and is WYSIWYG. Therefore, it can be regarded as the "digital paper" of the computer age and is an ideal document format for electronic document publishing, digital information dissemination and archiving.

[0003] Currently, in the field of image forming, OFD documents are mostly transmitted to image forming apparatuses, which then parse them, perform image forming operations, and output paper documents. In some companies, these paper documents are kept as records for extended periods; therefore, the output content of image forming apparatuses requires extra attention to prevent the inclusion of inappropriate, unsuitable, or confidential sensitive words. However, some companies have very strict controls over external access devices, making it impossible to install image forming control systems in their control equipment. Therefore, sensitive words in OFD documents can only be checked manually. Manual checking is prone to omissions and errors, and is very slow, especially when multiple layers exist simultaneously. Information in overlapping or interleaved layers is easily overlooked, affecting detection efficiency and security. Furthermore, current methods for sensitive word detection in OFD documents are limited, and many documents require printing for signing, meetings, etc., resulting in detection results that do not meet confidentiality requirements. Summary of the Invention

[0004] In view of this, this application provides an OFD document printing control method, an image forming apparatus, an electronic device, and a storage medium to solve the problems of missed detections, false detections, and slow efficiency that are very easy to occur when manually inspecting OFD documents, as well as the problem that the single method of detecting sensitive words in OFD documents results in the output documents failing to meet confidentiality requirements.

[0005] In a first aspect, embodiments of this application provide an OFD document printing control method, applied to an image forming apparatus, comprising:

[0006] Obtain an OFD document, wherein the OFD document includes at least one page of text;

[0007] The OFD document is parsed. When the OFD document contains text objects and / or image objects, sensitive word detection is performed on the text objects and / or image objects in the OFD document according to the user role. When sensitive words are detected in the text objects and / or image objects in the OFD document, a prompt screen is output for the user to choose from.

[0008] If the user chooses to continue the work, the sensitive words detected in the OFD document will be deleted or replaced to construct a de-sensitized OFD document;

[0009] Print the de-identified OFD document and record the job information of the OFD document in the log;

[0010] If the user chooses to abandon the job, printing will end and the job information in the OFD document will be deleted; wherein,

[0011] The user roles include administrators and general users. Administrators can assign user roles, access logs, and update data via the network or by setting them directly.

[0012] In one possible implementation, the administrator can assign user roles via a network or by direct configuration, including:

[0013] The administrator can assign user roles through direct flashing, batch configuration of network, and / or mirroring of devices within the same network.

[0014] In one possible implementation, the data update includes: sensitive word update, sensitive word detection strategy adjustment, and user role definition.

[0015] In one possible implementation, the sensitive word detection strategy adjustment includes:

[0016] Adjust the sensitive word detection method corresponding to the user role. The sensitive word detection method includes a first detection method, a second detection method, and a third detection method. The first detection method is to perform sensitive word detection on text objects and image objects separately. The second detection method is to perform sensitive word detection on text objects and image objects at the same time. The third detection method is to perform sensitive word detection on all text objects and image objects at the same time.

[0017] In one possible implementation, the administrator can update data via a network or through direct settings, including:

[0018] The administrator can update data through direct flashing, batch configuration of the network, and / or mirroring of devices within the same network.

[0019] In one possible implementation, the job information includes: job name, job attribute data, user information, printing time, number of copies printed, and / or test results.

[0020] In one possible implementation, the sensitive word detection of text objects and / or image objects contained in the OFD document based on user roles includes:

[0021] Determine the sensitive word detection method based on the user's role;

[0022] The text object is subjected to font encoding data sensitive word detection according to the determined sensitive word detection method, and / or, after extracting the text glyphs from the image object, sensitive word detection is performed on the text glyphs.

[0023] In one possible implementation, the step of performing font-encoded sensitive word detection on the text object, and / or, after extracting the glyphs of the image object, performing sensitive word detection on the glyphs, includes:

[0024] The text object is compared with a first database to obtain a first comparison result, wherein the first database is a font encoding database of the sensitive word; and / or

[0025] Extract the glyphs of the characters in the image object, and compare the glyphs with a second database to obtain a second comparison result. The second database is a glyph database of the sensitive words.

[0026] Based on the first comparison result and / or the second comparison result, determine whether the text object and / or image object contains sensitive words.

[0027] In one possible implementation, when the sensitive word detection method is the first detection method, determining whether the text object and / or image object contains sensitive words based on the first comparison result and / or the second comparison result includes:

[0028] Based on the first comparison result, determine whether the text object contains sensitive words; and / or

[0029] Based on the second comparison result, determine whether the image object contains sensitive words.

[0030] In one possible implementation, when the sensitive word detection method is the second detection method, determining whether the text object and / or image object contains sensitive words based on the first comparison result and / or the second comparison result includes:

[0031] Combine and analyze the first comparison result and / or the second comparison result;

[0032] Based on the combined analysis results, determine whether the text object and / or the image object contains sensitive words.

[0033] In one possible implementation, when the sensitive word detection method is a third detection method, determining whether the text object and / or image object contains sensitive words based on the first comparison result and / or the second comparison result includes:

[0034] Adjust the text order of the first comparison result and / or the second comparison result based on sensitive words;

[0035] Based on the result of the text order adjustment, determine whether the text object and / or the image object contain sensitive words.

[0036] In one possible implementation, comparing the text object with a first database to obtain a first comparison result includes:

[0037] Based on the text object, obtain the collection of text strings of the OFD document;

[0038] Obtain the font encoding of each text string in the text string set;

[0039] The font encoding of each text string is compared with the font encoding of each sensitive word in the first database to obtain the first comparison result.

[0040] In one possible implementation, obtaining the set of text strings of the OFD document based on the text object includes:

[0041] The text objects are divided according to the text page to which they belong, to obtain the set of text objects corresponding to each text page in the OFD document;

[0042] Sort the text objects contained in each of the aforementioned text object sets to obtain sorted text object sets;

[0043] The set of text strings of the OFD document is obtained based on the sorted set of text objects.

[0044] In one possible implementation, sorting the text objects contained in each of the text object sets includes:

[0045] The text objects in each set of text objects are sorted according to the row and column order of each text object in its respective text page.

[0046] In one possible implementation, sorting the text objects in each set of text objects according to their row and column order within their respective text pages includes:

[0047] Using any set of text objects as the target set of text objects, determine the row number and starting column number of each text object in the target set of text objects in its respective text page;

[0048] The text objects with the same row number are grouped together, and each group of text objects is arranged in rows according to the row number from smallest to largest.

[0049] The column order of the text objects in each group is adjusted according to the size of the starting column number of the text objects in each group.

[0050] In one possible implementation, obtaining the set of text strings of the OFD document based on the sorted set of text objects includes:

[0051] Based on the sorted set of text objects, obtain the text string set of each set of text objects;

[0052] The text string set of the OFD document is obtained based on each of the aforementioned text string sets.

[0053] In one possible implementation, after comparing the font encoding of each text string with the font encoding of each sensitive word in the first database, the method further includes:

[0054] When the font encoding of the text string at the end of the nth line is the same as the first half of the font encoding of the mth sensitive word in the first database, the font encoding of the text string at the beginning of the (n+1)th line is compared with the second half of the font encoding of the mth sensitive word.

[0055] If the font encoding of the text string at the beginning of the (n+1)th line is the same as the latter half of the font encoding of the mth sensitive word, then it is confirmed that the font encoding of the text string that is consistent with the font encoding of the sensitive word in the first database is stored.

[0056] In one possible implementation, after comparing the font encoding of each text string with the font encoding of each sensitive word in the first database, the method further includes:

[0057] When the font encoding of the text string at the end of the text page t is the same as the first half of the font encoding of the kth sensitive word in the first database, the font encoding of the first text string on the (t+1)th page is compared with the second half of the font encoding of the kth sensitive word.

[0058] If the font encoding of the first text string on the (t+1)th page is the same as the latter half of the font encoding of the kth sensitive word, then it is confirmed that the font encoding of the text string that is consistent with the font encoding of the sensitive word in the first database is stored.

[0059] In one possible implementation, comparing the character glyphs with a second database to obtain a second comparison result includes:

[0060] Based on the character glyphs, obtain the set of glyph strings of the OFD document;

[0061] Each glyph string in the set of glyph strings is compared with the glyph of each sensitive word in the second database to obtain a second comparison result.

[0062] In one possible implementation, obtaining the set of glyph strings of the OFD document based on the glyphs includes:

[0063] The text glyphs are divided according to the text page to which they belong, to obtain the set of text glyphs corresponding to each text page in the OFD document;

[0064] The glyphs contained in each of the aforementioned glyph sets are sorted to obtain sorted glyph sets;

[0065] The set of glyph strings of the OFD document is obtained based on the sorted set of glyphs of each character.

[0066] In one possible implementation, the sorting of the glyphs contained in each of the glyph sets includes:

[0067] The glyphs included in each set of glyphs are sorted according to the image object to which each glyph belongs.

[0068] In one possible implementation, the administrator includes two role definitions: ordinary administrator and super administrator, with different permissions for each role.

[0069] Regular administrators can assign general user roles, access logs, and update data via the network or by direct settings; super administrators have all the privileges of regular administrators and can also assign user roles to regular administrators.

[0070] In one possible implementation, general users are defined by three roles: ordinary employees, ordinary personnel with access to classified information, and senior personnel with access to classified information. These three roles correspond to different levels of user security, different methods of detecting sensitive words, and different types of printable documents.

[0071] In one possible implementation, sorting the glyphs included in each set of glyphs according to the image object to which each glyph belongs includes:

[0072] Group the text glyphs belonging to the same image object into a group;

[0073] The glyphs in each group are sorted according to their positions in the corresponding image objects.

[0074] In one possible implementation, obtaining the set of glyph strings of the OFD document based on the sorted set of glyphs includes:

[0075] Based on the sorted sets of character glyphs, obtain the set of glyph strings for each set of character glyphs;

[0076] The set of glyph strings of the OFD document is obtained based on the set of glyph strings of each of the aforementioned glyph sets.

[0077] In one possible implementation, the method further includes:

[0078] When sensitive words are detected in the text objects and / or image objects contained in the OFD document, the level of the sensitive words in the text objects and / or image objects is determined;

[0079] The image formation strategy for the OFD document is determined based on the level of the sensitive words.

[0080] Secondly, embodiments of this application provide an image forming apparatus, comprising:

[0081] An acquisition unit is used to acquire an OFD document, wherein the OFD document includes at least one page of text.

[0082] The parsing unit is used to parse the text pages of the OFD document;

[0083] The detection unit is used to perform sensitive word detection on the text objects and / or image objects contained in the OFD document when the OFD document contains text objects and / or image objects;

[0084] The output unit is configured to output a prompt screen when sensitive words are detected in the text objects and / or image objects contained in the OFD document.

[0085] An execution unit is configured to perform corresponding image forming control operations on the OFD document in response to an image forming control command triggered on the prompt screen.

[0086] Thirdly, embodiments of this application provide an electronic device, including:

[0087] processor;

[0088] Memory;

[0089] The memory stores a computer program that, when executed, causes the electronic device to perform the method described in any of the first aspects.

[0090] Fourthly, embodiments of this application provide a computer-readable storage medium including a stored program, wherein, when the program is executed, it controls the device where the computer-readable storage medium is located to perform the method described in any of the first aspects.

[0091] Compared with the prior art, the embodiments of this application can perform sensitive word detection on OFD documents during the printing process of OFD documents using an image forming apparatus, thereby improving the accuracy and efficiency of sensitive word detection. Sensitive word detection is performed on multiple layers existing in the OFD document, taking into account text objects, image objects, and the combined effect of text objects and image objects, greatly reducing the occurrence of missed detections and ensuring that the document detection results meet confidentiality requirements. Attached Figure Description

[0092] To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0093] Figure 1 This is a schematic diagram of an application scenario provided by an embodiment of this application;

[0094] Figure 2 A flowchart illustrating a printing control method for OFD documents provided in an embodiment of this application;

[0095] Figure 3 A schematic diagram illustrating the issuance of OFD documents by a control device, as provided in an embodiment of this application;

[0096] Figure 4 This application provides a schematic diagram of sensitive word segmentation.

[0097] Figure 5 A flowchart illustrating another OFD document printing control method provided in this application embodiment;

[0098] Figure 6 This is a schematic diagram of the structure of an image forming apparatus provided in an embodiment of this application;

[0099] Figure 7 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0100] To better understand the technical solution of this application, the embodiments of this application will be described in detail below with reference to the accompanying drawings.

[0101] It should be understood that the described embodiments are merely some, not all, of the embodiments in this application. All other embodiments obtained by those skilled in the art based on the embodiments in this application without inventive effort are within the scope of protection of this application.

[0102] The terminology used in the embodiments of this application is for the purpose of describing particular embodiments only and is not intended to be limiting of this application. The singular forms “a,” “the,” and “the” used in the embodiments of this application and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise.

[0103] It should be understood that the term "and / or" used in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. Additionally, the character " / " in this article generally indicates that the preceding and following related objects have an "or" relationship.

[0104] See Figure 1 This is a schematic diagram illustrating an application scenario provided by an embodiment of this application. For example... Figure 1 As shown, this application scenario includes a control device 110 and an image forming apparatus 120. The control device 110 and the image forming apparatus 120 are interconnected via a wired or wireless communication network for information transmission. During the image forming process, the control device 110 converts the print data into an OFD document and sends the OFD document to the image forming apparatus 120, which then parses the OFD document and performs the image forming operation.

[0105] The communication network between the control device 110 and the image forming apparatus 120 can be a local area network (LAN) or a wide area network (WAN) relayed through a relay device. When the communication network is a LAN, for example, it can be a short-range communication network such as a Wi-Fi hotspot network, a Wi-Fi P2P network, a Bluetooth network, a Zigbee network, or a near field communication (NFC) network. When the communication network is a WAN, for example, it can be a 3G network, a 4G network, a 5G network, a future public land mobile network (PLMN), or the Internet.

[0106] It should be pointed out that, Figure 1 This is merely an illustrative description and should not be construed as limiting the scope of protection of this application. For example, the image forming apparatus 120 includes, but is not limited to, printers, copiers, fax machines, multi-function image making and copying apparatuses, electrostatic printing apparatuses, and any other similar apparatuses. The control device 110 includes, but is not limited to, electronic devices capable of communicating with the image forming apparatus 120, such as computers, mobile phones, tablet computers, enterprise internal servers, and cloud servers.

[0107] In some possible implementations, the control device 110 may also be referred to as a host computer, a main unit, a terminal device, etc.

[0108] OFD (Open Fixed Layout Document) is an electronic document format that is independently controlled in my country. Its layout is fixed, does not change, and is WYSIWYG. Therefore, it can be regarded as the "digital paper" of the computer age and is an ideal document format for electronic document publishing, digital information dissemination and archiving.

[0109] Currently, in the field of image forming, OFD documents are mostly transmitted to image forming apparatuses, which then parse them, perform image forming operations, and output paper documents. In some companies, these paper documents are kept as documents for a long time; therefore, extra care must be taken with the output content of the image forming apparatus to prevent the inclusion of inappropriate, unsuitable, or confidential sensitive words. However, due to strict control over external access devices in some companies, it is impossible to install an image forming control system in the control equipment. Therefore, sensitive words in OFD documents can only be checked manually. However, manual checking is prone to missed or incorrect detections and is very slow, affecting image forming quality and efficiency. Currently, the methods for detecting sensitive words in OFD documents are limited to the text portion. There are no corresponding detection methods for text combinations in multiple text boxes or modules within a text page, text in images, or text combinations created by overlapping text and / or images. Figure 4 As shown, the two text boxes combine text content due to their positional relationship. If conventional detection logic is used, which only detects a single object, it is easy to miss some detections.

[0110] To address the aforementioned problems, this application proposes a printing control method, an image forming apparatus, an electronic device, and a storage medium for OFD documents. Embodiments of this application can perform sensitive word detection on OFD documents during the printing process using the image forming apparatus, thereby improving the accuracy and efficiency of sensitive word detection. A detailed description is provided below with reference to the accompanying drawings.

[0111] See Figure 2 This is a flowchart illustrating an OFD document printing control method provided in an embodiment of this application. The method can be applied to... Figure 1 In the image forming apparatus shown in the application scenario, such as Figure 2 As shown, it mainly includes the following steps.

[0112] S201: Obtain an OFD document, which includes at least one page of text.

[0113] In this embodiment, before the image forming apparatus acquires the OFD document, it needs to transmit the strings to be controlled, i.e., sensitive words, to the image forming apparatus and establish a sensitive word database in the image forming apparatus for sensitive word detection on the acquired OFD document. Specifically, sensitive word data can be sent to the image forming apparatus via an APP or the image forming apparatus's accompanying driver, or by setting up the sensitive word database through logging into the image forming apparatus's backend website, or by sending sensitive word data to the image forming apparatus via a file. The sent sensitive word data includes the font encoding of the sensitive words and the commonly used glyphs of the sensitive words. The image forming apparatus, upon receiving the font encoding of the sensitive words, such as... Figure 3 As shown, the font encoding of sensitive words is stored in the first database of the sensitive word storage module 121, that is, a sensitive word font encoding database is established in the sensitive word storage module 121 for sensitive word detection of text objects in OFD documents. After receiving the commonly used glyphs of sensitive words, the commonly used glyphs of sensitive words are stored in the second database of the sensitive word storage module 121, that is, a sensitive word glyph database is established in the sensitive word storage module 121 for sensitive word detection of image objects in OFD documents. It can be understood that the preset font encoding is stored in the first database, which is the sensitive word font encoding database; the preset glyphs are stored in the second database, which is the sensitive word glyph database. By storing the preset font encoding and preset glyphs in different databases, data management and retrieval can be better performed.

[0114] In practical implementation, the font encoding of sensitive words can use common font encodings, and this application embodiment does not make specific requirements in this regard. The glyphs of sensitive words can be extracted from common font libraries, such as bold, Song, and Kai, to obtain the glyphs corresponding to the sensitive words, and the common font size of each glyph, such as small four or five, are obtained as a target glyph set, wherein the glyph data is saved in bitmap form. Of course, in practical applications, the glyphs of sensitive words can also be obtained through other methods, and this application embodiment does not make specific requirements in this regard.

[0115] After establishing the first and second databases, as follows Figure 3 As shown, when the control device 110 sends an OFD document, the image forming apparatus 120 obtains the OFD document sent by the control device 110 and performs sensitive word detection and replacement on the OFD document through the sensitive word processing module 122 (described in detail below). The OFD document includes at least one text page.

[0116] In one embodiment of this application, preset font codes and preset glyphs can be stored in the same database, and preset font codes or preset glyphs can be queried through specific instructions.

[0117] S202: Parse the text page of the OFD document. When the OFD document contains text objects and / or image objects, perform sensitive word detection on the text objects and / or image objects contained in the OFD document. When sensitive words are detected in the text objects and / or image objects contained in the OFD document, output a prompt screen for the user to choose from.

[0118] In an OFD document, information on the OFD page, such as text and image information, is recorded in the OFD's text page. The text page is in XML format. Each text object—TextObject—on the text page includes one or more characters. The description of a text object contains information about the character size, starting point (row and column number), glyph, and font encoding. Each image object—ImageObject or PathObject—on the text page includes an image. The description of an image object includes information about the image size, starting point (row and column number), image content, and color. Since the text page can contain various types of objects, and each type corresponds to a different database for sensitive word detection, after obtaining the OFD document, the sensitive word processing module 122 needs to parse the OFD document to determine whether the text page of the OFD document contains text objects and / or image objects. If the text page of the OFD document contains text objects and / or image objects, sensitive word detection is performed on the text objects and / or image objects contained in the OFD document's text page.

[0119] In one embodiment of this application, a mapping relationship between user roles and sensitive word detection methods can be preset. The corresponding sensitive word detection method is determined based on the user role, and sensitive word detection is performed on the text objects and / or image objects contained in the OFD document according to the sensitive word detection method. User roles include administrators and general users. Administrators include at least one role definition, and general users include at least one role definition.

[0120] In one embodiment of this application, the administrator can assign user roles, access logs, and update data via network or direct configuration. Specifically, the administrator can assign user roles and update data through direct programming, batch network configuration, and / or mirroring of devices within the same network. Direct programming can be performed directly from the image forming device's system via wired or wireless networks, performing operations such as adding, deleting, modifying, and querying data. This method is suitable for performing user role assignment, log access, and data updates on a specific device. Batch network configuration involves performing operations such as adding, deleting, modifying, and querying data on the image forming device through a pre-established network. This method is suitable for scenarios such as document printing systems and local area network printing, allowing the administrator to synchronously configure multiple image forming devices via the network. Mirroring of devices within the same network refers to the ability for image forming devices within the same network to share data through network mirroring. This method is suitable for synchronizing user roles and data information when a new device is added.

[0121] In one possible implementation, administrators can access the logs by visiting the device's webpage, entering the administrator password to complete the authentication, and then accessing the log system. In the log system, local logs can be viewed, and the IP address, device number, or device name of the target device within the same network can be entered on the webpage to access the target device's logs.

[0122] In one possible implementation, multiple devices are networked to form a document printing system or a cloud printing system. After authentication, the administrator can access the logs of each networked device by accessing the backend system of the document printing system or the cloud printing system.

[0123] In one possible implementation, the administrator includes at least two role definitions: ordinary administrator and super administrator, with different permissions for each role.

[0124] Regular administrators can assign general user roles, access logs, and update data via the network or by direct settings; super administrators have all the privileges of regular administrators and can also assign user roles to regular administrators.

[0125] In one possible implementation, general users include at least three role definitions: ordinary employees, ordinary personnel with access to classified information, and senior personnel with access to classified information. These three role definitions correspond to different user security levels, sensitive word detection methods, and types of printable documents.

[0126] In one possible implementation, users (whether administrators or general users) can be divided into three security levels: Class I, Class II, and Class III. Sensitive word detection methods include a first detection method, a second detection method, and a third detection method. Class I users are assigned to the first detection method, which performs sensitive word detection on text and / or image objects in the OFD document individually. Class II users are assigned to the second detection method, which performs sensitive word detection on both text and image objects simultaneously. Class III users are assigned to the third detection method, which performs sensitive word detection on all text and image objects simultaneously.

[0127] Specifically, when a text page contains multiple layers, the corresponding sensitive word detection method is determined based on the user's role. If the user role is a regular employee, it corresponds to Class I security level and uses the first detection method, which involves detecting each layer individually. If the user role is a regular personnel with access to classified information, it corresponds to Class II security level and uses the second detection method, which involves cross-checking all text objects and all image objects within the layer. If the user role is a senior personnel with access to classified information, it corresponds to Class III security level and uses the third detection method, which involves combined detection of all objects within the layer, i.e., performing a full cross-sorting detection of the text contained in all objects.

[0128] It should be noted that the confidentiality level for the administrator role can be set according to specific needs.

[0129] It can be seen that the first detection method is a relatively lenient one, suitable for people with low confidentiality requirements; the second detection method is a normal one, suitable for most confidentiality personnel; and the third detection method is a relatively strict one, suitable for people with strict confidentiality requirements.

[0130] This application embodiment allows administrators to assign roles to users, so that users with different roles can perform different sensitive word detection methods, ensuring that users with different roles correspond to different levels of sensitivity and confidentiality.

[0131] Data updates include sensitive word updates, sensitive word detection strategy adjustments, and user role definitions.

[0132] Specifically, when sensitive words are updated, the font encoding of the newly added sensitive words can be entered into the first database of the image forming apparatus through the auxiliary program of the control device. If the image forming apparatus also supports an image recognition plugin, the control device can first obtain different font sizes and glyphs of the newly added sensitive words from different font libraries, and then write the different font sizes and glyphs into the second database of the image forming apparatus in image format. Simultaneously, embodiments of this application can also write images of seals, handwritten signatures, etc., into the second database.

[0133] In one possible implementation, when there may be several image forming devices in the application scenario, and the image forming devices are allowed to be deployed on an internal network, when updating the sensitive word database stored inside the image forming device, the updated image forming device can be used as a server to search for image forming devices of the same series in the same network segment, and the data can be synchronously transmitted to other image forming devices of the same series.

[0134] In one possible implementation, the user has already deployed a sensitive word database on all image forming devices in the internal network. When the user adds an image forming device and also needs to deploy a sensitive word database on the new image forming device, after adding the new image forming device to the internal network, the already deployed image forming devices can be used as a mirror source. The data can be mirrored and copied to the new image forming device through the internal network, thus completing the deployment of the sensitive word database on the new device.

[0135] Specifically, the adjustment of the sensitive word detection strategy mainly involves adjusting the sensitive word detection method corresponding to each user role based on the user role definition. For example, the sensitive word detection methods corresponding to users with Class I security level, Class II security level, and Class III security level will be adjusted.

[0136] Specifically, user role definition mainly involves adding the role definition of a new user to the database when a new user is added. User role definition includes whether the user is a person with access to classified information, the level of classification, etc.

[0137] Compared to existing technologies, this application's embodiments allow administrators to customize different sensitive word detection strategies, classify user levels, update the sensitive word database, and "clone" the sensitive word database across different devices. This greatly facilitates the management of personnel and information handling confidential matters and ensures the security of output documents. Simultaneously, the management of the sensitive word database and user roles can be quickly synchronized and updated across networked devices, facilitating unified management and improving the efficiency of confidentiality management.

[0138] In one possible implementation, sensitive word detection is performed on the OFD document according to a determined sensitive word detection method. The sensitive word detection includes font encoding data sensitive word detection on the text objects of the OFD document, and / or, after extracting the text glyphs from the image objects of the OFD document, sensitive word detection is performed on the extracted text glyphs.

[0139] Specifically, when the OFD document contains text objects, the sensitive word processing module 122 can extract the text objects from the OFD document and compare them with the first database to obtain a first comparison result. That is, the text objects are compared with preset font encodings to obtain the first comparison result.

[0140] In one possible implementation, since a text object can include one or more texts, we first need to obtain the set of text strings (a set of text strings composed of texts) of the OFD document based on the text object, then obtain the font encoding of each text string in the set of text strings of the OFD document, and compare the font encoding of each text string with the font encoding of each sensitive word in the first database to obtain the first comparison result.

[0141] In the specific implementation, the font encoding of each text string can be obtained in the code of the text page through ofd:TextCode. The specific content of ofd:TextCode can be found in the description in the relevant technology. For the sake of brevity, the embodiments of this application will not be described in detail here.

[0142] Understandably, in practical applications, a text index database of sensitive words can also be established in the image forming apparatus. After obtaining the set of text strings in the OFD document, the text index number of each text string in the set of text strings in the OFD document is obtained and compared with each text index number in the text index database to obtain the first comparison result. This method can obtain the text content to be compared based on finding the ofd:Glyphs node. However, establishing a text index database is very troublesome and not as convenient as font encoding. Therefore, this application embodiment preferably uses font encoding for sensitive word detection of text objects.

[0143] Theoretically, from the perspective of constructing OFD documents, a text object is generally constructed one line at a time. However, in practical applications, a line of text can be described by two or more text objects. Therefore, when retrieving a collection of text strings, the possibility of sensitive words being truncated needs to be considered. In one possible scenario, sensitive words are split by two or more text objects, such as... Figure 4As shown, assuming our sensitive word is "smooth wood", the sensitive word is separated by text object a and text object b; in another possible case, the sensitive word is formatted across two adjacent lines, such as... Figure 4 In the text, the sensitive word "the world" is partially placed at the end of the current line and partially placed at the beginning of the next line. That is, the first half is included in text object c and the second half is included in text object d.

[0144] To address the two scenarios described above, when retrieving the text string set of an OFD document based on text objects, the text objects can first be divided according to the text page to which they belong, resulting in a text object set corresponding to each text page in the OFD document. Then, the text objects contained in each text object set are sorted to obtain sorted text object sets. Finally, the text string set of the OFD document is retrieved based on the sorted text object sets.

[0145] In one possible implementation, the text objects in each text object set can be sorted according to their row and column order within their respective text pages. Specifically, taking any set of text objects as the target text object set, the row number and starting column number of each text object in the target set within its text page are determined. Text objects with the same row number are grouped together, and each group of text objects is arranged in rows according to their row numbers from smallest to largest. Then, the column order of the text objects in each group is adjusted according to their starting column numbers, such as adjusting the column order of the text objects in each group from smallest to largest starting column numbers, thus obtaining the sorted target text object set. Here, the starting column number refers to the column number of the first text in a text object, which is also the smallest column number among all texts in a text object. Specifically, when sorting text objects, you can take the top left corner of the text page as the origin, arrange each group of text objects from top to bottom according to the row number from smallest to largest, and then adjust the column order of the text objects in each group from left to right according to the starting column number from smallest to largest.

[0146] After sorting the text objects in each text object set, the text string set for each text object set can be obtained. Specifically, according to the descriptions of the text objects in each sorted text object set, all the text on each page can be obtained. Then, on a page-by-page basis, all the text on each page can be arranged into lines of strings according to the text object's sorting order. This gives the text string set for each page, and the text string set for the OFD document can be obtained from the text string set for each page.

[0147] The set of text strings in the OFD document obtained using the above method can be used to identify sensitive words that are truncated by text objects within the same line. Furthermore, regarding the issue of sensitive words being formatted across lines, if, when comparing the font encoding of the text string with the font encoding of each sensitive word in the first database, it is found that the font encoding of the text string at the end of the nth line is the same as the first half of the font encoding of the mth sensitive word in the first database, then the font encoding of the text string at the beginning of the (n+1)th line is compared with the second half of the font encoding of the mth sensitive word. If the font encoding of the text string at the beginning of the (n+1)th line is the same as the second half of the font encoding of the mth sensitive word, then it is confirmed that the OFD document contains a text string with a font encoding consistent with the sensitive word in the first database, thus obtaining the first comparison result.

[0148] Similarly, if sensitive words are formatted across lines, it also involves page-crossing issues, meaning the first half of the sensitive word is formatted at the end of the current page and the second half at the beginning of the next page. The method for identifying sensitive words includes: when comparing the font encoding of the text string with the font encoding of each sensitive word in the first database, if the font encoding of the text string at the end of page t is found to be the same as the first half of the font encoding of the kth sensitive word in the first database, then the font encoding of the first text string on page t+1 is compared with the second half of the font encoding of the kth sensitive word. If the font encoding of the first text string on page t+1 is the same as the second half of the font encoding of the kth sensitive word, then it is confirmed that the OFD document contains a text string with a font encoding consistent with the sensitive word in the first database, thus obtaining the first comparison result.

[0149] In a possible implementation, other methods can also be adopted to sort the text objects included in each set of text objects. It should be noted that since Chinese characters are block-shaped and are naturally separated from each other, in contrast, Western languages (taking English as an example) represent meaning through alphabetic phonetic notation, and letters form words through specific arrangements. Therefore, the sorting detection for Western language sensitive words is different from that of Chinese. It is necessary to first isolate and reconstruct the letters, obtain new semantics, and then perform sensitive word detection. If the OFD document involves Western language text, sorting requires first artificially isolating and removing it, then performing natural semantic processing, identifying possible Western language text compositions, and then detecting these possible Western language text compositions. For example, a Western language text is as follows: "He is confident I already out", and its Chinese translation is "他相信我已经出去了". If the artificial letter isolation is removed, it becomes "HeisconfidentIalreadyout". After re-isolating, "Heis confidential ready out" can be obtained, and its Chinese semantics can be understood as "他已经做好了保密准备". If there is an overlapping layer covering one or more letters at this time, new semantics will be generated. For example, if the letter 'e' in "He" is covered, the Western language will become "HisconfidentIalreadyout". After re-isolating, "His confidentialready out" can be obtained, and its Chinese semantics can be understood as "他的机密已经准备好了". Through the sorting mechanism for Western language detection described above, each set of sorted text objects can be obtained.

[0150] It should also be noted that based on the above sorting mechanism for Western language detection, in scenarios with specific confidentiality requirements, after isolating and reconstructing Chinese characters, the detection method in the present invention can also be executed for detection to improve the safety margin. For example, for the Chinese characters "木几禾必", after reconstruction, "机秘" can be obtained.

[0151] The above text reconstruction can also be applied to phonetic writing systems such as Chinese pinyin, Japanese, and Korean. Details are not elaborated here.

[0152] By sorting the text objects and then obtaining the text string of each text page in the OFD document in the embodiments of the present application, sensitive words truncated by text objects can be identified, the accuracy of sensitive word recognition is improved, and the compliance of the output document is ensured.

[0153] In the text page of an OFD document, some text is presented as images, meaning the text content is contained within ofd:Imageobject or ofd:Pathobject. In this case, on the one hand, the text is not easily found directly; on the other hand, due to the overlap of multiple images, images containing sensitive words are stacked at the bottom, making them difficult to see and thus overlooked. Therefore, when the type of object contained in the text page includes image objects, this embodiment of the application uses the sensitive word processing module 122 to extract the image object and extract the text glyphs from the image object. Then, the text glyphs are compared with the text glyphs of sensitive words in the second database to obtain a second comparison result. That is, the text glyphs are extracted from the image object and compared with preset glyphs to obtain a second comparison result.

[0154] In practical implementation, image recognition technologies such as OCR or image fuzzy matching can be embedded as plug-in functions into the sensitive word processing module 122. After the sensitive word processing module 122 extracts the image object, it can use image recognition technologies such as OCR or image fuzzy matching to recognize and extract the text in the image object. Of course, if the user needs to perform additional detection on the image due to scenario requirements or other factors, image recognition technologies such as OCR or image fuzzy matching can be integrated into the image recognition module, and the image recognition module can be set in the image forming device, whereby the image recognition module performs text recognition and text extraction.

[0155] In one possible implementation, the glyphs of the text are compared with the glyphs of sensitive words in a second database to determine whether sensitive words exist in the image object. Specifically, this includes: obtaining a set of glyph strings from the OFD document based on the glyphs; comparing each glyph string in the set with the glyph of each sensitive word in the second database; and determining that a sensitive word exists in the image object when there is a glyph string whose similarity to the glyph of a sensitive word in the second database is greater than a preset threshold. The preset threshold can be set according to specific needs. For example, if the requirements for sensitive words are strict, the preset threshold can be set lower, such as 70%; if the requirements for sensitive words are more lenient, the preset threshold can be set higher, such as 90%. This application embodiment does not impose specific limitations on this.

[0156] In one possible implementation, since the obtained text glyphs are individual, they need to be combined into glyph strings for sensitive word recognition. Specifically, the text glyphs can be divided according to the text page to which they belong, to obtain the set of text glyphs corresponding to each text page in the OFD document. Then, the text glyphs contained in each set of text glyphs are sorted to obtain the sorted set of text glyphs. Based on the sorted set of text glyphs, the set of glyph strings of the OFD document is obtained.

[0157] In one possible implementation, the glyphs in each glyph set can be sorted according to the image object to which each glyph belongs. Specifically, the glyphs belonging to the same image object in each glyph set are grouped together, and the glyphs in each group are sorted according to their position in their respective image objects.

[0158] After sorting the glyphs contained in each glyph set, the glyph string set of each glyph set can be obtained based on the sorted glyph set, and the glyph string set of the OFD document can be obtained based on the glyph string set of each glyph set.

[0159] This application embodiment detects sensitive words in image objects by extracting the glyphs of characters from the image objects and comparing them with a second database. This prevents sensitive words from appearing in the output images and further ensures the compliance of the final output content.

[0160] In the embodiments of this application, after obtaining the first comparison result and / or the second comparison result, it can be determined whether the text object and / or image object contains sensitive words based on the first comparison result and / or the second comparison result.

[0161] In one possible implementation, a pre-defined traversal comparison program can be used. This program can recognize string data and obtain a first comparison result by comparing the string data of text objects in OFD with a first database.

[0162] In one possible implementation, an OCR comparison program can be preset, which can recognize the text in the image, compare the text content of the image object in OFD with a second database, and then perform sensitive word detection on the text content to obtain a second comparison result.

[0163] It should be noted that the first and second comparison results in this application embodiment include not only words that are completely identical to the sensitive word, but also characters or words that exist within the sensitive word. For example, if the sensitive word is "blockchain," then the comparison results for "zone," "block," "chain," "blockchain," etc., will all be included in the first and / or second comparison results. It's just that when the detection methods are different...

[0164] Specifically, when the sensitive word detection method is the first detection method, the first comparison result and the second comparison result contain the comparison results of whether there is a word that is completely identical to the sensitive word. At this time, determining whether the text object and / or image object contains sensitive words based on the first comparison result and / or the second comparison result specifically includes: determining whether the text object contains sensitive words based on the first comparison result; and / or, determining whether the image object contains sensitive words based on the second comparison result. That is, if there is a first comparison result, then determining whether the text object contains sensitive words based on whether there is a word that is completely identical to the sensitive word in the first comparison result; if the first comparison result shows that there is a word that is completely identical to the sensitive word, then determining that the text object contains sensitive words; if there is a second comparison result, then determining whether the image object contains sensitive words based on whether there is a word that is completely identical to the sensitive word in the second comparison result; if the second comparison result shows that there is a word that is completely identical to the sensitive word, then determining that the image object contains sensitive words.

[0165] Specifically, when the sensitive word detection method is the second detection method, the first comparison result and the second comparison result include not only the comparison result of whether there is a word that is completely consistent with the sensitive word, but also the comparison result of whether there is a word or phrase that is completely consistent with the sensitive word. For example, if the sensitive word is "blockchain", then the comparison results of "zone", "block", "chain", "blockchain", etc. will be placed in the first comparison result and / or the second comparison result. At this time, the determination of whether the text object and / or image object contains the sensitive word is based on the first comparison result and / or the second comparison result. Specifically, this includes: combining and analyzing the first comparison result and / or the second comparison result, and determining whether the text object and / or image object contains the sensitive word based on the combined analysis result. That is, if only the first comparison result exists, the combined analysis determines whether the text object contains a sensitive word based on whether there is a word in the first comparison result that is completely identical to the sensitive word; if the first comparison result shows that there is a word that is completely identical to the sensitive word, then the text object is determined to contain a sensitive word. If only the second comparison result exists, the combined analysis determines whether the image object contains a sensitive word based on whether there is a word in the second comparison result that is completely identical to the sensitive word; if the second comparison result shows that there is a word that is completely identical to the sensitive word, then the image object is determined to contain a sensitive word. If both the first and second comparison results exist, the combined analysis determines whether the image object contains a sensitive word based on whether there is a word in the first comparison result that is completely identical to the sensitive word. The process involves several steps: first, determining if a text object contains a sensitive word; second, determining if an image object contains a sensitive word based on a second comparison result; and third, combining the first and second comparison results to determine if both text and image objects contain a sensitive word. If the first comparison result shows a word that perfectly matches the sensitive word, the text object is confirmed to contain a sensitive word. If the second comparison result shows a word that perfectly matches the sensitive word, the image object is confirmed to contain a sensitive word. If the combined result shows a word that perfectly matches the sensitive word, both the text and image objects contain sensitive words. For example, if the first comparison result shows the first half of a sensitive word at position A on a text page of an OFD document, and the second comparison result shows the second half of the same sensitive word at the position following position A on the same text page of the OFD document, then combining the first and second comparison results yields a word that perfectly matches the m-th sensitive word, thus confirming the presence of a sensitive word in both the text and image objects.

[0166] Specifically, when the sensitive word detection method is the second detection method, the first and second comparison results include not only the comparison results of whether there is a word that is completely identical to the sensitive word, but also the comparison results of whether there is a word or phrase that is completely identical to the character or word in the sensitive word, and the comparison results of whether there is a word that is completely identical to the word after the order of the characters in the sensitive word is changed. For example, if the sensitive word is "blockchain", then the comparison results of "zone", "block", "chain", "blockchain", "block zone chain", "block chain zone", "chain block zone", "block zone", etc. will be placed in the first comparison result and / or the second comparison result. At this time, the determination of whether the text object and / or image object contains the sensitive word is based on the first comparison result and / or the second comparison result. Specifically, this includes: adjusting the text order of the first comparison result and / or the second comparison result according to the sensitive word, and determining whether the text object and / or image object contains the sensitive word based on the result of the text order adjustment. That is, if only a first comparison result exists, the text order of the first comparison result is adjusted, and whether the text object contains a sensitive word is determined based on whether there is a word in the adjusted text order of the first comparison result that is exactly the same as a sensitive word; if there is a word in the adjusted text order of the first comparison result that is exactly the same as a sensitive word, then the text object contains a sensitive word; if only a second comparison result exists, the text order of the second comparison result is adjusted, and whether the image object contains a sensitive word is determined based on whether there is a word in the adjusted text order of the second comparison result that is exactly the same as a sensitive word; if there is a word in the adjusted text order of the second comparison result that is exactly the same as a sensitive word, then the image object contains a sensitive word; if both a first comparison result and a second comparison result exist, then the text order of the second comparison result is adjusted, and whether the text object contains a sensitive word is determined based on whether there is a word in the adjusted text order of the second comparison result that is exactly the same as a sensitive word. The text order of the first comparison result, the second comparison result, and the combined result of the first and second comparison results is adjusted. The presence of a word that exactly matches a sensitive word in the adjusted text order of the first comparison result, the second comparison result, and the combined result determines whether the text object and / or image object contain a sensitive word. If the adjusted text order of the first comparison result contains a word that exactly matches a sensitive word, then the text object contains a sensitive word. If the adjusted text order of the second comparison result contains a word that exactly matches a sensitive word, then the image object contains a sensitive word. If the adjusted text order of the combined result contains a word that exactly matches a sensitive word, then both the text object and the image object contain a sensitive word. For example, if the sensitive word is "abcd", the first comparison result shows "ba", and the second comparison result shows "dc", then combining the first and second comparison results and adjusting the text order of the combined result yields the sensitive word "abcd", thus confirming the presence of a sensitive word in both the text object and the image object.

[0167] This application's embodiments employ different sensitive word detection methods based on different user roles, ensuring different levels of sensitivity and confidentiality for different user roles, thus enabling the output documents to meet different levels of confidentiality requirements.

[0168] S200: In response to an image forming control command triggered on a prompt screen, perform the corresponding image forming control operation on the OFD document.

[0169] In this embodiment, when sensitive words are detected in the text objects and / or image objects contained in the OFD document, a prompt screen is displayed on the image forming apparatus for the user to select. In response to the image forming control command triggered by the user on the prompt screen, a corresponding image forming control operation is performed on the OFD document. When the image forming control command instructs continued execution of the image forming operation, the detected sensitive words in the OFD document are deleted or replaced to construct a desensitized OFD document; the desensitized OFD document is then printed, and the OFD document's job information is recorded; or, when the image forming control command instructs abandonment of the image forming operation, the current image forming process is terminated, and the OFD document's job information is deleted.

[0170] Step S200 above may further include steps S203, S204 and S205.

[0171] Specifically, the prompt screen includes two options: abandon the job and continue the job. The user can choose the appropriate option based on the actual situation. If the user chooses to abandon the job, the image forming device will directly end the current job and delete the relevant data; if the user chooses to continue the job, the current job will be desensitized before printing. When no sensitive words are detected in the text objects and / or image objects contained in the OFD document, the OFD document will be printed directly.

[0172] Step S203: If the user chooses to continue the job, the sensitive words detected in the OFD document are deleted or replaced to construct a desensitized OFD document.

[0173] In this embodiment of the application, for OFD documents containing sensitive words, the detected sensitive words in the OFD document will be deleted or replaced to construct a desensitized OFD document.

[0174] It should be noted that the deletion or replacement of sensitive words detected in OFD documents can be referred to relevant technologies. For the sake of brevity, the embodiments of this application will not be described in detail here.

[0175] S204: Execute the printing of the desensitized OFD document and record the OFD document job information in the log.

[0176] In this embodiment of the application, the de-identified OFD document can be directly printed. Specifically, this can be achieved through... Figure 3 The image forming module 123 shown executes the printing of the desensitized OFD document.

[0177] In addition, this embodiment of the application will also record the job information of the OFD document in a log for administrator access. The job information includes: job name, user information, printing time, number of copies printed, and / or sensitive word detection results.

[0178] S205: If the user chooses to abandon the job, printing will end and the job information in the OFD document will be deleted.

[0179] In this embodiment of the application, if the user selects the option to abandon the job on the prompt screen, the printing operation of the OFD document will be ended directly, and the job information of the OFD document will be deleted.

[0180] In one possible implementation, when the user selects to abandon the job on the prompt screen, the processor generates an instruction to end and delete the job, delete the detection results, and delete the cached job file. In this embodiment, when the OFD document contains sensitive words and the user selects to continue printing, the sensitive words are deleted or replaced to form a de-sensitized OFD document, ensuring the security of OFD document printing and making the output document meet confidentiality requirements.

[0181] In one possible implementation, when the OFD document contains sensitive words and the user chooses to continue printing, the processor generates a replacement or deletion command, retrieves pre-stored fill color blocks in memory to replace the layers containing sensitive words in the original document, or directly deletes the layers containing sensitive words in the original document.

[0182] In one possible implementation, embodiments of this application may further integrate an AI model (text semantic recognition after OCR recognition) into the image forming apparatus on the basis of sensitive word detection, so as to more intelligently detect sensitive words or recognize semantics.

[0183] In one possible implementation, if the image forming apparatus is allowed to access the network, this embodiment of the application can further perform a fuzzy search for synonyms on the text information in the first database and the image information in the second database to obtain the font codes and glyphs of similar words, and store them in the corresponding databases, thereby expanding the detection range and accuracy of sensitive words. Of course, if the memory and computing power of the image forming apparatus are sufficient, a further fuzzy search for synonyms can also be performed on the text and image information in the OFD document. After obtaining the font codes and glyphs of similar words, they can be compared again with the corresponding databases to confirm whether there are sensitive words among the similar words, thereby determining whether there are sensitive words in the OFD document.

[0184] Additionally, when an AI model is integrated, it can be used to detect issues such as grammatical errors and typos before performing image generation operations. If such issues are found, the user can be asked whether to print the original, the revised version, or both.

[0185] This application embodiment can perform intelligent and expanded detection of sensitive words in OFD documents during the printing process of OFD documents using an image forming apparatus, thereby improving the accuracy of sensitive word detection.

[0186] The method of this application embodiment will be further described below in conjunction with the printing process.

[0187] See Figure 5 This is a flowchart illustrating another OFD document printing control method provided in an embodiment of this application. Figure 5 As shown, after the control device issues an OFD job, the image forming apparatus (e.g., a printer) performs sensitive word detection on the OFD job to confirm whether sensitive words exist. Specifically, the printer determines whether sensitive words are present. If no sensitive words are found in the OFD job, normal printing is performed. If sensitive words are found, a prompt screen is displayed. If the user chooses to continue the job, the detected sensitive words in the OFD document are deleted or replaced, and then normal printing is performed. If the user chooses to abandon the job, printing ends. A detailed description of the embodiments of this application can be found in the above method embodiments; for the sake of brevity, it will not be repeated here.

[0188] In this embodiment of the application, when printing OFD documents, some text information inside the document is identified, and pre-defined sensitive words can be replaced with other characters or directly removed to avoid leakage of classified information. The aim is to ensure that the documents output by the printer can maintain normal printing function in the absence of authorization, while protecting necessary classified information from being printed out, thereby improving the security requirements of classified units.

[0189] Furthermore, the sensitive word recognition action in this embodiment is completed when the data is received inside the printer, and does not change the content of the source file. That is, when we apply this technology inside the printer, we can still ensure that the printed job will remove sensitive words without considering the job issued by external applications.

[0190] Corresponding to the above embodiments, this application also provides an image forming apparatus.

[0191] See Figure 6 This is a schematic diagram of the structure of an image forming apparatus provided in an embodiment of this application. Figure 6As shown, the image forming apparatus 600 includes: an acquisition unit 601 for acquiring an OFD document, the OFD document including at least one page of text; a parsing unit 602 for parsing the text page of the OFD document; a detection unit 603 for detecting sensitive words in the text objects and / or image objects contained in the OFD document when the OFD document contains text objects and / or image objects; an output unit 604 for outputting a prompt screen when sensitive words are detected in the text objects and / or image objects contained in the OFD document; and an execution unit 605 for performing corresponding image forming control operations on the OFD document in response to an image forming control command triggered on the prompt screen.

[0192] Furthermore, the execution unit 605 may further include a first execution unit and a second execution unit; the first execution unit is used to delete or replace the sensitive words detected in the OFD document and construct a desensitized OFD document when the image forming control instruction indicates to continue the image forming operation, specifically, if the user selects to continue the job on the prompt screen, the sensitive words detected in the OFD document are deleted or replaced, a desensitized OFD document is constructed, the desensitized OFD document is printed, and the job information of the OFD document is recorded in the log; the second execution unit is used to terminate the current image forming process and delete the job information of the OFD document when the image forming control instruction indicates to abandon the image forming operation, specifically, if the user selects to abandon the job, the printing is terminated and the job information of the OFD document is deleted.

[0193] For details of the embodiments of this application, please refer to the description of the above method embodiments. For the sake of brevity, they will not be repeated here.

[0194] Corresponding to the above embodiments, this application also provides an electronic device.

[0195] See Figure 7 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. The electronic device 700 may include a processor 710, a memory 720, and a communication unit 730. These components communicate through one or more buses. Those skilled in the art will understand that the structure of the electronic device shown in the figure does not constitute a limitation on the embodiment of this application. It may be a bus-shaped structure or a star-shaped structure, and may include more or fewer components than shown, or combine certain components, or have different component arrangements.

[0196] The communication unit 730 is used to establish a communication channel, enabling the electronic device to communicate with other devices. It receives user data from other devices or sends user data to other devices.

[0197] The processor 710 serves as the control center of the electronic device, connecting various parts of the device via various interfaces and lines. It executes software programs, instructions, and / or modules stored in the memory 720, and calls data stored in the memory to perform various functions and / or process data. The processor may be composed of integrated circuits (ICs), such as a single packaged IC or multiple packaged ICs with the same or different functions connected together. For example, the processor 710 may consist only of a central processing unit (CPU). In this embodiment, the CPU may have a single processing core or include multiple processing cores.

[0198] The memory 720 is used to store the execution instructions of the processor 710. The memory 720 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk or optical disk.

[0199] When the execution instructions in memory 720 are executed by processor 710, the electronic device 700 is able to perform operations. Figure 2 Some or all of the steps in the illustrated embodiments.

[0200] In a specific implementation, this application also provides a computer storage medium, which may store a program. When the program is executed, it may include some or all of the steps in the various embodiments of the OFD document printing control method provided in this application. The storage medium may be a magnetic disk, optical disk, read-only memory (ROM), or random access memory (RAM), etc.

[0201] In a specific implementation, this application also provides a computer program product, wherein the computer program product includes executable instructions, which, when executed on a computer, cause the computer to perform some or all of the steps in various embodiments of the OFD document printing control method provided in this application.

[0202] Those skilled in the art will clearly understand that the techniques in the embodiments of this application can be implemented using software plus necessary general-purpose hardware platforms. Based on this understanding, the technical solutions in the embodiments of this application, or the parts that contribute to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments of this application or some parts of the embodiments.

[0203] The same or similar parts between the various embodiments in this specification can be referred to mutually. In particular, the device embodiments and terminal embodiments are basically similar to the method embodiments, so the description is relatively simple, and the relevant parts can be referred to the description in the method embodiments.

Claims

1. An OFD document printing control method, applied to an image forming apparatus, characterized in that, include: Obtain an OFD document, wherein the OFD document includes at least one page of text; The text pages of the OFD document are parsed. When the OFD document contains text objects and / or image objects, sensitive word detection is performed on the text objects and / or image objects contained in the OFD document according to the user role. The sensitive word detection method is determined according to the user role, with different security levels corresponding to different sensitive word detection methods. The sensitive word detection methods include: a first detection method that performs individual detection on each of multiple layers; a second detection method that performs cross-detection on all text objects and all image objects in a layer; and a third detection method that performs combined detection on all objects in a layer. When sensitive words are detected in the text objects and / or image objects contained in the OFD document, a prompt screen is displayed: In response to the image forming control command triggered on the prompt screen, the corresponding image forming control operation is performed on the OFD document.

2. The printing control method according to claim 1, characterized in that, The step of performing corresponding image forming control operations on the OFD document includes: When the image forming control instruction is used to instruct the continued execution of the image forming operation, the sensitive words detected in the OFD document are deleted or replaced to construct a desensitized OFD document; Print the de-identified OFD document and record the job information of the OFD document; Alternatively, the image forming control command may be used to instruct the abandonment of the image forming operation, termination of the current image forming process, and deletion of the job information in the OFD document.

3. The printing control method according to claim 1, characterized in that, The sensitive word detection of the text objects and / or image objects contained in the OFD document includes: A mapping relationship between preset user roles and sensitive word detection methods is established, and the corresponding sensitive word detection method is determined based on the user role. Sensitive word detection is performed on the text objects and / or image objects contained in the OFD document according to the aforementioned sensitive word detection method; The user roles include administrators and regular users. Each administrator has at least one role definition, and each regular user has at least one role definition.

4. The printing control method according to claim 3, characterized in that, The administrator is used to assign user roles, access logs, and update data via network or direct settings. The data updates include at least one of the following: sensitive word updates, sensitive word detection strategy adjustments, and defining user roles.

5. The printing control method according to claim 4, characterized in that, The methods for assigning user roles or updating data include: burning, batch configuration of network, and / or mirroring of devices within the same network.

6. The printing control method according to claim 4, characterized in that, The adjustment to the sensitive word detection strategy includes: The sensitive word detection method corresponding to the user role is adjusted according to the user role definition. The sensitive word detection method includes a first detection method, a second detection method, and a third detection method. The first detection method is to perform sensitive word detection on text objects and image objects separately. The second detection method is to perform sensitive word detection on text objects and image objects at the same time. The third detection method is to perform sensitive word detection on text objects and image objects at the same time.

7. The printing control method according to claim 2, characterized in that, The job information includes: job name, user information, printing time, number of copies printed, and / or test results.

8. The printing control method according to claim 3, characterized in that, The step of performing sensitive word detection on the text objects and / or image objects contained in the OFD document according to the sensitive word detection method includes: The text object is subjected to font encoding sensitive word detection according to the determined sensitive word detection method, and / or, after extracting the text glyphs from the image object, sensitive word detection is performed on the text glyphs.

9. The printing control method according to claim 8, characterized in that, The step of performing font encoding sensitive word detection on the text object, and / or, after extracting the text glyphs from the image object, performing sensitive word detection on the text glyphs, includes: The text object is compared with a preset font code to obtain a first comparison result; and / or Extract the character shapes from the image object, and compare the character shapes with preset character shapes to obtain a second comparison result; Based on the first comparison result and / or the second comparison result, determine whether the text object and / or image object contains sensitive words.

10. The printing control method according to claim 9, characterized in that, The preset font encoding is stored in a first database, which is a font encoding database for the sensitive words; the preset glyphs are stored in a second database, which is a glyph database for the sensitive words.

11. The printing control method according to claim 9, characterized in that, When the sensitive word detection method is the first detection method, determining whether the text object and / or image object contains sensitive words based on the first comparison result and / or the second comparison result includes: Based on the first comparison result, determine whether the text object contains sensitive words; and / or Based on the second comparison result, determine whether the image object contains sensitive words.

12. The printing control method according to claim 9, characterized in that, When the sensitive word detection method is the second detection method, determining whether the text object and / or image object contains sensitive words based on the first comparison result and / or the second comparison result includes: Combine and analyze the first comparison result and / or the second comparison result; Based on the results of the combined analysis, determine whether the text object and / or the image object contains sensitive words.

13. The printing control method according to claim 9, characterized in that, When the sensitive word detection method is the third detection method, determining whether the text object and / or image object contains sensitive words based on the first comparison result and / or the second comparison result includes: Adjust the text order of the first comparison result and / or the second comparison result based on sensitive words; Based on the result of the text order adjustment, determine whether the text object and / or the image object contain sensitive words.

14. The printing control method according to claim 9, characterized in that, The step of comparing the text object with a preset font code to obtain a first comparison result includes: Based on the text object, obtain the collection of text strings of the OFD document; Obtain the font encoding of each text string in the text string set; The font encoding of each text string is compared with the preset font encoding corresponding to each sensitive word to obtain the first comparison result.

15. The printing control method according to claim 14, characterized in that, The step of obtaining the set of text strings of the OFD document based on the text object includes: The text objects are divided according to the text page to which they belong, to obtain the set of text objects corresponding to each text page in the OFD document; Sort the text objects contained in each of the aforementioned text object sets to obtain sorted text object sets; Based on the sorted set of text objects, obtain the text string set of each set of text objects; The text string set of the OFD document is obtained based on each of the aforementioned text string sets.

16. The printing control method according to claim 15, characterized in that, The step of sorting the text objects contained in each of the aforementioned text object sets includes: The text objects in each set of text objects are sorted according to the row and column order of each text object in its respective text page.

17. The printing control method according to claim 16, characterized in that, The step of sorting the text objects in each set of text objects according to their row and column order in their respective text pages includes: Using any set of text objects as the target set of text objects, determine the row number and starting column number of each text object in the target set of text objects in its respective text page; The text objects with the same row number are grouped together, and each group of text objects is arranged in rows according to the row number from smallest to largest. The column order of the text objects in each group is adjusted according to the size of the starting column number of the text objects in each group.

18. The printing control method according to claim 14, characterized in that, After comparing the font encoding of each text string with the preset font encoding corresponding to each sensitive word, the method further includes: When the font code of the text string at the end of the nth line is the same as the first half of the font code of the mth sensitive word, the font code of the text string at the beginning of the (n+1)th line is compared with the second half of the font code of the mth sensitive word. If the font encoding of the text string at the beginning of the (n+1)th line is the same as the latter half of the font encoding of the mth sensitive word, then it is confirmed that there is a font encoding of the text string that is consistent with the font encoding of the sensitive word.

19. The printing control method according to claim 14, characterized in that, After comparing the font encoding of each text string with the preset font encoding corresponding to each sensitive word, the method further includes: When the font encoding of the text string at the end of the text page on page t is the same as the first half of the font encoding of the kth sensitive word, the font encoding of the first text string on page t+1 is compared with the second half of the font encoding of the kth sensitive word. If the font encoding of the first text string on the (t+1)th page is the same as the latter half of the font encoding of the kth sensitive word, then it is confirmed that there is a font encoding of the text string that is consistent with the font encoding of the sensitive word.

20. The printing control method according to claim 9, characterized in that, The step of comparing the character shape with a preset character shape to obtain a second comparison result includes: Based on the character glyphs, obtain the set of glyph strings of the OFD document; Each glyph string in the set of glyph strings is compared with the glyph of each sensitive word to obtain a second comparison result.

21. The printing control method according to claim 20, characterized in that, The process of obtaining the set of glyph strings of the OFD document based on the glyphs includes: The text glyphs are divided according to the text page to which they belong, to obtain the set of text glyphs corresponding to each text page in the OFD document; The glyphs contained in each of the aforementioned glyph sets are sorted to obtain sorted glyph sets; Based on the sorted sets of character glyphs, obtain the set of glyph strings for each set of character glyphs; The set of glyph strings of the OFD document is obtained based on the set of glyph strings of each of the aforementioned glyph sets.

22. The printing control method according to claim 21, characterized in that, The step of sorting the glyphs contained in each of the aforementioned glyph sets includes: The glyphs included in each set of glyphs are sorted according to the image object to which each glyph belongs.

23. The printing control method according to claim 22, characterized in that, The step of sorting the glyphs included in each set of glyphs according to the image object to which each glyph belongs includes: Group the text glyphs belonging to the same image object into a group; The glyphs in each group are sorted according to their positions in the corresponding image objects.

24. An image forming apparatus, characterized in that, include: An acquisition unit is used to acquire an OFD document, wherein the OFD document includes at least one page of text. The parsing unit is used to parse the text pages of the OFD document; The detection unit is configured to perform sensitive word detection on the text objects and / or image objects contained in the OFD document according to the user role when the OFD document contains text objects and / or image objects; wherein, the sensitive word detection method is determined according to the user role, and different sensitive word detection methods correspond to different security levels of user roles; the sensitive word detection methods include: a first detection method that performs individual detection on each of multiple layers; a second detection method that performs cross-detection on all text objects in a layer and cross-detection on all image objects in a layer; and a third detection method that performs combined detection on all objects in a layer; The output unit is used to output a prompt screen when sensitive words are detected in the text objects and / or image objects contained in the OFD document; An execution unit is configured to perform corresponding image forming control operations on the OFD document in response to an image forming control command triggered on the prompt screen.

25. An electronic device, characterized in that, include: processor; Memory; The memory stores a computer program that, when executed, causes the electronic device to perform the method described in any one of claims 1-23.

26. A computer-readable storage medium, characterized in that, The computer-readable storage medium includes a stored program, wherein, when the program is executed, it controls the device on which the computer-readable storage medium is located to perform the method according to any one of claims 1-23.