Content parsing method and apparatus, device, and storage medium

By responding to preset characters to determine the target parsing unit and dynamically updating intermediate variables during the streaming text parsing process, the problem of low efficiency in business form filling is solved, and efficient parsing of streaming content and rapid generation of structured data are achieved.

WO2026117995A1PCT designated stage Publication Date: 2026-06-11BEIJING ZITIAO NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
BEIJING ZITIAO NETWORK TECH CO LTD
Filing Date
2024-12-04
Publication Date
2026-06-11

Smart Images

  • Figure CN2024136917_11062026_PF_FP_ABST
    Figure CN2024136917_11062026_PF_FP_ABST
Patent Text Reader

Abstract

Embodiments of the present disclosure provide a content parsing method and apparatus, a device, a storage medium, and a program product. The method comprises: acquiring streaming text outputted by a model, the streaming text corresponding to structured data; in response to detecting that the current position of the streaming text corresponds to a preset character, determining a target parsing unit corresponding to the preset character; on the basis of the type of the target parsing unit, initializing an intermediate variable corresponding to a target level, the intermediate variable comprising a key part and a value part, and the target level indicating the position of said current position in the structured data; parsing the streaming text using the target parsing unit to dynamically update the intermediate variable; and in the process of parsing the streaming text, presenting the key part and / or the value part of the intermediate variable on the basis of the target level. Therefore, the embodiments of the present disclosure can support the parsing of structured data outputted in a streaming fashion, without waiting for completion of the output of the structured data, thereby improving the efficiency of content parsing.
Need to check novelty before this filing date? Find Prior Art

Description

Methods, apparatus, devices and storage media for content parsing Technical Field

[0001] The exemplary embodiments disclosed herein relate generally to the field of computers, and in particular to methods, apparatus, devices, and computer-readable storage media for content parsing. Background Technology

[0002] Visual editing of business forms is a method for designing and organizing form filling processes through a graphical interface. It allows users to intuitively create, edit, and modify object types and values ​​within business forms. Such business forms need to be filled correctly to generate accurate structured data. Summary of the Invention

[0003] In a first aspect of this disclosure, a method for content parsing is provided. The method includes: acquiring streaming text output by a model, the streaming text corresponding to structured data; in response to detecting that the current position of the streaming text corresponds to a preset character, determining a target parsing unit corresponding to the preset character; initializing intermediate variables corresponding to a target level based on the type of the target parsing unit, the intermediate variables including a key portion and a value portion, the target level indicating the position of the current position in the structured data; parsing the streaming text using the target parsing unit to dynamically update the intermediate variables; and during the parsing of the streaming text, presenting the key portion and / or value portion of the intermediate variables based on the target level.

[0004] In a second aspect of this disclosure, an apparatus for content parsing is provided. The apparatus includes: an acquisition module configured to acquire streaming text output by a model, the streaming text corresponding to structured data; a first determination module configured to determine a target parsing unit corresponding to a preset character in response to detecting that the current position of the streaming text corresponds to a preset character; an initialization module configured to initialize intermediate variables corresponding to a target level based on the type of the target parsing unit, the intermediate variables including a key portion and a value portion, the target level indicating the position of the current position in the structured data; an update module configured to parse the streaming text using the target parsing unit to dynamically update the intermediate variables; and a presentation module configured to present the key portion and / or value portion of the intermediate variables based on the target level during the parsing of the streaming text.

[0005] In a third aspect of this disclosure, an electronic device is provided. The device includes at least one processing unit; and at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit. When executed by the at least one processing unit, the instructions cause the device to perform the method of the first aspect.

[0006] In a fourth aspect of this disclosure, a computer-readable storage medium is provided. The computer-readable storage medium stores a computer program that can be executed by a processor to implement the method of the first aspect.

[0007] In a fifth aspect of this disclosure, a computer program product is provided. The computer program product includes computer-executable instructions that, when executed by a processor, implement the method of the first aspect.

[0008] It should be understood that the content described in this content section is not intended to limit the key or essential features of the embodiments of this disclosure, nor is it intended to restrict the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description

[0009] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. In the drawings, the same or similar reference numerals denote the same or similar elements, wherein:

[0010] Figure 1 shows a schematic diagram of an example environment in which embodiments of the present disclosure can be implemented;

[0011] Figure 2 shows a schematic diagram of an example interactive interface according to some embodiments of the present disclosure;

[0012] Figures 3A to 3G illustrate schematic timing diagrams of the analysis process according to some embodiments of the present disclosure;

[0013] Figure 4 illustrates a flowchart of the content parsing process according to some embodiments of the present disclosure;

[0014] Figure 5 shows a schematic structural block diagram of an apparatus for content parsing according to certain embodiments of the present disclosure;

[0015] Figure 6 shows a block diagram of an electronic device capable of implementing several embodiments of the present disclosure. Detailed Implementation

[0016] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0017] It should be noted that the headings of any section / subsection provided herein are not limiting. Various embodiments are described throughout this document, and embodiments of any type may be included under any section / subsection. Furthermore, embodiments described in any section / subsection may be combined in any way with any other embodiments described in the same section / subsection and / or different sections / subsections.

[0018] In the description of embodiments of this disclosure, the term "comprising" and similar terms should be understood as open-ended inclusion, i.e., "including but not limited to". The term "based on" should be understood as "at least partially based on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The term "some embodiments" should be understood as "at least some embodiments". Other explicit and implicit definitions may also be included below. The terms "first", "second", etc., may refer to different or the same objects. Other explicit and implicit definitions may also be included below.

[0019] The embodiments of this disclosure may involve user data, data acquisition, and / or use. All of these aspects comply with applicable laws, regulations, and relevant provisions. In the embodiments of this disclosure, all data collection, acquisition, processing, manipulation, forwarding, and use are conducted with the user's knowledge and confirmation. Accordingly, in implementing the embodiments of this disclosure, the type, scope of use, and usage scenarios of any data or information that may be involved should be communicated to the user and their authorization obtained in accordance with relevant laws and regulations through appropriate means. The specific methods of notification and / or authorization may vary depending on the actual situation and application scenario, and the scope of this disclosure is not limited in this respect.

[0020] In this specification and the embodiments, any processing of personal information will be carried out only under the premise of legality (such as obtaining the consent of the personal information subject, or being necessary for the performance of a contract), and will only be carried out within the scope stipulated or agreed upon. A user's refusal to process personal information other than that necessary for basic functions will not affect the user's use of basic functions.

[0021] As described above, electronic devices can generate correct structured data based on correctly filled-in business forms. However, this method requires significant manpower and time, and form filling efficiency is low. Furthermore, this method can only output correct structured data if the business form has fully generated structured data, resulting in long generation times and low efficiency.

[0022] Based on this, embodiments of the present disclosure provide a content parsing scheme. According to this scheme, streaming text output by a model can be obtained, the streaming text corresponding to structured data; further, in response to detecting that the current position of the streaming text corresponds to a preset character, a target parsing unit corresponding to the preset character can be determined; further, based on the type of the target parsing unit, intermediate variables corresponding to the target level can be initialized, the intermediate variables including a key part and a value part, the target level indicating the position of the current position in the structured data; further, the streaming text can be parsed using the target parsing unit to dynamically update the intermediate variables; additionally, during the parsing of the streaming text, based on the target level, the key part and / or value part of the intermediate variables can be presented.

[0023] Based on this approach, embodiments of this disclosure can further parse different types of characters in streaming text using corresponding target parsing units, thereby improving the efficiency of parsing streaming content. Furthermore, in response to the determination of the target level, embodiments of this disclosure can present intermediate variables associated with the target level, thus presenting such intermediate variables without waiting for all streaming text to be parsed, further improving the efficiency of parsing streaming content.

[0024] Therefore, the embodiments of this disclosure can support the parsing of structured data in streaming output without waiting for the structured data output to complete, thereby improving the efficiency of content parsing.

[0025] The following section provides a detailed description of various example implementations of this scheme, with reference to the accompanying drawings.

[0026] Example Environment

[0027] Figure 1 illustrates a schematic diagram of an example environment 100 in which embodiments of the present disclosure can be implemented. As shown in Figure 1, the example environment 100 may include an electronic device 110.

[0028] In this example environment 100, application 120 is installed on electronic device 110. User 140 can interact with application 120 via electronic device 110 and / or its attached devices. Application 120 can be an application that supports business form filling, such as a workflow editing application, or any other suitable application.

[0029] In environment 100 of Figure 1, if application 120 is active, application 120 can provide a presentation interface 150 for user 140.

[0030] In some embodiments, electronic device 110 communicates with server 130 to provide services to application 120. Electronic device 110 can be any type of mobile terminal, fixed terminal, or portable terminal, including mobile phones, desktop computers, laptop computers, notebook computers, netbook computers, tablet computers, media computers, multimedia tablets, personal communication system (PCS) devices, personal navigation devices, personal digital assistants (PDAs), audio / video players, digital cameras / camcorders, positioning devices, television receivers, radio receivers, e-book devices, gaming devices, or any combination thereof, including accessories and peripherals of these devices or any combination thereof. In some embodiments, electronic device 110 can also support any type of user-facing interface (such as "wearable" circuitry).

[0031] Server 130 can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks, and big data and artificial intelligence platforms. Server 130 may include, for example, computing systems / servers such as mainframes, edge computing nodes, computing devices in a cloud environment, etc. Server 130 can provide backend services for applications 120 supported by electronic devices 110.

[0032] A communication connection can be established between server 130 and electronic device 110. This communication connection can be established via wired or wireless means. The communication connection may include, but is not limited to, Bluetooth, mobile network, Universal Serial Bus, and Wi-Fi connections; the embodiments of this disclosure are not limited in this respect. In the embodiments of this disclosure, server 130 and electronic device 110 can achieve signaling interaction through the communication connection between them.

[0033] It should be understood that the structure and function of the various elements in environment 100 are described for illustrative purposes only and do not imply any limitation on the scope of this disclosure.

[0034] The following description will continue with reference to the accompanying drawings, which will provide some exemplary embodiments of this disclosure.

[0035] Example Interaction

[0036] The following description refers to FIG2, which illustrates an example interaction process according to an embodiment of the present disclosure. FIG2 shows a schematic diagram of an example interactive interface 200 according to some embodiments of the present disclosure, which may be provided, for example, by the electronic device 110 shown in FIG1.

[0037] It should be understood that the embodiments of this disclosure can be applied to various types of business form filling scenarios. The following embodiments are described exemplarily using a workflow editing scenario as an example, but are not limited thereto.

[0038] The following description of an example interaction process according to an embodiment of the present disclosure will be made with reference to FIG2. FIG2 illustrates an example interface 200 according to some embodiments of the present disclosure, which may be provided, for example, by the electronic device 110 shown in FIG1.

[0039] As shown in Figure 2, the electronic device 110 can, for example, present an interface 200 (e.g., a workflow editing interface). This interface 200 can present predefined workflows such as aaa, bbb, ccc, ddd, eee, and fff. In response to receiving an editing operation on any predefined workflow, the node diagram of that workflow can be presented, such as workflow aaa or workflow bbb. The workflow editing interface can display a list of node diagrams for switching between multiple workflows, allowing for quick switching. Taking the node diagram 210 of workflow bbb as an example, the editing interface 200 can be used to edit the node diagram 210 of workflow bbb. The node diagram 210 can include node windows corresponding to multiple nodes, such as form components 211, etc.

[0040] In node diagram 210, two related nodes can be connected via a target method, while two unrelated nodes can remain unconnected. The target method can be based on a predetermined line element, or any other suitable connection method. Line elements can be straight lines, polylines, curves, etc., which will not be elaborated upon here.

[0041] In some embodiments, two related nodes can be in a sequential calling relationship. In other embodiments, two related node windows can also represent a parent node and a child node relationship; that is, the two related nodes can have a hierarchical relationship, where the parent node is the level above the child node.

[0042] In some embodiments, in response to receiving a preset operation on a target node in node diagram 210, electronic device 110 may present a test window associated with the target node.

[0043] As examples, the target node can be any appropriate node, such as a process node, etc.

[0044] In some embodiments, the electronic device 110 may, in response to receiving a preset operation, present a node management interface corresponding to the target node. Further, the node management interface may include a canvas component and a test window 220. The canvas component may display the node window of the target node. Taking FIG2 as an example, the canvas component may include multiple form components, such as form component 211, which includes editable key and value portions. The key portions may include, for example, “HH”, “G”, and “KKK” as shown in FIG2, and the value portions may include the values ​​corresponding to the key portions.

[0045] In some embodiments, the test window 220 may present code corresponding to the form component 211, such as code containing variables in a structured data format, or code containing variables in other formats. As an example, as shown in FIG2, the test window 220 may present code corresponding to each instance in the form component 211. Thus, the electronic device 110 can parse such code into a structured data format file for use in subsequent project processes.

[0046] It should be understood that the form components of the workflow shown in the embodiments of this disclosure are merely exemplary, and this disclosure is not intended to limit them. In scenarios where other business forms are filled, such form components may also be form components of other business forms.

[0047] The following embodiments will exemplify some implementation methods of the parsing code of electronic device 110.

[0048] Example parsing process

[0049] [Correction 31.12.2024 according to Rule 91] For ease of description, the parsing flow 300 according to embodiments of the present disclosure is described below with reference to Figures 3A to 3G. Figures 3A to 3G show schematic timing diagrams of the parsing process according to some embodiments of the present disclosure, and any one or more of processes 300A to 300G can be implemented at the electronic device 110 shown in Figure 1.

[0050] As shown in Figure 3A, the electronic device 110 may include, but is not limited to, a parser 301, a listener 302, an indexing unit 303, a value parsing unit 304, an object parsing unit 305, an array parsing unit 306, a string parsing unit 307, a boolean value parsing unit 308, a null value parsing unit 309, and a number parsing unit 310. The electronic device 110 can use the parser 301 to receive (311) streaming text from the model.

[0051] As examples, parser 301 can be deployed in electronic device 110. Such streaming text can be obtained by a model (e.g., a large model) recognizing code in test window 220. Such streaming text includes, for example, string streams, and as examples, can be generated by compiling the node windows in the aforementioned node graph 210. Such models can include large language models, etc.

[0052] In some embodiments, electronic device 110 may initialize (312) the intermediate variables in listener 302 to initial values ​​(e.g., null values) so that listener 302 can be used subsequently to monitor whether the parsing of the current part has been completed. In response to the completion of parsing of such intermediate variables, electronic device 110 may generate a file in a structured data format corresponding to the code in the current scope, i.e., the intermediate variables.

[0053] Furthermore, the electronic device 110 can use the value parsing unit 304 to parse the received streaming text (313). As some examples, the electronic device 110 can use the value parsing unit 304 to perform parsing character by character sequentially.

[0054] In some embodiments, the electronic device 110 can use a pointer to traverse to the position index of the character to be parsed. Based on such position index, the electronic device 110 can determine the length of the text it has parsed, such as the number of characters. Therefore, the electronic device 110 can parse the text corresponding to the position index if the parsed text does not exceed a preset text length, and pause the current parsing process if the parsed text exceeds the preset text length, to avoid malfunctions such as stuttering that would reduce the parsing efficiency of the electronic device 110.

[0055] In some embodiments, as shown in A1 of FIG3A, the electronic device 110 can update (314) such a position index to parse the next text in the code. Thus, the electronic device 110 can determine (315) whether the parsed text is too long, for example, exceeding the length of the streaming text. If there is a case where the parsed text exceeds the length of the streaming text, that is, the length of the position index exceeds the length of the streaming text, the electronic device 110 can execute an interrupt. Therefore, the electronic device 110 can control the length of the text parsed in a single step.

[0056] As examples, an interruption may include a task that interrupts the parsing process of the target parsing unit, thereby preventing a failure in the parsing process. During such an interruption, the electronic device 110 may retain the context information associated with the target parsing unit.

[0057] In some embodiments, in response to detecting that the current position of the streaming text corresponds to a preset character, the electronic device 110 can determine a target parsing unit corresponding to the preset character. As examples, the preset character may include “{”, “}”, “t”, “f”, “n”, “[”, “]”, etc. The target parsing unit may be an index unit 303, a value parsing unit 304, an object parsing unit 305, an array parsing unit 306, a string parsing unit 307, a Boolean value parsing unit 308, a null value parsing unit 309, and / or a number parsing unit 310.

[0058] In some embodiments, as shown in A2 of FIG3A, the character to be parsed at the current position of the streaming text can be “{”, and such “{” can be the beginning of an object’s content. Thus, after receiving the character “{” and before receiving the character “}”, the electronic device 110 can determine (316) that the received content is an object.

[0059] During the process of receiving such object content, the electronic device 110 can maintain such an object using the object parsing unit 305 corresponding to the object.

[0060] Furthermore, the electronic device 110 can parse the text in the object. To ensure accurate generation of a structured data format file, the electronic device 110 can define the key and value parts separately to meet the structured data format requirements.

[0061] In some embodiments, the electronic device 110 may determine (318) whether the text to be parsed at the next position is a key part.

[0062] In some embodiments, as shown at A3 in FIG3A, in response to determining that the text to be parsed at the next position is a key portion, the electronic device 110 can parse such a key portion. Specifically, the electronic device 110 can use the object parsing unit 305 to obtain string text from the array parsing unit 306 to determine whether the character to be parsed is a double quote.

[0063] In some embodiments, as shown in A4 of FIG3A, the electronic device 110 may continue to update (321) the position index to the index unit 303 and continue to determine (322) whether the text is too long, so as to ensure the normal execution of the parsing process of the electronic device and improve the efficiency of the parsing process of the electronic device 110.

[0064] In some embodiments, in response to the character to be parsed being a double quote, the electronic device 110 can set (323) the value portion corresponding to such a key portion to the initial value (e.g., empty). Further, the electronic device 110 can listen to whether the parsing of the current portion has been completed. In response to the completion of the parsing of the current portion, the electronic device 110 can define (324) the value portion corresponding to such a key portion based on the path. Thus, the electronic device 110 can use the listener to return (325) the intermediate variable corresponding to the currently parsed text.

[0065] In some embodiments, the path may include the path corresponding to the target level. The target level may indicate the position of the current location in the structured data. Specifically, the electronic device 110 may determine such a target level based on the path from the root node of the structured data to the target node corresponding to the current location.

[0066] In some embodiments, the electronic device 110 can parse the value portion corresponding to such a key portion. As some examples, such a value portion can include various types, such as "{", "}", Boolean values, etc. Thus, the electronic device 110 can use such a value parsing unit 304 to recursively call (326) the above parsing process (e.g., steps 316 to 319).

[0067] In some embodiments, as shown in A5 of FIG3A, the electronic device 110 may continue to update (327) the location index and continue to determine (329) whether the text is too long, so as to ensure the normal execution of the parsing process of the electronic device and improve the efficiency of the parsing process of the electronic device 110.

[0068] In some embodiments, the electronic device 110 may use the value parsing unit 304 to set (330) such a value of the current key portion as the returned value and transmit it to the object parsing unit 305.

[0069] Furthermore, the electronic device 110 can listen to whether the parsing of the current part has been completed. In response to the completion of the parsing of the current part, the electronic device 110 can define a value such as (331) based on the path. Thus, the electronic device 110 can use the listener to return (332) the intermediate variable corresponding to the currently parsed text.

[0070] In some embodiments, as shown in FIG3B, the character to be parsed at the current position of the streaming text can be “[”. Such “[” can be the beginning of an array content. Thus, after receiving the character “[” and before receiving the character “]”, the electronic device 110 can determine (333) that the received content is an array content. During the reception of such array content, the electronic device 110 can use the array parsing unit 306 to maintain (334) such an array.

[0071] Furthermore, the electronic device 110 can use such a value parsing unit 304 to recursively call (335) the above-described parsing process (e.g., steps 316 to 319) to parse the text in the array. Thus, the electronic device 110 can maintain the results parsed by the value parsing unit 304 in such an array (336).

[0072] In some embodiments, the electronic device 110 may determine (337) that the character to be parsed at the current position of the streaming text is “]”. Based on this, the electronic device 110 may determine that such an array has been parsed to the end point, and the electronic device 110 may return (338) such an array to the value parsing unit 304, thereby completing the parsing of such an array.

[0073] In some embodiments, as shown in A6 of FIG3B, the electronic device 110 may continue to update (339) the position index to the index unit 303 and continue to determine (340) whether the text is too long, so as to ensure the normal execution of the electronic device parsing process and improve the efficiency of the parsing process of the electronic device 110.

[0074] In some embodiments, the electronic device 110 may continue to listen for whether the parsing of the current part has been completed. In response to the completion of the parsing of the current part, the electronic device 110 may define an array such as (341) based on the path. Thus, the electronic device 110 may use the listener to return (342) the intermediate variable corresponding to the currently parsed text.

[0075] In some embodiments, as shown in FIG3C, the character to be parsed at the current position of the streaming text can be a space character. In response to the character to be parsed being a space character, the electronic device 110 can determine that the characters following that character are text strings. Thus, the electronic device 110 can determine (343) that the text before parsing the next space character is all content of the text string. Further, the electronic device 110 can use the string parsing unit 307 to parse the content of such a text string.

[0076] In some embodiments, as shown in A7 of FIG3C, the electronic device 110 may continue to update (344) the position index to the index unit 303 and continue to determine (345) whether the text is too long, so as to ensure the normal execution of the parsing process of the electronic device and improve the efficiency of the parsing process of the electronic device 110.

[0077] In some embodiments, in response to the text length not reaching a threshold, the electronic device 110 may use the string parsing unit 307 to return such a text string (346) to the value parsing unit 304.

[0078] In some embodiments, the electronic device 110 may continue to listen for whether the parsing of the current part has been completed. In response to the completion of the parsing of the current part, the electronic device 110 may define a text string such as (347) based on the path. Thus, the electronic device 110 may use the listener to return (348) the intermediate variable corresponding to the currently parsed text.

[0079] In some embodiments, as shown in FIG3D, the character to be parsed at the current position of the streaming text can be "t" or "f". Such "t" can be the beginning part of "true". Thus, after receiving the character "t", the electronic device 110 can determine (349) that the received content is "true" or "false", and use the Boolean value parsing unit 308 to parse such a Boolean value.

[0080] In some embodiments, as shown in A8 of FIG3D, the electronic device 110 may continue to update (351) the position index to the index unit 303 and continue to determine (352) whether the text is too long, so as to ensure the normal execution of the parsing process of the electronic device and improve the efficiency of the parsing process of the electronic device 110.

[0081] In some embodiments, in response to the text length not reaching a threshold, the electronic device 110 may use the Boolean parsing unit 308 to return such "true" or "false" to the value parsing unit 304 (353).

[0082] In some embodiments, the electronic device 110 may continue to listen for whether the parsing of the current part has been completed. In response to the completion of the parsing of the current part, the electronic device 110 may define a value such as (354) based on the path. Thus, the electronic device 110 may use the listener to return (355) the intermediate variable corresponding to the currently parsed text.

[0083] In some embodiments, as shown in FIG3E, the character to be parsed at the current position of the streaming text can be "n". Such "n" can be the beginning part of "null". Thus, after receiving the character "n", the electronic device 110 can determine (357) that the received content is "null" and use the null value parsing unit 309 to parse such a null value.

[0084] In some embodiments, as shown in A9 of FIG3E, the electronic device 110 may continue to update (358) the position index to the index unit 303 and continue to determine (359) whether the text is too long, so as to ensure the normal execution of the electronic device parsing process and improve the efficiency of the electronic device 110 parsing process.

[0085] In some embodiments, in response to the text length not reaching a threshold, the electronic device 110 may use the null value parsing unit 309 to return such a "null" (360) to the value parsing unit 304.

[0086] In some embodiments, the electronic device 110 may continue to listen for whether the parsing of the current part has been completed. In response to the completion of the parsing of the current part, the electronic device 110 may define a value such as (361) based on the path. Thus, the electronic device 110 may use the listener to return (362) the intermediate variable corresponding to the currently parsed text.

[0087] In some embodiments, as shown in FIG3F, the character to be parsed at the current position of the streaming text can be of numeric type. As some examples, numeric text may include text other than the types of text described above. Thus, the electronic device 110 can determine (363) that the received content is numeric. Further, the electronic device 110 can use the numeric parsing unit 310 to parse such numeric characters.

[0088] In some embodiments, as shown in A10 of FIG3F, the electronic device 110 may continue to update (364) the position index to the index unit 303 and continue to determine (365) whether the text is too long, so as to ensure the normal execution of the parsing process of the electronic device and improve the efficiency of the parsing process of the electronic device 110.

[0089] In some embodiments, in response to the text length not reaching a threshold, the electronic device 110 may use the number parsing unit 310 to return such a number (366) to the value parsing unit 304.

[0090] In some embodiments, the electronic device 110 may continue to listen for whether the parsing of the current part has been completed. In response to the completion of the parsing of the current part, the electronic device 110 may define a number such as (367) based on the path. Thus, the electronic device 110 may use the listener to return (368) the intermediate variable corresponding to the currently parsed text.

[0091] In this way, electronic device 110 can generate and return intermediate structured data in a complete structured format corresponding to the currently parsed portion of the streaming text.

[0092] In some embodiments, as shown in FIG3G, the electronic device 110 can use the index unit 303 to determine the interrupted parsing unit in the electronic device 110. Specifically, the electronic device 110 can obtain (370) new text from the large model. Further, the electronic device 110 can use the parser 301 to obtain (371) the enumeration of the current index subscript from the index unit 303. Thus, the electronic device 110 can use the parser 301 to determine whether the pointer corresponding to the position index of the current index subscript points to text, that is, whether there is a value at the pointer position. Thus, the interrupted parsing unit existing in the electronic device 110 is determined.

[0093] Therefore, when the pointer position has a value, the electronic device 110 can resume the parsing process (373) from the position of the last interruption, that is, resume the interrupted parsing unit. When the pointer position does not have a value, the electronic device 110 can continue to execute the waiting process (374).

[0094] Based on this approach, embodiments of this disclosure can further parse different types of characters in streaming text using corresponding target parsing units, thereby improving the efficiency of parsing streaming content. Furthermore, in response to the determination of the target level, embodiments of this disclosure can present intermediate variables associated with the target level, thus presenting such intermediate variables without waiting for all streaming text to be parsed, further improving the efficiency of parsing streaming content.

[0095] The embodiments of this disclosure can support the parsing of structured data in streaming output without waiting for the structured data output to complete, thereby improving the efficiency of content parsing.

[0096] Example process

[0097] Figure 4 shows a flowchart of an example process 400 for content parsing according to some embodiments of the present disclosure. Process 400 can be implemented at electronic device 110. Process 400 is described below with reference to Figure 1.

[0098] In box 410, electronic device 110 acquires streaming text output by the model, which corresponds to structured data.

[0099] In box 420, electronic device 110, in response to detecting that the current position of the streaming text corresponds to a preset character, determines the target parsing unit corresponding to the preset character.

[0100] In box 430, electronic device 110 initializes intermediate variables corresponding to the target level based on the type of the target parsing unit. The intermediate variables include a key part and a value part. The target level indicates the current position in the structured data.

[0101] In box 440, electronic device 110 uses the target parsing unit to parse the streaming text in order to dynamically update intermediate variables.

[0102] In box 450, during the parsing of streaming text, electronic device 110 presents the key and / or value portions of intermediate variables based on the target hierarchy.

[0103] In some embodiments, process 400 further includes: determining a target level based on the path from the root node of the structured data to the target node corresponding to the current position.

[0104] In some embodiments, initializing intermediate variables corresponding to the target level based on the type of the target parsing unit includes: in response to the value type corresponding to the type indicator, initializing intermediate variables corresponding to the target level to set initial values ​​corresponding to the value portion based on the value type.

[0105] In some embodiments, presenting the key and / or value portions of an intermediate variable based on a target hierarchy includes: in a form component, presenting a form item corresponding to the target hierarchy, the form item displaying the key and value portions.

[0106] In some embodiments, parsing streaming text using a target parsing unit to dynamically update intermediate variables includes: determining whether the position index parsed by the target parsing unit exceeds the length of the streaming text to be processed; and interrupting the target parsing unit in response to determining that the position index exceeds the length.

[0107] In some embodiments, context information associated with the target parsing unit is retained during the interruption of the target parsing unit.

[0108] In some embodiments, parsing streaming text using a target parsing unit to dynamically update intermediate variables includes: determining whether there is an interrupted parsing unit in response to obtaining updated content of the streaming text; resuming the target parsing unit in response to the presence of a target parsing unit with a terminal; and parsing the updated content of the streaming text from the position index using the target parsing unit.

[0109] In some embodiments, parsing streaming text using a target parsing unit to dynamically update intermediate variables includes: in response to determining that a position index has not exceeded its length, parsing the next position in the streaming text using the target parsing unit.

[0110] In some embodiments, the target parsing unit includes one of the following: an object parsing unit corresponding to the first character; a number parsing unit corresponding to the second character; a string parsing unit corresponding to the third character; a Boolean parsing unit corresponding to the fourth character; a null value parsing unit corresponding to the fifth character; and a number parsing unit corresponding to the sixth character.

[0111] In some embodiments, process 400 further includes: in response to the existing content of the streaming text corresponding to a portion of the structured data, generating intermediate structured data corresponding to the complete structured format based on the parsing results of the streaming text.

[0112] Example devices and equipment

[0113] Embodiments of this disclosure also provide corresponding apparatus for implementing the methods or processes described above. FIG5 shows a schematic structural block diagram of an apparatus 500 for content parsing according to certain embodiments of this disclosure. Apparatus 500 may be implemented as or included in the electronic device 110 discussed above. The various modules / components in apparatus 500 may be implemented by hardware, software, firmware, or any combination thereof.

[0114] As shown in Figure 5, the device 500 includes an acquisition module 510 configured to acquire streaming text output by a model, the streaming text corresponding to structured data; a first determination module 520 configured to determine a target parsing unit corresponding to a preset character in response to detecting that the current position of the streaming text corresponds to a preset character; an initialization module 530 configured to initialize intermediate variables corresponding to a target level based on the type of the target parsing unit, the intermediate variables including a key part and a value part, the target level indicating the position of the current position in the structured data; an update module 540 configured to parse the streaming text using the target parsing unit to dynamically update the intermediate variables; and a presentation module 550 configured to present the key part and / or value part of the intermediate variables based on the target level during the parsing of the streaming text.

[0115] In some embodiments, the apparatus 500 further includes a second determining module configured to determine a target level based on the path from the root node of the structured data to the target node corresponding to the current position.

[0116] In some embodiments, the initialization module 530 is further configured to: initialize an intermediate variable corresponding to the target level in response to the value type corresponding to the type indicator, so as to set an initial value corresponding to the value part based on the value type.

[0117] In some embodiments, the presentation module 550 is further configured to: present form items corresponding to the target level in the form component, wherein the form items display key portions and value portions.

[0118] In some embodiments, the update module 540 is further configured to: determine whether the position index parsed by the target parsing unit exceeds the length of the streaming text to be processed; and interrupt the target parsing unit in response to determining that the position index exceeds the length.

[0119] In some embodiments, context information associated with the target parsing unit is retained during the interruption of the target parsing unit.

[0120] In some embodiments, the update module 540 is further configured to: determine whether there is an interrupted parsing unit in response to obtaining the updated content of the streaming text; and resume the target parsing unit in response to the existence of a target parsing unit; and parse the updated content of the streaming text from the position index using the target parsing unit.

[0121] In some embodiments, the update module 540 is further configured to: in response to determining that the position index has not exceeded the length, parse the next position in the streaming text using the target parsing unit.

[0122] In some embodiments, the target parsing unit includes one of the following: an object parsing unit corresponding to the first character; a number parsing unit corresponding to the second character; a string parsing unit corresponding to the third character; a Boolean parsing unit corresponding to the fourth character; a null value parsing unit corresponding to the fifth character; and a number parsing unit corresponding to the sixth character.

[0123] [Correction 31.12.2024 according to Rule 91] In some embodiments, the apparatus 500 further includes a generation module configured to: in response to the portion of structured data corresponding to existing content of the streaming text, generate intermediate structured data corresponding to the complete structured format based on the parsing results of the streaming text.

[0124] The units included in device 500 can be implemented in various ways, including software, hardware, firmware, or any combination thereof. In some embodiments, one or more units may be implemented using software and / or firmware, such as machine-executable instructions stored on a storage medium. In addition to or as an alternative to machine-executable instructions, some or all of the units in device 500 may be implemented at least partially by one or more hardware logic components. By way of example and not limitation, exemplary types of hardware logic components that may be used include field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), systems-on-chips (SoCs), complex programmable logic devices (CPLDs), and so on.

[0125] [Revised according to Rule 91, 31.12.2024] FIG6 illustrates a block diagram of an electronic device 600 in which one or more embodiments of the present disclosure may be implemented. It should be understood that the electronic device 600 shown in FIG6 is merely exemplary and should not constitute any limitation on the functionality and scope of the embodiments described herein. The electronic device 600 shown in FIG6 may be used to implement the electronic device 110 of FIG1 or the apparatus 500 of FIG5.

[0126] As shown in Figure 6, the electronic device 600 is in the form of a general-purpose electronic device. Components of the electronic device 600 may include, but are not limited to, one or more processors or processing units 610, memory 620, storage device 630, one or more communication units 640, one or more input devices 650, and one or more output devices 660. The processing unit 610 may be a physical or virtual processor and is capable of performing various processes according to programs stored in memory 620. In a multiprocessor system, multiple processing units execute computer-executable instructions in parallel to improve the parallel processing capability of the electronic device 600.

[0127] Electronic device 600 typically includes multiple computer storage media. Such media can be any available media accessible to electronic device 600, including but not limited to volatile and non-volatile media, removable and non-removable media. Memory 620 can be volatile memory (e.g., registers, cache, random access memory (RAM)), non-volatile memory (e.g., read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory), or some combination thereof. Storage device 630 can be removable or non-removable media and can include machine-readable media, such as flash drives, disks, or any other media capable of storing information and / or data and accessible within electronic device 600.

[0128] Electronic device 600 may further include additional removable / non-removable, volatile / non-volatile storage media. Although not shown in FIG. 6, disk drives for reading from or writing to removable, non-volatile disks (e.g., "floppy disks") and optical disk drives for reading from or writing to removable, non-volatile optical disks may be provided. In these cases, each drive may be connected to a bus (not shown) via one or more data media interfaces. Memory 620 may include computer program product 625 having one or more program modules configured to perform various methods or actions of various embodiments of the present disclosure.

[0129] The communication unit 640 enables communication with other electronic devices via a communication medium. Additionally, the functionality of the components of the electronic device 600 can be implemented using a single computing cluster or multiple computing machines capable of communicating via communication connections. Therefore, the electronic device 600 can operate in a networked environment using logical connections to one or more other servers, network personal computers (PCs), or another network node.

[0130] Input device 650 can be one or more input devices, such as a mouse, keyboard, trackball, etc. Output device 660 can be one or more output devices, such as a monitor, speaker, printer, etc. Electronic device 600 can also communicate with one or more external devices (not shown) via communication unit 640 as needed. These external devices include storage devices, display devices, etc., and can communicate with one or more devices that enable user interaction with electronic device 600, or with any device that enables electronic device 600 to communicate with one or more other electronic devices (e.g., network card, modem, etc.). Such communication can be performed via input / output (I / O) interface (not shown).

[0131] According to an exemplary implementation of this disclosure, a computer-readable storage medium is provided that stores computer-executable instructions thereon, wherein the computer-executable instructions are executed by a processor to implement the methods described above. According to an exemplary implementation of this disclosure, a computer program product is also provided, which is tangibly stored on a non-transitory computer-readable medium and includes computer-executable instructions, which are executed by a processor to implement the methods described above.

[0132] Various aspects of this disclosure are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatuses, devices, and computer program products implemented according to this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0133] These computer-readable program instructions can be provided to a processing unit of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processing unit of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner. Thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0134] Computer-readable program instructions can be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions that execute on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0135] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction, which contains one or more executable instructions for implementing the specified logical function. In some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0136] Various implementations of this disclosure have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed implementations. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described implementations. The terminology used herein is chosen to best explain the principles, practical applications, or improvements to technology in the market, or to enable others skilled in the art to understand the various implementations disclosed herein.

Claims

1. A content parsing method, comprising: Obtain the streaming text output by the model, which corresponds to structured data; In response to detecting that the current position of the streaming text corresponds to a preset character, a target parsing unit corresponding to the preset character is determined; Based on the type of the target parsing unit, an intermediate variable corresponding to the target level is initialized. The intermediate variable includes a key part and a value part. The target level indicates the position of the current position in the structured data. The target parsing unit is used to parse the streaming text in order to dynamically update the intermediate variables; as well as During the parsing of the streaming text, the key portion and / or the value portion of the intermediate variable are presented based on the target level.

2. The method according to claim 1, further comprising: The target level is determined based on the path from the root node of the structured data to the target node corresponding to the current position.

3. The method according to claim 1, wherein initializing intermediate variables corresponding to the target level based on the type of the target parsing unit includes: In response to the value type corresponding to the type indication, the intermediate variable corresponding to the target level is initialized to set an initial value corresponding to the value portion based on the value type.

4. The method of claim 1, wherein presenting the key portion and / or the value portion of the intermediate variable based on the target hierarchy comprises: In the form component, form items corresponding to the target level are presented, and the form items display the key portion and the value portion.

5. The method according to claim 1, wherein parsing the streaming text using the target parsing unit to dynamically update the intermediate variables comprises: Determine whether the position index parsed by the target parsing unit exceeds the length of the streaming text to be processed; as well as In response to determining that the position index exceeds the length, the target parsing unit is interrupted.

6. The method of claim 5, wherein during the interruption of the target parsing unit, the context information associated with the target parsing unit is retained.

7. The method of claim 5, wherein parsing the streaming text using the target parsing unit to dynamically update the intermediate variables comprises: In response to receiving updated content of the streaming text, determine whether there are any interrupted parsing units; as well as In response to the presence of a terminal, the target parsing unit is restored; as well as The updated content of the streaming text is parsed from the location index using the target parsing unit.

8. The method of claim 5, wherein parsing the streaming text using the target parsing unit to dynamically update the intermediate variables comprises: In response to determining that the position index has not exceeded the length, the next position in the streaming text is parsed using the target parsing unit.

9. The method according to claim 1, wherein the target parsing unit comprises one of the following: The object parsing unit corresponding to the first character; The array parsing unit corresponding to the second character; The string parsing unit corresponding to the third character; The Boolean parsing unit corresponding to the fourth character; The null value parsing unit corresponding to the fifth character; The numeric parsing unit corresponding to the sixth character.

10. The method according to claim 1, further comprising: In response to the existing content of the streaming text corresponding to a portion of the structured data, intermediate structured data corresponding to the complete structured format is generated based on the parsing results of the streaming text.

11. An apparatus for content parsing, comprising: The acquisition module is configured to acquire streaming text output by the model, the streaming text corresponding to structured data; The first determining module is configured to determine a target parsing unit corresponding to the preset character in response to detecting that the current position of the streaming text corresponds to a preset character; An initialization module is configured to initialize intermediate variables corresponding to the target level based on the type of the target parsing unit. The intermediate variables include a key part and a value part. The target level indicates the position of the current position in the structured data. The update module is configured to parse the streaming text using the target parsing unit to dynamically update the intermediate variables; as well as The presentation module is configured to, during the parsing of the streaming text, present the key portion and / or the value portion of the intermediate variable based on the target level.

12. An electronic device, comprising: At least one processing unit; as well as At least one memory, coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, the instructions causing the electronic device to perform the method according to any one of claims 1 to 10 when executed by the at least one processing unit.

13. A computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the method according to any one of claims 1 to 10.

14. A computer program product tangibly stored in a computer storage medium and comprising computer-executable instructions that, when executed by a device, cause the device to perform the method according to any one of claims 1 to 10.