Method, apparatus, electronic device, and product for processing an ordered text sequence

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using boundary control partitioning and persistent identity identifier management, combined with an intelligent processing module, the problem of unclear expression in ordered text sequence processing is solved, achieving efficient and flexible text sequence editing and data consistency.

CN122197827APending Publication Date: 2026-06-12SHAANXI MUFENG DIGITAL TECHNOLOGY CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: SHAANXI MUFENG DIGITAL TECHNOLOGY CO LTD
Filing Date: 2026-01-07
Publication Date: 2026-06-12

Application Information

Patent Timeline

07 Jan 2026

Application

12 Jun 2026

Publication

CN122197827A

IPC: G06F40/166

AI Tagging

Application Domain

Natural language data processing

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Source Identifying Forensics for Digital Media
US20260161748A1Natural language data processingProgram/content distribution protection
Document review generation method, apparatus, and electronic device
CN122197830ANatural language data processing Office automation
A multi-screen presentation page code overflow identification method, device, equipment and storage medium
CN122195370AAvoid error reportingThe recognition effect is accurateNatural language data processing Digital output to display device
Prompt word optimization method and device based on reasoning model, electronic equipment, storage medium and program product
CN122196112ADigital data information retrieval Natural language data processing
Document processing method and apparatus, electronic device, storage medium, and program product
CN122197828ANatural language data processing Input/output processes for data processing

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing technologies often lead to unclear or inconsistent content when processing ordered text sequences, affecting the usability of digital content, while the demand for processing ordered text sequences continues to increase.

⚗Method used

The system divides an ordered text sequence into multiple text units by displaying boundary controls, adjusts and migrates text units in response to user actions, manages text units using persistent identity identifiers, and performs formatting and synchronization processing through an intelligent processing module.

🎯Benefits of technology

It enables efficient and flexible editing of ordered text sequences, ensuring data consistency and historical traceability, and improving the accuracy and operational consistency of text sequence processing.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122197827A_ABST

Patent Text Reader

Abstract

Embodiments of the present disclosure relate to a method, an apparatus, an electronic device, and a product for processing an ordered text sequence. The method includes displaying an ordered text sequence. The method also includes displaying one or more boundary controls. In addition, the method further includes adjusting first text included in a first text unit and second text included in a second text unit in response to a user operation on a first boundary control.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, and more particularly to methods, apparatus, electronic devices, and computer program products for processing ordered text sequences. Background Technology

[0002] With the development of the times, the application scope of digital content in social activities is constantly expanding, gradually covering multiple aspects such as information dissemination, knowledge acquisition, business activities, and cultural exchange. As an important form of information expression, digital content plays a fundamental role in information interaction between individuals and industries. Through digitalization, information can be stored, transmitted, and used under different time and space conditions, improving the efficiency of information processing and utilization.

[0003] In the field of digital content creation, the continuous increase in the amount of digital content places higher demands on the organization and processing of text content. Related content is typically expressed as a sequence of texts, with different texts having a certain order and correlation. Improper processing of text sequences can easily lead to unclear or inconsistent content, thus affecting the usability of the digital content. As the application scenarios for digital content continue to expand, the need for processing ordered text sequences is also increasing. Summary of the Invention

[0004] Embodiments of this disclosure provide methods, apparatus, electronic devices, computer program products, and media for processing ordered sequences of text.

[0005] According to a first aspect of this disclosure, a method for processing an ordered text sequence is provided. The method includes displaying the ordered text sequence. The method further includes displaying one or more boundary controls, wherein the one or more boundary controls divide the ordered text sequence into a plurality of text units, the plurality of text units including adjacent first text units and second text units, a first boundary control of the one or more boundary controls being displayed between the first text units and the second text units, the first text units and the second text units including sub-text sequences within the ordered text sequence, the first text unit including first text in the sub-text sequence located on a first side of the first boundary control, and the second text unit including second text in the sub-text sequence located on a second side of the first boundary control. Furthermore, the method includes adjusting the first text included in the first text unit and the second text included in the second text unit in response to a user operation on the first boundary control.

[0006] According to a second aspect of this disclosure, a method for processing an ordered text sequence is provided, wherein the ordered text sequence is divided into a plurality of text units, the plurality of text units including adjacent first text units and second text units, wherein the first text unit includes first text and the second text unit includes second text. The method includes displaying the first text unit and the second text unit. The method further includes selecting a first side of the first text unit in response to a first selection operation for the first text unit. Furthermore, the method includes migrating text in the first text of the first text unit located on the first side, corresponding to an operation amount of the first migration operation, to a second side of the second text unit in response to a first migration operation for the first text unit, wherein the text in both the migrated first text unit and the migrated second text unit maintains the order of the ordered text sequence.

[0007] According to a third aspect of this disclosure, a method for managing an ordered sequence of text, the ordered sequence of text being divided into a plurality of text units, is provided. The method includes assigning a first persistent identifier to a first text unit among the plurality of text units, wherein the first persistent identifier remains unique and unchanged throughout the lifetime of the first text unit. The method further includes creating a first asset corresponding to the first text unit in response to a first operation performed on the first text unit among the plurality of text units. Furthermore, the method includes establishing a persistent reference of the first asset to the first text unit by associating the first asset with the first persistent identifier.

[0008] According to a fourth aspect of this disclosure, an apparatus for processing an ordered text sequence is provided. The apparatus includes a text display module configured to display the ordered text sequence. The apparatus also includes a control display module configured to display one or more boundary controls, wherein the one or more boundary controls divide the ordered text sequence into a plurality of text units, the plurality of text units including adjacent first and second text units, a first boundary control of the one or more boundary controls displayed between the first and second text units, the first and second text units including sub-text sequences within the ordered text sequence, the first text unit including first text in the sub-text sequence located on a first side of the first boundary control, and the second text unit including second text in the sub-text sequence located on a second side of the first boundary control. Furthermore, the apparatus includes a text adjustment module configured to adjust the first text included in the first text unit and the second text included in the second text unit in response to a user operation on the first boundary control.

[0009] According to a fifth aspect of this disclosure, an apparatus is provided for processing an ordered text sequence, wherein the ordered text sequence is divided into a plurality of text units, the plurality of text units including adjacent first text units and second text units, and the first text unit includes first text, and the second text unit includes second text. The apparatus includes a unit display module configured to display the first text unit and the second text unit. The apparatus includes a unit selection module configured to select a first side of the first text unit in response to a first selection operation for the first text unit. The apparatus includes a text migration module configured to migrate, in response to a first migration operation for the first text unit, text in the first text unit located on the first side corresponding to the operation amount of the first migration operation to a second side of the second text unit, wherein the text in both the migrated first text unit and the migrated second text unit maintains the order of the ordered text sequence.

[0010] According to a sixth aspect of this disclosure, an apparatus is provided for managing an ordered sequence of text, the ordered sequence of text being divided into a plurality of text units. The apparatus includes an identifier assignment module configured to assign a first persistent identifier to a first text unit among the plurality of text units, wherein the first persistent identifier remains unique and unchanged throughout the lifetime of the first text unit. The apparatus includes an asset creation module configured to create a first asset corresponding to the first text unit in response to a first operation performed on the first text unit among the plurality of text units. Furthermore, the apparatus includes a reference establishment module configured to establish a persistent reference of the first asset to the first text unit by associating the first asset with the first persistent identifier.

[0011] According to a seventh aspect of this disclosure, an electronic device is provided. The electronic device includes a processor and a memory coupled to the processor, the memory having instructions stored therein, which, when executed by the processor, cause the electronic device to perform the steps of the methods described according to the first, second, and / or third aspects of this disclosure.

[0012] In an eighth aspect of this disclosure, a computer-readable storage medium is provided. The computer-readable storage medium stores one or more computer instructions, wherein the one or more computer instructions are executed by a processor to perform the steps of the methods described according to the first, second, and / or third aspects of this disclosure.

[0013] In a ninth aspect of this disclosure, a computer program product is provided. This computer program product is tangibly stored on a non-transitory computer-readable medium and includes computer-executable instructions that, when executed, cause a computer to perform the steps of the methods of the first, second, and / or third aspects of this disclosure.

[0014] It should be understood that the content described in this section is not intended to limit the key or essential features of the embodiments of this disclosure, nor is it intended to restrict the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description

[0015] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. In the drawings, the same or similar reference numerals denote the same or similar elements, wherein:

[0016] Figure 1 A schematic diagram of the architecture of an ordered text sequence processing system according to an embodiment of the present disclosure is shown;

[0017] Figure 2 A flowchart illustrating a process for processing an ordered sequence of text according to an embodiment of the present disclosure is shown;

[0018] Figure 3 A flowchart illustrating the displacement operation process according to an embodiment of the present disclosure is shown;

[0019] Figures 4A to 4D A schematic diagram illustrating a process for operating an ordered text sequence according to an embodiment of the present disclosure is shown;

[0020] Figure 5A and Figure 5B The application of embodiments of the present disclosure in a language learning scenario is illustrated;

[0021] Figure 6 A schematic diagram of the logical architecture of a data model according to an embodiment of the present disclosure is shown;

[0022] Figure 7 A schematic diagram of a user interface for metadata inheritance according to an embodiment of the present disclosure is shown;

[0023] Figures 8A to 8G The continuous operation process in a caption editing scenario with vertical layout is illustrated according to an embodiment of the present disclosure;

[0024] Figures 9A to 9G A schematic diagram is shown illustrating a process of interacting with an ordered text sequence based on a context according to an embodiment of the present disclosure;

[0025] Figure 10A flowchart of a method for processing an ordered sequence of text according to an embodiment of the present disclosure is shown;

[0026] Figure 11 A flowchart is shown for another method for processing ordered text sequences according to embodiments of the present disclosure;

[0027] Figure 12 A flowchart of a method for managing an ordered sequence of text according to an embodiment of the present disclosure is shown;

[0028] Figure 13 A block diagram of an apparatus for processing ordered text sequences according to an embodiment of the present disclosure is shown;

[0029] Figure 14 A block diagram of another apparatus for processing ordered text sequences according to embodiments of the present disclosure is shown;

[0030] Figure 15 A block diagram of an apparatus for managing an ordered sequence of text according to an embodiment of the present disclosure is shown;

[0031] Figure 16 A schematic block diagram of an example device that can be used to implement embodiments of the present disclosure is shown.

[0032] In all the accompanying figures, the same or similar reference numerals denote the same or similar elements. Detailed Implementation

[0033] It is understood that the data involved in this technical solution (including but not limited to the data itself, the acquisition or use of the data) shall comply with the requirements of relevant laws, regulations and related provisions.

[0034] It is understood that before using the technical solutions disclosed in the various embodiments of this disclosure, users should be informed of the types, scope of use, and usage scenarios of the personal information involved in this disclosure in an appropriate manner in accordance with relevant laws and regulations, and user authorization should be obtained.

[0035] For example, upon receiving a user's proactive request, a prompt message is sent to the user to explicitly inform them that the requested operation will require the acquisition and use of the user's personal information. This allows the user to independently choose whether to provide personal information to the software or hardware, such as the electronic device, application, server, or storage medium performing the operations of this disclosed technical solution, based on the prompt message.

[0036] As an optional but non-limiting implementation, in response to a user's active request, sending a prompt message to the user can be done via a pop-up window, where the prompt message can be presented in text format. Furthermore, the pop-up window can also include a selection control allowing the user to choose "agree" or "disagree" to provide personal information to the electronic device.

[0037] It is understood that the above notification and user authorization process are merely illustrative and do not constitute a limitation on the implementation of this disclosure. Other methods that comply with relevant laws and regulations may also be applied to the implementation of this disclosure.

[0038] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0039] In the description of embodiments of this disclosure, the term "comprising" and similar terms should be understood as open-ended inclusion, i.e., "including but not limited to". The term "based on" should be understood as "at least partially based on". The term "an embodiment" or "this embodiment" should be understood as "at least one embodiment". The terms "first", "second", etc., may refer to different or the same objects. Other explicit and implicit definitions may also be included below. The term "display" boundary control as described in this disclosure should be broadly understood as providing, activating, or defining an interactive object with boundary manipulation functionality on the user interface. This term does not necessarily require drawing visible graphical entities on the interface. As long as the system logically defines a response area, even if the area remains visually invisible before, during, or after the operation (e.g., its existence is only reflected by changing the cursor shape, providing haptic feedback, or only responding to the operation at the logical level), it should be considered to fall within the scope of "display" boundary control.

[0040] Figure 1 A schematic diagram of the architecture of an ordered text sequence processing system according to an embodiment of the present disclosure is shown. Figure 1 As shown, the ordered text sequence processing system may include a user interface module 101, a boundary manipulation processing module 102, a data and history management module 103, and an intelligent processing module 104.

[0041] An ordered text sequence refers to a text sequence structure formed by arranging one or more text units in a predetermined order. Each text unit contains a continuous segment of text content, and there are logical boundaries between adjacent text units to delineate the content's ownership. In some embodiments, the ordered text sequence may be further associated with time information; for example, each text unit corresponds to a timestamp or time range for synchronization with time-series media such as audio or video. In some embodiments, the ordered text sequence may include video subtitles, audio lyrics, meeting transcripts, or code block sequences. In some embodiments, the ordered text sequence can be applied in scenarios such as subtitle editing, speech transcription, lyric creation, teaching document organization, and / or code editing. In the following description, subtitle sequences will be used primarily as examples of ordered text sequences, but this does not constitute a limitation on the scope of protection of this disclosure.

[0042] The user interface module 101 is used to display the current state of an ordered text sequence to the user and provide interface elements for interacting with the ordered text sequence. In some embodiments, the user interface module 101 can display multiple text units in a list, timeline, or other suitable layout, and supports receiving operation instructions input by the user on the text units or their boundaries, such as drag operations, click operations, key operations, touch gestures, or combined input operations. The user interface module 101 can also update the interface in real time or with a delay based on the system processing results to provide feedback to the user on changes in text content, boundary positions, or time information. In some embodiments, the user interface module 101 can display explicit boundary controls located between adjacent text units. In some embodiments, the user interface module 101 can activate implicit boundary interaction areas in response to cursor positions. In some embodiments, the user interface module 101 can integrate video players and audio waveforms, etc.

[0043] The boundary manipulation processing module 102 receives operation instructions from the user interface module 101 and parses them to determine whether the user's intent involves adjusting the logical boundary between adjacent text units. After determining the user's intent, the boundary manipulation processing module 102 generates corresponding boundary manipulation processing logic, such as boundary displacement, boundary removal, or the creation of a new boundary, and sends the corresponding data update request or transaction request to the data and history management module 103. Simultaneously, the boundary manipulation processing module 102 can also return processing results or intermediate states to the user interface module 101 to drive dynamic feedback display on the interface. During the editing operation, the boundary manipulation processing module 102 can call the intelligent processing module 104 as needed to obtain auxiliary processing results.

[0044] The data and history management module 103 manages the underlying data model of the ordered text sequence. For example, the data and history management module 103 can manage persistent identifiers, versioned state snapshots, and transactional operation logs for text units. In some embodiments, the data and history management module 103 can assign persistent identifiers to each text unit and record the state changes of text units at different editing stages based on a versioned snapshot mechanism, thereby achieving reliable storage and traceability of text content, time information, and evolutionary relationships. The data and history management module 103 can also respond to data update requests from the boundary manipulation processing module 102 in a transactional manner, ensuring the logical atomicity and consistency of content adjustments across text units, and providing other modules with an interface to read the latest or historical state.

[0045] The intelligent processing module 104 is used to perform auxiliary processing related to text content or boundary adjustments after an ordered text sequence undergoes editing changes. In some embodiments, the intelligent processing module 104 may include a time synchronization processing function to automatically update the corresponding timestamp based on the text adjustment result in scenarios such as subtitles. In some embodiments, the intelligent processing module 104 may include a language-aware processing function to format the adjusted text according to text language rules. In addition, the intelligent processing module 104 can also read context data from the data and history management module 103 when needed to support more complex intelligent analysis or decision-making, and feed the processing results back to the boundary manipulation processing module 102 or the user interface module 101.

[0046] Through the collaborative work of the above modules, the ordered text sequence processing system can achieve efficient and flexible editing of ordered text sequences while ensuring data consistency and historical traceability. For ease of understanding, the specific data mentioned in the following description are exemplary and not intended to limit the scope of this disclosure. It is understood that the embodiments described below may also include additional actions not shown and / or actions shown may be omitted, and the scope of this disclosure is not limited in this respect.

[0047] Figure 2 A flowchart illustrating a process for processing an ordered text sequence according to an embodiment of the present disclosure is shown. This process can be performed by the aforementioned ordered text sequence processing system, for example, through the coordinated operation of the user interface module 101, the boundary manipulation processing module 102, the data and history management module 103, and the intelligent processing module 104.

[0048] At step S201, the boundary visualization and interaction steps are performed. For example, the system can present the text content of adjacent first and second text units in an ordered text sequence on the user interface, and display an interactive boundary control at the corresponding logical boundary position between them. Through this boundary control, the user can intuitively perceive the separation relationship between adjacent text units and initiate subsequent editing operations on the logical boundary. In some embodiments, the boundary control can be an explicitly displayed visual control. In other embodiments, the interactive presentation of the boundary can also be achieved through implicit interaction with the edge areas of the text units.

[0049] At step S202, a boundary operation response step is executed. The system can monitor in real time the editing operations performed by the user on the boundary control or the interactive area associated with the logical boundary, and respond to the operation. For example, editing operations may include dragging, clicking, key combinations, or gesture operations, used to express the user's intention to adjust, remove, or create the logical boundary.

[0050] At step S203, the operation type determination step is performed. Based on the user operation detected in step S202, the system can identify and determine the type of the operation to identify whether it is a displacement operation, a removal operation, or an insertion operation. Different types of operations will trigger different subsequent processing flows.

[0051] When the judgment result is a displacement operation, the process enters S204. At S204, the system can execute an iterative migration process, achieving atomic migration of text content between adjacent text units through continuous adjustment of logical boundary positions. For example, the system can iteratively determine the text elements to be migrated based on changes in the spatial position of boundary controls, and gradually update the text content in the first and second text units while maintaining the overall order of the ordered text sequence. The following will combine... Figure 3 Provide a detailed description of the migration process.

[0052] When the determination result is a removal operation, the process proceeds to S205. At S205, the system can perform a merging process for adjacent text units. For example, the system can merge adjacent text units to combine the text content of the first text unit and the second text unit into a new text unit, thereby eliminating the logical boundary between them and updating the structure of the ordered text sequence.

[0053] When the judgment result is an insertion operation, the process enters S206. At S206, the system performs text content segmentation processing on the source text unit at the user-specified location, splitting the source text unit into two new text units, and forming a new logical boundary after the split to update the unit division of the ordered text sequence.

[0054] After any one of the atomization update operations in steps S204, S205 or S206 is completed, the process proceeds to S207. At S207, the system calls the intelligent processing module 104 to perform linkage processing on relevant attributes affected by changes in text content or logical boundaries. In some embodiments, the intelligent processing module may include an intelligent timing synchronization module for adjusting the timestamps corresponding to each text unit in the subtitle sequence. In some embodiments, the intelligent processing module may include a language-aware boundary formatting module for automatically formatting the adjusted text content according to the rules of the current text language, such as standardizing the spacing between words, the position of punctuation marks, or the mixed Chinese and Western text format.

[0055] Figure 3 shows a flowchart of the process of a displacement operation according to an embodiment of the present disclosure. As Figure 3 shown, at block 301, the current direction of the displacement operation performed by the user on the logical boundary can be detected. For example, during the process of the user continuously performing a drag operation, the system obtains the direction information of the boundary manipulation input at the beginning of each loop iteration to determine whether the displacement is towards the first text unit or the second text unit, so as to provide a basis for subsequent text migration judgment.

[0056] It should be understood that the above detection process of the displacement direction is described by taking the implementation manner in which the boundary control is explicitly displayed in the user interface as an example. In other embodiments, for example, in the case where the boundary control is not explicitly displayed on the interface, but the logical boundary is implicitly manipulated through context gestures, cursor position or keyboard operations, etc., the system can also determine the displacement direction based on the directional characteristics of the user input, and its basic processing logic corresponds to the implementation manner described here. The subsequent content will further illustrate the relevant implementation manners in combination with different user interface schematic diagrams.

[0057] At block 302, one or more candidate text elements adjacent to the current logical boundary can be identified on the side of the source text unit according to the displacement direction detected at block 301. In some embodiments, the system can also provide corresponding visual feedback to the user. The visual feedback may include highlighting the candidate text elements, or a preview indication predicting the potential insertion position in the target text unit, so as to intuitively reflect the possible changes in the text content if the migration operation occurs.

[0058] At box 303, it can be determined whether a preset migration trigger condition is met. The migration trigger condition can be determined based on the spatial positional relationship between the judgment point of the boundary manipulation input and the candidate text element. For example, when the judgment point crosses a preset spatial threshold associated with the candidate text element, the migration trigger condition can be considered met; otherwise, it is considered not met. If the migration trigger condition is not met, the process returns to box 301 to continue responding to changes in the user's displacement direction and dynamically updating the visual feedback.

[0059] When the migration trigger condition is met at box 303, the process proceeds to box 304. At box 304, an atomic migration operation can be performed, whereby the candidate text element is removed from the source text unit and added to the target text unit as an indivisible whole. By performing the migration operation atomically, consistency and integrity during the text content adjustment process can be guaranteed, avoiding partial migrations or intermediate states.

[0060] After completing an atomic migration, the process enters box 305. At box 305, it can be determined whether the current displacement operation has ended, for example, by detecting whether the user has released the drag-and-drop input device. If the displacement operation is not yet complete, the process returns to box 301, thus entering the next loop, to allow the user to continue migrating within the same displacement operation, or to reverse the displacement direction without interrupting the operation. When the displacement operation is determined to be complete, the displacement operation process terminates.

[0061] Through the above loop structure, whether it is a preview process that has not triggered a migration or a subsequent process that has completed a migration, as long as the user's drag operation continues, the process will return to the starting point of the direction detection, thereby ensuring that the system can provide users with a highly flexible control experience that fully supports direction reversal during dragging.

[0062] Figures 4A to 4D A schematic diagram illustrating a process of operating an ordered text sequence according to an embodiment of the present disclosure. Figures 4A to 4D The technical solution of this disclosure is illustrated using a subtitle production scenario as an example. In this type of application scenario, the ordered text sequence corresponds to the subtitle text sequence in film and television works. The subtitle content usually needs to precisely match the rhythm and emotional pauses of the visuals, audio, and characters' speech. During the editing process, users frequently perform operations such as adjusting sentence breaks, merging and splitting text units, thus placing high demands on editing efficiency, operational consistency, and the accuracy of the editing results. Figures 4A to 4D The operation process shown demonstrates the specific application of the ordered text sequence editing scheme provided in this disclosure in the above-mentioned application scenarios and the resulting technical effects.

[0063] It should be noted that, Figures 4A to 4D The subtitle editing scenario shown is merely a typical application example of this disclosure, used to aid in understanding the implementation principles of logical boundary manipulation, ordered text migration, and related interaction processing mechanisms. This disclosure is not limited to the field of subtitle production, but is also applicable to other application scenarios requiring fine-grained editing of ordered text sequences, such as speech-to-text proofreading, lyric creation, meeting minutes transcription, and language learning text processing. In different application environments, the specific meaning, display format, and interaction method of text units can be adjusted according to actual needs without affecting the overall concept and implementation effect of the technical solution disclosed herein.

[0064] To clearly illustrate the implementation of this disclosure in specific application scenarios, this embodiment uses an excerpt from the classic literary masterpiece "Ode to the Red Cliff" as an example text for an ordered text sequence, simulating the process of synchronously editing text content with time-series media in scenarios such as audiobooks, educational videos, or cultural programs. (Refer to...) Figure 4A In the focus editing area 405 of the tool, an ordered text sequence, "Only the clear breeze on the river and the bright moon in the mountains, can be heard and seen, and are inexhaustible," is displayed as the object of editing. Simultaneously, the user interface also displays boundary controls 406a and 406b, both of which can be referred to as first boundary controls.

[0065] Boundary controls 406a and 406b divide the ordered text sequence into multiple text units, including text unit 410, text unit 420, and text unit 430, wherein text unit 410 is adjacent to text unit 420, and text unit 420 is adjacent to text unit 430. It should be understood that... Figure 4A The two border controls and three text cells shown are for illustrative purposes only, and embodiments of this disclosure do not limit the specific number of border controls and text cells displayed in the user interface.

[0066] In the description of embodiments of this disclosure, "first text unit" and "second text unit" are used to refer to any pair of adjacent text units. For example, text unit 410 can be referred to as the first text unit, and the text unit 420 adjacent to it can be referred to as the second text unit; or, for another example, text unit 420 can be referred to as the first text unit, and the text unit 410 or text unit 430 adjacent to it can be referred to as the second text unit.

[0067] like Figure 4AAs shown, the text sequence included in text unit 410 is "Only the gentle breeze on the river", which forms a sub - text sequence of the entire ordered text sequence and is on one side of the boundary control 406a. The text sequence included in text unit 420 is "And the bright moon among the mountains, what one hears becomes sound", which is on the other side of the boundary control 406a and is also on one side of the boundary control 406b. In this specification, the different directions divided around the boundary control can be respectively called the first side and the second side, but this naming is only used to describe the logical relationship and does not constitute a limitation on the specific spatial direction.

[0068] In addition, in the audio waveform graph area 402, timestamp marks corresponding to each text unit in the time dimension are also displayed. In this subtitle editing scenario, the timestamp mark 407a_end representing the end time of text unit 410 and the timestamp mark 407a_start representing the start time of text unit 420 together constitute a set of timestamp marks logically associated with the boundary control 406a. Similarly, the timestamp mark 407b_end and the timestamp mark 407b_start constitute another set of timestamp marks logically associated with the boundary control 406b. It should be understood that the display and adjustment of timestamps are mainly used in application scenarios related to temporal media such as subtitles. In other application scenarios, the ordered text sequence may not contain timestamp information.

[0069] In some embodiments, in response to the user operation being a displacement operation for moving the first boundary control, wherein adjusting the first text content included in the first text unit and the second text content included in the second text unit includes: determining the text to be adjusted based on a comparison of the spatial position of the first boundary control during the displacement operation with the spatial ranges of one or more text elements in the first text unit or the second text unit.

[0070] As Figure 4A shown, text unit 410 is adjacent to text unit 420, and the two are separated by the boundary control 406a. Text unit 410 currently contains the text "Only the gentle breeze on the river", and text unit 420 currently contains the text "And the bright moon among the mountains, what one hears becomes sound". When the user believes that the text boundary should be adjusted after "bright moon", the boundary control 406a can be selected by the pointer device and a drag - right operation can be performed to trigger the displacement operation.

[0071] In some embodiments, the comparison and determination of the text to be adjusted are iteratively performed during the duration of the displacement operation, and the triggering condition for each iteration is based on the real - time spatial position of the decision point of the boundary control crossing a preset spatial position associated with one or more text elements to be migrated.

[0072] During the displacement operation, the system continuously obtains the real-time spatial position of the boundary control 406a in the interface and compares this spatial position with the spatial ranges of each text element to the right of the boundary control 406a in the text unit 420. In some embodiments, the text element can be a single character or punctuation mark, and its spatial range can be determined by the display area of the character in the interface. When it is detected that the determined position of the boundary control 406a crosses a preset spatial threshold of a certain text element, the system determines this text element as the text to be adjusted and performs the corresponding text migration operation. It should be understood that the "text element" described in the embodiments of the present disclosure not only includes single characters (such as letters, numbers, Chinese characters, punctuation), but also includes phrases, expressions recognized based on semantics, or any text segment selected by the user. The embodiments of the present disclosure do not limit this.

[0073] For example, in Figure 4A the state shown, when the boundary control 406a moves to the right and successively crosses the spatial ranges of the characters "与", "山", "间", "之", "明", "月", ",", the system successively determines these characters as the text to be adjusted based on the above spatial position comparison, removes them from the text unit 420, and adds them to the end position of the text unit 410. Through this process, the text content of the text unit 410 gradually expands, while the text content of the text unit 420 decreases accordingly, thereby realizing the continuous movement of the logical boundary in the text.

[0074] When the user ends the displacement operation of the boundary control 406a, such as releasing the pointer device, the system will complete the text adjustment result corresponding to this displacement operation according to the final spatial position of the boundary control 406a and update the display in the user interface. It should be understood that the above description does not limit the system to only update the display when the user releases the pointer device. In other embodiments, the system can also update the display content of the text unit step by step according to the real-time position of the boundary control during the displacement operation, or complete the update at one time after the displacement operation. As Figure 4B shown, the text unit 410 is updated to include "惟江上之清风,与山间之明月,", the text unit 420 is updated to include "耳得之而为声,", and the boundary control 406a is displayed at the new text demarcation position. By determining the text content to be adjusted through the comparison of the spatial position of the boundary control and the spatial range of the text element, fine and continuous control of the text boundary between adjacent text units can be achieved, improving the intuitiveness and accuracy in the process of editing an ordered text sequence.

[0075] In some embodiments, the comparison and determination of the text to be adjusted is performed iteratively during the duration of the displacement operation, and each iteration is triggered based on the real-time spatial position of the boundary control's decision point, which crosses a preset spatial position associated with one or more text elements to be migrated. In some embodiments, the preset spatial position is the geometric centerline of the one or more text elements to be migrated.

[0076] In some embodiments, during the displacement operation of the first boundary control, the system does not determine the text to be adjusted only once after the displacement operation ends, but rather monitors and judges the displacement state in real time in a cyclical manner during the duration of the displacement operation. Specifically, as the user continuously drags the first boundary control, the system obtains the current spatial position of the first boundary control in each displacement detection cycle, and compares this spatial position with the spatial range of one or more text elements to be migrated near the logical boundary in the first or second text unit, thereby dynamically determining whether the triggering conditions for text migration are met.

[0077] During this process, the system can predefine a decision point for the first boundary control to reflect its effective position in the current displacement direction. For example, when the displacement direction is towards the second text unit, the right edge of the first boundary control in the interface can be used as the decision point; when the displacement direction is towards the first text unit, the left edge of the first boundary control in the interface can be used as the decision point. The system compares the real-time spatial position of this decision point with the preset spatial position of the text element to be migrated to determine whether to trigger a text migration.

[0078] In some embodiments, the preset spatial position can be set as the geometric center line of the text element to be migrated in the user interface. That is, when the judgment point of the first boundary control crosses the geometric center line of a text element to be migrated during the displacement process, it is considered that the migration trigger condition corresponding to the text element to be migrated has been met. Based on this judgment result, the system can remove the text element to be migrated from the source text unit and add it to the target text unit, thereby completing an atomic text migration.

[0079] Through the above method, the judgment and execution of text migration can be performed iteratively during the duration of the displacement operation, allowing the user to gradually and continuously adjust the text allocation relationship between adjacent text units while dragging the first boundary control. In some embodiments, the system can provide visual preview feedback before the judgment point crosses the preset spatial position; when the judgment point crosses the preset spatial position, the corresponding text migration operation is executed. The above text migration operation can adopt a real-time update method, that is, the text content displayed on the interface and the underlying data are updated synchronously immediately when each trigger condition is met; or a delayed submission method can be adopted, that is, during the user's dragging process, the text content displayed on the interface is updated in real time according to the migration judgment to provide smooth visual feedback, while the underlying data remains unchanged until the user ends the displacement operation, at which point the accumulated migration results determined during the displacement process are updated as a whole in a one-time data update. The above different processing methods can be selected according to the system performance requirements or specific application scenarios, without affecting the method of determining the migrated text based on spatial position comparison in the displacement operation.

[0080] In some embodiments, the comparison and determination of the text to be adjusted is performed after the displacement operation has ended, and the determination of the text to be adjusted includes: determining the text to be adjusted based on a comparison between the ending spatial position of the boundary control and the spatial range of the one or more text elements. For example, the displacement operation can employ a two-stage "preview-submit" update mechanism. In this embodiment, when the user performs a displacement operation on the first boundary control and continues to drag it, the user interface only presents the positional change of the boundary control in the ordered text sequence, while the text content in the first text unit and the second text unit remains unchanged. The system does not actually migrate the text content during the dragging process, but only provides indicative or preview interface feedback related to the position of the boundary control to assist the user in determining the target boundary position.

[0081] When the user ends the displacement operation, such as releasing the input device or ending the corresponding input command, the system obtains the final spatial position of the first boundary control at the end of the displacement operation and compares this final spatial position with the spatial range of each text element in the first or second text unit. Through the above comparison, the system determines the final segmentation position corresponding to the boundary control, thereby determining the text content that needs to be removed from or added to the first text unit, and the text content that needs to be added to or removed from the second text unit accordingly.

[0082] After the above determination is completed, the system updates the text content in a one-time manner, removing the determined text from the corresponding text cell and adding it to the adjacent text cell, while simultaneously updating the user interface display and the underlying data. This text content update is performed as an indivisible operation to ensure data consistency and integrity before and after the text adjustment. In this way, the actual migration of text content only occurs after the displacement operation is completed, avoiding frequent modifications to the data state during dragging. This approach is suitable for application environments that require reduced real-time data update overhead or support for complex collaborative editing scenarios.

[0083] For example, when the system identifies text to be migrated near a logical boundary in a first or second text unit, it can apply first visual feedback to that text to visually indicate the text content that may be adjusted. This first visual feedback may include, but is not limited to, highlighting the corresponding text segment, changing its color, blurring it, or other distinguishable display effects to clearly identify the candidate migration object for this displacement operation. Simultaneously, the system can also provide second visual feedback near a logical boundary in the target text unit to indicate the potential insertion position. For example, displaying an insertion preview cursor, placeholder marker, or a semi-transparent preview of the text to be migrated at the target position at the end of the first text unit or the beginning of the second text unit.

[0084] By combining one or more of the aforementioned visual feedback, users can perceive the range and direction of text migration in real time during displacement operations, thereby improving the intuitiveness and accuracy of editing operations. For example, a second visual feedback can be generated in the second text unit near the logical boundary (e.g., its starting position) to indicate the potential insertion position, aiming to provide users with a clear preview of the potential insertion position. The second visual feedback can be a preview form, such as a blinking cursor, a placeholder marker, or a semi-transparent shadow of the candidate element. The synergistic effect of dual visual feedback provides users with a clear, unambiguous real-time preview of the migration range and target position, thereby significantly reducing the error rate.

[0085] In some embodiments, when identifying text to be migrated, the system can determine the text segments constituting a single migration unit based on a predetermined pattern. This predetermined pattern defines the smallest text unit that is treated as a whole and adjusted during the shift operation, and can be configured according to different application requirements. In one embodiment, the predetermined pattern may include a fixed-granularity pattern, in which the migration unit can be a single character, a single punctuation mark, or a single word divided based on specific language rules, thus suitable for high-precision unit-by-unit fine-tuning. For example, the fixed-granularity pattern defines the migration unit as a basic linguistic unit, such as a single character, a single punctuation mark, or a single word identified based on linguistic boundaries (such as spaces in English). This pattern is suitable for high-precision unit-by-unit fine-tuning.

[0086] In some embodiments, the basic unit of the text to be adjusted is a single character. In some embodiments, at least one visual feedback may be provided, wherein the visual feedback includes at least one of the following: a first visual feedback indicating the text to be migrated in real time; or a second visual feedback indicating the target insertion position in real time. In some embodiments, text segments constituting a single migration unit may be identified based on a predetermined pattern, wherein the predetermined pattern includes at least one of a fixed granularity pattern, an intelligent semantic pattern, and a custom pattern.

[0087] In another embodiment, the predetermined pattern includes an intelligent semantic pattern. The system can perform semantic analysis on the text content to identify phrases or words with relatively complete semantics as transfer units, thereby preserving the integrity of the semantic structure as much as possible during the translation process. For example, the intelligent semantic pattern can utilize Natural Language Processing (NLP) algorithms to analyze the text and identify a text fragment with relatively independent semantics (e.g., a phrase like "artificial intelligence" or a short phrase like "a cup of tea") as a transfer unit. This pattern is suitable for rapid adjustments where semantic integrity is desired. As a specific implementation, the NLP algorithm can be based on a preloaded domain dictionary, or employ sequence labeling algorithms such as Hidden Markov Models (HMMs) or Conditional Random Fields (CRFs), or utilize pre-trained language models based on the Transformer architecture (such as BERT) for word segmentation or phrase boundary detection.

[0088] Furthermore, in some embodiments, the predetermined mode may also include a custom mode, allowing users to specify any text segment as the migration unit through real-time interaction, thereby providing greater flexibility for different editing habits or special editing needs. For example, the user-defined mode can define the migration unit as any text segment specified by the user through real-time interaction (e.g., selecting text with the cursor). This mode provides users with maximum operational flexibility. The above-mentioned different modes can be used individually or in combination according to specific application scenarios without affecting the overall logic of the displacement operation, and the embodiments of this disclosure do not impose any limitations on this.

[0089] In some embodiments, the triggering condition for the migration is determined based on a comparison between the spatial position of the first boundary control during the displacement operation and the spatial range of one or more text elements in the first or second text unit. For example, during the user's displacement operation on the boundary control, the system continuously acquires the current position of the boundary control in the user interface and compares this position with the display range of text elements near the logical boundary in the first or second text unit. When this spatial relationship meets a preset condition, a text migration operation is considered to be triggered.

[0090] For example, in a graphical user interface-based embodiment, boundary controls have definite decision points within the interface, while text elements each have their own display area or spatial range. During the displacement operation, the system can determine in real time whether the decision point of the boundary control enters, crosses, or leaves the display range of a text element, or whether it crosses the spatial threshold corresponding to that text element. When the determination result indicates that the position of the boundary control has reached the trigger position associated with a text element, it is determined that the text element belongs to the object that needs to be migrated, thereby triggering the corresponding text migration process.

[0091] In some embodiments, the aforementioned spatial range can be defined by the character boundaries, character center lines, character bounding rectangles, or other spatial information used to characterize the text display position within the interface. By comparing the real-time position of the boundary control during the displacement operation with the aforementioned spatial range, the system can accurately determine the triggering timing of the migration operation and, with the aid of visual feedback, enable the user to intuitively perceive whether the current displacement operation is about to trigger or has already triggered text migration.

[0092] In this way, the triggering of text migration does not depend on a fixed number of operation steps or discrete commands, but is directly determined by the spatial positional relationship between the boundary control and the text element. This makes the editing process of ordered text sequences more continuous and intuitive in terms of interaction, and can adapt to different text display layouts and migration needs of different granularities.

[0093] In some embodiments, when the migration triggering condition is not met, the first visual feedback and the second visual feedback are maintained, and the text content of the first text unit and the second text unit remains unchanged; and when the migration triggering condition is met, an atomic migration operation is performed, wherein the atomic migration operation includes: removing the determined text that needs to be adjusted from the first text unit; and adding the determined text that needs to be adjusted to the second text unit.

[0094] For example, during a user's displacement operation on a boundary control, the system continuously determines whether the migration trigger condition is met based on a comparison between the real-time spatial position of the boundary control and the spatial range of one or more text elements in adjacent text units. When the boundary control has not yet moved to a position that meets the migration trigger condition, the system does not modify the actual text content in the text unit, but instead keeps the text content in the first and second text units unchanged, while continuing to present the first and second visual feedback corresponding to the current operation. The first visual feedback indicates the text range currently identified as a migration candidate, and the second visual feedback indicates the potential target insertion position, providing the user with a clear preview of the operation.

[0095] When the system detects that the boundary control's judgment point crosses the preset spatial position associated with the text element to be migrated, thus meeting the migration trigger condition, the system executes an atomic migration operation. In this atomic migration operation, the system removes the determined text that needs adjustment from the source text unit and adds it to the corresponding position in the target text unit. This migration process is executed as an indivisible operation to ensure the consistency and integrity of the text content adjustment. After the migration is completed, the user interface is updated synchronously to reflect the latest text content state in the first and second text units. Through this method, during the displacement operation, the system can provide only a visual preview without changing the actual text content when migration is not triggered, and then execute a deterministic text migration operation after the migration condition is triggered. This allows users to intuitively perceive the impact of boundary adjustments and obtain a stable and controllable editing experience during the interaction.

[0096] In some embodiments, in response to the user operation being a removal operation for removing the first boundary control, adjusting the first text content included in the first text unit and the second text content included in the second text unit includes: removing the first boundary control; and merging the first text unit and the second text unit into a third text unit, where the third text unit includes the first text and the second text. This removal operation can be triggered by various interaction methods. For example, the user double-clicks on the boundary control, or enters a delete instruction when the boundary control is in a selected state, etc. The present disclosure does not limit this.

[0097] For example, as Figure 4B shown, the editor determines that the subtitle content "What one hears becomes sound" in the text unit 420 and the subtitle content "What one sees becomes color" in the adjacent text unit 430 have strong semantic continuity and are suitable for merging into the same text unit. At this time, the user performs a removal operation on the boundary control 406b located between the two text units. After the system receives this user operation, it triggers the corresponding merging process. In response to this instruction, the system immediately performs the following series of atomic operations: First, and most core, the system merges the text content of the second unit with the text content of the third unit to form a new, content-continuous merged text. Subsequently, as a linkage response, the intelligent timing synchronization module described in the present disclosure is automatically called to merge the time identifiers associated with the respective text units to form a new time range that can cover the entire content of the merged text.

[0098] For example, the system first removes the boundary control 406b, so that the two adjacent text units originally separated by this boundary control no longer maintain a separated relationship logically. Subsequently, the text contents in the text unit 420 and the text unit 430 are merged to generate a new text unit 440. This text unit 440 includes the text sequence in the text unit 420 and the text sequence in the text unit 430, thus forming a content-continuous merged text.

[0099] In some embodiments, while completing the text content merging, the system also联动调用智能处理模块,对与文本单元相关联的其他属性进行同步更新。例如,在字幕编辑场景下,文本单元通常与时间标识相关联,系统可以将原第一文本单元和第二文本单元各自对应的时间戳范围进行合并,生成一个覆盖合并后文本内容的新的时间范围,并在音频波形图区中更新对应的时间戳标记显示。

[0100] It should be noted that there is an unclear expression "联动调用智能处理模块" in the original text. It might need to be further clarified in the source to ensure more accurate translation. The above translation is based on the best understanding of the current text.In some embodiments, the background system adopts a data and history management mechanism centered around persistent identifiers. When performing the above text unit merging operation, the persistent identifiers corresponding to the original first text unit and the second text unit are marked as merged in their version history, and an association relationship is established with the persistent identifier of the newly generated third text unit, thereby completely recording the evolution path of the text units. Doing so is not only for handling the current merging operation, but also for providing the subtitle editing workflow with a complete historical version tracing ability and future functional extensibility (such as adding review comments) that are unparalleled by the prior art. For example, when performing the merging operation, the system background will perform the following processing: At the identity management level, the persistent identifiers of text unit 420 and text unit 430 are marked as "merged" in their version history and point to a brand-new persistent identifier assigned to the newly generated text unit 440 after merging. This ensures that the evolution path of the text units is completely recorded.

[0101] In some embodiments, after completing the above text unit merging operation, the system can also call a language-aware boundary formatting module to format the merged text content. Through this module, corresponding typesetting rules can be automatically applied at the connection points of text content from different text units according to the current language environment. For example, in a Chinese context, the system can avoid wrongly inserting unnecessary spaces between characters such as "声," and "目", ensuring professional typesetting quality.

[0102] After the merging operation is completed, the user interface is updated accordingly. In some embodiments, the focus editing area can be designed to display three consecutive text units. After the original text unit 420 and text unit 430 are merged into text unit 440, the system will load the next adjacent unit 450 of the newly merged unit from the global subtitle list and display it as the new text unit. At the same time, the system generates and displays a new boundary control 406c between the new text unit and its adjacent text unit 450 for subsequent manipulation of the new logical boundary. Meanwhile, the timestamp markers originally associated with the removed boundary control 406b in the audio waveform area are removed, and the timestamp markers logically associated with the newly generated boundary control 406c are displayed to reflect the updated subtitle structure. Refer to Figure 4C , the state of the focus editing area after the merging operation is updated. Meanwhile, the system automatically removes the second pair of timestamp markers (407b_end and 407b_start) in the audio waveform area 402 according to the updated time data in the background and displays the third pair of timestamp markers (407c_end and 407c_start) logically associated with the new boundary control 406c.

[0103] In the above manner, when the system responds to the removal operation of the boundary control, it realizes the merging of text units, the synchronous update of related attributes, and the traceable management of the editing history, making the structural adjustment of the ordered text sequence more efficient and reliable during the editing process.

[0104] In some embodiments, a second boundary control is inserted into the first text, wherein the second boundary control element divides the first text unit into a fourth text unit and a fifth text unit, the fourth text unit includes the fifth text in the first text on the first side of the second boundary control, and the fifth text unit includes the fifth text in the first text on the second side of the second boundary control.

[0105] Taking Figure 4D the subtitle editing scenario shown as an example, after the editor completes the foregoing text merging operation, it is considered that the content length of the merged text unit 440 is relatively long, which is not conducive to precise matching with the recitation rhythm or the picture rhythm. Therefore, a new boundary control 406d needs to be introduced inside the text unit 440.

[0106] For example, the editor can position the text cursor at the target segmentation position in the text unit 440, such as the position between "The eyes encounter it and it takes on a color" and "Taking it without limit", and trigger the segmentation operation through a preset insertion instruction. After receiving this insertion instruction, the system performs a splitting process on the text unit 440 at the data level. Taking this segmentation position as the boundary, the text content of the original text unit is divided into the text part on the first side of the boundary control 406d and the text part on the second side of the boundary control 406d, thereby generating the text unit 460 and the text unit 470 respectively. In response to the insertion instruction, the system first divides the original second unit into two at the data level, and a new logical boundary is naturally formed between the newly generated two units. Subsequently, the system displays a newly created boundary control 406d at this logical boundary on the user interface. In the background implementation using the preferred data model, the persistent identity identifier of the original unit is marked as "split" and points to the persistent identity identifiers of the two newly generated units, thus ensuring the atomicity and traceability of the entire editing history.

[0107] Furthermore, in subtitle application scenarios, after the segmentation operation is completed, the system can also automatically call the intelligent timing synchronization module. Based on the text length or number of characters contained in the segmented text units 460 and 470, the system automatically divides the time range corresponding to the original first text unit, thereby generating timestamp intervals corresponding to text units 460 and 470 respectively. For example, the intelligent timing synchronization module is automatically called again, and it automatically divides the time range of the original unit according to the proportional allocation mode based on the character count ratio of the two new units after segmentation. This greatly reduces the workload of manually adjusting the new timestamps. The new timestamp marker pair logically associated with the boundary control 406d is synchronously generated and displayed in the audio waveform area to intuitively reflect the correspondence between the segmented subtitles and the audio timing. (Refer to...) Figure 4D After the segmentation operation is completed, the state of the focus editing area is updated. At the same time, based on the time data updated in the background, the system automatically creates and displays the fourth timestamp pair (407d_end and 407d_start) logically associated with the boundary control 406d in the audio waveform area 402.

[0108] In some embodiments, after completing any of the aforementioned atomic editing operations and causing changes in text content or logical boundaries, the atomic update process can further trigger the execution of one or more intelligent processing modules to achieve automated collaborative processing of relevant data.

[0109] In some embodiments, the intelligent processing module includes an intelligent timing synchronization module. When the executed editing operation is a displacement, removal, or insertion operation, and causes a change in the text content or boundary position of one or more text units, the system can invoke the intelligent timing synchronization module to synchronize and adjust the time stamp corresponding to the affected text units. The intelligent timing synchronization module can adopt different time stamp adjustment strategies according to different operation types.

[0110] For example, when performing a removal operation, the intelligent time synchronization module can use deterministic merging logic to set the starting point of the timestamp of the newly generated text unit after merging to the starting timestamp of the original first text unit, and set its ending point to the ending timestamp of the original second text unit, so as to form a continuous time range covering the merged text content.

[0111] For example, when performing displacement or insertion operations and needing to recalculate the time boundary, the intelligent timing synchronization module can use at least one of the following estimation modes to determine the new time marker: a proportional allocation mode, which linearly divides or adjusts the original time range based on the proportion of characters or words in the original text content after migration or segmentation; a rate estimation mode, which estimates the time required for adding or migrating text based on a preset or context-learned average text rendering rate, and calculates the new time marker accordingly; and an automatic alignment mode based on media features, which, in application scenarios synchronized with audio and video media, automatically aligns the new time marker boundary to the matching media feature position by analyzing media features near the time boundary (e.g., silent sections in audio waveforms).

[0112] In some embodiments, the intelligent processing module further includes a language-aware boundary formatting module. When the displacement, removal, or insertion operation is performed, resulting in a change in the text content of any text unit, the system can automatically invoke the language-aware boundary formatting module to format the updated text content.

[0113] In some embodiments, the language-aware boundary formatting module can first identify the linguistic environment to which the currently edited text unit belongs. This identification can be based on user-preset language metadata or on the results of automatic detection of the text content. Subsequently, the module can load and apply the corresponding boundary processing rule set from the rule base according to the identified language type to ensure that the layout and format of the text after content update conforms to the writing standards of that language.

[0114] In some embodiments, the rule set may include rules for handling word separators. For languages that use spaces as word segmentation boundaries, the rules can be used to standardize the use of spaces between words and between words and punctuation marks; for non-space-segmented languages, the rules can be used to maintain close arrangement between full-width characters and automatically apply a preset normalized spacing at the junction of full-width and half-width characters to improve the readability of mixed text.

[0115] In some embodiments, the rule set may further include punctuation attributes and line break rules, used to handle the display restrictions of punctuation marks at the beginning or end of a line after text boundary changes, as well as the unification and conversion of full-width or half-width punctuation marks. Furthermore, the language-aware boundary formatting module may provide a custom rule interface, enabling users or developers to extend or replace existing rules according to specific typesetting needs.

[0116] Through the coordinated execution of the aforementioned intelligent processing modules, text content editing, time stamp adjustment, and multilingual typesetting can be achieved without increasing the user's workload, thereby improving the automation level and overall editing quality in the process of editing ordered text sequences.

[0117] pass Figures 4A to 4D As can be seen from the continuous operation process shown, the ordered text sequence editing scheme provided in this disclosure unifies the scattered operations such as sentence segmentation, text merging, and text splitting in traditional subtitle editing into an interactive processing process centered around logical boundaries. Users can adjust the content distribution relationship between text units by intuitively manipulating the state of the logical boundaries, ensuring that the editing operation is highly consistent with the user's editing intentions, significantly reducing the number of operation steps and cognitive burden.

[0118] In this scheme, logical boundaries are treated as abstract editing objects. Their positional changes and existence states are expressed in the user interface through boundary controls. User interaction with these boundary controls triggers atomic migration, merging, or splitting operations of the underlying text content, thus achieving a continuous and predictable editing experience. Whether in fine-grained sentence segmentation or larger-granularity merging and splitting scenarios, the consistency of operations and the determinism of results are maintained. Furthermore, the aforementioned interaction process is supported by a data model based on persistent identity identifiers and versioned states, and is linked to intelligent time-series synchronization and language-aware formatting modules. While ensuring editing efficiency, it also considers the consistency of time signatures, the standardization of text formatting, and the traceability of historical states, thereby improving the overall reliability and scalability of editing ordered text sequences.

[0119] Embodiments of this disclosure also provide a method and corresponding data model for managing ordered text sequences. In this method, the ordered text sequence is divided into multiple text units, and by assigning persistent identifiers to each text unit and maintaining a versioned state history, the text units can maintain stable identity references even when editing operations such as displacement, splitting, or merging occur. Based on the above data model, external metadata (also known as assets) can establish a long-term and reliable association with the corresponding text units through the persistent identifiers, thereby avoiding problems such as metadata loss, mismatch, or untraceability due to structural changes during the continuous evolution of text content.

[0120] External metadata can include, but is not limited to, user notes, favorites, highlighted annotations, tags, analysis results, or comment information. By associating external metadata with persistent identifiers of text units, rather than just their instantaneous text content or display position, a complete association history can be preserved when editing operations trigger the evolution of text units, providing a foundation for subsequent inheritance decisions, intelligent analysis, or manual intervention. To more clearly illustrate the specific implementation methods and technical effects of the above methods and data models, the following will combine... Figure 5A , Figure 5B , Figure 6 and Figure 7 The illustrated embodiments provide a detailed description of a language learning system that supports stable metadata association. It should be understood that the language learning system is for illustrative purposes only, and the embodiments disclosed herein do not limit specific application scenarios.

[0121] Figure 5A and Figure 5B The following illustration shows an application of embodiments of the present disclosure in a language learning scenario, wherein... Figure 5A This is a schematic diagram of the user interface in its initial state. Figure 5B This is a user interface diagram after the splitting operation is performed and metadata inheritance is completed. (Refer to...) Figure 5A The user interface of the language learning system may include a video playback area 501, a subtitle list area 502, a focus editing area 503, and a note area 504. The video playback area 501 is used to play the corresponding audio and video content; the subtitle list area 502 is used to display subtitle sequences and metadata operation entries related to subtitle units; the focus editing area 503 is used for fine-tuning several currently selected adjacent text units; and the note area 504 is used to input and display user notes associated with the text units.

[0122] Figure 6 A schematic diagram of the logical architecture of a data model according to an embodiment of the present disclosure is shown. (Refer to...) Figure 6 The backend system in this embodiment employs a data model centered on persistent identity identifiers. Each text unit is assigned a unique persistent identity identifier upon creation, which remains unchanged throughout its lifecycle. In this embodiment, this identifier is represented in UUID format. (Refer to...) Figure 6 The data model can include an immutable entity layer 601, a versioned state history layer 602, a real-time state layer 603, and an external relationship and metadata layer 604. The layers are decoupled and collaborated through explicit reference relationships.

[0123] The immutable entity layer 601 is used to define text unit entities in an ordered text sequence. Each text unit is assigned a unique persistent identifier upon creation, represented in UUID form in this embodiment, for example... Figure 6 The text unit A (UUID: 7A1) and text unit B (UUID: 7A2) are shown in the image. This layer is only used to represent the identity of the text unit and does not directly carry variable state information, thus ensuring that the identity reference of the text unit remains stable after it has undergone operations such as splitting, merging, or content adjustment.

[0124] The versioned state history layer 602 records the state evolution process corresponding to each persistent identity identifier. For each state change of the same text unit, the system generates a new state snapshot and stores it in chronological or version order. Figure 6 As shown, for UUID: 7A1, there can be multiple consecutive state versions v1, v2, v3, and v4; for UUID: 7A2, there can also be multiple state versions. Each state version forms a traceable state evolution chain through referencing relationships, supporting historical backtracking, version comparison, and source analysis of editing operations.

[0125] The real-time state layer 603 is used to maintain the latest active state of each text unit. This layer stores a reference to the corresponding latest state snapshot, indexed by a persistent identity identifier. For example, UUID: 7A1 corresponds to the latest state v4, and UUID: 7A2 corresponds to the latest state v3. Real-time rendering of the user interface, interactive feedback, and subsequent editing operations are all accessed directly based on this real-time state layer, thus avoiding traversing the entire version history during high-frequency interactions and improving system response efficiency.

[0126] The external relations and metadata layer 604 is used to store external information associated with text units, such as user notes, favorites, or other extended metadata. Figure 6 As shown, Note 1 and Collection Record A are not directly bound to specific text content or status versions. Instead, they are associated by referencing the persistent identity identifier of the corresponding text unit (such as UUID: 7A1 or UUID: 7A2). With this design, even if the text unit undergoes content adjustments, version updates, merging, or splitting during subsequent editing, the aforementioned external metadata can still maintain a stable association through identity references, avoiding association failure due to structural changes.

[0127] Through the hierarchical data model design described above, the identity, state evolution, real-time access, and external associations of text units are decoupled from each other, thus providing basic support for the consistency maintenance, historical tracing, and stable management of external metadata of ordered text sequences during dynamic editing.

[0128] Taking the language learning process of a user watching the movie "Forrest Gump" as an example, such as Figure 5AAs shown, the focus editing area 503 displays three consecutive subtitle text units loaded from the subtitle file. These text units are separated from each other by boundary controls, specifically including: the first text unit 506, with a persistent identifier of UUID:7A1, whose text content is "My momma always said,"; the second text unit 507, with a persistent identifier of UUID:7A2, whose text content is "'Life was like a box of chocolates, you never know what you're gonna get.'"; and the third text unit 508, with a persistent identifier of UUID:7A3, whose text content is "Those must be comfortable shoes."

[0129] During the learning process, the user becomes interested in the expressions in the second text unit 507 and focuses on learning the colloquial usage of "gonna." The user can use the system's built-in functions to look up relevant definitions and enter related learning notes in the notes area 504. Simultaneously, the user activates the favorite button 505 corresponding to the second text unit 507 in the subtitle list area 502 to mark that subtitle content as a key learning focus.

[0130] During this process, the note content and collection status are stably associated with the persistent identity identifier UUID:7A2 of the second text unit 507 through the data and history management module, thereby realizing the reference binding between metadata and text units. Subsequently, the user believes that the text length of the second text unit 507 is too long, which is not conducive to subsequent reading and understanding, and decides to perform a segmentation operation. The user positions the text cursor between "chocolates," and "you" and executes the insertion command. After the system responds to the command, it executes the atomic segmentation process.

[0131] At the data level, the original second text unit 507 is marked as segmented, and two new text units are generated, each assigned a new persistent identity identifier. Specifically, as follows... Figure 5B As shown, the system generates: a new second text unit 509 with a persistent identity identifier of UUID:8B1 and a text content of "'Life was like a box of chocolates,'"; and a new third text unit 510 with a persistent identity identifier of UUID:8B2 and a text content of "you never know what you're gonna get.".

[0132] On the user interface, the two segmented text units are displayed adjacent to each other, with a new boundary control generated between them. Simultaneously, the subtitle structure in the subtitle list area 502 is updated accordingly. In this embodiment, when performing the segmentation operation, the data and history management module records the evolution relationship between the original text unit and the new text unit in the versioned state history layer 602. Specifically, the system creates a final state snapshot for the original UUID:7A2 to identify that the text unit has been segmented into two new units corresponding to UUID:8B1 and UUID:8B2.

[0133] Based on the above evolutionary relationships, the system can identify the inheritance relationships faced by the original metadata. Therefore, as follows: Figure 7 As shown, the system can trigger dialog box 701 to prompt the user about how to handle the existing notes and favorites status. This interface provides the user with several options, including associating the metadata to any new text unit after splitting, or associating it to two new text units separately, and receives the user's final selection through confirmation element 705.

[0134] For example, when the system identifies, based on versioned state history, that an existing text unit has been segmented and there is inheritance ambiguity in the external metadata associated with that original text unit, the system can automatically trigger dialog box 701 to guide the user in making a decision regarding the subsequent association method of the metadata. This dialog box 701, presented in a dialog box or similar format, is used to clearly explain to the user the source background of the current metadata and the available processing options.

[0135] Specifically, dialog box 701 may first display a prompt message indicating that the relevant metadata was originally associated with the text unit before the split, and prompting that the text unit has evolved into multiple new text units, thus requiring the user to confirm the whereabouts of the metadata. For example, the prompt message may summarize the original note content and its associated objects, and ask the user how they wish to handle the note or its saved status.

[0136] In dialog box 701, the system can further provide several optional operation options for the user to choose from. These options may include: a first option 702, instructing the system to associate the metadata with the first new text unit generated after segmentation; a second option 703, instructing the system to associate the metadata with the second new text unit generated after segmentation; and a third option 704, instructing the system to create copies of the metadata for each of the multiple new text units and associate the corresponding copies with each new text unit. Through these options, the user can explicitly specify the attribution of metadata after text evolution based on actual semantic understanding or personal preference.

[0137] In addition, dialog box 701 also includes a confirmation element 705, such as a confirmation button or an equivalent interactive control. When the user selects any of the above options in dialog box 701 and activates the confirmation element 705, the system performs the corresponding association operation according to the user's selection, updates the metadata to the corresponding new text unit, and records the association decision in the background as part of the versioned state history.

[0138] In some embodiments, as the association operation is completed, the relevant display in the user interface can also be updated synchronously. For example, in the caption list area, the collection status indicator associated with the selected new text unit can be updated to an active state to visually reflect that the metadata has been successfully inherited and re-associated. In this way, even when the text unit undergoes structural evolution, the system can still achieve accurate inheritance and consistent presentation of metadata under the control of the user.

[0139] In some embodiments, the system can also be configured to automatically invoke an intelligent context analysis mechanism after detecting structural changes such as splitting or merging of text units, in order to help determine the inheritance relationship of external metadata associated with the original text unit between the evolved text units.

[0140] In this implementation, the system can perform semantic analysis or behavioral correlation analysis on user notes, favorite status, or other metadata content associated with the original text unit to identify core information elements that are highly associated with the metadata. This analysis can be achieved through one or more of the following methods.

[0141] In one implementation, the system can perform text cross-reference-based analysis, comparing the metadata content with the text in the original text units to identify common words or phrases. Based on the frequency or positional relationship of these common occurrences, the system assigns corresponding association weights to the relevant words. When a word appears in both the metadata content and the original text content, that word can be identified as a candidate core word with a high degree of relevance.

[0142] In another implementation, the system can also analyze the user's historical interaction behavior. For example, if the user has used dictionary lookups, pronunciation playback, or other learning functions for a specific word before generating the notes, the system can determine the importance of that word in the notes and consider it as a candidate for core vocabulary.

[0143] In another implementation, the system can also call a natural language processing engine to perform syntactic analysis, keyword extraction, or topic modeling on the notes to determine keywords or concepts that can represent the main semantics of the notes, thus serving as a reference for judging inheritance relationships.

[0144] After completing the above analysis, the system can detect the attribution of the identified core words in multiple newly segmented text units. When the core word appears in only one of the new text units, the system can determine the priority association between the metadata and that new text unit.

[0145] Based on this, the system can optimize the user interaction flow. For example, when triggering events such as... Figure 7 When the user interface is shown, the system can set the associated options corresponding to the text units matching the core vocabulary to the default selected state to reduce the user's operational burden. In some embodiments, if the user has enabled the automatic processing mode in their preferences, the system can also directly perform the re-association operation of metadata when the inheritance relationship is clear and unambiguous, without displaying the interactive interface to the user, thereby achieving automated execution of inheritance processing.

[0146] Through the above methods, even when text units undergo structural evolution, the system can still reasonably determine the inheritance relationship of external metadata based on content semantics and user behavior clues, thereby maintaining a stable association between metadata and text content without affecting editing flexibility.

[0147] Figures 8A to 8G This embodiment illustrates the continuous operation process of the present disclosure in a vertical layout subtitle editing scenario. This example demonstrates that the logical boundary-based editing mechanism proposed in this disclosure has a high level of abstraction, and its technical concept does not depend on a specific user interface layout. Therefore, this embodiment shows a preferred implementation suitable for a vertical list layout to further illustrate the adaptability of this disclosure under different interface paradigms.

[0148] In this implementation, boundary controls are no longer presented as independent visual controls that persist indefinitely. Instead, the logical boundaries are manipulated through functional linkage of the display edges of adjacent text units. The system employs a progressive interactive activation mechanism to accurately identify user intent and provide a smooth and efficient editing experience without altering the existing interface structure.

[0149] Reference Figures 8A to 8G A complete operation flow under this embodiment includes the following stages. In the focus indication and boundary activation stage, such as... Figure 8A and Figure 8B As shown, where Figure 8A This is the initial user interface, a schematic diagram of the user interface when the mouse hovers over the target subtitle. Figure 8BThis is a user interface diagram showing the user placing the cursor on the left edge of the text box, preparing to perform a movement operation. In the vertically arranged list of text units, there are logically adjacent text units A above and B below. When the user's input device enters the display area of a text unit, the system provides a focus indication effect for that text unit, such as highlighting its background area, to clearly identify the current interactive object. During this stage, the system does not activate any boundary manipulation functions, thus maintaining the simplicity of the interface.

[0150] After a text cell gains focus, when the user moves the input device further to the preset edge trigger area of that text cell, the system enters a boundary-linked activation state. For example... Figure 8B As shown, the system instantiates and displays the left boundary control 806a_L corresponding to the text unit B below on the interface, and simultaneously displays the right boundary control 806a_R corresponding to the text unit A above. In the audio waveform area 802, the system also displays the timestamp pair 805a_end and 805a_start associated with this logical boundary.

[0151] To enhance the perceptibility of linkage, the system can also change the cursor shape or display a temporary connection indicator between two boundary controls. For example, after the user moves the cursor to the boundary interaction trigger area and enters the boundary control state, the system can switch the cursor shape from a regular pointer to a double-headed arrow indicating horizontal adjustment capability, visually indicating that the position supports displacement operation along the text arrangement direction. Furthermore, the system can also connect two linked boundary controls (e.g., ... Figure 8B A temporary connection indicator, such as a dashed line, is displayed between 806a_L and 806a_R to clearly indicate that these two boundary controls belong to the same logical boundary in the current operation and will respond collaboratively to the user's input.

[0152] Through the aforementioned phased activation mechanism, the system only provides corresponding interactive capabilities when the user clearly expresses their intention to control the boundaries, thereby improving the accuracy of operations and the overall user experience.

[0153] During the linked drag and displacement operation phase, such as Figure 8C and Figure 8D As shown, where Figure 8C This is a diagram of the user interface when the user has not released the mouse after a displacement operation has been performed. Figure 8DThis is a user interface diagram showing the user interface after the user releases the mouse following a displacement operation. When the virtual boundary is activated, the user can drag any boundary control to adjust the position of the logical boundary. The system can perform the same displacement operation process as in the previous embodiments. Taking a rightward drag as an example, the system identifies one or more text elements to be migrated (e.g., "with the bright moon in the mountains,") from the starting position of text unit B below, and removes them from text unit B atomically, while simultaneously adding them to the end of text unit A. On the interface, the user can intuitively observe the increase in content of text unit A and the corresponding decrease in content of text unit B.

[0154] When the user finishes dragging, the system commits the entire adjustment process as a transaction. The changes to the text content are fixed, the temporarily displayed boundary controls and auxiliary visual feedback are removed, and the interface returns to its normal focus state. Simultaneously, the timestamp associated with this logical boundary is updated to the new time position to reflect the adjusted text boundary.

[0155] During the removal operation phase, such as Figure 8E and Figure 8F As shown, where Figure 8E This is a user interface illustration when the user places the cursor on the right edge of the subtitle box, preparing to perform a remove (merge) operation. Figure 8F This is a screenshot of the user interface after the removal (merge) operation is performed. (See reference...) Figure 8E Users can activate boundary controls (e.g., 806b_L and 806b_R) in the same way as the bitwise operations, and can then input commands to remove logical boundaries. These commands can be implemented, but are not limited to, context menu selection, keyboard shortcuts, or gesture input.

[0156] For example, in one implementation, a user can input a removal command via a context menu on an activated boundary control. Specifically, by right-clicking the boundary control, the system will display an operation menu related to the logical boundary, from which the user can select options such as "Remove Boundary" or "Merge Units," and the system will then trigger the corresponding removal operation.

[0157] In another implementation, when the boundary control is active or selected, the user can directly input the removal command via keyboard shortcuts, such as pressing the delete key. After receiving the shortcut command, the system will perform the corresponding logical boundary removal and text unit merging process.

[0158] Furthermore, in devices or environments that support multimodal input, the removal command can also be input via gestures or voice. For example, on a touchscreen device, a preset pinch gesture can be performed on the boundary control, or a command such as "merge these two sentences" can be input via voice. The system will then recognize and parse this as a removal operation targeting the logical boundary. These various command input methods can be used individually or combined according to specific application scenarios to improve the flexibility and adaptability of the interaction.

[0159] Reference Figure 8F Upon receiving the removal instruction, the system performs an atomic merging operation, merging the text content and corresponding time markers of two adjacent text units to generate a new continuous text unit, and removing the original logical boundaries and their corresponding boundary controls from the interface.

[0160] During the insertion operation phase, such as Figure 8F and Figure 8G As shown, where Figure 8G The diagram illustrates the user interface when the mouse hovers over the newly generated boundary controls after an insertion (splitting) operation. When a user executes a splitting command within a text unit, the system splits the text unit at the data level, generating two new text units and establishing a new logical boundary between them. The system then displays a pair of boundary controls, 806c_L and 806c_R, corresponding to this new logical boundary on the user interface, allowing the user to further fine-tune the boundary through displacement or removal operations. Commands used to trigger the splitting operation can include various forms such as keyboard shortcuts and context menu selections.

[0161] For example, in one implementation, a user can first position the text input cursor at the desired segmentation location, such as between two adjacent characters or words within a text unit, and then initiate the segmentation operation by pressing a preset keyboard shortcut. The keyboard shortcut can be a single key or a combination of function keys and character keys. After detecting the shortcut input, the system identifies the cursor position as the segmentation point and performs segmentation processing on the current text unit at the data level accordingly.

[0162] In another implementation, the user can also initiate a segmentation operation via a context menu. For example, the user right-clicks the input device at the target segmentation location. Upon detecting this operation, the system displays a context menu associated with the current location, which includes at least one menu item indicating the segmentation operation, such as "Segment Here". When the user selects this menu item, the system uses the cursor position as the segmentation point and performs the segmentation operation on the corresponding text unit.

[0163] It should be understood that the above-described keyboard shortcuts and context menu interactions are merely exemplary implementations used to illustrate different ways of initiating segmentation commands, and this disclosure does not limit the specific command format. In other embodiments, touch gestures, voice input, or other human-computer interaction methods can also be used to trigger the segmentation operation, as long as the segmentation position can be clearly indicated and the corresponding text unit splitting processing can be triggered.

[0164] As can be seen from the above embodiments, the logical boundary manipulation mechanism proposed in the embodiments of this disclosure can be flexibly implemented under different interface layout conditions, and can be integrated into the existing vertical list-style user interface in a non-intrusive manner, thereby expanding the scope of application of the embodiments of this disclosure.

[0165] Figures 9A to 9G A schematic diagram illustrating a process of creating an ordered text sequence based on context-based interactive operations according to an embodiment of the present disclosure is provided. This embodiment offers a logical boundary manipulation method that does not rely on explicit boundary controls, used to adjust the logical boundaries between adjacent text units during the editing of an ordered text sequence. In this method, the system determines the user's editing intent by parsing the context information of the interactive operations performed by the user within the text unit content area, and accordingly performs operations such as displacement, merging, or splitting on the corresponding logical boundaries.

[0166] Specifically, in this embodiment, the logical boundaries are not presented as independent visual controls on the user interface, but rather exist as implicit editable objects. The system identifies the user's intent by combining the type of user input, triggering conditions, and direction of the operation, thus enabling precise manipulation of the logical boundaries without displaying boundary controls. Therefore, users can directly adjust the previous or next logical boundary within the text unit through interactive operations, without needing to precisely move the cursor to a specific boundary position.

[0167] The core of this model lies in the fact that the system no longer relies on the user's precise pointing to a specific "hotspot" (such as a cell edge). Instead, it unambiguously determines whether the user wants to manipulate the "front boundary" or the "back boundary" by parsing the "type" (e.g., whether the left or right mouse button is used) and "direction" (e.g., whether dragging to the left or right) of the user's input gesture, and triggers the atomic content migration process described in this disclosure accordingly. Next, we will... Figures 9A to 9G Provide a detailed description.

[0168] like Figure 9A As shown, in the initial state, the ordered text sequence includes the logically adjacent previous text unit and the current text unit, which can be referred to as unit A and unit B, respectively. At this time, no explicit boundary controls are displayed in the user interface; the logical boundary exists as an implicit editing object between unit A and unit B.

[0169] like Figure 9B and Figure 9C As shown, when a user places the cursor of the input device within the content display area of the current text unit B and performs a preset contextual interaction operation, the system parses the operation. In some embodiments, the contextual interaction operation includes pressing a specific input key within the text unit content area and dragging in a predetermined direction. For example, when the user presses and holds down the left mouse button within unit B and drags to the right, the system parses the operation as a displacement intention towards the front logical boundary of unit B, that is, moving part of the text content at the beginning of unit B to the end of unit A.

[0170] During the drag-and-drop operation, the system provides focused visual feedback to the text unit currently being targeted, such as through background highlighting, outline display, or text style changes, to clearly indicate the scope of the current interaction. Simultaneously, the system initiates an iterative preview process, displaying potential migration results in real time without immediately modifying the underlying data. Specifically, at the beginning of unit B (the source text unit), the system applies first visual feedback to the text content to be migrated, indicating that this part of the content is in a state awaiting migration; simultaneously, at the end of unit A (the target text unit), a corresponding second visual feedback is generated to preview the insertion effect after the text content is migrated. As the user's drag-and-drop operation continues, the scope of the previewed content changes accordingly, thus visually representing the virtual movement process of the logical boundaries.

[0171] like Figure 9D As shown, when the user ends the aforementioned contextual interaction, such as releasing the mouse button, the system treats this action as an edit submission command and performs atomic data update operations accordingly. Based on the final migration range determined in the preview phase, the system removes the corresponding text content from the underlying data of unit B and adds it to the underlying data of unit A, completing the actual displacement of the front boundary. Subsequently, the user interface is refreshed, the preview visual feedback is removed, and the changes to the text content are solidified into the final state. Simultaneously, the timestamp pair associated with this logical boundary is also synchronously updated to the new time position to maintain consistency between the text content and the timeline.

[0172] In another scenario, the system can also adjust the logical boundary between the current text unit and its successor text unit by parsing different contextual interaction operations. (See reference...) Figures 9E to 9G This illustrates the process of back boundary displacement operation based on contextual intent recognition.

[0173] like Figure 9EAs shown, the user places the cursor of the input device within the display area of the text unit B that is currently the operation target and performs a preset type of interaction operation for manipulating the post-logical boundary. For example, the user can hold down the right mouse button and drag it in a predetermined direction (such as to the right). After detecting this interaction operation, the system combines the operation type and the operation direction and interprets it as the editing intention of the user to adjust the logical boundary between the text unit B and the subsequent text unit C, that is, to migrate some of the text content at the start of the subsequent text unit C to the end of the current text unit B.

[0174] During the process where the user continuously performs this interaction operation, the system activates the linkage feedback mechanism in the preview stage. As Figure 9F shown, the system does not immediately modify the underlying data but generates a non-destructive visual preview effect corresponding to the displacement of the logical boundary. On the one hand, at the start position of the text unit C, which is the source text unit, the text content to be migrated is marked through the first visual feedback to indicate that this part of the content is in a state of being待迁移 (to be migrated). On the other hand, at the end position of the text unit B, which is the target text unit, a corresponding second visual feedback is generated synchronously to preview the display result after the migration of the text content. During this process, the logical boundary indication state representing the post-logical boundary conceptually changes with the user's operation, and its position is used to define the text range of the current preview migration.

[0175] When the user ends the said interaction operation, such as releasing the mouse button, the system regards this behavior as a submission instruction for a complete displacement operation. As Figure 9G shown, the system performs a one-time atomic data update according to the final state determined in the preview stage, removes the marked text content from the text unit C, and adds it to the text unit B, thus completing the adjustment of the displacement of the post-logical boundary. At the same time, the user interface is refreshed to reflect the updated text content, and all temporary preview visual feedback is removed.

[0176] In some embodiments, the system can also联动更新 (linkage update) other metadata associated with this logical boundary. For example, in the scenario of subtitles or audio-video synchronization, the timestamp markers pair corresponding to this post-logical boundary (such as 905b_end and 905b_start) can be synchronously adjusted to the new time position to ensure the consistency between the text content change and the timeline.

[0177] It should be noted that the "preview-submit" two-stage update mechanism described above is only one implementation method in this embodiment. In this implementation, during the user's drag-and-drop operation, the system only generates a visual preview to indicate the potential migration results, without immediately modifying the underlying data. Only after the user finishes the operation is an atomic data update performed all at once. This approach helps reduce computational overhead during the interaction process and improves data consistency and overall performance in multi-user collaborative or cross-network editing scenarios.

[0178] In other implementations, the system can also update the underlying data model in real time and iteratively during the user's drag-and-drop operation, keeping the changes in text content synchronized with the user's operation process, thereby achieving an instant feedback editing effect. The different update strategies mentioned above can be selected according to the specific application environment, performance requirements, or user preferences, but their core lies in achieving atomic content migration between adjacent text units by manipulating logical boundaries.

[0179] Furthermore, the contextual interaction method used to distinguish the user's intention to manipulate the previous or subsequent logical boundary is not limited to the distinction based on the left and right mouse buttons. In other implementations, a combination of keyboard modifiers and drag-and-drop operations can be used, for example, different combinations of modifier keys and drag-and-drop operations can express different boundary manipulation intentions; on touchscreen devices or devices that support multi-point input, the distinction can also be made through different numbers of touch points or different types of gestures.

[0180] Furthermore, the contextual dimension used to parse user control intentions is not limited to the input method itself. In an alternative implementation, the system can also determine the logical boundary targeted by the user's drag operation based on the starting spatial position of the drag operation within the target text unit display area. Specifically, when the user begins to perform a drag operation within a preset first half of the text unit display area, the system parses the operation as a control instruction targeting the preceding logical boundary of the text unit; correspondingly, when the user begins to perform a drag operation within a preset second half of the text unit display area, the system parses the operation as a control instruction targeting the following logical boundary of the text unit and triggers the corresponding content migration process.

[0181] The aforementioned spatial context-based intent parsing mechanism can be used alone or in combination with gesture-type or input-method-based parsing mechanisms to provide users with a more intuitive and unambiguous logical boundary control method in different device environments and interaction scenarios.

[0182] Through the above-described context-interaction-based non-explicit logical boundary manipulation method, it can be seen that this embodiment achieves precise adjustment of the logical boundary between adjacent text units without relying on explicit boundary controls. Because this method interprets the user's editing intent as an operation on the logical boundary, rather than an operation on specific interface controls, its technical implementation has a high level of abstraction, making the related methods unrestricted by specific user interface layouts or control forms, thus adapting to implementation methods under different interaction paradigms. It should be noted that although this embodiment emphasizes that the solution can achieve manipulation without relying on explicit boundary controls, the implementation of this solution does not exclude the existence of explicit boundary controls. In certain application scenarios, the context-interaction-based non-explicit manipulation method described in this embodiment can coexist or be integrated with any form of explicit boundary control (such as the explicitly existing boundary controls described in the previous embodiments, or boundary controls generated based on linkage activation). In this hybrid mode, explicit boundary controls can exist only as auxiliary visual indicators of the current position of the logical boundary, or be provided as another parallel, redundant manipulation method.

[0183] At the interaction level, users can directly adjust the boundaries within the text content area using preset gestures or operational context, without needing to precisely position the cursor at a specific boundary. This helps reduce operational steps and improve editing efficiency. Simultaneously, this method avoids introducing additional visual controls during the editing process, thereby reducing interface complexity and minimizing distractions for the user.

[0184] Furthermore, since this solution can be implemented without significantly altering the existing user interface structure, it possesses good technical compatibility and integrability, making it suitable for application in existing text editing, subtitle editing, or content management systems. Through the aforementioned technical means, this embodiment improves the system's flexibility and scalability while ensuring editing accuracy and operational consistency, facilitating its widespread use in various application scenarios.

[0185] Figure 10A flowchart of a method for processing an ordered text sequence according to an embodiment of the present disclosure is shown. At block 1002, the ordered text sequence is shown. At block 1004, one or more boundary controls are shown, wherein the one or more boundary controls divide the ordered text sequence into a plurality of text units, the plurality of text units including adjacent first text units and second text units, a first boundary control of the one or more boundary controls being displayed between the first text units and the second text units, the first text units and the second text units including sub-text sequences in the ordered text sequence, the first text unit including first text in the sub-text sequence located on a first side of the first boundary control, and the second text unit including second text in the sub-text sequence located on a second side of the first boundary control. At block 1006, in response to a user operation on the first boundary control, the first text included in the first text unit and the second text included in the second text unit are adjusted.

[0186] Figure 11 A flowchart of another method for processing an ordered text sequence according to an embodiment of the present disclosure is shown. In this method, the ordered text sequence is divided into a plurality of text units, the plurality of text units including adjacent first text units and second text units, wherein the first text unit includes first text and the second text unit includes second text. At block 1102, the first text unit and the second text unit are shown. At block 1104, in response to a first selection operation for the first text unit, a first side of the first text unit is selected. At block 1106, in response to a first migration operation for the first text unit, the text in the first text of the first text unit located on the first side corresponding to the operation amount of the first migration operation is migrated to a second side of the second text unit, wherein the text in both the migrated first text unit and the migrated second text unit maintains the order of the ordered text sequence.

[0187] Figure 12A flowchart of a method for managing an ordered text sequence according to an embodiment of the present disclosure is shown. In this method, the ordered text sequence is divided into a plurality of text units. At block 1202, a first persistent identifier is assigned to a first text unit among the plurality of text units, wherein the first persistent identifier remains unique and unchanged throughout the lifetime of the first text unit. At block 1204, in response to a first operation for the first text unit among the plurality of text units, a first asset corresponding to the first text unit is created. At block 1206, a persistent reference of the first asset to the first text unit is established by associating the first asset with the first persistent identifier.

[0188] Figure 13 A block diagram of an apparatus for processing an ordered text sequence according to an embodiment of the present disclosure is shown. The apparatus includes a text display module 1302 configured to display the ordered text sequence. The apparatus includes a control display module 1304 configured to display one or more boundary controls, wherein the one or more boundary controls divide the ordered text sequence into a plurality of text units, the plurality of text units including adjacent first and second text units, a first boundary control of the one or more boundary controls displayed between the first and second text units, the first and second text units including sub-text sequences in the ordered text sequence, the first text unit including first text in the sub-text sequence located on a first side of the first boundary control, and the second text unit including second text in the sub-text sequence located on a second side of the first boundary control. The apparatus includes a text adjustment module 1306 configured to adjust the first text included in the first text unit and the second text included in the second text unit in response to a user operation on the first boundary control.

[0189] Figure 14A block diagram of another apparatus for processing an ordered text sequence according to an embodiment of the present disclosure is shown. In this apparatus, the ordered text sequence is divided into a plurality of text units, the plurality of text units including adjacent first text units and second text units, wherein the first text unit includes first text and the second text unit includes second text. The apparatus includes a unit display module 1402 configured to display the first text unit and the second text unit. The apparatus includes a unit selection module 1404 configured to select a first side of the first text unit in response to a first selection operation for the first text unit. The apparatus includes a text migration module 1406 configured to migrate, in response to a first migration operation for the first text unit, text in the first text of the first text unit located on the first side corresponding to the operation amount of the first migration operation to a second side of the second text unit, wherein the text in both the migrated first text unit and the migrated second text unit maintains the order of the ordered text sequence.

[0190] It should be noted that in the embodiments of this disclosure, the "first selection operation" and the "first migration operation" do not necessarily correspond to two input actions by the user that are independent and separate from each other in time. In some implementations, a single continuous user interaction can simultaneously trigger and identify these two types of operations. For example, a complete drag operation performed by the user (including the continuous process of pressing the input key, holding the press and generating displacement, and releasing the input key) can be parsed by the system as a composite operation containing multiple logical stages. In the initial stage, the system identifies and completes the lateral selection of the text unit based on the input starting position or input context, thereby constituting the first selection operation. In the subsequent displacement stage, the system performs the corresponding text migration processing based on the continuously detected operation amount, thereby constituting the first migration operation. In other words, the "first selection operation" and the "first migration operation" are more of an abstract division of user interaction from the perspective of system function and processing logic, rather than a limitation on the number of times the user physically inputs. Without departing from the technical solution of this disclosure, these two types of operations can be triggered by the same continuous user action, or they can be triggered separately by different forms of user input in other implementations. The term "operation amount" should not be narrowly interpreted as the physical displacement of the mouse. It can be reflected as the distance (in pixels) the cursor moves on the interface, the number of pulses the mouse wheel scrolls, the number of times a specific key on the keyboard is triggered, or even the angle of a knob rotation or the displacement of a slider movement, as long as the input can be logically quantified by the system and mapped to the granularity of text migration.

[0191] Figure 15A block diagram of an apparatus for managing an ordered sequence of text according to an embodiment of the present disclosure is shown. In this apparatus, the ordered sequence of text is divided into a plurality of text units. The apparatus includes an identifier allocation module 1502 configured to assign a first persistent identifier to a first text unit among the plurality of text units, wherein the first persistent identifier remains unique and unchanged throughout the lifetime of the first text unit. The apparatus includes an asset creation module 1504 configured to create a first asset corresponding to the first text unit in response to a first operation performed on the first text unit among the plurality of text units. The apparatus includes a reference establishment module 1506 configured to establish a persistent reference of the first asset to the first text unit by associating the first asset with the first persistent identifier.

[0192] Figure 16 A schematic block diagram of an example device that can be used to implement embodiments of the present disclosure is shown. As shown, the device includes a central processing unit (CPU) 1601, which can perform various appropriate actions and processes according to computer program instructions stored in read-only memory (ROM) 1602 or loaded from storage unit 1608 into random access memory (RAM) 1603. Various programs and data required for the operation of the device 1600 may also be stored in RAM 1603. The CPU 1601, ROM 1602, and RAM 1603 are interconnected via bus 1604. An input / output (I / O) interface 1607 is also connected to bus 1604.

[0193] Multiple components in the device are connected to the I / O interface 1605, including: an input unit 1606, such as a keyboard, mouse, etc.; an output unit 1607, such as various types of displays, speakers, etc.; a storage unit 1608, such as a disk, optical disk, etc.; and a communication unit 1609, such as a network card, modem, wireless transceiver, etc. The communication unit 1609 allows the device to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.

[0194] In some embodiments, the method for processing ordered text sequences may be implemented as a computer software program tangibly contained in a machine-readable medium, such as storage unit 1608. In some embodiments, part or all of the computer program may be loaded and / or installed on device 1600 via ROM 1602 and / or communication unit 1609. When the computer program is loaded into RAM 1603 and executed by CPU 1601, one or more actions for processing ordered text sequences described above may be performed.

[0195] This disclosure can be a method, apparatus, system, and / or computer program product. A computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of this disclosure.

[0196] Computer-readable storage media can be tangible devices capable of holding and storing instructions for use by an instruction execution device. Computer-readable storage media can be, for example—but not limited to—electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disc read-only memory (CD-ROM), digital multifunction disc (DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards or recessed protrusions storing instructions thereon, and any suitable combination of the foregoing. The computer-readable storage media used herein are not to be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or electrical signals transmitted through wires.

[0197] The computer-readable program instructions described herein can be downloaded from computer-readable storage media to various computing / processing devices, or downloaded via a network, such as the Internet, local area network, wide area network, and / or wireless network, to an external computer or external storage device. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to the computer-readable storage media in the respective computing / processing device.

[0198] Computer program instructions used to perform the operations of this disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, status setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as the "C" language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry, such as programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), is personalized by utilizing the status information of the computer-readable program instructions to implement various aspects of this disclosure.

[0199] Various aspects of this disclosure are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0200] These computer-readable program instructions can be provided to a processing unit of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processing unit of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner. Thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0201] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0202] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those shown in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0203] The various embodiments of this disclosure have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles, practical application, or technical improvements to the technology in the market, or to enable others skilled in the art to understand the embodiments disclosed herein.

Claims

1. A method for processing ordered text sequences, comprising: Display the ordered text sequence; Display one or more boundary controls, where... The one or more boundary controls divide the ordered text sequence into multiple text units. The plurality of text units includes adjacent first text units and second text units. A first boundary control, one of the one or more boundary controls, is displayed between the first text unit and the second text unit. The first text unit and the second text unit include sub-text sequences within the ordered text sequence. The first text unit includes the first text in the sub-text sequence located on the first side of the first boundary control, and The second text unit includes the second text in the sub-text sequence located on the second side of the first boundary control; and In response to a user action on the first boundary control, the first text included in the first text cell and the second text included in the second text cell are adjusted.

2. The method of claim 1, wherein in response to the user operation being a displacement operation for moving the first boundary control, adjusting the first text content included in the first text unit and the second text content included in the second text unit comprises: The text that needs to be adjusted is determined by comparing the spatial position of the first boundary control during the displacement operation with the spatial range of one or more text elements in the first text unit or the second text unit.

3. The method of claim 2, wherein the comparison and determination of the text to be adjusted is performed iteratively during the duration of the displacement operation, and the triggering condition for each iteration is based on the real-time spatial position of the decision point of the boundary control, which crosses a preset spatial position associated with one or more text elements to be migrated.

4. The method according to claim 3, wherein the preset spatial position is the geometric center line of the one or more text elements to be migrated.

5. The method of claim 2, wherein the comparison and determination of the text to be adjusted is performed after the displacement operation has ended, and wherein determining the text to be adjusted includes: The text that needs adjustment is determined by comparing the end position of the boundary control with the spatial range of the one or more text elements.

6. The method according to claim 2, wherein the basic unit of the text to be adjusted is a single character.

7. The method according to claim 2, further comprising: Based on the recognition of text fragments constituting a single migration unit based on a predetermined pattern, wherein the predetermined pattern includes at least one of a fixed-granularity pattern, an intelligent semantic pattern, and a custom pattern.

8. The method according to claim 2, further comprising: Provide at least one visual feedback, wherein the visual feedback includes at least one of the following: Real-time display of the first visual feedback indicating the text to be migrated; or Secondary visual feedback indicating the target insertion position in real time.

9. The method of claim 8, the method of claim 2, wherein the triggering condition for the migration is determined based on a comparison of the spatial position of the first boundary control during the displacement operation with the spatial range of one or more text elements in the first text unit or the second text unit.

10. The method according to claim 9, further comprising: When the migration triggering condition is not met, the first visual feedback and the second visual feedback are maintained, and the text content of the first text unit and the second text unit remains unchanged; as well as When the migration triggering condition is met, an atomic migration operation is performed, wherein the atomic migration operation includes: Remove the identified text that needs adjustment from the first text unit; Add the determined text that needs adjustment to the second text unit.

11. The method of claim 1, wherein in response to the user action being a removal operation for removing the first boundary control, adjusting the first text content included in the first text unit and the second text content included in the second text unit comprises: Remove the first boundary control; as well as The first text unit and the second text unit are merged into a third text unit, wherein the third text unit includes the first text and the second text.

12. The method according to claim 1, further comprising: Insert a second border control into the first text, wherein, The second boundary control divides the first text unit into a fourth text unit and a fifth text unit. The fourth text unit includes the fifth text in the first text that is located on the first side of the second boundary control, and The fifth text unit includes the fifth text in the first text that is located on the second side of the second boundary control.

13. The method of claim 1, wherein the ordered text sequence is a subtitle sequence, and each of the plurality of text units has a corresponding timestamp.

14. The method of claim 13, wherein in response to the user operation being a displacement operation, the method further comprises: Adjust the first timestamp of the first text unit and the second timestamp of the second text unit, wherein the adjustment includes at least one of a proportional allocation mode, a rate estimation mode, and an automatic alignment mode based on media characteristics.

15. The method according to claim 1, further comprising: After the adjustment operation is completed, the language-aware boundary formatting module is invoked to format the adjusted text content according to the boundary rules of the current text language of the ordered text sequence. The formatting process includes automatically adjusting at least one of the following: word spacing, punctuation marks, and spacing between mixed Chinese and Western text.

16. A method for processing an ordered text sequence, wherein the ordered text sequence is divided into a plurality of text units, the plurality of text units including adjacent first text units and second text units, and the first text unit includes first text, the second text unit includes second text, the method comprising: Display the first text unit and the second text unit; In response to a first selection operation on the first text unit, a first side of the first text unit is selected; as well as In response to a first migration operation for the first text unit, the text in the first text of the first text unit that corresponds to the operation amount of the first migration operation on the first side is migrated to the second side of the second text unit, wherein the text in both the migrated first text unit and the migrated second text unit maintains the order of the ordered text sequence.

17. The method of claim 16, further comprising: In response to a second migration operation for the first text unit, the text in the second text of the second text unit that is on the second side and corresponds to the operation amount of the second migration operation is migrated to the first side of the first text unit.

18. The method of claim 16, further comprising: In response to a second selection operation for the second text unit, the second side of the second text unit is selected; as well as In response to a third migration operation for the second text unit, the text in the second text of the second text unit that corresponds to the operation amount of the second migration operation on the second side is migrated to the first side of the first text unit.

19. The method of claim 18, further comprising: In response to a fourth migration operation for the second text unit, the text in the first text of the first text unit that is on the first side and corresponds to the operation amount of the fourth migration operation is migrated to the second side of the second text unit.

20. A method for managing an ordered text sequence, the ordered text sequence being divided into multiple text units, the method comprising: Assign a first persistent identity identifier to a first text unit among the plurality of text units, wherein the first persistent identity identifier remains unique and unchanged throughout the lifetime of the first text unit; In response to a first operation on a first text unit among the plurality of text units, a first asset corresponding to the first text unit is created; as well as By associating the first asset with the first persistent identity identifier, a persistent reference of the first asset to the first text unit is established.

21. The method of claim 20, further comprising: Create a first snapshot record in the snapshot record set corresponding to the first persistent identifier of the first text unit, wherein the snapshot record set includes multiple snapshot records associated with the plurality of text units; In response to the first operation, a corresponding first snapshot version is created in the first snapshot record; as well as In response to the second operation on the first text unit, a corresponding second snapshot version and a reference relationship from the first snapshot version to the second snapshot version are created in the first snapshot record.

22. The method of claim 21, further comprising: After generating the second snapshot version, update the real-time state index so that the reference in the real-time state index corresponding to the first persistent identity identifier points to the second snapshot version; as well as Based on the real-time status index, the content of the first text unit is displayed on the user interface.

23. The method of claim 21, further comprising: In response to the splitting operation on the first text unit, the first text unit is split into a third text unit and a fourth text unit; as well as Create a third persistent identifier and a corresponding third snapshot record for the third text unit, and create a fourth persistent identifier and a corresponding fourth snapshot record for the fourth text unit; as well as In the first snapshot record, a final state snapshot associated with the first persistent identity identifier is generated, wherein the final state snapshot indicates that the first snapshot record has been split into the third snapshot record and the fourth snapshot record.

24. The method of claim 23, further comprising: In response to the split operation for the first text unit, an inheritance control is displayed on the user interface, wherein the inheritance control is configured to prompt the user to select the inheritance method of the first asset between the third and fourth text units after the split operation.

25. The method of claim 24, wherein the inheritance method includes at least one of the following: Associate the first asset with the third text unit; Associate the first asset with the fourth text unit; The first asset is associated with the third text unit and the fourth text unit, respectively.

26. The method of claim 23, further comprising: In response to the splitting operation on the first text unit, the keywords of the first asset are determined; Based on the similarity between the keyword and the text in the third text unit and the text in the fourth text unit, the inheritance method of the first asset between the third text unit and the fourth text unit is determined.

27. The method according to any one of claims 20-26, further comprising: Multiple atomic operation instructions targeting the multiple text units are aggregated into an editing intent unit; The edit intent unit is submitted as an indivisible transaction.

28. The method according to any one of claims 20-26, wherein the ordered text sequence is a caption sequence, each of the plurality of text units has a corresponding timestamp, and the first asset includes at least one of the following: notes, favorites, highlighted annotations, tags, analysis, or comments.

29. An apparatus for processing an ordered sequence of text, comprising: The text display module is configured to display the ordered text sequence; The control display module is configured to display one or more bounded controls, wherein... The one or more boundary controls divide the ordered text sequence into multiple text units. The plurality of text units includes adjacent first text units and second text units. A first boundary control, one of the one or more boundary controls, is displayed between the first text unit and the second text unit. The first text unit and the second text unit include sub-text sequences within the ordered text sequence. The first text unit includes the first text in the sub-text sequence located on the first side of the first boundary control, and The second text unit includes the second text in the sub-text sequence located on the second side of the first boundary control; and The text adjustment module is configured to adjust the first text included in the first text unit and the second text included in the second text unit in response to a user operation on the first boundary control.

30. An apparatus for processing an ordered text sequence, wherein the ordered text sequence is divided into a plurality of text units, the plurality of text units including adjacent first text units and second text units, and the first text unit including first text, the second text unit including second text, the apparatus comprising: The unit display module is configured to display the first text unit and the second text unit; The cell selection module is configured to select a first side of the first text cell in response to a first selection operation for the first text cell; as well as A text migration module is configured to, in response to a first migration operation for the first text unit, migrate the text in the first text of the first text unit that is on the first side and corresponds to the operation amount of the first migration operation to the second side of the second text unit, wherein the text in both the migrated first text unit and the migrated second text unit maintains the order of the ordered text sequence.

31. An apparatus for managing an ordered text sequence, the ordered text sequence being divided into a plurality of text units, the apparatus comprising: The identifier allocation module is configured to assign a first persistent identity identifier to a first text unit among the plurality of text units, wherein the first persistent identity identifier remains unique and unchanged throughout the lifetime of the first text unit; The asset creation module is configured to create a first asset corresponding to the first text unit in response to a first operation on the first text unit among the plurality of text units; as well as The reference establishment module is configured to establish a persistent reference of the first asset to the first text unit by associating the first asset with the first persistent identity identifier.

32. An electronic device, comprising: At least one processor; as well as A storage device for storing at least one program, which, when executed by the at least one processor, causes the at least one processor to implement the method according to any one of claims 1-28.

33. A computer-readable storage medium having a computer program stored thereon, the computer program implementing the method according to any one of claims 1-28 when executed by a processor.

34. A computer program product comprising a computer program that, when executed by a processor, implements the method according to any one of claims 1-28.