Dynamic layout optimization method for internet accessibility reading

By acquiring the semantic tension feature sequence of the text sequence, the text layout is optimized to avoid semantic fragmentation, which solves the cognitive load problem caused by physical size cutting in digital reading and achieves barrier-free and smooth reading and visual continuity.

CN122242443APending Publication Date: 2026-06-19COMMUNICATION UNIVERSITY OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
COMMUNICATION UNIVERSITY OF CHINA
Filing Date
2026-03-20
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies in digital reading rely on physical dimensions for text layout, which forcibly cuts off highly cohesive semantic blocks in long and complex sentences, causing an accumulation of cognitive load on users and affecting reading efficiency and visual fatigue.

Method used

By acquiring the semantic tension feature sequence of the text sequence, and combining virtual space accumulation and semantic parsing, the text layout is optimized to avoid semantic fragmentation. A one-dimensional nonlinear space compensation mechanism and word weight compensation parameters are adopted to ensure that the text line breaks are aligned with human cognitive units.

Benefits of technology

It enables barrier-free and smooth reading, reduces cognitive load, maintains the aesthetic appeal and visual continuity of the layout, and avoids visual fatigue and decreased reading efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242443A_ABST
    Figure CN122242443A_ABST
Patent Text Reader

Abstract

This invention discloses a dynamic typesetting optimization method for accessible reading on the Internet, relating to the field of computer graphics and text information processing technology. The method includes: obtaining the basic physical rendering dimensions of the target text sequence and extracting tension feature values ​​through semantic parsing to construct a semantic tension feature sequence; determining the first truncation anchor point based on virtual dimension accumulation; if a collision condition is met, determining the second truncation anchor point by comprehensively considering width and tension within the candidate backtracking interval, and obtaining the physical white space margin of the target truncated line; allocating margin to specific gaps based on the tension sequence to obtain one-dimensional spatial compensation parameters, and converting the overflow residue into lateral deformation parameters of text primitives and orthogonally coupled character weight compensation parameters; finally, recombining the output rendering object. This invention eliminates semantic fragmentation and visual discontinuity, significantly reducing the cognitive load of reading.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer graphic information processing technology, specifically to a dynamic typesetting optimization method for accessible reading on the Internet. Background Technology

[0002] With the popularization of mobile internet and smart terminals, digital reading has become the main way for people to obtain information. In digital reading scenarios, front-end rendering engines need to dynamically arrange a large amount of text streams to adapt to screen display ranges of different sizes, thereby providing users with a smooth visual experience.

[0003] Currently, the mainstream text typesetting and rendering mechanisms in the industry generally rely on the physical size of the display container for automatic line wrapping. During operation, the typesetting engine will accumulate the physical pixel width of each character. Once the total accumulated width touches the physical boundary of the screen or display container, the system will trigger a line wrapping operation.

[0004] However, human language possesses strong inherent logic and semantic cohesion. Many phrase structures in long and complex sentences are indivisible minimum cognitive units in the brain's information processing. Under normal continuous reading conditions, physical line breaks can easily forcibly sever these highly cohesive semantic blocks in the middle. This uncontrollable physical fragmentation forces users to consume additional working memory to recall and reassemble the incomplete semantics when performing cross-line visual scanning, thus subconsciously triggering frequent micro-cognitive pauses. As reading time increases, this micro-level cognitive load accumulates exponentially, eventually leading to severe visual fatigue, line skipping errors, and a significant decrease in reading efficiency.

[0005] Therefore, how to break through the underlying limitation of using physical size as the standard for line breaks and eliminate the problem of cognitive load accumulation caused by typesetting fragmentation in conventional reading scenarios has become a technical bottleneck that urgently needs to be overcome in this field. Summary of the Invention

[0006] To address the shortcomings of existing technologies, this invention provides a dynamic typesetting optimization method for accessible reading on the Internet.

[0007] To achieve the above objectives, the technical solution of the present invention is as follows:

[0008] In a first aspect, this invention discloses a dynamic typesetting optimization method for accessible reading on the Internet, comprising the following steps:

[0009] Obtain the target text sequence, which consists of multiple text primitives and the primitive gaps between adjacent text primitives, and obtain the basic physical rendering dimension of each text primitive;

[0010] Semantic parsing is performed on the target text sequence to extract tension feature values ​​that characterize the degree of semantic cohesion of adjacent text primitives, so as to construct a semantic tension feature sequence corresponding to the gaps between each primitive;

[0011] Based on the basic physical rendering dimensions, the target text sequence is sequentially accumulated in virtual space to obtain the accumulated width, and the first truncation anchor point that satisfies the preset out-of-bounds condition is determined.

[0012] When the tension feature value corresponding to the first truncation anchor point meets the preset collision condition, the candidate backtracking interval is determined in the historical text primitives before the first truncation anchor point.

[0013] By combining the cumulative width and tension feature values ​​of each node in the candidate backtracking interval, the second truncation anchor point is determined. The part of the target text sequence ending at the second truncation anchor point is divided into target truncation lines, and the physical white space of the target truncation lines is obtained.

[0014] Based on the semantic tension feature sequence, the physical white space is allocated to the gaps of primitives in the target truncation line that satisfy the first tension condition, and the one-dimensional space compensation parameter and the overflow space residual amount are obtained.

[0015] When the residual amount of overflow space is greater than zero, the residual amount of overflow space is converted into the lateral deformation parameter of the text primitive that satisfies the second tension condition in the target truncated line, and the word weight compensation parameter orthogonally coupled with the lateral deformation parameter is generated.

[0016] Based on the second truncation anchor point, one-dimensional space compensation parameters, lateral deformation parameters, and word weight compensation parameters, the target text sequence is reorganized and output as a rendering object.

[0017] Secondly, this invention discloses a dynamic typesetting optimization system for accessible reading on the Internet, comprising:

[0018] The data acquisition module is used to acquire the target text sequence, which consists of multiple text primitives and the gaps between adjacent text primitives; and to acquire the basic physical rendering dimensions of each text primitive.

[0019] The semantic parsing module is used to perform semantic parsing on the target text sequence and extract tension feature values ​​that characterize the degree of semantic cohesion of adjacent text primitives, so as to construct a semantic tension feature sequence corresponding to the gaps between each primitive.

[0020] The virtual accumulation module is used to perform virtual space accumulation on the target text sequence sequentially based on the basic physical rendering dimension to obtain the accumulation width, and determine the first truncation anchor point where the accumulation width meets the preset out-of-bounds condition;

[0021] The backtracking interval determination module is used to determine candidate backtracking intervals in the historical text primitives before the first truncation anchor point when the tension feature value corresponding to the first truncation anchor point meets the preset collision conditions.

[0022] The truncation line segmentation module is used to comprehensively analyze the cumulative width and tension feature values ​​of each node in the candidate backtracking interval, determine the second truncation anchor point, divide the part of the target text sequence ending at the second truncation anchor point into target truncation lines, and obtain the physical white space of the target truncation lines.

[0023] The spatial compensation allocation module is used to allocate the physical white space margin to the gaps between primitives that meet the first tension condition within the target truncation line based on the semantic tension feature sequence, thereby obtaining a one-dimensional spatial compensation parameter and the overflow space residual amount.

[0024] The deformation coupling compensation module is used to convert the residual amount of the overflow space into the lateral deformation parameters of the text primitives that satisfy the second tension condition within the target truncated line when the residual amount of the overflow space is greater than zero, and to generate the word weight compensation parameters that are orthogonally coupled with the lateral deformation parameters.

[0025] The rendering output module is used to reconstruct and output the rendering object of the target text sequence based on the second truncation anchor point, one-dimensional space compensation parameters, horizontal deformation parameters, and word weight compensation parameters.

[0026] Compared with the prior art, the beneficial effects of the present invention are as follows:

[0027] 1. By acquiring the semantic tension feature sequence of the target text sequence, and accumulating the physical boundary (first truncation anchor point) in virtual space and satisfying the collision condition, the optimal second truncation anchor point is determined by combining spatial white space and semantic tension within the candidate backtracking interval; This scheme breaks the limitations of purely geometrically driven typesetting, making the physical line break boundary of the text closely aligned with the natural cognitive units of the human brain, avoiding the forced separation of core semantics, thereby fundamentally reducing the cognitive load in the process of reading long digital texts and achieving truly barrier-free and fluent reading;

[0028] 2. A one-dimensional nonlinear spatial compensation mechanism based on semantic tension is proposed. The system accurately and proportionally allocates the physical white space generated by line breaks to the gaps between primitives in the target truncation line that meet the first tension condition (i.e., weak semantic association and low tension), and is subject to strict constraints of the maximum stretchable absolute physical limit value. This mechanism ensures that white space is preferentially allocated between punctuation marks or independent phrases, and will never break up highly cohesive proper nouns. It ensures the alignment of the right edge, maintains the continuity of visual scanning, and eliminates the layout river effect that destroys the aesthetics of the page.

[0029] 3. The system transforms excess white space into lateral deformation parameters for high-tension core words. While widening the characters, it generates orthogonally coupled character weight (stroke density) compensation parameters based on the Weber-Fechner nonlinear perception law. This not only perfectly absorbs the extremely difficult-to-digest excess white space at the three-dimensional physical level, but also ensures the absolute constancy of the local optical grayscale of the page after the characters are widened, avoiding visual noise and focusing interference, while achieving a subconscious typographical effect that emphasizes the core semantics. Attached Figure Description

[0030] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0031] Figure 1 This is an overall block diagram of the method in Embodiment 1 of the present invention;

[0032] Figure 2 This is a schematic diagram illustrating the principle of dynamic layout truncation and multi-dimensional parameter compensation in Embodiment 1 of the present invention;

[0033] Figure 3 This is an overall block diagram of the system in Embodiment 2 of the present invention. Detailed Implementation

[0034] The technical solution of the present invention will be clearly and completely described below with reference to the embodiments. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0035] In the traditional field of digital reading, especially in the process of acquiring long text information on mobile devices, the fluency and immersion of reading are regarded as key indicators for measuring the quality of typesetting. The generation of this high-quality reading experience is essentially a process of efficient information decoding and reorganization at the cognitive psychology level. That is, through continuous and rhythmic visual saccades of the eyes, the retina is used as an information collector to transmit the text stimuli on the physical page to the semantic center of the brain without loss, thereby forming a coherent flow of understanding at the cognitive level.

[0036] However, existing typesetting technologies lack a mechanism to verify the consistency between the physical rendering boundaries of the front end and the semantic cohesion of the underlying text. This results in an inability to accurately identify the problems of "forced semantic separation" and "disorientation of visual coordinates" encountered by users during continuous scanning. Forced semantic separation manifests as the typesetting engine achieving physical alignment of container edges, but actually relying on the mechanical cutting of tightly packed phrases or core components of long and complex sentences to meet spatial constraints. Disorientation of visual coordinates manifests as the system using a globally uniform stretching alignment strategy to compensate for line endings, resulting in abnormally amplified spacing between low-related words, forming visual faults (typesetting rivers) that disrupt the continuity of retinal saccades. As a result, a strict mapping relationship cannot be established between the physical presentation of text and the cognitive rhythm of the brain, causing the system to falsely reflect the typesetting quality, thereby affecting the effective control of the user's cognitive load and the achievement of an accessible reading experience.

[0037] For example, in deep reading scenarios on smartphones, when users read long articles, the rendering engine can only ensure that the physical pixels of a single line of text do not exceed the boundaries through the underlying graphics interface, but it cannot distinguish whether the line break point is accompanied by a destructive cut of the core attributive phrase. Furthermore, when the system blindly distributes the inline whitespace due to forced two-way alignment, the system only presents a neat appearance and fails to detect the sudden change in eye sac distance and the sharp drop in visual focusing efficiency when the user crosses these abnormally wide spacings. Specifically, the system misjudges the semantically disruptive mechanical line breaks as standard typesetting and displays them directly on the screen, or incorrectly classifies the excessive stretching of character spacing as a normal cost of typesetting optimization. As a result, the user's brain is constantly in a high-energy-consuming visual splicing mode, and cannot form an immersive reading state that conforms to the laws of cognitive science.

[0038] If the above problems are not resolved, the front-end rendering system will continue to lose its ability to objectively adjust for cognitive friendliness. Specifically, the failure to recognize semantic fragmentation will cause the brain to over-rely on prefrontal working memory to reconstruct the information, leading to a deviation of the cognitive resource transmission path from the principle of fluent decoding. This weakens the depth of understanding of long texts and induces implicit micro-pauses. Simultaneously, the failure to correct visual coordinate loss will exacerbate the internal friction between the ciliary and extraocular muscles, making it impossible for the gaze trajectory to exhibit stable inertial characteristics, ultimately causing the reading process to lose its proper rhythm. Therefore, the inaccuracy of mechanical typesetting will systematically hinder users' core desire to escape visual and mental fatigue, seriously affecting the ultimate realization of the original intention of accessible reading on the internet.

[0039] Example 1:

[0040] like Figures 1-2 As shown, the dynamic typesetting optimization method for accessible reading on the Internet includes the following steps:

[0041] Step S1: Obtain the target text sequence, which consists of multiple text primitives and the primitive gaps between adjacent text primitives, and obtain the basic physical rendering dimension of each text primitive;

[0042] To facilitate understanding of the implementation logic of this invention, we will break down the specific execution process of step S1 in this embodiment using an example of an elderly user with easily fatigued eyes browsing an in-depth article on "cutting-edge artificial intelligence technology" on a smartphone (assuming the screen's physical viewport width is 1080 pixels) in reading mode with the built-in browser. The moment the user clicks to enter reading mode, the system first needs to acquire the target text sequence. This sequence is not a string of concatenated chaotic characters, but a structured data stream composed of multiple text primitives and the gaps between adjacent text primitives.

[0043] Specifically, to extract clean reading content from the complex webpage environment, the system's underlying Document Object Model (DOM Parser) first traverses the node tree of the current webpage, filtering out all redundant nodes in the non-text dimension (such as script tags, style sheets, nested ad image frames, etc.), thereby extracting a clean, raw plain text stream. It's worth noting that because human language has discrete lexical features, the system does not treat the entire text as a whole for typesetting calculations. Instead, a lightweight front-end word segmentation engine performs atomic segmentation on the raw text stream, deconstructing it into a sequence of text primitives. In practical applications, the deconstruction logic of this word segmentation engine varies depending on the typesetting conventions of the language: for text containing Chinese, Japanese, and Korean (CJK) characters, the engine typically uses a built-in Trie Tree or maximum matching algorithm to identify semantically independent phrases or single Chinese characters as independent text primitives; while for Indo-European language texts such as English, it directly relies on spaces or hyphens as natural segmentation boundaries to extract words as text primitives.

[0044] During this deconstruction process, the topologically isolated region that inevitably exists between two adjacent text primitives is defined by the system as a primitive gap. For example, in the mixed-format sentence "Deep Learning Technology," there is an implicit Chinese character gap with a default width of very small or zero between the text primitives "deep" and "learning," while there is a primitive gap consisting of an explicit space between "deep" and "learning." Through the above atomic segmentation, the system successfully constructs an ordered set of text primitive sequences in memory, denoted as a sequence containing n elements. .

[0045] After completing the logical deconstruction of the text content, the system then needs to obtain the basic physical rendering dimensions of each text primitive, which is the absolute physical scale for mapping abstract text to the real device screen. To obtain this key data, the system must not directly trigger real pixel redrawing on the screen (which would cause serious stuttering and flickering), but instead needs to call the underlying graphics rendering pipeline interface of the terminal system at the virtual memory level. Specifically, in this embodiment, the system calls the measureText() application programming interface of the Canvas 2D rendering context in the HTML5 standard, or directly reads the font metric matrix (Font Metrics) pre-loaded at the underlying layer of the operating system.

[0046] The reason for using the underlying graphics interface for pre-check is that the physical pixel area occupied by the same text primitive varies significantly under different device resolutions (DPI) and different global typesetting parameters. The system will capture the basic typesetting parameter set in the current reading environment in real time, which strictly includes the reference font size (e.g., 18px), the reference font family (e.g., the system default sans-serif font PingFang SC), and the basic letter spacing that the user may preset. Subsequently, the system will send each text primitive ci in the sequence S together with the default primitive gap attached to it into the above-mentioned underlying graphics interface one by one for high-precision geometric bounding box measurement.

[0047] For example, when the reference font size is set to 18px, through interface measurement, the Chinese character primitive "深" may return a square reference physical width of exactly 18 pixels, while the English letter combination primitive "Deep" may return a width of 34.5 pixels accurate to the decimal point based on the kerning rules of its specific font. The system performs the above precise measurement on each element in the sequence, thus transforming the logically text primitive sequence S into a one-dimensional array of physical widths that strictly corresponds to it one by one in terms of index , where represents the absolute horizontal pixel span of the i-th text primitive. Based on this, the system completes the conversion from a character stream without specific dimensions to a rendering data stream carrying an absolute physical scale, laying a data foundation for subsequent processing.

[0048] Step S2: Perform semantic parsing on the target text sequence, extract the tension eigenvalue characterizing the semantic cohesion degree of adjacent text primitives, and form a semantic tension feature sequence corresponding to each primitive gap;

[0049] After measuring the physical dimensions of text primitives, relying solely on these geometric data for typesetting will still lead the system into the traps of traditional typesetting. To eliminate cognitive micro-pauses caused by typesetting fragmenting highly cohesive semantic blocks, the system must understand the inherent linguistic logic of the text. Therefore, in this embodiment, the system needs to further perform deep semantic parsing on the target text sequence to extract tension feature values ​​that characterize the degree of semantic cohesion between adjacent text primitives, and use these to construct a semantic tension feature sequence that corresponds one-to-one with the gaps between each primitive.

[0050] To achieve this cross-modal feature extraction, the system integrates and configures a pre-defined dependency parsing model. In actual engineering deployment, this model is not a simple mapping tool, but rather employs a deep neural network architecture combining a Bi-LSTM or Transformer encoder with a Biaffine Attention mechanism. This specific network architecture is chosen because long-distance contextual dependencies are common in natural language text (e.g., subjects and predicates separated by a dozen characters). The Transformer or Bi-LSTM encoder layer can fully extract the global semantic feature representation of the entire sentence; while the subsequent Biaffine Attention layer is specifically used to cross-score the probability of dependency edges between any two text primitives and to perform multi-classification of dependency relationship types. In operation, the system uses the target text sequence obtained in step S1 as the model's input sequence. After forward propagation parsing by the deep network, the model generates a directed acyclic syntax tree containing rich linguistic information at the output. In this syntax tree, the nodes of the graph represent individual text primitives and are accompanied by accurate part-of-speech tags, while the edges of the graph represent the dependency relationships between primitives.

[0051] To ensure the feasibility of the aforementioned dependency parsing model in practical engineering, this embodiment describes the construction and training process of the preset model. Specifically, the input layer of the model is first configured with a word embedding matrix based on a pre-trained large language model (such as BERT or Word2Vec), used to transform discrete text primitives into dense feature vectors of fixed dimensions (such as 768 dimensions). During the model training phase, the model uses an open-source Chinese dependency treebank (such as Chinese Treebank, CTB) as the training sample set. During training, the feature vector sequence is used as input, and the head node prediction probability and dependency label classification probability output by the dual affine attention layer are used as prediction results. The system uses the cross-entropy loss function to calculate the structural prediction loss and label classification loss respectively, and iteratively updates the network weights of Bi-LSTM and the dual affine layer through a backpropagation algorithm (such as the Adam optimizer). After a finite number of iterations until the loss function converges, the model parameters are finally solidified, thereby ensuring that it can stably and accurately map the input text sequence into a directed acyclic syntax tree.

[0052] After successfully constructing a high-dimensional linguistic topology, the system begins to reduce the dimensionality of macroscopic syntactic relations into microscopic typographic constraints. Specifically, the system iterates through each pair of adjacent text primitives in the target text sequence in its underlying logic. It's important to note that "adjacent" here refers to linear adjacency in the physical text sequence, but they may be far apart in deep grammatical logic. For each pair of adjacent text primitives, the system uses a breadth-first search (BFS) algorithm in the generated directed acyclic syntactic tree to find the shortest connected topological path between the two primitive nodes and extracts the number of edges in this shortest connected topological path, defining it as the path distance feature. In cognitive linguistics, this path distance feature has a clear physical meaning: the shorter the topological distance between two words in the syntactic tree, the higher the requirement for coherence in the brain's thinking when parsing them; once physically truncated by line breaks in typography, the resulting loss in reading comprehension is more severe. In actual graph structure traversal, if two adjacent text primitives do not have any connected path in the syntax tree due to crossing independent sentence boundaries or syntactic parsing anomalies, the system will directly mark the path distance feature of the pair of primitives as the system's maximum preset constant (or regard it as mathematical infinity), so that the distance inverse decay term generated in the subsequent tension calculation formula is safely reduced to zero, which is consistent with the physical fact that it has no semantic stickiness.

[0053] To accurately quantify the cognitive loss that may be caused by physical truncation, the system calculates a quantified tension feature value using a rigorous mathematical model based on extracted path distance features and specific dependency relationships between adjacent text primitives. In this embodiment, the tension feature value is calculated using the following tension quantization mapping formula:

[0054] ;

[0055] In this core formula, Represents the i-th text primitive With the Text primitives The characteristic value of the tension between them. The first term on the right side of the formula is the reciprocal of the distance attenuation term, where... This refers to the path distance feature extracted earlier, where k is a system-preset normalization scaling factor (used to map the reciprocal value to a reasonable range that conforms to the ranking weight). The second term on the right side of the formula is the core syntactic penalty term, where... This is a dependency penalty constant preset by the system, and This is a crucial Boolean indicator function. In practical applications, when the system determines that the i-th text primitive and the (i+1)-th text primitive form a pre-defined core dependency relationship (such as a verb-object relationship, a subject-verb relationship, or a close modifier-head relationship) in the syntax tree, this indicator function... If activated, it is assigned a value of 1; otherwise, it is assigned a value of 0.

[0056] To facilitate an intuitive understanding of this computational logic, we will use the adjacent text primitive combination "visual" ( ) and "fatigue" Let's take "vision" as an example. In the syntactic tree, "vision" directly modifies the head noun "fatigue" as an attributive, and there is a directly connected topological edge between them. Therefore, its path distance feature... At the same time, this direct modification relationship is identified as a core dependency relationship by the system's internal knowledge base, hence the indicator function... Assuming the normalization scaling factor k is set to 1.0 in the system configuration, and the dependency penalty constant is... Setting it to 0.5 and substituting it into the formula, we can obtain its tension characteristic value. This extremely high tension value sends a strong data constraint signal to the typesetting engine, indicating that there is a very strong semantic stickiness between these two primitives, and they cannot be easily severed. Conversely, if it is a period and the next sentence-initial primitive, the distance may be extremely large and the indicator function is 0, and its tension characteristic value will infinitely approach 0.

[0057] After traversing and calculating all sequence gaps, the system finally combines the tension feature values ​​corresponding to each primitive gap in the physical order of the original text sequence, thus constructing a semantic tension feature sequence parallel to the original text in memory. At this point, the system has completely realized the mathematical transformation from intuitive linguistic semantics to rational typographic constraint scalars. This sequence will serve as the most crucial decision constraint benchmark in subsequent processing.

[0058] Step S3: Based on the basic physical rendering dimensions, perform virtual space accumulation on the target text sequence to obtain the accumulated width, and determine the first truncation anchor point that satisfies the preset boundary conditions for the accumulated width;

[0059] After successfully constructing a semantic tension feature sequence representing deep linguistic logic, the system needs to predict the default physical line break behavior of the native typesetting engine without interfering with the actual interactive interface of the terminal device. If styled text nodes are directly mounted to the real Document Object Model (DOM) tree for line break testing in the system front end, it will frequently trigger browser reflow and repaint, leading to serious system performance bottlenecks and frame drops / stuttering. Based on this, this embodiment constructs a lightweight isolation sandbox mechanism in memory, and performs virtual space accumulation operations on the target text sequence sequentially based on the basic physical rendering dimensions obtained in step S1.

[0060] Specifically, the system first needs to establish an absolute physical reference boundary, that is, to determine the upper limit of the effective rendering physical width of the target rendering environment. In actual engineering scenarios, this upper limit is not simply equivalent to the screen resolution of the hardware device, but rather the net width obtained after the system dynamically reads the horizontal physical pixel width of the current viewport and strictly deducts the horizontal space occupied by the left and right padding, margin, and scrollbars of the outer container. Taking a smartphone with a horizontal screen resolution of 1080 pixels as an example, assuming that the reader application interface sets a safe reading padding of 40 pixels on each side to avoid text touching the edges, the upper limit of the effective rendering physical width calculated by the underlying system will be strictly locked to 1000 pixels.

[0061] After anchoring this physical space constraint boundary, the system initializes a floating-point accumulator variable in memory to record the accumulated width in real time, with its initial value set to zero. Subsequently, the system synchronously traverses the basic physical rendering dimensions of each text primitive, calculated previously, according to the natural reading linear order of the target text sequence. In each iteration, the system directly adds the physical pixel width of the currently pointed-to text primitive (including its own pixel width and the accompanying primitive gap width) to the accumulated width. After each addition operation, the system immediately performs an out-of-bounds detection on the current accumulated width state. In the specific implementation of this embodiment, the triggering logic of the preset out-of-bounds condition is strictly defined as: determining whether the current accumulated width is strictly greater than the aforementioned determined effective rendering physical width upper limit of the target rendering environment.

[0062] When the accumulated width is consistently less than or equal to the upper limit, the system assumes that the current text primitive can still be safely contained within the physical space of the current line, and the memory cursor continues to advance to process the next text primitive. However, if a single-step accumulation operation causes a sudden change in the accumulated width value and it exceeds the upper limit of the effective rendering physical width, the system will immediately interrupt the current virtual space accumulation loop. At this point, the system will precisely lock and determine the index value of the text primitive that caused the accumulated width to exceed the upper limit of the effective rendering physical width of the target rendering environment as the first truncation anchor point.

[0063] To facilitate a clear understanding of this logical flow, we will continue with the example of long-text reading using AI technology. Assume the system is performing virtual accumulation on the first line of text in a paragraph. After the memory cursor advances and completes the accumulation of the 25th text primitive (e.g., the phrase "depth"), the accumulated width displayed by the accumulator reaches 960 pixels. Next, the system reads the basic physical rendering dimension of the 26th text primitive (e.g., the phrase "learning") and calculates its absolute width to be 60 pixels. The system performs virtual space accumulation, changing the latest accumulated width to 1020 pixels (i.e., 960 pixels plus 60 pixels). Since 1020 pixels is significantly larger than the system's defined effective rendering physical width limit of 1000 pixels, the pre-defined out-of-bounds condition is instantly triggered. At this point, the system's underlying logic does not allow the native typesetting engine to perform mechanical line-wrapping rendering. Instead, it immediately marks the absolute position of the text primitive "learning" that caused the space overflow in the original target text sequence, officially defining its index value "26" as the first truncation anchor point.

[0064] Through this purely memory-level data deduction and boundary-crossing pre-detection mechanism, the system, with an extremely low computational cost of only O(N) time complexity, pre-determines the mechanical cutting position that the native typesetting engine will inevitably execute under purely geometric rule-driven conditions. The establishment of this first truncation anchor point is not only an objective geometric marker of the exhaustion of single-line physical space, but also a highly potentially risky observation post established within the text data stream. It provides a positioning benchmark for the subsequent system to extract the semantic tension features of this coordinate point to assess whether it will trigger serious cognitive dissonance and collision.

[0065] Step S4: When the tension feature value corresponding to the first truncation anchor point meets the preset collision condition, determine the candidate backtracking interval in the historical text primitives before the first truncation anchor point;

[0066] After accurately locating the first truncated anchor point through virtual spatial accumulation, the system does not directly use it as an instruction for the typesetting engine to execute line break rendering. Instead, it enters the semantic collision evaluation and anti-fragmentation optimization stage. Specifically, the system first needs to extract the tension feature value corresponding to the position of the first truncated anchor point and compare it with the preset collision conditions in the underlying rule base. In this embodiment, the preset collision conditions essentially determine whether the tension feature value at this position is greater than a preset safe cognitive tension threshold. It is worth noting that this safe cognitive tension threshold is not a randomly set empirical constant, but a baseline red line determined by the system based on a large amount of eye-tracking experimental data in cognitive psychology and reading comprehension test results. When the tension feature value is lower than or equal to the threshold (for example, the cut-off point is exactly after a comma, period, or other punctuation mark with no semantic cohesion), it means that physical line breaks at that point will not cause the cognitive burden of cross-line splicing in the brain, and the system can safely allow it to proceed; however, once the tension feature value is greater than the safe cognitive tension threshold (for example, the cut-off point forcibly tears apart the highly cohesive proper noun "deep learning"), the system determines that a serious semantic break has occurred, satisfying the preset collision condition.

[0067] When the preset collision conditions are met, the system must abandon the physical default solution of mechanically wrapping the text at the first truncation anchor point and instead search for a better alternative cut-off point to the left (i.e., in the historical text primitives before the first truncation anchor point). This presents a new engineering dilemma: if the system indefinitely backtracks to the left to find a gap with extremely low tension to perfectly protect semantics, it will inevitably result in too little actual rendered content for the current line, leaving a large physical blank gap at the end of the line. This not only severely disrupts the visual balance of the layout but also induces the so-called "typography rivers," increasing visual resistance to reading. Therefore, the system introduces an extremely strict spatial baseline constraint mechanism when determining feasible candidate backtracking intervals.

[0068] In the specific implementation, the system's underlying layer includes a parameter mapping module for calculating the absolute physical backoff baseline. The system first obtains the upper limit of the effective physical width of the current target rendering environment and retrieves the maximum allowed physical whitespace ratio preset in the system configuration properties. This whitespace ratio is typically strictly limited to a reasonable tolerance range that ensures visual aesthetics. Based on these two parameters, the system calculates the safe boundary physical width using an internal formula. The calculation logic is: upper limit of the effective physical width multiplied by (1 minus the maximum allowed physical whitespace ratio). Assuming the system captures an upper limit of the effective physical width of 1000 pixels and sets the maximum allowed physical whitespace ratio to 15%, the calculated safe boundary physical width is 850 pixels. This value has an extremely rigid physical meaning: regardless of the level of semantic integrity being protected, the actual pixel width of the current line after final layout is absolutely not allowed to shrink below 850 pixels.

[0069] After anchoring the aforementioned safety boundary, the system performs a reverse lookup in the accumulated width trajectory recorded in memory. The system searches the accumulated width values ​​of historical nodes in reverse order, finding the historical text primitive whose accumulated width is exactly greater than or equal to the physical width of the safety boundary (i.e., 850 pixels), and marks its location as the extreme search index. Subsequently, the system defines the entire continuous segment of historical text primitives, starting from this extreme search index and ending at the text primitive preceding the first truncation anchor point, as the candidate backtracking interval.

[0070] Using the example from the previous long article on artificial intelligence, suppose the index of the first truncation anchor point that caused the single-line space overflow is 26 (at which point the cumulative width of this node reaches 1020 pixels, exceeding the upper limit of 1000 pixels). The system backtracks to the left and finds that the cumulative width of the 22nd text primitive is 860 pixels, just keeping the safety line of 850 pixels. Therefore, the system's limit search index is precisely locked at 22. Thus, the four consecutive text primitives from index 22 to index 25 constitute the closed candidate backtracking interval for this typesetting optimization. Through the above derivation and calculation, the system not only effectively intercepts the physical cutting action that disrupts cognition, but also successfully locks a safe and controllable optimization sandbox by utilizing the space white space limit, providing clear data support for the subsequent comprehensive cost assessment and the establishment of the final truncation anchor point.

[0071] Step S5: Combine the cumulative width and tension feature value of each node in the candidate backtracking interval to determine the second truncation anchor point, divide the part of the target text sequence ending at the second truncation anchor point into target truncation lines, and obtain the physical white space of the target truncation lines.

[0072] After successfully defining the candidate backtracking interval, constructed by the safety physical baseline and the first truncation anchor point, the system faces the core typesetting decision task: how to accurately locate the true optimal line break position within this interval. To find the optimal balance between the inherent engineering contradiction of "protecting the semantic integrity of the language" and "maintaining typesetting space utilization," the system employs a cost evaluation algorithm based on multi-dimensional objective optimization. Specifically, the system performs a traversal calculation for each candidate truncation node within the candidate backtracking interval, quantifying the comprehensive spatial semantic cost of forcibly truncating it at that point using an internally pre-built comprehensive cost function.

[0073] In the specific implementation of this embodiment, the comprehensive cost function is not a simple linear weighting, but a composite mathematical model that incorporates a nonlinear penalty mechanism. Its specific calculation formula is as follows:

[0074] ;

[0075] In this equation, This represents the comprehensive spatial semantic cost corresponding to the j-th candidate truncation node. The right side of the equation consists of two independent penalty terms. The first term is the semantic destruction cost evaluation term, where... The tension feature value extracted at the j-th candidate truncation node represents the risk of cognitive dissonance if the text is cut off at this point. The first preset adjustment weight constant set for the system is used to control the typesetting engine's priority in protecting semantic coherence. The second item is a space waste penalty item, in which... Represents the upper limit of the effective rendering physical width of the target rendering environment. This refers to the precise accumulated width previously virtualized up to the j-th candidate truncation node. The reason for specifically introducing a quadratic calculation structure for the white space ratio (i.e., the ratio of remaining physical space to total physical space) is that when the white space gap is small, the human eye has a high visual tolerance. However, as the gap size gradually increases, its impact on the aesthetics of the layout and the continuity of visual tracking increases exponentially. Therefore, a non-linear amplification penalty must be applied through a quadratic calculation; the formula... The second preset adjustment weight constant is used to control the system's basic preference for typesetting density.

[0076] To intuitively illustrate the decision-making logic of this algorithm, we assume an effective rendering physical width limit. For a 1000-pixel screen environment, the system sets the first preset adjustment weight constant. The second preset adjustment weight constant is 1.0. The value is 10000. There are two candidate cutoff nodes within the current candidate backtracking interval: the tension feature value corresponding to node A... Its cumulative width is 0.8. 950 pixels; tension feature value corresponding to node B Extremely low, only 0.1 (for example, exactly after a punctuation mark), but its cumulative width It is only 880 pixels. The system substitutes the data into the formula and calculates that the comprehensive spatial semantic cost of node A is... The comprehensive spatial semantic cost of node B is... The calculation results show that although node B has excellent semantic protection characteristics, it sacrifices too much typesetting space and leaves a large right margin, triggering a severe quadratic space penalty, resulting in its overall cost being far greater than that of node A.

[0077] Through the above calculations, the system completes the cost evaluation of all nodes within the candidate backtracking interval and sorts and compares the resulting values. Subsequently, the system formally determines the candidate truncation node whose comprehensive spatial semantic cost reaches the global minimum as the second truncation anchor point. This operation signifies that the system has found the optimal solution at the mathematical level that balances severe semantic fragmentation with an excessively empty layout.

[0078] After locking the second truncation anchor point, the system immediately performs a "physical severance" operation at the logical topology level. This involves dividing the historical primitive portion of the originally continuous one-dimensional target text sequence up to the second truncation anchor point into an independent target truncation line. Simultaneously, to provide an absolute physical benchmark input for subsequent spatial reorganization steps, the system retrieves the upper limit of the effective rendering physical width and subtracts the accumulated width at the second truncation anchor point. The precise pixel difference calculated from these two values ​​is determined by the system as the physical margin of the current target truncation line. For example, if the final determined accumulated width at the second truncation anchor point is 950 pixels, then the physical margin is precisely 50 pixels. The generation of this physical margin data signifies that the typesetting engine has completely ended the passive line break position prediction stage and officially entered the parameter reconstruction stage of actively absorbing typesetting flaws and implementing dynamic compensation.

[0079] Step S6: Based on the semantic tension feature sequence, the physical white space is allocated to the gaps between primitives that satisfy the first tension condition within the target truncation line to obtain the one-dimensional space compensation parameter and the overflow space residual amount.

[0080] After determining the target truncated line and obtaining the physical margin caused by early line breaks, the typesetting engine faces the engineering task of reallocating this margin within the line to achieve text right-edge justification. Traditional typesetting algorithms typically employ a uniform allocation strategy, but this indiscriminately widens all primitive gaps, easily disrupting the highly cohesive semantic blocks just protected by the cost function. Therefore, this embodiment constructs a non-linear spatial allocation mechanism based on semantic tension feature sequences, aiming to accurately and safely allocate the physical margin margin in a one-dimensional width to the primitive gaps within the target truncated line that satisfy the first tension condition.

[0081] Specifically, the system first needs to evaluate the spatial absorption priority of each gap within a line. For each primitive gap within the target truncated line, the system calculates a primary spatial allocation weight. At the underlying physical logic level, this primary spatial allocation weight is strictly inversely proportional to the tension characteristic value corresponding to the primitive gap; that is, gaps with stronger semantic stickiness and greater indivisibility (such as those within compound proper nouns) receive smaller weights, even approaching zero; while gaps with weaker semantic stickiness (such as those between prepositions or after punctuation) receive larger weights, thus being able to be scheduled to absorb more white space. This tension-based inverse mapping relationship is precisely quantified by the following formula:

[0082] ;

[0083] In this formula, The primary space allocation weight represents the gap of the i-th primitive. This represents the tension feature value corresponding to the gap of the i-th primitive, extracted from the semantic tension feature sequence. Considering... In some extremely low-stress scenarios, the value may be zero. To prevent the program from crashing or floating-point overflow due to a zero denominator in mathematical calculations, a preset division-to-zero smoothing constant is introduced internally. (For example, it is usually set to 0.001). Assume the line the system is currently processing contains two gaps, gap A is located inside a tight attributive phrase, and its tension characteristic value... Up to 0.99; gap B is located after the comma, its tension characteristic value Extremely low, only 0.01. Substituting into the above formula, the weight assigned to gap A is approximately 1, while the weight assigned to gap B jumps to 50. This indicates that the typesetting system will prioritize and concentrate the allocation of extremely high spatial absorption authority to low-tension gaps. In practical applications, this data state, after weight filtering and selection, manifests as the text primitives satisfying the first tension condition.

[0084] After obtaining the weights of all gaps within a line, the system calculates the relative proportion of the primary space allocation weights based on each primitive gap in the total weights. Subsequently, the system rigorously divides the previously captured overall physical whitespace allowance according to this percentage, thereby deriving the theoretical compensation width corresponding to each primitive gap.

[0085] It is worth noting that if the system relies solely on the aforementioned theoretical calculations and forcibly injects vast amounts of white space into a few extremely low-tension gaps without limit, it will inevitably tear extremely abrupt visual breaks on the page, a phenomenon known in typography as the "Typography Rivers." To prevent this degradation of the user experience caused by physical parallax from an engineering implementation perspective, the system incorporates a set of basic typography constraints pre-built based on visual physiology and typographic aesthetics. The system will capture the preset basic typography parameter set in the current rendering environment in real time, accurately obtain the current baseline spacing (usually referring to the standard word spacing or even-number spacing setting under normal typography conditions) and baseline font size, and determine the maximum stretchable absolute physical limit value of a single primitive gap accordingly. The calculation logic of this limit value integrates the upper limit control of the relative spacing stretching ratio and the absolute bottom-line mechanism based on font size, and its core formula is as follows:

[0086] ;

[0087] In the formula, The maximum stretchable absolute physical limit allowed by the system without causing a loss of vision; The baseline spacing configured for the current system; Set as the base font size for the current text; and These are all constant coefficients preset by the system. In engineering practice, It is usually calibrated as 2.5 (that is, the visual physiological limit allows the distance to be stretched to a maximum of 2.5 times the original size), while The fallback threshold for preventing zero spacing is often set to 0.2. For example, in some Chinese reading scenarios, the default baseline spacing for Chinese characters is... Often, the value is 0 pixels; without intervention, this will result in a product of zero, causing a loss of stretch elasticity. At this point, because... The system will trigger a fallback mechanism, assuming a base font size. If the value is 18 pixels, the system will take... The pixel is used as the absolute physical limit threshold for the absorption width of the Chinese character gap.

[0088] After establishing this safety boundary, the system performs numerical truncation matching for each gap. Specifically, it compares the theoretical compensation width of each primitive gap calculated previously with the derived maximum stretchable absolute physical limit value one by one. The system rigorously extracts the minimum value between the two and formally establishes it as the actual allocated compensation amount. This truncation operation ensures that the width extension of any gap will not exceed the safety threshold for human visual perception. After completing the truncation matching of all gaps within the current line, the system collects and integrates all the actual allocated compensation amounts within the target truncated line, forming a complete set of one-dimensional spatial compensation parameters. These parameters are essentially a set of micro-typography instructions accurate to the sub-pixel level, used to guide the downstream rendering pipeline to increase the physical distance between text nodes along the horizontal one-dimensional axis.

[0089] However, precisely because of this physical limit truncation mechanism, the initially acquired physical white space often cannot be fully absorbed by the in-line spacing. Therefore, the system must perform a final calculation: subtracting the sum of all actual allocated compensation amounts within the line from the original physical white space. This final difference is defined by the system as the overflow space residual. When the calculated overflow space residual is greater than zero, it means that relying solely on one-dimensional primitive spacing stretching has reached the physical limit of page layout aesthetics, and there is still a blank gap at the end of the current line that has not been effectively filled. This remaining data variable provides crucial activation signals and parameter support for the system to subsequently break through conventional thinking and trigger the joint compensation of two-dimensional font elastic deformation and optical grayscale orthogonality.

[0090] Step S7: When the residual amount of overflow space is greater than zero, convert the residual amount of overflow space into the lateral deformation parameter of the text primitive that satisfies the second tension condition in the target truncated line, and generate the word weight compensation parameter orthogonally coupled with the lateral deformation parameter.

[0091] After the typesetting system completes the spacing stretching and truncation matching in one-dimensional space, if the system calculates that the residual amount of overflow space is strictly greater than zero, it means that simply expanding the gaps between text primitives has already reached the physical and physiological limit of human visual continuity. At this point, if these remaining blank gaps are forcibly distributed into the gaps, severe visual breaks will inevitably be torn into the page. Based on this, the typesetting engine in this embodiment completely breaks through the underlying assumption of "rigid and immutable physical size of characters" in traditional typesetting, transforming the residual amount of overflow space into a three-dimensional deformation instruction for the text primitives themselves. In practical application scenarios, this process relies on the deep coupling between the system's underlying preset visual perception model and the variable fonts rendering engine. Specifically, this visual perception model is not a simple linear mapping table, but an optical grayscale compensation algorithm network built based on the Weber-Fechner Law in cognitive psychology. The reason for employing this specific non-linear perceptual structure is an inherent physical causal relationship in optical typesetting principles: when a character is horizontally widened to absorb typesetting white space, its original strokes inevitably appear thinner within the wider viewport, causing the overall typographic color of that area to become lighter. If orthogonal dimension weight (stroke thickness) compensation is not performed during the two-dimensional character stretching, this grayscale distortion will act like visual noise, significantly interfering with foveal focusing and thus negating the accessibility benefits of dynamic typesetting.

[0092] To execute the aforementioned deformation compensation logic, the system first needs to precisely define the suitable recipient objects to be stretched within the target truncated line. The system then re-traverses the target truncated line, extracting continuous text primitives whose corresponding tension feature values ​​are greater than the preset core tension isolation threshold. These continuous high-tension text primitives constitute the set of text primitives that satisfy the second tension condition. In practical linguistic terms, these primitives are often proper nouns or core terms that are extremely tightly bound internally and absolutely cannot be separated by letter spacing. Applying deformation to these highly cohesive primitives not only effectively absorbs excess white space at the physical level but also subconsciously guides and emphasizes the user's visual focus through micro-font expansion at the cognitive level.

[0093] After defining the target primitive set, the system immediately initiates rigorous deformation parameter conversion calculations. First, the system counts the total number of characters contained in the text primitives that satisfy the second tension condition. Then, the system divides the residual overflow space left over from the previous steps by this total number of characters, thereby obtaining the absolute physical stretch pixel increment that a single character needs to bear. It is worth noting that in real typesetting scenarios, there may be extreme cases where the tension feature values ​​of all primitives in the current target truncated line are lower than the core tension isolation threshold, resulting in a total number of extracted characters being zero. To give the system robustness in dealing with this boundary anomaly, the system has a dynamic degradation fault tolerance rule: when it detects that the total number of characters satisfying the second tension condition is zero, the system will trigger a degradation strategy, forcibly expanding the recipient target of the horizontal deformation parameter to all text primitives in the current line (i.e., uniform stretching across the entire line), or automatically and appropriately reducing the core tension isolation threshold to re-define the recipient, thereby absolutely avoiding program crashes caused by division by zero.

[0094] However, modern browsers' underlying typesetting pipelines and variable font APIs cannot directly recognize absolute pixel increment instructions; they typically require vector graphics rendering based on scalar parameters of relative proportions. Therefore, the system uses a font metrics interface to obtain the baseline pixel width of the target rendered font corresponding to the text primitives that satisfy the second tension condition. Based on this, the system calculates the percentage ratio of the aforementioned physical stretch pixel increment relative to this baseline pixel width, directly converting this percentage ratio into a relative percentage scalar increment, ultimately determining it as the lateral deformation parameter required by the system.

[0095] For example, suppose there is still a 6-pixel overflow space remaining at the end of the current target truncated line. The system locks a core phrase "deep learning" within the line that satisfies the second tension condition, and counts that it contains a total of 4 characters. After division, the physical stretching pixel increment required for a single Chinese character is 1.5 pixels. If the base pixel width of the target Chinese font currently rendering this phrase is 15 pixels, the system calculates that the percentage ratio of 1.5 pixels to 15 pixels is 10% (i.e., 0.1). Based on this, the system determines 10% as the horizontal deformation parameter of this phrase, which means that the phrase will be continuously stretched by 10% on the horizontal width axis (wdth axis) during rendering.

[0096] After successfully extracting the lateral deformation parameter, the system then calls the aforementioned preset visual perception model to generate a character weight compensation parameter orthogonally coupled to the lateral deformation parameter. To ensure absolute constancy of optical grayscale while widening the character, this embodiment uses the following orthogonal formula for character weight compensation for closed-loop calculation: ;

[0097] In this core formula, This represents the final calculated word weight compensation parameters; This refers to the lateral deformation parameters obtained in the previous step; The base stroke density constant is the preset base stroke density constant for the current target character family. This constant is set by the font designer when building the font library and represents the thickness base characteristics of the specific character family itself. The preset optical visual attenuation coefficient is used to characterize the sensitivity attenuation rate caused by the horizontal expansion of the character shape, which leads to visual fading. This represents the natural logarithm, aligning with the non-linear physical law that human vision is logarithmically sensitive to changes in brightness / grayscale. Continuing with the example above, let's assume the basic stroke density constant of the current font... Set to 200, optical visual attenuation coefficient Set to 1.5, lateral deformation parameter The value is 0.1. Substituting this into the formula, we get... .because Approximately equal to 0.14, the system ultimately calculated the character weight compensation parameter to be approximately 28. Thus, the system not only provided a deformation command to widen the width axis by 10%, but also a precise compensation command to increase the character weight by 28 units on the orthogonal character weight axis (wght axis), achieving dual-axis coupling of deformation and grayscale, completely clearing the parameter obstacles for the final lossless rendering and reconstruction.

[0098] Step S8: Based on the second truncation anchor point, one-dimensional space compensation parameters, lateral deformation parameters, and word weight compensation parameters, perform the reconstruction and output of the rendering object on the target text sequence;

[0099] After semantic tension assessment, spatial limit truncation, and multi-dimensional parameter compensation calculations, the system has transformed the abstract cognitive protection strategy into specific typesetting control values. However, these one-dimensional spatial compensation parameters, lateral deformation parameters, and character weight compensation parameters currently reside only in the system's memory computing module. To ensure these parameters truly impact the user's reading screen, the system must reconstruct the rendered object and perform the final output of the target text sequence without disrupting the underlying document structure. This process requires the system to deeply intrude into and take over the underlying rendering pipeline of the terminal device (such as the browser kernel or native UI framework).

[0100] Specifically, modern front-end architectures or native reading applications typically rely on the Document Object Model (DOM) tree or a similar hierarchical rendering tree to manage the presentation of visual elements. Making frequent style modifications directly on all text nodes would trigger catastrophic global reflow and repaint, leading to a surge in device power consumption and screen stuttering. Therefore, the system employs a dirty data isolation and reorganization mechanism based on a virtual memory tree (Virtual DOM or Offline Render Buffer). At the beginning of the reorganization phase, the system first precisely locates the index of the previously determined second truncation anchor point in memory. At this location, the system forcibly intrudes into the document structure, explicitly inserting a forced physical line break node (e.g., as in the HTML5 specification). The tag or pseudo-element injected with the display:block property. The insertion of this physical node is like building an absolute dam in the flow of typesetting, completely depriving the native typesetting engine of the right to freely wrap lines on that line, ensuring that its actual rendering cut-off point fits perfectly with the safe cognitive boundary calculated by the system.

[0101] After anchoring the physical boundaries of the lines, the system faces the engineering challenge of precisely applying macroscopic and microscopic differences in parameters to different text elements within the same line. Since one-dimensional spacing stretching and two-dimensional glyph deformation target entirely different text primitives, the system must break the chaotic, interconnected state of the original text nodes (TextNodes). Therefore, the system performs fine-grained container fragmentation wrapping on the target truncated line, that is, wrapping each text primitive within the line into an independent container tag node (such as...). Within the element. This object-oriented reorganization gives each text primitive and its accompanying gaps an independent style control handle, thus avoiding mutual contamination during parameter injection.

[0102] Subsequently, the system initiates a dual-track parallel parameter injection process. On one hand, for the container tag nodes corresponding to text primitives that meet the first tension condition (i.e., loose semantic association, suitable for absorbing white space), the system directly injects their exclusive one-dimensional space compensation parameters into the inline CSS properties of that node. In specific typesetting engine mapping, this usually manifests as dynamically assigning sub-pixel-level absolute values ​​to the word-spacing or letter-spacing properties, thereby precisely widening the physical distance between low-tension primitives in the horizontal geometric dimension.

[0103] On the other hand, for container tag nodes corresponding to text primitives that meet the second tension condition (i.e., semantically highly cohesive, such as core words like "deep learning"), the system implements higher-order variable font rendering instructions. The system combines the previously calculated horizontal deformation parameters with the weight compensation parameters and injects them into the inline font style attributes of the container tag. In a standard web rendering pipeline, this action manifests as generating and injecting dynamic instructions similar to font-variation-settings. For example, if the system calculates that the horizontal deformation parameter (relative percentage scalar increment) of a core phrase is 10%, the weight compensation parameter is 28, and the base weight of the font is 400, the system will precisely inject control code such as font-variation-settings: 'wdth'110, 'wght' 428 into the wrapper node of the phrase. This operation directly calls the vector deformation matrix of the system font library at the underlying level, achieving stepless stretching of the word on the width axis and orthogonal thickening compensation on the weight axis within microseconds.

[0104] After all micro-parameters have been accurately attached to the corresponding container tag nodes, the reconstruction of the entire target truncated line's rendering object is complete. At this point, the system submits this DirtyTree, carrying rich typographic compensation instructions, to the underlying graphics rendering pipeline. Upon receiving the instructions, the rasterization engine of the graphics processing unit (GPU) performs pixel blending and on-screen rendering in one go, based on these injected absolute values ​​and deformation parameters, thereby outputting a line of perfectly semantically coherent text with absolutely physical alignment on the right edge and uniform global optical grayscale on the terminal screen. It is worth noting that reading typography is a continuous streaming process. After completing the rendering output of the current line, the system automatically updates the text stream's read cursor, extracts the remaining text sequence after the second truncated anchor point in the target text sequence as the new target text sequence, and seamlessly returns to execute the primary step of sequentially accumulating virtual space based on the fundamental physical rendering dimensions. This closed-loop iterative mechanism will continue to operate at high speed until the entire long article is analyzed and rendered, ultimately presenting a highly immersive and cognitively unhindered digital reading experience throughout the entire life cycle for age-appropriate users with easily fatigued eyes.

[0105] In summary, this embodiment breaks through the limitations of traditional typesetting engines that rely solely on physical container boundaries for mechanical line breaks by introducing a semantic tension feature sequence representing the inherent linguistic logic of the text into the underlying rendering pipeline. Based on this feature sequence, the system performs virtual space accumulation and safety boundary backtracking, effectively intercepting destructive cuts to highly cohesive semantic blocks. Furthermore, when absorbing the physical white space generated by line breaks, the system not only implements nonlinear constraint allocation of one-dimensional primitive spacing according to the inverse tension weight, but also triggers orthogonal coupling compensation of two-dimensional font horizontal deformation and optical weight for space overflow residue. This series of data flows and parameter reconstructions eliminates the cognitive micro-pauses caused by the brain's frequent backtracking and splicing due to semantic fragmentation, while suppressing visual coordinate loss and grayscale distortion caused by excessive character spacing and relatively thinner strokes. This establishes a strict mapping relationship between the physical rendering form of the text and the cognitive decoding rhythm of continuous retinal scanning, ultimately eliminating visual resistance and cognitive friction during the reading of long digital texts.

[0106] Example 2:

[0107] like Figure 3 As shown, the dynamic typesetting optimization system for accessible reading on the Internet includes:

[0108] The data acquisition module is used to acquire the target text sequence, which consists of multiple text primitives and the gaps between adjacent text primitives; and to acquire the basic physical rendering dimensions of each text primitive.

[0109] The semantic parsing module is used to perform semantic parsing on the target text sequence and extract tension feature values ​​that characterize the degree of semantic cohesion of adjacent text primitives, so as to construct a semantic tension feature sequence corresponding to the gaps between each primitive.

[0110] The virtual accumulation module is used to perform virtual space accumulation on the target text sequence sequentially based on the basic physical rendering dimension to obtain the accumulation width, and determine the first truncation anchor point where the accumulation width meets the preset out-of-bounds condition;

[0111] The backtracking interval determination module is used to determine candidate backtracking intervals in the historical text primitives before the first truncation anchor point when the tension feature value corresponding to the first truncation anchor point meets the preset collision conditions.

[0112] The truncation line segmentation module is used to comprehensively analyze the cumulative width and tension feature values ​​of each node in the candidate backtracking interval, determine the second truncation anchor point, divide the part of the target text sequence ending at the second truncation anchor point into target truncation lines, and obtain the physical white space of the target truncation lines.

[0113] The spatial compensation allocation module is used to allocate the physical white space margin to the gaps between primitives that meet the first tension condition within the target truncation line based on the semantic tension feature sequence, thereby obtaining a one-dimensional spatial compensation parameter and the overflow space residual amount.

[0114] The deformation coupling compensation module is used to convert the residual amount of the overflow space into the lateral deformation parameters of the text primitives that satisfy the second tension condition within the target truncated line when the residual amount of the overflow space is greater than zero, and to generate the word weight compensation parameters that are orthogonally coupled with the lateral deformation parameters.

[0115] The rendering output module is used to reconstruct and output the rendering object of the target text sequence based on the second truncation anchor point, one-dimensional space compensation parameters, horizontal deformation parameters, and word weight compensation parameters.

[0116] The above description is merely an example and illustration of the structure of the present invention. Those skilled in the art can make various modifications or additions to the specific embodiments described, or use similar methods to replace them, as long as they do not deviate from the structure of the invention or exceed the scope defined in the claims, all of which should fall within the protection scope of the present invention.

[0117] In the description of this specification, references to terms such as "an embodiment," "example," and "specific example" indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the invention. In this specification, illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.

[0118] The preferred embodiments of the present invention disclosed above are merely illustrative of the invention. These preferred embodiments do not exhaustively describe all details, nor do they limit the invention to any specific implementation. Clearly, many modifications and variations can be made based on the content of this specification. This specification selects and specifically describes these embodiments to better explain the principles and practical applications of the invention, thereby enabling those skilled in the art to better understand and utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims

1. A dynamic typesetting optimization method for accessible reading on the Internet, characterized in that, Includes the following steps: Obtain the target text sequence, which consists of multiple text primitives and the primitive gaps between adjacent text primitives, and obtain the basic physical rendering dimension of each text primitive; Semantic parsing is performed on the target text sequence to extract tension feature values ​​that characterize the semantic cohesion of adjacent text primitives, so as to construct a semantic tension feature sequence corresponding to the gaps between each primitive; Based on the basic physical rendering dimension, the target text sequence is sequentially accumulated in virtual space to obtain the accumulated width, and the first truncation anchor point that satisfies the preset out-of-bounds condition is determined. When the tension feature value corresponding to the first cutoff anchor point meets the preset collision condition, a candidate backtracking interval is determined in the historical text primitives before the first cutoff anchor point. By combining the accumulated width of each node in the candidate backtracking interval with the tension feature value, a second truncation anchor point is determined. The portion of the target text sequence ending at the second truncation anchor point is divided into target truncation lines, and the physical white space of the target truncation lines is obtained. Based on the semantic tension feature sequence, the physical white space is allocated to the gaps between primitives that satisfy the first tension condition within the target truncated line to obtain a one-dimensional space compensation parameter and an overflow space residual. When the residual amount of the overflow space is greater than zero, the residual amount of the overflow space is converted into a horizontal deformation parameter for the text primitives in the target truncated line that satisfy the second tension condition, and a word weight compensation parameter orthogonally coupled with the horizontal deformation parameter is generated. Based on the second truncation anchor point, the one-dimensional space compensation parameter, the lateral deformation parameter, and the word weight compensation parameter, the target text sequence is reorganized and output as a rendering object.

2. The dynamic typesetting optimization method for accessible reading on the Internet according to claim 1, characterized in that: The construction process of the semantic tension feature sequence includes: The target text sequence is input into a preset dependency parsing model to generate a directed acyclic syntax tree containing part-of-speech tags and dependency associations; Traverse each pair of adjacent text primitives in the target text sequence, obtain the shortest connected topological path between the adjacent text primitives in the directed acyclic syntax tree, and extract the number of edges of the shortest connected topological path as the path distance feature. The tension feature value is calculated based on the path distance feature and the dependency relationship between adjacent text primitives; The tension feature values ​​corresponding to the gaps between each primitive are combined sequentially to form the semantic tension feature sequence.

3. The dynamic typesetting optimization method for accessible reading on the Internet according to claim 2, characterized in that: The tension characteristic value is calculated using the following tension quantization mapping formula: ; in, For the i-th text primitive With the Text primitives The characteristic value of the tension between them The path distance feature is represented by k, which is a preset normalization scaling factor. This is a preset dependency penalty constant. As an indicator function, when it is determined that the i-th text primitive and the (i+1)-th text primitive form a preset core dependency relationship... ,otherwise .

4. The dynamic typesetting optimization method for accessible reading on the Internet according to claim 1, characterized in that: Determining the first cutoff anchor point where the accumulated width satisfies the preset boundary condition includes: The index value of the text primitive that causes the accumulated width to be greater than the upper limit of the effective rendering physical width of the target rendering environment is determined as the first truncation anchor point. The preset collision condition includes: the tension characteristic value corresponding to the first cut-off anchor point is greater than the preset safety perception tension threshold. Determine candidate backtracking intervals in historical text primitives before the first truncation anchor point, including: calculating the safety boundary physical width based on the effective rendering physical width upper limit and the preset maximum allowable physical white space ratio; Based on the physical width of the security boundary, the limit search index is determined, and the continuous historical text primitives between the limit search index and the previous text primitive of the first truncation anchor point are defined as the candidate backtracking interval.

5. The dynamic typesetting optimization method for accessible reading on the Internet according to claim 4, characterized in that: Determine the second truncation anchor point, divide the portion of the target text sequence ending at the second truncation anchor point into target truncation lines, and obtain the physical whitespace margin of the target truncation lines, including: For each candidate truncation node within the candidate backtracking interval, based on the tension feature value corresponding to the candidate truncation node and the accumulated width at the candidate truncation node, the comprehensive spatial semantic cost value is calculated using the following comprehensive cost function: ; in, The comprehensive spatial semantic value of the j-th candidate truncation node. Let be the tension characteristic value at the j-th candidate cutoff node. To effectively render the physical width limit, The cumulative width at the j-th candidate truncation node. The first preset adjustment weight constant is... This is the second preset adjustment weight constant; The candidate truncation node whose comprehensive spatial semantic value reaches the global minimum is determined as the second truncation anchor point; The difference between the upper limit of the effective rendering physical width and the cumulative width at the second truncation anchor point is determined as the physical margin of the target truncation line.

6. The dynamic typesetting optimization method for accessible reading on the Internet according to claim 1, characterized in that: The physical margin is allocated to the gaps between primitives satisfying the first tension condition within the target truncation line to obtain one-dimensional space compensation parameters and overflow space residuals, including: For each primitive gap within the target truncated row, a primary space allocation weight is calculated, wherein the primary space allocation weight is inversely proportional to the tension characteristic value corresponding to the primitive gap. Based on the proportion of the primary space allocation weight of each element gap in the total weight, the physical blank space is divided proportionally to obtain the theoretical compensation width of each element gap. Based on the preset basic typesetting parameter set, obtain the current reference spacing and reference font size, and determine the maximum stretchable absolute physical limit value of the single element gap; The theoretical compensation width of each element gap is compared with the maximum stretchable absolute physical limit value, and the minimum value between the two is determined as the actual compensation amount. The one-dimensional space compensation parameter is formed by truncating all the actual allocated compensation amounts within the target row. The difference between the physical margin and the sum of all actual allocated compensation amounts is calculated to obtain the residual amount of the overflow space.

7. The dynamic typesetting optimization method for accessible reading on the Internet according to claim 6, characterized in that: The primary space allocation weights are calculated using the following formula: ; in, Assign weights to the primary space of the i-th primitive gap. This represents the tension characteristic value corresponding to the gap of the i-th element. This is a preset division-to-zero smoothing constant; The maximum tensile absolute physical limit value is calculated using the following formula: ; in, This represents the maximum stretchable absolute physical limit. As the reference spacing, As the base font size, and These are preset constant coefficients.

8. The dynamic typesetting optimization method for accessible reading on the Internet according to claim 1, characterized in that: The residual amount of the overflow space is converted into lateral deformation parameters for the text primitives within the target truncation line that satisfy the second tension condition, including: The continuous text primitives in the target truncated line whose corresponding tension feature value is greater than the preset core tension isolation threshold are determined as text primitives that satisfy the second tension condition; Count the total number of characters contained in the text primitives that satisfy the second tension condition; Divide the remaining amount of overflow space by the total number of characters to obtain the physical stretching pixel increment that a single character needs to bear; Obtain the baseline pixel width of the target rendered font corresponding to the text primitive that satisfies the second tension condition, and calculate the percentage ratio of the physical stretch pixel increment to the baseline pixel width as the lateral deformation parameter.

9. The dynamic typesetting optimization method for accessible reading on the Internet according to claim 8, characterized in that: The word weight compensation parameter is calculated using the following formula: ; in, For word weight compensation parameters, The preset base stroke density constant for the current target character family. The preset optical visual attenuation coefficient, This is the lateral deformation parameter.

10. A dynamic typesetting optimization system for accessible reading on the Internet, characterized in that: Using the dynamic typesetting optimization method for accessible reading on the Internet as described in any one of claims 1-9, comprising: The data acquisition module is used to acquire a target text sequence, which is composed of multiple text primitives and the primitive gaps between adjacent text primitives; and to acquire the basic physical rendering dimension of each text primitive. The semantic parsing module is used to perform semantic parsing on the target text sequence and extract tension feature values ​​that characterize the semantic cohesion of adjacent text primitives, so as to construct a semantic tension feature sequence corresponding to the gaps between each primitive. The virtual accumulation module is used to perform virtual space accumulation on the target text sequence sequentially based on the basic physical rendering dimension to obtain the accumulation width, and to determine the first truncation anchor point where the accumulation width satisfies the preset out-of-bounds condition. The backtracking interval determination module is used to determine candidate backtracking intervals in the historical text primitives before the first truncated anchor point when the tension feature value corresponding to the first truncated anchor point meets the preset collision conditions. The truncated line segmentation module is used to combine the accumulated width of each node in the candidate backtracking interval with the tension feature value to determine the second truncated anchor point, divide the part of the target text sequence ending at the second truncated anchor point into target truncated lines, and obtain the physical white space of the target truncated lines. The space compensation allocation module is used to allocate the physical white space margin to the gaps between primitives that meet the first tension condition within the target truncated line based on the semantic tension feature sequence, so as to obtain a one-dimensional space compensation parameter and the overflow space residual amount. The deformation coupling compensation module is used to convert the residual amount of the overflow space into a horizontal deformation parameter for the text primitives in the target truncated line that satisfy the second tension condition when the residual amount of the overflow space is greater than zero, and to generate a character weight compensation parameter orthogonally coupled with the horizontal deformation parameter. The rendering output module is used to reorganize and output the rendering object of the target text sequence based on the second truncation anchor point, the one-dimensional space compensation parameter, the horizontal deformation parameter and the word weight compensation parameter.