An autism children's picture book reading guidance method based on emotional segmentation and visual silent prompt

By constructing an emotional arc and an individual perception sensitivity profile, combined with emotional accumulation risk prediction and self-correction mechanisms, the cross-page emotional accumulation effect and individual perception heterogeneity issues in the ASD children's picture book reading system were resolved, achieving preventive intervention and system self-optimization.

CN122245641APending Publication Date: 2026-06-19ZHEJIANG NORMAL UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHEJIANG NORMAL UNIV
Filing Date
2026-03-27
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing ASD children's picture book reading intervention systems have a post-event response model, fail to model the cross-page emotional accumulation effect of picture book narratives, and lack the ability to adapt to individual perceptual sensitivity in intervention prompts, resulting in low recognition rates and stimulus amplification effects.

Method used

By employing emotion segmentation and visual silent cues, and constructing an emotional arc, an individual perception sensitivity profile, an emotion accumulation risk prediction system, real-time emotion monitoring, and a self-correction mechanism, preventive intervention can be achieved.

🎯Benefits of technology

It achieves forward-looking quantitative prediction of the emotional impact of future pages, improves the accessibility of prompts and system accuracy, reduces secondary stimulation to children with sensory allergies, and the system has continuous self-optimization capabilities.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245641A_ABST
    Figure CN122245641A_ABST
Patent Text Reader

Abstract

This invention discloses a method for guiding picture book reading for children with autism based on emotional segmentation and visual silent cues. The method includes: extracting and fusing emotional features from the picture book text and illustrations to construct an emotional arc and segmentation boundaries; collecting children's responses to establish a perceptual sensitivity vector; calculating the cumulative load value of subsequent pages based on an emotional aftershock model when turning pages, generating a risk timeline; collecting emotional observation values ​​in real time, using Kalman filtering to track the baseline and calculate the deviation value; combining the deviation value and the risk timeline to determine trigger rules, generating cue parameters based on the intervention urgency, and executing visual silent cues; after the session, bidirectionally correcting the emotional labeling values, and using the least squares method to fit the aftershock sequence and update the individual attenuation coefficient. This invention achieves preventative intervention through the cumulative effect of emotion, and the bidirectional self-correction closed loop improves accuracy, making it suitable for assistive therapy through picture book reading for children with autism.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of human-computer interaction and child assistance intervention technology, specifically to a method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues. Background Technology

[0002] Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by impaired social interaction, delayed language development, and narrow range of interests. Epidemiological data shows that the prevalence of ASD in children worldwide is approximately 1% to 2%. Picture book reading, as an important means of intervention and training for children with ASD, can effectively promote their language comprehension, emotional cognition, and social imitation abilities. However, while the rich emotional content in picture books provides cognitive stimulation, it may also trigger emotional overload in children with ASD, leading to adverse reactions such as anxiety, self-stimulatory behaviors, or refusal to read, negatively impacting the intervention's effectiveness.

[0003] Existing assistive intervention technologies for children with ASD mainly fall into two categories: one is a pure emotion recognition system, which triggers audio or visual alarms after detecting abnormal facial expressions in children through cameras; the other is a digital picture book system, which provides page-turning animations and voice reading functions. However, both types of systems have inherent limitations: existing emotion recognition systems are trained on standard emotion datasets, while children with ASD have weak facial expressions and lower motor unit strength than typical children, resulting in a significant decline in recognition rates; digital picture book systems lack emotion monitoring capabilities and cannot perform targeted interventions; most importantly, both types of systems only trigger a response after the child's emotional peak has occurred, which is a reactive mode and misses the optimal intervention window.

[0004] Furthermore, existing technologies have fundamental flaws in modeling picture book content: the emotional transmission in picture books relies on cross-page narrative context, and the impact of intense emotional stimulation on children's emotions on previous pages can extend to subsequent pages, creating an emotional aftershock effect. However, existing solutions perform isolated emotional analysis on each page, ignoring the cumulative effect of cross-page emotions and failing to predict the potential impact of future pages on children's emotions, thus lacking preventative intervention capabilities. Simultaneously, the intervention prompt parameters in existing systems are all fixed configurations, failing to be individually adapted to the highly heterogeneous perceptual sensitivities of children with ASD, leading to an amplified stimulus effect on children with perceptual hypersensitivity.

[0005] In summary, the core shortcomings of existing technologies lie in three dimensions: First, the intervention model is a reactive rather than preventative approach; second, the modeling of picture book content does not consider the cumulative effect of emotions across pages; and third, the intervention prompts lack adaptive mechanisms tailored to individual perceptual sensitivities. A fundamental solution to these problems requires the organic integration of narrative emotion cumulative effect modeling, adaptive prompts based on individual perceptual sensitivity, and self-correcting learning mechanisms to construct a preventative, proactive guidance system. Summary of the Invention

[0006] Technical Objective: To address the problems of existing ASD children's picture book reading intervention systems that adopt a post-event response model, fail to model the cumulative effect of cross-page emotions in picture book narratives, lack individual perceptual sensitivity adaptation ability in intervention prompts, and fail to improve system accuracy autonomously over time, this invention discloses a picture book reading guidance method for autistic children based on emotional segmentation and visual silent cues.

[0007] Technical solution: To achieve the above technical objectives, the present invention adopts the following technical solution:

[0008] A method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues includes the following steps:

[0009] S1. Constructing the Emotional Arc: Extract emotional feature vectors from the images on each page of the picture book, perform emotional intensity regression on the text on each page, execute a fusion strategy based on the difference between the emotional values ​​of the images and the emotional values ​​of the text to obtain the corresponding page emotional values, perform smoothing and extreme value detection on the page emotional value sequence of all pages, determine the segmentation boundaries, and generate the emotional arc data structure.

[0010] S2. Establish an individual perceptual sensitivity profile: Present the child with a sequence of perceptual probe stimuli, collect data on the child's gaze duration, avoidance actions, and facial muscle response, and calculate and store the individual perceptual sensitivity vector.

[0011] S3. Predicting the risk of accumulated emotions: When children turn pages, load the emotional value sequence of the current page and the next K pages. Based on the individual attenuation coefficient and the emotional aftershock propagation attenuation model, calculate the accumulated emotional load value of each subsequent page and generate a risk timeline.

[0012] S4. Real-time monitoring of emotional state: Collect real-time emotional observations of children, calculate the baseline estimate of emotion based on Kalman filtering, and use the difference between the real-time emotional observation and the baseline estimate of emotion as the emotional deviation value.

[0013] S5. Jointly Triggered Intervention: Based on the emotional deviation value and the risk timeline, the triggering rules are determined to ascertain the urgency of the intervention.

[0014] S6. Execute visual silent cue: Generate a cue parameter package based on the intervention urgency, individual perception sensitivity vector and current gaze coordinate ternary joint mapping, render and display the visual silent cue.

[0015] S7. Perform self-correction: After the reading session ends, update the sentiment label value based on the difference between the measured sentiment observation value and the sentiment value on the page, and update the individual attenuation coefficient based on the sentiment aftershock observation sequence.

[0016] This invention also provides a picture book reading guidance system for autistic children based on emotion segmentation and visual silent cues, comprising:

[0017] The emotional arc construction module is used to extract emotional feature vectors from the images on each page of the picture book, perform emotional intensity regression on the text on each page, perform a fusion strategy based on the difference between the emotional values ​​of the images and the emotional values ​​of the text to obtain the corresponding page emotional values ​​for each page, perform smoothing and extreme value detection on the page emotional value sequence of all pages to determine the segmentation boundaries, and generate the emotional arc data structure.

[0018] The individual profile building module is used to present a sequence of sensory probe stimulations to children, collect data on children's gaze duration, avoidance actions, and facial muscle action responses, and calculate and store the individual sensory sensitivity vector, which includes a color sensitivity component, a motion sensitivity component, and a spatial attention range component.

[0019] The cumulative risk prediction module is used to load the emotional value sequence of the current page and the next K pages when the child turns the page. Based on the individual attenuation coefficient and the emotional aftershock propagation attenuation model, it calculates the cumulative emotional load value of each subsequent page and generates a risk timeline.

[0020] The emotion state monitoring module is used to collect real-time emotion observations of children, calculate the emotion baseline estimate based on Kalman filtering, and use the difference between the real-time emotion observation and the emotion baseline estimate as the emotion deviation value.

[0021] The trigger decision module is used to determine the trigger rules based on the emotional deviation value and the risk timeline, and to calculate the urgency of intervention based on the trigger rules.

[0022] The prompt execution module is used to generate a prompt parameter package based on the intervention urgency, individual perceptual sensitivity vector and current gaze coordinate ternary joint mapping, and render and display visual silent prompts.

[0023] The self-calibration module is used to update the sentiment label value based on the difference between the mean of the measured sentiment observation value and the sentiment value of the page after the reading session ends, and to update the individual attenuation coefficient based on the sentiment aftershock observation sequence.

[0024] Beneficial Effects: The picture book reading guidance method for autistic children based on emotion segmentation and visual silent cues provided by this invention has the following beneficial effects:

[0025] (1) This invention is the first to transfer the power-law aftershock attenuation law in seismology to the emotional transmission model of picture book narratives, and proposes an emotional aftershock propagation attenuation model to achieve a forward-looking quantitative prediction of the emotional impact of future pages, and upgrade the intervention mode from post-event response to pre-event prevention, effectively controlling the peak intensity of children's emotions.

[0026] (2) This invention proposes to drive the generation of cue parameters by a three-element joint mapping of intervention urgency, individual perceptual sensitivity vector and current gaze coordinate, which solves the problem of fixed cue scheme adaptation failure caused by perceptual heterogeneity in children with ASD, significantly improves cue accessibility, and does not produce secondary stimulation for children with perceptual hypersensitivity.

[0027] (3) This invention constructs a two-way self-correction closed loop of emotion annotation value and emotion aftershock observation sequence: annotation correction improves the accuracy of picture book content modeling, individual attenuation coefficient calibration improves cross-page prediction accuracy, and the two work together to make the overall system accuracy monotonically increase with the number of times it is used, and have continuous self-optimization capability.

[0028] (4) This invention proposes a least squares fitting method for the emotional aftershock observation sequence to achieve continuous calibration of the individual attenuation coefficient, so that the personalized accuracy of the cumulative risk prediction model can be steadily improved as the number of sessions increases. Attached Figure Description

[0029] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below.

[0030] Figure 1 This is a schematic diagram of the overall architecture of the picture book reading guidance system for children with autism of the present invention.

[0031] Figure 2 This is a schematic diagram of the processing flow of the emotional arc construction module of the present invention.

[0032] Figure 3 This is a schematic diagram of the emotional aftershock propagation attenuation model and cumulative risk prediction engine of the present invention.

[0033] Figure 4 This is a schematic diagram of the process for generating a prompt parameter package using the ternary joint mapping of the present invention.

[0034] Figure 5 This is a schematic diagram of the data flow of the bidirectional self-correcting closed loop of the present invention. Detailed Implementation

[0035] The present invention will now be described more clearly and completely by way of a preferred embodiment in conjunction with the accompanying drawings, but this does not limit the invention to the scope of the described embodiment.

[0036] like Figure 1 As shown, the autism picture book reading guidance system of this invention adopts a four-layer architecture: content perception layer (offline preprocessing), state perception layer (real-time operation), intervention execution layer (real-time operation), and self-correction learning layer (asynchronous operation). The four layers exchange data through a unified personal profile data structure, PersonalProfile, and a risk timeline data structure, RiskTimeline, forming a complete functional loop. The PersonalProfile includes an individual perception sensitivity vector, an individual attenuation coefficient, an intra-conversation emotional baseline drift rate, a page sentiment annotation bias dictionary, and a historical conversation count field; the RiskTimeline includes the current page number, an array of cumulative emotional load values ​​for the next 5 pages, trigger rule enumeration values, intervention urgency, and a description of suggested intervention actions.

[0037] like Figure 2 As shown, the emotional arc construction module is executed when the picture book is first imported, and includes four sub-steps: image channel processing, text channel processing, image and text fusion verification, and arc generation.

[0038] Image channel processing unit: Using the EfficientNet-B4 backbone network, multi-scale semantic features are extracted from the RGB data of each page of the picture book. After fine-tuning on the children's picture book corpus, three types of sentiment indicators are output: color sentiment tensor (including the proportion of warm areas, the mean of image saturation, and the variance of brightness), scene sentiment density (represented by the product of the number of characters, the intensity of facial action units, and the spatial crowding), and threatening element density (represented by the dark area ratio and the mean of sharp contour gradients). These three indicators are then linearly mapped to output the image sentiment valence. The value ranges from -1 to +1, where -1 represents negative emotion and +1 represents positive emotion; the image arousal level is also output. The value ranges from 0 to 1.

[0039] Text processing unit: Performs optical character recognition on the text of each page of the picture book, inputs the recognition results into a Chinese sentiment intensity regression model pre-trained on RoBERTa-wwm-ext, fine-tunes it on children's picture book corpus and ASD intervention scripts, and outputs the sentiment intensity of the text. The value ranges from -1 to +1.

[0040] Image-text fusion verification unit: For children with ASD, images are the primary information acquisition channel; therefore, this invention assigns higher weight to the image channel. Specifically, it calculates the emotional valence of images. With the intensity of emotion in the text absolute value of the difference .when When the value is no more than 0.3, the page sentiment value is calculated using a weighted ratio of 0.75 for image and 0.25 for text. The specific calculation formula is as follows: ;when When it exceeds 0.3 but does not exceed 0.6, it is based on... As And set a minor conflict tag for the page, in which The emotional valence of the image is defined, with a value ranging from [-1, 1]. The emotional intensity of the text, with a value range of [-1, 1]; when When it exceeds 0.6, As Furthermore, an ambiguity warning label is set for the page, prompting therapists to manually review it. This ambiguity warning label enables the picture book content modeling to have a self-checking capability for consistency between text and images. The output of the above three-branch fusion strategy is the page sentiment value sequence. .

[0041] Arc generation and segmentation: The smoothing unit performs Savitzky-Golay filter smoothing on the sentiment value sequence of N pages in the book, with a window length of 5 pages and a polynomial order of 2. The first-order difference is calculated for the smoothed sequence, and the extremum detection unit detects local extrema. Extrema points satisfying an absolute sentiment difference exceeding 0.25 are selected as candidate segmentation boundaries, and constraints are imposed: a minimum segment length of 3 pages and a sentiment variance within each segment not exceeding 0.15. Finally, the segmentation boundary sequence is determined, and the sentiment arc data structure is generated.

[0042] When a child first uses the system, a 5-minute perception probe test is performed to establish an individual perception sensitivity profile. During the test, the system presents the child with a sequence of static color blocks with different color saturation levels (low, medium, and high) and a sequence of dynamic elements with different motion rates (5 frames per second, 15 frames per second, and 30 frames per second). The system simultaneously captures the child's gaze duration, facial avoidance movements, and the amplitude of their brow furrowing movements via a camera. Based on the collected data, color sensitivity components are calculated. (The shorter the fixation time and the more frequent the avoidance actions, the better) (Higher), motion sensitivity component (The more times high-speed dynamic elements trigger frown lines, the more...) (Higher) and spatial attention range components (The furthest distance of peripheral cues that a child can stably perceive, with the gaze center as the reference, in pixels), these three together constitute the individual's perceptual sensitivity vector S, which is stored in PersonalProfile.

[0043] like Figure 3 As shown, the cumulative risk prediction module includes an emotional cumulative risk prediction engine that executes each time a child turns a page. Its core is an emotional aftershock propagation attenuation model. This model is based on the observation that during reading with children with ASD, a page with high emotional intensity (the absolute value of the page's emotional value exceeds a high-intensity threshold) becomes a high-intensity emotional page. The impact on children's emotions is not limited to the current page, but extends to several subsequent pages in a power-law decay manner, forming an emotional transmission effect similar to aftershocks in seismology.

[0044] The cumulative sentiment load value of the k-th page after the current page P. Calculate according to the following formula:

[0045]

[0046] in, This represents the sentiment value for page P+k. For pages located before page P+k and with an absolute value exceeding [a certain threshold], [the following is a list of pages with a sentiment value exceeding a certain threshold]. The sentiment value of page j is calculated, where p is the individual attenuation coefficient (initially set to 1.2), and the summation range covers all pages j that meet the conditions. The cumulative sentiment load is calculated for each of the following 5 pages (K equal to 5 pages) after the current page P, forming a risk timeline array. Store it in the RiskTimeline data structure.

[0047] like Figure 4 As shown, the emotion state monitoring module acquires children's emotional signals through a dual-channel acquisition system using both a webcam and a depth camera. In the facial channel, a lightweight face alignment network is used to locate the child's facial region in the webcam frame stream, extracting a 64-dimensional temporal difference vector of facial micro-motion units. This vector primarily encodes the rate of glabellar contraction, the frequency of tremors at the corners of the mouth, and the amplitude of orbicularis oculi muscle contraction to capture the subtle facial expressions characteristic of children with ASD. In the skeleton channel, a depth camera combined with a skeleton estimation algorithm extracts the coordinates of 17 joints in the upper body, calculating the trunk tilt angle, scapular adduction, and head rotation angular velocity. When the detection confidence of the facial image falls below a confidence threshold, the fusion weight of the skeleton channel is increased to 0.8 to compensate for signal degradation when the face is occluded. The two feature vectors are input into a temporally aligned bidirectional long short-term memory network, which outputs an emotional state probability distribution covering six states: calm, focused, confused, anxious, joyful, and overload, serving as real-time emotion observations. .

[0048] For sentiment baseline tracking, a Kalman filter is used to perform rolling estimation of the sentiment baseline. The state variable of the Kalman filter is the current sentiment baseline estimate. The prediction equation assumes that the baseline change is a slow random walk process, and the update equation incorporates the latest sentiment observations. The gain is dynamically adjusted by the baseline change rate and the observation noise covariance. Real-time sentiment observations are used as the basis for this estimation. The difference between the baseline estimate and the emotional baseline (t) is used as the emotional bias value δ(t), which is used to eliminate the interference of intra-conversation baseline drift on intervention trigger judgment. The calculation formula is as follows:

[0049]

[0050] Where δ(t) is the emotional deviation value at time t; (t) represents the real-time emotion observation at time t; baseline(t) is the rolling estimate of the emotion baseline at time t by the Kalman filter. Using the relative deviation δ(t) instead of the absolute emotion value as the basis for intervention triggering can automatically eliminate the interference of the overall drift of the child's emotion baseline caused by fatigue, adaptation, etc. on the judgment.

[0051] like Figure 4 As shown, the triggering decision module determines the triggering rule based on the sentiment deviation value δ(t) and the risk timeline, as follows:

[0052] Preventative warning trigger rule: If the maximum value of the cumulative emotional load from page 1 to page 3 after the current page in the risk timeline exceeds the risk threshold. Furthermore, the emotional bias value δ(t) is lower than the stability threshold. This indicates that the child is currently emotionally stable but is about to face high-risk emotional content, and the system triggers a transitional warning at the end of the current page in advance. This rule is the core manifestation of the preventive intervention capability of this invention.

[0053] Immediate adjustment trigger rule: If the emotional deviation value δ(t) exceeds the deviation threshold This indicates that the child's current emotions have fluctuated significantly, and the system immediately triggers an emotion regulation prompt.

[0054] Enhanced combined triggering rules: If the judgment conditions of both the preventive warning triggering rule and the immediate adjustment triggering rule are met at the same time, the system first executes the prompt corresponding to the preventive warning, and after a 30-second interval, executes the prompt corresponding to the immediate adjustment. The two-stage prompts maintain a unified color theme visually.

[0055] Simplified plot trigger rule: If the cumulative emotional load value on the first page after the current page in the risk timeline exceeds the extremely high risk threshold. Furthermore, the emotional deviation value δ(t) exceeds the mild anxiety threshold. The system overlays a semi-transparent white mask over the non-core image area on the next page to reduce the visual impact of threatening visual elements.

[0056] Regarding the generation of prompt parameters, the prompt execution module employs a ternary joint mapping mechanism. Taking the intervention urgency, the individual perceptual sensitivity vector S, and the current gaze coordinates as inputs, the ternary mapping engine processes them: determining the prompt semantic template using the trigger rule enumeration value, determining the prompt intensity baseline value using the intervention urgency, and applying three-dimensional constraints using the individual perceptual sensitivity vector S. Specifically, when the color sensitivity component... When the color sensitivity threshold is exceeded, the suggested color parameters are forcibly converted to a grayscale gradient, forming a color constraint; when the motion sensitivity component... When the motion sensitivity threshold is exceeded, the frame rate of the prompted animation is limited to less than 5 frames per second, forming a motion effect constraint; when the spatial attention range component When the gaze distance falls below the attention threshold, the prompt location is constrained to within 100 pixels of the current gaze coordinates, forming a positional constraint. When multiple constraints are triggered simultaneously, the constraint intersection calculation unit performs the operation on the intersection of all constraints, prioritizing safety, and ultimately outputs the prompt parameter package. The current gaze coordinates are obtained in real-time using a camera-calibrated gaze estimation method, requiring no additional hardware eye tracker. For real-time adjustment of triggering rules, the offset of the prompt location relative to the gaze center is considered. (Unit: pixels) Calculated using the following formula:

[0057]

[0058] Urgency, representing the level of intervention urgency, ranges from [0, 1] and is calculated by combining the emotional bias value and the maximum load value of the risk timeline. A higher urgency indicates the cue is closer to the gaze center, ensuring its accessibility under high urgency; a lower urgency indicates the cue appears on the periphery of the gaze, providing gentle guidance. When the spatial attention range component... When the offset is below the attention range threshold, the offset calculated above is constrained by an upper limit of 100 pixels, and the smaller of the two values ​​is used.

[0059] like Figure 5 As shown, the self-calibration learning layer is executed asynchronously after each reading session. The self-calibration module contains two data streams: a sentiment annotation calibration submodule and an individual attenuation coefficient calibration submodule.

[0060] The sentiment annotation and correction data stream of the sentiment annotation and correction submodule: For each page, accumulated measured sentiment observations are collected across multiple reading sessions. With the page's sentiment value The difference value Δ(i). When the number of reading sessions accumulated for a page reaches 5 or more, and the absolute value of the mean difference value exceeds 0.2, and the standard deviation of the difference value is less than 0.3, after cross-session aggregation and significance test, it is determined that the page has a systematic labeling bias, and the sentiment labeling value is updated according to the following formula: ,in The sentiment value of the original page. The mean of the differences is used, and an update step size of 0.3 is used to suppress the impact of single outlier data. When the sentiment annotation value update range of multiple pages within a certain segment exceeds 0.15, segment boundary detection is re-executed, and the updated results are written back to the sentiment arc database to maintain the consistency of the sentiment arc data structure.

[0061] Individual attenuation coefficient calibration data stream of the individual attenuation coefficient calibration submodule: Extracting the emotional aftershock observation sequence, i.e., when the absolute value of the emotional value on a page in the picture book exceeds the high-intensity threshold. The sequence of measured sentiment observations from the next five pages is used as the sentiment aftershock observation sequence. The sequence is fitted using the least squares method, with a decay function... The above sequence is fitted to solve for the optimal individual decay coefficient. The calculation formula is: Where d is the page turning distance from the aftershock source page to the target page, d≥1, The value represents the intensity of an aftershock at a distance d from the aftershock source, which monotonically decreases as d increases. This represents the measured emotional aftershock observations on page d, i.e., the measured children's emotional observations on page d after the aftershock source page. Updated using an exponential moving average, according to the formula... The individual attenuation coefficients are updated, with weights of 0.7 and 0.3 used to smooth out random errors from a single estimate and prevent drastic fluctuations in the individual attenuation coefficients due to noise in a single dataset. The calibrated individual attenuation coefficients are shown below. The PersonalProfile data structure is written back and used in the cumulative sentiment load calculation of subsequent reading sessions, making the prediction accuracy monotonically increase with the number of sessions. The sentiment arc database is generated offline by the content-aware layer and loaded and called by the state-aware layer during real-time reading.

[0062] The following is based on the results of a 6-year-old boy with ASD (perceptual sensitivity test: color sensitivity component). Equal to 0.8, motion sensitivity component Equal to 0.7, spatial attention range component Taking the children's picture book "City Little Fox" (equivalent to 120 pixels) as an example, the complete workflow of this invention is explained.

[0063] Offline preprocessing stage: "City Little Fox" has 28 pages. After processing by the emotional arc construction module, 3 segmentation boundaries were detected, dividing the entire book into 4 segments: pages 1 to 7 (calm introduction, average...) (Equal to 0.35), pages 8-13 (conflict development section, mean) Equals -0.42, page 12 The lowest emotional point in the entire book is equal to -0.78, found on pages 14 to 21 (the turning point and resolution section, average). (Equal to 0.15), pages 22-28 (warm ending paragraph, average) (Equals 0.65). Page 12 is marked as a high-intensity emotion page. Set to 0.6.

[0064] Real-time reading phase: When the child turns to page 11, the cumulative risk prediction module calculates the risk timeline. The emotional cumulative load value on page 12 is -0.78, exceeding the risk threshold. The value is equal to 0.6, and the child's emotional deviation value δ(t) is equal to 0.05, which is below the stability threshold. The value equals 0.25, triggering the preventative warning trigger rule. This indicates an issue during the parameter generation phase, due to... A value of 0.8 exceeds the color sensitivity threshold of 0.6, so the color parameter is converted to a grayscale gradient; because... A value of 0.7 exceeds the motion sensitivity threshold of 0.5, limiting the motion effect frame rate to below 5 frames per second; due to... If the pixel value is 120, exceeding the attention threshold by 100 pixels, the prompt will be evenly distributed around the perimeter of the page. At the end of page 11, the system will display a static halo effect with a grayscale gradient at the page edge for 5 seconds, providing an advance warning to children that a plot change is imminent.

[0065] After the session ends, the self-calibration module performs least-squares fitting on the emotional aftershock observation sequence from page 12 onwards to obtain the optimal attenuation coefficient. It equals 1.45, according to The individual decay coefficient was updated to 1.275. After 5 reading sessions, the individual decay coefficient stabilized at around 1.38, significantly improving the accuracy of emotion accumulation risk prediction.

[0066] The above description is only a preferred embodiment of the present invention. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.

Claims

1. An autism children's picture book reading guide method based on emotion segmentation and visual silence prompt, characterized in that, Specifically, the following steps are included: S1. Extract emotional feature vectors from the images on each page of the picture book, perform emotional intensity regression on the text on each page, perform a fusion strategy based on the difference between the image emotional value and the text emotional value to obtain the page emotional value corresponding to each page, perform smoothing and extreme value detection on the page emotional value sequence of all pages, determine the segment boundary, and generate an emotional arc data structure. S2. Present a sensory probe stimulation sequence to the child, collect the child's gaze duration, avoidance behavior and facial muscle response data, calculate and store the individual sensory sensitivity vector, which includes color sensitivity component, motion sensitivity component and spatial attention range component; S3. When the child turns the page, load the page sentiment value sequence of the current page and the next K pages. Based on the individual attenuation coefficient and the page sentiment value of each page, calculate the cumulative sentiment load value of each subsequent page and generate a risk timeline. The cumulative sentiment load value is calculated based on the emotional aftershock propagation attenuation model. S4. Collect real-time emotional observations of children, calculate the baseline estimate of emotions based on Kalman filtering, and use the difference between the real-time emotional observations and the baseline estimate of emotions as the emotional deviation value. S5. Determine the triggering rules based on the emotional deviation value and the risk timeline, and determine the urgency of intervention based on the triggering rules; S6. Generate a prompt parameter package based on the ternary joint mapping of intervention urgency, individual perception sensitivity vector and current gaze coordinates, render and display visual silent prompts; S7. After the reading session ends, calculate the difference between the mean of the measured emotion observation values ​​of each page and the page emotion value, update the emotion label value based on the difference value, and update the individual attenuation coefficient based on the emotion aftershock observation sequence of multiple consecutive pages after the high-intensity emotion page.

2. The method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues as described in claim 1, characterized in that, The calculation formula for the emotional aftershock propagation attenuation model is as follows: , Where P is the current page number, and k is the page turn distance after the current page. This represents the cumulative emotional load value on page P+k. The page's sentiment value. Let p be the peak emotional value on page j, and p be the individual attenuation coefficient. The summation range is all pages j that meet the condition that the page emotional value exceeds the high intensity threshold. This is a high-intensity emotional threshold.

3. The method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues as described in claim 2, characterized in that, The specific steps for fitting and updating the individual attenuation coefficient are as follows: Extract the sequence of measured emotion observation values ​​from the five consecutive pages following the page in the picture book where the emotion value exceeds the high intensity threshold, and use this sequence as the emotion aftershock observation sequence. Applying the least squares method to the attenuation function Perform fitting, where This represents the aftershock intensity value at page d, a distance from the aftershock source, where d is the page turning distance. To stimulate the page's sentiment value, the optimal individual decay coefficient is determined. ; Update the individual attenuation coefficient according to the following formula: ,in This represents the individual decay coefficient before the update.

4. The method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues as described in claim 1, characterized in that, The specific steps for determining trigger rules based on a combination of sentiment deviation values ​​and risk timelines are as follows: If the maximum value of the cumulative emotional load from page 1 to page 3 after the current page in the risk timeline exceeds the risk threshold, and the emotional deviation value is lower than the stability threshold, then the triggering rule is determined to be a preventive warning triggering rule. If the emotional deviation value exceeds the deviation threshold, the triggering rule is determined to be an immediate adjustment triggering rule; If the conditions for both the preventive warning triggering rule and the immediate adjustment triggering rule are met simultaneously, then the triggering rule is determined to be an enhanced combined triggering rule. If the cumulative emotional load value on the first page after the current page in the risk timeline exceeds the extremely high risk threshold, and the emotional deviation value exceeds the mild anxiety threshold, then the triggering rule is determined to be the scenario simplification triggering rule.

5. A method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues, as described in claim 4, is characterized in that... The specific steps for generating the prompt parameter package based on the triggering rules are as follows: When the triggering rule is a preventative preview triggering rule, a warm-colored gradient halo hint is generated at the edge of the page. The halo spreads for 5 seconds, and the hint is evenly distributed around the perimeter of the page. When the trigger rule is an instant adjustment trigger rule, a low-saturation blue-green breathing guide graphic is generated at a position where the offset from the current gaze coordinate is the product of the intervention urgency and 50 pixels. When the triggering rule is an enhanced combination triggering rule, the prompt corresponding to the preventive warning triggering rule is executed first, and after a 30-second interval, the prompt corresponding to the immediate adjustment triggering rule is executed. When the trigger rule is a plot simplification trigger rule, a semi-transparent white overlay is applied to the non-core image area of ​​the picture book page.

6. The method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues as described in claim 1, characterized in that, When generating the prompt parameter package, apply a perception sensitivity constraint to the prompt parameters: When the color sensitivity component exceeds the color sensitivity threshold, the suggested color parameters will be converted to a grayscale gradient. When the motion sensitivity component exceeds the motion sensitivity threshold, the frame rate of the displayed animation will be limited to less than 5 frames per second. When the spatial attention range component is below the attention range threshold, the prompt position will be constrained to within 100 pixels from the current gaze coordinate. When multiple perception sensitivity constraints are triggered simultaneously, the intersection of all constraints is used for execution.

7. A method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues as described in claim 1, characterized in that, The specific steps for implementing a fusion strategy based on the difference between image sentiment values ​​and text sentiment values ​​are as follows: Calculate the absolute value of the difference between the emotional valence of images and the emotional intensity of text; When the absolute value of the difference does not exceed 0.3, the image sentiment value and the text sentiment value are weighted and summed according to the ratio of image weight 0.75 and text weight 0.25 to obtain the page sentiment value; When the absolute value of the difference exceeds 0.3 but does not exceed 0.6, the image sentiment value is used as the page sentiment value, and a mild conflict label is set for the page. When the absolute value of the difference exceeds 0.6, the image sentiment value is used as the page sentiment value, and an ambiguity warning label is set for the page.

8. A method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues, as described in claim 1, is characterized in that... The specific steps for updating sentiment labels are as follows: For each page, the difference between the measured sentiment observation value and the page sentiment value is accumulated and collected across multiple reading sessions; When a page has accumulated 5 or more reading sessions, and the absolute value of the mean difference exceeds 0.2, and the standard deviation of the difference is less than 0.3, the sentiment label value of that page is updated according to the following formula: ,in The sentiment value of the original page. The mean of the differences; If the sentiment annotation value of multiple pages within a certain segment is updated by more than 0.15, the segment boundary detection is re-executed.

9. A method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues, as described in claim 1, is characterized in that... The specific steps for collecting real-time emotional observations of children are as follows: The sequence of facial images of children is captured by a camera, and the temporal difference vector of facial micro-motion units is extracted. The sequence of joint points of the upper body skeleton of children was collected by a depth camera, and the forward tilt angle of the trunk, the degree of scapular adduction and the angular velocity of head rotation were extracted. When the detection confidence of the facial image is lower than the confidence threshold, the fusion weight of the skeleton channel is increased to 0.8; The temporal difference vector of facial micro-motion units and the skeleton feature vector are input into a bidirectional long short-term memory network, and the output emotional state probability distribution is used as a real-time emotion observation value.

10. A picture book reading guidance system for autistic children based on emotion segmentation and visual silent cues, characterized in that, A method for guiding picture book reading for autistic children based on emotion segmentation and visual silent cues as described in any one of claims 1-9, comprising: The emotional arc construction module is used to extract emotional feature vectors from the images on each page of the picture book, perform emotional intensity regression on the text on each page, perform a fusion strategy based on the difference between the emotional values ​​of the images and the emotional values ​​of the text to obtain the corresponding page emotional values ​​for each page, perform smoothing and extreme value detection on the page emotional value sequence of all pages to determine the segmentation boundaries, and generate the emotional arc data structure. The individual profile building module is used to present a sequence of sensory probe stimulations to children, collect data on children's gaze duration, avoidance actions, and facial muscle action responses, and calculate and store the individual sensory sensitivity vector, which includes a color sensitivity component, a motion sensitivity component, and a spatial attention range component. The cumulative risk prediction module is used to load the emotional value sequence of the current page and the next K pages when the child turns the page. Based on the individual attenuation coefficient and the emotional aftershock propagation attenuation model, it calculates the cumulative emotional load value of each subsequent page and generates a risk timeline. The emotion state monitoring module is used to collect real-time emotion observations of children, calculate the emotion baseline estimate based on Kalman filtering, and use the difference between the real-time emotion observation and the emotion baseline estimate as the emotion deviation value. The trigger decision module is used to determine the trigger rules based on the emotional deviation value and the risk timeline, and to calculate the urgency of intervention based on the trigger rules. The prompt execution module is used to generate a prompt parameter package based on the intervention urgency, individual perceptual sensitivity vector and current gaze coordinate ternary joint mapping, and to render and display visual silent prompts. The self-calibration module is used to update the sentiment label value based on the difference between the mean of the measured sentiment observation value and the sentiment value of the page after the reading session ends, and to update the individual attenuation coefficient based on the sentiment aftershock observation sequence.