An artificial intelligence-based non-coding genetic regulatory feature-driven brain region function susceptibility evaluation method

By using a deep neural network model to perform multi-scale mapping of individual non-coding genetic regulatory features, the shortcomings of existing technologies in individualized neural regulation are addressed, and quantitative assessment of brain region functional responses is achieved, thereby improving the individualized accuracy and scientific rigor of neural regulation.

CN122245428APending Publication Date: 2026-06-19恒燊中医科技(上海)有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
恒燊中医科技(上海)有限公司
Filing Date
2026-02-23
Publication Date
2026-06-19

Smart Images

  • Figure CN122245428A_ABST
    Figure CN122245428A_ABST
Patent Text Reader

Abstract

This invention discloses a brain region functional susceptibility assessment method driven by non-coding genetic regulatory features based on artificial intelligence, belonging to the field of bioinformatics technology. The invention first collects individual non-coding genetic and epigenetic feature data, preprocesses them to construct genetic regulatory feature vectors, then extracts the nonlinear coupling relationship between non-coding regulation and brain region function using a deep neural network model, generating a regulatory influence value matrix; finally, it combines external stimulus response transmission operators to quantitatively calculate the functional susceptibility score of each brain region. This application achieves a precise mapping from molecular-level regulatory logic to system-level brain region response potential, significantly improving the predictability of personalized and precise neural regulation.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of bioinformatics technology, specifically relating to a brain region functional susceptibility assessment method driven by non-coding genetic regulatory features based on artificial intelligence. Background Technology

[0002] Against the backdrop of the current technological evolution that highly integrates neuroscience, brain-computer interfaces, and precision medicine, neuromodulation technology, as a key means of intervening in neural circuit function and improving symptoms of neuropsychiatric diseases, has moved from early theoretical exploration to comprehensive clinical engineering applications. Currently, technological systems represented by transcranial magnetic stimulation (TMS), transcranial electrical stimulation (tES), deep brain stimulation (DBS), and emerging ultrasound and optogenetic modulation all operate on the core logic of altering the transmembrane potential of neurons or regulating synaptic plasticity through the intervention of specific physical fields, thereby reshaping the functional state of specific brain regions. In long-term clinical practice, to ensure the standardization and reproducibility of the procedure, existing technologies are typically based on standard anatomical atlases, that is, selecting stimulation targets based on population-averaged anatomical coordinates. This approach, based on "common characteristics," lowers the threshold for clinical implementation to some extent and has demonstrated a certain degree of effectiveness at the population level.

[0003] However, as precision medicine demands ever greater accuracy in personalized responses, the limitations of the aforementioned empirical or standardized brain region selection strategies are becoming increasingly apparent. The underlying contradiction lies in the severe disconnect between the high similarity of macroscopic anatomical structures and the significant heterogeneity of microscopic functional responses. Fundamentally, the brain region's response to external regulatory stimuli depends not only on the distribution of physical energy deposition but also on the complex molecular and genetic regulatory landscape within that region. Modern systems biology research has clearly indicated that approximately 98% of the human genome consists of non-coding regions that do not directly encode proteins. These sequences, through the construction of complex enhancers, silencers, non-coding RNA networks, and chromatin three-dimensional conformations, rigorously regulate the excitability threshold and synaptic transmission efficiency of neurons at a spatiotemporally specific level.

[0004] Specifically, individuals exhibit extensive polymorphism and epigenetic differences in these non-coding genetic regulatory elements, such as fluctuations in DNA methylation levels or varying open states of chromatin accessibility. This microscopic heterogeneity directly determines that external physical stimuli of the same intensity may elicit drastically different or even completely opposite functional responses in different individuals, and even in different brain regions of the same individual. This leads to the frequent occurrence of "same disease, same target, same parameters" yet vastly different therapeutic effects within the current technological framework—the so-called distinction between "responders" and "non-responders." Existing assessment methods are often limited to macroscopic functional imaging (such as fMRI) or electrophysiological signals (such as EEG). While these signals reflect real-time brain activity, they fail to reveal the underlying regulatory logic that determines the brain's response potential. This lack of a forward-looking, quantitative genetic regulation assessment system forces clinicians to rely primarily on retrospective trials or empirical exploration when developing regulatory plans, significantly limiting the refined development of neuromodulation techniques.

[0005] Furthermore, a deeper, less obvious bottleneck exists in current technologies: how to transform highly fragmented, multidimensional non-coding genetic information into brain region functional susceptibility indicators that can guide engineering practice. Current bioinformatics research largely focuses on gene association analysis, lacking a quantitative modeling method that can bridge the multi-scale gap between "molecular regulation—cell phenotype—systemic circuits." This lack of cross-scale mapping renders massive amounts of non-coding genetic data an "unreadable" information asset, difficult to transform into a priori evidence for target brain region selection in neural regulation. Therefore, how to systematically reveal the differences in functional susceptibility of various brain regions to external stimuli through scientific quantitative modeling, based on a full consideration of individual non-coding genetic regulatory characteristics and epigenetic status, has become a key scientific proposition urgently needing to be deciphered in the fields of bioengineering and neuroscience. This precise quantitative assessment is not only a prerequisite for achieving individualized precise regulation but also a core pathway to improving the scientific rigor and interpretability of neural intervention technologies. Summary of the Invention

[0006] This invention provides a brain region functional susceptibility assessment method driven by non-coding genetic regulatory features based on artificial intelligence, in order to solve the technical problems in the prior art where the selection of target brain regions for neural regulation relies too much on group experience and fails to fully consider the heterogeneity of individual non-coding genetic regulation, resulting in biased regulatory effects.

[0007] In a first aspect, the present invention provides a method for assessing brain region functional susceptibility driven by non-coding genetic regulatory features based on artificial intelligence, comprising the following steps: (1) Perform individual genetic regulation multimodal data acquisition steps: obtain non-coding genetic regulatory feature data and epigenetic status data of the test individuals through high-throughput sequencing data interface; the non-coding genetic regulatory feature data includes information on regulatory element associated sites distributed in non-coding regions of the whole genome and non-coding region functional annotation features; the epigenetic status data includes DNA methylation features and chromatin accessibility features; (2) Perform the multidimensional genetic regulation feature vectorization construction steps: Standardize and preprocess the obtained original non-coding genetic regulation feature data and epigenetic state data. The preprocessing includes dimensionality compression based on principal component analysis and numerical normalization based on range standardization. Encode categorical regulatory element information using the one-hot encoding operator, and quantify methylation level and accessibility intensity using the continuous variable mapping operator. Finally, concatenate the vectors in a unified feature space to generate a genetic regulation feature vector that characterizes individual genetic regulation differences. V g ; (3) Perform the gene regulation-brain region function association mapping step based on artificial intelligence: construct and train a deep neural network model, and then transfer the genetic regulation feature vector V g The input is a deep neural network model, which is a hybrid architecture of multilayer perceptron and graph convolutional network. This model maps highly sparse genetic regulatory feature vectors to a low-dimensional dense vector space through a feature embedding layer. A topological association mapping layer pre-constructs a functional connectivity map of human brain regions. This topological association mapping layer achieves logical association through a pre-constructed gene-brain region association matrix M, where matrix elements... M ij Indicates the first i The first control element affects the... j The functional influence weight of each target brain region, this weight M ij The absolute value of the Z-score of the mRNA expression level of the gene containing the regulatory element in brain region j was determined based on the PsychENCODE database. If no direct data is available, the value is obtained by interpolating the Hi-C interaction frequency with the expression data of the nearest neighbor brain region, and a regulatory influence matrix corresponding to each target brain region is generated. (4) Perform the quantitative calculation steps of brain region functional susceptibility score: For each target brain region, calculate the functional susceptibility score of the brain region under external regulatory stimulation conditions based on the corresponding components in the regulatory influence value matrix; the functional susceptibility score comprehensively reflects the response sensitivity and state transfer potential of the brain region based on the underlying genetic regulatory network when receiving external energy intervention. (5) Execute the evaluation result output step: The generated evaluation results containing functional susceptibility scores of multiple target brain regions are output in a structured manner. The evaluation results are identified in the form of a susceptibility heatmap or a score list, which identifies the high susceptibility response regions under the constraints of individual genetic regulatory characteristics.

[0008] Preferably, the regulatory element association site information in the non-coding genetic regulatory feature data described in step (1) includes expression quantitative trait sites (eQTL), protein quantitative trait sites (pQTL), and splicing quantitative trait sites (sQTL) obtained from databases (GTEx, PsychENCODE, etc.), used to characterize the regulatory efficacy of single nucleotide polymorphisms on gene expression and translation processes; the non-coding region functional annotation features in the non-coding genetic regulatory feature data include enhancer and silencer sequences and their activity scores (based on H3K27ac ChIP-seq signal enrichment folds) identified using HOMER software, promoter regions defined by the FANTOM5 project, and chromatin interaction frequency matrices extracted by Hi-C sequencing technology via the Juicer and Fit-Hi-C workflows, used to identify the physical interaction frequency between non-coding enhancers and distant target gene promoters.

[0009] Preferably, the genetic regulatory feature vector described in step (2) V g The construction process is as follows: an unsupervised feature extraction operator based on an autoencoder is used. The autoencoder extracts regulatory representation features with high information entropy from non-coding region site information by minimizing reconstruction error, thereby eliminating random noise and redundant biological information; the genetic regulatory feature vector... V g =[ f 1 , f 2 ,……, f n ],in, f i For control element i The biophysical property weights are assigned based on their conservation scores on the genome (e.g., GERP++ scores) and tissue specificity (e.g., tissue specificity indices from the ENCODE project).

[0010] Preferably, the topological association mapping layer of the deep neural network model described in step (3) treats the target brain region as a node in a graph structure through multi-layer graph convolution operations, and uses the edge weights between nodes to represent the anatomical or functional synergy between brain regions; the target brain region is divided based on standard anatomical atlases (such as AAL, Desikan-Killiany atlas), and at least covers the dorsolateral prefrontal cortex, primary motor cortex, anterior cingulate cortex and hippocampus; each element in the regulatory influence value matrix is ​​used to characterize the degree of influence of non-coding regulatory networks on the baseline excitability, upper limit of synaptic plasticity and neurotransmitter receptor expression level of each brain region under a specific genetic background of an individual.

[0011] More preferably, the deep neural network model adopts a transformer architecture with an attention mechanism during the training phase; the attention mechanism automatically identifies a set of key sites in the genetic regulatory feature vector that have a significant impact on the function of a specific brain region and assigns differentiated weight coefficients; the loss function of the deep neural network model training is composed of a mean squared error term and a sparsity regularization term to ensure that the calculation results of the regulatory influence value conform to the biological sparse distribution characteristics.

[0012] Furthermore, the construction and training process of the deep neural network model includes the following steps: (a) Training data preparation: Integrate sample data from large brain science cohorts (such as UK Biobank, ABCD study), each training sample including: whole genome sequencing and epigenetic data of the individual (used to construct the genetic regulatory feature vector). V g The data, along with the corresponding brain region functional state proxy labels for the individual, are derived from: 1) the low-frequency oscillation amplitude or local consistency of each brain region derived from resting-state functional magnetic resonance imaging; 2) the effect value of specific task activation in task-state fMRI; and 3) physiological response indicators induced after stimulation of specific brain regions in historical transcranial magnetic stimulation intervention studies (such as changes in the amplitude of motor evoked potentials MEP). The data are divided into training set, validation set, and test set in a 7:1:2 ratio. (b) Model Architecture: The deep neural network model adopts a cascaded hybrid architecture. First, the genetic regulation feature vector... V gEncoding dimensionality reduction is performed using a three-layer multilayer perceptron with an input dimension of n and hidden layer dimensions of 512 and 256, respectively. The ReLU activation function is used. The dimensionality-reduced dense vector is copied and assigned to each node of a pre-constructed brain region map. The brain region map is constructed based on population average diffusion tensor imaging data from the Human Connectome Project and is represented by a symmetric adjacency matrix A, where the matrix elements are the number of structural connection fibers between brain regions. Subsequently, the node features and the adjacency matrix A are input into a two-layer graph convolutional network (GCN). Each GCN layer is followed by Batch Normalization and ReLU activation. Finally, a fully connected layer outputs the regulatory influence value of each brain region. (c) Model training: The Adam optimizer was used with an initial learning rate of 0.001 and a batch size of 32. The loss function was defined as: Loss = MSE(Y pred , Y label ) + αW1, where MSE is the mean squared error, Y pred and Y label These are the predicted regulatory impact value and the proxy label, Y. label The data is obtained by standardizing and weighting the data from the three sources mentioned in step (a); W1 is the sparsity regularization term of the model weight parameters, α is the regularization coefficient, set to 0.0001, and training is stopped early when the loss on the validation set no longer decreases for 10 consecutive rounds.

[0013] Preferably, the formula for calculating the functional susceptibility score in step (4) is: ; in, S For functional susceptibility rating, For the first j The influence of each regulatory element on the regulation of this brain region. The corresponding weight coefficients are obtained through global average pooling of the output weights of the model graph attention layer. λ The environmental noise attenuation factor (determined by optimizing the correlation with real response data on the validation set). D This represents the degree of homeostasis of the brain region under its current epigenetic state; the homeostasis... D The absolute z-score of the current overall methylation level of a brain region (e.g., measured using the Illumina EPIC array) relative to a reference value in a healthy population is used to modulate the reduction effect of external non-genetic factors on response potential. If the target brain region of the subject is available... j Tissue-specific methylation data, D j = Zj ,in Z j The z-score is the overall methylation level of the brain region relative to the healthy control group; otherwise, peripheral blood methylation data is used as a proxy, and a pre-trained cross-tissue methylation prediction model (such as one based on elastic network regression, with blood CpG site methylation β-values ​​as input and brain region methylation as output) is used. j (Predicted methylation levels) to estimate brain regions j The methylation level was then calculated, and its absolute z-score was obtained. D j The training data for the prediction model comes from publicly available multi-tissue methylation databases (such as GEOaccession GSE111629).

[0014] More preferably, the calculation process of the functional susceptibility score also incorporates the basal metabolic level of the brain region as a covariate; by integrating local cerebral blood flow and oxygen metabolism data obtained from functional near-infrared spectroscopy or positron emission tomography, the genetic regulatory influence value is dynamically corrected; at the same time, the score calculation also incorporates cortical thickness data based on anatomical structure, and for brain regions with larger cortical thickness, the functional susceptibility score is corrected by gain compensation based on the attenuation characteristics of stimulus energy in spatial transmission.

[0015] Preferably, step (5) also includes brain region ranking suggestions based on functional susceptibility scores; the system automatically ranks each target brain region from high to low according to its susceptibility score, and selects the optimal target brain region candidate set for the individual to undergo neuromodulation intervention in combination with a preset regulation safety threshold; the evaluation results are pushed to the control end of the neuromodulation device through a standardized API interface as prior input data for the positioning of stimulation coils, pulse frequency and intensity parameters.

[0016] Compared with the prior art, the present invention has the following advantages and beneficial effects: (1) This invention achieves a systematic and quantitative mapping of non-coding genetic regulatory information to brain region functional response characteristics. This invention breaks through the limitations of traditional neural regulation that relies solely on macroscopic anatomy or population statistics, and for the first time, it delves into the non-coding regions that occupy the majority of the genome, revealing the underlying genetic mechanisms of heterogeneity in individual brain region functional responses. By incorporating enhancers, silencers, and epigenetic states into the evaluation system, brain region selection has a rigorous molecular biological basis.

[0017] (2) This invention enhances the scientific rigor and predictability of the neural modulation strategy formulation process. By extracting complex nonlinear modulation relationships through artificial intelligence models, this invention can accurately predict the response potential of different individuals in specific brain regions when linear or empirical rules fail. This technical approach, based on "a priori assessment" rather than "posterior trial," significantly improves the response rate of neural modulation and reduces the waste of medical resources and potential side effects caused by ineffective interventions.

[0018] (3) This invention constructs an evaluation framework with high scalability and engineering applicability. The modular design and universal feature encoding method adopted in this invention enable it to be seamlessly compatible with constantly updated bioinformatics databases and neuroscience discoveries. At the same time, the susceptibility score output by the system, as a standardized digital indicator, can be directly interfaced into the control logic of existing neuromodulation devices, and has great potential for clinical translation.

[0019] (4) This invention ensures the compliance and objectivity of the technical solution. The evaluation process of this invention focuses on the susceptibility of brain regions to the biophysical response to physical stimuli, and does not involve the direct diagnosis of an individual's disease state or the assessment of consciousness content or personality traits. The evaluation is based on objectively collected molecular genetic data and validated computational models, ensuring the professionalism and objectivity of the evaluation results. Attached Figure Description

[0020] Figure 1 A flowchart of a brain region functional susceptibility assessment method based on non-coding genetic regulatory features driven by artificial intelligence, provided by the present invention; Figure 2 This is a heatmap showing the functional susceptibility of brain regions in the subjects of Example 1. Detailed Implementation

[0021] The present invention will be described in detail below with reference to specific embodiments and examples, thereby making the advantages and various effects of the present invention more clearly apparent. Those skilled in the art should understand that these specific embodiments and examples are for illustrative purposes only and are not intended to limit the present invention.

[0022] like Figure 1 The diagram shown is an overall flowchart of the evaluation method provided by this invention. The physical basis of the evaluation system involved in this invention lies in the digital representation of the structural and functional state of non-coding regions across the entire genome of the test individual, and the use of the nonlinear fitting capability of deep learning models to quantify the contribution of these regulatory elements to the functional plasticity of specific neuroanatomical regions. The processing includes the following steps: (1) Perform individual genetic regulation multimodal data acquisition steps: Obtain non-coding genetic regulatory feature data and epigenetic status data of the test individuals through a high-throughput sequencing data interface; the non-coding genetic regulatory feature data includes information on regulatory element association sites distributed throughout the non-coding regions of the genome and functional annotation features of non-coding regions; the epigenetic status data includes DNA methylation features and chromatin accessibility features. In this stage, the system obtains multidimensional raw data of the test individuals from bioinformatics databases (such as GTEx, ENCODE, PsychENCODE) or laboratory sequencing terminals through a standardized high-throughput sequencing data interface. These data not only cover static sequence difference information, but also include dynamic epigenetic modification status. Specifically, the non-coding genetic regulatory feature data includes information on various regulatory element association sites distributed throughout the genome. These sites are not limited to traditional single nucleotide polymorphism sites, but focus more on those expression quantitative trait sites, protein quantitative trait sites, and splicing quantitative trait sites with significant regulatory effects. Each site corresponds to one or more candidate target genes, and its topological position on the genome is accurately marked using a coordinate system. Non-coding region functional annotation features further incorporate information on enhancers, silencers, insulators, and non-coding RNA promoter regions. The activity weights of these regions are assigned based on conservation scores and tissue specificity using predefined bioinformatics operators (such as using software HOMER for identification and scoring). Epigenetic state data, as an important component of the regulatory background, are characterized by DNA methylation features (e.g., using Bismark software to process whole-genome bisulfite sequencing data to obtain the methylation ratio of CpG sites) and chromatin accessibility features (e.g., using MACS2 software to analyze ATAC-seq data to obtain peak regions and openness intensities).

[0023] (2) Perform the multidimensional genetic regulation feature vectorization construction steps: Standardize and preprocess the obtained original non-coding genetic regulation feature data and epigenetic state data. The preprocessing includes dimensionality compression based on principal component analysis and numerical normalization based on range standardization. Encode categorical regulatory element information using the one-hot encoding operator, and quantify methylation level and accessibility intensity using the continuous variable mapping operator. Finally, concatenate the vectors in a unified feature space to generate a genetic regulation feature vector that characterizes individual genetic regulation differences. V gThis stage aims to transform the aforementioned heterogeneous and highly sparsity biological raw data into numerical entities that can be processed by computational models. In this process, the raw data first undergoes rigorous standardization preprocessing. Due to significant differences in dimensions and distribution characteristics among different types of sequencing data, the system employs a dimensionality compression operator based on principal component analysis to map tens of millions of locus information to a high-order principal component space while retaining over 95% of variance information. Subsequently, range standardization is used to uniformly scale the values ​​to the 0, 1 interval to eliminate magnitude bias between features. For categorical regulatory element information such as enhancer type and sequence conservation level, the system uses a one-hot encoding operator to convert them into high-dimensional sparse binary vectors. For continuous biophysical variables such as methylation level and accessibility strength, continuous variable mapping operators are used to embed them into specific feature channels. Through this series of operations, a genetic regulatory feature vector representing individual differences in genetic regulation is finally generated within a unified feature space. V g =[ f 1 , f 2 ,……, f n In this vector, each dimension f i Each of these values ​​carries a specific biological meaning, such as the tissue-specific activity value of a particular enhancer or the variation effect value of a key eQTL site.

[0024] (3) Perform the AI-based gene regulation-brain region function association mapping step: map the genetic regulation feature vector V g Input a pre-trained deep neural network model. This model is a hybrid concatenated architecture of a multilayer perceptron and a graph convolutional network. The feature embedding layer (MLP part) maps highly sparse genetic regulatory feature vectors to a low-dimensional dense vector space (e.g., 256-dimensional). Subsequently, the topological association mapping layer (GCN part) implements logical associations through a pre-constructed gene-brain region association matrix M, where matrix elements... M ij Indicates the first i The first control element affects the... j The functional influence weight of each target brain region, this weight M ij The method for determining is as follows: First, for the first... i Based on the Hi-C interaction frequency matrix, genes with the top 5% physical interaction strength with each regulatory element were selected as a candidate target gene set. G i ;like Gi If not empty, then select the brain region. j The gene with the highest absolute Z-score for mRNA expression level in the middle is g∈ G i As the representative gene of this regulatory element; if G i If the value is empty, the protein-coding gene that is the nearest neighbor of the regulatory element (limited to a linear distance of <1Mb from the genome) will be used as the representative gene; subsequently, M ij = Z (g,j) ,in Z (g,j) For gene g in brain region j The Z-score of mRNA expression in the PsychENCODE; if there is no Z-score in the PsychENCODE. Z (g,j) The data then calculates brain regions. j With all other brain regions k Structural connection strength A (j,k) (From HCPDTI data), and using weighted K-nearest neighbor interpolation, the formula for calculating Mij is: ,in N(j) To be related to brain regions j The set of brain regions with the highest connectivity strength. A pre-constructed human brain region functional connectivity map (based on the structural connectivity matrix of HCP) is loaded, and the embedded feature vectors are assigned to each node (brain region) of the map. Through two layers of graph convolution, the feature information is propagated and aggregated along the connectivity edges in the brain network, ultimately generating a regulatory influence value matrix corresponding to each target brain region. Each element in this matrix represents the regulatory efficacy of the lower-layer non-coding regulatory network on the baseline excitability, upper limit of synaptic plasticity, and neurotransmitter receptor expression level of that brain region under a specific individual genetic background.

[0025] The construction and training process of the deep neural network model is as follows: (a) Training data preparation: Integrate sample data from large brain science cohorts (such as UK Biobank, ABCD study), each training sample including: whole genome sequencing and epigenetic data of the individual (used to construct the genetic regulatory feature vector). V gThe data, along with the corresponding brain region functional state proxy labels for the individual, are derived from: 1) the low-frequency oscillation amplitude or local consistency of each brain region derived from resting-state functional magnetic resonance imaging; 2) the effect value of specific task activation in task-state fMRI; and 3) physiological response indicators induced after stimulation of specific brain regions in historical transcranial magnetic stimulation intervention studies (such as changes in the amplitude of motor evoked potentials MEP). The data are divided into training set, validation set, and test set in a 7:1:2 ratio. (b) Model Architecture: The deep neural network model adopts a cascaded hybrid architecture. First, the genetic regulation feature vector... V g Encoding dimensionality reduction is performed using a three-layer multilayer perceptron with an input dimension of n and hidden layer dimensions of 512 and 256, respectively. The ReLU activation function is used. The dimensionality-reduced dense vector is copied and assigned to each node of a pre-constructed brain region map. The brain region map is constructed based on population average diffusion tensor imaging data from the Human Connectome Project and is represented by a symmetric adjacency matrix A, where the matrix elements are the number of structural connection fibers between brain regions. Subsequently, the node features and the adjacency matrix A are input into a two-layer graph convolutional network (GCN). Each GCN layer is followed by Batch Normalization and ReLU activation. Finally, a fully connected layer outputs the regulatory influence value of each brain region. (c) Model training: The Adam optimizer was used with an initial learning rate of 0.001 and a batch size of 32. The loss function was defined as: Loss = MSE(Y pred , Y label ) + αW1, where MSE is the mean squared error, Y pred and Y label These are the predicted regulatory impact value and the proxy label, Y. label The data was obtained by standardizing and weighting the data from the three sources described in step (a). (First, each data category was standardized using z-scores based on the mean ± standard deviation of the healthy population; then, weights were assigned based on data reliability: α1=0.4 (resting-state fMRI), α2=0.3 (task-state fMRI), and α3=0.3 (TMS response); the final label was Y.) label = α1Z1 + α2Z2 + α3Z3, ensuring all sample labels are in the same dimension and distribution range); W1 is the sparsity regularization term of the model weight parameters, α is the regularization coefficient, set to 0.0001, and training stops early when the validation set loss no longer decreases after 10 consecutive rounds.

[0026] (4) Quantitative calculation steps for brain region functional susceptibility score: For each target brain region, based on the corresponding components in the regulatory influence value matrix, calculate the functional susceptibility score of that brain region under external regulatory stimulation conditions; the functional susceptibility score comprehensively reflects the response sensitivity and state transfer potential of the brain region based on the underlying genetic regulatory network when receiving external energy intervention. This process aims to transform abstract regulatory influence values ​​into quantitative indicators with practical engineering guidance significance. For each selected target brain region, the calculation process integrates the components of the regulatory influence value matrix with the preset external stimulus response transmission operator. In order to simulate the intervention effect of different neuromodulation methods (such as transcranial magnetic stimulation or transcranial electrical stimulation) on brain regions, this invention defines a functional susceptibility score. S The calculation logic. Within this logic, functional susceptibility scoring... S The calculation formula is: ; in, For the first j The influence of each regulatory element on the regulation of this brain region. The weight coefficients, learned through the model's graph attention mechanism, reflect the robustness of this regulatory pathway. The formula incorporates an exponential decay term to account for epigenetic environmental constraints, where represents the environmental noise attenuation factor (example value 0.15), used to modulate the reduction effect of external non-genetic factors (such as stress, sleep deprivation, etc.) on response potential. D This represents the degree of deviation of the current epigenetic state of the brain region from the steady-state value of a healthy population. This invention uses the overall methylation level offset and calculates it using the z-score of the deviation between the individual's peripheral blood overall methylation level and the healthy reference value. If the target brain region of the subject can be obtained... j Tissue-specific methylation data, D j = Z j ,in Z j The z-score is the overall methylation level of the brain region relative to the healthy control group; otherwise, peripheral blood methylation data is used as a proxy, and a pre-trained cross-tissue methylation prediction model (such as one based on elastic network regression, with blood CpG site methylation β-values ​​as input and brain region methylation as output) is used. j (Predicted methylation levels) to estimate brain regions j The methylation level was then calculated, and its absolute z-score was obtained. D jThe training data for the prediction model comes from publicly available multi-tissue methylation databases (such as GEO accession GSE111629). This computational model enables a comprehensive assessment from genetic background to current physiological status, ensuring that susceptibility scores accurately reflect the brain region's responsiveness to external energy interventions.

[0027] (5) Outputting Assessment Results: The generated assessment results, which include functional susceptibility scores for multiple target brain regions, are output in a structured format. The assessment results are presented as a susceptibility heatmap or a score list, identifying high-susceptibility response regions under the constraints of individual genetic regulatory characteristics. This process encapsulates the generated functional susceptibility scores for each brain region in a structured manner, and the output results are presented as a high-resolution three-dimensional brain heatmap, where the intensity of the color directly corresponds to the level of the susceptibility score. High-susceptibility regions are marked as optimal intervention sites for neural regulation, while low-susceptibility or inert regions are suggested as non-core intervention targets. In addition, the system also generates a structured report containing suggestions for the ranking of each brain region, clearly indicating the expected response probability when performing interventions at a specific frequency or intensity under the constraints of the current subject's genetic regulatory characteristics.

[0028] The technical solution of the present invention will be further described below with reference to embodiments and comparative examples.

[0029] Example 1 To further demonstrate the specific engineering implementation process of the present invention, this embodiment provides a case analysis based on a real subject background, and the analysis is as follows: In this embodiment, the subject is an individual with a specific need for improvement in cognitive function.

[0030] First, whole-genome sequencing data (30x depth) and ATAC-seq chromatin accessibility data of the subjects were obtained through high-throughput sequencing. The proportion of Q30 bases in the raw data was detected to be 92.5%. After filtering the adapter sequences, approximately 300GB of valid analysis data was retained.

[0031] Subsequently, 12,450 eQTL sites located in the DLPFC enhancer region were extracted using a feature extraction operator, and a homozygous variant was identified at a key non-coding enhancer site upstream of the BDNF gene in the subject. A one-hot encoder converted these categorical features into a sparse tensor of dimension 1x50000, and a normalization processor normalized the individual's overall methylation level to 0.62.

[0032] Next, the high-dimensional vector was reduced to a 512-dimensional dense vector. Using a map based on 86 anatomical regions, the modulatory influence value of the subject's DLPFC region was calculated to be 0.88, while the modulatory influence value of the primary motor cortex (M1) was only 0.32. Subsequently, an environmental noise attenuation factor was set. λ The value was 0.15, indicating the subject's current apparent steady-state deviation. D The value is 0.08. Substitute this value into the calculation formula: ; The susceptibility score of DLPFC was calculated. S The score was 84.5 out of 100, while the M1 area scored only 28.4.

[0033] Finally, the results output module generated a susceptibility profile for the subject. For example... Figure 2 As shown in the diagram, the DLPFC on the left is highlighted in red, designated as a "highly susceptible area," with suggested stimulation parameters of 10 Hz frequency and 80% intensity threshold. In a prospective exploratory study involving 20 subjects with similar backgrounds, the group (n=10) that received intervention at the highly susceptible DLPFC target based on the assessment results of this invention showed a significantly greater improvement in subsequent working memory tasks (average improvement +35%) than the control group (n=10, average improvement +10%) that received intervention based on the traditional anatomical coordinate target (M1 area). The difference between the two groups was statistically significant (p<0.01), preliminarily validating the engineering application value of this assessment method.

[0034] Example 2 This embodiment aims to demonstrate the application potential of the present invention in the precise intervention of mental and neurological diseases.

[0035] Subject: A patient with recurrent depression who met the DSM-5 diagnostic criteria.

[0036] Data and features: In addition to basic whole-genome and epigenome data, data on depression risk-related eQTL sites and prefrontal cortex-specific histone modifications from the PsychENCODE database were specifically included to enrich the feature vector.

[0037] Procedure and Results: The method of this invention was performed. Results showed that the patient's functional susceptibility score in the left anterior cingulate cortex (sgACC) was as high as 89.2 points, significantly higher than that in the dorsolateral prefrontal cortex (DLPFC, score 65.4 points)—the latter being the current FDA-approved standard target for TMS treatment of depression. Simultaneously, the susceptibility score in the right amygdala was extremely low (15.1 points), indicating a low-response area.

[0038] Significance of the validation: This result reveals that the patient may possess a unique emotional circuit regulatory network. Based on this assessment, a personalized stimulation program centered on the highly susceptible sgACC, rather than the standard DLPFC, can be designed for this patient, providing a new intervention approach for "treatment-resistant" patients. This case demonstrates the ability of this invention to explain and predict the heterogeneity of individual therapeutic targets at the molecular mechanism level.

[0039] Example 3 This embodiment demonstrates the universality of the present invention in evaluating different neuromodulation modalities.

[0040] Subjects: Healthy individuals as in Example 1.

[0041] Procedure: Two modes were simulated: "high-frequency (10Hz) excitatory stimulation" and "low-frequency (1Hz) inhibitory stimulation." The parameters of the external stimulus response transmission operator in the functional susceptibility score calculation were adjusted (specifically, for inhibitory stimuli, the weights ω of regulatory elements related to GABAergic receptor expression were adjusted). j Increased by 1.5 times; for excitatory stimuli, it increased the weight of glutamatergic receptor-related elements.

[0042] Results: Two susceptibility scores were calculated. Interestingly, the subject's primary motor cortex (M1) showed moderate susceptibility to excitatory stimuli (score 58.3) but very low susceptibility to inhibitory stimuli (score 22.7). Conversely, certain areas of the prefrontal cortex showed high susceptibility to both modes. This result suggests that the same brain region may have entirely different response potentials to different stimulus modes.

[0043] Significance of the verification: This invention can output pattern-specific susceptibility profiles, providing clinicians with prior decision support for selecting the most appropriate stimulation frequency and pattern (excitatory or inhibitory), and is expected to avoid ineffective or adverse reactions caused by improper pattern selection.

[0044] Comparative Example 1 Using current standard clinical methods, individual genetic information was not considered at all. For all subjects in Examples 1 and 2, the international 10-20 EEG system or MNI standard spatial coordinates were uniformly used, with the DLPFC (F3 position) or M1 area as the fixed stimulation target.

[0045] Comparative Results: In simulated or retrospective data analyses, the target points selected using the method in Comparative Example 1 showed an average spatial overlap of less than 30% with the "high-susceptibility areas" assessed by this invention. Historical data shows that using this fixed-target strategy, the effective response rate for TMS treatment of depression is typically around 50%-60%, while the predicted response rate for candidates selected using personalized targets according to this invention (such as the sgACC high-susceptibility patients in Example 2) could be increased to over 80% in a small pilot study (requires large-scale trial validation). This indicates that ignoring individual genetic heterogeneity is one of the important reasons for the emergence of "non-responders."

[0046] Comparative Example 2 A comparative model was constructed that uses only the exon region SNP data of the subjects' protein-coding genes (from the same sequencing data) as input features and is trained and predicted using a neural network architecture similar to the main model of this invention.

[0047] Comparison Results: On an independent test set (n=100), using the magnitude of changes in motor evoked potentials (MEPs) after actual TMS intervention as the gold standard for response, the Pearson correlation coefficient (r) between predicted susceptibility and actual response was calculated. The predictive correlation coefficient r of the complete model of this invention (including non-coding and epigenetic features) was 0.72 (p < 0.001). In contrast, the predictive correlation coefficient r of the comparative model 2 (coding genes only) was 0.41 (p < 0.05). There was a significant difference between the two (p < 0.01). This demonstrates that non-coding regulatory information is crucial to the predictive contribution of brain region functional response potential; ignoring this information will significantly reduce the accuracy of the assessment.

[0048] Finally, it should be noted that the terms "comprising," "including," or any other variations are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Although preferred embodiments of the invention have been described, those skilled in the art, upon learning the basic inventive concept, can make further changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including both the preferred embodiments and all changes and modifications falling within the scope of the invention.

[0049] The embodiments described above merely illustrate specific implementation methods of this application, and while the descriptions are detailed and specific, they should not be construed as limiting the scope of protection of this application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the technical solution of this application, and these modifications and improvements all fall within the scope of protection of this application.

Claims

1. A method for assessing brain region functional susceptibility driven by non-coding genetic regulatory features based on artificial intelligence, characterized in that, Includes the following steps: (1) Perform individual genetic regulation multimodal data acquisition steps: obtain non-coding genetic regulatory feature data and epigenetic status data of the test individuals through high-throughput sequencing data interface; the non-coding genetic regulatory feature data includes information on regulatory element associated sites distributed in non-coding regions of the whole genome and non-coding region functional annotation features; the epigenetic status data includes DNA methylation features and chromatin accessibility features; (2) Perform the multidimensional genetic regulation feature vectorization construction step: standardize the obtained original non-coding genetic regulation feature data and epigenetic state data. The preprocessing includes dimensionality compression based on principal component analysis and numerical normalization based on range standardization. One-hot encoding operators are used to encode categorical regulatory element information, and continuous variable mapping operators are used to quantify methylation levels and accessibility strength. Finally, these are concatenated in a unified feature space to generate a genetic regulatory feature vector that characterizes individual differences in genetic regulation. V g ; (3) Perform the gene regulation-brain region function association mapping step based on artificial intelligence: construct and train a deep neural network model, and then perform the genetic regulation feature vector V g The input is a deep neural network model, which is a hybrid architecture of multilayer perceptron and graph convolutional network. This model maps highly sparse genetic regulatory feature vectors to a low-dimensional dense vector space through a feature embedding layer. A topological association mapping layer pre-constructs a functional connectivity map of human brain regions. This topological association mapping layer achieves logical association through a pre-constructed gene-brain region association matrix M, where matrix elements... M ij Indicates the first i The first control element affects the... j The functional influence weight of each target brain region, this weight M ij Based on the PsychENCODE database, the gene containing this regulatory element... j The absolute value of the Z-score of mRNA expression level in brain regions is determined. If there is no direct data, it is obtained by interpolating the Hi-C interaction frequency with the expression data of the nearest neighbor brain regions, and a regulatory influence matrix corresponding to each target brain region is generated. (4) Perform the quantitative calculation steps of brain region functional susceptibility score: For each target brain region, calculate the functional susceptibility score of the brain region under external regulatory stimulation conditions based on the corresponding components in the regulation influence value matrix. The functional susceptibility score comprehensively reflects the brain region's responsiveness and state transition potential based on the underlying genetic regulatory network when receiving external energy intervention. (5) Execute the evaluation result output step: The generated evaluation results containing functional susceptibility scores of multiple target brain regions are output in a structured manner. The evaluation results are identified in the form of a susceptibility heatmap or a score list, which identifies the high susceptibility response regions under the constraints of individual genetic regulatory characteristics.

2. The brain region functional susceptibility assessment method based on non-coding genetic regulatory features driven by artificial intelligence as described in claim 1, characterized in that, The regulatory element association site information in the non-coding genetic regulatory feature data described in step (1) includes quantitative expression trait sites, quantitative protein trait sites, and quantitative splicing trait sites obtained from the database, which are used to characterize the regulatory efficacy of single nucleotide polymorphisms on gene expression and translation. The non-coding region functional annotation features in the non-coding genetic regulatory feature data include enhancer and silencer sequences and their activity scores identified using HOMER software, promoter regions defined by the FANTOM5 project, and chromatin interaction frequency matrices extracted by the Juicer and Fit-Hi-C sequencing process using Hi-C sequencing technology, which are used to identify the physical interaction frequency between non-coding enhancers and distant target gene promoters.

3. The brain region functional susceptibility assessment method based on non-coding genetic regulatory features driven by artificial intelligence as described in claim 1, characterized in that, The genetic regulatory feature vector mentioned in step (2) V g The construction process is as follows: an unsupervised feature extraction operator based on an autoencoder is used. The autoencoder extracts regulatory representation features with high information entropy from non-coding region site information by minimizing reconstruction error, thereby eliminating random noise and redundant biological information; the genetic regulatory feature vector... V g =[ f 1 , f 2 ,……, f n ],in, f i For control element i The biophysical property weights are assigned based on their conservation scores in the genome and tissue specificity.

4. The method for assessing brain region functional susceptibility based on non-coding genetic regulatory features driven by artificial intelligence as described in claim 1, characterized in that, The topological association mapping layer of the deep neural network model described in step (3) treats the target brain region as a node in a graph structure through multi-layer graph convolution operations, and uses the edge weights between nodes to represent the anatomical or functional synergy between brain regions; the target brain region is divided based on standard anatomical atlases, covering at least the dorsolateral prefrontal cortex, primary motor cortex, anterior cingulate cortex, and hippocampus; each element in the regulatory influence value matrix is ​​used to characterize the degree of influence of non-coding regulatory networks on the baseline excitability, upper limit of synaptic plasticity, and expression level of neurotransmitter receptors of each brain region under a specific genetic background of an individual.

5. The method for assessing brain region functional susceptibility based on non-coding genetic regulatory features driven by artificial intelligence as described in claim 4, characterized in that, The deep neural network model employs a transformer architecture with an attention mechanism during the training phase. The attention mechanism automatically identifies a set of key sites in the genetic regulatory feature vector that significantly affect the function of a specific brain region and assigns differentiated weight coefficients. The loss function for training the deep neural network model consists of a mean squared error term and a sparsity regularization term, which are used to ensure that the calculation results of the regulatory influence values ​​conform to the biological sparse distribution characteristics.

6. The method for assessing brain region functional susceptibility based on non-coding genetic regulatory features driven by artificial intelligence as described in claim 5, characterized in that, The construction and training process of the deep neural network model includes the following steps: (a) Training data preparation: Sample data from a large brain science cohort were integrated. Each training sample included: whole genome sequencing and epigenetic data of an individual, and surrogate tags for the functional state of the corresponding brain region of that individual. The surrogate tags were derived from: 1) low-frequency oscillation amplitude or local consistency of each brain region derived from resting-state functional magnetic resonance imaging; 2) effect values ​​of specific task activation in task-state fMRI; 3) physiological response indicators induced after stimulation of specific brain regions in historical transcranial magnetic stimulation intervention studies. The data were divided into training set, validation set and test set in a ratio of 7:1:

2. (b) Model Architecture: The deep neural network model adopts a cascaded hybrid architecture. First, the genetic regulation feature vector... V g Encoding dimensionality reduction is performed using a three-layer multilayer perceptron with an input dimension of n and hidden layer dimensions of 512 and 256, respectively. The ReLU activation function is used. The dimensionality-reduced dense vector is copied and assigned to each node of a pre-constructed brain region map. The brain region map is constructed based on population average diffusion tensor imaging data from the Human Connectome Project and is represented by a symmetric adjacency matrix A, where the matrix elements are the number of structural connection fibers between brain regions. Subsequently, the node features and the adjacency matrix A are input into a two-layer graph convolutional network. Each GCN layer is followed by Batch Normalization and ReLU activation. Finally, a fully connected layer outputs the regulatory influence value of each brain region. (c) Model training: The Adam optimizer was used with an initial learning rate of 0.001 and a batch size of 32. The loss function was defined as: Loss = MSE(Y pred Y label ) + αW1, where MSE is the mean squared error, Y pred and Y label These are the predicted regulatory impact value and the proxy label, Y. label The data is obtained by standardizing and weighting the data from the three sources mentioned in step (a); W1 is the sparsity regularization term of the model weight parameters, α is the regularization coefficient, set to 0.0001, and training is stopped early when the loss on the validation set no longer decreases for 10 consecutive rounds.

7. The method for assessing brain region functional susceptibility based on non-coding genetic regulatory features driven by artificial intelligence as described in claim 1, characterized in that, The formula for calculating the functional susceptibility score mentioned in step (4) is as follows: ; in, S For functional susceptibility rating, For the first j The influence of each regulatory element on the regulation of this brain region. For the corresponding weighting coefficients, λ As an environmental noise attenuation factor, D The steady-state deviation D represents the degree of homeostasis of the brain region under its current epigenetic state; the degree of homeostasis D is determined based on the absolute value of the z-score of the brain region’s current overall methylation level relative to the reference value of the healthy population, and is used to modulate the reduction effect of external non-genetic factors on response potential.

8. The method for assessing brain region functional susceptibility based on non-coding genetic regulatory features driven by artificial intelligence as described in claim 7, characterized in that, The calculation process of the functional susceptibility score also incorporates the basal metabolic level of the brain region as a covariate; by integrating local cerebral blood flow and oxygen metabolism data obtained from functional near-infrared spectroscopy or positron emission tomography, the genetic regulatory influence value is dynamically corrected; at the same time, the score calculation also incorporates cortical thickness data based on anatomical structure, and for brain regions with larger cortical thickness, the functional susceptibility score is corrected by gain compensation based on the attenuation characteristics of stimulus energy in spatial transmission.

9. The brain region functional susceptibility assessment method based on non-coding genetic regulatory features driven by artificial intelligence as described in claim 1, step (5) further includes brain region ranking suggestions based on functional susceptibility scores; the system automatically ranks each target brain region from high to low according to its susceptibility score, and selects the optimal target brain region candidate set for the individual to undergo neuromodulation intervention in combination with a preset regulatory safety threshold; the assessment results are pushed to the control end of the neuromodulation device through a standardized API interface as prior input data for the positioning of stimulation coils, pulse frequency and intensity parameters.