An active block ground motion response spectrum prediction method and system based on an XGBoost algorithm

By constructing input feature vectors and event-level groupings using the XGBoost algorithm, the problems of regional adaptability, accuracy, and interpretability of seismic motion prediction models in active land parcel areas are solved, achieving higher accuracy and more robust seismic response spectrum prediction.

CN122196436APending Publication Date: 2026-06-12JIANGHAN UNIVERSITY +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
JIANGHAN UNIVERSITY
Filing Date
2026-03-16
Publication Date
2026-06-12

Smart Images

  • Figure CN122196436A_ABST
    Figure CN122196436A_ABST
Patent Text Reader

Abstract

The present application relates to the technical field of earthquake analysis, and particularly relates to an active block seismic response spectrum prediction method based on an XGBoost algorithm, and comprises the following steps: S1, data acquisition and screening; acquiring strong motion records and event information of an active block region; S2, strong motion preprocessing and response spectrum calculation; obtaining a peak acceleration PGA and an acceleration response spectrum Sa(T) under a target period; S3, feature construction and logarithmic modeling; constructing an input feature vector, wherein the input feature at least comprises a magnitude M s , a focal depth Depth, an epicenter distance Repi and a site shear wave velocity V S30 ; S4, XGBoost regression model construction and target training; training an XGBoost regression model with PGA and Sa(T) as output targets; S5, event level grouping and hyperparameter optimization; comprising the following steps: step one, event ID extraction and grouping; step two, period-by-period training and key hyperparameter set; the method has strong regional adaptability and stable fitting, can effectively avoid event leakage and generalization, and is more reliable in prediction.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of earthquake analysis technology, and in particular to a method and system for predicting the ground motion response spectrum of active blocks based on the XGBoost algorithm. Background Technology

[0002] Seismic ground acceleration response spectrum is a key input parameter in seismic design, structural dynamic analysis, and seismic hazard assessment. However, existing seismic ground motion prediction models mostly rely on empirical regression with pre-defined functions, which fails to fully characterize the strong nonlinear coupling relationship between the source, path, and site. Compared to active tectonic blocks, these regions exhibit significant spatial differences in tectonic activity, medium attenuation, and site effects. Traditional models often suffer from regional systematic bias and insufficient spectral shape fitting, necessitating the development of prediction methods with greater regional adaptability and higher accuracy. Existing research has shown significant regional differences in attenuation patterns and response spectrum morphology between active tectonic blocks and adjacent areas, driving the demand for regionalized models. Simultaneously, with the accumulation of strong earthquake records and improved computational capabilities, machine learning methods can learn complex mapping relationships from data without pre-defined functions, possessing the potential to improve prediction accuracy and transferability. However, in seismic ground motion prediction tasks, it is still necessary to address pain points in engineering applications such as data leakage due to intra-event correlation, insufficient cross-event generalization ability, and inadequate model interpretability and physical consistency verification. Therefore, a seismic ground motion response spectrum prediction technology solution that is oriented towards active tectonic blocks and balances accuracy, robustness, and interpretability is urgently needed. Summary of the Invention

[0003] The purpose of this application is to provide a method for predicting the seismic response spectrum of active land parcels based on the XGBoost algorithm, aiming to solve the problems in the prior art.

[0004] This application provides a method for predicting the seismic response spectrum of active land parcels based on the XGBoost algorithm, including the following steps:

[0005] S1. Data Acquisition and Filtering: Acquire strong ground motion records and event information for the active site area;

[0006] S2. Strong vibration preprocessing and response spectrum calculation; obtain peak acceleration PGA and acceleration response spectrum Sa(T) under the target period;

[0007] S3. Feature Construction and Logarithmic Modeling; Constructing an input feature vector, wherein the input features include at least the magnitude M. s focal depth (Depth), epicentral distance (Repi), and site shear wave velocity (V) S30 ;

[0008] S4. Construct the XGBoost regression model and training objective; train the XGBoost regression model with PGA and Sa(T) as output objectives;

[0009] S5. Event-level grouping and hyperparameter optimization; including step one: event ID extraction and grouping; step two: cycle-by-cycle training and key hyperparameter set.

[0010] S6. Output and Application: Input the input features of the earthquake event to be predicted into the trained model, and output PGA and Sa(T).

[0011] Preferably, the data filtering in S1 specifically involves limiting the magnitude and distance range of the acquired earthquake data, and filtering valid samples according to the signal-to-noise ratio threshold to form a set of records that can be used for modeling.

[0012] Preferably, the preprocessing in S2 includes baseline correction of the seismic record, determination of the available frequency band based on the signal-to-noise ratio and filtering; calculation of RotD50 for the two horizontal components, and calculation of the 5% damped acceleration response spectrum Sa(T) for different periods.

[0013] Preferably, the field shear wave velocity V in S3 S30 Using the station soil profile database and the V from the site classification results S30 Furthermore, when station measurements are missing, site condition maps or site parameter databases are used to analyze V. S30 Complete the missing parts.

[0014] Preferably, step one in S5 specifically involves dividing the training set and the test set based on the earthquake event number, so that all records of the same earthquake event are only included in either the training set or the test set.

[0015] Preferably, step two in S5 specifically involves establishing a corresponding XGBRegressor regression model for each period T, and using GridSearchCV to optimize the learning rate, number of trees, maximum depth, subsample ratio, column sampling ratio, and regularization coefficient to obtain the optimal model parameters for each period.

[0016] A system applicable to the above-described seismic response spectrum prediction method is characterized by comprising a data acquisition module, a preprocessing and spectral value calculation module, a feature construction module, a grouping and partitioning module, an XGBoost training module, and a prediction output module.

[0017] The beneficial effects of this invention are: 1. Strong regional adaptability and stable fitting: Based on large sample training in active land parcels, it can better capture the overall trend of PGA / response spectrum decay with epicentral distance, and can output reasonable and smooth response spectrum curves under different epicentral distances and magnitudes.

[0018] 2. Avoid event leakage and make generalization more reliable: The event ID grouping strategy is adopted to prevent the same event from entering the training and testing at the same time, which would lead to an overly optimistic evaluation and improve the credibility of cross-event prediction from the process.

[0019] 3. Better interpretability and physical consistency: Feature importance analysis reveals the differences in the main controlling factors across different cycles, and verifies that Sa varies with V. S30 The pattern of increasing overall decline and being more sensitive to short periods is consistent with the basic characteristics of site effects. Attached Figure Description

[0020] Figure 1 The magnitude is M s Two-dimensional distribution map of epicentral distance Repi.

[0021] Figure 2 This is a spatial distribution map of magnitude for the training and test sets.

[0022] Figure 3 This is a spatial distribution map of the epicentral distance for the training and test sets.

[0023] Figure 4 This is a scatter plot comparing the true and predicted values ​​for the training and test sets.

[0024] Figure 5 This represents the distribution of residuals between events with magnitude under different periods.

[0025] Figure 6 The distribution trend of residuals with epicentral distance under different periods.

[0026] Figure 7 For residuals under different periods, V S30 The distribution trend.

[0027] Figure 8 This represents the distribution trend of residuals with focal depth under different periods.

[0028] Figure 9 This is a curve showing the PGA prediction value as a function of epicentral distance.

[0029] Figure 10 The response spectrum curves for different magnitudes at a fixed epicentral distance are shown.

[0030] Figure 11 M in each period s Depth, Repi, V S30 The importance of the graph changes.

[0031] Figure 12 Sa varies with V under different periods S30 The predicted curve of the change.

[0032] Figure 13 This is a comparison of the prediction results of the model of this invention with those of existing models. Detailed Implementation

[0033] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0034] A method for predicting the seismic response spectrum of active land parcels based on the XGBoost algorithm includes the following steps:

[0035] 1. Data Acquisition and Screening: Acquire strong ground motion records and event information of the active area, limit the magnitude and distance range, and screen valid samples according to the signal-to-noise ratio threshold to form a set of records that can be used for modeling.

[0036] 2. Strong vibration preprocessing and response spectrum calculation: Baseline correction is performed on the recording, the usable frequency band is determined based on the signal-to-noise ratio and filtered; RotD50 is calculated for the two horizontal components, and the 5% damped acceleration response spectrum Sa(T) for different periods is calculated accordingly.

[0037] 3. Site parameter completion: Prioritize using the V parameters from the station soil profile database and site classification results. S30 For stations lacking measurement data, high-resolution site condition maps or site parameter databases should be used to fill the gaps.

[0038] 4. Feature Construction and Logarithmic Modeling: Construct an input feature vector x=[M] for each record. s Depth, Repi, V S30 The target spectral value Sa(T) under a given period is used as the output. Considering the positive value of the spectral value and the log-normal characteristic, a logarithmic transformation is performed on the output to establish the learning target and ensure that the prediction is positive. Where: M s Dimensions are surface wave magnitudes, Depth is the focal depth, Repi is the epicentral distance, and V is the magnitude. S30 This refers to the field shear wave velocity.

[0039] 5. Event-level grouping to avoid leakage: Extract the event number from the record file name as the event ID, and randomly divide the training set and test set by event as the grouping unit to ensure that all records of the same event appear only in one of the training set or the test set.

[0040] 6. XGBoost Training and Hyperparameter Optimization: For each period T, a corresponding XGBRegressor regression model is established. The learning rate, number of trees, maximum depth, subsample ratio, column sampling ratio, and regularization coefficient are optimized using GridSearchCV to obtain the optimal model parameters for each period.

[0041] 7. Output and Application: Input the M of the event to be predicted s / Depth / Repi / V S30 The model outputs the predicted lnSa(T), which is then inversely transformed to obtain Sa(T); where PGA can be output as a specific period point (or a special case of T≈0) for engineering seismic design and regional hazard analysis.

[0042] 8. Performance evaluation and robustness verification:

[0043] Population fit was tested using a scatter plot of predicted and actual values.

[0044] Using the total residual and decomposing it into inter-event residuals and intra-event residuals, we examine whether there are residuals that vary with magnitude, epicentral distance, and V. S30 , depth of systematic bias;

[0045] The characteristic contribution was evaluated using methods such as permutation importance, and the variation of Sa with V was verified under the condition of fixed source and path. S30 The monotonicity and periodicity of [the material / organization];

[0046] Compare and validate with existing regional models;

[0047] Independent events outside the region were selected as the external test set, and the generalization ability was verified by K-S normality test and accuracy curve.

[0048] A system applicable to the aforementioned seismic response spectrum prediction method includes a data acquisition module, a preprocessing and spectral value calculation module, a feature construction module, a grouping module, an XGBoost training module, and a prediction output module. These modules work together to realize the seismic response spectrum prediction method provided by this invention.

[0049] The following detailed description is provided in conjunction with specific embodiments.

[0050] Example 1

[0051] The active block tectonic region is characterized by strong geological activity; therefore, this embodiment selects the Sichuan-Yunnan region as an example. Based on strong ground motion records from the Sichuan-Yunnan region, a prediction model for PGA and multi-period acceleration response spectrum Sa(T) is established using the XGBoost regression algorithm. The input feature is the surface wave magnitude M. sfocal depth (Depth), epicentral distance (Repi), and site parameter V S30 The output is the spectral value Sa(T) (or its logarithmic form) under the target period, which can be used in engineering scenarios such as regional seismic design and seismic hazard analysis.

[0052] (a) Data Acquisition and Sample Screening

[0053] 1. Data Sources and Scope

[0054] Obtain a dataset of strong ground motion records in the Sichuan-Yunnan region from 2007 to 2019 (example: 1006 earthquake events, 7132 records), magnitude M. s The magnitude is approximately 4.0–7.0, and the epicenter distance from Repi is no more than 300 km. To ensure data quality while maintaining a sufficient sample size, SNR≥2 is used as the screening threshold.

[0055] 2. Statistical analysis of sample sizes available for different periods

[0056] The number of usable records after periodic statistical filtering of the target period set is used to illustrate the degree of data support in different periods and to provide an explanation for the subsequent "bias caused by insufficient long-term samples".

[0057] 3. Magnitude-Distance Coverage Check

[0058] To verify whether the training data coverage is reasonable, the magnitude M was plotted. s Two-dimensional distribution map of epicentral distance Repi, as shown Figure 1 As shown, this is used to demonstrate the distribution density and coverage of data in different magnitude and distance ranges.

[0059] 4.V S30 Binning and Sample Size Statistics

[0060] For V S30 Perform binning statistics to check site parameter coverage and avoid extreme imbalance of site types during model training.

[0061] (II) Preprocessing of strong earthquakes and calculation of response spectrum

[0062] 1. Baseline Correction and Frequency Band Determination

[0063] To reduce low-frequency drift errors introduced by environmental noise and instrument tilt, the ground motion acceleration time history was first baseline corrected; then, the usable frequency range of each record was calculated based on the signal-to-noise ratio, and non-causal filtering was performed within this range.

[0064] 2. Direction-independent intensity metric RotD50

[0065] RotD50 is calculated for the two filtered horizontal components to unify the influence of recordings from different directions, and the acceleration response spectrum is calculated accordingly.

[0066] 3. Calculation of the 5% damped acceleration response spectrum Sa(T)

[0067] Given a damping ratio, Sa(T) is calculated based on the target period set to form supervised learning labels for subsequent model training.

[0068] (III) Site Parameter V S30 Acquisition and missing test completion

[0069] 1.V S30 Priority source

[0070] V S30 Priority is given to data obtained from the "Western National Strong Earthquake Station Soil Profile Database and Site Classification Results".

[0071] 2. Methods for supplementing missing monitoring stations

[0072] For stations missing data in the aforementioned database, a V-shaped data set was generated using a "100 m resolution site condition map of China based on surface geology and bedrock depth". S30 Complete it.

[0073] (iv) Model input and output definition and logarithmic modeling

[0074] 1. Definition of input feature vector

[0075] For each strong earthquake record i, construct the input feature vector x. i It consists of four characteristics: source, path, and site: M s Depth, Repi, V S30 .Right now

[0076] 2. Output target and logarithmic transformation

[0077] The output target is the acceleration response spectrum Sa(T) under a given period T. Considering that the seismic intensity index has a log-normal characteristic and must be positive, a logarithmic transformation is performed on the output, and the learning target is defined as lnSa(T), thereby ensuring that the prediction result is positive and improving the regression stability.

[0078]

[0079] 3. Period-by-period modeling strategy

[0080] For each period T, train an independent regressor. Learn the mapping function from x→lnSa(T); when T=0, it corresponds to PGA prediction.

[0081]

[0082] (V) Construction and Training Objectives of XGBoost Regression Model

[0083] 1. Additive Tree Ensemble Prediction Form

[0084] The XGBoost regressor is used, the core of which is to integrate multiple regression trees in an additive model to approximate the objective function;

[0085]

[0086] in Let K represent the regression tree function space, where K is the number of trees. Each tree maps samples to a leaf node and outputs the leaf node weights, thus achieving nonlinear fitting of the input variables and interaction effects.

[0087] 2. Training objective: Error term + complexity penalty

[0088] During training, the objective function of "error term + complexity penalty" is minimized to control the balance between the model's fitting ability and generalization ability.

[0089]

[0090] 3. Log-space squared loss and regularization term

[0091] We employ squared loss in logarithmic space and suppress overfitting by constraining the complexity of the tree through regularization terms (e.g., the number of leaf nodes, the norm of leaf node weights, etc.).

[0092]

[0093]

[0094] Where J is the number of leaf nodes, The leaf node weight vector; and These correspond to the L1 and L2 regularization coefficients, respectively. This is a penalty item for splitting.

[0095] (vi) Event-level grouping and hyperparameter optimization

[0096] 1. Event ID Extraction and Grouping Principles

[0097] Strong ground motion records exhibit significant "correlation within the same event". To avoid the problem of the model remembering event characteristics and resulting in artificially high test accuracy due to the same earthquake event being included in both the training and test sets, the event number is extracted from the record file name as the event ID, and the records are randomly divided into groups based on the event ID, ensuring that all records of the same event are included in only one of the training or test sets.

[0098] 2. Verification of the rationality of the training / test ratio and division

[0099] The example uses GroupShuffleSplit to divide the data into 80% for the training set and 20% for the test set, and plots the spatial distribution of the training and test sets in terms of magnitude and epicentral distance, as shown below. Figure 2-3 As shown, this verifies that the cutting did not cause a distribution imbalance.

[0100] 3. Period-by-period training and key hyperparameter set

[0101] For each period T, with (x i ,lnSa i (T) was used as the training data, and XGBRegressor was used for fitting. Hyperparameters such as learning rate, number of trees, maximum depth, min_child_weight, subsample, colsample_bytree, and L1 / L2 regularization were tuned. The names and abbreviations of the above hyperparameters are shown in the table below.

[0102] Serial Number hyperparameters 1 Learning rate 2 Estimate the number of trees (n_estimators) 3 The maximum depth of the tree (max_depth) 4 Minimum child weight 5 Subsample ratio 6 Sampling ratio for each tree column (colsample_bytree) 7 L1 regularization (alpha) 8 L2 regularization (lambda)

[0103] The table below shows the tuning results of each hyperparameter under different periods.

[0104] cycle Estimating the number of trees Maximum depth Minimum Subsample Weight Subsample ratio Sampling ratio for each tree column L1 L2 Learning rate <![CDATA[R ² Coefficient of determination PGA 450 6 0.5 0.8 0.8 0.1 1 0.02 0.6104 0.05 350 6 0.5 0.8 0.8 0.1 1 0.03 0.5983 0.1 450 6 0.5 0.8 0.8 0.1 1 0.05 0.5954 0.2 450 6 0.5 0.8 0.8 0.1 1 0.04 0.6344 0.3 400 6 0.5 0.8 0.8 0.1 1 0.06 0.6717 0.4 450 6 0.5 0.8 0.8 0.1 1 0.06 0.7012 0.5 450 6 0.5 0.8 0.8 0.1 1 0.06 0.7156 1 400 6 0.5 0.8 0.8 0.1 1 0.06 0.7391 1.5 450 6 0.5 0.8 0.8 0.1 1 0.06 0.7760 2 450 6 0.5 0.8 0.8 0.1 1 0.06 0.8040 3 450 6 0.5 0.8 0.8 0.1 1 0.06 0.8223 4 450 6 0.5 0.8 0.8 0.1 1 0.06 0.8220 5 450 6 0.5 0.8 0.8 0.1 1 0.05 0.8092 7.5 450 6 0.5 0.8 0.8 0.1 1 0.06 0.8099 10 450 6 0.5 0.8 0.8 0.1 1 0.06 0.7838

[0105] (vii) Demonstration of model fitting results and residual analysis

[0106] 1. Scatter comparison of actual values ​​and predicted values

[0107] After training, plot a scatter plot comparing the true and predicted values ​​on the training and test sets, as shown below. Figure 4 As shown, this is used to visually demonstrate whether the fitting effect is distributed around y=x.

[0108] 2. Definition of Total Residual

[0109] To quantitatively evaluate the prediction bias of the model, the total residual is defined using log-space residuals.

[0110]

[0111] in For the event In Record The observation intensity index at the location, These are the model's predicted values.

[0112] 3. Residual decomposition: Inter-event residuals + Intra-event residuals

[0113] The total residual is decomposed into the sum of the inter-event residual and the intra-event residual, which is used to distinguish between "overall event bias" and "dispersion within the same event".

[0114]

[0115]

[0116] in Indicates an event The inter-event residuals reflect whether the event as a whole is stronger or weaker than the system of the model.

[0117] 4. Systematic examination of inter-event residuals with magnitude

[0118] Calculate the inter-event residuals for each event and plot the distribution of inter-event residuals with magnitude for different periods, such as... Figure 5 As shown, this is used to check whether the model has an event-level system bias that varies with earthquake magnitude.

[0119] 5. Systematic examination of in-event residuals with respect to key variables

[0120] After obtaining the event term, define the in-event residuals and plot their values ​​as a function of epicentral distance and V. S30 The distribution trend of focal depth, such as Figure 6-8 As shown, this is to examine whether there is a systematic bias that varies with the input variables.

[0121] (x) Attenuation law and reaction spectrum morphology

[0122] 1. PGA attenuation trend with epicentral distance

[0123] Plot the PGA prediction value as a function of epicentral distance in a log-log coordinate system, as follows: Figure 9 As shown, the actual values ​​are overlaid to test the model's overall ability to characterize path decay.

[0124] 2. Response spectrum curves under different distances and magnitude scenarios

[0125] To demonstrate the reasonableness of the spectrum output by the model, fixed epicentral distances (e.g., 25, 75, 150, 250 km) were used, and corresponding response spectrum curves were plotted for different magnitudes, such as... Figure 10 As shown, the variation of the spectrum shape with magnitude / distance is examined;

[0126] (xi) Model interpretability analysis and physical consistency verification

[0127] 1. Contribution of Permutation Importance Assessment Characteristics

[0128] To improve interpretability, a permutation importance method is used on the test set: the values ​​of a certain feature are randomly shuffled sequentially, and the model error increments are compared; the larger the increment, the more critical the feature. M is plotted for each period. s Depth, Repi, V S30 Changes in importance, such as Figure 11 As shown, the controlling factors in different cycles are identified.

[0129] 2. Verification of the physical consistency of site effects

[0130] Under the conditions of fixed magnitude, epicentral distance, and focal depth, V S30 As the sole independent variable input model, we obtain Sa as a function of V under different periods. S30 The predicted curve of change, such as Figure 12 As shown, check whether a "V" is presented. S30 The periodic dependence characteristics of the site effect, such as "increase, overall decrease of Sa" and "more sensitive to short cycles, and more gradual in medium and long cycles", are consistent with the site effect.

[0131] (x) Comparison and verification with existing regional models

[0132] To verify the rationality of the model of this invention within the range of common magnitudes and distances, an existing seismic ground motion prediction model for the Sichuan-Yunnan region was selected as a baseline for comparison, and models were tested at different magnitudes M. s By comparing the predicted results with those under the epicentral distance R scenario, the consistency and sources of difference between the spectral values ​​and the attenuation trend are determined. The results are as follows: Figure 13 As shown.

[0133] The above embodiments are not intended to limit the present invention. Unless otherwise explicitly specified and limited, the terms "set," "install," "connect," and "link" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; a mechanical connection or an electrical connection; a direct connection or an indirect connection through an intermediate medium; or a connection within two components. Those skilled in the art can understand the specific meaning of the above terms in this application based on the specific circumstances. The present invention is not limited to the above examples. Changes, modifications, additions, or substitutions made by those skilled in the art within the scope of the technical solutions of the present invention are also within the protection scope of the present invention. Furthermore, the technical features involved in the different embodiments of the present application described above can be combined with each other as long as they do not conflict with each other.

[0134] It will be apparent to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the invention can be implemented in other specific forms without departing from its spirit or essential characteristics. Therefore, the embodiments should be considered in all respects as exemplary and non-limiting, and the scope of the invention is defined by the appended claims rather than the foregoing description. Thus, all variations falling within the meaning and scope of equivalents of the claims are intended to be included within the present invention. No reference numerals in the claims should be construed as limiting the scope of the claims.

Claims

1. A method for predicting the seismic response spectrum of active land parcels based on the XGBoost algorithm, characterized in that, Includes the following steps: S1. Data Acquisition and Filtering: Acquire strong ground motion records and event information for the active site area; S2. Strong vibration preprocessing and response spectrum calculation; obtain peak acceleration PGA and acceleration response spectrum Sa(T) under the target period; S3. Feature Construction and Logarithmic Modeling; Constructing an input feature vector, wherein the input features include at least the magnitude M. s focal depth (Depth), epicentral distance (Repi), and site shear wave velocity (V) S30 ; S4. Construct the XGBoost regression model and training objective; train the XGBoost regression model with PGA and Sa(T) as output objectives; S5. Event-level grouping and hyperparameter optimization; The process includes: Step 1: Event ID extraction and grouping; Step 2: Cycle-by-cycle training and key hyperparameter set. S6. Output and Application: Input the input features of the earthquake event to be predicted into the trained model, and output PGA and Sa(T).

2. The method for predicting the seismic response spectrum of active blocks based on the XGBoost algorithm according to claim 1, characterized in that, The data filtering in S1 specifically involves limiting the magnitude and distance range of the acquired earthquake data, and filtering valid samples according to the signal-to-noise ratio threshold to form a set of records that can be used for modeling.

3. The method for predicting the seismic response spectrum of active blocks based on the XGBoost algorithm according to claim 1, characterized in that, The preprocessing in S2 includes baseline correction of the seismic record, determination of the available frequency band based on the signal-to-noise ratio and filtering; calculation of RotD50 for the two horizontal components, and calculation of the 5% damped acceleration response spectrum Sa(T) for different periods.

4. The method for predicting the seismic response spectrum of active blocks based on the XGBoost algorithm according to claim 1, characterized in that, The site shear wave velocity V in S3 S30 Using the station soil profile database and the V from the site classification results S30 Furthermore, when station measurements are missing, site condition maps or site parameter databases are used to analyze V. S30 Complete the missing parts.

5. The method for predicting the seismic response spectrum of active blocks based on the XGBoost algorithm according to claim 1, characterized in that, Step one in S5 specifically involves dividing the training set and the test set based on the earthquake event number, so that all records of the same earthquake event are only included in either the training set or the test set.

6. The method for predicting the seismic response spectrum of active blocks based on the XGBoost algorithm according to claim 1, characterized in that, Step two in S5 specifically involves establishing a corresponding XGBRegressor regression model for each period T, and using GridSearchCV to optimize the learning rate, number of trees, maximum depth, subsample ratio, column sampling ratio, and regularization coefficient to obtain the optimal model parameters for each period.

7. A system applicable to the seismic response spectrum prediction method according to any one of claims 1-6, characterized in that, It includes a data acquisition module, a preprocessing and spectral value calculation module, a feature construction module, a grouping and partitioning module, an XGBoost training module, and a prediction output module.