A method for managing used car transaction information
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANGHAI SOCHENG INFORMATION TECH CO LTD
- Filing Date
- 2025-07-02
- Publication Date
- 2026-06-19
Smart Images

Figure CN120807010B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image data processing, and in particular to a method for managing used car transaction information. Background Technology
[0002] With the continuous maturation of the automobile consumer market and the accelerated pace of new car replacement, the used car market is experiencing rapid growth. Used car price assessment, as a core link in the transaction, directly impacts market transaction efficiency and the protection of participants' interests. Accurate price forecasting not only provides reasonable pricing references for both buyers and sellers but also reduces transaction risks and enhances market transparency and liquidity. Traditional used car price assessment methods mainly rely on the experience and judgment of professional appraisers or linear regression analysis based on simple statistical models. However, these methods have significant limitations: on the one hand, manual assessment is influenced by the appraiser's personal experience and subjective factors, making it difficult to guarantee the consistency and objectivity of the assessment results; on the other hand, simple statistical models cannot effectively handle the complex nonlinear relationships and multi-dimensional influencing factors in the used car market, especially when facing complex factors such as brand effects, model differences, and configuration changes, where the prediction accuracy is often insufficient.
[0003] In practical applications of used car price prediction, the problem of data sparsity is particularly prominent. The automotive market exhibits a typical long-tail distribution, with popular models experiencing frequent transactions and abundant data, while transaction records for a large number of less popular models or specific configuration versions are relatively scarce. This data imbalance leads to significant technical challenges: for mainstream models with abundant transaction records, existing machine learning methods can achieve good predictive results; however, for models with sparse data, the models often suffer from overfitting or low prediction accuracy due to insufficient training samples.
[0004] Existing technologies attempt to alleviate sparsity issues by adding similar vehicle data or using collaborative filtering, but these methods have the following shortcomings: First, the definition of similar vehicles is often too coarse, aggregating only based on brand or model series, ignoring the differences between models in key dimensions such as configuration and performance; second, there is a lack of effective hierarchical information utilization mechanisms, making it impossible to fully explore the inherent patterns contained in the vehicle classification system; finally, the information aggregation method is singular, making it difficult to balance the weight distribution of information on vehicles of the same type and information on similar vehicles across categories.
[0005] Therefore, there is an urgent need for a technical solution to improve the accuracy of sparse data vehicle price prediction. Summary of the Invention
[0006] To address the issue of low accuracy in predicting vehicle prices using sparse data, this application provides a method for managing used car transaction information. By employing a hierarchical graph neural network and multi-source information aggregation, the accuracy of used car price prediction is improved.
[0007] This application provides a method for managing used car transaction information, including: S1, acquiring historical used car transaction data and performing hierarchical processing on the acquired data to obtain a hierarchical dataset; the hierarchical dataset adopts a three-level data structure of brand-series-model, where the brand node is the root node, the series node is the intermediate node, and the model node is the leaf node, and each node stores the corresponding transaction records and price data; S2, establishing a hierarchical graph neural network model based on the hierarchical dataset through transfer learning; S3, establishing an evaluation model based on residual value rate based on historical used car transaction data; S4, mapping the data of the vehicle to be evaluated to the corresponding node in the hierarchical dataset and calculating the data sufficiency coefficient of the vehicle to be evaluated; selecting a hierarchical graph neural network model or an evaluation model to predict the used car price based on the data sufficiency coefficient; S5, managing the used car transaction information based on the predicted used car price.
[0008] Furthermore, a hierarchical dataset is obtained, including: collecting historical transaction data of used cars, which includes fields such as basic vehicle information, transaction price, transaction time, vehicle configuration parameters, mileage, and vehicle condition level; classifying historical transaction data by brand affiliation, grouping all transaction records of the same brand under the corresponding brand node; classifying historical transaction data by model affiliation under each brand node, establishing model intermediate nodes, and grouping transaction records of the same model under the corresponding model node; classifying transaction records by model under each model node, establishing model leaf nodes, and storing transaction records of the same model in the corresponding model node; calculating statistical parameters for brand nodes, model nodes, and model nodes, including the number of transaction records, average price, price variance, and transaction time; wherein, the statistical parameters of brand nodes are calculated by aggregating the data of all subordinate model nodes; the statistical parameters of model nodes are calculated by aggregating the data of all subordinate model nodes, generating a three-level data structure of brand-model-model.
[0009] Furthermore, a hierarchical graph neural network model is established through transfer learning, including: constructing a hybrid graph structure containing hierarchical edges and similarity edges based on the three-level data structure of brand-vehicle series-model; extracting node features from the hybrid graph structure and performing multi-dimensional feature vectorization to obtain a node feature matrix containing statistical features, configuration features, and temporal features; and training a graph convolutional neural network on the brand layer data in the node feature matrix to obtain the weight parameters of the brand layer graph neural network. Based on the brand layer diagram neural network weight parameters Transfer learning training is performed from the brand level to the vehicle series level to obtain the weight parameters of the graph neural network at the vehicle series level. According to the weight parameters of the vehicle series layer diagram neural network Transfer learning training from the vehicle series layer to the vehicle model layer is performed to obtain the weight parameters of the graph neural network at the vehicle model layer. ; Identify sparse vehicle model nodes with fewer than a preset threshold in the statistical features; perform multi-source information aggregation processing on the sparse vehicle model nodes based on hierarchical edges and similarity edges to obtain the sparse node enhanced feature matrix; according to the weight parameters to And sparse nodes enhance the feature matrix, and construct a hierarchical graph neural network model.
[0010] In particular, information on sparse models can only be obtained from models within the same series, but models within the same series may have huge differences in configuration (such as the 1.4T and 3.0T versions of the brand B-A4); there is a lack of connection between similar models across series and brands (such as the competitive relationship between the brand A-3 series and the brand B-A4).
[0011] This application, on the one hand, adds a similarity calculation mechanism based on vehicle configuration parameters. For example, it can identify the similarity between brand BA-4 1.4T and brand A-3 series 1.5T, instead of forcibly aggregating with A-4 3.0T, thereby improving the accuracy of information aggregation.
[0012] On the other hand, the original single-vehicle-series information aggregation is expanded to multi-source information aggregation; sparse models can not only obtain information from the same vehicle series, but also from cross-brand models with similar configurations, which significantly improves the sparsity processing capability.
[0013] Furthermore, a hybrid graph structure containing hierarchical edges and similarity edges is constructed, including: extracting node relationships from the three-level data structure, marking brand nodes, vehicle series nodes, and model nodes as different types of graph vertices; establishing hierarchical edge connections, creating directed edges between brand nodes and their subordinate vehicle series nodes, and between vehicle series nodes and their subordinate model node nodes, forming a hierarchical graph structure; extracting vehicle configuration parameters from the transaction records stored in the model node, including numerical configuration data such as engine displacement, maximum power, body length, body width, body height, and official guide price; and performing Z-score standardization on the configuration data to eliminate the influence of different parameter dimensions and numerical ranges.
[0014] The configuration similarity between vehicle model nodes is calculated using the weighted Euclidean distance algorithm. The calculation formula is as follows: ,in, The weight coefficient for the k-th configuration parameter; a similarity threshold θ is set, which is the similarity between vehicle model nodes. At that time, undirected similarity edges are established between corresponding vehicle model nodes; the hierarchical edges and similarity edges are merged to construct a hybrid graph topology structure containing hierarchical and similarity relationships, forming the adjacency matrix representation of the graph.
[0015] Furthermore, the weight parameters of the brand layer graph neural network are obtained. This includes: extracting brand-level node data from the node feature matrix by node type, and obtaining brand node statistical feature data calculated from statistical parameters; performing data aggregation processing on the vehicle configuration features under each brand node, calculating the mean and standard deviation of configuration parameters, and generating brand-level configuration feature data; and concatenating the statistical feature data and configuration feature data column by column to form a brand-level feature matrix. ;
[0016] From the brand layer feature matrix Statistical feature columns are extracted from the data, and the overlap of average price ranges, price variance similarity, and transaction activity similarity among brands are calculated. The three similarity indicators are weighted and summed to calculate the overall similarity value among brands. The overall similarity value is binarized according to a preset similarity threshold to generate inter-brand connection relationship data. This connection relationship data is then converted into an adjacency matrix format, and self-connection markers are added to the diagonal positions to form a brand-layer adjacency matrix. ;
[0017] Statistical Brand Layer Adjacency Matrix Count the non-zero elements in each row of the matrix to generate a degree vector; convert the degree vector into an angle matrix. According to the normalization formula Perform matrix operations to obtain normalized adjacency matrix data;
[0018] Initialize the first layer weight matrix Initialize the first-layer bias vector as a random numerical matrix of [feature dimension × 64]. A 64-dimensional zero vector; initialize the second-layer weight matrix. Initialize the second-layer bias vector as a [64×32] random numerical matrix. It is a 32-dimensional zero vector;
[0019] Perform the first layer graph convolution matrix operation: The first layer linear output data is obtained; the first layer linear output data is processed by the ReLU activation function to generate the first layer hidden feature matrix. Perform the second-layer graph convolution matrix operation: To obtain brand-level prediction output data ;
[0020] Average price data is extracted from statistical parameters and used as the true label vector. ; Calculate the predicted output data Compared with the real label vector Mean square error: The Adam optimizer is used to calculate the gradient update amounts of the weight matrix and bias vector based on the loss value; these gradient update amounts are then applied... Complete the iterative update of parameter data;
[0021] After each training round, the loss value of the validation set data is calculated and compared with the historical minimum loss value; when the validation set loss value is less than the historical minimum, the current loss value is adjusted. The parameter data is saved as optimal parameters; the optimal parameter data is then organized into a set of weight parameters. This serves as the initial parameter data for vehicle series-level transfer learning.
[0022] Specifically, brand nodes, as root nodes, are indeed independent of each other and lack a natural hierarchical relationship. Directly extracting them for graph convolution would result in an adjacency matrix with almost zero connectivity, leading to interrupted information propagation. This application, by "binarizing the comprehensive similarity value according to a preset similarity threshold," transforms the originally isolated brand nodes into a graph structure with stable connections. The similarity threshold mechanism ensures that connections are only established between truly similar brands (such as brand H and brand A, brand D and brand E), avoiding unreasonable brand associations (such as brand F and brand G). The reconstructed adjacency matrix... It possesses a suitable degree of sparsity, maintaining brand differentiation while ensuring effective information propagation during graph convolution. Through... Through convolutional computation, each brand node can learn the pricing patterns and market characteristics of similar brands, thereby improving the weight parameters. The generalization ability is excellent. This technical path of "isolated nodes → similarity connections → effective graph structure" fundamentally solves the feasibility problem of training brand layer graph neural networks.
[0023] Furthermore, the weight parameters of the vehicle series layer graph neural network are obtained. This includes: extracting vehicle series-level node data from the node feature matrix by node type, and obtaining vehicle system design feature data calculated from the statistical parameters of the vehicle series nodes; performing data aggregation processing on the configuration features of the vehicle model nodes under each vehicle series node, calculating the mean and standard deviation of the configuration parameters, and generating vehicle series-level configuration feature data; and concatenating the vehicle system design feature data and the vehicle series-level configuration feature data column by column to form a vehicle series-level feature matrix. ;
[0024] From the vehicle series layer feature matrix Statistical feature columns are extracted from the data, and the overlap of average price ranges, price variance similarity, and transaction activity similarity among vehicle series are calculated. Considering the brand affiliation of vehicle series, the similarity within the same brand is weighted more, while the similarity across brands is weighted more. The adjusted similarity indicators are weighted and summed to calculate the overall similarity value among vehicle series. The overall similarity value is binarized according to a preset vehicle series similarity threshold to generate vehicle series connection relationship data. This connection relationship data is converted into an adjacency matrix format, and self-join markers are added to the diagonal positions to form a vehicle series-level adjacency matrix. ;
[0025] Statistical vehicle series adjacency matrix The number of non-zero elements in each row is counted to generate a vehicle series degree vector; the vehicle series degree vector is then converted into an angle matrix. According to the normalization formula Perform matrix operations to obtain the vehicle series layer normalized adjacency matrix data;
[0026] From the weight parameters of the brand layer graph neural network Extract the first layer weight matrix and bias vector As the initial weight of the first layer of the vehicle series layer and initial bias From the brand layer graph neural network weight parameters Extracting the second layer weight matrix and bias vector As the initial weight of the second layer of the vehicle series layer and initial bias Based on the vehicle series layer feature matrix The initial weights are adapted to the desired dimensions to ensure compatibility of matrix operations.
[0027] Perform the first layer graph convolution matrix operation: The first-layer linear output data of the vehicle series layer is obtained; the first-layer linear output data of the vehicle series layer is processed by the ReLU activation function to generate the first-layer hidden feature matrix of the vehicle series layer. Perform the second-layer graph convolution matrix operation: The vehicle series layer prediction output data is obtained. ;
[0028] Average price data is extracted from the statistical parameters of vehicle series nodes and used as the true label vector of the vehicle series layer. ; Calculate the vehicle series layer prediction output data Vehicle series layer real label vector Mean square error: Fine-tuning training is performed using a small learning rate, and the Adam optimizer is used to calculate the gradient updates of the weight matrix and bias vector based on the vehicle series layer loss value; the gradient updates are then applied to the training system with a small magnitude. Complete the fine-tuning and updating of vehicle series level parameters;
[0029] After each round of fine-tuning training, the loss value of the vehicle series layer validation set data is calculated and compared with the historical minimum loss value; when the loss value of the vehicle series layer validation set is less than the historical minimum value, the current loss value is adjusted. The parameter data is saved as optimal vehicle series layer parameters; after completing the fine-tuning training for the preset number of rounds, the optimal vehicle series layer parameter data is organized into a set of weight parameters. This serves as the initial parameter data for vehicle model-level transfer learning.
[0030] In particular, the vehicle series layer has higher data granularity and more complex relationships compared to the brand layer. The number of vehicle series nodes is far greater than that of brand nodes, and the similarity judgment between vehicle series is more complex (e.g., the competitive relationship between brand A-3 series and brand HC class vs. the same-brand relationship between brand A-3 series and brand A-X3). This application incorporates brand DNA factors into the similarity calculation by "combining the brand affiliation relationship between vehicle series, enhancing the weight of similarity within the same brand, and adjusting the weight of similarity across brands." Vehicle series within the same brand naturally share similar design concepts, manufacturing processes, and market positioning, thus requiring higher connection weights; while cross-brand similarity is more reflected in functionality, requiring moderate adjustments to avoid over-connection. This design ensures a stable adjacency matrix for the vehicle series layer. It can reflect market competition while maintaining the internal product system logic of the brand, forming a more reasonable graph topology.
[0031] Furthermore, the weight parameters of the vehicle model layer graph neural network are obtained. This includes: extracting vehicle model-level node data from the node feature matrix by node type, and obtaining vehicle model statistical feature data calculated from the statistical parameters of the vehicle model nodes; directly extracting vehicle configuration parameters, including engine displacement, maximum power, body size, and official guide price, from the transaction records stored in the vehicle model nodes, and generating vehicle model configuration feature data after standardization; and concatenating the vehicle model statistical feature data and vehicle model configuration feature data column by column to form a vehicle model-level feature matrix. ;
[0032] From vehicle model layer feature matrix Extract statistical feature columns from the data, and calculate the overlap of average price ranges, price variance similarity, and transaction activity similarity among vehicle models; from the vehicle model layer feature matrix Configuration feature column data is extracted, and the configuration similarity between vehicle models is calculated using a weighted Euclidean distance algorithm. Combining vehicle series and brand affiliation relationships, the similarity of models within the same series is weighted more, the similarity of models across series within the same brand is adjusted with moderate weight, and the similarity of models across brands is weighted less. The statistical similarity and configuration similarity are then weighted and fused to calculate the comprehensive similarity value between vehicle models. The comprehensive similarity value is binarized according to a preset vehicle similarity threshold to generate connection relationship data between vehicle models. This connection relationship data is converted into an adjacency matrix format, and self-connection markers are added to the diagonal positions to form a vehicle-level adjacency matrix. ;
[0033] Statistical vehicle model layer adjacency matrix Count the non-zero elements in each row to generate a vehicle model layer degree vector; convert the vehicle model layer degree vector into an angle matrix. According to the normalization formula: Perform matrix operations to obtain the vehicle model level normalized adjacency matrix data;
[0034] From the weight parameters of the vehicle series layer diagram neural network Extract the first layer weight matrix and bias vector As the initial weight of the first layer of the vehicle model layer and initial bias From the weight parameters of the vehicle series layer graph neural network Extracting the second layer weight matrix and bias vector As the initial weight of the second layer of the vehicle model layer and initial bias Based on the vehicle model layer feature matrix The initial weights are adapted to the desired dimensions to ensure compatibility of matrix operations.
[0035] Perform the first layer graph convolution matrix operation: The first-layer linear output data of the vehicle model layer is obtained; the first-layer linear output data of the vehicle model layer is processed by the ReLU activation function to generate the first-layer hidden feature matrix of the vehicle model layer. Perform the second-layer graph convolution matrix operation: Obtain vehicle model layer prediction output data ;
[0036] Average price data is extracted from vehicle model node statistical parameters and used as the true label vector for the vehicle model layer. ; Calculate the vehicle model layer prediction output data Vehicle model layer real label vector Mean square error: Fine-tuning training is performed using a smaller learning rate than that of the vehicle series layer. The Adam optimizer is used to calculate the gradient updates of the weight matrix and bias vector based on the loss value of the vehicle series layer. The gradient updates are then applied to the vehicle series layer with a smaller magnitude. This completes the fine-tuning and updating of vehicle model level parameters;
[0037] After each round of fine-tuning training, the loss value of the vehicle model layer validation set data is calculated and compared with the historical minimum loss value; when the loss value of the vehicle model layer validation set is less than the historical minimum value, the current loss value is adjusted. The parameter data is saved as optimal vehicle model layer parameters; after completing the fine-tuning training for the preset number of rounds, the optimal vehicle model layer parameter data is organized into a set of weight parameters. , which serves as the final vehicle model layer prediction parameter for the hierarchical graph neural network model.
[0038] Furthermore, obtaining the sparse node enhanced feature matrix includes: extracting the transaction record count data for each vehicle model node from the vehicle model node statistical parameters; calculating the distribution statistics of the transaction record count for all vehicle model nodes, including the mean, median, and quartiles; marking vehicle model nodes with fewer than a preset threshold as sparse vehicle model nodes according to a preset threshold standard; generating a sparse vehicle model node index list to record the position information of sparse nodes in the vehicle model layer feature matrix; finding the vehicle series affiliation information for each sparse vehicle model node based on the hierarchical edge relationship data of the hybrid graph structure; extracting other vehicle model nodes besides sparse vehicle model nodes under the same vehicle series node to construct a hierarchical neighbor node set; and extracting the vehicle series affiliation information from the vehicle model layer feature matrix. Extract the feature vector data of hierarchical neighbor nodes; arrange the feature vectors of hierarchical neighbor nodes according to node number to form a hierarchical neighbor feature matrix. ;
[0039] Based on similarity edge relationship data with a hybrid graph structure, we find similar vehicle nodes directly connected to each sparse vehicle node; from these similar vehicle nodes, we filter out nodes that cross vehicle series and cross brands to construct a set of similar neighbor nodes; and from the vehicle layer feature matrix... Extract feature vector data of similar neighbor nodes; arrange the feature vectors of similar neighbor nodes in descending order of similarity to form a similarity neighbor feature matrix. ;
[0040] Calculate the feature vector of sparse vehicle model nodes and the feature matrix of hierarchical neighbors. The similarity score of each neighbor's feature vector is calculated; the similarity score is normalized using the softmax function to obtain the hierarchical attention weight vector. ; Calculate the feature vectors of sparse vehicle model nodes and the feature matrices of similar neighbors. The similarity score of each neighbor's feature vector is calculated; the similarity score is normalized using the softmax function to obtain the similarity attention weight vector. ;
[0041] Perform hierarchical neighbor feature weighted aggregation operation: This yields a hierarchical aggregated feature vector; a weighted aggregation operation of similarity neighbor features is then performed. The similarity aggregation feature vector is obtained; the fusion weight is dynamically calculated based on the number of hierarchical neighbor nodes and the number of similar neighbor nodes. and Perform dual feature fusion operation: This generates enhanced feature vectors for sparse vehicle model nodes.
[0042] Based on the sparse vehicle model node index list, locate the feature matrix of each sparse node at the vehicle model level. The row position in the middle; the corresponding enhanced feature vector Replace the feature vector data of the original sparse nodes;
[0043] Specifically, in this application, hierarchical neighbors (other models within the same vehicle series) ensure consistency in brand DNA and vehicle series positioning, but may have issues with excessively large configuration differences (such as the 1.4T and 3.0T versions of the B-A4); similarity neighbors (similar models across vehicle series and brands) provide more accurate reference information based on functional similarity in configuration parameters, but may lack factors such as brand premium; by "filtering out nodes across vehicle series and brands from similar model nodes," the system can break through the traditional brand-vehicle series boundary limitations and find truly valuable data sources for sparse models.
[0044] Furthermore, constructing a hierarchical graph neural network model includes: based on the weight parameters of the brand-layer graph neural network. Vehicle series layer graph neural network weight parameters Vehicle model layer diagram neural network weight parameters A three-layer cascaded graph neural network architecture is constructed. The three-layer cascaded architecture is connected in series according to the hierarchical order of brand layer → series layer → model layer to form a top-down hierarchical prediction pipeline. A feature dimension transformation module is set between each layer to ensure that the dimensions of the output features of the upper layer match the input features of the lower layer. A hierarchical feature transfer mechanism is constructed to pass the abstract features learned by the upper layer to the lower layer for refinement.
[0045] Brand layer weight parameters Decomposed into the first layer weight matrix of the brand layer Second layer weight matrix and corresponding bias vector , ; Weight parameters of vehicle series Decomposed into the first layer weight matrix of the vehicle series layer Second layer weight matrix and corresponding bias vector , ; Weight parameters of vehicle type layer Decomposed into the first layer weight matrix of the vehicle model layer Second layer weight matrix and corresponding bias vector , Construct a weight parameter management dictionary, organizing all weight matrices and bias vectors according to hierarchy and network level, to facilitate model calling and updating;
[0046] Enhanced Feature Matrix from Sparse Nodes Replace the original vehicle model layer feature matrix Perform data integrity checks on the enhanced feature matrix to ensure that all sparse vehicle model nodes have undergone feature enhancement; calculate the statistical differences between the feature matrices before and after enhancement, including changes in the mean, variance, and distribution of the feature vectors, to verify the effectiveness of the enhancement; and correlate the enhanced feature matrix with the vehicle model layer weight parameters. Perform a dimensionality compatibility check to ensure the correctness of subsequent graph convolution calculations;
[0047] Construct a brand-layer prediction submodule and use the brand-layer feature matrix. Brand layer adjacency matrix and weight parameters Perform graph convolution calculations to obtain the brand layer prediction results. Construct a vehicle series layer prediction submodule, using the vehicle series layer feature matrix. Vehicle series layer adjacency matrix and weight parameters Graph convolution calculations are performed to obtain the vehicle series layer prediction results. Construct a vehicle model-level prediction submodule and use an enhanced feature matrix. Vehicle type layer adjacency matrix and weight parameters Graph convolution calculations are performed to obtain the vehicle model layer prediction results. Design a hierarchical weighted fusion strategy to dynamically adjust the fusion weights of the three levels of prediction results based on the data sufficiency of the vehicle to be predicted.
[0048] Calculate the data sufficiency coefficients for the vehicle to be predicted at the brand, model series, and vehicle type levels. , , The data sufficiency coefficient is normalized to obtain the fusion weight vector. Perform weighted fusion calculation: The final predicted price is obtained; the fused prediction results are post-processed, including price range reasonableness checks and outlier filtering;
[0049] Furthermore, establishing an evaluation model based on residual value rate includes: obtaining historical transaction data of used cars from a hierarchical dataset, extracting key data fields such as basic vehicle information, transaction price, transaction time, mileage, and vehicle condition level; querying the corresponding official guide price of new cars based on the basic vehicle information, and establishing a dataset on the correspondence between new car prices and used car transaction prices; calculating the vehicle age for each transaction record, where vehicle age = transaction time - vehicle manufacturing time, generating vehicle age feature data; and cleaning and standardizing the mileage data, removing abnormal mileage records, and generating standardized mileage feature data.
[0050] The residual value rate for each transaction is calculated based on the ratio of the used car transaction price to the corresponding new car official guide price: Residual Value Rate = Used Car Transaction Price / New Car Official Guide Price. The calculated residual value rate is then validated for reasonableness, with upper and lower thresholds set to filter out abnormal data records. The residual value rate data is grouped according to vehicle age, including age ranges such as less than 1 year, 1-3 years, 3-5 years, 5-8 years, and over 8 years. The distribution characteristics of the residual value rate within each age range are statistically analyzed, including the mean, median, quantiles, and standard deviation.
[0051] A feature vector of factors influencing residual value rate is constructed, including feature dimensions such as vehicle age, mileage, vehicle condition level, brand influence factor, and model influence factor. Among them, the brand influence factor and model influence factor are calculated based on the statistical parameters of the brand node and model node. A residual value rate prediction model is established using a multiple linear regression algorithm: residual value rate = β0 + β1 × vehicle age + β2 × mileage + β3 × vehicle condition + β4 × brand factor + β5 × model factor + ε. The least squares method is used to estimate the parameters of the regression coefficients β0 to β5, and significance tests and model fit evaluations are performed.
[0052] Based on the three-level data structure of brand-series-model, residual value rate models are constructed at the brand level, series level, and model level respectively. The brand-level residual value rate model is trained using average residual value rate data at the brand level and is suitable for brand-level value assessment. The series-level residual value rate model is trained using residual value rate data at the series level and adds series feature variables to the brand-level model. The model-level residual value rate model is trained using residual value rate data of specific models and includes complete model configuration feature information.
[0053] Historical transaction data is divided into training and test sets in an 8:2 ratio. The training set is used to estimate model parameters. The mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R²) of residual rate prediction are calculated on the test set. Cross-validation is used to evaluate the generalization ability of the model and prevent overfitting. The model parameters are tuned based on the validation results, including feature selection, regularization coefficient settings, and outlier handling strategies.
[0054] The optimized hierarchical residual value rate model is organized into a unified evaluation model system, including sub-models at three levels: brand level, vehicle series level, and vehicle model level. A model selection strategy is constructed to automatically select the appropriate level of residual value rate model based on the data sufficiency of the vehicle to be evaluated. A prediction interface for the residual value rate model is established, which takes the vehicle's basic information and feature data as input and outputs the predicted residual value rate and confidence interval. The residual value rate evaluation model is used as an alternative prediction model to complement the hierarchical graph neural network model.
[0055] Furthermore, calculating the data sufficiency coefficient α of the vehicle to be evaluated includes: obtaining basic information about the vehicle to be evaluated, including brand, series, model, age, mileage, condition level, and configuration parameters; mapping the vehicle to be evaluated to the corresponding brand node, series node, and model node in the hierarchical dataset based on the vehicle's brand, series, and model information; and extracting the number of transaction records, latest transaction time, and price statistics parameters of the vehicle to be evaluated at each level node.
[0056] Calculate the data sufficiency coefficient α for the vehicle under evaluation at the brand, series, and model levels respectively. The calculation formula is as follows: Set a data sufficiency threshold. ,when When, a hierarchical graph neural network model is selected for prediction; when When making predictions, a residual value rate-based evaluation model is selected; and the price prediction is performed at the level with the highest data sufficiency and that meets the threshold requirements, following the priority order of vehicle model layer → vehicle series layer → brand layer.
[0057]
[0058] α (data sufficiency coefficient): α=0.5 indicates that the data sufficiency has reached the critical state; α≥0.8 is generally considered to be sufficient data and suitable for graph neural network models; α<0.5 is generally considered to be sparse data and it is recommended to use residual rate evaluation models.
[0059] N (Number of valid transaction records): Statistical time window: Transaction records of the past 24 months; Layered calculation: Brand layer N: Total number of transaction records for all models under this brand; Series layer N: Total number of transaction records for all models under this series; Model layer N: Total number of transaction records for this specific model.
[0060] (Adequacy benchmark): Preferably: Brand layer (Large sample benchmark); Vehicle series level (Medium sample benchmark); Vehicle type level (Small sample benchmark).
[0061] (Sensitivity adjustment parameters): Preferably: Brand layer Vehicle series Model layer Setting principles: Ensure a reasonable transition period.
[0062] Compared to existing technologies, the advantages of this application are:
[0063] This application, on the one hand, leverages the inherent hierarchical structure of the automotive market through a transfer learning mechanism that extends from the brand layer to the series layer and then to the model layer. The brand layer possesses the richest transaction data. By using the average price of the brand node's statistical parameters as a supervisory label for backpropagation training, the weight parameters W1 obtained can learn price patterns and market characteristics at the brand level. This knowledge learned from the upper layers is then passed down level by level, allowing the data-sparse model layer to inherit the rich experience from the brand and series layers. This "knowledge distillation" mechanism fundamentally changes the predicament of "isolated learning" for sparse models. Even if a specific model has few transaction records, a stable and reliable price prediction foundation can be obtained through hierarchical weight inheritance, thereby significantly improving prediction accuracy and model robustness in sparse data scenarios.
[0064] On the other hand, this application breaks through the limitations of traditional single-dimensional information aggregation by using a hybrid graph structure design of "hierarchical edges + similarity edges". Hierarchical edges ensure the inheritance of brand genes (such as the brand positioning of brand B), while similarity edges establish cross-brand functional associations based on configuration parameters (such as the performance characteristics of the 1.4T engine). "Using an attention mechanism to calculate the weight coefficients of the hierarchical neighbor feature set and the similarity neighbor feature set separately" can dynamically adjust the contribution of different information sources according to the specific characteristics of the vehicle model. When a certain model differs significantly from other models in the same series, the system will automatically increase the weight of similar neighbors; conversely, it will rely more on hierarchical neighbor information. This adaptive multi-dimensional information fusion mechanism ensures that sparse vehicle models can obtain information from the most relevant data sources, thereby achieving more accurate feature representation and price prediction. Attached Figure Description
[0065] This application will be further described by way of exemplary embodiments, which will be described in detail with reference to the accompanying drawings. These embodiments are not limiting; in these embodiments, the same reference numerals denote the same structures, wherein:
[0066] Figure 1 This is an exemplary flowchart of a used car transaction information management method according to some embodiments of this application;
[0067] Figure 2 This is an exemplary flowchart of a hierarchical graph neural network model according to some embodiments of this application;
[0068] Figure 3 This is a schematic diagram of the sparse vehicle model distribution in this embodiment. Detailed Implementation
[0069] The methods and systems provided in the embodiments of this application will now be described in detail with reference to the accompanying drawings.
[0070] like Figure 1 As shown, historical transaction data of used cars is acquired and processed hierarchically to obtain a hierarchical dataset. The hierarchical dataset adopts a three-level data structure of brand-series-model, where the brand node is the root node, the series node is the intermediate node, and the model node is the leaf node. Each node stores corresponding transaction records and price data. Based on the hierarchical dataset, a hierarchical graph neural network model is established through transfer learning. An evaluation model based on residual value rate is established based on the historical transaction data of used cars. The data of the vehicles to be evaluated is mapped to the corresponding nodes in the hierarchical dataset, and the data sufficiency coefficient of the vehicles to be evaluated is calculated. Based on the data sufficiency coefficient, either the hierarchical graph neural network model or the evaluation model is selected for used car price prediction. Used car transaction information is managed based on the predicted used car prices.
[0071] Transaction records containing complete configuration information were extracted from the trading platform's database, with data field coverage exceeding 95%. Key data fields included: basic vehicle information (brand, model, series), transaction price, transaction time, vehicle configuration parameters (engine displacement, maximum power, body dimensions, official guide price), mileage, and vehicle condition rating.
[0072] Brand B contains 23,456 transaction records, including 3,892 for the A4 series, 4,123 for the A6 series, and 2,567 for the Q5 series. Each record details specifications such as engine displacement (1.4L–6.0L), maximum power (90kW–450kW), vehicle length (4200mm–5300mm), vehicle width (1750mm–2100mm), vehicle height (1400mm–1900mm), and official guide price (180,000–1,500,000 RMB).
[0073] The 458,623 transaction records were distributed across 68 brand nodes based on brand affiliation. Brand B accounted for 23,456 records, Brand A for 21,789 records, and Brand H for 25,234 records, representing 15.3% of the total data. The R-series brands D, E, and R-series together accounted for 18,967 records, 16,234 records, and 14,123 records, representing 10.8% of the total data. See Table 1 for detailed statistical parameters of the main brands.
[0074] Table 1 Statistical parameters of major brands
[0075]
[0076] Specifically, the statistical parameters for Brand B node are: 23,456 transaction records, average transaction price of 248,000 yuan, and price variance of 8.7 × 10⁻⁶. 10 The latest transaction date was June 15, 2025. Brand A node statistics: 21,789 transaction records, average transaction price of 272,000 yuan, price variance of 9.4 × 10⁻⁶. 10 The latest transaction date was June 16, 2025.
[0077] Under brand node B, further subdivided into 21 car series: A3 series (1234 records), A4 series (3892 records), A6 series (4123 records), Q3 series (967 records), Q5 series (2567 records), Q7 series (1845 records), etc. The data density varies significantly among car series, with ample data for mainstream series and relatively sparse data for niche series.
[0078] Brand B-A4 series: 3892 transaction records, average transaction price of 235,000 yuan, price variance of 4.2 × 10. 10 It covers a variety of powertrain configurations, including 1.4T, 2.0T, and 3.0T. Brand A - 3 Series: 4156 transaction records, average transaction price 248,000 yuan, price variance 4.6×10 10 It forms a direct competitor with brand B-A4.
[0079] The brand's B-A4 series is further subdivided into 32 specific model configurations. Models with abundant data: A-4L 2.0T Brand Edition (658 records), A-4L 2.0T Sport Edition (543 records). Models with sparse data: A-4L 1.4T Fashion Edition (only 23 records), A-4L 3.0T quattro Flagship Edition (only 8 records).
[0080] With a threshold of 50 transaction records set, 2,847 sparse car models were identified, accounting for 22.1% of the total number of car models. These sparse car models are mainly distributed in categories such as high-end configurations, entry-level configurations, and discontinued models, where traditional prediction methods are difficult to obtain reliable results.
[0081] like Figure 2 As shown, six core configuration parameters were extracted from 12,847 vehicle model nodes: engine displacement, maximum power, vehicle length, vehicle width, vehicle height, and official guide price. The distribution of configuration parameters across the entire sample was statistically analyzed: mean engine displacement 2.1L, standard deviation 0.8L; mean maximum power 156kW, standard deviation 78kW; mean vehicle length 4567mm, standard deviation 245mm.
[0082] All configuration parameters are Z-score standardized to eliminate the influence of dimensions. Taking the BA-4L 1.4T Fashion model as an example: the original configuration [1.4L, 110kW, 4818mm, 1843mm, 1432mm, 269800 yuan] is standardized to [-0.875, -0.590, 1.024, 0.356, -0.234, -0.412].
[0083] Based on the analysis of factors influencing the value of used cars, the following configuration parameters were assigned weights: engine displacement 0.30, maximum power 0.25, body length 0.15, body width 0.10, body height 0.10, and official guide price 0.10. This weighting reflects the dominant role of power performance in the value of used cars.
[0084] The weighted Euclidean distance algorithm was used to calculate the configuration similarity between vehicle models. Key findings: The similarity between the BA-4L 1.4T Fashion model and the A-320Li Fashion model is 0.847, while the similarity with the BA-4L 3.0T quattro in the same series is only 0.623. This verifies the rationality of cross-brand configuration matching and breaks the traditional brand boundary limitations.
[0085] A similarity threshold of 0.75 was set, and undirected similarity edges were established between vehicle models whose configuration similarity exceeded the threshold. Statistical results: A total of 8967 cross-brand similarity edges and 15234 cross-vehicle series similarity edges were established, providing a rich source of information for sparse vehicle models.
[0086] The final constructed hybrid graph contains 25,694 hierarchical edges (12,847 brand-vehicle series and 12,847 vehicle series-model) and 24,201 similarity edges. The graph has an average degree of 3.9 and good connectivity, laying a solid foundation for subsequent graph neural network training.
[0087] Brand layer feature matrix The dimensions are [68×128], including 64 dimensions of statistical features (transaction volume, average price, price variance, time features, etc.) and 64 dimensions of configuration features (statistical aggregation of configuration parameters of subordinate models). The brand B feature vector integrates statistical information from 23,456 transaction records and configuration distribution features of subordinate models.
[0088] Calculate the overall similarity between brands: Brand B - Brand A: 0.834, Brand B - Brand H: 0.812, Brand A - Brand H: 0.856. Intra-brand similarity within the R series: Brand D - Brand E: 0.789, Brand E - R series: 0.756. Cross-border brand similarity is generally low, consistent with differences in market positioning (see Table 2 for details).
[0089] Table 2 Brand Similarity
[0090]
[0091] A two-layer graph convolutional neural network was constructed, with the first layer transitioning from 128 to 64 dimensions and the second layer from 64 to 32 dimensions. The ReLU activation function was used, and the Adam optimizer had a learning rate of 0.001. After 200 training epochs, the validation set loss converged to 0.0023, and the brand layer prediction accuracy reached 92.3%.
[0092] The weight parameters obtained from brand layer training As initial parameters for the vehicle series layer, it effectively utilizes the abstract features learned from the brand layer. Vehicle series layer feature matrix. The dimension is [356×128], and the feature density is more refined than that of the brand layer.
[0093] At the vehicle series level, brand affiliation is enhanced, increasing the similarity weight of vehicle series within the same brand by 1.2 times and decreasing the similarity weight of vehicle series across brands by 0.8 times. Core competitor relationships are identified: Brand B-A4-Brand A-3 series similarity is 0.891, Brand B-Q5-Brand A-X3 similarity is 0.876.
[0094] Fine-tuning was performed using a small learning rate of 0.0005. After 150 epochs, the validation set loss decreased to 0.0019, and the prediction accuracy of the vehicle series layer improved to 94.1%. Transfer learning reduced the training time by 60% compared to random initialization.
[0095] Vehicle model layer feature matrix The dimension is [12847×128], directly containing the original configuration parameters and statistical features. The feature granularity is the finest, containing detailed configuration information such as engine displacement, power, and size.
[0096] At the vehicle model level, both statistical similarity and configuration similarity are considered, with a weighted fusion ratio of 0.4:0.6. The higher weighting of configuration similarity reflects the emphasis placed on technical parameters at the vehicle model level. The weighting of vehicles within the same series is increased by 1.5 times, the weighting of vehicles within the same brand but different series is increased by 1.2 times, while the weighting of vehicles across brands remains unchanged.
[0097] Fine-tuning was performed using a smaller learning rate of 0.0002, and after 100 rounds, the validation set loss decreased to 0.0015, achieving a prediction accuracy of 95.7% for the vehicle model layer. The layer-by-layer progressive transfer learning strategy significantly improved the model performance.
[0098] like Figure 3 As shown, 2,847 sparse models (less than 50 transaction records) were identified out of 12,847 models, accounting for 22.1%. The sparse models are mainly distributed as follows: 1,234 high-end models (43.4%), 856 entry-level models (30.1%), 542 discontinued models (19.0%), and 215 niche models (7.5%).
[0099] The BA-4L 1.4T Fashion model has only 23 transaction records, the A-320Li Fashion model has only 27 records, and the H-C180L Fashion model has only 19 records. These entry-level models have limited market acceptance, resulting in sparse historical transaction data and significant errors in traditional prediction methods.
[0100] Taking the BA-4L 1.4T Fashion model as an example, 31 neighboring models within the same vehicle series were identified. Neighbors with sufficient data include: A-4L 2.0T Brand model (658 records), A-4L 2.0T Sport model (543 records), and A-4L 2.5T quattro model (412 records). On average, each sparse model yielded 28.6 neighboring features within the same vehicle series.
[0101] By aggregating information from the same vehicle series, the feature vector of the BA-4L 1.4T Fashion model was enhanced from its original sparse state (based on 23 records) to an aggregated feature that incorporates 1,613 records from the same vehicle series, expanding the data base by 70 times.
[0102] Based on configuration similarity, the following cross-brand similar neighbors were identified for the BA-4L 1.4T Fashion model: A-320Li Fashion model (similarity 0.847), H-C180L Fashion model (similarity 0.823), K-1.4T Comfort model (similarity 0.798), and M-ATS-L 1.5T Fashion model (similarity 0.772).
[0103] By aggregating cross-brand similarity data, 145 similar vehicle transaction records were further integrated. Compared to traditional methods that only use data from the same vehicle series, multi-source aggregation expands the effective data base to 1,758 records, resulting in significant data enhancement.
[0104] Calculate the attention weights for sparse car models and their siblings within the same model family. The hierarchical attention distribution for the BA-4L 1.4T Fashion model is as follows: A-4L 2.0T Brand model: 0.342; A-4L 2.0T Sport model: 0.298; A-4L 1.8T Comfort model: 0.256; other models: 0.104. This weighting distribution reflects the higher correlation between models with similar configurations.
[0105] Similarity attention weight distribution: Brand A-320Li Fashion model weight 0.386, Brand H-C180L Fashion model weight 0.294, Volkswagen Passat 1.4T weight 0.213, Brand M-ATS-L weight 0.107. The cross-brand attention mechanism effectively identified the most relevant reference models.
[0106] The fusion weight is dynamically calculated based on the number of neighbors. For the brand BA-4L 1.4T Fashion model, there are 31 hierarchical neighbors, 4 similar neighbors, and a total of 35 neighbors. Hierarchical weight. Similarity weight The norm of the eigenvector before enhancement was 2.134, and the norm of the eigenvector after enhancement was 3.672, representing a 72.1% improvement in feature representation capability. Through multi-source information aggregation, sparse vehicle models obtained richer and more accurate feature representations, laying a solid foundation for subsequent price prediction.
[0107] Vehicle to be evaluated: 2021 BA-4L 1.4T Fashion model, mileage 32,000 km, good condition. Data mapping results: Brand layer (Brand B) 23,456 records, Series layer (A4) 3,892 records, Model layer (A-4L 1.4T Fashion model) 23 records.
[0108] Use the sigmoid function to calculate data sufficiency at each level: Brand level: Vehicle series layer: Vehicle type layer: A data sufficiency threshold of 0.8 was set as the selection boundary between the graph neural network model and the residual rate model. The data sufficiency at the vehicle model level (0.272) was far below the threshold, while the data sufficiency at the vehicle series level (1.000) was far above the threshold. Following the priority order of vehicle model level → vehicle series level → brand level, due to insufficient data at the vehicle model level, the graph neural network model at the vehicle series level was automatically selected for prediction. This strategy ensured both prediction accuracy and the reliability of the data foundation.
[0109] Traditional same-model aggregation methods, when predicting the price of the BA-4L 1.4T Fashion model, can only use data from 31 models within the same model series. However, these models include high-end configurations such as the 3.0T quattro, which differ significantly from the 1.4T Fashion model's configuration, leading to substantial prediction bias. The multi-source aggregation method in this embodiment can identify similar models across brands, such as the A-320Li and H-C180L, resulting in higher configuration matching and significantly improved prediction accuracy. The average absolute error of sparse model predictions has been reduced from 8500 yuan to 5200 yuan, an improvement of 38.8%.
[0110] The system successfully identified 89.2% of the market-recognized competitive relationships, such as brand B-A4 and brand A-3 series (similarity 0.891), brand H-C class and brand A-3 series (similarity 0.878), and brand N-ES and brand BA6 (similarity 0.845). The accuracy rate far exceeds traditional brand-based classification methods. Through configuration similarity analysis, potential competitive relationships beyond traditional perceptions were discovered, such as brand P-2.0T and brand BA-4L 2.0T (similarity 0.756), providing a new perspective for market analysis. This embodiment successfully broke through traditional brand and model series boundaries, establishing vehicle similarity based on actual configuration parameters, solving the problem of information silos for sparse models. By identifying the high similarity between brand BA-4L 1.4T and brand A-320Li, more suitable reference samples were provided for sparse model prediction. A dual aggregation mechanism of hierarchical neighbor features and similarity neighbor features was constructed, enabling sparse models to simultaneously obtain information from similar models within the same model series and across brands. The data base has been expanded from 23 records to 1758 records. An intelligent model selection mechanism based on data sufficiency has been established, automatically selecting the optimal prediction strategy according to the data conditions of different vehicle models. This ensures the prediction accuracy for models with sufficient data while solving the prediction challenge for sparse models.
[0111] The foregoing illustrative description of the present application and its embodiments is not restrictive and can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application. The accompanying drawings are only one embodiment of the present application, and the actual structure is not limited thereto. Therefore, if those skilled in the art are inspired by this description and design similar structures and embodiments without departing from the spirit of the present application, such designs should fall within the scope of protection of this application. Furthermore, the word "comprising" does not exclude other elements or steps, and the word "a" preceding an element does not exclude the inclusion of "a plurality" of that element. Terms such as "first," "second," etc., are used to indicate names and do not indicate any specific order.
Claims
1. A method for managing information on a used car transaction, characterized by, include: Historical transaction data of used cars is obtained and the obtained data is processed hierarchically to obtain a hierarchical dataset. The hierarchical dataset adopts a three-level data structure of brand-car series-model, where the brand node is the root node, the car series node is the intermediate node, and the model node is the leaf node. Each node stores the corresponding transaction records and price data. Based on the hierarchical dataset, a hierarchical graph neural network model is established through transfer learning; Based on historical used car transaction data, an evaluation model based on residual value rate is established; Map the vehicle data to be evaluated to the corresponding nodes in the hierarchical dataset, and calculate the data sufficiency coefficient of the vehicle to be evaluated. Select a hierarchical graph neural network model or an evaluation model to predict used car prices based on the data sufficiency coefficient; Used car transaction information is managed based on predicted used car prices; Among them, establishing a hierarchical graph neural network model through transfer learning includes: Based on the three-level data structure of brand-car series-model, a hybrid graph structure containing hierarchical edges and similarity edges is constructed. Based on the hybrid graph structure, node features are extracted and multidimensional feature vectorization is performed to obtain a node feature matrix containing statistical features, configuration features, and temporal features. The brand layer data in the node feature matrix is subjected to graph convolutional neural network training to obtain brand layer graph neural network weight parameters ; According to the brand layer graph neural network weight parameters , the brand layer to the vehicle series layer migration learning training is performed to obtain the vehicle series layer graph neural network weight parameters ; Based on the weight parameters of the vehicle series layer diagram neural network Transfer learning training from the vehicle series layer to the vehicle model layer is performed to obtain the weight parameters of the graph neural network at the vehicle model layer. ; Sparse vehicle model nodes with fewer than a preset threshold in the statistical features are obtained. Multi-source information aggregation processing based on hierarchical edges and similarity edges is performed on the sparse vehicle model nodes to obtain the sparse node enhanced feature matrix. According to weight parameters to And sparse node enhancement feature matrix, to construct a hierarchical graph neural network model; The sparse node enhancement feature matrix is obtained, including: Identify vehicle model nodes with fewer than a threshold from the number of transaction records in the vehicle model node statistics parameters and mark them as sparse vehicle model nodes; Based on the hierarchical edge relationships of the hybrid graph structure, find other model nodes under the vehicle series to which each sparse model node belongs, extract the features of neighboring models in the same vehicle series, and form a hierarchical neighbor feature set. Based on the similarity edge relationships of the hybrid graph structure, find similar vehicle nodes connected to sparse vehicle node, extract similar neighbor vehicle features across vehicle series and brands, and form a set of similar neighbor features; The weight coefficients of the hierarchical neighbor feature set and the similarity neighbor feature set are calculated using an attention mechanism to obtain the hierarchical aggregated feature vector and the similarity aggregated feature vector, respectively. The hierarchical aggregated feature vector and the similarity aggregated feature vector are weighted and fused to obtain the enhanced feature vector of the sparse vehicle model node; The enhanced feature vectors are updated to the corresponding sparse node positions of the vehicle model layer feature matrix to obtain the sparse node enhanced feature matrix. Among these, establishing an evaluation model based on residual value rate includes: Historical transaction data is obtained from hierarchical datasets to extract basic vehicle information, transaction price, vehicle age, mileage, and vehicle condition level, and the corresponding official guide price for new cars is obtained. Calculate the residual value rate based on the transaction price of the used car and the corresponding official guide price of the new car; A multi-factor residual value regression model was constructed, which includes vehicle age, mileage, vehicle condition, brand influence factor, and model influence factor. The brand influence factor and model influence factor were calculated based on the statistical parameters of the corresponding nodes. Based on the three-level data structure of brand-series-model, residual value rate sub-models are constructed at the brand level, series level, and model level, respectively, and combined to form an evaluation model based on residual value rate.
2. The used car transaction information management method according to claim 1, characterized in that: The resulting hierarchical dataset includes: Collect historical transaction data for used cars, which includes fields such as basic vehicle information, transaction price, transaction time, vehicle configuration parameters, mileage, and vehicle condition level. Historical transaction data is classified into primary categories according to brand affiliation, and all transaction records of the same brand are grouped under the corresponding brand node. Under each brand node, historical transaction data is classified into two levels according to the vehicle series affiliation, and intermediate nodes for vehicle series are established to aggregate transaction records of the same vehicle series under the corresponding vehicle series node; Under each vehicle series node, transaction records are classified into three levels according to vehicle model, and a vehicle model leaf node is created to store transaction records of the same vehicle model in the corresponding vehicle model node. Calculate the statistical parameters for brand nodes, vehicle series nodes, and model nodes respectively. The statistical parameters include the number of transaction records, average price, price variance, and transaction time. Among them, the brand node statistical parameters are calculated by aggregating the node data of all subordinate vehicle series; the vehicle series node statistical parameters are calculated by aggregating the node data of all subordinate vehicle models, generating a three-level data structure of brand-vehicle series-vehicle model.
3. The used car transaction information management method according to claim 1, characterized in that: Obtain the weight parameters of the brand layer graph neural network ,include: Brand layer node data and statistical features generated from brand node statistical parameters are extracted from the node feature matrix, and configuration features aggregated from subordinate vehicle model nodes are used to form the brand layer feature matrix. Based on the statistical parameters of brand nodes in the brand layer feature matrix, calculate the similarity between brands, establish the connection relationship between brands according to the preset threshold, and generate a brand layer adjacency matrix containing self-connections. The brand layer adjacency matrix is normalized to obtain a normalized adjacency matrix; A multi-layer graph convolutional neural network is constructed, and graph convolution calculation is performed using the brand layer adjacency matrix and the normalized adjacency matrix to obtain the brand layer prediction output. Using the average price of brand node statistical parameters as the supervision label, the prediction error is calculated and backpropagation training is performed to obtain the weight parameters of the brand layer graph neural network. .
4. The used car transaction information management method according to claim 1, characterized in that: Obtain the weight parameters of the vehicle series layer graph neural network. ,include: Extract vehicle series layer node data from the node feature matrix, and combine the vehicle series node statistical parameters of the vehicle series nodes with the configuration features aggregated from the subordinate vehicle model nodes to form a vehicle series layer feature matrix. The similarity between vehicle series is calculated based on the statistical features in the vehicle series feature matrix. The connection relationship between vehicle series nodes is established according to the preset threshold, and a vehicle series adjacency matrix containing self-connections is generated. The vehicle series layer adjacency matrix is normalized to obtain the vehicle series layer normalized adjacency matrix. Brand layer graph neural network weight parameters As initialization parameters for the vehicle series layer graph neural network; The vehicle series layer feature matrix and the vehicle series layer normalized adjacency matrix are used to perform graph convolution calculation to obtain the vehicle series layer prediction output; Using the average price of the vehicle series node statistical parameters as the supervision label, the prediction error is calculated and backpropagation training is performed to obtain the weight parameters of the vehicle series layer graph neural network. .
5. The used car transaction information management method according to claim 1, characterized in that: Obtain the weight parameters of the vehicle model layer graph neural network ,include: Extract vehicle model layer node data from the node feature matrix, and concatenate the vehicle model statistical features of the vehicle model node statistical parameters and the configuration features aggregated from the subordinate vehicle model nodes to form a vehicle model layer feature matrix. The similarity between vehicle models is calculated based on the statistical and configuration features in the vehicle model layer feature matrix. The connection relationship between vehicle model nodes is established based on the preset threshold, and a vehicle model layer adjacency matrix containing self-connections is generated. The vehicle model layer adjacency matrix is normalized to obtain the vehicle model layer normalized adjacency matrix. Weight parameters of the vehicle series layer graph neural network As initialization parameters for the vehicle model layer graph neural network; The vehicle model layer feature matrix and the vehicle model layer normalized adjacency matrix are used to perform graph convolution calculation to obtain the vehicle model layer prediction output; Using the average price of the vehicle model node statistical parameters as the supervision label, the prediction error is calculated and backpropagation training is performed to obtain the weight parameters of the vehicle model layer graph neural network. .
6. The used car transaction information management method according to claim 1, characterized in that: Constructing a hierarchical graph neural network model includes: According to weight parameters to Construct a three-layer cascaded graph neural network structure that includes a brand layer, a vehicle series layer, and a model layer; The sparse node-enhanced feature matrix is integrated into the vehicle model layer, replacing the original vehicle model layer feature matrix. Graph convolution calculations were performed on the brand layer, vehicle series layer, and model layer respectively to obtain the prediction results for each layer.
7. The used car transaction information management method according to claim 1, characterized in that: Calculate the data sufficiency coefficient of the vehicle to be evaluated. ,include: ;in, The data sufficiency coefficient is N; N is the number of valid transaction records for the corresponding node within the preset time window. This is the baseline value for sufficiency. This is the sensitivity adjustment parameter.