Community updating planning and design intelligent decision-making method based on multi-modal data fusion
By using multimodal data fusion and graph neural network modeling, combined with collaborative optimization of the needs of multiple stakeholders and closed-loop feedback adjustment, the problems of data correlation and integration of the needs of multiple parties in community renewal planning were solved, and more efficient community renewal planning decisions were achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- 中国城市和小城镇改革发展中心
- Filing Date
- 2026-03-13
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies in community renewal planning have failed to fully explore the deep relationships between multimodal data, ignored spatial topological and functional relationships, failed to systematically integrate the needs of diverse stakeholders, and lacked a closed-loop feedback mechanism, resulting in insufficient scientific rigor, accuracy, and coordination in planning decisions.
By acquiring spatiotemporally heterogeneous multimodal data, merging cross-modal features, constructing community spatial relationship graphs, conducting collaborative analysis of the needs of multiple stakeholders, and dynamically adjusting through closed-loop feedback, the system employs a multi-head cross-attention mechanism, graph neural networks, and game theory framework to achieve deep integration of multimodal data and collaborative optimization of the needs of multiple parties, forming an end-to-end closed-loop optimization system.
It significantly improves the representation ability of multimodal features, enhances the spatial rationality of planning schemes and the satisfaction of the needs of multiple parties, improves the adaptive optimization ability of decision-making models, and enhances the scientific nature and synergy of planning decisions.
Smart Images

Figure CN122241074A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of urban planning and artificial intelligence technology, specifically to an intelligent decision-making method for community renewal planning and design based on multimodal data fusion. Background Technology
[0002] As my country's new urbanization process enters a high-quality development stage, the construction of embedded service facilities in urban communities has become an important task in ensuring people's well-being and improving social governance. The "Implementation Plan for the Construction of Embedded Service Facilities in Urban Communities," forwarded by the General Office of the State Council in 2023, clearly proposed selecting pilot cities nationwide to promote the construction of embedded service facilities and effectively integrate and extend high-quality, inclusive public service resources to the community level. However, how to scientifically and efficiently identify community spaces requiring priority renovation, how to reach consensus among diverse stakeholders, and how to generate feasible planning solutions have become pressing technical challenges that need to be addressed.
[0003] With my country's urban development gradually shifting towards stock renewal and high-quality development, optimizing the living circle by combining the renovation and upgrading of old residential areas, the smart upgrading of communities, and the improvement of surrounding service resources has become a key task in urban community planning and governance. At the community level, the main challenges at present are how to accurately identify the shortcomings of old community spaces and service facilities, how to coordinate the demands of various stakeholders, and how to maintain the sustainability of subsequent operations.
[0004] The construction of embedded service facilities in urban communities involves multiple stages, including property rights collection, resource coordination, planning approval, platform construction, and multi-party sharing of costs. It faces challenges such as complex community governance, difficulties in policy implementation, cumbersome engineering implementation, and economic operation challenges. In terms of community governance, difficulties include fragmented property rights, complex management entities, and significant NIMBY (Not In My Backyard) effects. Regarding policy implementation, there are issues such as a lack of practical policies, weak representativeness of existing cases, and a wide range of policy-related departments. In terms of engineering implementation, challenges include complex approval procedures, complex technical systems, and cumbersome construction processes. In terms of economic operation, there are difficulties such as insufficient financial support, insufficient confidence in socialized finance, and difficulties in cooperation between state-owned and private enterprises.
[0005] In the prior art, Chinese Patent Publication No. CN 118313676 A discloses a method and apparatus for intelligent auxiliary decision-making in participatory planning of smart communities. This method spatially rasterizes a map of a target city to obtain multiple spatial grids, acquires multi-source data from each spatial grid, and constructs multiple decision indicators reflecting factors related to urban renewal and smart community planning and construction. It establishes an intelligent auxiliary decision-making model for participatory planning of smart communities, including a target layer, a criterion layer, and a scheme layer. The performance score of each potential smart community is calculated using the analytic hierarchy process (AHP), and a comprehensive evaluation score is obtained by combining the subjective evaluation tendency coefficient input by the user. Finally, communities with higher scores are displayed on the map to assist in decision-making.
[0006] However, the aforementioned existing technologies have the following shortcomings: First, the multi-source data fusion of existing technologies only adopts a simple fusion calculation method, directly splicing or weighting the statistical characteristics of each data source, failing to fully explore the deep correlations and complementary information between different modalities of data, resulting in limited representational ability of fused features and difficulty in accurately reflecting the overall state of the community; Second, existing technologies use the traditional analytic hierarchy process (AHP) for decision modeling, which, although having a certain degree of interpretability, cannot effectively model the complex spatial topological relationships and functional relationships between communities, and the evaluation of each community unit is relatively independent, ignoring the impact of spatial context on planning decisions; Third, existing technologies only consider the subjective input of a single user, failing to systematically integrate the differentiated needs of multiple stakeholders such as residents, government, and operators, making it difficult to achieve collaborative optimization when there are conflicts in the needs of multiple parties, affecting the social acceptance and feasibility of the planning scheme; Fourth, existing technologies lack a closed-loop feedback mechanism, unable to dynamically adjust the parameters of the pre-processing module based on the scheme evaluation results, limiting the adaptive optimization capability of the decision model, resulting in difficulty in continuously improving the quality of the output scheme.
[0007] Therefore, there is an urgent need for an intelligent decision-making method for community renewal planning and design that can deeply integrate multimodal heterogeneous data, effectively model community spatial relationships, collaboratively optimize the needs of multiple parties, and have closed-loop adaptive adjustment capabilities, so as to improve the scientific nature, accuracy and collaboration of planning decisions. Summary of the Invention
[0008] The purpose of this invention is to overcome the shortcomings of existing technologies and provide an intelligent decision-making method for community renewal planning and design based on multimodal data fusion. This invention achieves intelligent, precise, and collaborative decision-making for community renewal planning by organically coupling four core technologies: deep fusion of spatiotemporal heterogeneous multimodal data, neural network modeling of community spatial relationship graphs, collaborative optimization of multi-stakeholder needs, and closed-loop feedback dynamic adjustment.
[0009] To achieve the above objectives, the technical solution adopted by the present invention is as follows:
[0010] The intelligent decision-making method for community renewal planning and design based on multimodal data fusion includes six core steps: spatiotemporal heterogeneous multimodal data collection, cross-modal feature alignment and fusion, construction of community spatial relationship graph, collaborative analysis of the needs of multiple stakeholders, intelligent generation and optimization of planning schemes, and closed-loop feedback dynamic adjustment.
[0011] The spatiotemporal heterogeneous multimodal data acquisition process obtains remote sensing imagery, point-of-interest (POI) data, population flow trajectory data, infrastructure status data, and policy text data for the target community area. Spatiotemporal alignment and standardization preprocessing are performed on these data types to generate multimodal feature vectors. This step addresses the issues of spatiotemporal inconsistency and inconsistent formats of multi-source data in existing technologies, laying a data foundation for subsequent deep fusion.
[0012] The cross-modal feature alignment and fusion method maps visual modal features, spatial modal features, temporal modal features, and semantic modal features to a unified representation space based on a multi-head cross-attention mechanism. It then determines the fusion weight coefficients of the multimodal feature vectors through adaptive weight learning. Unlike the simple fusion calculations of existing technologies, this invention employs deep neural networks to learn the complex relationships between different modalities, significantly improving the representational capability of the fused features.
[0013] The community spatial relationship graph construction method constructs a community spatial relationship graph with spatial grid units as nodes and spatial adjacency and functional association relationships as edges based on the fused representation vector. A graph convolutional network is then used to propagate and aggregate features from the community spatial relationship graph. Unlike the hierarchical analysis method in existing technologies, this invention uses graph neural networks to effectively model the spatial topological relationships between communities, enabling the features of each community unit to incorporate the contextual information of its spatial neighborhood.
[0014] The multi-stakeholder demand collaborative analysis collects resident demand data, government planning target data, and operator constraint data, encodes the demands of each party into demand vectors, and performs conflict detection and collaborative optimization of multi-stakeholder demands based on the Nash equilibrium principle. Unlike existing technologies that only consider single user input, this invention systematically integrates the differentiated demands of multiple stakeholders and seeks Pareto optimal solutions when demand conflicts occur.
[0015] The intelligent generation and optimization of the planning scheme will enhance the characteristics of the post-community, the multi-party demand vector, and the collaborative demand weights into the multi-objective constraint optimization model. Under the conditions of satisfying land use planning constraints, facility configuration standard constraints, and investment budget constraints, a set of candidate planning schemes will be generated and a comprehensive evaluation score will be calculated.
[0016] The closed-loop feedback dynamic adjustment calculates parameter adjustment signals based on the difference between the comprehensive evaluation score and the preset target score. These signals are then fed back to cross-modal feature alignment and fusion, community spatial relationship graph construction, and multi-stakeholder demand collaborative analysis, updating the learnable parameters of the corresponding modules and forming an end-to-end closed-loop optimization. Unlike the open-loop decision-making process of existing technologies, the closed-loop feedback mechanism of this invention enables the entire decision-making system to have adaptive adjustment capabilities.
[0017] The technical effects of this invention include: Through multimodal deep fusion technology, heterogeneous data such as remote sensing images, points of interest, population trajectories, current infrastructure status, and policy texts are organically integrated, improving feature representation capabilities by approximately 15% to 25% compared to simple fusion methods; By modeling community spatial relationships using graph neural networks, spatial dependencies and functional similarities between communities can be captured, improving the spatial rationality score of planning schemes by approximately 10% to 20%; Through collaborative optimization of multi-stakeholder needs, a balanced solution acceptable to all parties can be achieved even in the presence of conflicting needs, improving overall demand satisfaction by approximately 8% to 15%; Through closed-loop feedback and dynamic adjustment, the decision-making model can adaptively optimize based on evaluation results, with the overall evaluation score continuously improving by approximately 5% to 12% during the iteration process. Attached Figure Description
[0018] Figure 1 This is an overall flowchart of the intelligent decision-making method for community renewal planning and design based on multimodal data fusion, as proposed in this invention.
[0019] Figure 2 This is a schematic diagram of the cross-modal feature alignment and fusion structure of the present invention. Detailed Implementation
[0020] Please refer to the attached document. Figures 1-2 The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. The technical solution provided by the present invention aims to solve the technical problems faced in planning and decision-making in the construction of embedded service facilities in urban communities. Through the organic coupling of deep fusion of multimodal data, spatial modeling of graph neural networks, collaborative optimization of multi-party needs, and closed-loop feedback adjustment, intelligent community renewal planning and design decision-making is achieved.
[0021] Reference Figure 1 The intelligent decision-making method for community renewal planning and design based on multimodal data fusion of this invention includes the following six core steps.
[0022] Step 1: Spatiotemporal Heterogeneous Multimodal Data Acquisition. This step is the data input layer for the entire decision-making method, responsible for acquiring and preprocessing multi-source heterogeneous data from the target community area. In this embodiment, the target area is a community within an area of approximately 50 square kilometers in the central urban area of a city.
[0023] First, the target community area is spatially rasterized. Balancing the need for refined community planning with efficient data processing, the community space is divided into square spatial raster units with sides of 100 meters. Assuming the target area covers N spatial raster units, in this embodiment, N is approximately 5000. Each spatial raster unit serves as the basic spatial unit for subsequent analysis, possessing a unique spatial identifier and geographic coordinates of its center point.
[0024] For the acquisition and processing of remote sensing image data, multispectral satellite remote sensing images or aerial orthophotos with a resolution of at least 2 meters are obtained for the target area. Preprocessing operations such as radiometric correction, geometric correction, and image enhancement are performed on the remote sensing images. For each spatial grid cell, image slices corresponding to the area are cropped, with a slice size of 100×100 pixels (corresponding to a 100m×100m actual area) and 3 or more channels (RGB bands). A pre-trained convolutional neural network (such as ResNet-50) is used to extract features from the image slices, resulting in a 2048-dimensional visual feature vector. Further calculations are made for the building density index (calculated by the percentage of building pixels), green space coverage (calculated by the vegetation index NDVI), and road network density (calculated by the percentage of road pixels) for each grid cell. These statistical features are then concatenated to form a 2051-dimensional remote sensing feature vector. subscript Indicates the first A spatial grid cell.
[0025] For the collection and processing of Points of Interest (POI) data, data on POIs within the target area is obtained through map service open interfaces, including major categories such as catering services, shopping, lifestyle services, healthcare, education and training, culture and sports, and transportation facilities. In this embodiment, approximately 120,000 POI records were collected, each containing attributes such as name, category, latitude and longitude coordinates, and address. For each spatial grid cell, the number of POIs of each category falling within that grid area is counted, forming a category quantity distribution vector. The spatial density (number per unit area) and category diversity index (calculated using Shannon entropy) of each category of POIs are further calculated. The quantity distribution vector, density vector, and diversity index are concatenated to form a POI feature vector with a dimension of 30. .
[0026] For the collection and processing of population flow trajectory data, anonymized mobile signaling data or GPS trajectory data for 30 consecutive days is obtained within the target area. The mobile signaling data includes user identifiers (de-identified), timestamps, base station locations, and other information. The trajectory data is cleaned to remove outliers and noise. For each spatial grid cell, population inflow, outflow, and resident numbers are statistically analyzed for different time periods (morning peak 6:00-9:00, daytime 9:00-17:00, evening peak 17:00-20:00, and nighttime 20:00-6:00). The differences in population distribution between weekdays and rest days, and the tidal coefficient of population flow are calculated. These time-series statistical features are concatenated to form a 48-dimensional population flow feature vector. .
[0027] To collect and process data on the current status of facilities, the distribution of existing embedded service facilities within the target area was obtained through publicly available government data, on-site surveys, and questionnaires. The statistics included: elderly care service facilities (including community elderly care service stations, day care centers, and activity rooms for the elderly), childcare service facilities (including childcare institutions and infant care points), community canteens / meal assistance points, childcare service facilities, domestic service points, community health service stations, cultural activity centers, and sports and fitness facilities. For each spatial grid unit, the walking distance from the grid center to the nearest similar facility was calculated, as well as the compliance status of various facilities per thousand people (number of facilities / service population × 1000) in the community where the grid is located. The facility accessibility indicators and the compliance status per thousand people were encoded to form a 24-dimensional feature vector of the current status of facilities. .
[0028] For the collection and processing of policy text data, national and local policy documents related to the construction of embedded service facilities in urban communities were collected, including the "Implementation Plan for the Construction of Embedded Service Facilities in Urban Communities," the "Guidelines for the Construction of Embedded Service Facilities in Urban Communities (Trial)," and supporting implementation rules from various provinces and cities. Pre-trained language models (such as BERT) were used to encode the policy texts, extracting semantic features related to facility construction priorities, service function configurations, and construction standard requirements. These policy text features were then encoded into 768-dimensional policy feature vectors. Since the policy feature is at the region level rather than the raster level, the vector is copied N times and distributed to all spatial raster cells.
[0029] After completing the data collection and feature extraction for the above-mentioned data types, the multimodal feature vectors are standardized. The Z-score standardization method is used to calculate the mean and standard deviation for each feature dimension, transforming the feature values into a standard normal distribution with a mean of 0 and a standard deviation of 1.
[0030] ,
[0031] in: For the first The first grid cell The first type of modal data Original values of dimensional features; This is the mean of the feature across all raster cells; The standard deviation of this dimension feature across all raster cells; These are the standardized feature values.
[0032] The standardized feature vectors of each modality are concatenated to form the multimodal feature vector of each spatial grid cell:
[0033] ,
[0034] in: The original feature dimension is denoted by ; the semicolon indicates the vector concatenation operation.
[0035] Step Two: Cross-modal Feature Alignment and Fusion. Cross-modal feature alignment and fusion is one of the core innovative steps of this invention. It aims to map heterogeneous features from different modalities to a unified semantic space and mine complementary information between modalities through deep learning. (Refer to...) Figure 2 This step employs a multi-head cross-attention mechanism to achieve deep fusion of cross-modal features.
[0036] First, the multimodal feature vectors are projected onto a unified representation space. Let the dimension of the unified representation space be . Linear projection layers are applied to the multimodal feature vectors respectively:
[0037] ,
[0038] ,
[0039] ,
[0040] ,
[0041] ,
[0042] in: For the first The projection weight matrix for each modality; This is the corresponding bias vector; For the first The original feature dimensions of each modality; This is the projected modal feature vector.
[0043] Next, a multi-head cross-attention mechanism is used to learn the relationships between modalities. Let the number of attention heads be . The dimensions of each head are Using visual modalities (remote sensing features) as the query modalities and other modalities as key-value modalities:
[0044] ,
[0045] ,
[0046] ,
[0047] in: For the first The query, key, and value projection matrix of each attention head; The semicolon indicates vector concatenation; the concatenated keys and values have multiple dimensions. .
[0048] Calculate the first Attention weights and outputs of each attention head:
[0049] ,
[0050] ,
[0051] in: This is the attention weight vector, representing the degree of attention the visual modality pays to the other four modalities; For the first The output vector of each head; This is a scaling factor used to stabilize training.
[0052] The outputs of multiple attention heads are concatenated and linearly projected to obtain the multi-head attention output:
[0053] ,
[0054] in: To output the projection matrix; This is the output bias vector; This is the output of multi-head cross-attention.
[0055] Residual connectivity and layer normalization are used to enhance the training stability of the model.
[0056] ,
[0057] in: For layer normalization operation; This is the feature vector after residual connection.
[0058] Further nonlinear transformation is performed using a feedforward neural network:
[0059] ,
[0060] ,
[0061] in: It is a two-layer feedforward neural network; and This is the weight matrix of the feedforward network; This is the ReLU activation function.
[0062] To achieve adaptive adjustment of the fusion weights for each modality, a gating mechanism is introduced. First, the information entropy of the multimodal feature vectors is calculated:
[0063] ,
[0064] ,
[0065] in: For the first Information entropy of a modality; For the first The first mode Normalized amplitude of the dimensional feature.
[0066] Adaptive gating weights for each modality are calculated based on information entropy:
[0067] ,
[0068] in: For the first Gating weights for each modality; The value is 1.0 for temperature parameters in this embodiment.
[0069] The final fusion representation vector is obtained by gated weighted summation:
[0070] ,
[0071] in: For the first The fused representation vector of each spatial raster cell.
[0072] Step 3: Construction of Community Spatial Relationship Graph. The construction of community spatial relationship graph is another core innovative step of this invention, which aims to explicitly model the topological relationships and functional associations between community spatial units through graph structure. This step first constructs the community spatial relationship graph, and then uses graph convolutional networks for feature propagation and enhancement.
[0073] Define the community spatial relationship diagram ,in It is a set of nodes, with each node corresponding to a spatial grid cell; This is a set of edges, representing the relationships between grid cells; It is an adjacency matrix.
[0074] Construct spatial adjacency edges. If two spatial raster cells are geographically adjacent (eight-neighbor adjacency, i.e., top, bottom, left, right, top-left, top-right, bottom-left, bottom-right), then establish a spatial adjacency edge between them. The weight of the spatial adjacency edge is determined based on the Euclidean distance between the center points of the two raster cells:
[0075] ,
[0076] in: For grid cells and The Euclidean distance between the center points; The distance attenuation parameter is set to 1.5 times the grid side length, i.e., 150 meters, in this embodiment. This represents the weight of the spatially adjacent edges.
[0077] Construct functional association edges. Establish functional association edges based on the functional similarity between two spatial grid cells. Functional similarity is calculated using the cosine similarity of the POI category distribution vectors:
[0078] ,
[0079] When the cosine similarity exceeds a set threshold At this time, a functional association edge is established between two grid cells, with the edge weight being:
[0080] ,
[0081] By combining spatial adjacency edges and functionally related edges, an adjacency matrix is constructed. Elements:
[0082] ,
[0083] in: and The balance coefficients for the spatial edge and the functional edge are 0.6 and 0.4 respectively in this embodiment.
[0084] Perform symmetric normalization on the adjacency matrix:
[0085] ,
[0086] in: Add a self-loop to the identity matrix; For degree matrix, ; This is the normalized adjacency matrix.
[0087] fuse representation vectors As a node Initial features, construct node feature matrix .
[0088] Adaptive graph convolutional networks are used to process community spatial relationship graphs. Layer feature propagation. The formula for calculating layer graph convolution is:
[0089] ,
[0090] in: For the first The graph convolution weight matrix of the layer; The weight matrix for jump connections; For the activation function, the LeakyReLU activation function is used in this embodiment; , The value range is 2 to 4, and in this embodiment, it is taken as... .
[0091] Introducing an attention mechanism makes graph convolution adaptive. During propagation at each layer, the attention weights of the edges are dynamically calculated based on the node features:
[0092] ,
[0093] ,
[0094] in: This is the attention parameter vector; This indicates vector concatenation; For nodes The set of neighboring nodes; For the first Layer nodes For nodes Attention weights.
[0095] The final graph convolution propagation formula is modified as follows:
[0096] ,
[0097] go through After layer graph convolution propagation, the output is an enhanced community feature matrix carrying spatial context information. , No. The enhanced features of each grid cell are .
[0098] Step Four: Collaborative Analysis of Needs from Multiple Stakeholders. This step is crucial for resolving conflicts of interest among multiple stakeholders in planning decisions. It involves collecting and integrating the needs of three types of stakeholders—residents, government, and operators—and using a game theory framework to achieve collaborative optimization of those needs.
[0099] Collect resident demand data. Through a combination of online questionnaires and offline interviews, the intensity of residents' demand for various embedded service facilities was collected. The questionnaire covered six major categories of services: elderly care (including day care, home services, rehabilitation nursing, etc.), childcare services (including full-day care, half-day care, temporary care, etc.), community catering (including elderly meal assistance, student lunches, and general catering, etc.), fitness and leisure (including indoor fitness, outdoor sports, and cultural activities, etc.), household services (including repair services, housekeeping services, and express delivery), and health services (including health checkups, chronic disease management, and psychological counseling, etc.). Residents rated the intensity of their demand for each service category using a Likert scale on a scale of 1-5. The demand scores of residents within each spatial grid unit were then compiled to form a resident demand matrix. .
[0100] Collect government planning target data. Extract quantifiable planning targets from the "Guidelines for the Construction of Embedded Service Facilities in Urban Communities" and local supporting documents, including: no less than 40 square meters per 1,000 people for elderly care service facilities, no less than 4.5 childcare places per 1,000 people for childcare service facilities, a community canteen service coverage rate of no less than 80%, and a fitness facility area of no less than 2 square meters per person. Simultaneously, extract service radius standards, such as a service radius of no more than 500 meters for elderly care service facilities and no more than 300 meters for childcare facilities. Encode the above targets into a government demand vector. .
[0101] Collect data on operator constraints. Communicate with potential operating companies and social organizations to obtain their constraints for participating in the operation of community service facilities, including: minimum operating scale requirements (e.g., no less than 20 beds in elderly care facilities), minimum return on investment requirements (e.g., no less than 8%), human resource availability constraints (e.g., difficulty in recruiting caregivers), and operating cost constraints (e.g., rent caps). Encode these operator constraints into an operational demand vector. .
[0102] The needs of all parties are transformed into a demand vector with a unified dimension. The resident demand matrix is aggregated and normalized by facility category to obtain the resident demand vector. By projecting the dimensions of the government demand vector and the operational demand vector respectively, we obtain... ,in To unify the dimensions of demand.
[0103] Calculate the conflict level matrix between demand vectors. Define the conflict measurement function:
[0104] ,
[0105] in: For the first Fang and Di The degree of conflict of needs between the parties; Conflict level matrix The diagonal element is 0.
[0106] The collaborative demand weights are solved based on the Nash equilibrium principle. The demand coordination problem is modeled as a three-party non-cooperative game: each party's strategy is the weight coefficient of its demand. ,and The utility function for each party is:
[0107] ,
[0108] in: For the first The utility benchmark value when the demand of one party is met; The conflict penalty coefficient is set to 0.5 in this embodiment.
[0109] The Nash equilibrium condition requires each party to maximize its own utility given the strategies of the other parties:
[0110] ,
[0111] Solving for the Nash equilibrium using an iterative optimal response algorithm:
[0112] ,
[0113] in: The iteration count is the number of iterations; the iteration termination condition is that the weight change is less than a threshold. Or the number of iterations has reached the limit. .
[0114] Finally, the collaborative demand weight vector is obtained. And the integrated multi-party demand vector:
[0115] ,
[0116] In a typical scenario of this embodiment, the calculated collaborative requirement weights are approximately , , This reflects the collaborative principle of prioritizing residents' needs, taking into account government objectives, and ensuring operational constraints.
[0117] Step 5: Intelligent Generation and Optimization of Planning Schemes
[0118] Intelligent generation and optimization of planning schemes is a crucial step in transforming the aforementioned analysis results into implementable planning solutions. This step constructs a multi-objective constrained optimization model and employs a hybrid optimization algorithm to generate and evaluate candidate schemes.
[0119] Define decision variables. Suppose that a total of embedded service facilities need to be added or modified within the target area. Each location has a selection point. The decision variables include: whether to select this location. The type of facilities configured at this location ( (Number of facility types); facility scale at this location. (Unit: square meters).
[0120] Construct the objective function. The optimization objective is to maximize the overall benefits of facility configuration.
[0121] ,
[0122] in: To measure the effectiveness of service coverage, the extent to which facilities meet the needs of residents; For spatial accessibility benefits, measure the spatial accessibility of a facility; To ensure social equity, the spatial balance of facility allocation should be measured. The weighting coefficients for each benefit item are determined based on the weighting of collaborative needs.
[0123] The formula for calculating service coverage benefits is:
[0124] ,
[0125] in: For the first The grid pair of the first The intensity of demand for the integration of similar facilities; For grid To the facility location The distance; For the first Service radius standards for similar facilities; This is an indicator function.
[0126] The formula for calculating spatial accessibility benefits is:
[0127] ,
[0128] The formula for calculating social equity benefits is:
[0129] ,
[0130] This means minimizing the variance in the service coverage of each grid cell to promote equitable spatial allocation.
[0131] Set constraints. Land use planning constraints:
[0132] ,
[0133] ,
[0134] in: This represents the total land area available for embedded service facilities. For point The set of permitted facility types (determined by land use).
[0135] Facility configuration standard constraints:
[0136] ,
[0137] in: For the first Standards per thousand people for similar facilities; The total number of people served in the target area.
[0138] Investment budget constraints:
[0139] ,
[0140] in: For the first Construction cost per unit area for this type of facility; For the first Equipment configuration costs for such facilities; This represents the upper limit of the total investment budget.
[0141] A hybrid optimization strategy combining genetic algorithm and gradient optimization is employed to solve the problem. The genetic algorithm part uses binary encoding to represent the location selection decision. Using real number encoding to represent scale decisions Population size is set to 100, crossover probability to 0.8, mutation probability to 0.05; number of iterations is set to 200. Gradient optimization part: for continuous decision variables... Based on the location scheme obtained by the genetic algorithm, the L-BFGS algorithm is used for local optimization.
[0142] After hybrid optimization, a set of candidate planning schemes is obtained. Retaining fitness ranking The proposed solution was selected as a candidate.
[0143] Calculate the comprehensive evaluation score for each candidate solution. Technical feasibility score. Determined based on the extent to which the plan meets the facility configuration standards:
[0144] ,
[0145] Economic feasibility score Determined based on the estimated rate of return on investment of the plan:
[0146] ,
[0147] in: This is the estimated rate of return on investment.
[0148] Social Acceptance Score The degree of matching between the proposed solution and residents' needs was calculated to obtain the following:
[0149] ,
[0150] in: For the scheme of grid The service supply vector.
[0151] The overall evaluation score is:
[0152] ,
[0153] Step Six: Closed-Loop Feedback Dynamic Adjustment. Closed-loop feedback dynamic adjustment is the key mechanism for achieving end-to-end adaptive optimization in this invention. This step adjusts the parameters of the preceding modules in reverse based on the scheme evaluation results, improving decision quality through iterative optimization.
[0154] Define the feedback objective function. Let the preset target score be... (In this embodiment, we take 0.85), calculate the comprehensive evaluation score of the current optimal solution. Difference from the target score:
[0155] ,
[0156] Determine if convergence has occurred. When When the convergence threshold (ranging from 0.01 to 0.05, and 0.02 in this embodiment) is reached, convergence is determined, and the current optimal solution is output as the final result.
[0157] When convergence fails, the parameter adjustment signal is calculated. The overall evaluation score is decomposed into the contribution of each sub-item score, and the main sources of score discrepancies are analyzed:
[0158] like The low value indicates insufficient compliance with facility configuration standards; adjusting the signal direction enhances constraint fulfillment. A low value indicates insufficient economic feasibility, signaling an adjustment towards optimizing cost-effectiveness; if The low level indicates insufficient social acceptance, and the signal should be adjusted to better match demand.
[0159] Fusion weight adjustment signal Calculation:
[0160] ,
[0161] in: The learning rate for the fusion weights ranges from 0.01 to 0.1; in this embodiment, it is set to 0.05. Approximate calculations are performed using numerical difference.
[0162] Will Feedback is sent to the cross-modal feature alignment and fusion step to update the gating mechanism parameters:
[0163] ,
[0164] ,
[0165] Image Convolutional Layer Number Adjustment Signal Calculation: If The difference from the expected value exceeds 0.1 and the current ,but (Increase the number of propagation layers to capture a wider range of spatial context); if The difference from the expected value exceeds 0.1 and the current ,but (Reduce the number of propagation layers to preserve local features); otherwise .
[0166] Demand weight adjustment signal Calculation:
[0167] ,
[0168] in: The learning rate is the demand weight, ranging from 0.02 to 0.08; in this embodiment, it is 0.05.
[0169] Will Feedback is provided to the multi-stakeholder demand collaboration analysis steps, and the collaborative demand weights are adjusted before recalculating the integrated demand vector.
[0170] Set the maximum number of iterations. The iteration count is 50 to 200 times; in this embodiment, it is 100 times. The iteration process is as follows: 1. Initialize the parameters of each module; 2. Execute steps one to five to generate candidate solutions and calculate the comprehensive evaluation score; 3. Determine whether convergence has been achieved or the maximum number of iterations has been reached; 4. If the termination condition is not met, calculate the parameter adjustment signal and update the parameters of each module, then return to step 2; 5. Output the optimal planning solution.
[0171] Through closed-loop feedback iterative optimization, the comprehensive evaluation score continuously improves during the iteration process. In a typical scenario of this embodiment, the initial comprehensive evaluation score of the solution is approximately 0.72, which converges to 0.86 after about 60 iterations, representing an improvement of approximately 19.4%.
[0172] This invention also provides a system for implementing the above method. The system includes a spatiotemporal heterogeneous multimodal data acquisition module, a cross-modal feature alignment and fusion module, a community spatial relationship graph construction module, a multi-stakeholder demand collaborative analysis module, a planning scheme intelligent generation and optimization module, and a closed-loop feedback dynamic adjustment module.
[0173] The spatiotemporal heterogeneous multimodal data acquisition module is used to acquire remote sensing imagery, point-of-interest (POI) data, population flow trajectory data, infrastructure status data, and policy text data for the target community area. It performs spatiotemporal alignment and standardization preprocessing on these data types to generate multimodal feature vectors. This module interfaces with remote sensing satellite data platforms, map service platforms, mobile operator data platforms, and government data open platforms to achieve automatic acquisition and updating of multi-source data. The data preprocessing submodule is responsible for data cleaning, format conversion, coordinate unification, and feature extraction.
[0174] The cross-modal feature alignment and fusion module maps multimodal feature vectors from multiple modal feature vectors to a unified representation space based on a multi-head cross-attention mechanism. It determines the fusion weight coefficients of the multimodal feature vectors through adaptive weight learning, generating a fused representation vector. This module is implemented based on a deep learning framework and includes a modality projection layer, a multi-head cross-attention layer, a feedforward neural network layer, and a gated fusion layer. The module supports GPU-accelerated computation, processing 5000 grid cells in a single fusion operation in approximately 2 seconds.
[0175] The community spatial relationship graph construction module is used to construct a community spatial relationship graph based on fused representation vectors. It then uses a graph convolutional network to propagate and aggregate features from the graph, outputting enhanced community features carrying spatial context information. This module includes a graph construction submodule and a graph convolutional computation submodule. The graph construction submodule automatically constructs the adjacency matrix of the graph based on spatial adjacency and functional association relationships. The graph convolutional computation submodule employs sparse matrix operations to optimize the feature propagation efficiency of large-scale graphs.
[0176] The multi-stakeholder demand collaboration analysis module collects data on resident demands, government planning objectives, and operator constraints. It encodes each party's demands into a demand vector and performs conflict detection and collaborative optimization based on the Nash equilibrium principle, outputting collaborative demand weights and the integrated multi-stakeholder demand vector. This module includes a demand acquisition interface, a demand encoder, and a game theorem solver. The demand acquisition interface supports questionnaire data import, document parsing, and manual data entry. The game theorem solver implements an iterative optimal response algorithm, with a typical solution time of less than 1 second.
[0177] The intelligent planning scheme generation and optimization module is used to input enhanced community features, multi-party demand vectors, and collaborative demand weights into a multi-objective constrained optimization model, generate a set of candidate planning schemes, and calculate the comprehensive evaluation score of each candidate scheme. This module includes an optimization model builder, a hybrid optimization solver, and a scheme evaluator. The hybrid optimization solver integrates a genetic algorithm library and a gradient optimization library, supporting parallel computation. In a typical scenario, the total time to generate 10 candidate schemes is approximately 5 minutes.
[0178] The closed-loop feedback dynamic adjustment module calculates parameter adjustment signals based on the difference between the comprehensive evaluation score and the preset target score. These signals are then fed back to the cross-modal feature alignment and fusion module, the community spatial relationship graph construction module, and the multi-stakeholder demand collaborative analysis module, updating the learnable parameters of the corresponding modules. This module includes a convergence judge, a gradient calculator, and a parameter updater. The convergence judge monitors changes in the evaluation score in real time. The parameter updater enables joint optimization across modules through parameter interfaces.
[0179] The data flow and control flow between the above modules are organized according to the process described in the method embodiment. The system as a whole adopts a microservice architecture, and each module can be independently expanded. The system provides a visual interface on both web and mobile devices, supporting interactive display and adjustment of planning schemes.
[0180] The method and system of this invention are applicable to planning and decision-making scenarios for the construction of embedded service facilities in urban communities. Embedded service facilities include, but are not limited to, community elderly care service facilities, infant and toddler care facilities, community meal assistance facilities, childcare facilities, housekeeping and convenience service facilities, health service facilities, cultural activity facilities, and sports and fitness facilities.
[0181] This invention supports two application scenarios: newly built communities and existing community upgrades. For newly built communities, the method focuses on the rational layout and standardization of facilities, ensuring that the new community has complete embedded service functions from the planning stage. For existing community upgrades, the method focuses on the integrated utilization of existing resources and the filling of gaps in new facilities. By accurately identifying areas with facility deficiencies, optimizing the functional configuration of existing facilities, and rationally planning the site selection of new facilities, the overall community service capabilities are improved.
[0182] Typical users of this invention include urban planning and management departments, development and reform departments, housing construction departments, civil affairs departments, and investment, construction, and operation management entities of community service facilities. Through the intelligent decision support of this invention, the scientific rigor, efficiency, and satisfaction of all parties involved in planning decisions can be significantly improved.
[0183] To verify the technical effectiveness of this invention, a comparative experiment was conducted in two administrative districts of a first-tier city. Experimental area A was a concentrated area of old communities in the central urban area, covering an area of approximately 35 square kilometers, containing approximately 3,500 spatial grid units, with a resident population of approximately 450,000; Experimental area B was a newly built community area in the new urban area, covering an area of approximately 28 square kilometers, containing approximately 2,800 spatial grid units, with a resident population of approximately 300,000.
[0184] The experiment compared the performance of the method of this invention with that of existing technologies on multiple evaluation indicators. The existing technology employs the participatory planning and decision-making support method for smart communities based on the analytic hierarchy process (AHP) disclosed in Chinese invention patent CN 118313676 A. Both methods use the same raw data input, and the same team of planning experts evaluates the output solutions.
[0185] In terms of feature representation capabilities, the cross-modal deep fusion method of this invention improves the accuracy of downstream tasks (facility demand prediction) from 78.3% to 91.6% compared with the simple fusion methods of the prior art, an improvement of 17.0%. This shows that the multi-head cross-attention mechanism can effectively mine complementary information between different modalities of data.
[0186] In terms of spatial relationship modeling, the graph neural network method of this invention improves the spatial rationality score (scored by experts, out of 100 points) of the generated scheme from 72.5 points to 86.8 points, an improvement of 19.7%, compared with the existing hierarchical analysis method. This shows that graph convolutional networks can effectively capture the spatial dependencies between communities.
[0187] Regarding the satisfaction of multiple needs, the game-theoretic collaborative optimization method of this invention, compared with the existing single-user input method, improves the overall score of demand satisfaction from 0.68 to 0.79 in simulated demand conflict scenarios, representing an improvement of 16.2%. In actual questionnaire surveys, residents' satisfaction with the planning scheme increased from 65% to 82%, the government's approval of the scheme's compliance increased from 70% to 88%, and the operator's evaluation of the scheme's feasibility increased from 60% to 75%.
[0188] Regarding the closed-loop optimization effect, the feedback adjustment mechanism of this invention improves the overall evaluation score by an average of about 11.5% during the iteration process. In experimental region A, the initial solution score was 0.71, which converged to 0.84 after 78 iterations; in experimental region B, the initial solution score was 0.74, which converged to 0.87 after 52 iterations.
[0189] In terms of computational efficiency, the method of this invention takes about 8 minutes (including 100 closed-loop iterations) to process 5,000 grid cells on a server configured with an Intel Xeon E5-2680 v4 CPU (28 cores), an NVIDIA Tesla V100 GPU (32GB VRAM), and 256GB of RAM, which meets the timeliness requirements of actual planning work.
[0190] When applying this invention in practice, the following points should be noted.
[0191] Regarding data quality, the collection of multimodal data should ensure spatiotemporal consistency and data integrity. Remote sensing imagery data should be selected during time periods with cloud cover of less than 10% and a resolution of no less than 2 meters; point-of-interest (POI) data should be updated regularly to reflect the latest facility distribution; population flow data should ensure that anonymization processing complies with personal privacy protection requirements; and facility status data should be verified on-site to ensure accuracy.
[0192] Regarding parameter settings, the unified representation dimension for cross-modal fusion It is recommended to set the value between 128 and 512. Too small a value may lead to information loss, while too large a value may increase computational overhead. (The number of propagation layers in graph convolution is also relevant.) It is recommended to set the number of layers to 2 to 4. Too few layers will not be able to capture long-distance spatial relationships, while too many layers may lead to oversmoothing issues. The convergence threshold of the closed-loop iteration is recommended to be set between 0.01 and 0.05. Too few layers may lead to too many iterations, while too many layers may lead to insufficient optimization.
[0193] Regarding computing resources, the deep learning module of this invention is recommended to run on a server equipped with a GPU to ensure computing efficiency; for ultra-large-scale regions (more than 10,000 grid cells), a partitioned parallel computing strategy is recommended.
[0194] Regarding the interpretation of results, the optimal solution output by the closed-loop optimization should be used as a reference for planning decisions. The final decision still needs to be adjusted in conjunction with the professional judgment of planning experts and actual constraints. The weight coefficients of each evaluation score can be adjusted according to the policy orientation and actual situation of the specific project.
[0195] The embodiments described above are merely illustrative of specific implementations of the present invention, and while the descriptions are detailed, they should not be construed as limiting the scope of the present invention. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these modifications and improvements all fall within the scope of protection of the present invention.
Claims
1. A community renewal planning and design intelligent decision-making method based on multi-modal data fusion, characterized in that, include: Spatiotemporal heterogeneous multimodal data acquisition: acquire remote sensing images, points of interest, population trajectories, facility status and policy text data of the target community, and generate multimodal feature vectors; Cross-modal feature alignment and fusion is achieved by mapping multimodal feature vectors to a unified representation space based on a multi-head cross-attention mechanism, thereby generating a fused representation vector. Community spatial relationship graph construction: The community spatial relationship graph is constructed using spatial grid cells as nodes, and the enhanced community features are output through a graph convolutional network. Multi-stakeholder demand collaboration analysis, based on the Nash equilibrium principle, optimizes the collaborative demand of residents, government and operators, and outputs collaborative demand weights; The intelligent generation and optimization of planning schemes will input the weights of enhanced post-community characteristics and collaborative needs into the constraint optimization model to generate candidate schemes and calculate the comprehensive evaluation score; The closed-loop feedback is dynamically adjusted, and the signals are adjusted to the aforementioned steps based on the evaluation score feedback parameters, and the optimal planning scheme is output through iterative optimization.
2. The method of claim 1, wherein, The spatiotemporal heterogeneous multimodal data acquisition includes: dividing the community space into grid units with sides of 50 to 200 meters; extracting building density and green space coverage characteristics from remote sensing images; statistically analyzing the category distribution of points of interest within each grid; calculating population inflow, outflow, and residence duration; and extracting the coverage rates of elderly care, childcare, catering, and fitness facilities.
3. The method of claim 1, wherein, The cross-modal feature alignment and fusion includes: projecting multimodal feature vectors onto a unified representation space of dimension D; using visual features as queries, calculating intermodal association weights through multi-head cross-attention; weighted aggregation based on association weights and generating a fused representation vector through residual connections; and adaptively adjusting the fusion ratio according to the information entropy of each modality through a gating mechanism.
4. The method according to claim 1, characterized in that, The construction of the community spatial relationship graph includes: using grid cells as nodes and fusing representation vectors as initial features; constructing spatial adjacency edges and functional association edges based on geographical adjacency and functional similarity, respectively; and propagating features through K-layer graph convolution, where K ranges from 2 to 4.
5. The method according to claim 1, characterized in that, The multi-stakeholder demand collaboration analysis includes: obtaining resident demand scores, government planning indicators, and operator constraints; encoding the demands of all parties into a unified dimension vector; calculating the demand conflict degree matrix; and solving for the collaborative demand weights that satisfy Pareto optimality based on Nash equilibrium.
6. The method according to claim 1, characterized in that, The intelligent generation and optimization of the planning scheme includes: constructing an optimization function with the objectives of service coverage, spatial accessibility and social equity benefits; setting land use planning, facility configuration standards and investment budget constraints; solving the problem using a hybrid strategy of genetic algorithm and gradient optimization; and calculating the comprehensive evaluation score of each candidate scheme.
7. The method according to claim 1, characterized in that, The comprehensive evaluation score is a weighted sum of the technical feasibility score, economic feasibility score, and social acceptance score, with weights of 0.3, 0.3, and 0.4, respectively.
8. The method according to claim 1, characterized in that, The closed-loop feedback dynamic adjustment includes: determining convergence when the absolute value of the difference between the evaluation score and the target score is less than the convergence threshold; calculating and updating the adjustment signals of the fusion weight, graph convolution layer number, and demand weight when convergence fails; and the maximum number of iterations is 50 to 200.
9. The method according to claim 1, characterized in that, The method further includes: overlaying the optimal facility layout onto an electronic map; and generating a service coverage heat map and a demand satisfaction radar map.
10. The method according to claim 1, characterized in that, The method is applied to the planning and decision-making of community-embedded service facilities, which include facilities for elderly care, childcare, meal assistance, childcare, housekeeping, and health services.
Citation Information
Patent Citations
Intelligent auxiliary decision-making method and device for intelligent community participatory planning
CN118313676A