Method for analyzing pore structure of organic matter molecular model
By combining the grid method and a preset search algorithm with the convex hull method and KD tree structure, the accuracy and efficiency problems of porosity calculation for porous materials in the prior art are solved, realizing efficient and accurate analysis of the pore structure of organic molecular models, and supporting material design and optimization.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA UNIV OF PETROLEUM (BEIJING)
- Filing Date
- 2025-07-03
- Publication Date
- 2026-06-16
AI Technical Summary
Existing molecular simulation methods struggle to accurately extract pore information when calculating the porosity and pore volume of porous materials, making it difficult to identify connected and closed pores. Furthermore, their computational algorithms are inefficient and lack precision, especially in modeling complex systems.
Using a grid-based method and a pre-defined search algorithm, the total atomic volume and model volume are calculated by inputting the model structure file of an organic molecule model. The connectivity of the pore structure is analyzed by combining the convex hull method and KD tree structure, quantifying the volume of connected pores and closed pores, and outputting the analysis results.
It enables accurate calculation of porosity and volume in complex structures, improves computational efficiency and accuracy, provides an efficient porosity analysis tool, and supports material design and optimization.
Smart Images

Figure CN120913718B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of molecular simulation in oil and gas geology, specifically to the calculation and analysis of structural volume and pore volume based on the structural files of organic molecular models derived from molecular dynamics simulations, as well as methods for analyzing pore structure. Background Technology
[0002] Porosity analysis is a crucial tool for studying the properties of porous materials, and it plays a significant role in material development and optimization. Porosity and pore volume, as key parameters of porous materials, directly affect their physical and chemical properties, such as adsorption capacity, gas storage, ion transport, and catalytic efficiency. In molecular simulations, porosity calculations not only comprehensively analyze the pore volume of materials but also further analyze the distribution characteristics of connected and closed pores. Therefore, accurately calculating the porosity characteristics of models in molecular simulations is a critical step in evaluating material performance and guiding design optimization.
[0003] Traditional experimental methods, such as the Biosorption and Extraction of Tissues (BET) method or X-ray computed tomography (CT) scans, while providing relatively accurate pore data, struggle to deeply analyze the microstructure of materials at the atomic scale. Quantitative analysis of pore connectivity remains extremely difficult, especially in dynamic simulations or early material design stages. Molecular simulation techniques, by processing atomic-level structural information—such as coordinate-based structural files generated from molecular dynamics—can directly extract pore characteristics from simulation data, offering a more efficient and lower-cost analytical approach.
[0004] However, existing molecular simulation methods face numerous challenges in the computational process, such as how to accurately extract porosity information from complex molecular structures, how to efficiently identify connected and closed pores in high-dimensional space, and how to optimize computational algorithms to handle large-scale atomic coordinate data. These problems limit the efficiency and accuracy of existing methods, especially in modeling complex porous material systems. Summary of the Invention
[0005] This invention provides a method for analyzing the pore structure of organic matter molecular models, which overcomes the shortcomings of existing technologies. It can accurately calculate the porosity and volume in complex structures, and has an efficient calculation method and an easy-to-use interface, which can help researchers conduct pore analysis of 3D structures in multiple fields.
[0006] To achieve the above objectives, the present invention adopts the following technical solution:
[0007] In a first aspect, this application provides a method for analyzing the pore structure of organic matter using a molecular model, comprising:
[0008] S1, Input the model structure file of the organic molecule model, which contains the atom type and three-dimensional coordinates of each atom in the organic molecule model;
[0009] S2, Calculate the total volume of atoms in the organic molecule model based on the model structure file;
[0010] S3, determine the convex hull of all atoms surrounding the organic matter molecular model in order to calculate the model volume of the organic matter molecular model;
[0011] S4. Based on the model volume and the total atomic volume, analyze and calculate the pore volume and porosity in the organic molecular model;
[0012] S5. Based on the grid method, a spatial index structure of the organic matter molecular model is established. A preset search algorithm is used to analyze the connectivity of the pore structure in the organic matter molecular model and to quantify the volume of connected pores and closed pores.
[0013] S6 outputs the analysis results.
[0014] In one implementation, the input organic molecule model's model structure file includes a file in pdb or gro format, and also includes parsing the simulation box vector information in the input data.
[0015] In one implementation, S2 includes:
[0016] The volume of a single atom is estimated using van der Waals radii, and volume corrections are applied based on the identified interatomic chemical bonds. The latter calculates the total atomic volume of the organic molecular model based on the atom type and interatomic connections, using preset elemental volume contribution values and chemical bond volume contribution values.
[0017] In one implementation, the specific calculation process includes:
[0018] The atomic volume of each atom is calculated using the preset van der Waals radius, based on the three-dimensional coordinates and atom type of each atom.
[0019] The total element volume contribution is obtained by multiplying the number of atoms of each element determined for each atom type by its preset element volume contribution value.
[0020] Identify and count the chemical bonds in the organic molecule model, and multiply the count of each type of chemical bond by its preset chemical bond volume contribution value to obtain the total chemical bond volume contribution.
[0021] The total atomic volume is obtained by adding the total elemental volume contribution to the total chemical bond volume contribution.
[0022] Specifically, it may include:
[0023] ① Calculate the atomic volume; assign bond length data (nm) to each atom read according to the preset van der Waals radius database, and approximate the atomic volume as the volume of a sphere with the van der Waals radius as the radius.
[0024] ② Chemical bond identification and counting; prioritize the use of acquired explicit chemical bond connection information (such as the CONECT record in the pdb file). For each explicit connection, determine its corresponding chemical bond type (e.g., based on the element type of the bonding atoms).
[0025] If explicit connection information is lacking, a distance-based bond lookup method is used: A spatial index structure of atomic coordinates (Kd tree) is constructed. Neighboring atoms within a predefined maximum bond length search radius are queried for each atom. For found atom pairs, their interatomic distances are calculated. The calculated distances are compared to a predefined database of typical bond lengths (containing approximate bond lengths and tolerance ranges for different element pairs and bond orders). The most likely bond type (e.g., single, double, triple bond) is determined based on the best match (minimum distance error within the tolerance range).
[0026] ③ Bond Type Mapping and Counting: Identified chemical bonds (whether from explicit information or distance judgment) are mapped to a predefined chemical bond contribution type database, which contains identifiers for acyclic bonds, cyclic bonds, and special functional groups (such as benzene rings and nitro groups). Each identified chemical bond type is counted. The volume contribution value of the chemical bonds is obtained (in units of...). (usually negative), summed to calculate the total chemical bond volume contribution.
[0027] Preferably, for cyclic structures and special functional groups, more complex graph theory algorithms or cheminformatics databases can be integrated for more accurate identification and counting, thereby improving the accuracy of bond type determination.
[0028] ④ Volume Contribution Summation: Calculate the total atomic volume contribution based on ①. Calculate the total chemical bond volume contribution based on the volume contribution of each chemical bond obtained in ③. Add the total elemental volume contribution to the total chemical bond volume contribution to obtain the total atomic volume of the material structure, in units of... Ensure the calculation result is not negative.
[0029] In one implementation, identifying chemical bonds in an organic molecular model includes:
[0030] Obtain the display chemical bond connection information from the model structure file and determine the corresponding chemical bonds;
[0031] If explicit connection information is missing, a distance-based chemical bond lookup method is used to search for neighboring atoms of each atom within a preset maximum bond length search radius. For the found atom pairs, the interatomic distance is calculated. The calculated distance is compared with a preset database of typical chemical bond lengths, and the chemical bond is determined based on the error range of the comparison.
[0032] In one implementation, in step S5, the spatial index structure is a Kd-tree structure.
[0033] In one implementation, in step S5, the preset search algorithm is a breadth-first search algorithm.
[0034] Specifically, step S5 above includes:
[0035] ① Gridding: Define the boundary of the analysis region, preferably using the simulated box boundary. If no box exists, use the minimum / maximum range of atomic coordinates with added buffers. Generate a 3D uniform grid within the analysis region according to the preset grid resolution (here, the size of a methane molecule is used as the standard, which can be modified). Record the total number of grid points.
[0036] ② Spatial index construction: Obtain the coordinates of all atoms. If the system has periodic boundary conditions (especially orthogonal boxes), map the atomic coordinates into the main simulation box to handle atoms crossing the boundary. Construct a Kd tree using the atomic coordinates.
[0037] ③ Pore lattice point identification: For each lattice point, use a Kd-tree to query the indices of all neighboring atoms within a certain search radius. Calculate the distance between the lattice point and each neighboring atom. Determine whether the lattice point is located inside a sphere defined by the van der Waals radius of any neighboring atom. If the lattice point is not located inside any van der Waals sphere of neighboring atoms, mark it as a "pore lattice point". Record the total number of pore lattice points.
[0038] More preferably, when calculating pores and grid boundaries, to avoid large pore errors caused by grid points being divided outside the model, the boundary of the convex hull is used as the boundary for dividing the grid points.
[0039] ④ Connectivity Analysis (Browse-First Search); Initialize a queue and add all "pore grid points" located on the model boundary to the queue, marking these points as "visited," "connected," and "closed." When the queue is not empty, remove a grid point (the current point) from the queue and check its six direct neighbors (up, down, left, right, front, and back). For each neighbor: if the neighbor is a "pore grid point" and has not yet been marked as "visited," mark it as "visited" and "connected" and add it to the queue. After BFS, all grid points marked as "connected" represent the pore portions connected to the outside world. Record the total number of connected grid points; grid points marked as "closed" represent the closed pore portions.
[0040] ④ Calculation of connected and closed pore volume: Multiply the number of connected / closed grid points by the volume of a single grid point to obtain an estimated value of the connected / closed pore volume, in units of... The calculated total pore volume should be the sum of the calculated connected pore volume and closed pore volume. This should be used as a correction to ensure the accuracy of the result. If the estimated connected volume exceeds the total pore volume, the closed volume is 0, and the connected volume equals the total pore volume. Dividing the connected pore volume by the total pore volume yields the connected porosity.
[0041] In a second aspect, a computer-readable storage medium is provided, wherein at least one computer program is stored in the computer-readable storage medium, the at least one computer program being loaded and executed by a processor to enable a computer to implement the organic molecular model pore structure analysis method as described in the first aspect.
[0042] Thirdly, a computer device is provided, including a memory and a processor, the memory storing computer program instructions, and the processor being configured to execute the instructions to implement the method of the first aspect.
[0043] The method provided in this invention relies entirely on the structural files from molecular simulations under equilibrium conditions. By parsing three-dimensional atomic coordinate data and combining slicing analysis, chemical bond correction, and efficient spatial search algorithms, it can accurately extract and classify porosity information. Its core technologies include calculating the total pore volume of the model based on the convex hull method, using breadth-first search (BFS) and KD-tree data structures to identify connected and closed pores, and presenting the results visually. This method significantly improves the efficiency and accuracy of model porosity calculations, while providing reliable data support for simulation and optimization in materials design, representing a novel analytical tool for molecular simulations. Attached Figure Description
[0044] Figure 1 is a schematic diagram of the calculation model for porosity and porosity ratio provided in an embodiment of the present invention.
[0045] Figure 2 is a schematic diagram of the molecular structure of the irregular organic kerogen model in Example 1 of the present invention.
[0046] Figure 3 shows the molecular calculation results of the irregular organic kerogen model in Example 1 of the present invention.
[0047] Figure 4 is a schematic diagram of the regular silicon dioxide model molecular structure in Embodiment 2 of the present invention.
[0048] Figure 5 shows the calculation results of the regular silica model molecule in Example 2 of the present invention. Detailed Implementation
[0049] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. All other embodiments obtained by those skilled in the art based on the described embodiments of the present invention are within the scope of protection of the present invention.
[0050] To address the shortcomings and problems of existing technologies, this application provides a method for analyzing the pore structure of organic matter using a molecular model, comprising:
[0051] S1, Input the model structure file of the organic molecule model, which contains the atom type and three-dimensional coordinates of each atom in the organic molecule model;
[0052] S2, Calculate the total volume of atoms in the organic molecule model based on the model structure file;
[0053] S3, determine the convex hull of all atoms surrounding the organic matter molecular model in order to calculate the model volume of the organic matter molecular model;
[0054] S4. Based on the model volume and the total atomic volume, analyze and calculate the pore volume and porosity in the organic molecular model;
[0055] S5. Based on the grid method, a spatial index structure of the organic matter molecular model is established. A preset search algorithm is used to analyze the connectivity of the pore structure in the organic matter molecular model and to quantify the volume of connected pores and closed pores.
[0056] S6 outputs the analysis results.
[0057] The above method is described below with reference to more accompanying drawings, in one or more more detailed embodiments, and in application examples.
[0058] Detailed Method Example 1
[0059] Referring to Figure 1, an embodiment of the present invention provides a method flow, wherein the method involves the following key technical contents:
[0060] Step S1 Data Input and Preprocessing: The calculation process of this invention first requires the user to select an input file, which must be a coordinate file in .gro or .pdb format. This file contains key information such as the atomic composition and three-dimensional coordinates of the molecular model, which is the basis for subsequent analysis.
[0061] When reading a file, the system automatically calls functions that recognize the GRO or PDB format based on the file extension, parsing the file to obtain the type and coordinates of each atom. For example, based on the fixed format of GRO files, the system automatically skips the header and footer data, retaining only the valid information in the middle, including the coordinates and category of each atom, and ensuring that all length units are consistent in nm. This extracted data provides fundamental support for subsequent porosity and volume calculations.
[0062] Step S2: Calculate the total atomic volume: To calculate the total volume of atoms in the model, this invention estimates the volume of each atom based on van der Waals radii, calculates the volume of a single atom, and sums the volumes of all atoms to obtain the total atomic volume in the model. A volume correction method based on chemical bond information is also introduced to correct the initially calculated model volume.
[0063] Step S2 specifically includes:
[0064] Step S21 Calculate the atomic volume: First, extract the type of each atom recorded in the coordinate file, and find the corresponding van der Waals radius value according to its type. Calculate the volume of the atomic sphere based on the van der Waals radius.
[0065] Step S22: Chemical Bond Identification and Counting: The chemical bond identification module is invoked. If the input data contains explicit atomic connection records, chemical bonds between atoms are identified primarily based on these records. If explicit connection information is lacking, a chemical bond lookup based on inter-atomic distance is performed, constructing a spatial index structure (e.g., a Kd-tree) using the coordinates (in nm) of all atoms. For each atom, its neighboring atoms within a preset search radius (based on the maximum possible bond length, in nm) are queried. The distance between atom pairs is calculated (in nm). The calculated distance is compared to a predefined database of typical chemical bond lengths, BOND_LENGTHS (containing bond length ranges for different bond orders and element pairs, in nm). The most likely chemical bond and its type (e.g., C=C single bond, C=O double bond, etc.) are determined based on the best degree of distance matching (e.g., minimum error and within tolerance). It should be noted that this distance-based method has limitations in determining bond order (single / double / triple), ring properties, and special functional groups; more advanced algorithms or libraries may be needed for more accurate determination.
[0066] Step S23 Bond Type Mapping and Counting: Map each chemical bond identified in Step S22 to a predefined list of chemical bond contribution types, count the occurrences of each successfully mapped chemical bond type, and store the counts in a dictionary. For bonds that cannot be explicitly mapped or identified, record them or issue a warning.
[0067] Step S24: Volume Contribution Summation: Count the number of each chemical bond type obtained in Step S23, match it with the database, multiply it by its corresponding unit volume contribution value, and then sum the contribution values of all chemical bonds to obtain the total chemical bond volume contribution. (Note that this value is usually negative). The total atomic volume calculated in step S21... Compared with the total chemical bond volume contribution obtained in this step To perform algebraic summation, that is: = + The obtained value This represents the actual space volume occupied by the atoms. Ensure the final calculated total atomic volume is not negative; if a negative value is found, set it to 0 and issue a warning.
[0068] Step S3: Calculate the model volume: Call the convex hull calculation function, taking the 3D coordinate array of all atoms obtained in Step S1 as input. The convex hull algorithm finds a minimal convex polyhedron containing all the atomic coordinate points, which completely encloses the input N atomic coordinate points. The goal of the convex hull algorithm is to find the minimal convex polyhedron containing all points in the point set. Similar to quicksort, it first finds the extreme points of the coordinates (e.g., the minimum and maximum coordinate points of x, y, z), which must lie on the convex hull. Then, based on the initial polyhedron (e.g., a tetrahedron) formed by these points, it recursively assigns the remaining points to the "outside" of each face, and adds the point farthest from the face to the convex hull, continuously expanding and updating the convex hull facets until all points are inside or on the boundary of the convex hull. Extract the calculated volume value from the return result of the convex hull calculation function. This volume value represents the outer boundary volume enclosing the entire atomic structure. This method is particularly suitable for handling situations where the atomic arrangement is irregular, the shape is complex, and the structure contains voids, channels, or has complex surface irregularities.
[0069] Step S4: Calculate Pore Volume: Pore volume is the portion of the total model volume not occupied by atoms, representing the porosity of the material. In this step, the pore volume is obtained by calculating the difference between the model volume and the atomic volume. Specifically, the system first retrieves the model volume and total atomic volume data calculated in previous steps, using the formula: = - The calculations are performed. This calculation yields the total pore volume of the model, laying the foundation for further analysis of the type and distribution characteristics of pore regions. Furthermore, the results are used to evaluate the porosity of the material, defined as the ratio of the total pore volume to the model volume, usually expressed as a percentage, and is an important parameter for measuring the performance of porous materials.
[0070] Step S5: Calculation of Closed / Connected Pore Volume: To deeply analyze the geometric characteristics and connectivity of pore regions, this invention proposes a method for identifying closed and connected pores based on spatial data structures and algorithms. A Kd-tree (K-dimensional tree) is used as the spatial index structure to efficiently search and segment the extracted pore regions. A breadth-first search (BFS) algorithm using a KD-tree is employed to analyze the connectivity of the pore regions, identifying which pore regions are connected to each other and which pores are connected to the outside of the model, marking them as connected pores. The remaining pores are closed pores. This step not only quantifies the type and distribution of pores in the material but also provides detailed references for application scenarios. The above results can be further combined with parameters such as porosity and pore ratio to provide comprehensive data support for the optimized design and functional development of materials.
[0071] Step S5 specifically includes:
[0072] Step S51: Spatial Gridding and KD Tree Construction: Determine the three-dimensional spatial analysis region based on the minimum / maximum range of atomic coordinates of the material structure. Within the determined spatial range, generate uniformly distributed three-dimensional grid points according to a preset grid resolution. Construct a KD tree for all grid points.
[0073] Step S52: Pore lattice point identification: Traverse all generated lattice points. For each lattice point, use the Kd tree constructed in step S51 to query all its neighboring atoms within a predetermined search radius. Calculate the distance between the lattice point and each neighboring atom to determine whether the lattice point is located inside a sphere defined by the van der Waals radius of any neighboring atom. If a lattice point is not located inside the van der Waals sphere of any neighboring atom, then mark the lattice point as a "pore lattice point".
[0074] Step S53 Connectivity Analysis: Create a queue to store the grid points to be visited, and create a state array to mark the state of all grid points (0 - non-pore, 1 - pore unvisited, 2 - pore visited / connected, 3 - pore visited / closed). Mark all "pore grid points" identified in Step S52 as state 1 in the array. Find all pore grid points (points with state 1) located on the gridded spatial boundary. Update the state of these boundary pore grid points to 2 (indicating visited and connected). While the queue is not empty, iterate through each grid point. Check the six direct spatial neighbors (up, down, left, right, front, back) of the grid point. For each neighbor, if the neighbor grid point is a "pore grid point" (state 1), update its state to 2 (marked as visited / connected) and add it to the queue. After the search is complete, all grid points with state 2 represent the pore parts connected to the outside world.
[0075] Step S54 Calculation of connected and closed pore volume: Count the number of all "connected pore grid points" with a state of 2, multiply by the grid point size, and that is the size of the connected pore. The size of a closed pore is obtained by counting the number of all "closed pore grid points" with a state of 3 and multiplying the count by the grid point size. Then according to The pore volume calculated in step S4 is then verified and corrected.
[0076] Step S6 Result Display: Based on the above calculation results, the output results include volume, porosity, void ratio, etc.
[0077] Application Example 1
[0078] This embodiment provides an application example of implementing the above algorithm using the Python language, describing the process of analyzing the pore structure of an irregular kerogen material stored in a PDB format file using the method of this invention. Example 1 uses an irregular organic kerogen model containing 6660 C, H, and S atoms. The molecular configuration is as follows... Figure 3 As shown, the calculation results are as follows Figure 4 As shown.
[0079] Specifically, it includes:
[0080] 1. Import necessary modules: NumPy: for numerical computation and array operations. scipy.spatial.KDTree: for efficient nearest neighbor lookup and analysis of connectivity gaps. collections.deque: a doubly queued queue for breadth-first search (BFS). tkinter: for building graphical user interfaces.
[0081] 2. Data Reading: In this embodiment of the invention, a PDB file (e.g., '22.pdb') is first selected and read via a graphical interface. The 'read_pdb_file' function in the program is responsible for parsing the PDB file, extracting the atom numbers, atom names, and atom coordinates, and converting them to nm units for storage. Simultaneously, the program infers the atom element type based on the atom names or element columns in the file and finds the corresponding van der Waals radii. If the PDB file contains CONECT records, the program will also read these records for subsequent chemical bond identification.
[0082] 3. Chemical Bond Identification and Total Atomic Volume Calculation: The program iterates through all atoms, looks up the corresponding van der Waals radius based on their element type, calculates the volume of each atom as a sphere, and sums them to obtain the uncorrected total volume of atomic spheres. The program calls the 'find_bonds_and_count_types' function. Since PDB files typically contain CONECT records, the program prioritizes using these records to identify connections between atoms. By comparing the atomic pairs in the CONECT records with their spatial distances and referring to the built-in typical bond length information ('BOND_LENGTHS'), the program attempts to determine the types of chemical bonds and counts them. Based on the large number of identified skeletal chemical bonds, the program finds the corresponding volume contribution ('BOND_VOLUME_CONTRIBUTIONS') and calculates the total correction. The corrected total atomic volume (V_atom) is calculated. This volume represents the space occupied by the model's own atoms.
[0083] 4. Model Volume Calculation: The program calls the function 'calculate_model_volume_convex_hull' to calculate the convex hull volume (V_model) using all the read atomic coordinates. The convex hull volume represents the total external spatial extent of this segment. Simultaneously, the convex hull vertices are extracted as the basis for gridding the extent. This volume is usually much larger than the total atomic volume; the difference represents the potential pore space. Total Pore Volume Calculation: The program calculates the total pore volume (V_pore_total) as the model volume minus the total atomic volume (V_pore_total = V_model - V_atom), representing the total porosity space inside the material.
[0084] 5. Pore Connectivity Analysis: The program determines the gridded range based on the model boundary (convex hull vertices). A 3D regular grid covering this range is generated according to 'GRID_RESOLUTION'. A Kd tree containing atomic coordinates is constructed. All generated grid points are traversed. A large number of grid points located between model atoms, since they are not within any van der Waals spheres of nearby atoms, are identified as "pore grid points". These points constitute the mesh representation of the pore space inside the material. The program performs a breadth-first search (BFS) to analyze the connectivity of the pore grid points. First, the BFS begins with pore grid points located on the boundary of the gridded region. If the pores of the material fragment are open and extend to the model boundary, then all pore grid points along these pores will be marked as "connected pore points". These points represent pore spaces connected to the external environment. Next, the program traverses all remaining (unconnected) pore grid points. A new BFS begins from each unvisited pore point, finding an independent cluster of pore grid points. These clusters are entirely contained within the material and not connected to the outside; therefore, they are labeled "closed pore points." These points represent isolated cavities within the material or pore portions that fail to extend to segment boundaries. The connected pore volume and the closed pore volume based on BFS counts are calculated by multiplying the number of connected and closed pore points found by the lattice cell volume, respectively, and then verified against the total pore volume. The connected volume represents the connected pore space of the material, and the closed volume represents the closed pore space.
[0085] 6. User Interface and Result Display: After calculation, the program displays the results through a GUI interface. Users can see the calculated total atomic volume, model volume, total pore volume, connected pore volume, closed pore volume, and the corresponding porosity. For this embodiment, the total pore volume is expected to be a large positive value. The connected pore volume and closed pore volume will also be positive values; their relative magnitudes depend on the specific pore structure of the material (whether it is predominantly open channels or isolated cavities). These results quantify the porosity characteristics of the material.
[0086] Application Example 2
[0087] This embodiment provides an application example of implementing the above algorithm using the Python language, describing the process of analyzing the structure of a regular silica material stored in a GRO file using the method of this invention. Example 2 uses a regular silica material model containing 65,968 Si and O atoms. The molecular configuration is as follows... Figure 5 As shown, this is the configuration of a hollow cylinder within a cuboid. The calculation results are as follows: Figure 5 As shown.
[0088] Specifically, it includes:
[0089] 1. Import necessary modules: Same as in Example 1, import NumPy, KDTree, deque, tkinter, etc.
[0090] 2. Data Reading: This invention first selects and reads a GRO file (e.g., 'd14.gro') through a graphical interface. The 'read_gro_file' function in the program is responsible for parsing the GRO file, extracting atom names and atom coordinates (in nm), guessing element types based on atom names, and finding van der Waals radii. GRO files typically contain box vectors (in nm), which the program reads and uses for subsequent periodic boundary condition processing (although the periodic processing in the code is simplified, mainly affecting Kd tree lookups and grid coordinate mapping). Chemical Bond Identification and Total Atom Volume Calculation: The program iterates through all atoms, calculates and accumulates their van der Waals sphere volumes. The program calls the 'find_bonds_and_count_types' function. Since GRO files typically do not contain CONECT records, the program mainly relies on the spatial distances between atoms and refers to the built-in typical bond lengths ('BOND_LENGTHS') to identify and count chemical bonds. This step is crucial for accurately calculating the corrected chemical bond volume. Based on the identified large number of skeletal chemical bonds, the program finds the corresponding volume contribution ('BOND_VOLUME_CONTRIBUTIONS') and calculates the total correction. It then calculates the corrected total atomic volume (V_atom). This volume represents the space occupied by the model's own atoms.
[0091] 3. Model Volume Calculation: The program calls the function 'calculate_model_volume_convex_hull' to calculate the 3D convex hull using all the read atomic coordinates. The convex hull is the smallest convex polyhedron containing all atoms, and its volume represents the total volume (V_model) outside the molecule. This step also extracts the vertices of the convex hull as the basis for defining the subsequent lattice range. For isolated small molecules, this convex hull volume represents the approximate spatial range occupied by the molecule; this can be calculated for the regular model and should be 2460. about.
[0092] 4. Total Pore Volume Calculation: Internally, the program calculates the total pore volume (V_pore_total) as the model volume minus the total atomic volume (V_pore_total = V_model - V_atom). For a molecule with an internal cavity but a compact material structure, V_model is expected to be larger than V_atom, and the total pore volume should be slightly larger than the volume of the internal cavity, and should be slightly greater than 1526. .
[0093] 5. Pore Connectivity Analysis: Based on the model boundary defined by the convex hull vertices, the program generates a 3D regular grid within this region according to a preset grid resolution 'GRID_RESOLUTION' (e.g., 0.38 nm). A Kd tree containing the coordinates of all atoms is constructed for fast atom lookup. The program traverses all generated grid points. For each grid point, the Kd tree is used for efficient lookup of atoms within its surrounding radius. If a grid point is not within the van der Waals sphere of any nearby atom (considering atomic radius), it is marked as a "porosity grid point". For a tightly connected molecule, it is expected that only a very small number of grid points will be identified as porosity grid points because there are no voids inside the molecule, and the external space is defined by the model boundary. If a porosity grid point is identified, the program performs a breadth-first search (BFS). First, the BFS is performed starting with the porosity grid points located on the boundary of the gridded region. All porosity grid points reachable from the boundary grid points through adjacent porosity grid points are marked as "connected porosity points". Next, the program iterates through all remaining (unmarked as connected) pore grid points. A new BFS is started from each unvisited pore point to find an independent cluster of pore grid points, which are marked as "closed pore points". The connected pore volume and the closed pore volume are calculated by multiplying the number of found connected pore points and the number of closed pore points by the volume of each grid cell, respectively. For regular, dense small molecules, the number of closed pore points is expected to be relatively small due to the extremely limited number of pore grid points within the molecule.
[0094] 6. User Interface and Result Display: After the calculation is completed, the program displays the results through the constructed Tkinter GUI interface. Users can see the calculated total atomic volume, model volume, total pore volume, connected pore volume, closed pore volume, and the corresponding porosity (connected porosity = connected volume / total pore volume, closed porosity = closed volume / total pore volume). For this embodiment, the values of closed pore volume and closed porosity are expected to be very low, close to zero.
[0095] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A method for analyzing the pore structure of organic matter using molecular models, characterized in that, include: S1, Input the model structure file of the organic molecule model, which contains the atom type and three-dimensional coordinates of each atom in the organic molecule model; S2, Calculate the total volume of atoms in the organic molecule model based on the model structure file; In this process, the volume of a single atom is estimated using van der Waals radii, and volume corrections are applied based on the identified interatomic chemical bonds. The latter, based on the atom type and interatomic connections, uses preset elemental volume contribution values and chemical bond volume contribution values to calculate the total atomic volume of the organic molecular model. ; S3, determine the convex hull of all atoms surrounding the organic matter molecular model in order to calculate the model volume of the organic matter molecular model; The process involves calling a convex hull calculation function, taking the three-dimensional coordinate array of all atoms as input, finding a minimal convex polyhedron containing all atomic coordinate points, and extracting the calculated volume value from the return result of the convex hull calculation function as the model volume. ; S4. Based on the model volume and the total atomic volume, analyze and calculate the pore volume and porosity in the organic molecular model; Among them, pore volume = - ; Porosity is defined as the ratio of pore volume to model volume; S5. Based on the grid method, a spatial index structure of the organic matter molecular model is established. A preset search algorithm is used to analyze the connectivity of the pore structure in the organic matter molecular model and to quantify the volume of connected pores and closed pores. S6 outputs the analysis results.
2. The method for analyzing the pore structure of organic matter molecular models according to claim 1, characterized in that, The specific calculation process includes: The atomic volume of each atom is calculated using the preset van der Waals radius, based on the three-dimensional coordinates and atom type of each atom. The total element volume contribution is obtained by multiplying the number of atoms of each element determined for each atom type by its preset element volume contribution value. Identify and count the chemical bonds in the organic molecule model, and multiply the count of each type of chemical bond by its preset chemical bond volume contribution value to obtain the total chemical bond volume contribution. The total atomic volume is obtained by adding the total elemental volume contribution to the total chemical bond volume contribution.
3. The method for analyzing the pore structure of organic matter molecular models according to claim 2, characterized in that, Identifying chemical bonds in organic molecular models, including: Obtain the display chemical bond connection information from the model structure file and determine the corresponding chemical bonds; If explicit connection information is missing, a distance-based chemical bond lookup method is used to search for neighboring atoms of each atom within a preset maximum bond length search radius. For the found atom pairs, the interatomic distance is calculated. The calculated distance is compared with a preset database of typical chemical bond lengths, and the chemical bond is determined based on the error range of the comparison.
4. The method for analyzing the pore structure of organic matter using a molecular model according to claim 1, characterized in that, In step S5, the spatial index structure is a Kd-tree structure.
5. The method for analyzing the pore structure of organic matter molecular models according to claim 1, characterized in that, In step S5, the preset search algorithm is a breadth-first search algorithm.
6. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores at least one computer program, which is loaded and executed by a processor to enable the computer to implement the organic molecular model pore structure analysis method as described in any one of claims 1 to 5.
7. A computer device, characterized in that, The device includes a memory and a processor, wherein the memory stores computer program instructions and the processor is configured to execute the instructions to implement the method as described in any one of claims 1 to 5.