An intelligent database of clogging marine organisms and risk early warning system
By constructing an intelligent database and risk early warning system for marine organisms that cause blockages, the problem of the lack of a systematic database in existing technologies has been solved, realizing a multi-dimensional information system and accurate risk early warning, and enhancing the initiative and foresight of water intake safety for coastal nuclear power plants.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HAINAN NUCLEAR POWER CO LTD
- Filing Date
- 2026-02-04
- Publication Date
- 2026-06-19
AI Technical Summary
The lack of a systematic and intelligent database of marine organisms that cause blockages in existing technologies means that risk assessments of blockages at coastal nuclear power plant intakes rely on historical experience and qualitative observations, lacking quantitative and dynamic prediction and early warning models.
A smart database and risk early warning system for marine organisms causing blockages will be constructed, including a data processing module, a database module, an automatic retrieval module, and a risk prediction module. A convolutional neural network model will be used for data processing and prediction, integrating multi-source data and real-time environmental parameters to provide one-stop information acquisition and dynamic updates.
It has realized a multi-dimensional, cross-sea bio-information system for preventing blockages, supporting accurate data support and risk warning, reducing the risk of water intake blockages, improving the initiative and foresight of water intake safety, and providing a scientific basis for decision-making.
Smart Images

Figure CN122240586A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of marine safety in nuclear power plants, and in particular to an intelligent database and risk early warning system for marine organisms that can cause blockages. Background Technology
[0002] In recent years, nuclear power unit shutdowns due to cold source incidents have occurred frequently both domestically and internationally. Intake blockage has become a global problem affecting the safe operation of coastal nuclear power plants. my country is vigorously developing nuclear power, and with the continuous increase in nuclear power units, the frequency and probability of intake blockage incidents are increasing dramatically. Some nuclear power plant sites have even been threatened multiple times by the same or different types of blockages, causing not only high economic losses but also varying degrees of impact on fouling screens, filters, or heat exchange systems. Understanding the species composition and spatiotemporal variation characteristics of blockage-causing marine organisms is a prerequisite for conducting scientific early warning and comprehensive prevention of blockages. In particular, mastering the occurrence and development patterns of key blockage-causing marine organisms is the foundation for winning the battle against blockages.
[0003] In current technologies, information on blockage-causing marine organisms is scattered across various scientific papers, government gazettes, yearbook reports, and expert opinions, lacking a systematic, standardized, and intelligent centralized database. Researchers or engineering managers need to spend a significant amount of time sifting through massive amounts of data, making it difficult to quickly obtain comprehensive information on a specific blockage-causing organism. Current assessments of the risk of blockage at coastal nuclear power plant intakes mainly rely on historical experience, qualitative observations, and simple statistics, representing a passive response model. A quantitative, dynamic, and spatially explicit predictive and early warning model that integrates historical outbreak data, real-time environmental parameters, and biological ecological characteristics has not yet been established. Summary of the Invention
[0004] This invention provides a system for constructing an intelligent database of marine organisms that cause blockages and for risk early warning, which addresses the problems of fragmented knowledge, static data management, and experience-based risk early warning in existing technologies.
[0005] The technical solution of the present invention is as follows: This invention proposes an intelligent database and risk warning system for marine organisms that cause blockages. The system includes a data processing module, a database of marine organisms that cause blockages, a database retrieval module, an automatic retrieval module, a data update module, and a risk prediction module. The data processing module processes the raw data of marine organisms that cause blockages and includes a data deduplication submodule, a format unification submodule, and a missing value supplementation submodule. The data processed by the data processing module of the blockage-causing marine organism database is stored according to the types of blockage-causing marine organisms and the indicator system; The database retrieval module is used by users to search for data within the marine organism database using keywords. The automatic retrieval module periodically retrieves data related to marine organisms that cause blockages, and updates the data to the database of marine organisms that cause blockages after it has been reviewed by the data update module. The risk prediction module uses a convolutional neural network model to train on data from a database of blockage-causing marine organisms to predict information on outbreaks of blockage-causing marine organisms in target areas; The data update module updates the data in the marine organism database that causes blockages and the convolutional neural network model in the risk prediction module. The data update module also maintains the system.
[0006] In some embodiments, the raw data collection of blockage-causing marine organisms is obtained by screening and integrating multi-source data. The multi-source data is collected according to the types of blockage-causing marine organisms and the indicator system. The types of blockage-causing marine organisms include six categories: red tide organisms, large and medium-sized zooplankton, large algae or seagrass, large benthic animals, fouling organisms, and schooling fish, totaling 125 species. The indicator system consists of 16 categories, specifically including Chinese name, scientific name, category, taxonomic rank, distribution area, habitat type, migration mode, individual or group size, phototaxis, morphological characteristics, reproductive habits, ecological habits, interception net aperture size, outbreak record, treatment method, data source, biological atlas, and high-risk calendar.
[0007] In some embodiments, multi-source data includes first-hand survey data and publicly available information. The screening and organization of multi-source data includes removing data that lacks authenticity, authority, relevance, and completeness.
[0008] In some embodiments, the data deduplication submodule of the data processing module uses the DISTINCT statement in MySQL and a deduplication tool to identify and remove duplicate data from the original data of marine organisms that cause blockages. Duplicate data is sorted according to priority, and data with higher priority is retained. The data priority from high to low is: survey data, expert verification data, core journal literature data, authoritative bulletin data, and general public information data. For data of the same priority, the average value is taken for numerical data, and the decision for descriptive data is made by manual review.
[0009] In some embodiments, the format unification submodule of the data processing module unifies the format of classification level data according to the biological classification method, unifies the distribution sea area data as the four sea areas of Bohai Sea, Yellow Sea, East China Sea and South China Sea individually or in combination, unifies the unit format of numerical data, unifies the format of time data, unifies the format of high-risk monthly calendar data by year and month, and unifies the format of outbreak record time by year, month and day; for risk level data, the risk level in the high-risk monthly calendar is unified into high risk, medium risk and low risk, with high risk corresponding to a blockage probability greater than 70%, medium risk corresponding to a blockage probability less than or equal to 70% and greater than or equal to 30%, and low risk corresponding to a blockage probability less than 30%; the missing value supplementation submodule of the data processing module supplements the missing basic information, ecological habits, outbreak records and treatment methods data of the blockage marine organisms in the original data of the blockage marine organisms. The basic information, outbreak records and treatment methods data are supplemented by consulting literature, and the ecological habits data are supplemented by expert demonstration; missing data that cannot be supplemented by the above methods is marked as data to be supplemented.
[0010] In some embodiments, the marine biological database for blocking access includes five data tables: a basic biological information table, a habitat and distribution table, a morphology and habit table, an outbreak and control table, and a data source table. The basic biological information table includes the biological unique identifier (ID), Chinese name, scientific name, category, taxonomic rank, biological atlas storage path, and remarks. The habitat and distribution table includes the biological unique identifier (ID), distribution area, habitat type, migration method, high-risk calendar, and risk level. The morphology and habit table includes the biological unique identifier (ID), individual or group size, phototaxis, morphological characteristics, reproductive habits, and ecological habits. The outbreak and control table includes the biological unique identifier (ID), outbreak record, control method, and interception net aperture size. The data source table includes the biological unique identifier (ID), data name, source type, publication time, source link, original document storage path, and review status. The data tables in the marine biological database for blocking access are linked and queried using the biological unique identifier (ID). The marine biological database for blocking access is built on MySQL, and the data processed by the processing module is converted to CSV format and loaded via MySQL's LOAD DATA INFILE function. The system uses statements to batch import CSV format data into corresponding data tables. After importing the data into the marine biology database, it performs data verification. MySQL SQL queries are used to verify that the number of data rows in each table matches the expected count. MySQL JOIN statements are used to verify the accuracy of multi-table joins. Twenty biological species are randomly selected for full-field queries to ensure no data loss, field misalignment, or join errors. Data integrity verification tools are used to check for null values and outliers, and abnormal data is corrected promptly.
[0011] In some embodiments, the database retrieval module supports users to search using Chinese names and scientific names, and supports switching between fuzzy and precise searches. Based on the user-input keywords, the database retrieval module associates five data tables through unique biological identifiers (BIIs), outputs 16 categories of indicator information for marine organisms causing blockages, and automatically retrieves data from the data source table. The database retrieval module supports switching between list view and detail view for the search results. The list view displays the Chinese name, scientific name, category, distribution area, high-risk level, and core treatment methods of the organisms. The detail view displays all indicator information, the biological atlas is displayed in high-definition image format, the high-risk calendar visualizes the changes in risk level in line graph format, and the treatment methods are broken down and displayed step by step.
[0012] In some embodiments, the automatic retrieval module constructs a keyword dictionary using the Chinese names, scientific names, synonyms, and common names of 125 blockage-causing marine organisms. Each month, the automatic retrieval module uses Python to automatically access and crawl data from various platforms based on the keyword dictionary, extracting relevant literature and report titles, authors, publication dates, source links, and original text information. This information is then entered into the source table of the corresponding blockage-causing marine organism database and marked as pending review. The retrieval platforms include Chinese core journal databases, foreign core databases, government open information platforms, international authoritative institution platforms, and industry databases.
[0013] In some embodiments, the risk prediction module extracts input features and output labels from a database of marine organisms that cause blockages. The input features are standardized and divided into training and validation sets in a 7:3 ratio. Input features include environmental influencing factors, the organisms' ecological habits, and the intensity of human activities in the marine area. Output labels include outbreak time, outbreak range, and risk level. The risk prediction module constructs and trains a convolutional neural network model using the extracted input features and output labels. Within a user-input target region and time range, the risk prediction module retrieves historical environmental data and basic data of the target organisms from the database of marine organisms that cause blockages, inputs them into the convolutional neural network model, and outputs the types and labels of marine organisms that may cause blockages in the region, generating a prediction report.
[0014] In some embodiments, the data update module allows administrators to review the periodic retrieval results of the automatic retrieval module and update the reviewed retrieval results to the blockage-causing marine organism database. The data update module records an update log. Every six months, the data update module retrains the convolutional neural network model of the risk prediction module based on the newly added data in the blockage-causing marine organism database. The data update module also regularly backs up the blockage-causing marine organism intelligent database, regularly checks the operating status of the blockage-causing marine organism intelligent database, and collects user feedback.
[0015] The implementation of this invention has the following beneficial effects: 1. This invention proposes an intelligent database and risk early warning system for blockage-causing marine organisms. The system integrates survey data from four major sea areas and authoritative publicly available information, covering the core indicators of various high-risk blockage-causing marine organisms. It forms a "full-chain, multi-dimensional, cross-sea area" information system for blockage-causing organisms, filling the gap in the existing technology of lacking a dedicated database of blockage-causing marine organism systems, and providing accurate and comprehensive data support for the prevention of water intake blockage.
[0016] 2. This invention proposes an intelligent database and risk early warning system for blockage-causing marine organisms. By setting up a database retrieval module and an automatic retrieval module, this invention integrates data storage, automatic retrieval, and high-end query, realizing one-stop access to information on blockage-causing organisms. Users can obtain basic biological information, outbreak patterns, treatment methods, and the latest research progress through a single keyword; the automatic retrieval function enables dynamic data supplementation, eliminating the need for manual literature retrieval.
[0017] 3. This invention proposes an intelligent database and risk early warning system for blockage-causing marine organisms. By setting up multiple tables in the database, this invention is specifically designed to address the water intake blockage problem in coastal nuclear power plants. The database provides information such as suggested interceptor mesh apertures, disposal methods, and high-risk calendars, which can be directly applied to the design, operation, maintenance, and emergency response of the water intake system. This effectively reduces the risk of water intake blockage, ensures the safe and stable operation of nuclear power units, and has significant economic and social benefits. The database in this invention is built on MySQL, supports data expansion and function upgrades, and can be further customized by adding blockage-causing organisms, expanding the indicator system, optimizing the search scope, and improving the prediction model according to actual needs. It is suitable for the water intake safety needs of other coastal industrial enterprises such as desalination plants and coastal chemical plants, and has broad application prospects.
[0018] 4. This invention proposes an intelligent database and risk early warning system for marine organisms prone to blockages. The invention includes a risk prediction module that integrates a CNN-based machine learning model to achieve quantitative and spatial prediction of biological outbreaks in complex marine environments. This enables water intake safety assurance to shift from reactive emergency response to proactive early warning, significantly improving the initiative and foresight of risk prevention and providing a scientific basis for formulating seasonal prevention and control strategies and initiating emergency responses. Attached Figure Description
[0019] Figure 1 This is a diagram illustrating the composition of blockage-causing marine organisms in an intelligent database and risk warning system for blockage-causing marine organisms proposed in this embodiment of the invention. Figure 2 This is a flowchart illustrating the database construction process of an intelligent database and risk warning system for marine organisms prone to blockage, as proposed in an embodiment of the present invention. Detailed Implementation
[0020] The technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings and specific embodiments. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0021] like Figure 1 As shown, cold source blockage refers to a class of hazardous marine organisms or abiotic entities that, when they proliferate explosively or accumulate in large numbers, can cause blockage or damage to water intake and filtration facilities such as pipes, canals, tunnels, screens, and debris nets, thus affecting the normal flow of water. Based on the design requirements of the intake and cooling systems of China's coastal nuclear power plants, cold source blockages affecting water intake safety generally have the following characteristics: they are generally large (adults are generally larger than 2-5 mm); relatively light (easily disturbed by water flow and floating or suspended in the water, except for fouling organisms); large in number (generally having an absolute advantage in density or biomass during outbreaks); exhibiting gregarious or easily agglomerated characteristics (prone to explosive proliferation or highly aggregated due to water flow); and generally also possessing biological characteristics such as rapid growth, strong reproductive capacity, short life cycle, and weak swimming ability.
[0022] like Figures 1 to 2 As shown, this invention proposes an intelligent database and risk early warning system for marine organisms that cause blockages in coastal nuclear power plant cold sources. This invention studies key parameters of these organisms through marine surveys, field investigations, and social interviews, constructing a database containing information on 125 species, including their Chinese names, categories, habitat types, migration methods, individual or group size, phototaxis, and the size of the interception netting aperture. The species list was selected considering the geographical location of China's coastal nuclear power plants and the spatiotemporal distribution characteristics of marine organisms in nearshore waters. Based on the biological characteristics and ecological habits of marine organisms, it takes into account both different biological populations within the marine environment and potential disaster-causing species with high water intake safety risks in various sea areas. Some species may be typical narrow-range species in a particular sea area, while others are widely distributed species shared by multiple sea areas.
[0023] like Figures 1 to 2 As shown, this invention proposes an intelligent database and risk early warning system for marine organisms that cause blockages. The system includes a data processing module, a database of marine organisms that cause blockages, a database retrieval module, an automatic retrieval module, a data update module, and a risk prediction module.
[0024] The data processing module processes the raw data on marine organisms causing blockages, which has been sorted and filtered. This module includes sub-modules for data deduplication, format standardization, and missing value imputation. The raw data on marine organisms causing blockages is collected through multi-source data filtering and integration. The multi-source data is collected based on the species and indicator system of these organisms. The specific methods for constructing the species and indicator system for these organisms are as follows: Step 1: Conduct a multi-dimensional survey of water intake blockage issues in the four major sea areas of the Bohai Sea, Yellow Sea, East China Sea, and South China Sea. Carry out a cross-sea systematic survey, and comprehensively identify common and typical blockage problems encountered in the cooling water intake process in the four major sea areas through fixed-point marine surveys, on-site investigations, social interviews, and expert consultations. Common blockage problems include blockage caused by summer plankton outbreaks and blockage caused by fish migration in spring and autumn. Typical blockage problems include blockage caused by the breeding season of large benthic animals in the Bohai Sea, blockage caused by the outbreak and accumulation of seaweed in the East China Sea, and blockage caused by fouling organisms in the South China Sea.
[0025] The marine fixed-point survey mainly includes setting up more than 30 monitoring points around typical water intakes in the four major sea areas and conducting quarterly sampling and monitoring for 12 consecutive months; the field survey mainly includes visiting more than 15 coastal nuclear power plant sites and more than 5 seawater desalination plants in China to collect records of water intake blockage incidents, disposal plans, operational data and prevention and control pain points; the social survey mainly includes conducting questionnaire surveys among coastal fishery departments, marine environmental monitoring institutions and nuclear power operation and maintenance companies, and collecting more than 200 valid questionnaires; the expert consultation mainly includes inviting more than 8 authoritative experts in marine ecology, nuclear power safety engineering and marine environmental science to conduct more than 2 rounds of consultation and demonstration.
[0026] Step Two: Screening 125 Marine Organisms Prone to Blockage: Based on the survey results of marine biological community structure in the four major sea areas, the focus is on six categories of organisms with potential blockage risk: red tide organisms, large and medium-sized zooplankton, large algae or seagrass, large benthic animals, fouling organisms, and schooling fish. A systematic review of existing international cases of water intake blockage (such as the diatom outbreak at the California coastal nuclear power plant and the barnacle attachment blockage at the Fukushima coastal power plant in Japan) and prevention and control experience is conducted. Combined with the ecological characteristics of the four major sea areas in China, and following the principles of "high frequency of blockage, wide range of impact, severe harm, and strong representativeness," 125 marine organisms prone to blockage are selected as the core research objects for the database. These include 25 red tide organisms, 20 large and medium-sized zooplankton, 20 large algae or seagrass, 20 large benthic animals, 20 fouling organisms, and 20 schooling fish.
[0027] Step 3: Construct a 16-category core indicator system. Based on the full-process requirements for water intake blockage prevention and control at the coastal nuclear power plant, construct 16 core indicators covering blockage-causing biological identification, risk prediction, prevention and control, and data traceability. These indicators include: Chinese name, scientific name, category, taxonomic rank, distribution area, habitat type, migration mode, individual or group size, phototaxis, morphological characteristics, reproductive habits, ecological habits, interception net aperture size, outbreak records, treatment methods, data source, biological atlas, and high-risk calendar. This indicator system comprehensively covers the entire information chain from "what it is (basic biological information) - where it is (distribution area) - when it breaks out (high-risk calendar) - why it causes blockage (morphological / habitual characteristics) - how to prevent and control it (interception recommendations / treatment methods) - where the data comes from (data traceability)."
[0028] The specific methods for multi-source data screening and integration are as follows: Multi-source data on blockage-causing marine organisms are collected and screened, including first-hand survey data and publicly available information. First-hand survey data collection includes: 1. Marine survey data: Raw data such as species identification, individual size (minimum cross-section), cluster density, and distribution range of 125 blockage-causing organisms are obtained through trawling, plankton net sampling, and on-site monitoring at water intakes; 2. Enterprise survey data: Records of water intake blockage events at coastal nuclear power plants over the past 10 years are collected, including the time and location of the events, the species causing the blockage, the scale of the outbreak, the degree of impact, and the measures and effects of the responses; data on the operation and maintenance of interceptor nets and the operating parameters of the cooling water system are also collected; 3. Expert consultation data: Experience data from experts on risk level assessments of blockage-causing organisms, suggestions for optimizing treatment technologies, and supplementary opinions on missing data are obtained through expert interviews, the Delphi method, and other methods.
[0029] The collection of publicly available data includes: 1. Official bulletins and yearbooks: Retrieving marine environmental bulletins, fisheries yearbooks, and marine ecological status bulletins issued by the State Oceanic Administration and the marine and fisheries departments of coastal provinces from 1990 to the present; 2. Academic literature: Retrieving relevant academic papers and dissertations from 1990 to the present from core Chinese databases such as CNKI, Wanfang, and VIP, as well as core foreign databases such as Web of Science and Scopus, with keywords including "marine organism blockage," "water intake blockage," "water intake from coastal nuclear power plants," "red tide outbreaks," and "fouling organism attachment"; 3. International data: Collecting the coastal nuclear power plant water intake safety guidelines issued by the International Atomic Energy Agency (IAEA), statistical data on blockage events issued by the Marine Ecosystem Conservation Organization (MEPC), and case reports on water intake blockage prevention and control of foreign coastal industrial enterprises; 4. Authoritative directories and databases: Consulting authoritative materials such as the "China Marine Organism Directory," the "Global Marine Organism Database (WoRMS)," and the "China Marine Disaster Bulletin."
[0030] The multi-source data screening and organization specifically includes screening first-hand survey data and publicly available data, eliminating data that lacks authenticity, authority, relevance, and completeness. Authenticity requires eliminating data from unknown sources or unverified informal channels; first-hand survey data must be confirmed by both on-site monitoring personnel and data reviewers. Authority requires prioritizing publications in core journals, official announcements from authoritative institutions, operational data from industry benchmark enterprises, and expert consensus data. Relevance requires ensuring that each data point clearly corresponds to one of the 125 blockage-causing organisms, avoiding data with ambiguous attribution or overly generalizable meaning. Completeness requires that outbreak record data include key information such as time, location, outbreak scale, and impact; and treatment method data must include operational information such as technical principles, operating procedures, and application effects. Data lacking key information will not be included.
[0031] The data deduplication submodule of the data processing module specifically includes: using the MySQL DISTINCT statement and deduplication tools to identify and remove duplicate data from the original marine biological data that may cause congestion. For multi-source data on the same organism and the same indicator, such as differences in the individual size of a certain organism in different literature, duplicate data are sorted according to priority, retaining data with higher priority. The data priority from high to low is: survey data, expert-verified data, core journal literature data, authoritative bulletin data, and general publicly available data. If the data are of the same level, for numerical data with the same priority, the average value is taken, and for descriptive data, an expert voting method is used to determine the priority.
[0032] The data processing module's format unification submodule specifically includes: For classification hierarchy data: strictly follow the seven-level classification standard of the biological classification method of "Kingdom-Phylum-Class-Order-Family-Genus-Species" for unified expression. For example, reticulate barnacles are uniformly labeled as "Kingdom of Animals-Phylum of Arthropoda-Class of Crustacea-Order of Styloidea-Family of Barnacles-Genus of Barnacles-Reticulate Barnacles".
[0033] For data on the distribution of sea areas, uniformly label them as Bohai Sea, Yellow Sea, East China Sea, South China Sea or a combination thereof (such as Yellow Sea-East China Sea, Bohai Sea-Yellow Sea-East China Sea), and avoid using vague descriptions such as "Northern Sea Area" or "Southern Sea Area".
[0034] For numerical indicators, the size of an individual / group (minimum cross-section) is uniformly expressed in millimeters (mm), and the group size is expressed as the average minimum cross-section of an individual x the cluster density; the size of the interceptor mesh aperture (recommended size) is uniformly expressed in millimeters (mm).
[0035] For time data formats, high-risk calendar data should uniformly use the YYYY-MM to YYYY-MM format to mark the risk period, such as 2025-06 to 2025-08, and outbreak record time should uniformly use the YYYY-MM-DD format.
[0036] For risk level data, the risk levels in the high-risk calendar are uniformly labeled as high risk, medium risk, and low risk. The classification standards were jointly formulated by experts in the fields of marine ecology and nuclear power safety. Specifically, high risk has a blockage probability >70%, which may lead to a reduction in unit power; medium risk has a blockage probability of 30%-70%, which may affect water intake efficiency; and low risk has a blockage probability <30%, which has minimal impact on water intake.
[0037] The missing value supplementation submodule of the data processing module specifically includes: for basic information, missing basic information such as biological taxonomic ranks and distribution areas, supplementation is carried out by consulting authoritative lists such as the "List of Marine Organisms of China" and the "Global Marine Organism Database (WoRMS)".
[0038] Regarding ecological habits, for any missing information such as habitat types, reproductive habits, and ecological habits, we will organize at least three authoritative experts in the field of marine ecology to conduct a review and reach a consensus before supplementing the information.
[0039] For outbreak records, missing outbreak records are supplemented by searching the corresponding sea area's annual marine environmental bulletins, relevant academic literature, and enterprise survey records; For any missing solutions, priority should be given to referencing industry standards and mature technical solutions from benchmark companies. If no relevant references are available, expert consultation should be consulted to develop a suggested solution.
[0040] For data marked as needing to be supplemented, and for missing data that cannot be supplemented through the above methods, the data will be uniformly marked as needing to be supplemented and included in the scope of automatic database retrieval and subsequent updates.
[0041] The database for marine organisms causing blockages stores data processed by the data processing module according to the species and indicator system of these organisms. The database's overall architecture is based on MySQL. It employs a multi-table join structure, with five core tables. The functions and fields of each table are as follows: The core fields of the biological basic information table include the biological unique identifier (ID), Chinese name, scientific name, category, taxonomic rank, biological atlas storage path, and remarks. Among them, the "biological unique identifier (ID)" is the global primary key, which consists of "category code + two-digit serial number". The category codes for the six major categories of organisms are as follows: red tide organisms are coded as CC, large and medium-sized zooplankton are coded as ZP, large algae or seagrass are coded as SA, large benthic animals are coded as MA, fouling organisms are coded as FB, and schooling fish are coded as SF. For example, the ID of the 10th organism in red tide organisms is CC10, ensuring that each ID uniquely corresponds to a blockage-causing organism.
[0042] The core fields of the habitat and distribution table include the unique biological identifier (ID), distribution area, habitat type (e.g., shallow sea, estuary, intertidal zone, etc.), migration mode (e.g., active migration, passive drifting, sessile growth, etc.), high-risk calendar, and risk level.
[0043] The core fields of the morphology and habit table include the biological unique identifier (ID), individual or group size (e.g., minimum cross-section), phototaxis (e.g., present / absent / strong / weak), morphological characteristics (e.g., body size, body surface structure, gregarious morphology, etc.), reproductive habits (e.g., reproductive method, breeding season, reproductive capacity, etc.), and ecological habits (e.g., suitable temperature range, suitable salinity range, diet, habitat depth, etc.).
[0044] The core fields of the outbreak and response table include a unique biological identifier (BBI), outbreak record (e.g., time, location, outbreak size, impact), response method (e.g., specific plans for physical, chemical, and biological response), and recommended mesh size for the interceptor.
[0045] The core fields of the data source table include the unique biological identifier (BBI), data name, source type (e.g., first-hand data, publicly available data), publication time, source link, original storage path, and review status (e.g., reviewed, pending review).
[0046] The database of marine organisms employs a relational design, with each table linked by a foreign key using a unique biological identifier (ID), enabling joint queries across multiple tables. For example, when a user searches for a specific organism, the system can simultaneously retrieve the name and illustrations from the basic information table, the high-risk calendar from the habitat and distribution table, the individual size from the morphology and habits table, the treatment methods from the outbreak and control table, and relevant literature from the data source table, achieving one-stop information retrieval.
[0047] The Zhidu Marine Biology Database imports and validates data processed by the data processing module, converting the processed data from Excel to CSV format to ensure consistent field separators and no formatting errors. For batch data import, the MySQL LOADDATAINFILE statement is used to import CSV data into the corresponding data tables in batches. During the import process, field mapping relationships are set to ensure accurate matching between the data and the table fields.
[0048] After importing data into the marine biology database, data verification was performed. MySQL SQL queries were used to verify that the number of data rows in each table was consistent with expectations. MySQL JOIN statements were used to verify the accuracy of multi-table joins. Twenty biological species were randomly selected for full-field queries to ensure no data loss, field misalignment, or join errors. Data integrity verification tools were used to check for null values and outliers, and abnormal data was corrected promptly.
[0049] The database search module allows users to search for data within the database of marine organisms causing blockages using keywords. The search entry point is located on the database's front-end interface, offering two independent entry points: "Chinese Name Search" and "Scientific Name Search," supporting switching between fuzzy and precise searches. Fuzzy searches allow users to input partial names (e.g., entering "barnacle" matches all blockage-causing organisms in the genus *Barnacles*), while precise searches require the complete name (e.g., entering "reticulated barnacle" matches only that species), catering to different users' search habits and precision requirements.
[0050] The database retrieval module integrates and outputs search results. After the user enters keywords and confirms the search, the system links five core data tables through the "Unique Biological Identifier ID" and outputs 16 core indicator information of the marine organism causing blockages in one go. At the same time, it automatically retrieves the latest relevant research literature and report links from the "Source Table" and arranges them in reverse chronological order of publication time to display the latest research progress of the organism (such as new treatment technologies, new discoveries on outbreak patterns, and the basis for adjusting risk levels).
[0051] The database retrieval module's result display has been optimized, supporting switching between list view and detail view. Specifically, the list view concisely displays key information such as the organism's Chinese name, scientific name, category, distribution area, high-risk level, and core treatment methods, making it easy for users to quickly filter target organisms.
[0052] The detailed view comprehensively displays all indicator information. The biological atlas is presented in high-definition image format and supports zooming in. The high-risk calendar visualizes the changes in risk level in the form of a line chart, intuitively presenting the risk distribution throughout the year. The treatment methods are broken down and displayed in the steps of "prevention measures - emergency response - follow-up cleanup" and are accompanied by links to typical application cases for users to refer to directly.
[0053] The automatic retrieval module periodically retrieves data related to marine organisms that cause blockages, and updates this data to the database of marine organisms that cause blockages after it has been reviewed by the data update module.
[0054] The database retrieval module's keyword dictionary is built around the Chinese and scientific names of 125 marine organisms that cause blockages. It also includes synonyms and common names for each organism (such as the common name of seaweed, which is also known as sea lettuce), to avoid omissions in the search due to name differences. The keyword dictionary supports manual updates, allowing for the addition of new scientific names or the correction of old ones based on the progress of biological taxonomy research.
[0055] The database retrieval module automatically performs data searches based on a keyword dictionary. It uses a Python-written web crawler, integrating the Selenium automation tool and the Requests library to achieve automatic access and data scraping from various platforms. For databases requiring access permissions (such as some foreign language journal databases), it configures legitimate access accounts to ensure retrieval compliance. The search scope covers Chinese core journal databases (e.g., CNKI, Wanfang, VIP), foreign core databases (e.g., Web of Science, Scopus), government public information platforms (e.g., the State Oceanic Administration website, coastal provinces' marine environmental bulletin release platforms), international authoritative institution platforms (e.g., IAEA website, MEPC report platform), and industry databases (e.g., the China Marine Ecological Environment Monitoring Data Sharing Platform).
[0056] The database retrieval module's default retrieval cycle is once a month. Database administrators can manually trigger instant retrievals through the backend to meet urgent data replenishment needs. After the retrieval is completed, the database retrieval module automatically extracts the titles, authors, publication dates, source links, and original or full-text access channels of relevant literature and reports, and links them to the corresponding organism's data source table in the marine organism database, marking them as "pending review." The system notifies administrators via in-system messages to review the data in the data update module. Approved data is automatically updated to the database and synchronously linked to the corresponding organism's information page. Data that fails the review (e.g., insufficient relevance, data duplication, or unreliable sources) is marked with the reason for rejection and archived for future reference.
[0057] The risk prediction module uses a convolutional neural network to train on a database of blockage-causing marine organisms to predict information on outbreaks of blockage-causing marine organisms in target areas. First, the risk prediction module constructs a training dataset, extracting historical outbreak records of 125 blockage-causing organisms from 1990 to the present, collecting over 5000 valid outbreak records. Each record corresponds to 12 input features and 3 output labels. Input features include environmental influencing factors (including water temperature, salinity, nutrient concentration, light intensity, ocean current velocity, and marine chlorophyll a concentration), the organism's own ecological habits (including reproductive cycle, migration patterns, phototaxis, and suitable temperature range), and the intensity of human activities in the marine area (including water intake and coastal sewage discharge). Output labels include outbreak time (year and month required), outbreak range (i.e., marine area zoning code, e.g., BH-XB for the western Bohai Sea), and risk level. The risk level is numerically quantified: high risk corresponds to 3, medium risk to 2, and low risk to 1. The input features are standardized (e.g., normalized to the [0, 1] interval) and divided into training and validation sets in a 7:3 ratio.
[0058] The risk prediction module then proceeds to build and train a CNN neural network model. Based on the TensorFlow framework, a CNN convolutional neural network is constructed, with the following network structure: The input layer receives a 12-dimensional feature vector, corresponding to 12 input features. Convolutional layer 1 consists of 32 convolutional kernels of size 3, using the ReLU activation function, to extract low-level features (such as the basic association between a single environmental factor and a burst). Pooling layer 1 uses 2x1 max pooling with a stride of 2 to reduce the number of parameters and avoid overfitting. Convolutional layer 2 consists of 64 5x1 convolutional kernels with the ReLU activation function, used to extract mid-level features (such as the interaction between combinations of environmental factors and biological habits). Pooling layer 2 uses 2x1 max pooling with a stride of 2 to further compress the feature dimension. Convolutional layer 3 consists of 128 3x1 convolutional kernels with the ReLU activation function, used to extract high-level features (such as burst patterns under multi-factor coupling). Fully connected layer 1 consists of 128 neurons with the ReLU activation function, integrating the features extracted by the convolutional layers. The fully connected layer 2 has 64 neurons with ReLU activation function to optimize feature mapping. The output layer has 3 neurons, corresponding to outbreak time, outbreak range, and risk level, respectively. The outbreak time and range use linear activation functions, while the risk level uses Softmax activation function.
[0059] The specific parameter settings for model training in the risk prediction module are as follows: the optimizer is Adam, the learning rate is set to 0.001, and the regularization parameter is set to 0.001 to avoid overfitting; the model is trained iteratively for 500 epochs, and the 10-fold cross-validation method is used to verify the model performance. The R² coefficient and root mean square error (RMSE) of the predicted and actual values are used as performance evaluation indicators. The final model training results show that the R² coefficient reaches 0.88 and the RMSE is 0.12, which meets the accuracy requirements for congestion prediction.
[0060] The risk prediction module automatically retrieves historical environmental data and basic data of target organisms from the database of marine organisms that may cause blockages. This data is then input into a pre-trained CNN convolutional neural network model. The module outputs the types of marine organisms that may cause blockages in the region, the outbreak time window (accurate to the month), the outbreak range (specific marine area or a 5km / 10km radius around the water intake), and the risk level. The module also generates a prediction report that includes the basis for the prediction and prevention and control recommendations, which can be downloaded in PDF format.
[0061] The data update module updates the data in the marine organism database that causes blockages and the convolutional neural network model in the risk prediction module. The data update module also maintains the system.
[0062] The data in the marine organism database is updated using both automatic and manual methods. Manual updates involve managers inputting new marine survey data, expert consultation results, and the latest records of blockage-causing events from enterprises into the data processing module. After processing by the data processing module, the data is then entered into the marine organism database.
[0063] Automatic updates involve the data update module reviewing the periodic search results from the automatic retrieval module. After the administrators review the periodic search results, the data update module batch updates the corresponding data tables in the marine organism database that causes blockages. The updates include new outbreak records, new treatment technologies, and supplementary literature. At the same time, the data update module automatically records the time, content, and operator of each update, forming an update log for easy traceability and auditing.
[0064] Every six months, the data update module retrains the CNN convolutional neural network model based on newly added outbreak records (no less than 50 records) in the marine organism database that causes blockages. It adjusts parameters such as the number of convolutional kernels, learning rate, and number of iterations to optimize the model's adaptability to new environmental conditions and new outbreak patterns, ensuring that the prediction accuracy remains stable.
[0065] Meanwhile, the data update module performs system maintenance. It uses a MySQL database backup tool to perform a full backup once a month and an incremental backup once a week. The backup data is stored on both a local server and a cloud server to prevent data loss.
[0066] The data update module periodically checks the database's operating status, including response speed, retrieval success rate, and model prediction efficiency, and promptly handles faults such as lag and errors to ensure stable system operation.
[0067] The data update module collects user feedback (e.g., through front-end feedback forms, user interviews, etc.). Based on user feedback, the operations and maintenance personnel upgrade the system every quarter, optimizing the search function, result display format, and prediction report content, and adding functional modules requested by users (e.g., multi-condition combined search, data export function, etc.).
[0068] The marine organism database for blockage caused by cold sources of coastal nuclear power plants established in this invention focuses on analyzing the taxonomic rank, morphological characteristics, biological illustrations, habitat types, individual / population size, migration patterns, distribution areas, phototaxis, ecological habits, reproductive habits, high-risk calendars, and outbreak records of blockage-causing marine organisms.
[0069] This invention provides fundamental data for the construction of a monitoring-early warning-prevention technology system for water intake blockages in coastal nuclear power plants, and offers a scientific basis for cooling water system engineering design / optimization, water intake safety management, and emergency response. Simultaneously, the coastal nuclear power plant water intake blockage information retrieval database will be continuously updated and optimized based on the latest experience feedback and ongoing monitoring results, keeping pace with the times and playing a significant role in improving the safety management and risk response level of water intake at coastal nuclear power plants.
[0070] The above embodiments merely illustrate several implementation methods of the present invention, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the scope of the present invention. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these all fall within the protection scope of the present invention. Therefore, the protection scope of this patent should be determined by the appended claims.
Claims
1. A smart database and risk early warning system for marine organisms that cause blockages, characterized in that, The system includes a data processing module, a database of marine organisms that cause blockages, a database retrieval module, an automatic retrieval module, a data update module, and a risk prediction module. The data processing module processes the raw data of marine organisms that cause blockages and includes a data deduplication submodule, a format unification submodule, and a missing value supplementation submodule. The database of marine organisms causing blockages reads the data processed by the data processing module and stores it according to the types of marine organisms causing blockages and the indicator system. The database retrieval module is used by users to search for data in the database of marine organisms that cause blockages using keywords. The automatic retrieval module periodically and automatically retrieves data related to marine organisms that cause blockages, and updates the data to the database of marine organisms that cause blockages after it has been reviewed by the data update module. The risk prediction module uses a convolutional neural network model to train the data in the database of blockage-causing marine organisms to predict information on the outbreak of blockage-causing marine organisms in the target area; The data update module updates the data in the marine organism database that causes blockages and the convolutional neural network model in the risk prediction module, and the data update module maintains the system.
2. The method for constructing a marine organism intelligent database and risk early warning system for clogging as described in claim 1, characterized in that, The original data on blockage-causing marine organisms was obtained through screening and integration of multi-source data. The multi-source data was collected based on the types of blockage-causing marine organisms and an indicator system. The types of blockage-causing marine organisms include six categories: red tide organisms, large and medium-sized zooplankton, large algae or seagrass, large benthic animals, fouling organisms, and schooling fish, totaling 125 species. The indicator system consists of 16 categories, specifically including Chinese name, scientific name, category, taxonomic rank, distribution area, habitat type, migration mode, individual or group size, phototaxis, morphological characteristics, reproductive habits, ecological habits, interception net aperture size, outbreak record, treatment method, data source, biological atlas, and high-risk calendar.
3. The method for constructing a smart database and risk early warning system for marine organisms prone to blockage, as described in claim 2, is characterized in that... The multi-source data includes first-hand survey data and publicly available information. The screening and organization of the multi-source data includes removing data that lacks authenticity, authority, relevance, and completeness.
4. The method for constructing a marine organism intelligent database and risk early warning system for clogging, as described in claim 3, is characterized in that... The data deduplication submodule of the data processing module uses the DISTINCT statement in MySQL and a deduplication tool to identify and remove duplicate data from the original data of marine organisms that cause blockages. Duplicate data is sorted according to priority, and data with higher priority is retained. The data priority from high to low is: survey data, expert-verified data, core journal literature data, authoritative bulletin data, and general public information data. For data of the same priority, the average value is taken for numerical data, and the decision for descriptive data is made by manual review.
5. The method for constructing a marine organism intelligent database and risk early warning system for clogging as described in claim 4, characterized in that, The data processing module's format unification submodule unifies the format of classification level data according to the biological classification method, unifies the distribution sea area data as the four sea areas of Bohai Sea, Yellow Sea, East China Sea and South China Sea individually or in combination, unifies the format of numerical data according to the unit, and unifies the format of time data, high-risk monthly calendar data by year and month, and outbreak record time by year, month and day. For risk level data, the risk levels in the high-risk calendar are uniformly formatted as high risk, medium risk, and low risk. High risk corresponds to a blockage probability greater than 70%, medium risk corresponds to a blockage probability less than or equal to 70% but greater than or equal to 30%, and low risk corresponds to a blockage probability less than 30%. The missing value supplementation submodule of the data processing module supplements the missing basic information, ecological habits, outbreak records, and treatment methods data of the blockage-causing marine organisms in the original data. The basic information, outbreak records, and treatment methods data are supplemented by consulting literature, and the ecological habits data are supplemented by expert demonstration. Missing data that cannot be supplemented in the above ways is marked as data to be supplemented.
6. The method for constructing a smart database and risk early warning system for marine organisms prone to blockage, as described in claim 5, is characterized in that... The database of marine organisms causing blockages includes five tables: a basic information table, a habitat and distribution table, a morphology and habit table, an outbreak and control table, and a data source table. The basic information table includes the organism's unique identifier (ID), Chinese name, scientific name, category, taxonomic rank, biological illustration storage path, and remarks. The habitat and distribution table includes the organism's unique identifier (ID), distribution area, habitat type, migration method, high-risk calendar, and risk level. The morphology and habit table includes the organism's unique identifier (ID), individual or group size, phototaxis, morphological characteristics, reproductive habits, and ecological habits. The outbreak and control table includes the organism's unique identifier (ID), outbreak records, control methods, and interception net aperture size. The data source table includes the organism's unique identifier (ID), data name, source type, publication time, source link, original document storage path, and review status. The database tables are linked and queried using the organism's unique identifier (ID). The database is built on MySQL and converts the processed data into CSV format using MySQL's LOAD DATA function. The INFILE statement imports CSV format data into the corresponding data tables in batches. After importing the data into the marine organism database, the data is verified. MySQL SQL queries are used to verify that the number of data rows in each table matches the expectation. MySQL JOIN statements are used to verify the accuracy of multi-table joins. Twenty biological species are randomly selected for full-field queries to ensure no data loss, field misalignment, or join errors. A data integrity verification tool is used to check for null values and outliers, and abnormal data is corrected promptly.
7. The method for constructing a smart database and risk early warning system for marine organisms prone to blockage, as described in claim 6, is characterized in that... The database retrieval module supports users to search using Chinese names and scientific names, and supports switching between fuzzy and precise searches. Based on user-input keywords, the module associates five data tables using unique biological identifiers (BIIs), outputting 16 categories of indicators for marine organisms causing blockages, and automatically retrieving data from the source tables. The module supports switching between list and detail views for search results. The list view displays the organism's Chinese name, scientific name, category, distribution area, high-risk level, and core treatment methods. The detail view displays all indicator information, with the biological atlas presented as high-resolution images, a high-risk calendar visualized as a line graph showing changes in risk levels, and treatment methods broken down step-by-step.
8. The method for constructing a smart database and risk early warning system for marine organisms prone to blockage, as described in claim 7, is characterized in that... The automatic retrieval module constructs a keyword dictionary using the Chinese names, scientific names, synonyms, and common names of 125 blockage-causing marine organisms. Each month, the automatic retrieval module uses Python to automatically access and crawl data from various platforms based on the keyword dictionary, extracting relevant literature and report titles, authors, publication dates, source links, and original text information. This information is then entered into the source table of the corresponding blockage-causing marine organism database and marked as pending review. The retrieval platforms include Chinese core journal databases, foreign core databases, government open information platforms, international authoritative institution platforms, and industry databases.
9. The method for constructing a marine organism intelligent database and risk early warning system for clogging as described in claim 8, characterized in that, The risk prediction module extracts input features and output labels from the database of marine organisms that cause blockages. The input features are standardized and divided into training and validation sets in a 7:3 ratio. The input features include environmental influencing factors, the organisms' ecological habits, and the intensity of human activities in the marine area. The output labels include outbreak time, outbreak range, and risk level. The risk prediction module constructs and trains a convolutional neural network model using the extracted input features and output labels. When the user inputs a target region and time range, the risk prediction module retrieves historical environmental data and basic data of the target organisms from the database of marine organisms that cause blockages, inputs this data into the convolutional neural network model, and outputs the types of marine organisms that may cause blockages in the region, along with their output labels, generating a prediction report.
10. A method for constructing a smart database and risk early warning system for marine organisms prone to blockage, as described in claim 9, is characterized in that... The data update module allows administrators to review the periodic retrieval results of the automatic retrieval module and update the reviewed retrieval results to the blockage-causing marine organism database. The data update module records an update log. Every six months, the data update module retrains the convolutional neural network model of the risk prediction module based on the newly added data in the blockage-causing marine organism database. The data update module also regularly backs up the blockage-causing marine organism intelligent database, periodically checks the operating status of the database, and collects user feedback.