A method and system for rapid positioning and information extraction of intelligent medical images
By combining a three-level cascaded localization and dynamic processing strategy, the problems of high computational complexity and poor cross-device adaptability in existing medical image processing systems are solved, achieving efficient and accurate lesion localization and information extraction.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SUZHOU HENGYIXIN INTELLIGENT TECH CO LTD
- Filing Date
- 2026-04-02
- Publication Date
- 2026-06-26
AI Technical Summary
Existing medical image processing systems suffer from high computational complexity and low efficiency in lesion localization, making it difficult to effectively integrate multi-dimensional features and exhibiting poor cross-device adaptability, resulting in insufficient recognition capabilities and unstable diagnostic results.
A three-level cascaded localization module is used for step-by-step localization. A dynamic processing strategy is combined to select differentiated processing paths. Multi-modal feature fusion module is used to weightedly fuse multi-dimensional features to generate a structured report.
It improves the processing efficiency of lesion localization and information extraction, enhances the system's adaptability to different devices and complex scenarios, and ensures the accuracy and stability of diagnostic results.
Smart Images

Figure CN122290909A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of medical image processing and artificial intelligence, and in particular to a method and system for rapid localization and information extraction of smart medical images, specifically involving medical image analysis technology based on multi-level localization mechanisms, dynamic processing strategies, and multi-modal feature fusion. Background Technology
[0002] With the development of medical imaging technology, computer-aided medical image analysis methods have been widely used in disease screening, assisted diagnosis, and efficacy evaluation. Existing medical image processing systems typically incorporate deep learning models to locate, segment, and extract features from medical image data, thereby generating corresponding diagnostic results or structured reports. These systems generally include multiple processing stages such as image preprocessing, target detection or localization, region segmentation, and feature analysis, with each module working collaboratively to complete the automatic analysis of medical images.
[0003] In existing technologies, common medical image analysis methods often employ convolutional neural network-based target detection models (such as those based on sliding windows or candidate region generation mechanisms) to perform global scanning of images to identify suspected lesion areas. Subsequently, segmentation networks are used to finely segment the candidate regions, and morphological, textural, or radiomics features are further extracted. Finally, diagnostic results are output by combining rules or models. While these methods have improved the automation level of medical image analysis to some extent, they still have significant shortcomings.
[0004] First, existing technologies typically require high-intensity calculations on the entire image or all slices during lesion localization, especially in three-dimensional medical images (such as CT or MRI), where the large amount of data and high dimensionality result in high overall computational complexity and low processing efficiency, making it difficult to meet the real-time or rapid response requirements of clinical scenarios.
[0005] Secondly, existing methods often use a single model or limited features for analysis, making it difficult to effectively integrate multi-dimensional features such as structural information, texture information, and spatial context. This results in insufficient ability to identify complex or small-scale lesions, and problems such as incomplete information extraction or high false negative rates.
[0006] In addition, medical images generated by different medical device manufacturers vary in format, resolution and imaging parameters. Existing models are quite sensitive to changes in data distribution and lack good cross-device adaptability, which can easily lead to a decline in model performance or unstable diagnostic results.
[0007] Furthermore, in existing technologies, each processing stage typically employs a fixed computational process, treating different image regions the same and lacking an adaptive computational mechanism for differences in image content. This results in low efficiency in utilizing computational resources and a large amount of redundant computation.
[0008] Therefore, how to improve the processing efficiency of lesion localization and information extraction, enhance the comprehensive utilization of multi-dimensional features, and improve the system's adaptability in different devices and complex scenarios while ensuring the accuracy of medical image analysis has become a technical problem that urgently needs to be solved in this field. Summary of the Invention
[0009] In view of this, the purpose of this invention is to provide a method and system for rapid localization and information extraction of smart medical images, to solve the technical problems existing in the medical image processing technology, such as low localization efficiency, incomplete information extraction, low utilization of computing resources, and poor cross-device adaptability, and to improve the efficiency and stability of medical image processing results while ensuring analysis accuracy. To achieve the above objectives, this invention provides the following technical solution: In one embodiment of the present invention, a rapid localization and information extraction system for smart medical images is provided, comprising: an image preprocessing module for standardizing and enhancing the input raw medical image; a three-level cascaded localization module connected to the image preprocessing module for sequentially performing coarse localization, fine localization, and fine-tuning localization processing on the medical image, wherein the coarse localization is used to quickly generate a set of candidate regions, the fine localization is used to refine the boundaries of the candidate regions, and the fine-tuning localization is used to achieve pixel-level localization correction; and a dynamic processing strategy module for selecting differentiated processing paths for different regions based on the complexity of the medical image or the importance of the candidate regions, including skipping processing, simplifying processing, or fine processing. The system comprises: a precise segmentation and information deconstruction module, connected to the three-level cascaded localization module, used to perform pixel-level segmentation only on the target region filtered by the dynamic processing strategy, and simultaneously extract morphological features, texture features, and spatial relationship features of the lesion; a multimodal feature fusion module, used to perform weighted fusion of the morphological features, texture features, context features, and medical metadata to obtain a unified feature representation; a structured report generation module, connected to the multimodal feature fusion module, used to generate a structured report containing localization information, quantification parameters, and diagnostic prompts based on the unified feature representation; and a dynamic processing strategy module used to determine the processing path for fine localization and fine-tuning localization based on the candidate region generated by the coarse localization.
[0010] Furthermore, in one embodiment of the present invention, the dynamic processing strategy module controls different regions to be processed using models with different resolutions or different computational depths by calculating image complexity indicators or regional importance scores, so as to achieve adaptive allocation of computing resources.
[0011] Preferably, in one embodiment of the present invention, the dynamic processing strategy module includes an attention guidance mechanism for suppressing computation in low-importance regions and prioritizing the allocation of computing resources to high-importance regions, thereby reducing redundant processing in non-critical regions.
[0012] Furthermore, in one embodiment of the present invention, the positioning networks at each level in the three-level cascaded positioning module share some features to reduce redundant calculations and improve overall processing efficiency.
[0013] Preferably, in one embodiment of the present invention, the multimodal feature fusion module uses an adaptive weighting method to fuse different features, wherein the weights of each feature are automatically learned through model training to improve the accuracy and robustness of the fused feature representation.
[0014] Alternatively, in one embodiment of the present invention, the precise segmentation and information deconstruction module extracts the area, perimeter, shape parameters and texture statistical features of the lesion through parallel branches during the segmentation process, so as to achieve the collaborative execution of the segmentation task and the feature extraction task.
[0015] Furthermore, in one embodiment of the present invention, the system further includes a feedback learning module, used to incrementally learn the three-level cascaded localization module or the precise segmentation and information deconstruction module based on the user's correction information on the structured report, so as to achieve continuous optimization of model performance.
[0016] In one embodiment of the present invention, a method for rapid localization and information extraction of smart medical images based on the above system is provided, comprising: preprocessing the original medical image; generating candidate regions and refining the localization results step by step through a three-level cascaded localization module; filtering target regions based on a dynamic processing strategy; finely segmenting the filtered target regions and extracting multi-dimensional features; fusing the multi-dimensional features and generating a structured diagnostic report.
[0017] Furthermore, in one embodiment of the present invention, the dynamic processing strategy includes: selecting a processing resolution based on image complexity; and determining whether to perform fine segmentation based on region importance.
[0018] Preferably, in one embodiment of the present invention, the feature fusion includes a weighted combination of structural features, texture features, contextual features, and medical metadata.
[0019] Furthermore, in one embodiment of the present invention, the above-mentioned method for rapid localization and information extraction of intelligent medical images based on the above-described system further includes: Receive correction information from the user regarding the structured diagnostic results; Annotated training data is generated based on the correction information; The model parameters in the three-level cascaded localization module and / or the precise segmentation and information deconstruction module are updated by using online learning or periodic training to achieve adaptive optimization of the model.
[0020] Alternatively, in one embodiment of the present invention, an electronic device is provided, including a processor and a memory, wherein the processor executes a program to implement the above-described method.
[0021] Alternatively, in one embodiment of the present invention, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed, implements the above-described method.
[0022] Based on the above technical solutions, the rapid localization and information extraction system for smart medical images of the present invention, by constructing a three-level cascaded localization module, realizes a progressive processing from coarse localization, fine localization to fine-tuning localization, thereby significantly reducing the need for high-complexity calculations across the entire image range while ensuring localization accuracy, thus effectively reducing the overall computational overhead; and by introducing a dynamic processing strategy module, it adaptively selects skip processing, simplified processing, or fine processing paths for different regions according to the complexity of the medical image and the importance of candidate regions, realizing on-demand allocation of computing resources, avoiding redundant calculations for non-critical regions, and further improving processing efficiency.
[0023] Meanwhile, the precise segmentation and information deconstruction module performs pixel-level fine segmentation only on the selected target areas, and simultaneously extracts the morphological features, texture features and spatial relationship features of the lesions during the segmentation process. This allows for deep integration of the information extraction process and the localization process, ensuring the integrity of feature extraction and avoiding redundant calculations. Combined with the multimodal feature fusion module, structural features, texture features, contextual features and medical metadata are adaptively weighted and fused to construct a more comprehensive and robust feature representation, thereby improving the system's generalization ability under different devices and imaging conditions.
[0024] Furthermore, through the structured report generation module, the fused features are associated and matched with medical knowledge to achieve automatic conversion from image data to diagnostic information, thereby improving the consistency and interpretability of diagnostic results. The model can also be continuously optimized through a feedback learning mechanism, enabling the system to continuously improve its performance in practical applications.
[0025] Therefore, this invention achieves an effective balance between positioning speed and extraction accuracy. While ensuring high-precision detection and information extraction, it significantly shortens processing time and reduces computational resource consumption. It solves the technical problems of low positioning efficiency, incomplete information extraction, and poor cross-device adaptability in medical image processing in the prior art, and has outstanding substantive features and significant progress. Attached Figure Description
[0026] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the embodiments are briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0027] Figure 1 This is a schematic diagram of the overall structure of a rapid localization and information extraction system for smart medical images according to the present invention; Figure 2 This is a schematic diagram of the processing flow combining three-level cascaded positioning and dynamic processing strategies in an embodiment of the present invention; Figure 3 This is a schematic diagram of the process of multimodal feature fusion and structured report generation in an embodiment of the present invention. Detailed Implementation
[0028] To enable those skilled in the art to more clearly understand the technical solution of the present invention, specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the following embodiments are only used to illustrate the technical solution of the present invention and are not intended to limit the scope of protection of the present invention.
[0029] In a specific embodiment of the present invention, the system structure, core processing flow, and feature extraction and result output process will be described layer by layer. Firstly, in conjunction with... Figure 1 The overall system structure and each functional module are explained, and then combined with... Figure 2 The core processing flow based on the three-level cascaded positioning and dynamic processing strategy is explained, and finally combined with... Figure 3 The process of feature extraction, multimodal fusion, and structured output is explained.
[0030] Furthermore, this invention combines a "three-level cascaded localization mechanism" with a "dynamic processing strategy" to achieve hierarchical processing and adaptive computation path selection of medical images, and uses "multimodal feature fusion" to achieve multidimensional expression of lesion information, thereby improving processing efficiency while ensuring analytical accuracy. The above technical ideas will be elaborated in detail in subsequent embodiments.
[0031] I. General Description In one embodiment of the present invention, combined with Figures 1 to 3 This invention provides a detailed description of a method and system for rapid localization and information extraction from smart medical images. It should be noted that this embodiment is only used to explain the technical solution of this invention and is not intended to limit the scope of protection of this invention.
[0032] In this embodiment, the "medical image" refers to image data acquired through medical imaging equipment, including but not limited to computed tomography (CT) images, magnetic resonance imaging (MRI) images, ultrasound images, or other types of medical images; the "raw medical image" is unprocessed input image data; the "candidate region" refers to a suspected lesion region obtained through preliminary localization; and the "target region" refers to a region that needs further detailed analysis after being screened by a dynamic processing strategy.
[0033] Furthermore, in this embodiment, the "three-level cascaded localization module" refers to a processing structure that sequentially performs coarse localization, fine localization, and fine-tuning localization in a coarse-to-fine order; the "dynamic processing strategy module" refers to a control mechanism used to adaptively select the processing path based on image complexity or regional importance; and the "multimodal feature fusion module" refers to a module that performs unified representation and fusion processing of features from different sources or of different types. The above terms will be used consistently throughout the following sections of this specification.
[0034] In one embodiment of the present invention, such as Figure 1 As shown, the system as a whole includes an image preprocessing module, a three-level cascaded localization module, a dynamic processing strategy module, a precise segmentation and information deconstruction module, a multimodal feature fusion module, and a structured report generation module. The modules are connected and work together through data flow to achieve rapid localization and information extraction of lesion areas in medical images.
[0035] Furthermore, during system operation, the original medical images are first input to the image preprocessing module for standardization and enhancement to reduce differences caused by different devices or imaging conditions. Subsequently, the processed images are input to the three-level cascaded localization module, which generates candidate regions and refines the localization results step by step through hierarchical processing. Based on this, the dynamic processing strategy module evaluates the importance of each candidate region according to the image content features and selectively determines whether to perform subsequent fine processing on it.
[0036] Preferably, only the target region filtered by the dynamic processing strategy is input into the precise segmentation and information deconstruction module for pixel-level segmentation and feature extraction, thereby avoiding performing a uniform high-complexity calculation on the entire image; after obtaining multi-dimensional features, the structural features, texture features, context features and medical metadata are uniformly modeled through the multi-modal feature fusion module, and finally the structured report generation module outputs the results containing localization information and diagnostic prompts.
[0037] Furthermore, through the coordinated operation of the above modules, this embodiment achieves a progressive analysis from coarse to fine and content-based adaptive calculation path selection in the processing flow, enabling the system to reduce redundant calculations and improve processing efficiency while ensuring analysis accuracy.
[0038] Optionally, in some implementations, the system may also include a feedback learning mechanism for updating the relevant model based on user corrections to the output results, thereby achieving continuous optimization of system performance.
[0039] It should be noted that the system structure and processing flow described in this embodiment can be implemented by software, hardware or a combination of software and hardware, and its specific implementation does not constitute a limitation on the scope of protection of this invention.
[0040] II. System Structure and Implementation In one embodiment of the present invention, such as Figure 1 As shown, the rapid localization and information extraction system for smart medical images includes an image preprocessing module 10, a three-level cascaded localization module 20, a dynamic processing strategy module 30, a precise segmentation and information deconstruction module 40, a multimodal feature fusion module 50, and a structured report generation module 60. The modules are connected through data streams to collaboratively complete the medical image analysis task.
[0041] 1. Image preprocessing module 10 In one embodiment of the present invention, the image preprocessing module 10 is used to receive raw medical images and perform standardization and enhancement processing on them to generate input data suitable for subsequent analysis.
[0042] Furthermore, the image preprocessing module 10 includes: A size uniformity unit is used to scale medical images of different resolutions to a preset size; A contrast enhancement unit is used to enhance the contrast between the lesion area and normal tissue; The data normalization unit is used to normalize the pixel values of the image.
[0043] Preferably, by uniformly processing the original medical images, the data differences caused by different devices and imaging conditions can be reduced, providing a consistent data foundation for subsequent multi-level localization and feature extraction, thereby improving the overall stability of the system.
[0044] 2. Three-level cascaded positioning module 20 In one embodiment of the present invention, the three-level cascaded positioning module 20 is used to perform coarse-to-fine layered positioning processing on the input image to generate progressively refined positioning results.
[0045] Furthermore, the three-level cascaded positioning module 20 includes: The coarse localization unit 21 is used to quickly scan the input image and generate multiple candidate regions; The fine positioning unit 22 is used to refine the boundaries of the candidate region; The fine-tuning positioning unit 23 is used to perform pixel-level positioning correction on the refined region.
[0046] Furthermore, the candidate region output by the coarse positioning unit 21 is used as the input of the fine positioning unit 22, and the output of the fine positioning unit 22 is further input to the fine-tuning positioning unit 23, thereby realizing a progressive processing flow.
[0047] Preferably, by using the above-mentioned cascaded structure, high-complexity calculations are concentrated in a small number of candidate regions, avoiding the need to perform uniform high-precision calculations on the entire image, thereby reducing the overall computational complexity while ensuring positioning accuracy.
[0048] 3. Dynamic Processing Strategy Module 30 In one embodiment of the present invention, the dynamic processing strategy module 30 is used to select differentiated processing paths for different regions according to the image content, so as to achieve adaptive allocation of computing resources.
[0049] Furthermore, the dynamic processing strategy module 30 includes: Complexity evaluation unit 31 is used to calculate the overall complexity of medical images; Regional importance assessment unit 32 is used to score candidate regions; The strategy decision-making unit 33 is used to determine the processing path based on the evaluation results.
[0050] Furthermore, the strategy decision-making unit 33 divides the candidate regions into different categories based on their regional importance, and executes the following processing strategies for each category: Skip low-importance regions; Simplified processing is applied to areas of moderate importance; Highly important areas are handled with meticulous care.
[0051] Preferably, the dynamic processing strategy module 30 works in conjunction with the three-level cascaded positioning module 20, so that only some candidate regions enter the high-complexity processing flow, thereby reducing redundant calculations and improving overall processing efficiency.
[0052] 4. Precise Segmentation and Information Deconstruction Module 40 In one embodiment of the present invention, the precise segmentation and information deconstruction module 40 is used to perform pixel-level segmentation of the target region and extract multi-dimensional feature information.
[0053] Furthermore, the precise segmentation and information deconstruction module 40 includes: Segmentation unit 41 is used to generate precise boundaries of the lesion region; Feature extraction unit 42 is used to extract the morphological features, texture features and spatial relationship features of lesions.
[0054] Furthermore, the feature extraction unit 42 and the segmentation unit 41 operate in parallel, and the feature results are output synchronously during the segmentation process.
[0055] Preferably, the above processing is performed only on the target area filtered by the dynamic processing strategy, thereby avoiding high-complexity calculations on non-critical areas and improving the overall efficiency of the system.
[0056] 5. Multimodal feature fusion module 50 In one embodiment of the present invention, the multimodal feature fusion module 50 is used to perform unified modeling and fusion processing on multiple types of features.
[0057] Furthermore, the multimodal feature fusion module 50 includes: Structural feature input unit; Texture feature input unit; Contextual feature input unit; Medical metadata input unit; The feature fusion unit is used to perform weighted fusion of the above features.
[0058] Preferably, the feature fusion unit weights different features using adaptive weights, thereby improving the expressive power and robustness of the fusion result.
[0059] 6. Structured Report Generation Module 60 In one embodiment of the present invention, the structured report generation module 60 is used to generate structured diagnostic results based on the fused features.
[0060] Furthermore, the structured report generation module 60 includes: The feature parsing unit is used for semantic parsing of the fused features; The results generation unit is used to generate a structured report that includes location information, quantification parameters, and diagnostic prompts.
[0061] Optionally, the structured report generation module 60 can also be combined with a medical knowledge base or knowledge graph to improve the accuracy and interpretability of diagnostic results.
[0062] III. Implementation Methods of Core Processes In one embodiment of the present invention, such as Figure 2 The diagram illustrates a detailed description of the rapid medical image localization and information extraction process of the present invention. This process is based on the synergistic effect of a three-level cascaded localization module and a dynamic processing strategy module, enabling hierarchical processing of medical images and adaptive computational path selection.
[0063] 1. Input and Preprocessing Stage In one embodiment of the present invention, an original medical image is first acquired and then input into an image preprocessing module for processing to obtain standardized image data.
[0064] Furthermore, the preprocessing includes size unification, contrast enhancement, and normalization to ensure that the input data meets the processing requirements of the subsequent localization model.
[0065] 2. Coarse localization stage (candidate region generation) In one embodiment of the present invention, the preprocessed image is input to the coarse positioning unit in the three-level cascaded positioning module to perform rapid positioning processing.
[0066] Furthermore, the coarse localization unit performs low-complexity analysis on the image to generate multiple candidate regions, which are used to represent the range of areas where lesions may exist.
[0067] Preferably, the coarse localization stage uses a low-resolution or lightweight model to improve the candidate region generation speed while ensuring recall.
[0068] 3. Dynamic processing strategy evaluation phase (key step) In one embodiment of the present invention, the candidate regions are input to a dynamic processing strategy module to evaluate the importance of each candidate region.
[0069] Furthermore, the evaluation process includes: Calculate the image complexity index; Calculate the importance score of the candidate region; Candidate regions are classified based on the scoring results.
[0070] Furthermore, based on the evaluation results, the candidate regions are divided into different treatment categories, including: Category 1: Low-importance areas; Category 2: Areas of moderate importance; The third category of areas: areas of high importance.
[0071] Preferably, the above evaluation process enables differentiated processing decisions for candidate regions, thereby avoiding the application of a uniform processing procedure to all regions.
[0072] Furthermore, in one embodiment of the present invention, the regional importance assessment process in the dynamic processing strategy module adopts a regional importance scoring algorithm based on multi-factor weighting to achieve adaptive screening and processing path decision for candidate regions.
[0073] Specifically, for each candidate region, a region importance score is calculated, which is obtained by weighting at least two evaluation factors, including but not limited to: An abnormality probability factor is used to characterize the probability that the candidate region belongs to a lesion. Edge response factor, used to characterize the boundary saliency of the candidate region; Texture complexity factor is used to reflect the gray-level changes and texture distribution characteristics within the candidate region; Anatomical deviation factor is used to characterize the degree of deviation of the candidate region from the preset anatomical structure; The context consistency factor is used to reflect the relationship between the candidate region and the surrounding organizational structure.
[0074] Furthermore, the regional importance score can be expressed as: Score = w1·S1 + w2·S2 + w3·S3 + … + wn·Sn Where Si represents the score value corresponding to the i-th evaluation factor, and wi represents the corresponding weight, which can be obtained through model training or set based on experience.
[0075] Furthermore, after obtaining the regional importance score, the dynamic processing strategy module classifies the candidate regions according to a preset threshold and determines the corresponding processing path, specifically including: When the score is less than the first threshold T1, the candidate region is determined to be a low importance region and is skipped. When the score is between the first threshold T1 and the second threshold T2, the candidate region is determined to be a region of moderate importance, and fine localization processing is performed. When the score is greater than the second threshold T2, the candidate region is determined to be a high-importance region, and fine-tuning and localization processing are performed.
[0076] Preferably, the regional importance score can also be jointly calculated by combining the candidate region confidence score and regional scale information output from the coarse localization stage to improve the accuracy of the processing path decision.
[0077] Alternatively, in some implementations, the weights wi are obtained by learning from historical training samples, thereby enabling the scoring algorithm to be adaptive.
[0078] By introducing the aforementioned regional importance scoring algorithm, the system can adaptively allocate computing resources based on the feature differences of different candidate regions, thereby avoiding performing uniform high-complexity calculations on all regions, improving overall processing efficiency and reducing computational overhead while ensuring positioning accuracy.
[0079] Furthermore, the regional importance score is combined with the three-level cascaded localization process to dynamically control the execution path of subsequent fine localization and fine-tuning localization based on the coarse localization result, thereby realizing hierarchical processing and adaptive calculation path selection.
[0080] 4. Layered processing stage In one embodiment of the present invention, different processing paths are executed for different categories of regions based on the evaluation results of the dynamic processing strategy: (1) Low importance area The low-importance regions are skipped and will not proceed to the subsequent fine localization or segmentation process.
[0081] (2) Areas of moderate importance Fine-tuning is performed on the medium-importance region to further refine the boundary information.
[0082] (3) Highly important areas The highly important regions are sequentially subjected to fine-tuning and precise localization processes to obtain pixel-level localization results.
[0083] Furthermore, the precise positioning and fine-tuning positioning are executed sequentially in a cascade manner, with the output of the previous stage serving as the input of the next stage.
[0084] Preferably, the above-mentioned hierarchical processing mechanism concentrates computing resources on high-importance areas, thereby significantly reducing the overall computational complexity while ensuring positioning accuracy.
[0085] 5. Target Area Determination Stage In one embodiment of the present invention, the region after the above-described layering process is output as the target region.
[0086] Furthermore, the target region includes only the regions that require detailed analysis, excluding low-importance regions that have been filtered out.
[0087] Preferably, by filtering the target region, the computational scope of subsequent high-complexity processing can be significantly reduced.
[0088] 6. Integration with subsequent processing In one embodiment of the present invention, the target region is input to the precise segmentation and information deconstruction module for subsequent processing, including pixel-level segmentation and multi-dimensional feature extraction, and finally generates structured analysis results.
[0089] Furthermore, the process only performs high-complexity calculations on the target region, while the parts that do not enter the target region do not participate in subsequent processing, thereby achieving computational optimization of the overall process.
[0090] Through the above process, the present invention achieves the following technical effects: By using a three-level cascaded positioning system, progressive processing from coarse to fine is achieved, thereby improving positioning accuracy. A dynamic processing strategy is used to achieve differentiated processing for different regions, avoiding redundant calculations; By working together, computing resources can be allocated on demand, thereby significantly improving processing efficiency while ensuring analytical accuracy.
[0091] Compared to existing technologies that uniformly perform high-complexity calculations on the entire image, this invention achieves a better balance between computational efficiency and analytical accuracy through layered processing and dynamic filtering mechanisms.
[0092] IV. Feature Extraction and Output Process In one embodiment of the present invention, such as Figure 3 As shown, the target region obtained through the core process is subjected to feature extraction, multimodal fusion, and result output processing to complete the structured analysis process of medical images.
[0093] 1. Target Area Input Stage In one embodiment of the present invention, the target region obtained by the three-level cascaded positioning module and the dynamic processing strategy module is input to the precise segmentation and information deconstruction module.
[0094] Furthermore, the target region is a set of regions that require detailed analysis, excluding low-importance regions that were filtered out in the aforementioned process.
[0095] Preferably, by performing subsequent processing only on the target region, the scale of data involved in high-complexity calculations can be significantly reduced, thereby improving overall processing efficiency.
[0096] 2. Pixel-level segmentation stage In one embodiment of the present invention, a precise segmentation and information deconstruction module is used to perform pixel-level segmentation processing on the target region to obtain the precise boundary of the lesion.
[0097] Furthermore, the segmentation process is implemented based on a deep learning model, which extracts and reconstructs features of the target region through an encoder-decoder structure, thereby generating a corresponding segmentation mask.
[0098] Preferably, the segmentation process is performed only on the target region, thereby avoiding high-complexity segmentation calculations on the entire image.
[0099] 3. Multidimensional Feature Extraction Stage In one embodiment of the present invention, multidimensional feature extraction is performed on the target region while segmentation is being completed.
[0100] Furthermore, the features include: Morphological features: used to describe the geometry of the lesion, including area, perimeter, and shape parameters; Texture features: used to describe grayscale distribution and texture variation characteristics; Spatial relationship characteristics: used to describe the relative positional relationship between the lesion and surrounding tissues or structures.
[0101] Furthermore, the feature extraction and segmentation processes are executed in parallel, sharing some feature representations, thereby reducing redundant computation.
[0102] Preferably, by extracting features simultaneously during the segmentation process, integrated localization and analysis are achieved, thereby improving overall processing efficiency.
[0103] 4. Multimodal feature fusion stage (key step) In one embodiment of the present invention, the extracted multidimensional features are input into a multimodal feature fusion module for fusion processing.
[0104] Furthermore, the fusion includes the following features: Structural features; Texture features; Contextual features; Medical metadata.
[0105] Furthermore, the multimodal feature fusion module combines different features using an adaptive weighting method to generate a unified feature representation.
[0106] Preferably, by fusing information from multiple sources, the ability to express complex lesions can be improved, and the robustness of the system under different devices and imaging conditions can be enhanced.
[0107] 5. Structured Result Generation Stage In one embodiment of the present invention, the fused features are input into a structured report generation module to generate the final analysis results.
[0108] Furthermore, the structured report includes: Lesion location information; Quantization feature parameters; Diagnostic prompts / information.
[0109] Optionally, the structured report generation module can combine a medical knowledge base or knowledge graph to perform semantic mapping on features, thereby outputting diagnostic results with clinical reference value.
[0110] 6. Output and Feedback Phase In one embodiment of the present invention, the generated structured report is output to the user terminal or a medical system.
[0111] Furthermore, user feedback on the report can be fed back to the system for subsequent model optimization.
[0112] Preferably, by introducing a feedback mechanism, the system performance can be continuously improved.
[0113] Through the above feature extraction and output process, this invention realizes a complete processing link from the target region to structured diagnostic information, wherein: By segmenting and extracting features only from the target region, the computational scope is reduced; Improve processing efficiency by performing segmentation and feature extraction in parallel; By fusing multimodal features, the completeness and accuracy of information expression can be improved. This ensures analytical accuracy while improving the overall system processing efficiency and result stability.
[0114] V. Method and Implementation In one embodiment of the present invention, a method for rapid localization and information extraction of images for smart healthcare based on the above-described system is provided, combining... Figure 2 and Figure 3 The process shown is explained below.
[0115] 1. Overall Method Flow In one embodiment of the present invention, the method includes the following steps: S1: Acquire the original medical image and preprocess the original medical image; S2: Perform three-level cascaded localization processing on the preprocessed image to generate candidate regions; S3: The candidate regions are filtered based on a dynamic processing strategy to obtain the target region; S4: Perform pixel-level segmentation and multi-dimensional feature extraction on the target region; S5: The multidimensional features are fused to generate structured diagnostic results.
[0116] Furthermore, the above steps are executed sequentially according to the data flow, and the optimal allocation of computing resources is achieved through hierarchical processing and adaptive strategies.
[0117] 2. Step S1: Image preprocessing In one embodiment of the present invention, preprocessing of the original medical image includes: Perform size standardization on the image; Perform contrast enhancement processing on the image; The image is normalized.
[0118] Preferably, the preprocessing step can reduce the differences caused by different devices and imaging conditions, and improve the stability of subsequent processing.
[0119] 3. Step S2: Three-level cascade positioning In one embodiment of the present invention, a three-level cascaded localization process is performed on the preprocessed image, including: S21: Perform coarse localization processing to generate a set of candidate regions; S22: Perform fine-tuning on the candidate region to refine the boundaries; S23: Perform fine-tuning positioning processing on the precise positioning results to obtain pixel-level positioning results.
[0120] Furthermore, the coarse positioning, fine positioning, and fine-tuning positioning are executed sequentially in a cascade manner, with the output of the previous stage serving as the input of the next stage.
[0121] Preferably, by using a step-by-step processing method, high-complexity calculations are concentrated in the candidate region, thereby reducing the overall computational burden.
[0122] 4. Step S3: Dynamic processing strategy selection In one embodiment of the present invention, a dynamic processing strategy is performed to filter candidate regions, including: S31: Calculate the image complexity index; S32: Calculate the importance score of the candidate region; S33: Classify the candidate regions based on the scoring results.
[0123] Furthermore, different processing paths are executed based on the classification results: Skip low-importance regions; Simplified processing is applied to areas of moderate importance; Highly important areas are handled with meticulous care.
[0124] Preferably, the above dynamic filtering process enables differentiated processing for different regions, thereby avoiding the need to perform uniform high-complexity calculations on all regions.
[0125] 5. Step S4: Segmentation and Feature Extraction In one embodiment of the present invention, performing fine analysis on the target region includes: S41: Perform pixel-level segmentation on the target region; S42: Simultaneously extract multidimensional features during the segmentation process.
[0126] Furthermore, the features include morphological features, texture features, and spatial relationship features.
[0127] Preferably, the segmentation and feature extraction processes share a common computational process, thereby reducing redundant computations and improving processing efficiency.
[0128] 6. Step S5: Feature Fusion and Result Generation In one embodiment of the present invention, the extracted multidimensional features are fused to generate a result, including: S51: Integrate structural features, texture features, contextual features, and medical metadata; S52: Generate structured diagnostic results based on fusion features.
[0129] Furthermore, the structured diagnostic results include lesion localization information, quantitative parameters, and diagnostic prompts.
[0130] Optionally, the result generation process can be combined with semantic reasoning using a medical knowledge base.
[0131] 7. Optional Step: Feedback Learning In one embodiment of the present invention, it further includes: S6: Receive correction information from the user regarding the structured results, and update the model based on the correction information.
[0132] Preferably, by introducing a feedback learning mechanism, continuous optimization of the model can be achieved.
[0133] Through the above method, the present invention achieves the following technical effects: By using a three-level cascaded positioning system, a layered processing approach from coarse to fine is achieved, thereby improving positioning accuracy. A dynamic processing strategy is used to achieve differentiated processing for different regions, reducing redundant calculations; Improve overall computational efficiency by performing high-complexity processing only on the target region; By fusing multimodal features, the completeness and accuracy of information extraction can be improved.
[0134] Therefore, this invention achieves an effective improvement in processing efficiency and resource utilization while ensuring analytical accuracy.
[0135] VI. Specific Application Examples In one embodiment of the present invention, taking the analysis of CT images of lung nodules as an example, the method for rapid localization and information extraction of images for smart medical purposes of the present invention will be specifically described.
[0136] In this embodiment, the input is a sequence of low-dose chest CT images, with an image size of 512×512 pixels and 200 to 400 slices. First, the raw medical images are input to the image preprocessing module for processing, including size unification, contrast enhancement, and normalization, to obtain standardized image data.
[0137] Furthermore, the preprocessed image is input into a three-level cascaded localization module. The coarse localization unit uses a lightweight convolutional neural network to quickly scan the image and generate multiple candidate regions. Subsequently, the fine localization unit refines the boundaries of the candidate regions. For highly important regions, pixel-level localization correction is further performed by the fine-tuning localization unit to obtain the precise location of the lesion.
[0138] In the dynamic processing strategy stage, the importance of candidate regions is evaluated, and the importance score can be determined based on at least one of the following: region grayscale change, edge response, and anomaly probability. Based on the score results, low-importance regions are not further processed, medium-importance regions undergo fine-tuning localization, and high-importance regions undergo both fine-tuning and refined localization, thereby selecting the target regions.
[0139] Furthermore, the target region is input into the precise segmentation and information deconstruction module, where pixel-level segmentation is performed and multi-dimensional features are extracted simultaneously, including nodule area, perimeter, roundness, and grayscale texture features.
[0140] Subsequently, the multidimensional features are input into the multimodal feature fusion module and fused with contextual features and medical metadata (including scanning equipment parameters, slice thickness information, etc.) to obtain a unified feature representation.
[0141] Finally, the analysis results are output through the structured report generation module, including the nodule's three-dimensional location coordinates, diameter, malignancy probability assessment, and diagnostic recommendations.
[0142] Preferably, in this embodiment, through the synergistic effect of the three-level cascaded positioning and dynamic processing strategy, high-complexity calculations are performed only on some high-importance areas, so that the processing time of a single CT data is controlled within the range of 1 to 2 seconds, while maintaining a high detection accuracy.
[0143] The model structure, parameter settings, and processing time described in this embodiment are merely illustrative examples, and the present invention is not limited to the specific values or implementation methods mentioned above.
[0144] As can be seen from this embodiment, the present invention can significantly improve processing efficiency while ensuring detection accuracy in actual medical image analysis scenarios, and has good clinical application value.
[0145] It should be noted that the above embodiments of the present invention are merely preferred embodiments of the present invention, used to illustrate the technical solutions of the present invention, and not to limit the scope of protection of the present invention. For those skilled in the art, various modifications, substitutions, or equivalent transformations can be made to the above embodiments without departing from the technical concept and essence of the present invention, and all such modifications, substitutions, or equivalent transformations should fall within the scope of protection of the present invention.
[0146] Furthermore, the technical features involved in the various embodiments described in this specification can be combined with each other to form new implementation methods without conflict, and such combinations should also be regarded as the disclosure of this invention.
[0147] Furthermore, the division of modules, units, or steps in this specification is only a logical division of functions. In practical applications, different implementation methods can be adopted as needed. For example, multiple modules can be integrated into one module, or one module can be split into multiple sub-modules. The specific implementation method should not be construed as a limitation on the scope of protection of this invention.
[0148] Finally, it should be understood that the present invention can be implemented in hardware, software, or a combination of both. For the parts implemented in software, they can be stored in a computer-readable storage medium and executed by a processor to implement the technical solution of the present invention. All such implementations are within the protection scope of the present invention.
Claims
1. A rapid localization and information extraction system for smart medical images, characterized in that, include: The image preprocessing module is used to standardize and enhance the input raw medical images; A three-level cascaded positioning module, connected to the image preprocessing module, is used to sequentially perform coarse positioning, fine positioning, and micro-adjustment positioning processing on the medical image, wherein: The coarse localization is used to quickly generate a set of candidate regions; The precise positioning is used to refine the boundaries of the candidate region; The fine-tuning positioning is used to achieve pixel-level positioning correction; The dynamic processing strategy module is used to select differentiated processing paths for different regions based on the complexity of the medical image or the importance of the candidate region, including skipping processing, simplifying processing, or fine processing. The precise segmentation and information deconstruction module is connected to the three-level cascaded positioning module and is used to perform pixel-level segmentation only on the target area after being filtered by the dynamic processing strategy, and simultaneously extract the morphological features, texture features and spatial relationship features of the lesion. The multimodal feature fusion module is used to perform weighted fusion of the morphological features, texture features, context features, and medical metadata to obtain a unified feature representation; A structured report generation module, connected to the multimodal feature fusion module, is used to generate a structured report containing location information, quantization parameters, and diagnostic prompts based on the unified feature representation; The dynamic processing strategy module is used to determine the processing path for fine positioning and fine-tuning positioning based on the candidate region generated by the coarse positioning.
2. The system of claim 1, wherein, The dynamic processing strategy module controls different regions to be processed using models with different resolutions or computational depths by calculating image complexity indicators or region importance scores.
3. The system of claim 1, wherein, The dynamic processing strategy module includes an attention guidance mechanism to suppress computation in low-importance regions and prioritize the allocation of computing resources to high-importance regions.
4. The system according to claim 1, characterized in that, The three-level cascaded positioning module shares some features among its various positioning networks to reduce redundant calculations and improve processing efficiency.
5. The system according to claim 1, characterized in that, The multimodal feature fusion module uses an adaptive weighting method to fuse different features, where the weights of each feature are automatically learned through model training.
6. The system according to claim 1, characterized in that, The precise segmentation and information deconstruction module extracts the area, perimeter, shape parameters, and texture statistical features of the lesions through parallel branches during the segmentation process.
7. The system according to claim 1, characterized in that, Also includes: The feedback learning module is used to incrementally learn the three-level cascaded positioning module and / or the precise segmentation and information deconstruction module based on the user's correction information to the structured report.
8. A method for rapid localization and information extraction of intelligent medical images based on the system described in any one of claims 1-7, characterized in that, include: Preprocess the raw medical images; Candidate regions are generated through a three-level cascaded positioning module, and the positioning results are refined level by level. Target regions are filtered based on dynamic processing strategies; The selected target region is finely segmented and multidimensional features are extracted. Multidimensional features are fused to generate a structured diagnostic report; The dynamic processing strategy includes: Choose the processing resolution based on the image complexity; Whether to perform fine-grained segmentation is determined based on the importance of the region.
9. The method according to claim 8, characterized in that, The feature fusion includes a weighted combination of structural features, texture features, contextual features, and medical metadata.
10. The method for rapid localization and information extraction of images in smart healthcare according to claim 8, characterized in that, Also includes: Receive correction information from the user regarding the structured diagnostic results; Annotated training data is generated based on the correction information; The model parameters in the three-level cascaded localization module and / or the precise segmentation and information deconstruction module are updated by using online learning or periodic training to achieve adaptive optimization of the model.