Method for detecting structural anomalies in a medical image and associated system
The method uses neural networks to convert UTE-MRI images into synthetic CT scans, employing deep reinforcement learning for accurate detection and quantification of structural airway anomalies, addressing the limitations of existing UTE-MRI segmentation methods and enhancing diagnostic precision.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- UNIVERSITE DE BORDEAUX
- Filing Date
- 2025-12-11
- Publication Date
- 2026-06-18
Smart Images

Figure EP2025086668_18062026_PF_FP_ABST
Abstract
Description
[0001] METHOD FOR THE DETECTION OF STRUCTURAL ANOMALIES ON A MEDICAL IMAGE AND ASSOCIATED SYSTEM
[0002] Scope of the invention
[0003] The invention relates to a method for detecting a structural anomaly on a functional image of an organ of a subject.
[0004] The invention also relates to a system, a computer program product, and a data carrier for implementing said method according to the invention.
[0005] State of the art
[0006] Chronic airway diseases represent a major public health problem. Among them, chronic obstructive pulmonary disease (COPD), asthma, and cystic fibrosis, one of the most common genetic diseases in Caucasians, cause progressive lung damage through chronic inflammation and infections. Computed tomography (CT scans) has long been the gold standard for monitoring structural changes in the lungs, but it poses problems due to cumulative exposure to ionizing radiation, particularly for pediatric patients.
[0007] Ultrashort echo time MRI (UTE-MRI) sequences offer a promising, radiation-free alternative for lung imaging. However, UTE-MRI presents specific challenges, including a lower signal-to-noise ratio and reduced spatial resolution compared to CT.
[0008] Currently, there is no automated solution for segmenting and quantifying the volume of structural airway lesions such as bronchiectasis, bronchial wall thickening, bronchial mucus, bronchiolar mucus, or areas of consolidation / atelectasis on UTE-MRI due to these technical limitations. These volumetric segmentations can be used to derive a severity score for structural airway damage.
[0009] The invention therefore aims to improve existing methods for segmenting structural abnormalities of an organ with sufficient accuracy while reducing radiation exposure, in order to quantify a holistic severity score of structural alterations in the airways of patients with chronic airway disease. Summary of the invention
[0010] According to a first aspect, the invention relates to a computer-implemented method for detecting a structural anomaly in a subject's organ from a functional image. The method comprises a step of receiving an image acquired by a non-irradiating image acquisition device. The method further comprises a step of converting said acquired image into at least one synthetic anatomical image using a conversion module. The method further comprises a step of detecting one or more structural anomalies in a subject's organ on at least one synthetic anatomical image using a detection module.
[0011] One benefit of such detection is to allow the calculation of a holistic segmentation score of one or more structural anomalies.
[0012] In one execution mode, the conversion step is implemented by a first trained learning function configured to receive the acquired image as input and generate the synthetic anatomical image as output.
[0013] In one execution mode, the detection step is implemented by a second trained learning function configured to receive as input the synthetic anatomical image and generate as output at least one presence score of at least one predetermined structural anomaly.
[0014] In one execution mode, the training of the first learning function is implemented using data produced by a loss function implemented during the execution of the second trained learning function.
[0015] In one execution mode, the first and / or second learning function is trained using a loss function that incorporates a learning reinforcement agent over deep supervision
[0016] In one execution mode, the data produced includes segmentation information extracted from the loss function of the second trained learning function.
[0017] In one execution mode, the first learning function is trained using a discriminator that acts on a latent space of said first learning function to evaluate whether the features generated by the first learning function satisfy predetermined properties extracted from the second trained learning function. In another execution mode, the first learning function is trained using a loss function that incorporates an adversarial or corrective term.
[0018] In one execution mode, the adversarial or corrective term guides the latent space of the first learning function to produce representations aligned with target information extracted from the second trained learning function.
[0019] In one execution mode, the process includes the generation of a presence score by the second trained learning function for each anomaly present on the synthetic anatomical image, said score being a function of the integral of the number of voxels and / or pixels classified for each predetermined anomaly.
[0020] In one embodiment, the generated attendance score is a holistic attendance score.
[0021] In one embodiment, the method includes the extraction of a bronchial volume and / or a vascular volume from the synthetic anatomical image.
[0022] In one embodiment, the process includes a step of normalizing each presence score generated by the extracted bronchial and / or vascular volume to generate, for each predetermined anomaly, a normalized presence score.
[0023] In one execution mode, the conversion module includes a neural network comprising:
[0024] • an encoder configured to compress the acquired image into a reduced representation in a latent space,
[0025] • a decoder configured to generate a synthetic anatomical image from the reduced representation in latent space; the method further comprising:
[0026] • training a learning function to classify at least two types of anomalies on a synthetic anatomical image of a patient's organ,
[0027] • the extraction of specific knowledge from said trained learning function, • the integration of said specific knowledge extracted within the latent space of the neural network of the conversion module to orient the conversion of the functional image into synthetic anatomical images into an image, said orientation allowing the reconstruction of the scan image to be focused on the regions presenting anomalies.
[0028] In one execution mode, the detection module includes a learning function trained to receive as input at least one portion of a synthetic anatomical image and to generate as output a classification of each elementary part of said portion of the synthetic anatomical image.
[0029] In one execution mode, the first learning function is implemented by a neural network such as an nn-Unet network.
[0030] In one execution mode, the second learning function is implemented by a neural network such as an nn-Unet network.
[0031] In one execution mode, the acquired image is an image of a subject's lung. In one execution mode, the structural abnormality predetermined by the detection module includes at least one of the following abnormalities: bronchiectasis, bronchial wall thickening, bronchial mucus accumulation, bronchiolar mucus accumulation, and / or condensation / atelectasis.
[0032] According to another aspect, the invention also relates to a structural anomaly detection system comprising software and / or hardware means for implementing the method according to the invention.
[0033] According to another aspect, the invention relates to a computer program product comprising code instructions which, when implemented by a computer, cause the system according to the invention to carry out the steps of the process according to the invention.
[0034] In another aspect, the invention relates to a computer-readable data carrier on which the computer program product according to the invention is stored. Brief description of the figures
[0035] Other features and advantages of the invention will become apparent from the detailed description that follows, with reference to the attached figures, which illustrate:
[0036] Figure 1: A flowchart representing the steps of a method according to one embodiment of the invention.
[0037] Figure 2: A flowchart representing the steps of a method for training the first and second learning functions according to an execution mode of the invention.
[0038] Figure 3: A schematic representation of a system according to one embodiment of the invention.
[0039] Description of the invention
[0040] The present invention relates to a computer-implemented method 1000 for the automatic detection of a structural anomaly of an organ from a functional image of said organ of a subject. The invention also relates to an associated device 1 for implementing said method.
[0041] A structural anomaly should be understood as a structural change in an organ compared to a healthy state, caused by anatomical alterations of the organ that may be resulting from disease, injury, or the aggravation of an injury. In one embodiment, structural anomalies affect the morphology and function of the airways in a lasting and / or significant manner.
[0042] In one embodiment, a structural anomaly refers to anatomical alterations of the lungs caused by cystic fibrosis, such as bronchiectasis (abnormal dilations of the bronchi), bronchial wall thickening (increased thickness of the airway walls due to chronic inflammation or infections), or bronchial mucus accumulation (airway obstructions caused by excessive mucus production).
[0043] These changes are key markers of disease progression and are studied using imaging techniques to assess lung condition. An example of a method according to an embodiment of the invention is now described with reference to Figure 1.
[0044] Image acquisition
[0045] Method 1000 involves receiving a medical image of at least one organ of a subject. The received medical image preferably comprises an image acquired by a non-ionizing imaging device.
[0046] In the context of the present invention, the term "image acquired by a non-ionizing imaging device" specifically refers to a visual representation, in the form of digital or analog data, obtained from a medical imaging modality that does not involve the use of ionizing radiation to generate said image.
[0047] This term encompasses, but is not limited to, images produced by devices using technologies based on magnetic fields, radio waves, ultrasound, or photons in the visible or infrared spectrum. Examples include images obtained by magnetic resonance imaging (MRI), ultrasound, optical coherence tomography (OCT), or any other equivalent method that guarantees the absence of ionizing radiation.
[0048] The images thus generated allow anatomical, functional or metabolic visualization of the subject's tissues and organs without the risk associated with cumulative radiation exposure, making them particularly suitable for vulnerable populations or those requiring frequent examinations.
[0049] In one aspect of the invention, a method includes a step 100 of receiving, by a data processing device, an image obtained by non-ionizing imaging such as magnetic resonance imaging (MRI) and storing it in memory.
[0050] In one execution mode, a data acquisition module receives an MRI image file as a data stream or as a digital file generated by an MRI scanner (e.g., an anatomical or functional MRI sequence). The data can be received via a wired connection (e.g., Ethernet, USB) or wirelessly (e.g., Wi-Fi, Bluetooth) from the imaging device or an intermediate server. The received image is then stored in a dedicated memory space, comprising volatile memory (such as RAM) for immediate processing and non-volatile memory (hard drive, SSD, or secure cloud) for long-term archiving.
[0051] In one execution mode, the metadata associated with the MRI image (such as acquisition parameters, resolution, or patient identifier) is checked to ensure data compatibility and integrity.
[0052] The image file can include a single two-dimensional MRI image or a three-dimensional MRI image.
[0053] In the context of the present invention, the term "acquired image" refers to a visual representation obtained from a medical imaging modality, such as magnetic resonance imaging (MRI), and may include two-dimensional ("2D image") or three-dimensional ("3D image") images.
[0054] A 2D image corresponds to a planar projection representing a single cross-section or section of an anatomical volume of the subject, each point of the image being associated with an intensity value indicating the physical or chemical properties of the corresponding region.
[0055] A 3D image, on the other hand, corresponds to a volumetric set made up of a series of successive 2D images or a reconstructed volume, each voxel (volumetric element) representing a specific spatial unit and containing information relating to the characteristics of the tissue or organ in the volume concerned.
[0056] In one embodiment, the acquired images 11 are images obtained using an ultrashort echo magnetic resonance imaging (UTE-MRI) sequence. These images are particularly well-suited for capturing anatomical structures with low proton signal, such as the lungs, due to their ability to reduce motion artifacts and enhance the contrast of low-density soft tissues.
[0057] In one embodiment, the acquired images 11 are images obtained by respiratory-gated magnetic resonance imaging (MRI) or UTE-MRI. This technique allows for the capture of anatomical images at specific phases of the respiratory cycle, thereby reducing motion artifacts associated with respiration. The images thus acquired provide an accurate representation of thoracic structures, such as the lungs, airways, and surrounding tissues, under consistent and reproducible conditions. This embodiment is particularly well-suited for applications where respiratory variability could compromise the analysis, such as the evaluation of pulmonary abnormalities, the study of airway dynamics, or the mapping of soft tissue movements during respiration. A further benefit is increased spatial and contrast resolution and reduced motion artifacts.
[0058] Conversion of the acquired image into a synthetic image
[0059] In one execution mode, the acquired image 11 is converted 200 into a synthetic image 12.
[0060] In the context of the present invention, the generated synthetic image 12 specifically refers to a visual representation obtained from medical data, enabling the description of the morphology, shape, and physical organization of tissues, organs, or internal structures of the human or animal body. These images are intended to capture static and structural characteristics, providing a precise map of anatomical elements such as bones, muscles, organs, blood vessels, or respiratory tracts. In one embodiment, the generated synthetic image is a synthetic image of irradiating and / or ionizing imaging.
[0061] One benefit of this step is converting the acquired image into a type of image on which it is easier to automatically detect one or more structural abnormalities of the organ. Preferably, the acquired image is converted into a synthetic 2D or 3D CT scan image, depending on whether the acquired image was 2D or 3D. The remainder of the description will refer to the converted synthetic image as a CT scan, but other types of anatomical imaging can also be considered.
[0062] In one execution mode, this step is performed by a first learning function 26.
[0063] The first trained learning function 26 can include any type of neural network. The first learning function 26 is configured to receive as input a medical image 11 as described above and generate as output a synthetic anatomical image 12. In one embodiment, the first learning function 26 is implemented by an nnll-Net type neural network.
[0064] In one aspect of the invention, a convolutional neural network model, based on an nnll-Net architecture, is configured to convert an image 11 obtained by magnetic resonance imaging into a synthetic anatomical image 12 such as an image corresponding to a computed tomography (CT) scan. This nnU-Net model is optimized to receive as input an MRI image, including information on the magnetic properties and contrasts of the tissues, and to produce as output a synthetic image simulating the Hounsfield unit intensities characteristic of a CT scan, while preserving anatomical details.
[0065] The nnU-Net model is trained on a database of matched MRI and CT images, where each pair of images is aligned to ensure precise spatial correspondence. In one execution mode, the training method includes registration between a computed tomography (CT) image and a magnetic resonance imaging (MRI) image, the images being matched to represent the same anatomical structures. This step aims to spatially align the two imaging modalities to enable combined analysis or fusion of anatomical and functional information.
[0066] The nnU-Net model architecture includes an encoder, which extracts relevant features from MRI images, a bottleneck to model the complex relationships between MRI and synthetic image properties, and a decoder configured to reconstruct a synthetic image from the extracted features.
[0067] In a convolutional neural network architecture, the bottleneck is a layer or set of layers located between the encoder and the decoder. This region plays a key role in transforming and compressing the features extracted by the encoder, allowing essential information to be represented in a smaller space while eliminating redundancies.
[0068] Training 110 of the first learning function 26 is an iterative process aimed at adjusting the weights and connections between neurons to minimize a loss function and improve the model's performance on a given task. Training an artificial neural network allows for the optimization of weighted connections between the network layers based on a loss function.
[0069] Training step 110 of the first learning function includes, in particular, receiving a data set 21, 22 (for example, the acquired image or data from said acquired image) as input to the network, which is propagated through the network layers. At the network output, a predictive result is generated.
[0070] The training process involves calculating a loss function to measure the difference between the predicted output and the CT scan image matched to the acquired image received as input to the network. The loss function may include, but is not limited to, the root mean square error (RMSE), cross-entropy, and / or Dice Loss. A learning agent is then implemented to adjust the POIs by minimizing the loss function while preserving the network's ability to generalize to new data. In one embodiment, the loss function comprises a combination of both Dice Loss and cross-entropy. A benefit of this combination is that it advantageously leverages the advantages of both loss functions.
[0071] Within the framework of the present invention, Dice Loss is a loss function used to evaluate and optimize the correspondence between segments predicted by a neural network and a ground truth in image segmentation tasks.
[0072] Detection of the presence of an anomaly
[0073] In an execution mode, a detection step 300 is generated by a detection module.
[0074] The detection module is configured to receive as input a synthetic anatomical image 12 and generate as output at least one presence score 14A, 14B, 14C of at least one predetermined structural anomaly of an organ of the subject.
[0075] In one embodiment, the attendance score 14A, 14B, 14C is a holistic attendance score.
[0076] A "holistic" score refers to an approach that comprehensively considers all available information without simplification or excessive hierarchical prioritization. Unlike methods that simplify or summarize data (for example, by focusing solely on a dominant anomaly or specific slices), a holistic approach aims to analyze all voxels or pixels of an image or structure to produce a complete and representative result.
[0077] A holistic presence score implies that each anomaly is evaluated across the entire analyzed volume, taking into account all anomalies present in the image without excluding any.
[0078] In one execution mode, the score is calculated from the integral of all voxels or pixels classified as belonging to a given anomaly over the entire volume studied (e.g., the entire lung in the case of bronchi).
[0079] In one execution mode, the holistic score takes into account all anomalies simultaneously, whether major or minor, unlike hierarchical scores, where a dominant anomaly can "mask" the others.
[0080] In one execution mode, the holistic score is based on an exact quantification of the volume occupied by each anomaly, rather than on visual simplifications or approximate averages.
[0081] One benefit of such a holistic score is that it avoids biases associated with common simplifications (for example, considering only the most severe abnormalities or analyzing only a few image slices). It thus offers a robust and comprehensive quantification of the abnormalities present, improving the accuracy and reproducibility of diagnoses.
[0082] In one execution mode, this step of calculating an attendance score is carried out by a second trained learning function 25.
[0083] The second trained learning function 25 is configured to receive as input anatomical images or synthetic anatomical images of an organ of a subject such as CT-scan images or synthetic CT-scan images and to generate as output a score of presence of at least one predetermined structural anomaly on an organ of a subject.
[0084] In a first embodiment, the second learning function 25 includes a first sub-function configured to generate a segmentation of the anatomical image and / or the synthetic anatomical image and a second sub-function configured to generate, from the generated segmentation, a presence score for each predetermined anatomical anomaly.
[0085] In a second alternative embodiment, the second learning function 25 comprises a plurality of second sub-functions, each configured to generate, from the segmentation generated by the first sub-function, a score for the presence of a single predetermined anatomical anomaly different from the other second sub-functions.
[0086] In one embodiment, the second learning function 25 is configured to generate directly from each voxel of the anatomical image or synthetic anatomical image, a probability of presence of each predetermined anatomical anomaly.
[0087] The presence score 14A, 14B, 14C generated from an anomaly may include the integral of the probabilities of each voxel of the anatomical image and / or synthetic anatomical image belonging to said anomaly.
[0088] In one embodiment, the second learning function 25 includes a neural network, preferably of type nnll-Net.
[0089] In one embodiment, the second trained learning function 25 is configured to produce at least one series of presence scores 14A, 14B, 14C, each presence score being representative of the presence on the acquired image 11 or the synthetic anatomical image 12 of a predetermined structural anomaly different from the other presence scores.
[0090] In one embodiment illustrated in Figure 2, the training 110 of the second learning function 25 is carried out by a database of acquired medical images (such as MRI images) 21 and / or synthetic anatomical images matched to a series of presence scores 22 for each predetermined structural anomaly.
[0091] In one embodiment, training 110 includes a preprocessing step of anatomical and / or synthetic anatomical images, comprising the extraction of the organ's envelope and the training itself. In one embodiment, the training is performed using pairs of CT scan images simultaneously, with and without the extraction of said envelope. In one embodiment, the second learning function 25 is configured to output the total volume 15 of the organ, such as a lung volume or a vascular volume.
[0092] Training the second learning function (110) can include, in particular, receiving a data set as input to the network (for example, the anatomical image with or without the organ's envelope), which is propagated through the network layers. A predictive result is generated at the network's output. Training the second learning function includes calculating a loss function to measure the difference between the predicted output and the presence scores for each anomaly and / or the organ volume received as input to the network. In one embodiment, the loss function can include, but is not limited to, the mean squared error (MSE) and / or the Dice Loss. A learning agent is then implemented to adjust the pois by minimizing the loss function while preserving the network's ability to generalize to new data.
[0093] In one embodiment of training the second learning function, the Dice loss can be optimized to retrain specifically on the proportion of training data having the worst similarity performance between the ground truth and the prediction issued by the second learning function.
[0094] In another embodiment of training the second learning function, a learning reinforcement (LR) agent is proposed for deep supervision. This agent uses a one-layer long short-term memory (LSTM) network followed by a fully connected layer to dynamically optimize the deep supervision weights. The LSTM input is a sequence containing losses and Dice scores for all N supervision levels. The LR agent integrates into the learning function training process as follows: the learning function is trained during one iteration, and then the losses and Dice scores are computed and fed into the LR agent. The agent generates weights w, which are applied to the losses in deep supervision. A reward is computed based on the improvement in losses and Dice scores. The LR agent is updated using the loss function.This formulation encourages the agent to adjust the weights to maximize the reward, which represents an improvement in both loss and Dice scores at all levels of supervision.
[0095] This deep supervision reinforcement agent uses a reward function that can be formulated as follows: reward = In which:
[0096] - N is the number of deep supervision levels
[0097] - Lprev is the loss value calculated during the previous iteration for each deep supervision level (i). This value represents the model error before the supervision level weights are adjusted by the reinforcement agent.
[0098] - Lcurr is the current value of the loss calculated after the dynamic adjustment of the supervision level weights by the agent. This value reflects the new error after applying the changes made by the agent.
[0099] - Dice P rev is the value of the Dice score calculated in the previous iteration for each level of deep supervision (i). This value represents the model's performance in terms of overlap between the prediction and the ground truth before the agent adjusts the weights of the supervision levels.
[0100] - Dicecurr is the current value of the Dice score, calculated after the reinforcement agent has dynamically adjusted the weights of the deep supervision levels. This value reflects the new performance of the model after the agent's intervention.
[0101] The deep supervision reinforcement agent is updated with the following loss function (Lagent) .L agent= — ii log(Wj) . reward
[0102] Where w is the importance weight assigned by the learning agent to each level (i) of deep supervision during training.
[0103] This formulation encourages the agent to adjust the weights to maximize the reward, resulting in improved losses and Dice scores at all levels of supervision.
[0104] This dynamic approach, guided by the RL learning reinforcement agent, offers several significant technical advantages over static methods: A primary advantage is that the system learns to personalize the training process. For example, it can learn to initially assign higher weights to lower-resolution (deeper) layers to quickly assimilate the overall semantic features of anatomical structures, and then gradually shift the weights towards higher-resolution (shallower) layers to focus on refining fine contours and small anomalies.
[0105] A second advantage is that the optimal weighting strategy can vary depending on the specific anomaly to be segmented. Segmenting fine, branching bronchial structures does not present the same challenges as segmenting large areas of consolidation / atelectasis. The RL agent learns the best strategy directly from the data, making it highly task-specific.
[0106] Finally, this process advantageously automates the difficult and often suboptimal process of manually tuning deep monitoring hyperparameters. The RL agent can discover a complex and non-intuitive weighting policy, potentially more effective than any heuristic devised by a human.
[0107] By employing this advanced embodiment for training the second learning function 25, the resulting anomaly detector is rendered exceptionally accurate and robust. Consequently, the "knowledge" extracted from this superior detector to guide the training of the first learning function is of increased quality. This, in turn, ensures that the generated synthetic anatomical images are optimally structured for identifying target anomalies, thus improving the performance and clinical utility of the entire inventive system.
[0108] The example described below more precisely describes an embodiment of the invention where the organ comprises the lungs of a subject and includes: processing the CT scan image to extract the lung envelope, receiving said processed image by the second trained learning function, and automatically generating by the trained learning function a presence score on at least three labels: bronchiectasis, peribronchial thickening, the presence of bronchial or bronchiolar mucus, consolidations / atelectasis, and lung volume. Preferably, the method includes a step of normalizing each presence score by the total lung volume. This normalization beneficially allows for comparison of values between different subjects with varying lung volumes.
[0109] In one embodiment, the method further includes the generation of a bronchial tree and / or a vascular tree of the organ from synthetic anatomical images.
[0110] In one embodiment, the method includes generating a graphical representation that allows for a detailed and usable visualization of structural abnormalities on the generated bronchial tree and / or vascular tree. In one execution mode, said graphical representation is stored in memory and / or displayed on an AFF display.
[0111] From the generated synthetic anatomical image (e.g., a synthetic CT scan image from the previous analysis), a bronchial volume and / or a vascular volume is extracted. These volumes are determined by dedicated segmentation algorithms based on anatomical characteristics specific to bronchial and vascular structures, such as their density, shape, and spatial distribution. The accurate extraction of these volumes allows for the contextualization of abnormality scores and the normalization of their interpretation. Bronchial volumes can, for example, represent the analyzed airway region, while vascular volumes provide a reference for adjacent areas or correlated abnormalities.
[0112] In one embodiment, the anomalies include airway anomalies. In one embodiment, the structural anomalies include the following anomalies:
[0113] ■ Bronchiectasis,
[0114] ■ Thickening of the bronchial walls, and / or
[0115] ■ An accumulation of bronchial mucus, and / or
[0116] ■ An accumulation of bronchiolar mucus, and / or
[0117] ■ Condensation, and / or
[0118] ■ Atelectasis.
[0119] In one embodiment, the second learner function 25 is configured to generate, for each or part of the anomalies listed above, an attendance score of 14A, 14B, 14C or a normalized attendance score of 16A, 16B, 16C.
[0120] In an alternative embodiment illustrated in Figure 1, for each presence score 14A, 14B, 14C generated by the second learning function 25, a normalization step 400 is performed by dividing the presence score 14A, 14B, 14C by the extracted bronchial and / or vascular volume 15. This normalization 400 makes it possible to generate normalized presence scores 16A, 16B, 16C that are comparable between different patients or images, regardless of inter-individual anatomical or physiological variations. Thus, this step ensures that the detected abnormalities are assessed proportionally to the size of the surrounding structures, providing a relative and clinically relevant measurement.
[0121] Although the use of a reinforcement learning (RL) agent to orchestrate deep supervision is a preferred embodiment, it is understood that other mechanisms for dynamically adjusting the loss weighting of deep supervision may be envisaged within the scope of the invention. These alternative mechanisms may also be used to enhance the training of the first learning function (26) and / or the second learning function (25).
[0122] In an alternative embodiment, training is approached as a multi-task learning problem, where each level of deep supervision is considered a separate task. Dynamic task weighting methods can be used to adjust the relative importance of each intermediate loss. One approach is to weight the losses according to the model's uncertainty; the network thus focuses more on the supervision levels for which its prediction is least certain, accelerating convergence to a robust solution. Another approach aims to balance the magnitude of the gradients from each task (each level of supervision) to ensure that no single task dominates the learning process and disrupts the stability of the training. These mechanisms ensure more balanced training of the different layers of the learning function's decoder.In another embodiment, attention mechanisms are integrated not into the main network architecture, but directly into the calculation of the deep supervision loss weighting. The network thus autonomously learns which scales (i.e., which decoder levels) are most relevant for a given image or image region, and dynamically assigns them greater importance ("attention") by adjusting the weights of their respective losses. This process is typically implemented via differentiable attention modules that are integrated into the overall loss calculation, allowing the network to modulate the importance of intermediate error signals based on the input image context.
[0123] According to another alternative embodiment, the dynamic adjustment of the deep supervision weights can follow a curriculum-based learning strategy. Curriculum-based learning is a training strategy where the model is first exposed to "easy" examples or tasks, and then progressively to more "difficult" examples or tasks. In the context of the invention, this can be implemented by adjusting the deep supervision weights according to a predefined or adaptive curriculum. For example, the training could initially focus on the deep layers of the decoder, which correspond to "easy" global semantic tasks, assigning them higher weights. Progressively, the weighting would shift toward the finer, higher-resolution layers, which correspond to "difficult" tasks of precisely delineating anomaly outlines.In this context, the preferred mode reinforcement learning agent can be seen as an advanced and automated form of this strategy, where the agent learns the optimal training curriculum itself in real time.
[0124] Providing the latent space with predefined knowledge
[0125] A particular aspect of the invention is to allow the use of specific knowledge of the second trained learning function 25 to improve the training of the first learning function 26.
[0126] In particular, one aspect of the invention aims to execute a training step 120 the first learning function 26 using data 30 produced by the loss function implemented during the execution of the second learning function 25.
[0127] One benefit of this feature is to enable the training of the first learning function 26 in such a way as to orient it to generate synthetic anatomical images 12 allowing efficient identification of predetermined structural anomalies.
[0128] In one execution mode, additional information is provided to the latent space of an nnU-Net to guide its loss function; several approaches can be used. These methods consist of integrating additional information or constraints into the model training, either by modifying the architecture or by adding regularizations to the optimization.
[0129] In one embodiment, this additional information 30 constitutes specific knowledge enabling supervised training to improve the generation of synthetic anatomical images 12 towards images enabling the identification of anomalies.
[0130] In one execution mode, the first learning function 26 is implemented using data produced by the loss function implemented during the execution of the second learning function 25. In one embodiment, said produced data includes segmentation information extracted from the loss function of the second learning function.
[0131] For example, the loss function of the first learning function can be oriented so as to guide the latent space of the neural network implementing the first learning function to produce representations of anomalies aligned with the representation of said anomalies on CT-scan images.
[0132] In one embodiment, the various coefficients of the loss function implemented to execute the first learning function are fixed or generated based on data extracted from the loss function implemented by the second learning function. These coefficients may include the coefficients of the mathematical formula of the loss function, particularly when the latter includes a combination of both dice loss and cross-entropy loss. One benefit is enabling the generation of synthetic anatomical images whose performance has been optimized to enhance the performance of presence scores by the second learning function.
[0133] In a particular embodiment, the process includes a training step 120 of the first learning function 26 using a discriminator 30. This discriminator operates in the latent space generated by said first learning function 26, evaluating whether the features extracted by the latter respect predetermined properties, these properties being derived from the second learning function 25 previously trained.
[0134] More specifically, the discriminator is configured to receive as input the latent representations generated by the first learning function and to compare these representations to a target distribution or to criteria extracted from the latent features of the second learning function. These predetermined properties may include, but are not limited to, spatial structures, contextual relationships, or statistical distributions related to the learning task. The discriminator thus acts as a regularization mechanism to guide the learning of the first learning function toward optimization consistent with these properties.
[0135] During training, an adversarial loss function is used to improve the quality of the generated latent representations. The loss includes a term corresponding to the error calculated by the discriminator, which evaluates whether the generated features conform to the target properties.
[0136] This Ladversarial loss function can be formulated as follows: Ladversarial = Ez~first learning function[logD(z)] + Ez~second learning function [log(1 -D(z))]
[0137] Where D(z) is the output of the discriminator applied to the latent space z.
[0138] Where E is the expected value of the evaluated values for the latent representations (z) generated by the first or second learning function
[0139] The first learning function is then optimized to maximize the similarity between its latent representations and those of the second learning function, while the discriminator is simultaneously trained to distinguish the representations of the two sources.
[0140] This mechanism promotes convergence where the features generated by the first learning function capture properties relevant to the target task, while respecting the constraints imposed by the second learning function. This process ensures greater consistency of latent representations and improved overall performance of the first learning function in its respective application.
[0141] Within the scope of the invention, the concept of structural or anatomical image may also include three-dimensional or multi-dimensional representations obtained by processing raw imaging data, allowing detailed and usable visualization of internal structures, often used as a reference for the analysis of functional or combined images.
[0142] System
[0143] According to one aspect, system 1 according to the invention comprises software and hardware means for implementing the process as described above.
[0144] An embodiment of system 1 according to the invention is now described with reference to figure 3.
[0145] The system includes a REC receiver. In one embodiment, the REC receiver is intended to be connected to a medical imaging acquisition device such as an MRI device, so as to receive images acquired by said acquisition device.
[0146] The REC receiver can be connected to the acquisition device by a wired or wireless connection, for example by a Bluetooth connection or a WI-FI connection or any other data exchange protocol known to those skilled in the art.
[0147] The REC receiver may include or be associated with one or more memory units for temporarily storing received images. The REC receiver is directly or indirectly connected to the AFF display to transmit the acquired images to the AFF display.
[0148] The REC receiver may include an input processor configured to preprocess received images, including steps such as intensity normalization, artifact correction, and resampling to ensure compatibility with learning functions.
[0149] System 1 also includes a CALC calculation device.
[0150] The CALC calculation device includes software and / or hardware means to implement the conversion 200 and detection 300 steps of the invention described above.
[0151] In one embodiment, the CALC calculation device preferably includes a conversion module M1 for implementing the conversion step 200 of the acquired image 11 into a synthetic anatomical image 12, and a detection module M2 for detecting a predetermined structural anatomy and / or calculating a presence score for at least one predetermined structural anatomy. The two modules M1 and M2 may be independent or integrated into a single module.
[0152] Alternatively, the learning functions 25 and 26 are implemented by remote electronic equipment, such as a remote server. In this scenario, the computing device includes an interface for exchanging data with the remote equipment in order to transmit data and retrieve the results of the processed data, for example, the generated attendance score(s).
[0153] Finally, in the present invention, it is understood that when the learning functions 25, 26 are implemented wholly or partly by remote equipment, the CALC computing device can be interpreted as the system comprising on the one hand the local device described in this application and the remote means enabling the implementation of the learning functions.
[0154] The CALC computing device further includes at least one processor or computer associated with a memory module (MEM) to perform at least some of the steps of the method according to the invention. For example, a first processor can be configured to perform the steps of conversion, detection, and display generation. In one embodiment, the at least one processor or computer includes means for transmitting and receiving information with a remote device, for example, via an internet network, enabling the implementation of the steps of the method according to the invention. In some cases, the processor or computer can communicate with one or more external devices via the network. The processor or computer can be connected to the network via a wired connection (for example, via an Ethernet cable) and / or a wireless connection (for example, via a Wi-Fi network).These external devices can include servers, workstations, and / or databases. The processor or computer can communicate with these devices to, for example, offload computationally intensive tasks. For instance, the processor or computer can send a medical image acquired over the network to the server for analysis and receive the results of the server's analysis. In addition (or alternatively), the processor or computer can communicate with these devices to access information that is not available locally and / or update a central information repository.
[0155] Device 1 may also include a plurality of processors, each associated with one or more memories, and configured to perform such steps together. In one embodiment, the processor(s) may be remote and connected to the display via a data network.
[0156] Device 1 further comprises one or more memories for storing or recording the sequences generated by the method according to the invention and / or for storing the computer programs that, when executed by one or more processors, implement the method according to the invention. In one embodiment, the device further comprises an EMM transmitter connected to said MEM memory for transmitting data from the second learning function, such as attendance scores or normalized attendance scores, from said MEM memory to a data network.
[0157] Device 1 may further include means of communication such as EMM transmitters and REC receivers for the exchange of information with an ACQ acquisition device and / or a remote device.
[0158] The AFF display may include means for receiving the different information received by the different means REC, CALC, MEM, of device 1 to generate a final image to be displayed.
Claims
DEMANDS 1. A computer-implemented method (1000) for detecting a structural anomaly of a subject's organ from a functional image comprising the following steps: ■ the reception (100) of an image acquired (11) by a non-irradiating image acquisition (QAC) device; ■ the conversion (200) of said acquired image (11) into at least one synthetic anatomical image (12) by a conversion module; ■ the detection (300) of one or more structural anomalies of an organ of a subject on the synthetic anatomical image (12) by a detection module.
2. A method according to claim 1, wherein: ■ The conversion step (200) is implemented by a first trained learning function (26) configured to receive the acquired image (11) as input and generate the synthetic anatomical image (12) as output and ■ the detection step (300) is implemented by a second trained learning function (25) configured to receive as input the synthetic anatomical image (12) and generate as output at least one presence score (14A, 14B, 14C) of at least one predetermined structural anomaly; in which the training of the first learning function (120) is implemented using data produced (22) by a loss function implemented during the execution of the second trained learning function (25).
3. Method according to claim 2 wherein the data produced (22) include segmentation information extracted from the loss function of the second trained learning function (25).
4. A method according to claim 2, wherein the first learning function (26) is trained using a discriminator that acts on a latent space of said first learning function to evaluate whether the features generated by the first learning function (26) respect predetermined properties extracted from the second trained learning function (25).
5. A method according to any one of claims 2 to 4, wherein the first learning function is trained using a loss function that incorporates a learning reinforcement agent over deep supervision.
6. A method according to claim 2 or claim 3, wherein the first learning function (26) is trained using a loss function that incorporates an adversarial or corrective term; and wherein the adversarial or corrective term guides the latent space of the first learning function (26) to produce representations aligned with target information extracted from the second trained learning function (25).
7. A method according to any one of claims 1 to 6, characterized in that it comprises: ■ the generation of a presence score (14A, 14B, 14C) by the second trained learning function (25) for each anomaly present on the synthetic anatomical image (12), said score being a function of the integral of the number of voxels and / or pixels classified for each predetermined anomaly; ■ the extraction of a bronchial volume (15) and / or a vascular volume from the synthetic anatomical image; and ■ the normalization (400) of each presence score generated by the extracted bronchial and / or vascular volume to generate, for each predetermined anomaly, a normalized presence score (16A, 16B, 16C).
8. System (1) for detecting structural anomalies comprising software and / or hardware means for implementing the method according to any one of claims 1 to 7.
9. Computer program product comprising code instructions which, when implemented by a computer (PRO), cause the system (1) according to claim 8 to carry out the steps of the process (1000) according to any one of claims 1 to 7.
10. Computer-readable data carrier (MEM) on which the computer program product according to claim 9 is stored.