Method for training a neural network, and corresponding classification method and computer program

The method enhances neural network training by using distance-transformed matrices to modulate errors based on spatial proximity, improving robustness and accuracy in image classification tasks, particularly for autonomous vehicles.

WO2026139482A1PCT designated stage Publication Date: 2026-07-02THALES SA

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
THALES SA
Filing Date
2025-12-22
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Existing neural network training methods for image classification are not robust to annotation errors and inaccuracies, particularly in segmentation tasks, leading to potential functional failures in applications like autonomous vehicles due to varying impacts of errors based on their location within the image.

Method used

A method for training neural networks that includes iterative training with distance-transformed matrices to modulate errors based on their distance from class boundaries, using a modulation function to penalize errors differently based on their spatial proximity to class instances, enhancing the neural network's robustness and accuracy.

Benefits of technology

The method improves the neural network's ability to discriminate between different types of errors, resulting in a more accurate and robust classification model suitable for inference tasks, reducing the risk of functional failures in applications like autonomous driving.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure EP2025088746_02072026_PF_FP_ABST
    Figure EP2025088746_02072026_PF_FP_ABST
Patent Text Reader

Abstract

The invention relates to a method (10) for training a neural network (15), which method comprises: - initialising (110) a neural network (15_in); - supplying (120) pairs of matrices (51) comprising an input matrix (51A) and a target label matrix (51B), each term of which is a class number (numj) of a term of the input matrix, for forming (130) training sets (65), intermediate validation sets (66) and final validation sets (67); and - training the neural network using said sets. For a selected class (Ci.sel), for each target label matrix and a relevant estimated label matrix (51C_Ent, 51C_VI, 51C_VF) from the current neural network in response to the relevant input matrix, a distance transform matrix is determined so as to apply a modulation function (f_mod) thereto in order to estimate (144, 154) a performance indicator of the current neural network.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] DESCRIPTION

[0002] TITLE: METHOD FOR TRAINING A NEURAL NETWORK, CLASSIFICATION METHOD AND CORRESPONDING COMPUTER PROGRAM

[0003] The present invention relates to a method for training a neural network configured for the classification of input data, in particular images, a method for classifying input data using such a neural network, and the corresponding computer program.

[0004] In the field of artificial intelligence, and in particular the automatic analysis of data such as image files, it is known to train neural networks by deep learning in order to generate a classification model.

[0005] As is known, the classification model is performed on a neural network, presenting an initial architecture chosen in terms of the number of layers, the number of neurons per layer, and the types of activation functions.

[0006] The values ​​of the neural network parameters, including synaptic weights and any biases, are adjusted during an adaptation phase of the neural network, also called the learning phase.

[0007] The learning phase is carried out using a set of digital input files and respective annotations, including information relating to the classes to be assigned to these files.

[0008] The annotation of the input files provided for training is carried out by an operator, based on a set of classes that the operator chooses or that is provided to him.

[0009] The classification model obtained at the end of the learning phase allows the automatic classification of new input files during a subsequent use phase of the classification model, also called the inference phase.

[0010] Input file annotation can be performed at different scales.

[0011] It is therefore possible to assign a single class to an entire input file to be classified, for example, to each image provided as input to the neural network. Typically, in the field of image analysis, any image containing a car will be assigned the class "car".

[0012] It is also possible to process each element of the input file by assigning a class to each of these elements. If the input file is an image, a class would be assigned to each pixel of the image. In this case, the annotation process is called segmentation.

[0013] Segmentation is called "semantic" if no distinction is made between different instances of the same class. Segmentation is called "instance" if, on the contrary, it allows the identification of individual components belonging to a single instance of a given class.

[0014] For example, if the input files are images and the classes to be assigned are the road class and the car class, instance segmentation will separate the pixels corresponding to a given car from the pixels belonging to other cars, unlike semantic segmentation.

[0015] The evolution of the neural network during the learning phase is monitored by means of one or more metrics allowing control of the performance of the neural network, i.e. the ability of the classification model learned on a training dataset to generalize to new data.

[0016] These metrics are notably used to determine a performance indicator of the neural network being trained, obtained at each iteration of the training phase with the respective set of input files and annotations for training, called the training set, as well as for the final validation of the neural network for its use in a subsequent inference phase.

[0017] The metrics commonly used are based on the concepts of:

[0018] - precision, that is to say the ratio of the number of true predictions of a class to the total number of predictions of that class, and / or

[0019] - of recall, that is to say the ratio of the number of true predictions of a class and the total number of true occurrences expected for that class.

[0020] These metrics therefore take into account whether a prediction is true or false, in an essentially binary way. A metric commonly used by those skilled in the art is the Intersection over Union metric (abbreviated as loU).

[0021] When the annotation process is a segmentation type process, the usual metrics do not necessarily allow us to obtain a relevant model, or the most relevant one, with respect to a function to be performed by means of the neural network during an inference phase of this neural network.

[0022] It is important to bear in mind that annotations are performed by an operator, whose performance is inherently limited both in terms of annotation speed and in terms of class detection threshold in the observed input files, and that the accuracy of the input files observed by the operator is inherently limited by the resolution and accuracy of the sensor with which these files are obtained, as well as by those of the device enabling their observation.

[0023] Initial annotation errors and / or inaccuracies may be reinforced by learning, so that the neural network obtained at the end of the learning phase is not always fully satisfactory.

[0024] For example, a neural network trained for image classification can be implemented in an autonomous vehicle or one equipped with a driver assistance system.

[0025] The control function to which the neural network contributes can then include one or more priorities, for example, ensuring pedestrian safety, ensuring the safety of all vehicles, following the shortest path, etc. To perform this function, images provided, for example, by a vehicle camera can be annotated using segmentation. As an example, the classes for annotation could include the classes road, sidewalk, and house.

[0026] An error in the positioning of the border of a class instance can have a significant impact on that class's function, whereas an error made far from the class's borders would have a minor impact. In the example of driving a vehicle, classifying a pixel on a section of sidewalk as a road poses a greater risk to pedestrian safety when that misclassified pixel is far from any other road than when it is immediately adjacent to one.

[0027] It is therefore understood that the initial annotation errors and inaccuracies and the annotation errors generated during training and evaluated for updating the model parameters do not necessarily all have the same influence on a function to be performed during an inference phase of the neural network.

[0028] One aim of the invention is therefore to propose a method for training a neural network for file classification by segmentation which is more robust to annotation errors and which makes it possible to discriminate certain errors from others.

[0029] To this end, the invention relates to a method for training a neural network configured for the classification of matrix input data, comprising: a) the initialization of a neural network,

[0030] b) the provision of a training set comprising a plurality of matrix pairs, each matrix pair comprising an input matrix containing input data and a respective target label matrix of the same dimensions as the input matrix and each term of which is a class number representing a class of a respective term of the input matrix, said class being selected from a set of predetermined classes, c) training from the training set, a training set, an intermediate validation set and a final validation set,

[0031] d) iterative training of the neural network using the training set and the intermediate validation set, and

[0032] e) the final validation of the trained neural network using the final validation set,

[0033] characterized in that:

[0034] a plurality of iterations of the iterative training and / or respectively the final validation include an estimation of a performance indicator of the neural network, said estimation comprising:

[0035] i) the selection of at least one class from the set of predetermined classes, ii) for each of the selected classes, for each target label matrix of the training and intermediate validation sets, respectively final validation sets, as well as for each respective estimated label matrix obtained with the current neural network in response to the respective input matrix, the determination of a target distance transform matrix, respectively estimated, each term of a target distance transform matrix, respectively estimated, representing a distance between the position of a corresponding term in the respective label matrix and the position of the nearest term of said selected class in the respective label matrix, iii) the application of a modulation function to each of the terms of the target distance transform matrices, respectively estimated, thus determined, to form target modulated matrices,respectively estimated, the modulation function being a function of the distance, and,

[0036] iv) the estimation of the performance indicator of the current neural network from a performance measurement function taking as arguments said target and estimated modulated matrices.

[0037] Distance-transformed matrices allow for the expansion of data within label matrices. The data in label matrices relates to a class assigned to any element of an input matrix (that is, to any term in that input matrix). The class assigned to an element is supplemented in distance-transformed matrices by information about the position of that element relative to elements that have been assigned the same and / or a different class.

[0038] The formation of modulated matrices then makes it possible to weight any error made in assigning a class according to the distance to which the term of the estimated first neighbor matrix of the class is located. Depending on the choice of the direction of variation of the modulation function, decreasing or increasing with distance, errors made near a boundary of an instance of a given class will be more heavily penalized than errors made far from these boundaries for the updating of the model parameters and / or its final validation, or vice versa.

[0039] The choice of the shape of the modulation function allows, for example, the definition of a characteristic distance beyond and / or below which the penalty for errors no longer varies significantly.

[0040] Thus, the performance of the model during training and / or the final model can be evaluated in a way that is adapted to a function to be performed during an inference phase of the neural network, so that the function is performed with better accuracy than with prior art training methods.

[0041] According to other advantageous aspects of the invention, the training method comprises one or more of the following features, taken individually or in all technically possible combinations:

[0042] - the modulation function is a monotonic function of the distance;

[0043] - the modulation function takes the value one, respectively tends towards the value zero, when said distance is equal to zero, respectively tends towards infinity;

[0044] - the modulation function is chosen from the function f mod (distance) = exp dlsta lce ^ Ol j arepresents a predetermined positive scaling factor and p represents a predetermined positive adjustment parameter, the function f mod (distance) = 1 — tanh (distance) and the function f mod (distance) = dista ^ ce+ Y where Y represents a predetermined positive adjustment parameter;

[0045] - estimation iii) of the neural network performance indicator includes the calculation of the intersection and union of each estimated modulated matrix and the respective target modulated matrix for each of said selected classes;

[0046] - all classes from the predetermined set of classes are selected in step i);

[0047] - one or more of the performance indicators thus estimated are used to make a decision on whether to continue the training iterations and / or qualify the neural network for a later inference phase and / or for the initialization a) of a new neural network during a reiteration of the training process;

[0048] - the input matrices are previously obtained using a digital image sensor. The invention also relates to a method of data classification by segmentation using a neural network obtained at the end of the training process as described above.

[0049] The invention finally relates to a computer program product comprising software instructions which, when executed by a computer, implement the training method as defined above.

[0050] The invention will become clearer upon reading the following description, given solely by way of non-limiting example, and made with reference to the drawings in which:

[0051] Figure 1 schematically represents an electronic classification device on which the process according to the invention can be implemented;

[0052] Figure 2 schematically represents a particular example of a neural network implemented by the electronic classification device of Figure 1;

[0053] Figure 3 represents in the form of a logic diagram the training method according to the invention for training the neural network of Figure 2 and the classification method implemented at the end of the training method;

[0054] Figure 4 represents in the form of a logic diagram the intermediate estimation step of the training process of Figure 3;

[0055] Figure 5 represents in the form of a logic diagram the final estimation step of the training process of Figure 3.

[0056] The invention relates to a method for training 10 a neural network 15 implemented by means of an electronic classification device 20.

[0057] The electronic classification device 20 is described with reference to Figure 1. The electronic classification device 20 includes, for example, at least one processor 25 exchanging data with at least one mass storage 30 and at least one random access memory 35.

[0058] The processor 25 is for example chosen from a central processing unit (CPU) and a graphics processing unit (GPU).

[0059] Mass memory 30 is a non-volatile memory, with a storage capacity suitable for implementing the training process 10 and optionally for the subsequent classification of at least one matrix input data to be classified 45.

[0060] Mass memory 30 is configured to receive and store software instructions to be executed by the processor 25 for the implementation of the training process 10 which will be described later, and optionally for the classification of at least one matrix input data to be classified 45.

[0061] Mass storage 30 is configured to receive and store a training set 50 of training data and optionally at least one input data to be classified 45.

[0062] The matrix input data to be classified 45 includes information relating to a physical object obtained by means of a sensor and stored electronically by means of a matrix data structure.

[0063] Each matrix input to be classified (45) is a matrix of dimension NI*N2, where Ni and N2 are integers greater than or equal to 1 and such that at most one of Ni and N2 is equal to 1. Each term of the matrix comprises C sub-terms, where C denotes a number of sensor channels. Each term is therefore a singlet when C is equal to 1 and a multiplet when C is strictly greater than 1.

[0064] Each matrix input data to be classified 45 is obtained by means of a suitable digital sensor, for example a digital image sensor.

[0065] As an example, the matrix input data to be classified 45 is stored in mass memory 30 in a matrix image file of dimension NI*N2, each term of the matrix being a multiplet of C sub-terms if C is strictly greater than 1.

[0066] Ni, respectively N2, represents in this case an integer number of pixels along a longitudinal direction of the digital image sensor by means of which the file was obtained, Ni and N2 being strictly greater than 1.

[0067] Each term in the corresponding input matrix then encodes, for example as a floating-point number, a color level on one or more channels.

[0068] In this example, the number of channels C of the image sensor is, for instance, equal to 1 for a grayscale image, 3 for an RGB color image, or even greater than three for a multispectral, hyperspectral, or ultraspectral sensor. In the case of an RGB color image, each pixel therefore comprises three subpixels, so that each term of the matrix input data to be classified is a triplet of floating-point numbers.

[0069] The raster format is chosen, for example, from the JPEG, PNG, GIF, TIF and BMP formats.

[0070] The matrix data to be classified 45 is for example a medical image obtained on a patient using a medical imaging device such as a 2D digital radiography device, a scanner or a magnetic resonance imaging device.

[0071] In another embodiment, the matrix data to be classified 45 is, for example, an image file of a portion of space intended to be traversed by an autonomous vehicle or one equipped with a driver assistance system.

[0072] The training set 50 comprises a plurality of matrix pairs 51. Each matrix pair 51 comprises an input matrix 51A containing input data and a respective target label matrix 51B of the same dimension as the input matrix 51A.

[0073] The data format used to store the input matrices 51A in mass memory 30 is the same as that of the matrix input data to be classified 45.

[0074] Thus, in the case of image classification, the input matrices 51A are stored in memory 30 as image files of the same format as that used for the matrix input data to be classified 45.

[0075] Each target label matrix 51 B has the same dimension Ni*N2 as the respective input matrix 51 A.

[0076] Each term in the target label matrix 51 B is a class number numj, representing a respective class of the corresponding term in the input matrix 51A, the class being selected from a set {Ci} of N predetermined classes Ci.

[0077] The classes Ci are chosen a priori by an operator who annotates the input matrices 51 A upstream of the training process 10 to generate the target label matrices 51 B.

[0078] Each target label matrix 51 B is therefore formed at the end of an annotation step of the segmentation type of annotation of a respective input matrix 51A, that is to say by assigning a class Ci to each term of the respective input matrix 51A, the annotation step being carried out by an operator upstream of the implementation of the training process 10.

[0079] Each class Ci is represented by a respective class number numj, preferably a positive integer.

[0080] The target label matrices 51 B are stored in mass memory 30 by means of an appropriate matrix data structure.

[0081] The RAM 35 is a volatile memory configured to temporarily store data from the mass memory 30 for the implementation of the training method 10 and optionally for the subsequent classification of at least one input data to be classified 45. The classification device 20 is configured to implement the training method 10 of the neural network 15, an example of which is shown in Figure 2.

[0082] The neural network 15 comprises an ordered succession of 55-i layers (i an integer between 1 and k, with k greater than or equal to three) of 60 neurons, each of which takes its inputs from the outputs of the previous layer.

[0083] More specifically, each 55-i layer comprises respective 60 neurons, taking their inputs from the outputs of the 60 neurons of the previous 55-(i-1) layer as appropriate, or receiving input data for the first 55-1 layer.

[0084] In the example in Figure 2, the neural network 15 includes an input layer 55-1, two hidden layers 55-2 and 55-3 and an output layer 55-4.

[0085] Alternatively, more complex 55-i neural network structures can be considered. In this case, a given 55-i layer can be linked to a 55-j layer further away than the immediately preceding 55-(i-1) layer.

[0086] Each neuron 60 is also associated with an operation, that is to say a type of processing, to be carried out by said neuron within the corresponding processing layer.

[0087] Each 55-i layer is connected to the other 55-j layers by one or more synapses 65.

[0088] A synaptic weight is associated with each synapse 65, and each synapse forms a link between two respective neurons 60. Each synaptic weight is, for example, a real number or a complex number.

[0089] Each neuron 60 is specific to:

[0090] - perform a weighted sum of the value(s) received from the neurons 60 of the previous layer, each value then being multiplied by the respective synaptic weight of the corresponding synapse, then

[0091] - apply an activation function, typically a non-linear function, to said weighted sum, and

[0092] - deliver at the output of said neuron 60 a value resulting from the application of the activation function.

[0093] The activation function allows for the introduction of non-linearity in the processing carried out by each neuron 60. The sigmoid function, the hyperbolic tangent function, the Heaviside function are examples of activation functions.

[0094] As an optional complement, each neuron 60 is also capable of applying an additional factor, also called a bias, to the output of the activation function. The value delivered at the output of said neuron 60 is then the sum of the bias value and the value resulting from the activation function. The training method 10 aims to adjust the synaptic weights, including any biases, so that the neural network 15 performs the classification of the matrix input data 45 with the lowest possible error rate.

[0095] The training method 10, described with reference to Figure 3, comprises: a) the initialization 110 of the neural network 15,

[0096] b) the provision 120 of a training set 65 comprising a plurality of pairs of matrices 51, each pair of matrices 51 comprising an input matrix 51 A comprising input data and a respective target label matrix 51 B of the same dimension as the input matrix 51 A and of which each term is the class number numj representing the class Ci of the respective term of the input matrix 51 A, said class Ci being selected from the set {C} of predetermined classes,

[0097] c) training 130 from the learning set 50, a training set 65, an intermediate validation set 66 and a final validation set 67,

[0098] d) the iterative training 140 of the neural network 15 using the training set 65 and the intermediate validation set 66, and

[0099] e) the final validation 150 of the neural network 15 trained using the final validation set 67.

[0100] Initialization 110 includes:

[0101] - the choice of a type of neural network 15, including the choice of the different layers 55-i and the synapses 65 that link their respective neurons 60,

[0102] - the choice of activation functions for the different neurons 60, and

[0103] - the initialization of synaptic weights, for example in a random manner, including possible biases, of a neural network 15.

[0104] Synaptic weights are called neural network parameters and are intended to be adjusted during iterative training 140.

[0105] The other features of the neural network chosen during initialization 110 are represented by quantities called neural network hyperparameters 15, which are not modifiable during iterative training 140.

[0106] At the end of the initialization step 110, an initial neural network 15_in is thus formed, whose representative information is stored in the RAM 35 for the purpose of subsequent steps.

[0107] The provision of 120 of the 50 learning set includes the storage of the 50 learning set on the 30 mass storage.

[0108] The initialization 110 and the supply 120 can be carried out in any order or simultaneously. The training 130 of the training sets 65, the intermediate validation set 66, and the final validation set 67 is carried out by the processor 25 in a manner known to the person skilled in the art.

[0109] In the following, the reference signs of the different matrices relating to the training set 65 end with "_Ent", the reference signs of the different matrices relating to the intermediate validation set 66 end with "_VI" and the reference signs of the different matrices relating to the final validation set 67 end with "_VF".

[0110] The training set 65 and intermediate validation set 66 are preferably disjoint.

[0111] Preferably, the training set 65 and the validation set 66 have similar probability distributions of the different classes Ci.

[0112] The training set 65 and the final validation set 67 are preferably disjoint.

[0113] Preferably, the training set 65 and the final validation set 67 have similar probability distributions of the different classes Ci.

[0114] In one particular embodiment, the final validation set 67 and intermediate set 66 are identical.

[0115] As is known, iterative training 140 comprises a plurality of iterations, called epochs E p In the following, we will denote the number of epochs E p by NE, where p is an integer ranging from 1 to NE.

[0116] In every era E p The input matrices 51A_Ent of the training set 65 are provided as input to the neural network 15 in its configuration for epoch E p in progress, so as to generate for each an estimated label matrix 51C_Ent, at the output of the neural network 15 in its configuration for epoch E p in progress.

[0117] Each era E poptionally includes a step of dividing the training set 65 into K batches of B input matrices 51 A_Ent. The batches of input matrices 51A_Ent are in this case provided successively to the neural network 15 and each once at each epoch E p , each era E p including K intermediate iterations, so that at the end of an epoch E p , each input matrix 51A_Ent of the training set 65 was received as input to the neural network 15 once.

[0118] In one particular embodiment, the division step is performed only once for all epochs E p .

[0119] Alternatively, the division step is performed with replacement at each epoch E p , so that at the end of a period E pGiven, some input matrices 51A_Ent may have been selected from multiple batches, and / or some input matrices 51A_Ent may never have been selected.

[0120] The terms of an estimated training label matrix 51C_Ent are the class numbers numj estimated by the neural network 15 in its configuration for epoch E p ongoing, in response to the provision of the input matrix 51A_Ent on the input layer 55-1.

[0121] At the end of each epoch E p , a first distance d1(51B_Ent, 51C_Ent) between each of the estimated label matrices 51C_Ent and the respective target label matrix 51B_Ent is evaluated by the processor 25 and the value of a cost function, configured to measure a distance between the set of estimated label matrices 51 C and the respective target label matrices 51 B, is calculated on the basis of the first distances d1(51B_E, 51C_E) thus evaluated.

[0122] Finally, at the end of each epoch E p , synaptic weights, and optionally biases, are adjusted according to the value taken by the cost function.

[0123] The adjustment of synaptic weights and optionally of biases is carried out in a manner known to the person skilled in the art, for example according to the backpropagation of the gradient method, to provide an updated neural network 15.

[0124] In a first embodiment, for each of a plurality of epochs E p advantageously for each era E p , the training process 10 includes an intermediate estimation step 140-A of a first performance indicator IP1 of the neural network 15, described with reference to Figure 4.

[0125] The first performance indicator IP1 of neural network 15 is configured to allow tracking the evolution of the neural network's classification performance over epochs E p successive, notably in such a way as to allow the transition to epoch E p+i next, according to the principle of early stopping.

[0126] The intermediate estimation step 140-A includes:

[0127] i) the selection 141 of at least one class Ci.seï from the set {C} of predetermined classes,

[0128] ii) for each of the selected Ci.seï classes:

[0129] * For each target label matrix 51B_Ent in the training set 65, as well as for each respective estimated label matrix 51C_Ent obtained with the current neural network 15 in response to the respective input matrix 51A_Ent, the determination 142_Ent of a target distance transformed matrix 52B_Ent, respectively estimated 52C_Ent, each term of a target distance transformed matrix 52B_Ent, respectively estimated 52C_Ent, representing an intra-matrix distance dj ntra / ci,sei between the position of a corresponding term in the target label matrix 51B_Ent, respectively estimated 51C_Ent, and the position of the term of said class Ci.seï selected closest in the target label matrix 51B_Ent, respectively estimated 51C_Ent, respective; * on the same principle, for each target label matrix 51B_VI of the intermediate validation set 66, as well as for each respective estimated label matrix 51C_VI obtained with the current neural network 15 in response to the respective input matrix 51A_VI, the determination 142_VI of a target distance transformed matrix 52B_VI, respectively estimated 52C_VI;

[0130] iii) the application 143 of a modulation function f_mod to each of the terms of the target distance transformed matrices 52B_Ent, 52B_VI, respectively estimated 52C_Ent, 52C_VI, thus determined to form target modulated matrices 53B_Ent, 53B_VI, respectively estimated 53B_Ent, 53B_VI, the modulation function f_mod being a function, in particular a monotonic function, of the intra-matrix distance dintra / ci.seï, and iv) the estimation 144 of the performance indicator IP of the current neural network from a performance measurement function taking as arguments the said target and estimated modulated matrices.

[0131] Selection 141 includes the provision by an operator to the processor 25 of at least one class Ci of interest, selected from the set {C} of predetermined classes Ci.

[0132] Selection 141 is for example carried out in relation to a function to be performed during an inference phase of the trained neural network 15.

[0133] In a particular embodiment, several Ci classes, or even all of the Ci classes, are selected.

[0134] Determination 142 includes, for each of the classes Cj, sei selected, for each target label matrix 51 B and estimated label matrix 51 C from the intermediate validation set 66 and the training set 65:

[0135] the evaluation for each term of said label matrix concerned of an intra-matrix distance dintra / ci.seï, defined as the distance of the term of the label matrix to its first neighbor of said selected class Cj, se i.

[0136] As an example, the intra-matrix distance is chosen from an L1 distance or an L2 distance.

[0137] If we denote by B[m][p] the term of a target label matrix 51 B positioned in row m in column p of this matrix, the intra-matrix distance dj ntra between any two terms B[m][p] and B[m'][p'] of a given target label matrix 51 B is for example defined by the relation dj n tra(B[m][p], B[m'][p']) = abs(m'-m) + abs(p'-p), where abs denotes the absolute value function. For example, if the input data is of type image, all pixels belonging to a given instance of a selected class Ci, sei pixels will be associated with an intra-matrix distance of zero for this class. Pixels at the edge of this instance will be associated with an intra-matrix distance of 1.

[0138] If we denote by N sei the number of classes selected Ci, se i, at the end of determination step 142, each processed target label matrix 51 B or estimated label matrix 51 C is associated with N sei target distance transformed matrices 52B, respectively estimated 52C, each obtained relative to a respective selected class Ci.seï.

[0139] Processor 25 then applies the modulation function f_mod to each term of the transformed target distance matrices 52B_Ent, 52B_VI, and estimated 52C_Ent, 52C_VI, respectively, thus determined to form the modulated target matrices 53B_Ent, 53B_VI, and estimated 53B_Ent, 53B_VI, respectively. The modulation function f_mod is a function, in particular a monotonic function, of the intra-matrix distance dintra / ci.seï- At the end of the application step 143 of the modulation function f_mod, each processed target label matrix 51B or estimated matrix 51C is associated with N sei target modulated matrices 53B, respectively estimated 53C, each obtained relative to a respective selected class Ci.seï.

[0140] The modulation function f_mod is chosen for example in relation to a function to be performed in an inference phase of the neural network 15.

[0141] The modulation function f_mod can be configured to decrease from a higher value, for example 1, to zero when the intra-matrix distance dj n tra / ci,sei tends towards infinity.

[0142] The modulation function f_mod can be continuous or piecewise continuous. In a particular embodiment, the modulation function f_mod takes the form:

[0143]

[0144] in which represents a a predetermined positive scaling factor and p represents a predetermined positive adjustment parameter.

[0145] In another embodiment, the modulation function f_mod is of the form f mod(distance) = 1 - tanh distance). This choice notably allows defining a boundary zone around the class instances, within which the variation in the position of an annotation error is taken into account, and beyond which the influence of the annotation error will be approximately constant regardless of its position. In another embodiment, the modulation function f_mod is of the form fmod(distance) = dista ^ ce+ y where Y represents a predetermined positive adjustment parameter. This choice notably allows defining a boundary zone around the class instances, in which the variation in the position of an annotation error is taken into account, and beyond which the influence of the annotation error will be approximately constant regardless of its position.

[0146] Once the modulated matrices are obtained, the processor 25 implements the estimation step 144. The estimation step 144 includes the estimation of the first performance indicator IP1 of the neural network 15 on the basis of a first performance measurement function f 1 _perf taking as an argument the set of said target modulated matrices 53B and estimated 53C for the set considered among the training set 65 and the intermediate validation set 66.

[0147] The value of the first performance indicator IP1 for a given epoch Ep for the training set 65 is therefore defined by the relation IP1 (E p , 65) = f1_perf ({53B_Ent, 53C_Ent} / E p ).

[0148] The value of the first performance indicator IP1 for a given epoch Ep for the intermediate validation set 66 is therefore defined by the relation I P1 (E p , 66) = f1_perf ({53B_VI, 53C_VI} / E P ).

[0149] The first performance measurement function f1_perf can be any function known to the person skilled in the art configured to measure a distance between two sets of matrices, with the difference that it receives as arguments not the target label matrices 51 B and estimated 51 C as in prior art processes, but the target modulated matrices 53B and estimated 53C.

[0150] As an example, the first performance measurement function f 1 _perf can be the sum, possibly weighted, over all the selected classes Cj, sei Jaccard indices, also called Intersection over Union coefficients (in English, "Intersection over Union", abbreviated loU) obtained for each pair of matrices comprising a target modulated matrix 53B and a respective modulated matrix 53C.

[0151] In a particular embodiment, the first performance measurement function f1_perf includes the weighting of elements from each selected class Ci.seï by weights Àj, se i, optionally reconfigurable.

[0152] At the end of the intermediate estimation step 140-A, the training process 10 optionally includes a decision on whether to continue the iterative training 140, based on one or more first performance indicators IP1 already estimated.

[0153] The decision can be chosen from the transition to a final validation stage 150 of the neural network thus trained 15_ent, the launch of a new epoch E p learning and / or initialization 110 of a new neural network 15 with a modified set of hyperparameters in view of repeating the other steps of the training process 10.

[0154] In particular, if it is observed that the first performance indicator IP1 grows over the intermediate validation set 66 after passing through a minimum, it may be possible to terminate the iterations of the learning epochs.

[0155] Alternatively or in addition, if it is observed that the first performance indicator IP1 of the training sets 65 and the intermediate validation sets 66 diverge, the training iterations can be terminated and a new neural network 15 can be initialized.

[0156] It is understood that calculating the first performance indicator IP1 on the basis of the target modulated matrices 53B and estimated matrices 53C allows a decision on the continuation of the learning to be made different from that which would have been made on the basis of the respective label matrices, since the decision in the method according to the invention is based on a penalization of certain errors compared to others on a criterion of position of the error with respect to the instances of the different selected classes Ci.seï- At the end of the iterative training 140, the processor 25 proceeds to the final validation 150 of the trained neural network 15_ent by means of the final validation set 67.

[0157] In a second alternative or combined embodiment with the first embodiment, the final validation 150 includes a final estimation step 150-A of a second performance indicator IP2 of the trained neural network 15_ent, described with reference to Figure 5.

[0158] The second performance indicator IP2 is configured to evaluate the classification performance of the trained neural network15_ent on new data.

[0159] The final estimation step 150-A, described with reference to Figure 5, comprises: i) the selection 151 of at least one class Ci.seï from the set {Ci} of predetermined classes,

[0160] ii) for each of the selected Ci.seï classes:

[0161] * for each target label matrix 51B_VF of the final validation set 67, as well as for each respective estimated label matrix 51C_VF obtained with the neural network 15 trained in response to the respective input matrix 51A_VF, the determination 152 of a target distance transformed matrix 52B_VF, respectively estimated 52C_VF, each term of a target distance transformed matrix 52B_VF, respectively estimated 52C_Ent, representing the intra-matrix distance dintra / ci.seï between the position of a corresponding term in the target label matrix 51B_VF, respectively estimated 51C_VF, and the position of the term of said class Ci.seï selected closest in the target label matrix 51B_VF, respectively estimated 51C_VF, respective;

[0162] iii) the application of a modulation function f_mod to each of the terms of the target distance-transformed matrices 52B_VF, respectively estimated 52C_VF, thus determined to form target modulated matrices 53B_VF, respectively estimated 53C_VF, the modulation function f_mod being a function, in particular a monotonic function, of the intra-matrix distance dj n tra / ci,sei, and

[0163] iv) the estimation 154 of the second performance indicator IP2 of the neural network 15 trained from a second performance measurement function taking as arguments said target modulated matrices 53B_VF and estimated matrices 53C_VF.

[0164] It is therefore understood that the final estimation step 150-A is constructed in a manner analogous to the intermediate estimation steps 140-A mutatis mutandis.

[0165] The selected classes Ci.seï and / or the modulation function f_mod for the final estimation step 150-A and the intermediate estimation steps 140-A may be the same or different.

[0166] The first and second performance measurement functions may be identical or different.

[0167] The training method 10 according to the invention optionally includes a decision step following the estimation 154 of the second performance indicator IP2. The decision can be chosen from qualifying the trained neural network 15_ent for a subsequent inference phase or disqualifying the trained neural network 15_ent for a subsequent inference phase and optionally initializing 110 a new neural network 15 with a modified set of hyperparameters.

[0168] Optionally, the training method 10 according to the invention includes a padding step of each of the input matrices 51 A of the training set 68, by adding dummy terms around the periphery of the terms already present, for example nuis terms.

[0169] Typically, if the input data are images comprising NI*N2 pixels, the corresponding enlarged matrices are of dimension (Ni+2di)*(N2+2d2), with di dummy pixels being added on either side of the first dimension and d2 dummy pixels being added on either side of the second dimension of the image.

[0170] The enlargement step includes enlarging the respective label matrices 51 B by assigning a dummy additional class to the terms added to the input matrices 51 A. The enlargement step allows control of possible undesirable side effects.

[0171] The invention also relates to a data classification method 170 by segmentation using the trained neural network 15_ent obtained at the end of the training method 10 as described above for the classification of at least one matrix input data 45.

[0172] As seen in Figure 2, matrix data 45 is provided on the input layer 55-1 of the trained neural network 15_ent and a response label matrix 70 is obtained at the output 55-p of the trained neural network 15_ent.

[0173] The invention finally relates to a computer program product comprising software instructions which, when executed by a computer, implement the training process 10 or classification 170 as described above.

Claims

DEMANDS 1. Method for training (10) a neural network (15) configured for classifying matrix input data (45) comprising: a) the initialization (110) of a neural network (15_in), b) the provision (120) of a training set (50) comprising a plurality of matrix pairs (51), each matrix pair comprising an input matrix (51A) comprising input data and a respective target label matrix (51B) of the same dimension as the input matrix (51A) and each term of which is a class number (numj) representing a class (Ci) of a respective term of the input matrix (51A), said class (Ci) being selected from a set ({C}) of predetermined classes, c) training (130) from the training set (50), a training set (65), an intermediate validation set (66) and a final validation set (67), d) iterative training (140) of the neural network using the training set (65) and the intermediate validation set (66), and e) the final validation (150) of the trained neural network (15_ent) using the final validation set (67), characterized in that: a plurality of iterations of the iterative training and / or respectively the final validation include an estimate (140-A, 150-A) of a performance indicator (PI) of the neural network (15), said estimate (140-A, 150-A) comprising: i) the selection (141, 151) of at least one class (Ci.seï) from the set ({C}) of predetermined classes, ii) for each selected class (Ci.seï), for each target label matrix (51B_Ent, 51B_VI, 51B_VF) of the training (65) and intermediate validation (66) sets, respectively final validation (67), as well as for each respective estimated label matrix (51C_Ent, 51C_VI, 51C_VF) obtained with the current neural network in response to the respective input matrix (51A_Ent, 51A_VI, 51A_VF), the determination (142_Ent, 142_VI, 152_VF) of a target distance transform matrix (52B_Ent, 52B_VI, 52B_VF), respectively estimated (52C_Ent, 52C_VI, 52C_VF), each term of a target distance transform matrix, respectively estimated, representing a distance (djntra / Ci.se i) between the position of a corresponding term in the respective label matrix and the position of the term of said selected class (Cj, sei) the closest in the respective label matrix, iii) the application (143, 153) of a modulation function (f_mod) to each of the terms of the target distance transformed matrices (52B_Ent, 52B_VI, 52B_VF), respectively estimated (52C_Ent, 52C_VI, 52C_VF), thus determined, to form target modulated matrices (53B_Ent, 53B_VI, 53B_VF), respectively estimated (53C_Ent, 53C_VI, 53C_VF), the modulation function (f_mod) being a function of the distance, and iv) the estimation (144, 154) of the performance indicator (PI) of the current neural network (15) from a performance measurement function taking as arguments the said target and estimated modulated matrices.

2. A training method (10) according to claim 1, wherein the modulation function (f_mod) is a monotonic function of the distance.

3. A training method (10) according to claim 2, wherein the modulation function (f_mod) takes the value one, or respectively tends towards the value zero, when said distance (dj n tra / ci,sei) is equal to zero, respectively tends towards infinity 4. A drive method (10) according to claim 3, wherein the modulation function (f_mod) is chosen from the function f mod (distance) = exp Ol j a represents a predetermined positive scaling factor and p represents a predetermined positive adjustment parameter, the function f mod (distance) = 1 - tanh distance) and the function f mod (distance) = dista ^ ce+ y where Y represents a predetermined positive adjustment parameter.

5. Training method (10) according to any one of the preceding claims, wherein estimation iii) of the performance indicator (IP1, IP2) of the neural network (15) comprises calculating the intersection and union of each estimated modulated matrix (53C) and the respective target modulated matrix (53B) for each of said selected classes (Cj, se i).

6. Training method (10) according to any one of the preceding claims, wherein all classes (Ci) of the set ({Ci}) of predetermined classes are selected in step i).

7. Training method (10) according to any one of the preceding claims, wherein one or more of the performance indicators (IP1, IP2) thus estimated are used to make a decision on whether to continue the training iterations and / or qualify the neural network (15) for a subsequent inference phase and / or for the initialization a) of a new neural network (15) during a reiteration of the training method (10).

8. Training method (10) according to any one of the preceding claims, wherein the input matrices (51A) are previously obtained using a digital image sensor.

9. Data classification method (170) by segmentation using a neural network (15_ent) obtained at the end of the training method (10) according to any one of the preceding claims.

10. Product computer program comprising software instructions which, when executed by a computer, implement the drive method (10) according to any one of claims 1 to 8.