Robust neural network learning system

By introducing intermediate concepts, data, and feature constraints into deep neural network training, the problems of adversarial attacks and training data dependence are solved, enabling more robust and interpretable neural network training.

CN116894471BActive Publication Date: 2026-06-23GM GLOBAL TECHNOLOGY OPERATIONS LLC

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
GM GLOBAL TECHNOLOGY OPERATIONS LLC
Filing Date
2022-10-20
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Deep neural networks are vulnerable to adversarial attacks, and the training process requires a large number of labeled images. Furthermore, existing technologies struggle to resist domain changes and generalize to impossible scenarios during training.

Method used

The neural network is trained by using intermediate concept constraints, data constraints, and feature constraints, including defining relevant information for individual components and objects of interest, training to identify adversarial features using perturbation data, and combining intrinsic and non-intrinsic image parameters from sensor data.

Benefits of technology

It improves the robustness of neural networks, enabling them to resist adversarial attacks, reduces dependence on training data, and enhances their generalization ability and interpretability across different domains.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116894471B_ABST
    Figure CN116894471B_ABST
Patent Text Reader

Abstract

A system comprising a computer comprising a processor and a memory. The memory comprises instructions causing the processor to be programmed to: receive an intermediate concept constraint at a neural network; and train the neural network with training data, training labels, and at least one of a data constraint, a feature constraint, or the intermediate concept constraint.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to a robust neural network learning system that uses intermediate concept constraints and reasoning during training. Background Technology

[0002] Deep neural networks (DNNs) can be used to perform many image understanding tasks, including classification, segmentation, and annotation. Typically, DNNs require a large number of training images (tens of thousands to millions). Furthermore, these training images usually need to be labeled (e.g., marked) for training and prediction purposes.

[0003] In addition, conventional DNNs may be vulnerable to adversarial attacks. For example, conventional DNNs may be susceptible to adversarial attacks in which noisy inputs cause the DNN to perform abnormally, such as generating inaccurate predictions and / or classifications. Summary of the Invention

[0004] A system includes a computer comprising a processor and a memory. The memory includes instructions that program the processor to: receive intermediate concept constraints at a neural network, and train the neural network using training data, training labels, and at least one of data constraints, feature constraints, or intermediate concept constraints.

[0005] In other aspects, the neural network is trained using intermediate concept constraints, which include at least one concept parameter that defines individual components and relevant information about the object of interest.

[0006] In other aspects, data constraints are used to train the neural network, where the data constraints include perturbation data, so that the neural network is trained to identify adversarial features.

[0007] In other aspects, feature constraints are used to train the neural network, where the feature constraints include style parameters corresponding to the input data.

[0008] In other aspects, the feature constraints include at least one of the following: intrinsic image parameters corresponding to sensor data, non-intrinsic image parameters corresponding to sensor data, or physical sensor characteristics of the sensor that captures the sensor data.

[0009] In other aspects, the processor is further programmed to receive training data and training labels.

[0010] In other aspects, the training data includes images depicting objects located within the field of view of the vehicle's sensors.

[0011] In other aspects, neural networks include deep neural networks.

[0012] In other aspects, deep neural networks include at least one of convolutional neural networks or generative adversarial neural networks.

[0013] In terms of other features, the neural network is trained using at least two of the following: data constraints, feature constraints, or intermediate concept constraints.

[0014] A method comprising receiving intermediate concept constraints at a neural network, and training the neural network using training data, training labels, and at least one of data constraints, feature constraints, or intermediate concept constraints.

[0015] In other aspects, the method includes training the neural network using intermediate concept constraints, which include at least one concept parameter that defines individual components and relevant information about the object of interest.

[0016] In other aspects, the method includes training the neural network using data constraints, where the data constraints include perturbation data, so that the neural network is trained to identify adversarial features.

[0017] In other aspects, the method includes training the neural network using feature constraints, where the feature constraints include style parameters corresponding to the input data.

[0018] In other aspects, the feature constraints include at least one of the following: intrinsic image parameters corresponding to sensor data, non-intrinsic image parameters corresponding to sensor data, or physical sensor characteristics of the sensor that captures the sensor data.

[0019] Other features of the method include receiving training data and training labels.

[0020] In other aspects, the training data includes images depicting objects located within the field of view of the vehicle's sensors.

[0021] In other aspects, neural networks include deep neural networks.

[0022] In other aspects, deep neural networks include at least one of convolutional neural networks or generative adversarial neural networks.

[0023] In other aspects, the method includes training the neural network using at least two of data constraints, feature constraints, or intermediate concept constraints.

[0024] Further applicability will become apparent from the description provided herein. It should be understood that the description and specific examples are intended for illustrative purposes only and are not intended to limit the scope of this disclosure.

[0025] In addition, the present invention also includes the following solutions.

[0026] Option 1. A system comprising a computer, the computer including a processor and a memory, the memory including instructions that program the processor to:

[0027] At the neural network, at least one of data constraints, feature constraints, or intermediate concept constraints is received; and

[0028] The neural network is trained using training data, training labels, and at least one of the data constraints, feature constraints, or intermediate concept constraints.

[0029] Option 2. The system according to Option 1, wherein the neural network is trained using the intermediate concept constraints, the intermediate concept constraints including at least one concept parameter defining individual components and relevant information relating to the object of interest.

[0030] Option 3. The system according to Option 1, wherein the neural network is trained using the data constraints, wherein the data constraints include perturbation data, such that the neural network is trained to identify adversarial features.

[0031] Option 4. The system according to Option 1, wherein the neural network is trained using the feature constraints, wherein the feature constraints include style parameters corresponding to the input data.

[0032] Option 5. The system according to Option 4, wherein the feature constraints include at least one of intrinsic image parameters corresponding to sensor data, non-intrinsic image parameters corresponding to the sensor data, or physical sensor characteristics of the sensor that captures the sensor data.

[0033] Option 6. The system according to Option 1, wherein the processor is further programmed to receive the training data and the training labels.

[0034] Option 7. The system according to Option 1, wherein the training data includes images depicting objects located within the field of view of the vehicle sensors.

[0035] Option 8. The system according to Option 1, wherein the neural network includes a deep neural network.

[0036] Option 9. The system according to Option 8, wherein the deep neural network includes at least one of a convolutional neural network or a generative adversarial neural network.

[0037] Option 10. The system according to Option 1, wherein the neural network is trained using at least two of the data constraints, the feature constraints, or the intermediate concept constraints.

[0038] Option 11. A method comprising:

[0039] At the neural network, at least one of data constraints, feature constraints, or intermediate concept constraints is received; and

[0040] The neural network is trained using training data, training labels, and at least one of the data constraints, feature constraints, or intermediate concept constraints.

[0041] Option 12. The method according to Option 11 further includes training the neural network using the intermediate concept constraints, the intermediate concept constraints including at least one concept parameter defining individual components and relevant information relating to the object of interest.

[0042] Option 13. The method according to Option 11 further includes training the neural network using the data constraints, wherein the data constraints include perturbation data, such that the neural network is trained to identify adversarial features.

[0043] Option 14. The method according to Option 11 further includes training the neural network using the feature constraints, wherein the feature constraints include style parameters corresponding to the input data.

[0044] Option 15. The method according to Option 14, wherein the feature constraint includes at least one of intrinsic image parameters corresponding to sensor data, non-intrinsic image parameters corresponding to the sensor data, or physical sensor characteristics of the sensor that captures the sensor data.

[0045] Option 16. The method according to Option 11 further includes receiving the training data and the training labels.

[0046] Option 17. The method according to Option 11, wherein the training data includes images depicting objects located within the field of view of the vehicle sensors.

[0047] Option 18. The method according to Option 11, wherein the neural network includes a deep neural network.

[0048] Option 19. The method according to Option 18, wherein the deep neural network includes at least one of a convolutional neural network or a generative adversarial neural network.

[0049] Option 20. The method according to Option 11 further includes training the neural network using at least two of the data constraints, the feature constraints, or the intermediate concept constraints. Attached Figure Description

[0050] The accompanying drawings described herein are for illustrative purposes only and are not intended to limit the scope of this disclosure in any way. Wherein:

[0051] Figure 1 This is a block diagram of an example system including a vehicle;

[0052] Figure 2 This is a block diagram of the sample server within the system;

[0053] Figure 3 This is a block diagram of an example computing device;

[0054] Figure 4 This is a diagram of an example neural network;

[0055] Figures 5A to 5C This is a block diagram illustrating an example process for training one or more neural networks;

[0056] Figure 6 This is a block diagram illustrating a neural network arranged in an encoder-decoder architecture; and

[0057] Figure 7 This is a flowchart illustrating an example process for training a neural network. Detailed Implementation

[0058] The following description is exemplary in nature and is not intended to limit this disclosure, application or use.

[0059] This disclosure discloses one or more implementations of a data and knowledge-driven learning architecture. As discussed herein, in addition to intermediate concept constraints, neural networks can also be trained using regular training data. Intermediate concept constraints can be based on human knowledge that provides additional context about the training data. For example, intermediate concept constraints may include additional definitions and / or relationships relating to the objects depicted within the training data. In training neural networks using intermediate concept constraints, the trained neural network is resistant to adversarial attacks, unaffected by domain variations, generalizes to impossible situations, and can be interpretable.

[0060] Figure 1 This is a block diagram of an example vehicle system 100. System 100 includes a vehicle 105, which is a land-based vehicle, such as a car or truck. Vehicle 105 includes a computer 110, vehicle sensors 115, actuators 120 for actuating various vehicle components 125, and a vehicle communication module 130. The communication module 130 allows the computer 110 to communicate with a server 145 via a network 135.

[0061] Computer 110 can operate vehicle 105 in autonomous mode, semi-autonomous mode, or non-autonomous (manual) mode. For the purposes of this disclosure, autonomous mode is defined as a mode in which each of the propulsion, braking, and steering of vehicle 105 is controlled by computer 110; in semi-autonomous mode, computer 110 controls one or both of the propulsion, braking, and steering of vehicle 105; and in non-autonomous mode, a human operator controls each of the propulsion, braking, and steering of vehicle 105.

[0062] Computer 110 may include programs for: operating vehicle 105 braking, propulsion (e.g., controlling vehicle acceleration by controlling one or more of an internal combustion engine, electric motor, hybrid engine, etc.), steering, air conditioning control, interior and / or exterior lighting, etc., and determining whether and when computer 110 (relative to a human operator) will control such operations. Additionally, computer 110 may be programmed to determine whether and when a human operator will control such operations.

[0063] Computer 110 may include, or be communicatively coupled to, more than one processor, such as an electronic control unit (ECU) included in vehicle 105 as further described below, via vehicle 105 communication module 130, for detecting and / or controlling various vehicle components 125 (e.g., powertrain controller, brake controller, steering controller, etc.). Further, computer 110 may communicate with a navigation system using a Global Positioning System (GPS) via vehicle 105 communication module 130. As an example, computer 110 may request and receive location data of vehicle 105. The location data may be in a known form, such as geographic coordinates (latitude and longitude coordinates).

[0064] Computer 110 is typically arranged to communicate with vehicle 105 communication module 130 and also with wired and / or wireless networks within vehicle 105 (such as buses in vehicle 105, such as controller area network (CAN) and / or other wired and / or wireless mechanisms).

[0065] Via the vehicle 105 communication network, the computer 110 can transmit messages to and / or receive messages from various devices within the vehicle 105, such as vehicle sensors 115, actuators 120, vehicle components 125, human-machine interfaces (HMIs), etc. Alternatively or additionally, where the computer 110 actually comprises multiple devices, the vehicle 105 communication network can be used for communication between devices represented herein as computer 110. Further, as mentioned below, various controllers and / or vehicle sensors 115 can provide data to the computer 110. The vehicle 105 communication network may include one or more gateway modules (such as protocol transpilers, impedance matchers, rate converters, etc.) that provide interoperability between various networks and devices within the vehicle 105.

[0066] Vehicle sensors 115 may include a variety of devices known for providing data to computer 110. For example, vehicle sensors 115 may include one or more light detection and ranging (LiDAR) sensors 115 disposed on the top of vehicle 105, behind the windshield of vehicle 105, around vehicle 105, etc., providing information about the relative position, size, and shape of objects and / or the situation around vehicle 105. As another example, one or more radar sensors 115 fixed to the bumper of vehicle 105 may provide data to provide and measure the velocity of an object (potentially including a second vehicle 106) relative to vehicle 105. Vehicle sensors 115 may further include one or more camera sensors 115 (e.g., front view, side view, rear view, etc.) to provide images from the inner and / or outer fields of view of vehicle 105.

[0067] The actuator 120 of vehicle 105 is implemented via circuitry, chips, motors, or other electronic and / or mechanical components capable of actuating various vehicle subsystems according to known appropriate control signals. The actuator 120 can be used to control components 125, including braking, acceleration, and steering of vehicle 105.

[0068] Within the scope of this disclosure, vehicle component 125 is one or more hardware components adapted to perform mechanical or electromechanical functions or operations, such as moving vehicle 105, decelerating or stopping vehicle 105, steering vehicle 105, etc. Non-limiting examples of component 125 include propulsion components (including, for example, internal combustion engines and / or electric motors), transmission components, steering components (e.g., may include one or more of a steering wheel, steering tie rods, etc.), braking components (as described below), parking assist components, adaptive cruise control components, adaptive steering components, movable seats, etc.

[0069] Additionally, computer 110 can be configured to communicate with devices outside vehicle 105 via vehicle-to-vehicle communication module or interface 130, such as with another vehicle via vehicle-to-vehicle (V2V) or vehicle-to-infrastructure (V2X) wireless communication, and with a remote server 145 (typically via network 135). Module 130 may include one or more mechanisms by which computer 110 can communicate, including wireless (e.g., cellular, wireless, satellite, microwave, and radio frequency) communication mechanisms and any desired network topology (or topology when using multiple communication mechanisms) and any desired combination thereof. Exemplary communications provided via module 130 include any desired combination of cellular, Bluetooth®, IEEE 802.11, Private Short Range Communication (DSRC), and / or Wide Area Network (WAN) (including the Internet) to provide data communication services.

[0070] Network 135 can be one or more of a variety of wired or wireless communication mechanisms, including wired (e.g., cable and fiber optic) and / or wireless (e.g., cellular, wireless, satellite, microwave, and radio frequency) communication mechanisms and any desired network topology (or any desired combination of topologies when using multiple communication mechanisms). Exemplary communication networks include wireless communication networks (e.g., using Bluetooth, Bluetooth Low Energy (BLE), IEEE 802.11, vehicle-to-vehicle (V2V) (such as Dedicated Short Range Communication (DSRC)), etc.), local area networks (LANs), and / or wide area networks (WANs) (including the Internet) to provide data communication services.

[0071] Computer 110 can receive and analyze data from sensor 115 substantially continuously, periodically, and / or as instructed by server 145, etc. Furthermore, object classification or recognition techniques can be used in computer 110, for example, based on lidar sensor 115, camera sensor 115, etc., to identify object types (e.g., vehicles, people, rocks, dents, bicycles, motorcycles, etc.) and object physical characteristics.

[0072] Figure 2 An example server 145 is shown, comprising a data and knowledge neural network training system 205. As shown, the data and knowledge neural network training system 205 may include a neural network module 210, a neural network training module 215, and a storage module 220.

[0073] As just mentioned, the data and knowledge neural network training system 205 may include a neural network module 210. Specifically, the neural network module 210 may manage, maintain, train, implement, utilize, or communicate with one or more neural networks. For example, the neural network module 210 may communicate with the storage module 220 to access neural networks, such as neural network 400 stored in the database 225. Additionally, the data and knowledge neural network training system 205 may communicate with the neural network training module 215 to train and implement neural networks to classify digital images or generate predictions for other feasible domains.

[0074] As described in more detail herein, the neural network training module 215 can train and implement the neural network using training data and at least one of data constraints, feature constraints, or intermediate concept constraints.

[0075] Intermediate conceptual constraints may include one or more conceptual parameters. Conceptual parameters may include indications conveying the concept to be learned from one or more digital media items (i.e., digital images, digital audio files, etc.). A concept can refer to an opinion, adjective, verb, noun, abstract concept, and / or any other learnable information. In some instances, a concept may include human-provided indications relating to the object of interest.

[0076] For example, the concept of a "vehicle" can be learned from one or more images of the vehicle, from one or more descriptions of the vehicle, and / or from other information related to the vehicle. Conceptual parameters may include wheels, body, doors, windows, vehicle color, etc., which provide specific concepts (e.g., physical characteristics) of the vehicle. In some example implementations, training data and training labels may be provided to train a neural network to encode features corresponding to individual concepts of the vehicle (i.e., details and related location information) such as doors, door positions, wheels, wheel positions, windows, window positions, etc. In other words, intermediate concept constraints can be used to train the neural network to identify the vehicle based on a deduced structured concept (i.e., the individual components of vehicle 105 and the relationships between these components) obtained from images and concept representations.

[0077] Data constraints, or perturbation data constraints, may include perturbation data (i.e., adversarial data) that enable neural networks to "learn" the concept of transformation invariance present in human perception. In this context, neural networks can be trained to identify adversarial features introduced into sensor data and ignore adversarial features for classification and / or detection purposes.

[0078] Feature constraints may include style parameters corresponding to the input data. For example, feature constraints may include physical parameters, such as inherent image parameters corresponding to the sensor data, non-inherent image parameters corresponding to the sensor data, and / or physical sensor characteristics of the sensor 115 that generates the sensor data, such as the physical characteristics of the sensor (such as lens type, sensor hardware, etc.).

[0079] Using at least one of data constraints, feature constraints, or intermediate concept constraints, the neural network training module 215 can determine gradient losses associated with classification labels for multiple neurons within the neural network.

[0080] Figure 3 An example computing device 300 is shown, namely, a computer 110 and / or (one or more) servers 145, which can be configured to perform one or more of the processes described herein. As shown, the computing device may include a processor 305, a memory 310, a storage device 315, an I / O interface 320, and a communication interface 325. Furthermore, the computing device 300 may include input devices such as a touchscreen, a mouse, a keyboard, etc. In some embodiments, the computing device 300 may include... Figure 3 The components shown are fewer or more than the number of components shown.

[0081] In a particular implementation, processor(s) 305 includes hardware for executing instructions, such as those constituting a computer program. By way of example and not limitation, in order to execute instructions, processor(s) 305 may retrieve (or read) instructions from internal registers, internal caches, memory 310, or storage device 315, and decode and execute those instructions.

[0082] Computing device 300 includes memory 310 coupled to processor(s) 305. Memory 310 can be used to store data, metadata, and programs to be executed by processor(s). Memory 310 may include one or more of volatile and non-volatile memory, such as random access memory (“RAM”), read-only memory (“ROM”), solid-state drive (“SSD”), flash memory, phase-change memory (“PCM”), or other types of data storage devices. Memory 310 may be internal or distributed memory.

[0083] Computing device 300 includes storage device 315 for storing data or instructions. By way of example and not limitation, storage device 315 may include the non-transitory storage media described above. Storage device 315 may include hard disk drive (HDD), flash memory, universal serial bus (USB) drive, or combinations thereof or other storage devices.

[0084] The computing device 300 also includes one or more input or output (“I / O” devices / interfaces 320, providing these input or output devices / interfaces to allow a user to provide input (such as user strokes) to the computing device 300, and otherwise to and from the computing device. These I / O devices / interfaces 320 may include a mouse, keypad or keyboard, touchscreen, camera, optical scanner, network interface, modem, other known I / O devices, or combinations of such I / O devices / interfaces 320. The touchscreen may be activated using a writing device or a finger.

[0085] I / O device / interface 320 may include one or more devices for presenting output to a user, including but not limited to a graphics engine, a display (e.g., a screen), one or more output drivers (e.g., display drivers), one or more audio speakers, and one or more audio drivers. In some embodiments, device / interface 320 is configured to provide graphical data to the display for presentation to a user. The graphical data may represent one or more graphical user interfaces and / or any other graphical content that may be used in a particular embodiment.

[0086] The computing device 300 may further include a communication interface 325. The communication interface 325 may include hardware, software, or both. The communication interface 325 may provide one or more interfaces for communication (such as, for example, packet-based communication) between the computing device and one or more other computing devices 300 or one or more networks. By way of example and not limitation, the communication interface 325 may include a network interface controller (NIC) or network adapter for communicating with Ethernet or other wired networks, or a wireless NIC (WNIC) or wireless adapter for communicating with wireless networks (such as Wi-Fi). The computing device 300 may further include a bus 330. The bus 330 may include hardware, software, or both for coupling components of the computing device 300 to each other.

[0087] Figure 4 This is an example deep neural network (DNN) 400 available in this paper. The DNN 400 includes multiple nodes 405, and the nodes 405 are arranged such that the DNN 400 includes an input layer 410, one or more hidden layers 415, and an output layer 420. Each layer in the DNN 400 may include multiple nodes 405. Although Figure 4 Three (3) hidden layers 415 are shown, but it should be understood that the DNN 400 may include additional or fewer hidden layers. The input layer 410 and the output layer 420 may also include more than one (1) node 405.

[0088] Nodes 405 are sometimes referred to as artificial neurons because they are designed to mimic biological (e.g., human) neurons. A set of inputs to each node 405 (indicated by arrows) is multiplied by its respective weight. The weighted inputs are then summed in an input function to provide a net input (possibly adjusted for bias). This net input is then fed to an activation function, which in turn provides the output to the connected nodes 405. The activation function can be one of several suitable functions, typically chosen based on empirical analysis. Figure 4 As indicated by the arrows, the output of node 405 can then be provided to a set of inputs that are included toward one or more neurons 405 in the next layer.

[0089] The DNN 400 can be trained to take data as input and generate output based on that input. In one example, the DNN 400 can be trained using ground truth data (i.e., data about real-world conditions or states). For example, the DNN 400 can be trained using ground truth data or updated by a processor using additional data. The weights can be initialized, for example, using a Gaussian distribution, and the bias for each node 405 can be set to zero. Training the DNN 400 can include updating the weights and biases via appropriate techniques, such as optimized backpropagation. Ground truth data can include, but is not limited to, data specifying objects within an image or data specifying physical parameters (e.g., angle, velocity, distance, color, hue, or the angle of an object relative to other objects). For example, ground truth data could be data representing objects and object labels.

[0090] Machine learning services such as those based on recurrent neural networks (RNNs), convolutional neural networks (CNNs), generative adversarial networks (GANs), long short-term memory (LSTM) neural networks, or gated recurrent units (GRUs) can be implemented using the DNN 400 described in this disclosure.

[0091] It should be understood that DNN 400 may include an encoder-decoder architecture (see [link to documentation]). Figure 6 For example, a DNN 400 may include one or more encoders that generate an coded representation of the received input. This coded representation may be referred to as a latent embedding layer, within which one or more decoders that generate an estimated reconstruction of the data reside.

[0092] Figure 5A and Figure 5B An example procedure for training a DNN 400 according to one or more embodiments of this disclosure is shown. Figure 5AAs shown, during the initial training phase, the DNN 400 receives training data 505, training labels 510, and constraints 515, such as one or more of data constraints, feature constraints, or intermediate concept constraints. Training data 505 may include images depicting objects located within the field of view (FOV) of the vehicle sensor 115. Training labels 510 may include object labels, object type labels, domain type, and / or the distance of the object relative to the image source. Constraints 515 may include interfering data, concept parameters, and / or physical parameters (such as physical parameters of sensor 115). It is envisioned that multiple types of constraints 515 can be used to train the DNN 400.

[0093] For example, the first constraint 515 may include perturbation data, while the second constraint 515 may include conceptual parameters of the perturbation data. In this example, the image of a stop sign may be perturbed, causing some DNNs to classify the object as a speed limit sign due to the perturbation. The perturbation data including the stop sign may be provided to the DNN 400 for training along with conceptual parameters (such as red, octagonal shape, etc.) that define the concept corresponding to the stop sign.

[0094] Following the initial training phase, during the supervised training phase, a set of N training data points 520 are input into the DNN 400. The DNN 400 generates output translated data for each of the N training data points 520 input. Figure 5B An example of generating an output based on N training data 520 (e.g., unlabeled training images) is shown. Based on the initial training, the DNN 400 outputs a vector representation 525 of the output data, such as a latent representation of the training data. The vector representation 525 is compared with the real standard data 530.

[0095] The DNN 400 updates its network parameters based on comparisons with real standard data 530, which may include intermediate concept constraints. For example, network parameters (e.g., weights associated with neurons) may be updated via backpropagation. The DNN 400 may be trained at server 145 and provided to vehicle 105 via communication network 135. Vehicle 105 may also provide server 145 with data captured by the vehicle 105 system for further training purposes.

[0096] The process can occur multiple times. For example, the process can continue until the desired accuracy or the desired loss convergence is achieved. Once training is complete, the DNN 400 can be provided to the vehicle 105. The computer 110 can employ the DNN 400 to use images captured by the sensor 115 to perform object classification and / or object recognition. For example, as Figure 5CAs shown, sensor data 535 is received at DNN 400. Based on sensor data 535, DNN 400 generates output 540, which may include, but is not limited to, object recognition, object classification, etc.

[0097] Using object classification and / or object recognition, computer 110 can operate the vehicle based on one or more vehicle operation protocols, such as transitioning from autonomous operation mode to semi-autonomous operation mode, changing vehicle speed and / or vehicle direction of travel, etc.

[0098] Figure 6 An example environment 600 including a DNN 400 is shown. As illustrated, the DNN 400 includes an encoder 605 and a decoder 610. In the example implementation, the latent encoder 605 includes a variational autoencoder (VAE) neural network that receives one or more images as input and encodes the images into a latent representation space (e.g., latent features). The encoder 605 may be implemented as one or more hidden convolutional layers and a fully connected output layer. The hidden representation may be referred to as a latent representation or latent vector 615.

[0099] Decoder 610 may receive latent representation 615 and generate a reconstructed image based on latent representation 615. Decoder 610 may include a VAE decoder, which includes a fully connected input layer and one or more hidden deconvolution layers.

[0100] As shown, during the initial training phase, one or more of data constraints 620, feature constraints 625, or intermediate concept constraints 630 are provided to the DNN 400 for training purposes. As discussed above, the training data for data constraints 620, feature constraints 625, and / or intermediate concept constraints 630 may further include training data and training labels corresponding to perturbation data, concept parameters, and / or style parameters.

[0101] Figure 7 This is a flowchart of an example process 700 for training a DNN 400 according to the techniques described herein. The boxes of process 700 can be executed by server 145. Process 700 begins at box 705, where training data, training labels and data constraints, feature constraints, and / or intermediate concept constraints are received.

[0102] At box 710, DNN 400 generates an encoded representation of the training data using the training labels. At box 715, DNN 400 generates reconstructed data based on the encoded representation, data constraints, feature constraints, and / or intermediate concept constraints. At box 720, it is determined whether another current period equals a predetermined period value. The predetermined period value may include the number of iterations that DNN 400 has performed during training using the training data, training labels, and intermediate concept constraints. If the current period equals the predetermined period value, process 700 ends. Otherwise, the current period is incremented at box 725, and the labeling process 700 returns to box 725.

[0103] The description in this disclosure is exemplary in nature only, and variations thereof without departing from the spirit and scope of this disclosure are intended to fall within its scope. Such variations are not considered to depart from the spirit and scope of this disclosure.

[0104] Generally, the described computing systems and / or devices may employ any of a variety of computer operating systems, including but not limited to Microsoft Automotive® operating system, Microsoft Windows® operating system, Unix operating system (e.g., Solaris® operating system released by Oracle Corporation of Redwood Beach, California), AIX UNIX operating system released by International Business Machines Corporation of Armonk, New York, Linux operating system, Mac OSX and iOS operating systems released by Apple Inc. of Cupertino, California, BlackBerry OS released by BlackBerry LLC of Waterloo, Canada, and Android operating system developed by Google and the Open Handset Alliance, or versions and / or variations of the QNX® automotive infotainment platform provided by QNX Software Systems. Examples of computing devices include, but are not limited to, in-vehicle computers, computer workstations, servers, desktop computers, laptops, portable computers, or handheld computers, or some other computing systems and / or devices.

[0105] Computers and computing devices typically include computer-executable instructions, which can be executed by one or more computing devices (such as those listed above). Computer-executable instructions can be compiled or interpreted by computer programs created using a variety of programming languages ​​and / or technologies, including but not limited to the individual or combined use of the following programming languages ​​and / or technologies: Java™, C, C++, Matlab, Simulink, Stateflow, Visual Basic, JavaScript, Perl, HTML, etc. Some of these applications can be compiled and executed on virtual machines (such as the Java Virtual Machine, Dalvik Virtual Machine, etc.). Generally, a processor (e.g., a microprocessor) receives instructions from memory, computer-readable media, etc., and executes those instructions to perform one or more processes, including one or more of the processes described herein. Such instructions and other data can be stored and transferred using a variety of computer-readable media. Files in computing devices are typically collections of data stored on computer-readable media (such as storage media, random access memory, etc.).

[0106] Memory may include computer-readable media (also known as processor-readable media), which includes any non-transitory (e.g., tangible) medium involved in providing data (e.g., instructions) that can be read by a computer (e.g., by the computer's processor). Such media can take many forms, including but not limited to non-volatile and volatile media. Non-volatile media may include, for example, optical discs or magnetic disks, and other persistent storage. Volatile media may include, for example, dynamic random access memory (DRAM), which typically constitutes main memory. Such instructions may be transmitted by one or more transmission media, including coaxial cables, copper wires, and optical fibers, including wiring containing a system bus coupled to a processor to an ECU. Common forms of computer-readable media include, for example, floppy disks, floppy disks, hard disks, magnetic tape, any other magnetic media, CD-ROMs, DVDs, any other optical media, punched cards, paper tape, any other physical media with a perforated pattern, RAM, PROM, EPROM, FLASH EEPROM, any other memory chip or cartridge, or any other computer-readable medium.

[0107] The databases, data repositories, or other data storage devices described herein can include various organizations for storing, accessing, and retrieving various types of data, including hierarchical databases, file sets in file systems, application databases in proprietary formats, relational database management systems (RDBMS), etc. Each such data storage device is typically contained within a computing device employing a computer operating system (such as those mentioned above) and is accessed via a network in any one or more of a variety of ways. File systems are accessible from the computer operating system and can include files stored in various formats. In addition to the languages ​​used to create, store, edit, and execute the stored programs, RDBMS typically employs a Structured Query Language (SQL), such as the PL / SQL language mentioned above.

[0108] In some examples, system elements may be implemented as computer-readable instructions (e.g., software) on one or more computing devices (e.g., servers, personal computers, etc.) and may be stored on associated computer-readable media (e.g., disks, storage, etc.). Computer program products may include such instructions stored on computer-readable media for performing the functions described herein.

[0109] In this application, the term "module" or "controller" is replaced by the term "circuit" as defined below. The term "module" may refer to, belong to, or include: application-specific integrated circuits (ASICs); digital discrete circuits, analog discrete circuits, or mixed analog / digital discrete circuits; digital integrated circuits, analog integrated circuits, or mixed analog / digital integrated circuits; combinational logic circuits; field-programmable gate arrays (FPGAs); processor circuits (shared, dedicated, or grouped) that execute code; memory circuits (shared, dedicated, or grouped) that store code executed by the processor circuits; other suitable hardware components that provide the aforementioned functionality; or combinations of some or all of the above circuits, such as a system-on-a-chip.

[0110] A module may include one or more interface circuits. In some examples, the interface may include a wired or wireless interface that connects to a local area network (LAN), the Internet, a wide area network (WAN), or a combination thereof. The functionality of any given module disclosed herein may be distributed among multiple modules connected via the interface circuits. For example, multiple modules may allow load balancing. In a further example, a server (also referred to as a remote or cloud) module may perform some functions on behalf of a client module.

[0111] Regarding the media, processes, systems, methods, inferences, etc., described herein, it should be understood that although the steps of such processes are described as occurring according to an ordered sequence, such processes can be practiced using steps performed in an order different from that described herein. Furthermore, it should be understood that some steps may be performed simultaneously, other steps may be added, or some steps described herein may be omitted. In other words, the description of processes herein is provided only for illustrative purposes and should not be construed in any way as limiting the scope of the claims.

[0112] Accordingly, it should be understood that the foregoing description is intended to be illustrative rather than restrictive. Many embodiments and applications beyond the provided examples will be apparent to those skilled in the art upon reading the foregoing description. The scope of the invention should not be determined by reference to the foregoing description, but rather by reference to the appended claims together with their equivalents. Future developments are anticipated and expected to occur within the technical field discussed herein, and the disclosed systems and methods will be incorporated into such future embodiments. In summary, it should be understood that the invention is capable of modifications and variations and is limited only by the following claims.

[0113] All terms used in the claims are intended to be given their ordinary and common meaning as understood by one of those skilled in the art, unless otherwise expressly indicated herein. In particular, the use of singular articles (such as “an,” “a,” “the,” etc.) should be regarded as one or more of the indicated elements, unless the claims set forth an explicit limitation to the contrary.

Claims

1. A system comprising a computer, the computer including a processor and a memory, the memory including instructions that program the processor to: The neural network receives data constraints, feature constraints, and intermediate concept constraints, wherein the data constraints include perturbation images of objects, the feature constraints include physical characteristics of sensors, and the intermediate concept constraints are based on human knowledge providing additional context about the training data and include additional definitions and / or relationships related to the objects depicted within the training data; and The neural network is trained using the training data, training labels, data constraints, feature constraints, and intermediate concept constraints.

2. The system of claim 1, wherein the intermediate concept constraint further includes at least one concept parameter that defines individual components and related information concerning the object of interest.

3. The system of claim 1, wherein the neural network is trained using the data constraints such that the neural network is trained to identify adversarial features.

4. The system according to claim 1, wherein the feature constraint further includes a style parameter corresponding to the input data.

5. The system of claim 4, wherein the feature constraint further includes at least one of intrinsic image parameters corresponding to the sensor data or non-intrinsic image parameters corresponding to the sensor data.

6. The system of claim 1, wherein the processor is further programmed to receive the training data and the training labels.

7. The system of claim 1, wherein the training data comprises images depicting the object located within the field of view of the sensor.

8. The system of claim 1, wherein the neural network comprises a deep neural network.

9. The system of claim 8, wherein the deep neural network comprises at least one of a convolutional neural network or a generative adversarial neural network.

10. A method, the method comprising: The neural network receives data constraints, feature constraints, and intermediate concept constraints, wherein the data constraints include a perturbation image of the object, the feature constraints include the physical characteristics of the sensor, and the intermediate concept constraints are based on human knowledge that provides additional context about the training data and include additional definitions and / or relationships related to the objects depicted within the training data. as well as The neural network is trained using training data, training labels, and the data constraints, feature constraints, and intermediate concept constraints.

11. The method according to claim 10, wherein, The intermediate concept constraints also include at least one concept parameter that defines individual components and related information about the object of interest.

12. The method of claim 10, further comprising training the neural network using the data constraints such that the neural network is trained to identify adversarial features.

13. The method of claim 10, wherein the feature constraint further includes a style parameter corresponding to the input data.

14. The method of claim 13, wherein the feature constraint further includes at least one of intrinsic image parameters corresponding to the sensor data or non-intrinsic image parameters corresponding to the sensor data.

15. The method of claim 10, further comprising receiving the training data and the training labels.

16. The method of claim 10, wherein the training data comprises an image depicting the object located within the field of view of the sensor.

17. The method of claim 10, wherein the neural network comprises a deep neural network.

18. The method of claim 17, wherein the deep neural network comprises at least one of a convolutional neural network or a generative adversarial neural network.