A smart cockpit sun visor automatic adjusting system based on facial expression recognition

By using a smart cockpit system based on facial expression recognition, and by automatically adjusting the sun visor using MTCNN and MobileNetV3_small networks, the problem of inaccurate sun visor adjustment in existing technologies has been solved, thereby improving driving safety and intelligence.

CN117261786BActive Publication Date: 2026-06-23CHINA UNIV OF MINING & TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA UNIV OF MINING & TECH
Filing Date
2023-08-17
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing car sun visor adjustment systems cannot accurately identify whether a driver is being disturbed by strong sunlight, leading to reduced safety.

Method used

The system employs a smart cockpit system based on facial expression recognition. It utilizes a multi-task convolutional neural network (MTCNN) to detect facial regions in real time and combines it with a lightweight network, MobileNetV3_small, to recognize the driver's facial expressions and automatically adjust the sun visor to mitigate strong light interference.

Benefits of technology

It enables automatic recognition of the driver's facial expressions while driving and automatic adjustment of the sun visor to avoid driver distraction, thereby improving driving safety and intelligence.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117261786B_ABST
    Figure CN117261786B_ABST
Patent Text Reader

Abstract

The application discloses a kind of intelligence cabin sun visor automatic adjusting system based on facial expression recognition, including the cooperation of multitask convolutional neural network and lightweight network MobileNetV3 network, composition is used for the facial expression recognition of intelligence cabin sun visor automatic adjusting system;MTCNN is used for real-time detection intercepts face area, and marks five feature points in person eye, nose and mouth corner for face alignment operation;MobileNetV3 network is a kind of lightweight network applicable to vehicle, for identifying the facial expression of driver.The application detects face in real time by MTCNN model, identifies facial expression in driving process using MobileNetV3_small model, timely automatically adjusts car sun visor, relieves the interference of strong light to driver, realizes safe, intelligent driving.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to an automatic sun visor adjustment system, specifically an automatic sun visor adjustment system for a smart cockpit based on facial expression recognition, belonging to the field of facial recognition technology for automotive smart cockpits. Background Technology

[0002] Currently, most car sun visors on the market are manually adjustable. When drivers are exposed to strong sunlight or other intense ambient light, they may temporarily lose their vision. At this time, they need to be distracted to manually adjust the sun visor, which greatly reduces driving safety and may even lead to traffic accidents.

[0003] Machine vision is an emerging technology for industrial intelligence. While its technological maturity still needs continuous improvement, this does not hinder its powerful productivity. The combination of machine vision and deep learning has led to an unprecedented increase in the level of intelligence in people's lives, and it boasts higher recognition accuracy than other methods in the field of facial recognition.

[0004] The "14th Five-Year Plan for the Development and Construction of the Automotive Industry" points out that "intelligent connected vehicles involve complex environmental perception, intelligent decision-making and control, human-machine interaction and co-driving, vehicle-road interaction, big data platforms, intelligent computing platforms, high-precision spatiotemporal reference services, and basic maps, among other forward-looking and common cross-disciplinary technologies." Currently, my country's automotive industry is entering a strategic period of transformation and upgrading, moving from large to strong. Therefore, research on intelligent cockpits in automobiles provides an effective impetus for the rapid and healthy development of my country's automotive electronics industry.

[0005] In existing technologies, such as the automatic sun visor adjustment system for automobiles based on image recognition disclosed in CN110435400A, the control unit reads the brightness value V-sensor of the pixel at the location of the light intensity sensor on the image captured by the camera, and the average brightness value V-eye of the area near the driver's eyes. It then calculates the actual light intensity at the driver's eyes, L-eye = (Lx * V-eye) / V-sensor, based on the proportional relationship, and adjusts the degree of shading of the shading element accordingly. However, in actual use, since each driver's tolerance to light intensity is different, simply monitoring the light intensity with a sensor cannot accurately reflect whether the light intensity affects the driver, and therefore cannot accurately identify whether the driver is disturbed by strong sunlight. Summary of the Invention

[0006] The purpose of this invention is to provide an intelligent cockpit sun visor automatic adjustment system based on facial expression recognition to solve at least one of the above-mentioned technical problems. This system can identify whether the driver is being disturbed by strong sunlight and achieve safe driving by automatically adjusting the sun visor.

[0007] The present invention achieves the above objectives through the following technical solution: an intelligent cockpit sun visor automatic adjustment system based on facial expression recognition, comprising a camera module, a face detection module, and an expression recognition module, characterized in that: the camera module acquires an image and inputs it into the face detection module; the face image captured by the face detection module is input into the expression recognition module; the expression recognition module judges the driver's facial expression and generates a corresponding instruction based on the judgment result; the cockpit control unit receives the instruction to automatically adjust the car sun visor.

[0008] The camera module includes an industrial camera;

[0009] The face detection module includes a trained multi-task convolutional neural network (MTCNN);

[0010] The facial expression recognition module includes a trained MobileNetV3_small network.

[0011] As a further aspect of the present invention, the training steps of the Multi-Task Convolutional Neural Network (MTCNN) are as follows:

[0012] 1) Select a certain number of images from the public dataset WIDER_FACE as the training set for MTCNN, and label the face bounding boxes and five feature points;

[0013] 2) Obtain training sets of images with three different resolutions: 12×12, 24×24, and 48×48 from the WIDER_FACE training set;

[0014] 3) Train three-layer networks P-Net, R-Net, and O-Net using training sets of three different resolutions.

[0015] As a further aspect of the present invention: the training set is trained using a loss function and includes the following parts:

[0016] 1) Face detection uses the cross-entropy loss function:

[0017]

[0018] Where, p i The probability of a face appearing. This is the true label for the area, indicating whether a face exists; 1 for yes and 0 for no.

[0019] 2) Face bounding box regression uses the sum of squares loss function:

[0020]

[0021] in, These are the bounding box coordinates predicted by the network. These are the actual bounding box coordinates. A face bounding box is represented by 4 coordinate points.

[0022] 3) Facial feature point localization uses the sum-of-squares loss function:

[0023]

[0024] in, For the predicted results, These are the keypoint locations. Since a total of 5 facial keypoints need to be predicted, with 2 coordinates for each point, there are 10 tuples.

[0025] 4) The three loss functions are weighted and summed to form the final training objective function:

[0026]

[0027] Where N is the number of training samples, α j This indicates the importance of the task. For sample labels, The loss function is as described above. In P-Net and R-Net, α... det =1, α box =0.6, α landmark =0.4, in O-Net, α det =1, α box =0.4, α landmark =0.6.

[0028] As a further step in this invention, the specific training steps for the lightweight network MobileNetV3_small are as follows:

[0029] 1) Take several photos of the driver's facial expressions and other expressions when the driver is disturbed by light in the car cabin environment, and divide them into training set and dataset in a 7:3 ratio;

[0030] 2) Pre-train the MobileNetV3 network model using the publicly available dataset Labeled Faces in the Wild Home (LFW) to recognize faces;

[0031] 3) Use the collected dataset to fine-tune the pre-trained MobileNetV3_small network model to recognize facial expression changes while driving.

[0032] As a further step in this invention: the MobileNetV3_small network model uses depthwise separable convolutions instead of traditional convolutions, achieving a reduction in the number of parameters and a lightweight model. The ratio of its number of parameters is:

[0033]

[0034] Among them, D k Where is the kernel size, N is the number of kernels, M is the number of input channels, and H and W are the length and width of the input image.

[0035] The beneficial effects of this invention are:

[0036] This system can automatically recognize the driver's facial expressions while driving. When the driver's vision is obstructed by strong light, it automatically adjusts the sun visor to prevent the driver from taking their hands off the steering wheel, thus ensuring safe driving. Based on deep learning, this system uses a multi-task convolutional neural network (MTCNN) and a lightweight network, MobileNetV3. The MTCNN model detects faces in real time, while the MobileNetV3_small model recognizes facial expressions during driving and automatically adjusts the sun visor in a timely manner to reduce the interference of strong light on the driver, achieving safe and intelligent driving. It can achieve higher accuracy recognition than traditional image processing methods, bringing a better user driving experience. This system relies on the facial recognition system of the car's smart cockpit and does not require the installation of additional cameras or light intensity sensors, which can improve the intelligence level of the smart cockpit. Attached Figure Description

[0037] Figure 1 This is a schematic diagram of the actual product of the present invention.

[0038] Figure 2 This is the overall flowchart of the present invention.

[0039] Figure 3 The flowchart of the MTCNN algorithm of this invention is shown below.

[0040] Figure 4 This is a flowchart of the training process for the MobileNetV3_small model of the present invention. Detailed Implementation

[0041] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present invention.

[0042] Example 1: As Figures 1 to 4As shown, a smart cockpit sun visor automatic adjustment system based on facial expression recognition includes a camera module, a face detection module, and an expression recognition module. The camera module acquires images and inputs them into the face detection module. The face images captured by the face detection module are input into the expression recognition module. The expression recognition module judges the driver's facial expression and generates corresponding instructions based on the judgment results. The cockpit control unit receives the instructions to automatically adjust the car sun visor.

[0043] The camera module includes an industrial camera that can capture real-time images of the driver while the vehicle is in motion.

[0044] The face detection module includes a trained multi-task convolutional neural network (MTCNN), which can detect and capture the driver's facial image in real time.

[0045] The facial expression recognition module includes a trained MobileNetV3_small network, which is a lightweight neural network suitable for fast and accurate inference on mobile GPUs in in-vehicle conditions.

[0046] Example 2: In addition to all the technical features included in Example 1, this example also includes:

[0047] The specific training steps for a multi-task convolutional neural network (MTCNN) are as follows:

[0048] 1) Select a certain number of images from the public dataset WIDER_FACE as the training set for MTCNN, and label the face bounding boxes and five feature points;

[0049] 2) Obtain training sets of images with three different resolutions: 12×12, 24×24, and 48×48 from the WIDER_FACE training set;

[0050] 3) Train three-layer networks P-Net, R-Net, and O-Net using training sets of three different resolutions.

[0051] The training set is trained using a loss function and includes the following components:

[0052] 1) Face detection uses the cross-entropy loss function:

[0053]

[0054] Where, p i The probability of a face appearing. This is the true label for the area, indicating whether a face exists; 1 for yes and 0 for no.

[0055] 2) Face bounding box regression uses the sum of squares loss function:

[0056]

[0057] in, These are the bounding box coordinates predicted by the network. These are the actual bounding box coordinates. A face bounding box is represented by 4 coordinate points.

[0058] 3) Facial feature point localization uses the sum-of-squares loss function:

[0059]

[0060] in, For the predicted results, These are the keypoint locations. Since a total of 5 facial keypoints need to be predicted, with 2 coordinates for each point, there are 10 tuples.

[0061] 4) The three loss functions are weighted and summed to form the final training objective function:

[0062]

[0063] Where N is the number of training samples, α j This indicates the importance of the task. For sample labels, The loss function is as described above. In P-Net and R-Net, α... det =1, α box =0.6, α landmark =0.4, in O-Net, α det =1, α box =0.4, α landmark =0.6.

[0064] Example 3: In addition to all the technical features included in Example 1, this example also includes:

[0065] The specific training steps for the lightweight network MobileNetV3_small are as follows:

[0066] 1) Take several photos of the driver's facial expressions and other expressions when the driver is disturbed by light in the car cabin environment, and divide them into training set and dataset in a 7:3 ratio;

[0067] 2) Pre-train the MobileNetV3 network model using the publicly available dataset Labeled Faces in the Wild Home (LFW) to recognize faces;

[0068] 3) Use the collected dataset to fine-tune the pre-trained MobileNetV3_small network model to recognize facial expression changes while driving.

[0069] The MobileNetV3_small network model uses depthwise separable convolutions instead of traditional convolutions, achieving a reduction in parameters and a lightweight model. Its parameter ratio is:

[0070]

[0071] Among them, D k Where is the kernel size, N is the number of kernels, M is the number of input channels, and H and W are the length and width of the input image.

[0072] Example 4: An automatic adjustment system for a smart cockpit sun visor based on facial expression recognition. The adjustment method of this system includes the following steps:

[0073] Step A: First, turn on the vehicle camera to obtain real-time footage of the driver driving;

[0074] Step B: Preprocess the image by using histogram equalization to enhance the image.

[0075] Step C: Input the processed image into the trained multi-task convolutional neural network (MTCNN). MTCNN can detect and extract the face region in the image and mark the five feature points of the face.

[0076] Step D: Use OpenCV to rotate and align the face; then input the aligned face image into the MobileNetV3_small model to recognize facial expressions. If it is determined that there is no interference from strong light, then there is no need to adjust the sun visor. If it is determined that there is interference from strong light, then the sun visor angle is automatically adjusted until the driver's facial expression returns to normal.

[0077] A multi-task convolutional neural network (MTCNN) and a lightweight network, MobileNetV3, are used together to form a smart cockpit automatic sun visor adjustment system for facial expression recognition. MTCNN is used to detect and crop the face region in real time and mark five feature points—eyes, nose, and corners of the mouth—for face alignment. MobileNetV3 is a lightweight network applicable to in-vehicle systems used to recognize the driver's facial expressions. While driving, when drivers are exposed to strong sunlight, they may exhibit unpleasant facial expressions such as constricted pupils, squinting, frowning, and pursing lips.

[0078] It will be apparent to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the invention can be implemented in other specific forms without departing from its spirit or essential characteristics. Therefore, the embodiments should be considered in all respects as exemplary and non-limiting, and the scope of the invention is defined by the appended claims rather than the foregoing description. Thus, all variations falling within the meaning and scope of equivalents of the claims are intended to be included within the present invention. No reference numerals in the claims should be construed as limiting the scope of the claims.

[0079] Furthermore, it should be understood that although this specification describes embodiments, not every embodiment contains only one independent technical solution. This narrative style is merely for clarity. Those skilled in the art should consider the specification as a whole, and the technical solutions in each embodiment can also be appropriately combined to form other embodiments that can be understood by those skilled in the art.

Claims

1. A smart cockpit sun visor automatic adjustment system based on facial expression recognition, comprising a camera module, a face detection module, and an expression recognition module, characterized in that: The camera module acquires images and sends them to the face detection module. The face image captured by the face detection module is input to the expression recognition module. The expression recognition module judges the driver's facial expression and generates corresponding instructions based on the judgment results. The cockpit control unit receives the instructions to automatically adjust the car's sun visor. The camera module includes an industrial camera; The face detection module includes a pre-trained multi-task convolutional neural network; The facial expression recognition module includes a pre-trained MobileNetV3_small network; The specific training steps for the multi-task convolutional neural network are as follows: 1) Select images from the public dataset WIDER_FACE as the training set for MTCNN, and annotate the face bounding boxes and five feature points; 2) Obtain training sets of images with three different resolutions: 12×12, 24×24, and 48×48 from the WIDER_FACE training set; 3) Train three-layer networks P-Net, R-Net, and O-Net using training sets of three different resolutions respectively; The training set is trained using a loss function and includes the following components: 1) Face detection uses the cross-entropy loss function: ; in, The probability of a face appearing. The region is the true label, i.e. whether a face exists; 1 for yes and 0 for no. 2) Face bounding box regression uses the sum of squares loss function: ; in, These are the bounding box coordinates predicted by the network. These are the actual bounding box coordinates; a face bounding box is represented by 4 coordinate points. 3) Facial feature point localization uses the sum-of-squares loss function: ; in, For the predicted results, Key point locations; 4) Sum the three loss functions above using weighted averages to form the final training objective function: min ; Where N is the training sample size. This indicates the importance of the task. For sample labels, For the loss function above, In P-Net and R-Net, , In O-Net, ; The specific training steps for the MobileNetV3_small network are as follows: 1) Take several photos of the driver's facial expressions and other expressions when they are disturbed by light in the car cabin environment, and divide them into training set and dataset in a 7:3 ratio; 2) Pre-train the MobileNetV3 network model using the publicly available dataset Labeled Faces in the Wild Home to recognize faces; 3) Use the collected dataset to fine-tune the pre-trained MobileNetV3_small network model to recognize facial expression changes while driving.