A method and system for image emotion recognition based on machine vision

CN121789262BActive Publication Date: 2026-06-19ANHUI MEDICAL UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ANHUI MEDICAL UNIV
Filing Date
2025-12-29
Publication Date
2026-06-19

Smart Images

  • Figure CN121789262B_ABST
    Figure CN121789262B_ABST
Patent Text Reader

Abstract

This invention relates to the field of image recognition technology, and discloses a machine vision-based image emotion recognition method and system. The machine vision-based image emotion recognition method includes the following steps: Step S101, determining the outer canthus point; Step S102, calculating the orbitozygomatic connectivity region; Step S103, calculating the orbitozygomatic isoperipheral bulging index; Step S104, generating a disk image; Step S105, constructing a feature vector using scalar modulation; Step S106, outputting the emotion prediction result. This invention obtains the monocular yaw angle and outer canthus point through eye region ellipse fitting, and locks the orbitozygomatic connectivity region by combining the lower eyelid margin curve and the zygomatic superior margin curvature ridge line, effectively eliminating the projection distortion caused by head yaw. Combined with neural network feature modulation and angular statistics extraction, stable extraction of emotion-related features is achieved, avoiding confusion between posture changes and emotion features.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image recognition technology, and more specifically, to an image emotion recognition method and system based on machine vision. Background Technology

[0002] Image emotion recognition technology is widely used in fields such as human-computer interaction, intelligent monitoring, and psychological assessment. Its core is to achieve recognition by extracting facial features and associating them with emotional states. Traditional emotion recognition methods are mostly based on frontal facial images, relying on complete and symmetrical facial contours, the layout of facial features, and other global features, or the complete shape of local features such as the corners of the eyes and mouth. The design logic is highly adapted to frontal shooting scenarios, and has high requirements for the completeness and symmetry of features.

[0003] However, in practical applications, users frequently rotate their heads naturally, and side profiles are extremely common. When the head veers, key facial expression areas exhibit significant projection distortion, compressing previously intact facial features and making it difficult for traditional methods to accurately locate core feature points such as the outer canthus. The outer canthus, a crucial marker connecting the eyes and cheeks, directly impacts subsequent segmentation of emotion-related regions. In side profiles, this point is easily confused with surrounding skin texture or experiences positioning errors due to viewpoint occlusion. Furthermore, in side profiles, key emotion-related regions such as the orbitozygomaticus undergo geometrical changes, with the degree of bulging stretched or compressed due to viewpoint tilt. Traditional region segmentation algorithms are not optimized for these dynamic changes, failing to effectively separate target regions and eliminate irrelevant interference, making it difficult to extract stable emotion-related features.

[0004] Furthermore, traditional feature extraction processes do not consider the projection effects caused by yaw angles and lack targeted distortion correction mechanisms. This easily leads to confusion between posture changes and emotional changes, resulting in insufficient robustness of the extracted features and an inability to accurately reflect the true emotional state. These problems cause a significant drop in the accuracy of traditional methods in profile scenes. However, profile scenes are frequently encountered in real-world scenarios such as intelligent monitoring, psychological assessment, and human-computer interaction, making it difficult for traditional methods to meet the demand for accurate facial emotion recognition under different postures. Summary of the Invention

[0005] This invention provides a machine vision-based image emotion recognition method and system, which solves the technical problems mentioned in the background.

[0006] This invention provides a machine vision-based image emotion recognition method, comprising the following steps:

[0007] Step S101: Extract the eye fissure edge point set within the candidate eye region, obtain the major and minor axis parameters of the ellipse through geometric fitting, obtain the estimated value of the monocular yaw angle based on this, and determine the outer canthus point.

[0008] Step S102: Extract the lower eyelid margin curve and the zygomatic superior margin curvature ridge line. Based on the outer canthus point, extract the area enclosed by the lower eyelid margin curve and the zygomatic superior margin curvature ridge line to obtain the orbitozygomatic connected region.

[0009] Step S103: Extract the area and boundary arc length of the orbitozygomatic connected region and calculate the isoperimeter quotient. Use the monocular yaw angle estimation value to project and correct the isoperimeter quotient to obtain the orbitozygomatic isoperimeter bulging index.

[0010] Step S104: Extract the boundary distance field and centroid of the orbitozygomatic connected region, and monotonically map the pixels in the region to the unit disk coordinate system to generate a disk map;

[0011] Step S105: Input the disk image into the neural network, use the orbitozygomatic isocircular bulging index to perform scalar modulation on the intermediate feature map of the network, extract the angular statistics generated by accumulating along the radial direction of the modulated feature map, and concatenate the angular statistics with the orbitozygomatic isocircular bulging index to form a feature vector.

[0012] Step S106: Perform linear mapping on the feature vector to obtain the log odds of each emotion category, and output the emotion prediction results and probability distribution after normalization calculation.

[0013] This invention provides an image emotion recognition system based on machine vision, comprising:

[0014] The outer canthus point determination module extracts the eye fissure edge point set within the candidate eye region, obtains the major and minor axis parameters of the ellipse through geometric fitting, obtains the monocular yaw angle estimate based on this, and determines the outer canthus point.

[0015] The orbitozygomatic connectivity region determination module extracts the lower eyelid margin curve and the ridge line of curvature of the upper zygomatic margin. Based on the outer canthus point, it extracts the region enclosed by the lower eyelid margin curve and the ridge line of curvature of the upper zygomatic margin to obtain the orbitozygomatic connectivity region.

[0016] The orbitozygomatic isoperimeter bulging index calculation module extracts the area and boundary arc length of the orbitozygomatic connected region and calculates the isoperimeter quotient. The isoperimeter quotient is then projected and corrected using the monocular yaw angle estimate to obtain the orbitozygomatic isoperimeter bulging index.

[0017] The disk image generation module extracts the boundary distance field and centroid of the orbitozygomatic connected region, and monotonically maps the pixels in the region to the unit disk coordinate system to generate a disk image.

[0018] The feature extraction module inputs the disk image into the neural network, uses the orbitozygomatic isocircular bulging index to perform scalar modulation on the intermediate feature map of the network, extracts the angular statistics generated by accumulating along the radial direction of the modulated feature map, and concatenates the angular statistics with the orbitozygomatic isocircular bulging index to form a feature vector.

[0019] The emotion prediction module performs linear mapping on the feature vectors to obtain the log odds of each emotion category, and outputs the emotion prediction results and probability distribution after normalization.

[0020] The beneficial effects of this invention are as follows: This invention obtains the monocular yaw angle and outer canthus point through eye region ellipse fitting, and locks the orbitozygomatic connection region by combining the lower eyelid margin curve and the zygomatic superior margin curvature ridge line. Then, the orbitozygomatic isoperipheral bulging index is obtained after yaw angle correction, effectively eliminating the projection distortion caused by head yaw. Simultaneously, mapping the orbitozygomatic region to a unit disk image, combined with neural network feature modulation and angular statistics extraction, achieves stable extraction of emotion-related features, avoiding confusion between posture changes and emotional features. This invention does not rely on frontal facial images, significantly improving the accuracy and robustness of emotion recognition in side-view scenarios. It is adaptable to diverse head postures in practical applications such as human-computer interaction and intelligent monitoring, meeting the needs for accurate emotion recognition in different scenarios. Attached Figure Description

[0021] Figure 1 This is a flowchart of an image emotion recognition method based on machine vision according to the present invention;

[0022] Figure 2 This is a schematic diagram of an image emotion recognition system based on machine vision according to the present invention;

[0023] Figure 3 This is a schematic diagram of the eye zone projection geometry model and yaw angle calculation of the present invention;

[0024] Figure 4 This is a schematic diagram of the orbitozygomatic region double ridge line locking of the present invention;

[0025] Figure 5 This is a schematic diagram of the isoperimeter bulging index modulation and disk mapping of the present invention.

[0026] In the diagram: 201 lateral canthus point determination module, 202 orbitozygomatic connectivity region determination module, 203 orbitozygomatic isoperipheral bulging index calculation module, 204 disk diagram generation module, 205 feature extraction module, and 206 emotion prediction module. Detailed Implementation

[0027] The subject matter described herein will now be discussed with reference to exemplary embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and implement the subject matter described herein, and changes may be made to the function and arrangement of the elements discussed without departing from the scope of this specification. Various processes or components may be omitted, substituted, or added as needed in the examples. Furthermore, features described in some examples may be combined in other examples.

[0028] It should be noted that, unless otherwise defined, the technical or scientific terms used in one or more embodiments of the present invention should have the ordinary meaning understood by one of ordinary skill in the art to which this invention pertains. The terms "first," "second," and similar terms used in one or more embodiments of the present invention do not indicate any order, quantity, or importance, but are merely used to distinguish different components. Terms such as "comprising" or "including" indicate that the element or object preceding the term encompasses the elements or objects listed following the term and their equivalents, without excluding other elements or objects. Terms such as "connected" or "linked" are not limited to physical or mechanical connections, but can include electrical connections, whether direct or indirect. Terms such as "upper," "lower," "left," and "right" are used only to indicate relative positional relationships; when the absolute position of the described object changes, the relative positional relationship may also change accordingly.

[0029] like Figures 1-5 As shown, an image emotion recognition method based on machine vision includes the following steps:

[0030] Step S101: Extract the eye fissure edge point set within the candidate eye region, obtain the major and minor axis parameters of the ellipse through geometric fitting, obtain the estimated value of the monocular yaw angle based on this, and determine the outer canthus point.

[0031] Step S102: Extract the lower eyelid margin curve and the zygomatic superior margin curvature ridge line. Based on the outer canthus point, extract the area enclosed by the lower eyelid margin curve and the zygomatic superior margin curvature ridge line to obtain the orbitozygomatic connected region.

[0032] Step S103: Extract the area and boundary arc length of the orbitozygomatic connected region and calculate the isoperimeter quotient. Use the monocular yaw angle estimation value to project and correct the isoperimeter quotient to obtain the orbitozygomatic isoperimeter bulging index.

[0033] Step S104: Extract the boundary distance field and centroid of the orbitozygomatic connected region, and monotonically map the pixels in the region to the unit disk coordinate system to generate a disk map;

[0034] Step S105: Input the disk image into the neural network, use the orbitozygomatic isocircular bulging index to perform scalar modulation on the intermediate feature map of the network, extract the angular statistics generated by accumulating along the radial direction of the modulated feature map, and concatenate the angular statistics with the orbitozygomatic isocircular bulging index to form a feature vector.

[0035] Step S106: Perform linear mapping on the feature vector to obtain the log odds of each emotion category, and output the emotion prediction results and probability distribution after normalization calculation.

[0036] In one embodiment of the present invention, a set of eye fissure edge points is extracted within the candidate eye region, and the major and minor axis parameters of the ellipse are obtained through geometric fitting. Based on this, an estimated value of the monocular yaw angle is obtained, and the outer canthus point is determined, including: [further details on the extraction process]. Calculate gradient ,in For the image in coordinates The brightness value at that location, Let be the first-order partial derivative of the image grayscale function in the x-direction. The first-order partial derivative of the image grayscale function in the y-direction; calculate the gradient magnitude. ,in The magnitude of the gradient vector; select candidates within the eye region that have local maxima along the gradient normal and Non-zero pixels constitute the point set of the eye fissure edge. ,in Let N be the coordinates of the i-th point in the set of points on the edge of the eye fissure, and N be the number of points in the set of points on the edge of the eye fissure.

[0037] With quadratic curve coefficients Point set at the edge of the eye fissure Perform fitting and minimize the objective function ,in All are quadratic curve fitting coefficients; constraints are applied. To ensure the fitted curve is an ellipse; calculate the coordinates of the ellipse's center. ,in , , Let x be the x-coordinate of the center of the ellipse. Let y be the center of the ellipse; calculate the rotation angle of the ellipse. ,in Let x be the angle between the principal axis of the ellipse and the x-axis; calculate... ,in The correlation coefficients are calculated based on the coordinates of the ellipse center; calculation , ,in All are derived values ​​of the fitting coefficients; calculate the length of the major axis of the ellipse. Length of the minor axis of the ellipse ,in The length of the major axis of the ellipse. Let be the length of the minor axis of the ellipse, and satisfy . ;

[0038] Calculate the estimated value of the monocular yaw angle ,in satisfy ,and The value range is from 0 to Between (radians);

[0039] Define the principal axis unit vector ,in The rotation angle of the ellipse The cosine value, The rotation angle of the ellipse The sine value; calculating the point set at the edge of the eye fissure. Each point in the middle is relative to the center of the ellipse relative displacement vector ; Calculate the relationship between each relative displacement vector and the principal axis unit vector dot product The point with the largest dot product value is selected as the outer canthus point, and its coordinates are denoted as . .

[0040] It should be noted that the image brightness function represents the grayscale value of each pixel in the image, reflecting the brightness level of the pixel. The gradient component represents the rate of change of the image brightness function in the horizontal and vertical directions, reflecting the intensity of brightness changes in the corresponding directions. The gradient magnitude represents the combined intensity of the gradient components, reflecting the overall drastic degree of brightness change in the pixel. The eye fissure edge point set represents the set of pixel locations that meet specific gradient conditions, reflecting the geometric distribution of the eye fissure edge. The quadratic curve equation represents the mathematical expression used to fit the eye fissure edge point set, reflecting the curve distribution pattern of the point set. The elliptic geometric constraints represent the limiting conditions to ensure that the quadratic curve is an ellipse, reflecting the geometric characteristics required of an ellipse. The fitting coefficients represent the coefficient values ​​of each term in the quadratic curve equation, reflecting the degree of fit between the fitted curve and the eye fissure edge point set. The ellipse center coordinates represent the coordinates of the center position of the fitted ellipse, reflecting the spatial positioning of the ellipse in the image. The rotation angle represents the angle between the principal axis of the ellipse and the horizontal direction of the image, reflecting the spatial rotation state of the ellipse. The length of the ellipse major axis represents the size of the ellipse's major axis, reflecting the extent of the ellipse's extension along the principal axis. The minor axis length of the ellipse represents the size of the minor axis, reflecting the extent of the ellipse's extension along the minor axis. The estimated monocular yaw angle represents the angle of left and right rotation of the monocular eye, reflecting the projection change caused by the degree of head tilt. The principal axis direction vector represents the direction vector of the major axis of the ellipse, reflecting the spatial orientation of the major axis. The relative displacement vector represents the positional offset vector of the eye fissure edge point relative to the center of the ellipse, reflecting the spatial positional relationship between the point and the center. The projection length represents the size of the projection of the relative displacement vector onto the principal axis direction vector, reflecting the distance of the point from the center along the principal axis. The outer canthus point represents the point farthest from the center of the ellipse along the principal axis of the eye fissure edge point set, reflecting the image coordinates of the outer canthus.

[0041] It should be noted that the gradient normal refers to the direction perpendicular to the gradient direction. Local maxima determination requires that the grayscale change rate of a pixel reaches its peak among its neighboring pixels along the gradient normal. For example, if a pixel's gradient magnitude is less than that of its adjacent pixels along the normal (vertical) direction of the horizontal gradient direction, and the pixel's gradient magnitude is greater than 0, then that pixel is included in the eye fissure edge point set. This filtering method can lock in pixels at the eye fissure edge, eliminating background noise interference. The specific mathematical expression of the elliptical geometric constraint is four times the first coefficient of the quadratic curve multiplied by the third coefficient, minus the square of the second coefficient, which equals 1. This constraint forces the fitted curve to be an ellipse rather than any other quadratic curve. The objective function of the least squares fitting is the sum of the squares of the distances from all eye fissure edge points to the quadratic curve. By minimizing this objective function, the fitting coefficients are obtained. Substituting these coefficients into the aforementioned formulas for calculating the ellipse center coordinates, rotation angle, and axis length, the accurate ellipse parameters can be calculated, ensuring that the parameters closely match the actual shape of the eye fissure. Furthermore, the major axis of the ellipse corresponds to the actual length of the eye fissure, while the minor axis corresponds to the direction perpendicular to the eye fissure. When the face is turned to the side, the eye fissure appears to shorten horizontally in the image, resulting in a smaller ratio of the minor axis to the major axis and a larger yaw angle. By performing an inverse cosine operation on the ratio, the geometric proportions can be converted into angle values, enabling a quantitative estimate of the yaw angle. This mapping relationship is a correlation method derived based on the principles of projection geometry.

[0042] It should be noted that the local maxima of the gradient normal are determined using a 3x3 neighborhood window. For a target pixel, the gradient magnitudes of its two adjacent pixels (pixels before and after it along the normal within the window) are calculated. If the gradient magnitude of the target pixel is greater than the gradient magnitudes of these two adjacent pixels, and the difference is greater than a set small threshold (e.g., 0.5, in units of gradient magnitude quantization value), then it is determined to be a local maximum.

[0043] It should be noted that by performing gradient calculations on the image brightness function to filter the eye fissure edge point set, applying elliptical geometric constraints to perform least-squares fitting, solving for the ellipse parameters, and deriving the monocular yaw angle, the outer canthus is determined based on the principal axis direction vector and projection length. This achieves the effect of obtaining the yaw angle and outer canthus solely through the geometry of the near monocular eye, adapting to side-face scenes, avoiding cross-regional coupling, and providing a basis for geometric poles and projection correction for subsequent calculations.

[0044] In one embodiment of the present invention, the lower eyelid margin curve and the zygomatic superior margin curvature ridge line are extracted, and the region enclosed by the lower eyelid margin curve and the zygomatic superior margin curvature ridge line is extracted based on the outer canthus point to obtain the orbitozygomatic connected region, including: within the candidate range of the eye area, for Gaussian smoothing is performed to obtain a smoothed image. ,in For Gaussian kernel, For the image in coordinates The brightness value at that location, The brightness value of the smoothed image; calculate the gradient of the smoothed image. ,in The first-order partial derivative of the smoothed image in the x-direction is given by [the graph]. To smooth the image, calculate the first-order partial derivative in the y-direction; calculate the gradient magnitude. ,in The magnitude of the gradient vector of the smoothed image; when When calculating the unit normal ,in The norm of the smoothed image gradient vector. The unit normal vector; calculate ,in Let the gradient vector be the magnitude of the gradient; calculate ,in Let be the second derivative matrix of the gradient magnitude; choose a matrix that satisfies , and Pixels, of which y is the ordinate of the pixel. The ordinate of the ellipse center is used; the selected pixels are connected to obtain the lower eyelid margin curve. ;

[0045] Calculate smooth image The second derivative matrix ,in Let be the second derivative matrix of the smoothed image; for Perform eigenvalue decomposition to obtain , ,in , are eigenvalues ​​and satisfy , , For each corresponding , eigenvectors; defining ridge strength ;calculate ,in Let be the gradient vector of the ridge intensity; calculate ,in Let be the second derivative matrix of the ridge strength; select a matrix that satisfies , , and The selected pixels are then connected to obtain the zygomatic superior margin curvature ridge. ;

[0046] Define ray ,in The coordinates of the outer canthus point are... Let be the angle between the ray and the x-axis. Let be the length of the ray originating from the outer canthus. Angle The cosine value, Angle The sine value; defining the unit direction vector. Determine the set of angles ,in The curve of the ray and the lower eyelid margin The distance from the intersection point to the outer canthus point. The ridge of curvature of the zygomatic margin and the ray Distance from the intersection point to the outer canthus; defines the orbitozygomatic connected region. .

[0047] It should be noted that the candidate eye region represents a preset area in the image containing the eyes, reflecting the approximate distribution of the eyes in the image. Gaussian smoothing refers to the operation of blurring the image using a Gaussian kernel. The smoothed image represents the image obtained after Gaussian smoothing, reflecting the grayscale distribution of the image after removing noise interference. The gradient vector represents the combination of the rates of change of pixels in the horizontal and vertical directions in the smoothed image, reflecting the direction and intensity of pixel brightness changes. The gradient magnitude represents the magnitude of the gradient vector, reflecting the overall drasticness of pixel brightness changes. The unit normal vector represents a vector perpendicular to the gradient vector direction with a magnitude of 1, reflecting the vertical direction of the gradient. The first derivative of the gradient magnitude along the unit normal vector represents the rate of change of the gradient magnitude along the unit normal vector direction, reflecting the increasing or decreasing trend of the gradient magnitude along the direction perpendicular to the gradient. The second derivative of the gradient magnitude along the unit normal vector represents the rate of change of the first derivative, reflecting the curvature characteristics of the gradient magnitude along the unit normal vector direction. The pixel set represents a group of pixels that meet specific screening criteria, reflecting the pixel clustering that conforms to the characteristics of the lower eyelid margin. Pixel connectivity processing refers to the operation of merging adjacent pixels that meet connectivity conditions into a continuous region. The lower eyelid margin curve represents the curve formed by connectivity processing of pixels that meet the conditions, reflecting the geometric shape of the lower eyelid margin.

[0048] It should be noted that the kernel size (scale parameter) of Gaussian smoothing needs to be adjusted according to the image resolution and eye noise level, with an optimal range of 3x3 to 7x7. For example, when the image resolution is 640x480, a 5x5 Gaussian kernel is used, which can suppress high-frequency noise while preserving the edge structure of the lower eyelid margin. If the kernel is too large (e.g., 9x9), it will lead to edge blurring; if the kernel is too small (e.g., 1x1), it will not effectively remove noise. This selection is based on the spatial scale of the eye structure to achieve a balance between noise suppression and edge preservation. A first derivative of zero indicates that the gradient magnitude reaches an extreme point in the unit normal direction, and a second derivative less than zero indicates that the extreme point is a maximum point. The combination of these two can pinpoint the edge pixels of the lower eyelid margin. For example, if the gradient direction of a pixel at the lower eyelid margin is perpendicular to the edge, and the unit normal vector is along the edge tangent direction, the gradient magnitude reaches a maximum in this direction, satisfying the condition that the first derivative is zero and the second derivative is less than zero, thus effectively distinguishing the lower eyelid margin from background pixels. In addition, the constraint that the ordinate is not less than the ordinate of the ellipse center can eliminate interfering pixels in the upper eyelid and above the eye area, and focus on the lower eyelid area; the pixel connectivity processing adopts 8-neighbor connectivity (that is, the adjacent pixels of a pixel in the top, bottom, left, right and four diagonal directions are considered connected), ensuring that even if there are a small number of discrete pixels that meet the conditions, they can be merged into a continuous lower eyelid margin curve, avoiding curve breakage.

[0049] It should be noted that the second derivative matrix represents a combination matrix of the second-order partial derivatives of pixels in a smoothed image, reflecting the curvature characteristics of gray-level changes around the pixel. Eigenvalue decomposition involves solving for eigenvalues ​​and eigenvectors of the second derivative matrix to extract its core feature information. The largest eigenvalue represents the eigenvalue with the largest value obtained after eigenvalue decomposition, reflecting the maximum change intensity of the second derivative matrix in the corresponding direction. The smallest eigenvalue represents the eigenvalue with the smallest value obtained after eigenvalue decomposition, reflecting the minimum change intensity of the second derivative matrix in the corresponding direction. The eigenvector corresponding to the smallest eigenvalue represents the direction vector associated with the smallest eigenvalue, reflecting the spatial orientation of the minimum change intensity. Ridge intensity represents the opposite value of the smallest eigenvalue, reflecting the salience of the structural ridge line in the zygomatic region. The first derivative of the ridge intensity along the eigenvector corresponding to the smallest eigenvalue represents the rate of change of the ridge intensity in the direction of that eigenvector, reflecting the increasing or decreasing trend of the ridge line along that direction. The second derivative of the ridge intensity along the eigenvector corresponding to the smallest eigenvalue represents the rate of change of the first derivative, reflecting the curvature characteristics of the ridge intensity along the direction of that eigenvector. A ridge strength greater than zero indicates a positive ridge strength value, reflecting the effective existence of the structural ridge line.

[0050] It should be noted that the zygomatic superior margin curvature ridge line represents a curve formed by connecting pixels that meet certain conditions, reflecting the arc ridge shape of the bony or soft tissue of the zygomatic superior margin. A ray represents a straight line segment extending from the outer canthus, used to define the boundary of the orbitozygomatic region. The first intersection point represents the point where the ray intersects with the lower eyelid margin curve, reflecting the spatial overlap between the ray and the lower eyelid margin. The second intersection point represents the point where the ray intersects with the zygomatic superior margin curvature ridge line, reflecting the spatial overlap between the ray and the zygomatic superior margin ridge line. The distance from the outer canthus to the first intersection point represents the straight line length between the outer canthus and the first intersection point, reflecting the spatial span between the two points. The distance from the outer canthus to the second intersection point represents the straight line length between the outer canthus and the second intersection point, reflecting the spatial span between the two points. The angle set represents the set of angles between rays satisfying the intersection point distance condition and the horizontal direction, reflecting the angular range of the effective rays. A pixel segment represents the pixel sequence of the ray between the first and second intersection points, reflecting the image pixel distribution between the two intersection points. Union processing represents merging multiple pixel segments into a contiguous region to construct a complete connected region. The orbitozygomatic connected region represents a contiguous region formed by the union of pixel segments, reflecting the core metric region related to emotion in the proximal outer canthus.

[0051] It's important to note that in the second derivative matrix of a smoothed image, the minimum eigenvalue typically corresponds to the vertical direction of linear structures in the image. The smaller (more negative) the value, the more significant the structure in that direction. Taking its negative value as the ridge intensity converts structural salience into a positive value, facilitating subsequent thresholding (ridge intensity greater than zero), locking onto the curvature ridge structure of the upper zygomatic edge, and avoiding confusion with background noise. For example, if the minimum eigenvalue of the upper zygomatic edge ridge region is -3, the ridge intensity is 3, satisfying the condition of being greater than zero. Meanwhile, the minimum eigenvalue of the background region is close to zero, and its ridge intensity is close to zero, thus it is excluded. A first derivative of zero indicates that the ridge intensity reaches an extreme point in the direction of the eigenvector, and a second derivative less than zero indicates that this extreme point is a maximum point. Combining these two factors allows for the location of the core pixel of the ridge. The direction of this eigenvector is consistent with the direction of the ridge; the maximum points along this direction form a continuous ridge, ensuring the integrity and accuracy of the curvature ridge of the upper zygomatic edge. Furthermore, ray selection starting from the outer canthus is performed, using the existence of two intersection points and distance constraints (the distance between the first intersection point and the second intersection point is less than the distance between the two intersection points) to ensure that the ray only covers the area between the lower eyelid margin and the zygomatic superior ridge. For example, if a ray's intersection point with the lower eyelid margin is 5 pixels from the outer canthus and its intersection point with the zygomatic superior ridge is 10 pixels from the outer canthus, satisfying the distance constraint, the 5-10 pixel segment of that ray is included; if the intersection point distance relationship is reversed, the ray is excluded. All matching pixel segments are merged (union processing) to form a continuous orbitozygomatic connected region.

[0052] It should be noted that the Jacobi iteration method is used to perform eigenvalue decomposition on the second-order derivative matrix. This method is suitable for solving the eigenvalues ​​of symmetric matrices, and it has a fast convergence speed and high accuracy. The derivative calculation adopts the finite difference method, with a neighborhood window set to 3x3. The calculation accuracy is retained to three decimal places to ensure the accuracy of derivative determination and avoid misjudgment of ridge pixels due to insufficient precision. The intersection point is solved by the iterative approximation method of straight lines and curves. The ray equation is combined with the lower eyelid margin curve and the curvature ridge line equation of the upper zygomatic margin, and iteratively approximated by the bisection method. The number of iterations is set to 10, and the accuracy of the intersection point coordinates is controlled within 0.1 pixels. The angle range is set to 0 degrees to 180 degrees, and the discretization step size is 1 degree, that is, a ray is drawn every 1 degree to ensure coverage of the entire orbitozygomatic region and avoid missing key pixel segments. An 8-neighborhood connectivity determination is adopted. That is, when the pixels of a certain pixel segment are adjacent to the pixels of other pixel segments in the vertical, horizontal, and diagonal directions, they are determined to be connected and merged into a continuous region to ensure that there are no breaks in the orbitozygomatic connectivity region.

[0053] In one embodiment of the present invention, the area and boundary arc length of the orbitozygomatic connected region are extracted and the isoperimeter quotient is calculated. The isoperimeter quotient is then projectively corrected using a monocular yaw angle estimate to obtain the orbitozygomatic isoperimeter bulging index. This includes: calculating the area of ​​the orbitozygomatic connected region, using the following formula: ,in This is the orbitozygomatic region. Let be the area element, and let the integral object be a constant function with a value equal to 1;

[0054] The boundary of the orbitozygomatic connection region Perform parameterization to obtain parameter curves. ,in The parameter is a closed interval. Let be the coordinate function of the parametric curve on the horizontal side. Let be the coordinate function of the parametric curve in the vertical direction; calculate the boundary arc length using the following formula: ,in Let be the arc length of the boundary of the orbitozygomatic region. The horizontal coordinate function of the parametric curve with respect to the parameters rate of change, The vertical coordinate function of the parametric curve with respect to the parameter The rate of change; calculate the isoperiodic quotient, the formula is: ,in For the projected area, The boundary arc length;

[0055] Define the projection correction factor, and calculate it using the following formula: ,in This is the estimated yaw angle for a single eye. The cosine of the monocular yaw angle estimate is given; the orbitozygomatic isoperipheral bulging index is calculated using the following formula: .

[0056] It should be noted that the constant function of unit 1 represents a function where all pixel positions within the orbitozygomatic connected region have a value of 1. Double integral operation represents the cumulative summation of functions within the planar region, used to calculate the region's area. Projected area represents the size of the orbitozygomatic connected region projected onto the image plane, reflecting the region's two-dimensional spatial extent. Parametric processing represents transforming the region boundary into a mathematical expression with a single parameter. Parametric curve represents the boundary curve obtained after parametric processing, reflecting the continuous geometric shape of the boundary. Horizontal rate of change represents the derivative of the parametric curve's horizontal coordinate with respect to the parameter, reflecting the rate of change of the boundary in the horizontal direction. Vertical rate of change represents the derivative of the parametric curve's vertical coordinate with respect to the parameter, reflecting the rate of change of the boundary in the vertical direction. Square root operation represents solving for the square root of the sum of the squares of the two rates of change, used to calculate the instantaneous intensity of boundary change. The arc length of the boundary represents the total length of the orbitozygomatic connected region boundary, reflecting the one-dimensional extension scale of the boundary. Isoperimeter quotient represents the ratio of the projected area to the square of the boundary arc length, reflecting the degree to which the region's shape approximates a circle. The secant value represents the reciprocal of the cosine of the monocular yaw angle estimate, reflecting the correction coefficient for the shortening of the horizontal projection of the face. The orbitozygomatic isoperimeter bulging index represents the result of the isoperimeter quotient after correction by the secant value, reflecting the true degree of bulging in the orbitozygomatic region.

[0057] It should be noted that the constant function of unit 1 ensures that each pixel contributes equally to the area, avoiding interference from irrelevant factors such as grayscale values. For example, a connected region of the orbitozygomatic region contains 200 complete pixels and 50 half-pixels, and its projected area is 200 plus 25 equals 225 pixels squared, consistent with the discrete calculation result of double integrals. This method is suitable for pixel-level region measurement of images. Parametric processing transforms irregular region boundaries into continuously differentiable parametric curves. Typically, points on the boundary are selected and parameter values ​​are assigned sequentially (e.g., from 0 to 1), and then the curve is completed through linear interpolation or spline interpolation. The square root of the sum of the squares of the horizontal and vertical rates of change is essentially calculating the tangent vector magnitude of the parametric curve, and then obtaining the total arc length through integration (summing in discrete scenarios). This combinational logic can accurately characterize the bending and extension characteristics of the boundary, avoiding errors from direct measurement. The secant of the perimeter naturally reflects the degree to which the shape is close to a circle, while the secant value of the monocular yaw angle can compensate for the shortening of the horizontal projection caused by the profile. For example, when the monocular yaw angle is 30 degrees, the cosine value is 0.866 and the secant value is approximately 1.154. If the isoperimeter quotient is 0.7, then the orbitozygomatic isoperimeter bulging index is 0.7 multiplied by 1.154, which is approximately equal to 0.808. This effectively eliminates the influence of the side face projection on shape measurement. This association mechanism is adapted to the extraction of physiological deformation features in side face scenes.

[0058] It should be noted that the double integral discretization uses a pixel-level summation method, traversing all pixels within the orbitozygomatic region, and the total number of pixels is the projected area (the default area of ​​each pixel is 1 pixel square). For higher precision, incomplete pixels at the region boundary can be divided into sub-pixel levels (e.g., dividing the pixel into 16 sub-pixels), counting the number of sub-pixels contained within the region, dividing by 16, and then adding to the number of complete pixels to obtain the projected area. The parameter selection range is set to 0 to 1, and the number of segments is determined by the number of boundary pixels, ensuring that each segment contains 1 to 2 pixels. For example, when the boundary contains 100 pixels, the number of segments is 100, and the parameter increases sequentially from 0.01 to 1. Linear interpolation is used to complete the curve shape of each segment, ensuring that the parameter curve fits the original boundary. The number of segments for the boundary arc length is consistent with the number of boundary pixels, i.e., each pixel corresponds to one segment. This precision balances computational efficiency and accuracy. For example, when the boundary contains 80 pixels, the number of segments is 80, and the calculation error of the length of each straight line segment can be controlled within 0.1 pixels.

[0059] In one embodiment of the present invention, the boundary distance field and centroid of the orbitozygomatic connectivity region are extracted, and the pixels within this region are monotonically mapped to the unit disk coordinate system to generate a disk map. This includes calculating the centroid of the region using the following formula: ,in Let x be the x-coordinate of the region's centroid. Let be the ordinate of the region's centroid. This represents the projected area of ​​the orbitozygomatic region. This is the orbitozygomatic region. is the x-coordinate of the pixel. y is the ordinate of the pixel. Let be the area element, and the two integrals are the double integrals of the horizontal axis within the orbitozygomatic region and the double integrals of the vertical axis within the orbitozygomatic region, respectively.

[0060] The boundary distance field is calculated using the following formula: ,in Coordinates within the orbitozygomatic region pixel to region boundary The shortest Euclidean distance, For the boundary The coordinates of any point on, This represents the infimum; the distance field from this boundary satisfies the Econnell equation. And boundary conditions ,in Let be the gradient vector of the boundary distance field. This indicates that the distance field value at all points on the boundary is 0; (Definition) ,in This is the maximum value in the boundary distance field;

[0061] Calculate the normalized radius and polar angle. The formula for calculating the normalized radius is: ,in The normalized radius has a range of values. , The boundary distance field value, The boundary distance field is the maximum value; the polar angle is calculated using the following formula: ,in The polar angle has a range of values ​​of 100. , This represents the difference between the pixel's ordinate and the region's centroid's ordinate. The difference between the pixel's horizontal coordinate and the region's centroid's horizontal coordinate; a unit disk coordinate system is constructed using the normalized radius and polar angle;

[0062] Define the maximum length of the ray along the polar angle direction, and calculate it using the following formula: ,in For along the polar angle The direction extends from the region's centroid to the boundary. The maximum length of the ray, For ray length parameters, Polar angle The corresponding unit direction vector, Polar angle The cosine value, Polar angle The sine value, Denotes the supremum; defines the inverse mapping, calculated as follows: ,in Unit disk coordinates Map back to the coordinates of the original image. The inverse scaling factor is used to generate a disk diagram; the calculation formula is as follows: ,in It is a disk diagram. The grayscale function of the original image. These are the coordinates of the original image obtained through reverse mapping.

[0063] It should be noted that the cumulative value of the horizontal coordinate represents the result of the double integral of the horizontal coordinate within the orbitozygomatic region, reflecting the overall cumulative distribution of the horizontal coordinate within the region. The cumulative value of the vertical coordinate represents the result of the double integral of the vertical coordinate within the orbitozygomatic region, reflecting the overall cumulative distribution of the vertical coordinate within the region. The centroid of the region represents the geometric center coordinates of the orbitozygomatic region, reflecting the overall spatial center position of the region. The boundary distance field represents the field distribution formed by the minimum distances from each point within the region to the boundary, reflecting the spatial relationship between each point within the region and the boundary. The minimum Euclidean distance represents the minimum value of the Euclidean distances from a point within the region to all points on the boundary, reflecting the shortest spatial span from the point to the boundary. The maximum value represents the element with the largest value in the boundary distance field, reflecting the distance to the point farthest from the boundary within the region. The maximum distance value represents the specific value corresponding to the maximum value of the boundary distance field, reflecting the upper limit of the maximum distance from a point within the region to the boundary.

[0064] It's important to note that accumulating the x and y coordinates of all pixels within a region using double integrals and then dividing by the projected area essentially performs a weighted average (each pixel has an equal weight), ensuring the centroid accurately reflects the region's geometric center. For example, if a connected orbitozygomatic region contains 200 pixels, with a cumulative x-coordinate of 400 and a projected area of ​​200, then the centroid's x-coordinate is 400 divided by 200, which equals 2; and its y-coordinate is 300, which equals 300 divided by 200, which equals 1.5. This centroid accurately represents the region's spatial center. Choosing the minimum Euclidean distance ensures the smoothness and monotonicity of the distance field, avoiding abrupt distance changes caused by irregular boundaries.

[0065] It should be noted that the double integral discretization uses a pixel-level accumulation method, transforming the double integral into a summation operation over the coordinates of all pixels within the region. This eliminates the need for complex continuous integral calculations, adapting to the discrete characteristics of image processing. For incomplete pixels at the region boundary (e.g., only partially contained within the region), their inclusion ratio is included in the cumulative value. For example, if an incomplete pixel has a inclusion ratio of 0.5 and an x-coordinate of 5, its cumulative contribution is 5 multiplied by 0.5, which equals 2.5. A global maximum extraction method is used, traversing the minimum Euclidean distance from the boundary to all points in the field, and directly selecting the distance with the largest value as the maximum distance value. If multiple identical maximum points exist, the distance corresponding to any one of them can be taken without affecting the subsequent mapping effect. Furthermore, the inclusion ratio of incomplete pixels is determined by the proportion of the intersection area between the pixel and the region boundary. For example, if the pixel area is 1 square pixel and the intersection area with the region is 0.3 square pixels, the inclusion ratio is 0.3, and its x and y coordinates are included in the cumulative value according to this ratio.

[0066] It should be noted that the direction pointed to by the polar angle represents the spatial direction corresponding to the polar angle, reflecting the direction of the ray's extension from the centroid. The ray maximum length represents the length of the line segment extending from the centroid of the region along the direction pointed to by the polar angle to the boundary of the region, reflecting the actual span from the centroid to the boundary in that direction. The inverse scaling factor represents the difference between the unit radius and the normalized radius, reflecting the scale adjustment coefficient during inverse mapping. The composition operation represents the algorithm that combines the inverse scaling factor, the ray maximum length, and the direction vector to calculate the offset of the inverse mapping. The inverse mapping coordinate point represents the pixel coordinates of the original image mapped back to the unit disk coordinates. The disk image represents the image composed of the pixel brightness values ​​of all inverse mapping coordinate points, reflecting the normalized texture and grayscale distribution of the orbitozygomatic connected region.

[0067] It should be noted that sub-pixel level extraction precision is employed. Intersection points are solved using an iterative approximation method between rays and region boundaries, with 8 iterations. The accuracy of the intersection point coordinates is controlled within 0.01 pixels, ensuring that the calculation error of the maximum ray length is less than 0.05 pixels. When the inverse mapping coordinate point is a non-integer pixel, bilinear interpolation is used to calculate the pixel brightness value. This involves selecting four integer pixels surrounding the coordinate point, assigning weights based on the distance between the coordinate point and the four pixels, and then weighted summing to obtain the interpolated brightness value, thus avoiding the jagged or blurry phenomena seen in disk images.

[0068] In one embodiment of the present invention, a disk image is input into a neural network, and the orbitozygomatic isoperipheral bulging index is used to perform scalar modulation on the intermediate feature map of the network. An angular statistic generated by accumulating the modulated feature map along the radial direction is extracted, and the angular statistic is concatenated with the orbitozygomatic isoperipheral bulging index to form a feature vector. This includes: assuming the intermediate feature extraction operator of the neural network is... Input disk diagram Generate intermediate feature maps for the network; the calculation formula is as follows: ,in This is the intermediate feature map of the network. For the normalized radius, Polar angle, It is a disk diagram;

[0069] Define the scaling factor and translation factor. The formula for calculating the scaling factor is as follows: ,in The orbitozygomatic isoperipheral bulge index. , The translation coefficient is a fixed constant; the formula for calculating the translation coefficient is: ,in , For fixed constants; scalar modulation is applied to the intermediate feature maps of the network, and the calculation formula is as follows: ,in This is the modulated feature map;

[0070] The modulated feature map is discretized, with the radial direction discretized as follows: Each sampling point is discrete in angle direction. There are 1 sampling points, among which This represents the number of sampling points in the radial direction. Let be the number of sampling points in the angular direction; let the discrete modulated feature map be denoted as . ,in For the first Normalized radius value of each radius sampling point For the first Polar angle values ​​at each angle sampling point , Extract the angular statistics; the calculation formula is as follows: ,in For the first The accumulated feature values ​​along the radial direction at each angular sampling location form an angular statistic sequence. ;

[0071] The angular statistics sequence is concatenated with the orbitozygomatic isoperipheral bulging index to form an feature vector, calculated using the following formula: ,in For feature vectors, This represents the vector transpose operation.

[0072] It should be noted that the intermediate feature map represents the intermediate data matrix generated during the neural network feature extraction process, reflecting the hierarchical feature information of the disk map. Polar coordinates represent angular coordinates in the unit disk coordinate system, reflecting the angular positional relationship of features within the disk map. The scaling factor represents a scaling factor calculated based on the orbitozygomatic isoplegic index, reflecting the magnitude of magnification or reduction of the intermediate feature map. The translation factor represents an offset factor calculated based on the orbitozygomatic isoplegic index, reflecting the overall offset magnitude of the intermediate feature map. The modulated feature map represents the feature map after scaling and offset adjustments, reflecting the optimized features focused on the core geometric features. The sampling angle position represents the preset feature acquisition points along the angular direction, reflecting the discretized position of the angular statistics.

[0073] It should be noted that the neural network adopts a 3-layer convolutional neural network structure. The first layer has a 3×3 kernel size and 32 output channels, the second layer has a 3×3 kernel size and 64 output channels, and the third layer has a 2×2 kernel size and 128 output channels. The intermediate feature extraction operator is the output feature map of each convolutional layer; that is, the output of the first convolutional layer is the first-level intermediate feature map, the second layer is the second-level, and the third layer is the third-level. The third-level intermediate feature map is selected for subsequent modulation. The fixed constants in the linear multiplication and constant addition operations are preset values, ranging from 0.01 to 2.0. Among them, the first fixed constant for calculating the scaling factor ranges from 0.5 to 2.0, and the second is from 0 to 0.5; the third fixed constant for calculating the translation factor ranges from 0.1 to 1.0, and the fourth is from 0 to 0.3. These values ​​can be adjusted to the optimal values ​​based on the training dataset. The number of sampling points in the radial direction is set to 32, evenly distributed within the normalized radius range of 0 to 1; the number of sampling points in the angular direction is set to 64, evenly distributed within the polar angle range from negative to positive pi, ensuring that the sampling covers the complete disk diagram features.

[0074] In one embodiment of the present invention, the feature vector is linearly mapped to obtain the log-probability of each emotion category, and the emotion prediction result and probability distribution are output after normalization calculation, including: assuming the set of emotion categories is... ,in For the set of all emotion categories, Let the feature vector be a single emotion category. ,in It is a sequence of angular statistics. The orbitozygomatic isoperipheral bulge index. Let be the discrete number of the angular direction. Represents the vector transpose operation; defines the weight matrix. ,in For the number of emotion categories, Represent the real number field, with the weight matrix being a real number matrix; define the bias vector. The bias vector is a real number vector; a linear mapping is performed, and the calculation formula is as follows: ,in It is a log-odds vector. The components of the log-odds vector are obtained from the matrix multiplication operation between the weight matrix and the eigenvectors. For emotion categories The logarithmic odds;

[0075] Perform an exponential operation on the log-odds ratio of each emotion category, using the following formula: ,in This represents an exponential function with the natural constant as its base. For emotion categories The corresponding individual index item; calculate the sum of the individual index items for all emotion categories, using the following formula: ,in The index is used in the set of emotion categories; the probability components of each emotion category are calculated using the following formula: ,in For emotion categories The probability components satisfy A probability distribution is constructed from the probability components of all emotion categories. Extract the emotion category with the largest probability component value from the probability distribution; the calculation formula is as follows: ,in For sentiment prediction results, This indicates the index operation corresponding to the maximum value.

[0076] It should be noted that the number of rows in the weight matrix equals the number of emotion categories, and the number of columns equals the dimension of the feature vector. The logarithmic odds itself can be positive or negative; after natural exponentiation, it is transformed into a non-negative individual exponent term to avoid interference from negative values ​​in probability calculations. Normalization is then achieved by dividing by the summation value, ensuring that the sum of all probability components is 1, conforming to the definition of a probability distribution. Assuming there are 3 emotion categories (e.g., positive, neutral, negative), and the feature vector dimension is the number of angle samples plus 1 (e.g., when the number of angle samples is 64, the feature vector dimension is 65), then the weight matrix dimension is 3×65, and the bias vector dimension is 3×1. The weight matrix and bias vector are real-valued matrices / vectors, obtained from the training dataset through gradient descent optimization, with the cross-entropy loss function used during training. The default set of emotion categories includes 3 categories: positive emotions, neutral emotions, and negative emotions. If more detailed classification is needed, it can be expanded to 6 categories (such as joy, anger, sorrow, fear, surprise, and neutral). In this case, the number of rows in the weight matrix is ​​adjusted to 6, and the dimension of the bias vector is adjusted to 6×1.

[0077] In one embodiment of the present invention, such as Figure 2 As shown, an image emotion recognition system based on machine vision includes:

[0078] The outer canthus point determination module 201 extracts the eye fissure edge point set within the candidate range of the eye region, obtains the major axis and minor axis parameters of the ellipse through geometric fitting, obtains the estimated value of the monocular yaw angle based on this, and determines the outer canthus point.

[0079] The orbitozygomatic connectivity region determination module 202 extracts the lower eyelid margin curve and the zygomatic superior margin curvature ridge line, and extracts the region enclosed by the lower eyelid margin curve and the zygomatic superior margin curvature ridge line based on the outer canthus point to obtain the orbitozygomatic connectivity region.

[0080] The orbitozygomatic isoperimeter bulging index calculation module 203 extracts the area and boundary arc length of the orbitozygomatic connected region and calculates the isoperimeter quotient. It then uses the monocular yaw angle estimation value to perform projection correction on the isoperimeter quotient to obtain the orbitozygomatic isoperimeter bulging index.

[0081] The disk image generation module 204 extracts the boundary distance field and the centroid of the orbitozygomatic connected region, and monotonically maps the pixels in the region to the unit disk coordinate system to generate a disk image.

[0082] The feature extraction module 205 inputs the disk image into the neural network, uses the orbitozygomatic isoperimeter bulging index to perform scalar modulation on the intermediate feature map of the network, extracts the angular statistics generated by accumulating along the radial direction of the modulated feature map, and concatenates the angular statistics with the orbitozygomatic isoperimeter bulging index to form a feature vector.

[0083] The emotion prediction module 206 performs linear mapping on the feature vectors to obtain the log odds of each emotion category, and outputs the emotion prediction results and probability distribution after normalization calculation.

[0084] It should be noted that the interval and threshold sizes are set for ease of comparison. The size of the threshold depends on the amount of sample data and the base number set by those skilled in the art for each set of sample data, as long as it does not affect the proportional relationship between the parameter and the quantized value. Furthermore, the above formulas are all dimensionless calculations, and the formulas are derived from software simulations using a large amount of collected data to obtain the most recent real-world results. The preset parameters in the formulas are set by those skilled in the art according to the actual situation.

[0085] The embodiments of this example have been described above. However, this example is not limited to the specific implementation methods described above. The specific implementation methods described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms based on the guidance of this example, and all of them are within the protection scope of this example.

Claims

1. A machine vision-based image emotion recognition method, characterized in that, Includes the following steps: Step S101: Extract the eye fissure edge point set within the candidate eye region, obtain the major and minor axis parameters of the ellipse through geometric fitting, obtain the estimated value of the monocular yaw angle based on this, and determine the outer canthus point. Step S102: Extract the lower eyelid margin curve and the zygomatic superior margin curvature ridge line. Based on the outer canthus point, extract the area enclosed by the lower eyelid margin curve and the zygomatic superior margin curvature ridge line to obtain the orbitozygomatic connected region. Step S103: Extract the area and boundary arc length of the orbitozygomatic connected region and calculate the isoperimeter quotient. Use the monocular yaw angle estimation value to project and correct the isoperimeter quotient to obtain the orbitozygomatic isoperimeter bulging index. Step S104: Extract the boundary distance field and centroid of the orbitozygomatic connected region, and monotonically map the pixels in the region to the unit disk coordinate system to generate a disk map; Step S105: Input the disk image into the neural network, use the orbitozygomatic isocircular bulging index to perform scalar modulation on the intermediate feature map of the network, extract the angular statistics generated by accumulating along the radial direction of the modulated feature map, and concatenate the angular statistics with the orbitozygomatic isocircular bulging index to form a feature vector. Step S106: Perform linear mapping on the feature vector to obtain the log odds of each emotion category, and output the emotion prediction result and probability distribution after normalization calculation. Gradient calculation is performed on the image brightness function within the candidate eye region to obtain gradient components and gradient magnitude. The set of pixel locations with local maxima along the gradient normal and non-zero gradient magnitude is extracted to construct the eye fissure edge point set. Construct a quadratic curve equation and apply elliptical geometric constraints. Perform least squares fitting on the eye fissure edge point set to obtain fitting coefficients. Based on the fitting coefficients, calculate the ellipse center coordinates, rotation angle, major axis length, and minor axis length, where the major axis length is greater than or equal to the minor axis length and both are greater than zero. The ratio of the length of the minor axis of the ellipse to the length of the major axis is calculated, and the inverse cosine of the ratio is taken as the estimated value of the monocular yaw angle. The estimated value of the monocular yaw angle is between zero and half of pi. The principal axis direction vector is determined based on the rotation angle. The relative displacement vector of each pixel position in the eye fissure edge point set with respect to the center of the ellipse is calculated. The projection length of the relative displacement vector on the principal axis direction vector is calculated. The pixel position point with the largest projection length is selected as the outer canthus point.

2. The image emotion recognition method based on machine vision according to claim 1, characterized in that, Gaussian smoothing is performed on the image brightness function within the candidate eye region to obtain a smooth image. The gradient vector and gradient magnitude of the smooth image are calculated. The unit normal vector is determined based on the gradient vector. The pixel positions in the smooth image where the first derivative of the gradient magnitude along the unit normal vector is zero and the second derivative is less than zero are extracted. In the set of pixels whose ordinate is not less than the ordinate of the ellipse center, the lower eyelid margin curve is constructed through pixel connectivity processing.

3. The image emotion recognition method based on machine vision according to claim 2, characterized in that, Calculate the second derivative matrix of the smoothed image and perform eigenvalue decomposition to obtain the maximum eigenvalue, minimum eigenvalue, and eigenvector corresponding to the minimum eigenvalue. Take the negative of the minimum eigenvalue as the ridge intensity. Extract the pixel positions where the first derivative of the eigenvector corresponding to the minimum eigenvalue is zero and the second derivative is less than zero. In the pixel set where the ridge intensity is greater than zero and the ordinate is not less than the ordinate of the ellipse center, construct the upper edge curvature ridge line through pixel connectivity processing. Starting from the outer canthus, rays are drawn out. Within the angle set where the ray intersects the lower eyelid margin curve at the first intersection point and the upper zygomatic margin curvature ridge at the second intersection point, and the distance from the outer canthus to the first intersection point is less than the distance from the outer canthus to the second intersection point, the pixel segments of each ray between the first and second intersection points are extracted and a union processing is performed to construct the orbitozygomatic connected region.

4. The image emotion recognition method based on machine vision according to claim 1, characterized in that, Perform a double integral operation on a constant function with a unit value of one within the orbitozygomatic connected region to obtain the projected area of ​​the orbitozygomatic connected region on the image plane. Parametric processing is performed on the boundary of the orbitozygomatic connected region to construct a parametric curve. The rate of change of the parametric curve in the horizontal direction and the rate of change in the vertical direction are calculated. The squares of the horizontal rate of change and the squares of the vertical rate of change are added together and the square root operation is performed. The result of the square root operation is integrated to obtain the arc length of the boundary of the orbitozygomatic connected region. Calculate the product of the projected area and four times pi, then divide the product by the square of the arc length to obtain the isocircumference quotient; Calculate the secant value of the monocular yaw angle estimate, and multiply the isoperimeter quotient by the secant value to obtain the orbitozygomatic isoperimeter bulging index.

5. The image emotion recognition method based on machine vision according to claim 1, characterized in that, The cumulative value of the x-coordinate is obtained by performing a double integral operation on the x-coordinate within the orbitozygomatic region, and the cumulative value of the y-coordinate is obtained by performing a double integral operation on the y-coordinate within the orbitozygomatic region. The cumulative values ​​of the x-coordinate and y-coordinate are then divided by the projected area to obtain the centroid of the region. Calculate the minimum Euclidean distance from each point within the orbitozygomatic region to the boundary of the orbitozygomatic region to construct a boundary distance field, and extract the maximum value in the boundary distance field as the maximum distance value.

6. The image emotion recognition method based on machine vision according to claim 5, characterized in that, Divide the boundary distance field by the maximum distance value to obtain the distance ratio value, subtract the distance ratio value from the unit to obtain the normalized radius, calculate the horizontal and vertical displacements of each point relative to the centroid of the region and perform arctangent operation to obtain the polar angle, and use the normalized radius and polar angle to construct the unit disk coordinate system. The length of the line segment extending from the centroid of the region to the boundary of the orbitozygomatic region along the direction of the polar angle is extracted as the ray maximum length. The unit radius is subtracted from the normalized radius to obtain the inverse scaling factor. The inverse scaling factor, the ray maximum length, and the direction vector along the polar angle are synthesized and superimposed on the centroid of the region to obtain the inverse mapping coordinate point. The pixel brightness value corresponding to the inverse mapping coordinate point in the image brightness function is extracted to obtain the disk image.

7. The image emotion recognition method based on machine vision according to claim 1, characterized in that, The disk image is input into a neural network to perform feature extraction processing, and an intermediate feature map with normalized radius coordinates and polar angle coordinates is constructed. Using the orbitozygomatic isoplegic bulging index as the independent variable, linear multiplication and constant addition are performed to obtain the scaling coefficient and translation coefficient, respectively. The intermediate feature map is then multiplied element-wise with the scaling coefficient and the translation coefficient is superimposed to obtain the modulated feature map. Discretization sampling is performed on the modulated feature map in both the radial and angular directions. At each sampling angular position, the multiple feature sample values ​​distributed along the radial direction are summed to obtain the angular statistic sequence. The angular statistics sequence is merged with the orbitozygomatic isoperipheral bulging index vector to obtain the feature vector.

8. The image emotion recognition method based on machine vision according to claim 1, characterized in that, Multiply the feature vector and weight matrix and superimpose the bias vector to obtain the log odds of each sentiment category, thus obtaining the log odds vector. Perform an exponential operation with the natural constant as the base on each component of the log-odds vector to obtain the individual index term. Calculate the sum of all individual index terms. Divide each individual index term by the sum to obtain the probability component of each emotion category. Construct a probability distribution from the probability components of each emotion category. Extract the emotion category with the largest probability component value in the probability distribution to obtain the emotion prediction result.

9. A machine vision-based image emotion recognition system, characterized in that, Performing a machine vision-based image emotion recognition method as described in any one of claims 1 to 8 includes the following steps: The outer canthus point determination module extracts the eye fissure edge point set within the candidate eye region, obtains the major and minor axis parameters of the ellipse through geometric fitting, obtains the monocular yaw angle estimate based on this, and determines the outer canthus point. The orbitozygomatic connectivity region determination module extracts the lower eyelid margin curve and the ridge line of curvature of the upper zygomatic margin. Based on the outer canthus point, it extracts the region enclosed by the lower eyelid margin curve and the ridge line of curvature of the upper zygomatic margin to obtain the orbitozygomatic connectivity region. The orbitozygomatic isoperimeter bulging index calculation module extracts the area and boundary arc length of the orbitozygomatic connected region and calculates the isoperimeter quotient. The isoperimeter quotient is then projected and corrected using the monocular yaw angle estimate to obtain the orbitozygomatic isoperimeter bulging index. The disk image generation module extracts the boundary distance field and centroid of the orbitozygomatic connected region, and monotonically maps the pixels in the region to the unit disk coordinate system to generate a disk image. The feature extraction module inputs the disk image into the neural network, uses the orbitozygomatic isocircular bulging index to perform scalar modulation on the intermediate feature map of the network, extracts the angular statistics generated by accumulating along the radial direction of the modulated feature map, and concatenates the angular statistics with the orbitozygomatic isocircular bulging index to form a feature vector. The emotion prediction module performs linear mapping on the feature vectors to obtain the log odds of each emotion category, and outputs the emotion prediction results and probability distribution after normalization calculation.