An ocular image generation system and method
By finely segmenting the eye region and processing eye gaze, the problem of unnatural eye region in existing technologies has been solved, achieving highly realistic and natural eye image generation, which is suitable for face swapping and digital human generation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- LANZHOU FUTURE NEW FILM CULTURE & TECH GRP CO LTD
- Filing Date
- 2022-10-18
- Publication Date
- 2026-06-23
AI Technical Summary
Existing eye image generation technologies suffer from problems such as unnatural appearance and significant texture differences between the eye and face areas, failing to accurately convey the inner emotional information conveyed by the gaze.
Through steps such as image preprocessing, feature point detection, region extraction, transformation, eye detection, and fusion, the eye region is finely divided to generate eye images with high realism and naturalness. Taking into account the direction of eye gaze, high-definition and realistic gaze generation of the eye region are achieved.
The generated eye areas are realistic, natural, and highly clear, capable of expressing any direction of gaze. They are suitable for face swapping and digital human generation, enhancing the realism and naturalness of eye images.
Smart Images

Figure CN115497147B_ABST
Abstract
Description
TECHNICAL FIELD
[0001] The present application relates to the technical field of face swapping and digital human generation, in particular to an eye image generation system and method. BACKGROUND
[0002] Eye image generation refers to generating an eye image or a map with realistic eye features in an eye region of an image / video or a three-dimensional model face map. Eye image generation is an important part of face image generation. With the development of computer and network technology and the continuous improvement of people's living standards, applications in the field of mass entertainment consumption emerge in an endless stream, and face image generation has been widely used in face swapping, digital human generation and other fields. Eyes are an important part of the face, and the generation effect of the eye region directly affects the realism and naturalness of the entire face region.
[0003] Existing eye image generation techniques are mainly based on image region segmentation and deep learning methods. The image region segmentation-based method identifies the eye region using image segmentation, object detection, and semantic segmentation, extracts the eye region, and pastes it into the eye region of the target face image or target map. The deep learning-based method mainly processes the entire face region and generates a natural eye image consistent with the target face sense through a generative adversarial network.
[0004] The image segmentation-based method has unnatural connections between the overall eye region and the face region, and the texture difference between the eye region and the face region is large. The deep learning-based method can solve the unnatural eye region problem to some extent, but the generated eye region is generally not realistic and has a large gap with the real target eye, and the realism is poor. At the same time, neither of these two methods considers the eye gaze of the real target eye region, so it cannot accurately express the internal emotional information conveyed through the eye gaze. SUMMARY
[0005] The purpose of the present application is to provide an eye image generation system and method to solve the problems raised in the background.
[0006] To achieve the above purpose, the present application provides the following technical scheme: an eye image generation system, comprising:
[0007] An image preprocessing module is configured to obtain a first eye image and a second eye image, and to obtain a third eye image and a fourth eye image by preprocessing the first eye image and the second eye image;
[0008] an eye feature point detection module configured to perform feature point detection on the third eye image and the fourth eye image, and extract eye feature point sets of the third eye image and the fourth eye image;
[0009] an eye region extraction module configured to generate a third eye region image and a fourth eye region image according to the eye feature point set of the third eye image and the eye feature point set of the fourth eye image;
[0010] an eye region transformation module configured to transform the fourth eye region image and the eye feature point set of the fourth eye image into the first eye image coordinate space, and generate a fifth eye transformation image and an eye feature point set of the fifth eye transformation image;
[0011] an eyeball detection module configured to extract eyeball regions and eyeball boundary feature points of the fifth eye transformation image and the first eye image;
[0012] an eyeball generation module configured to generate an eyeball region of the fifth eye transformation image according to the eyeball region of the fifth eye transformation image and the eyeball boundary feature points;
[0013] an eye fusion module configured to fuse the eye region of the first eye image and the eye region of the fifth eye transformation image to obtain a sixth eye fusion image;
[0014] an eye region generation module configured to perform texture fusion and boundary processing on the sixth eye fusion image and the first eye image to generate a seventh eye generation image.
[0015] Preferably, the eye region extraction module determines eye boundary candidate regions according to the eye feature point set, and determines eye boundary regions by using an image processing method.
[0016] Preferably, the eye region extraction module determines eye boundary candidates according to different physical meanings of eye feature points, and optimizes the candidate boundaries according to image gray scale features and geometric features.
[0017] Preferably, the eye region transformation module transforms the fourth eye region image and the eye feature point set of the fourth eye region image into the first eye image according to the eye feature point sets of the third eye image and the fourth eye image and the eye boundary regions.
[0018] Preferably, the eye region transformation module performs transformation based on matching relationships between feature points, and adopts overall transformation or block transformation according to feature points according to features of the eye region.
[0019] Preferably, the eye detection module performs eye region detection on the fifth eye transformation image and the first eye image, and determines the accurate boundary region and boundary points of the eye based on the eye feature point set and image processing method;
[0020] Preferably, the eye detection module uses a combination of image grayscale information, image geometric information, or deep learning methods to determine the boundary points of the eyeball.
[0021] Preferably, the eyeball generation module uses the eyeball boundary points to perform circle fitting, and performs pixel interpolation on the missing pixels in the eyeball portion of the fifth eye transformation image to generate the eyeball region of the fifth eye transformation image.
[0022] A method for generating an eye image includes the following steps:
[0023] 1) Obtain the first eye image and the second eye image, and perform image preprocessing on the first eye image and the second eye image to obtain the third eye image and the fourth eye image;
[0024] 2) Perform eye feature point detection on the third eye image and the fourth eye image respectively to obtain the eye feature point sets of the third eye image and the fourth eye image;
[0025] 3) Extract the eye regions from the third eye image and the fourth eye image respectively to form the third eye region image and the fourth eye region image;
[0026] 4) Transform the eye feature point set of the fourth eye image to the coordinate space of the first eye image to form the fifth eye feature point set; transform the fourth eye region image to the coordinate space of the first eye image to form the fifth eye transformed image;
[0027] 5) Perform eye region detection on the fifth eye transformation image and the first eye image to obtain the eye region and its boundary feature points of the fifth eye transformation image and the first eye image;
[0028] 6) Generate the eye region of the fifth eye transformation image based on the eye region and its boundary feature points of the fifth eye transformation image and the first eye image;
[0029] 7) The eye region and non-eye region generated from the fifth eye transformation image are fused with the first eye image to obtain the sixth eye fused image;
[0030] 8) Perform image fusion and boundary processing on the sixth eye fusion image and the first eye image to obtain the seventh eye generated image.
[0031] Preferably, the image preprocessing includes image quality enhancement, image resolution enhancement, and human eye mapping orthographic projection operation. The eye feature points include eye inner and outer contour feature points, eyeball feature points, pupil feature points, and gaze tracking points. The gaze tracking points are a feature representation of the human eye's gaze direction, and the gaze tracking feature points are used to describe the human eye's gaze direction information.
[0032] Compared with the prior art, the beneficial effects of the present invention are:
[0033] This invention proposes a system and method for generating eye images. By finely dividing the eye region according to its physiological characteristics, the system generates images in different sub-regions, improving the realism and naturalness of the generated eye images. This makes the system suitable for applications such as face swapping and digital human generation. The generated eye regions and gaze directions are realistic, natural, and highly clear. It can generate eye images of any gaze direction from a real human eye, achieving eye region generation without requiring large amounts of data and computation. Different regions of the eye are processed separately, and important eye features are specifically processed to ensure an overall realistic eye effect. Attached Figure Description
[0034] Figure 1 This is a flowchart of the generation method of the present invention. Detailed Implementation
[0035] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0036] In the description of this invention, it should be understood that the terms "upper", "lower", "front", "rear", "left", "right", "top", "bottom", "inner", "outer", etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings. They are only for the convenience of describing this invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, they should not be construed as limitations on this invention.
[0037] Example:
[0038] Please see Figure 1 The present invention provides a technical solution: the specific operation steps include:
[0039] Acquire the first eye image and the second eye image;
[0040] The first eye image and the second eye image are preprocessed respectively. The resolution and clarity of the second eye image are improved to obtain the fourth eye image. If the first eye image is an eye texture of a 3D face model, the first eye image is preprocessed into an orthographic projection image to obtain the third eye image.
[0041] Eye landmark detection was performed on the third and fourth eye images respectively. Contour detection or deep learning methods were used to detect key points such as eye landmarks and gaze direction key points. Deep learning methods can obtain eye landmarks based on the results of facial landmark detection and use gaze tracking results as gaze direction key points. Contour detection methods can extract features of eye landmarks and gaze direction based on binarization, contour detection, and gradient features. Combining both methods and optimizing the results yields better results. Deep learning-based facial landmark detection of eye landmarks is inaccurate; it can represent the approximate location of the eye area but not the true eye contour area. Contour detection methods generate many candidate contours, making it difficult to determine the true eye contour location. Using the deep learning-based eye landmark results as the initial region, and then applying contour detection on top of this, can reduce the processing area and candidate results while obtaining a better eye contour boundary.
[0042] Eye regions are extracted from the third and fourth eye images respectively. Based on the third or fourth eye image and its eye feature points, feature points of the eye boundary are determined. The polygon connecting the feature points of the eye boundary is used as the boundary of the eye contour, or the feature points are used as the initial candidate boundary region. The eye boundary contour is then accurately found in the vicinity to determine the eye region image. Using polygons to represent the boundary of a region is a common method in this field. At the same time, the polygons here are sampled based on the contour in the previous step, so that the polygon points are evenly distributed on the eye contour. When dividing the sub-region triangles inside the eye later, the vertices of this polygon are used as the vertices of the outer contour.
[0043] Based on the eye feature points of the fourth eye region and the eye feature points of the third eye image, the fourth eye region and its eye feature points are transformed to the third eye image space. The transformation is based on the block transformation of eye feature points. The block division method can be selectively divided according to the selected feature points. The preprocessing method transforms the transformation result of the third eye image and its eye feature points to the first eye image space to obtain the fifth eye transformed image.
[0044] Eye detection is performed on the first eye image space and the fifth eye transformed image to obtain the eye region contour and boundary points. Eye detection is based on image processing methods, which can be binarization, contour detection, and Hough circle detection. Edge points are determined based on eye features to fit the boundary of the eye region and determine the eye region. For standard eye unfolded textures, Hough circle detection is used to achieve a balance between efficiency and accuracy. For the eye region of user-captured frontal face textures, Hough circle detection or contour detection alone is ineffective due to incomplete eye contour boundaries, low resolution, and blurry image quality. Therefore, binarization is used to reduce the impact of image quality and unclear boundary contours. Morphological operations are then used to process the poor binarization effect of the eye region at low resolution. Finally, contour detection is used to determine candidate eye boundaries. Based on the polygonal region of the eye, it is roughly determined which points in the candidate boundary are likely to be eye contour points. Based on these points, the least squares fitting method is used to obtain the fitted circle, which is the eye contour boundary. This method has the best effect and can completely detect the eye boundary and thus determine the eye region.
[0045] An eyeball generation operation is performed on the fifth eye transformation image. Pixel interpolation is performed on the missing eyeball pixels in the region. The pixel interpolation can be performed using the bilinear interpolation method to generate the eyeball region.
[0046] Image texture fusion is performed between the eye region of the first eye image and the eye region of the fifth eye transformation image. Texture fusion can be performed by aligning the eye regions of the first and fifth eye transformation images and then fusing them as a whole, or by fusing the eye and non-eye regions separately to obtain the sixth fused eye image. Fusing the eye and non-eye regions separately yields better results. The eye region should be similar to the real user's eye, but the non-eye region usually has better fusion between the whites of the eye and the eye boundary in the first eye image / the non-eye region in the standard texture has better realism and usually has details such as blood vessels. Therefore, fusion by region, using different fusion methods for different regions, can make the eyes more expressive while ensuring the realism of the user.
[0047] The fusion formula is:
[0048]
[0049]
[0050]
[0051] in, This is the first eye image. This represents the transformation relationship from the first eye image to the fifth eye image. This is the scaling factor. This is the transformed image of the first eye. This is the mask matrix for the eye area, where the eye area is represented by 1 and other parts by 0. The operator multiplies corresponding elements in a matrix. The fusion weight parameters are for the eye region and the non-eye region. These correspond to the eyeball region and non-eyeball region after the transformation of the first eye image. This is the final merged image of the sixth eye.
[0052] The sixth eye fusion image and the first eye image are fused together. The pixels around the eyes in the sixth eye fusion image are then subjected to texture fusion and image filtering to obtain the seventh eye generated image, which is the result generated on the first eye image based on the second eye image.
[0053] The foregoing has shown and described the basic principles, main features, and advantages of the present invention. It will be apparent to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the invention can be implemented in other specific forms without departing from its spirit or basic characteristics. Therefore, the embodiments should be considered exemplary and non-limiting in all respects. The scope of the invention is defined by the appended claims rather than the foregoing description. Therefore, all variations falling within the meaning and scope of equivalents of the claims are intended to be included within the present invention, and no reference numerals in the claims should be construed as limiting the scope of the claims.
[0054] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.
Claims
1. An eye image generation system, characterized in that, include: An image preprocessing module is used to obtain a first eye image and a second eye image, and to preprocess the first eye image and the second eye image to obtain a third eye image and a fourth eye image; An eye feature point detection module is used to perform feature point detection on the third eye image and the fourth eye image, and extract the eye feature point set of the third eye image and the fourth eye image; An eye region extraction module is used to generate a third eye region image and a fourth eye region image based on the eye feature point set of the third eye image and the eye feature point set of the fourth eye image. An eye region transformation module is used to transform the fourth eye region image and the eye feature point set of the fourth eye image into the coordinate space of the first eye image, and generate a fifth eye transformed image and the eye feature point set of the fifth eye transformed image. The eye detection module is used to extract the eye region and eye boundary feature points of the fifth eye transformation image and the first eye image; The eyeball generation module uses the eyeball boundary feature points to perform circle fitting, and performs pixel interpolation on the missing pixels in the eyeball part of the fifth eye transformation image to generate the eyeball region of the fifth eye transformation image. The eye fusion module fuses the eye region and non-eye region generated from the fifth eye transformation image with the eye region and non-eye region of the first eye image to obtain a sixth eye fusion image; The eye region generation module is used to perform texture fusion and boundary processing on the sixth eye fusion image and the first eye image to generate the seventh eye generation image.
2. The eye image generation system according to claim 1, characterized in that, The eye region extraction module determines candidate regions for the eye boundary based on the set of eye feature points, and uses image processing methods to determine the eye boundary region.
3. The eye image generation system according to claim 1, characterized in that, The eye region extraction module determines the candidate boundaries of the eye based on the different physical meanings of the eye feature points, and optimizes the candidate boundaries based on the grayscale and geometric features of the image.
4. The eye image generation system according to claim 1, characterized in that, The eye region transformation module transforms the fourth eye region image and the eye feature points of the fourth eye region image to the first eye image based on the eye feature point set and eye boundary region of the third eye image and the fourth eye image.
5. The eye image generation system according to claim 1, characterized in that, The eye region transformation module performs transformation based on the matching relationship between feature points, and adopts either overall transformation or block transformation based on feature points according to the characteristics of the eye region.
6. The eye image generation system according to claim 1, characterized in that, The eye detection module performs eye region detection on the fifth eye transformation image and the first eye image, and determines the accurate boundary region and boundary points of the eye based on the eye feature point set and image processing method.
7. The eye image generation system according to claim 1, characterized in that, The eye detection module uses a combination of image grayscale information, image geometric information, or deep learning methods to determine the boundary points of the eyeball.
8. A method for generating an eye image, characterized in that, Includes the following steps: 1) Obtain the first eye image and the second eye image, and perform image preprocessing on the first eye image and the second eye image to obtain the third eye image and the fourth eye image; 2) Perform eye feature point detection on the third eye image and the fourth eye image respectively to obtain the eye feature point sets of the third eye image and the fourth eye image; 3) Extract the eye regions from the third eye image and the fourth eye image respectively to form the third eye region image and the fourth eye region image; 4) Transform the eye feature point set of the fourth eye image to the coordinate space of the first eye image to form the fifth eye feature point set; The fourth eye region image is transformed to the coordinate space of the first eye image to form the fifth eye transformed image; 5) Perform eye region detection on the fifth eye transformation image and the first eye image respectively to obtain the eye region and its boundary feature points of the fifth eye transformation image and the first eye image; 6) Generate the eye region of the fifth eye transformation image based on the eye region and its boundary feature points of the fifth eye transformation image and the first eye image; 7) The eye region and non-eye region generated from the fifth eye transformation image are respectively fused with the eye region and non-eye region of the first eye image to obtain the sixth eye fused image; 8) Perform image fusion and boundary processing on the sixth eye fusion image and the first eye image to obtain the seventh eye generated image.
9. The method for generating an eye image according to claim 8, characterized in that, The image preprocessing includes image quality enhancement, image resolution enhancement, and human eye mapping orthographic projection operation. The eye feature points include eye inner and outer contour feature points, eyeball feature points, pupil feature points, and gaze tracking points. The gaze tracking points are a feature representation of the human eye's gaze direction, and the gaze tracking feature points are used to describe the human eye's gaze direction information.