Multi-dimensional oral image screening method

By employing a multi-dimensional oral image screening method, combined with multi-source image acquisition and a lightweight deep learning model, the problem of low accuracy in early screening of oral cancer has been solved, achieving efficient and accurate identification and early detection of abnormal oral locations.

CN117132975BActive Publication Date: 2026-06-19WUHAN LANDING INTELLIGENCE MEDICAL CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
WUHAN LANDING INTELLIGENCE MEDICAL CO LTD
Filing Date
2023-08-26
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Current technologies for early screening of oral cancer suffer from low image recognition accuracy and significant differences in sampling results from different locations, affecting the efficiency and accuracy of early detection.

Method used

A multi-dimensional oral image screening method is adopted, including swab collection of oral mucosal cell samples, HE staining, microscopic scanning, multi-source image acquisition, lightweight deep learning model CNN recognition and support vector machine denoising processing, combined with multi-dimensional artificial intelligence image recognition, to accurately locate abnormal locations through multi-source and multi-mode image acquisition.

Benefits of technology

It improves the accuracy and efficiency of oral image screening, reduces the workload of professional pathologists, ensures the reliability and consistency of identification results, and provides an effective means of early cancer screening.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117132975B_ABST
    Figure CN117132975B_ABST
Patent Text Reader

Abstract

This invention provides a multi-dimensional oral cavity image screening method. The method comprises the following steps: S1, collecting oral mucosa samples via swab to obtain oral epithelial cell samples; S2, performing HE staining; S3, spreading the stained oral epithelial cell samples onto a glass slide; S4, microscopically scanning the oral epithelial cell samples to obtain oral epithelial cell images; S5, preprocessing the oral epithelial cell images, adjusting their size to a uniform 224×224×3 format, and performing normalization; S6, inputting the processed oral epithelial cell image data into a deep learning model (CNN); and S7, outputting the classification and recognition results of the oral epithelial cells. By employing artificial intelligence to assist in the recognition of oral cavity images, the workload of professional pathologists is significantly reduced. The use of a lightweight deep learning model (CNN) improves the efficiency of recognition. The use of a support vector machine beforehand removes adhering molecules and unstained substances, thus ensuring the accuracy of the lightweight model.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of oral image screening, and in particular to a multi-dimensional oral image screening method. Background Technology

[0002] Oral cell carcinoma is a common malignant tumor, and early detection and treatment have a significant impact on patients' survival rate and quality of life. Image screening is a rapid and non-invasive detection method that can assist doctors in diagnosing oral cancer. Chinese patent document CN 111563887 A describes a vision-based intelligent analysis method and device for oral images, but this scheme mainly analyzes conventional visual images and fails to reach microscopic tissue and cell images. Jiao Long's "Research on Oral Cancer Image Recognition Based on Deep Learning" describes a scheme for oral cancer pathological image recognition based on convolutional neural networks and introduces various models; however, these models are usually quite large and have long processing times. Furthermore, these models are trained on high-quality image data, and their practical application is poor due to the large number of unsatisfactory images collected in clinical settings. A further technical problem discovered is that oral epithelial cells are polygonal or rectangular, approximately 20-30 micrometers in size. The cytoplasm contains many small granular structures called organelles, including mitochondria, endoplasmic reticulum, and Golgi apparatus. The outer surface of the cell membrane contains numerous glycoprotein molecules called adhesion molecules, which play a role in interacting with neighboring cells or tissues. In addition, unstained substances, such as cell nuclei or cytoplasmic fragments, can also be observed. These can interfere with the normal operation of the model during image recognition, leading to a decrease in recognition accuracy. Furthermore, the oral cavity is a large area, and sampling tissues from different locations can result in significantly different recognition results. Clinically, there have been cases where patients obtained completely different results from samples taken from different locations in the oral cavity. These technical challenges hinder the development of oral pathology recognition and increase the difficulty of early detection of oral cancer. It should be noted that the description in the background section is not an admission of existing technology; the inventor's raising of new technical problems is also part of the inventor's innovative contribution. Summary of the Invention

[0003] The technical problem to be solved by the present invention is to provide a multi-dimensional oral image screening method that can improve the accuracy of oral image screening, assist physicians in detecting oral image abnormalities as early as possible, and the oral scanning device used can help improve the efficiency and accuracy of oral image screening.

[0004] To solve the above-mentioned technical problems, the technical solution adopted by the present invention is: a multi-dimensional oral cavity image screening method, comprising the following methods:

[0005] S1. Collect oral mucosa samples and oral epithelial cell samples using swabs;

[0006] S2 and HE staining treatment;

[0007] S3. Spread the stained oral epithelial cell sample onto a glass slide;

[0008] S4. Obtain images of oral epithelial cells by microscopic scanning of oral epithelial cell samples;

[0009] S5. Preprocess the oral epithelial cell image, adjust the size to a uniform format of 224 × 224 × 3, and perform normalization.

[0010] S6. Input the processed oral epithelial cell image data into the deep learning model CNN;

[0011] S7. Output the classification and identification results of oral epithelial cells.

[0012] In the preferred embodiment, the step of acquiring oral cavity images is also included:

[0013] S01. Acquire intraoral images using a multi-light source method;

[0014] S02. Input the images acquired by the multi-source method into the deep learning recognition model;

[0015] S03, Assessment of abnormal oral cavity locations;

[0016] S04. Based on the output results, swabs are collected from the oral mucosa at abnormal locations in the oral cavity.

[0017] In the preferred embodiment, in step S01, an intraoral image is acquired using an oral scanning device in a conventional scanning manner. During the acquisition process, intraoral images are acquired using blue laser and white light respectively to obtain a fluorescence image a and a white light image b.

[0018] In the preferred embodiment, in step S02, the fluorescence image a and the white light image b are input into different recognition models A and B, respectively, and the output results are input into a decision tree, random forest tree, or gradient boosting tree model to output the location of oral abnormalities.

[0019] In the preferred embodiment, after the conventional scanning method in step S01, the method further includes rinsing the mouth with an indicative fluorescent agent, and then acquiring images of the oral cavity with an oral scanning device. During the acquisition process, a co-focused laser light source is used to scan the inside of the oral cavity to scan the oral tissue images c excited by the laser.

[0020] In the preferred embodiment, in step S02, the fluorescence image a and the white light image b are input into different recognition models A and B, respectively, the oral histology image c is input into the recognition model C, and the output results are input into the decision tree, random forest tree, or gradient boosting tree model to output the location of oral abnormalities.

[0021] Before rinsing your mouth, use a swab to collect oral mucosa from abnormal locations in your mouth;

[0022] Based on the output of the recognition model C, the sample collected by the swab is selected.

[0023] In the preferred embodiment, step S5 includes image preprocessing, which involves denoising the image using a support vector machine (SVM) to remove adhering molecules and unstained material impurities.

[0024] In the preferred embodiment, the deep learning model CNN in step S6 adopts a lightweight CNN model with a 5-layer architecture: the first layer is the input layer; the second layer is a convolutional and pooling layer with EfficientNet-B0; the third layer is a global average pooling layer; the fourth layer is a fully connected layer with 1024 neurons; and the fifth layer is an output layer with a sigmoid activation function. Based on a statistical sample review and the distribution of class samples, the original dataset is randomly divided into two classes, with 80% used as the training set and the remaining 20% ​​as the test set. The model is trained using the training set, and the model performance is evaluated using the test set. 。

[0025] The deep learning model CNN uses manually labeled images as the training dataset for training, and the current output results are manually labeled and added to the training dataset for expansion.

[0026] The outputs added to the training set include a positive feedback training set, which contains the correct results identified by the deep learning model CNN; and a negative feedback training set, which contains the incorrect results identified by the deep learning model CNN.

[0027] In a preferred embodiment, the oral scanning device includes a glass cover with a cavity inside. The glass cover extends along the length direction. A motor is provided on one end face of the glass cover. The output shaft of the motor is connected to a rod. The rod is arranged along the length direction of the glass cover. An image acquisition head, a blue laser light source, and a white LED are provided on the surface of the rod.

[0028] It also includes a main control unit, which contains a memory and / or a wireless transmission device.

[0029] The main control unit is electrically connected to the motor, image acquisition head, blue laser light source, and white LED;

[0030] It also has a power supply for powering the system.

[0031] In a preferred embodiment, a co-focusing laser source is also provided on the rod along its length direction. The co-focusing laser source is an array of laser sources, with columns along the length direction and rows along the circumference direction, and at least multiple lasers in the same row are focused on the same focal point.

[0032] This invention provides a multi-dimensional oral image screening method that significantly reduces the workload of professional pathologists by employing artificial intelligence-assisted oral image recognition. Furthermore, this invention utilizes a lightweight deep learning model (CNN) to improve recognition efficiency. To overcome the issue of reduced accuracy with lightweight models, a support vector machine (SVM) is used beforehand to remove adhering molecules and unstained substances, thus ensuring the accuracy of the lightweight model. Before swab collection, multi-source, multi-mode image acquisition, combined with multi-dimensional AI image recognition, accurately identifies abnormal locations in the oral cavity, improving the accuracy of swab collection and consequently, the accuracy of oral image screening. In particular, the co-focused laser light source scanning method can excite local histological images, providing effective assistance for early head and neck cell carcinoma screening. The oral scanning device of this invention can quickly acquire intraoral images in multiple modes, providing guidance for subsequent acquisition and recognition of oral pathological tissue images. Moreover, the multi-dimensional oral images also serve as weight parameters for subsequent oral pathological tissue image recognition, further improving the accuracy of the final results. The present invention also expands the training dataset by using positive feedback datasets and negative feedback datasets respectively after the detection results are manually confirmed, so as to realize the iteration of the training dataset. Attached Figure Description

[0033] The present invention will be further described below with reference to the accompanying drawings and embodiments:

[0034] Figure 1 This is a flowchart of the present invention.

[0035] Figure 2 This is a flowchart illustrating the oral cavity image scanning process of the present invention.

[0036] Figure 3 This is a schematic diagram of the oral cavity scanning device of the present invention.

[0037] Figure 4 Images of abnormal oral epithelial cells.

[0038] Figure 5 Another image of oral epithelial cells that identified abnormalities.

[0039] Figure 6 Another image of oral epithelial cells that identified abnormalities.

[0040] In the diagram: 1. Motor; 2. Transparent cover; 3. Image acquisition head; 4. Co-focused laser light source; 5. Blue laser light source; 6. White LED; 7. Pole. Detailed Implementation

[0041] Example 1:

[0042] like Figure 1 A multi-dimensional oral cavity image screening method includes the following methods:

[0043] S1. Collect oral mucosa samples and oral epithelial cell samples using swabs;

[0044] S2 and HE staining treatment;

[0045] S3. Spread the stained oral epithelial cell sample onto a glass slide;

[0046] S4. Obtain images of oral epithelial cells by microscopic scanning of oral epithelial cell samples;

[0047] S5. Preprocess the oral epithelial cell image, adjust the size to a uniform format of 224 × 224 × 3, and perform normalization.

[0048] S6. Input the processed oral epithelial cell image data into the deep learning model CNN;

[0049] S7. Output the classification and identification results of oral epithelial cells.

[0050] like Figure 2 In the preferred embodiment, the step of acquiring oral cavity images is also included:

[0051] S01. Acquire intraoral images using a multi-light source method;

[0052] S02. Input the images acquired by the multi-source method into the deep learning recognition model;

[0053] S03, Assessment of abnormal oral cavity locations;

[0054] S04. Based on the output results, swabs are collected from the oral mucosa at abnormal locations in the oral cavity.

[0055] In the preferred embodiment, in step S01, an intraoral image is acquired using a dental scanning device in a conventional scanning manner. During the acquisition process, intraoral images are acquired using both blue laser and white light to obtain a fluorescence image a and a white light image b. The wavelength of the blue laser is 415nm ~ 530nm.

[0056] In the preferred embodiment, in step S02, the fluorescence image a and the white light image b are input into different recognition models A and B, respectively. The output results are then input into a decision tree, random forest tree, or gradient boosting tree model to output the location of oral abnormalities. Different recognition models are used to identify images from different light sources. The results are then processed by the decision tree, random forest tree, or gradient boosting tree model according to weights to obtain the output result, which points to the location of oral abnormalities. Using the location of oral abnormalities to guide swab collection can significantly improve the hit rate of collection and enable early detection of clues to oral cancer.

[0057] In a preferred embodiment, after the conventional scanning in step S01, the method further includes rinsing the mouth with an indicative fluorescent agent. The indicative fluorescent agent refers to the intraoral image acquired using an oral scanning device, during which a co-focused laser light source scans the oral cavity, generating a laser-excited oral tissue image c. Optionally, the indicative fluorescent agent, for example, is a cMBP peptide fluorescent imaging agent targeting c-Met based on oral squamous cell carcinoma, tongue squamous cell carcinoma, pharyngeal squamous cell carcinoma, or laryngeal squamous cell carcinoma, with the amino acid sequence KSLSRHDHIHHHK. The wavelength of the co-focused laser light source is a 765-785 nm laser light source.

[0058] The oral scanning device acquires images by placing it on the left and right sides of the mouth and under the tongue, respectively, and then automatically acquiring the images. When the co-focused laser light source is activated, multiple light source units simultaneously focus on a single point to excite the fluorescence image of the indicative fluorescent agent.

[0059] In the preferred embodiment, in step S02, the fluorescence image a and the white light image b are input into different recognition models A and B, respectively, the oral histology image c is input into the recognition model C, and the output results are input into the decision tree, random forest tree, or gradient boosting tree model to output the location of oral abnormalities.

[0060] Before rinsing your mouth, use a swab to collect oral mucosa from abnormal locations in your mouth;

[0061] Based on the output of the recognition model C, the sample collected by the swab is selected.

[0062] In the preferred embodiment, step S5 includes image preprocessing, which involves denoising the image using a support vector machine (SVM) to remove adhering molecules and unstained material impurities.

[0063] In the preferred embodiment, feature extraction is performed. The feature values ​​of the oral epithelial cell image include cell morphology, size, nuclear shape, cytoplasmic color, and texture. The feature extraction method used is SIFT, SURF, or ORB.

[0064] Multiple eigenvalues ​​are selected and fused to obtain a comprehensive feature vector. Feature fusion methods include simple weighted average, principal component analysis (PCA), and linear discriminant analysis (LDA). This scheme employs independent feature extraction and feature fusion steps, which can further reduce the computational load of the CNN model. Furthermore, the independent feature extraction and feature fusion steps, due to their single-branch structure, can further improve the efficiency of feature extraction and feature fusion. For large-scale image recognition tasks, the efficiency improvement is significant.

[0065] In the preferred embodiment, the deep learning model CNN in step S6 adopts a lightweight CNN model with a 5-layer architecture: the first layer is the input layer; the second layer is a convolutional and pooling layer that transmits EfficientNet-B0; the third layer is a global average pooling layer; the fourth layer is a fully connected layer with 1024 neurons, preferably also having rectified linear units and a dropout rate of 0.5; the fifth layer is an output layer with a sigmoid activation function. With the optimized lightweight CNN model, the present invention significantly improves the efficiency of image recognition.

[0066] The deep learning model CNN uses manually labeled images as the training dataset for training, and the current output results are manually labeled and added to the training dataset for expansion.

[0067] The outputs added to the training set include a positive feedback training set, containing correctly identified results from the CNN deep learning model, and a negative feedback training set, containing incorrectly identified results from the CNN deep learning model. By iterating through the training dataset, the accuracy of the recognition is further improved. Figures 4-6 In the image, abnormal oral epithelial cells are identified.

[0068] Example 2:

[0069] like Figure 3 In the above-mentioned multi-dimensional oral image screening method, an oral scanning device includes a glass cover 1 with a cavity inside. The glass cover 1 extends along its length. A motor 1 is provided on one end face of the glass cover 1. The output shaft of the motor 1 is connected to a rod 7. The rod 7 is arranged along the length of the glass cover 1. An image acquisition head 3, a blue laser light source 5, and a white LED 6 are provided on the surface of the rod 7. Preferably, the motor 1 is a stepper motor or a servo motor.

[0070] It also includes a main control unit, which contains a memory and / or a wireless transmission device.

[0071] The main control unit is electrically connected to motor 1, image acquisition head 3, blue laser light source 5 and white LED 6;

[0072] It also has a power supply for powering the system.

[0073] In a preferred embodiment, a co-focusing laser source 4 is also provided on the rod 7 along the length direction. The co-focusing laser source 4 is an array of laser sources, with columns along the length direction and rows along the circumference direction, and at least multiple lasers in the same row are focused on the same focal point.

[0074] The oral scanning device acquires images by placing it on the left and right sides of the mouth and under the tongue. The main control unit then drives motor 1 to rotate, illuminating the corresponding light sources. The image acquisition head 3 captures and stores the images or transmits them wirelessly to a terminal device, such as a computer or mobile phone, for processing. When the co-focused laser light source is activated, multiple lasers in the same row focus on the same focal point to excite the fluorescent image of the indicative fluorescent agent.

[0075] A three-dimensional oral cavity model is established. The oral cavity scanning device scans images obtained at preset positions and maps these images onto the three-dimensional oral cavity model based on the rotation and position parameters of motor 1. Then, a planar image is obtained using a projection method. The planar image is input into the corresponding recognition model to identify abnormal locations in the oral cavity. The oral cavity scanning device of this invention is easy to use and has good recognition results.

[0076] Preferably, a heating device is also provided inside the transparent cover 2. The heating device is electrically connected to the main control device to heat the oral scanning device to human body temperature so as to prevent water vapor from being generated in the transparent cover 2 and affecting the field of vision.

[0077] The above embodiments are merely preferred technical solutions of the present invention and should not be considered as limitations on the present invention. The embodiments and features described in these embodiments can be arbitrarily combined without conflict. The scope of protection of the present invention should be limited to the technical solutions described in the claims, including equivalent substitutions of the technical features described in the claims. That is, equivalent substitutions and improvements within this scope are also within the scope of protection of the present invention.

Claims

1. A multi-dimensional oral cavity image screening method, comprising the following methods: Steps for acquiring oral cavity images: S01. Acquire intraoral images using a multi-light source method; In step S01, an intraoral image is acquired using a conventional scanning method with an oral scanning device. During the acquisition process, intraoral images are acquired using blue laser and white light respectively to obtain fluorescence image a and white light image b, and fluorescence image set and white light image set are established respectively. After the conventional scanning method, the mouth was rinsed with an indicative fluorescent agent, and then images of the oral cavity were acquired using an oral scanning device. During the acquisition process, a co-focused laser light source was used to scan the inside of the oral cavity, and the histological images of the oral cavity excited by the laser were scanned. The oral scanning device includes a glass cover (2), which has a cavity inside. The glass cover (2) extends along the length direction. A motor (1) is provided on one end face of the glass cover (2). The output shaft of the motor (1) is connected to a rod (7). The rod (7) is arranged along the length direction of the glass cover (2). An image acquisition head (3), a blue laser light source (5), and a white LED (6) are provided on the surface of the rod (7). It is also equipped with a main control unit, which contains a memory and / or a wireless transmission device; The main control device is electrically connected to the motor (1), the image acquisition head (3), the blue laser light source (5), and the white LED (6); It is also equipped with a power supply for power supply; A co-focusing laser source (4) is also provided on the rod (7) along the length direction. The co-focusing laser source (4) is an array of laser sources, with columns along the length direction and rows along the circumference direction. At least multiple lasers in the same row are focused on the same focal point. S02. Input the images acquired by the multi-source method into the deep learning recognition model; S03, Assessment of abnormal oral cavity locations; S04. Based on the output results, swabs are collected from the oral mucosa at abnormal locations in the oral cavity. S1. Collect oral mucosa samples and oral epithelial cell samples using swabs; S2 and HE staining treatment; S3. Spread the stained oral epithelial cell sample onto a glass slide; S4. Obtain images of oral epithelial cells by microscopic scanning of oral epithelial cell samples; S5. Preprocess the oral epithelial cell image, adjust the size to a uniform format of 224 × 224 × 3, and perform normalization. S6. Input the processed oral epithelial cell image data into the deep learning model CNN; S7. Output the classification and identification results of oral epithelial cells.

2. The multi-dimensional oral image screening method according to claim 1, characterized in that: Step S In step 02, the fluorescence image a and the white light image b are input into different trained recognition models A and B, respectively. The output results are input into decision tree, random forest tree, or gradient boosting tree models to output the location of oral abnormalities.

3. The multi-dimensional oral image screening method according to claim 1, characterized in that: Step S In 02, the fluorescence image a and the white light image b are input into different recognition models A and B, respectively, and the oral histology image c is input into the recognition model C. The output results are input into the decision tree, random forest tree, or gradient boosting tree model to output the location of oral abnormalities. Before rinsing your mouth, use a swab to collect oral mucosa samples from abnormal locations in your mouth; Based on the output of the recognition model C, the sample collected by the swab is selected.

4. The multi-dimensional oral image screening method according to claim 1, characterized in that: Step S5 includes image preprocessing, which involves denoising the image using a support vector machine to remove adhering molecules and unstained material impurities.

5. The multi-dimensional oral image screening method according to claim 1, characterized in that: The deep learning model CNN in step S6 adopts a lightweight CNN model with a 5-layer architecture: the first layer is the input layer; the second layer is a convolutional and pooling layer with EfficientNet-B0; the third layer is a global average pooling layer; the fourth layer is a fully connected layer with 1024 neurons; and the fifth layer is an output layer with a sigmoid activation function. Based on the statistical sample review and the distribution of class samples, the original dataset is randomly divided into two classes, with 80% used as the training set and the remaining 20% ​​as the test set. The model is trained using the training set and tested using the test set. The deep learning model CNN uses manually labeled images as the training dataset for training, and the current output results are manually labeled and added to the training set for expansion. The outputs added to the training set include a positive feedback training dataset, which contains the correct results identified by the deep learning model CNN; and a negative feedback training dataset, which contains the incorrect results identified by the deep learning model CNN.