Ultrasound image text annotation adding method and device
By identifying and matching candidate images from sample images, and automatically adding text annotations to ultrasound images, the problem of low efficiency in traditional methods is solved, and efficient and flexible text annotation addition is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- WUHAN UNITED IMAGING HEALTHCARE CO LTD
- Filing Date
- 2022-11-30
- Publication Date
- 2026-06-19
Smart Images

Figure CN115762726B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, and in particular to a method and apparatus for adding text annotations to ultrasound images. Background Technology
[0002] With the development of medical imaging technology, ultrasound images are playing an increasingly important role in medical diagnosis and auxiliary treatment. In clinical applications, doctors usually need to add text annotations to the ultrasound images after the ultrasound examination. For example, for an ultrasound examination of the kidneys, "left kidney" and "right kidney" need to be added to the ultrasound images.
[0003] In traditional techniques, doctors usually rely on personal experience to manually add text annotations, which is inefficient. Moreover, the content of the annotations varies depending on the clinical application scenario. With the increasing prominence of the role of ultrasound examination and the growing number of ultrasound examinations, the traditional method of manually adding text annotations can no longer meet the needs of practical applications.
[0004] Therefore, current methods for adding text annotations in the field of ultrasound imaging technology suffer from low efficiency. Summary of the Invention
[0005] Therefore, it is necessary to provide an efficient method, apparatus, computer device, and computer-readable storage medium for adding text annotations to ultrasound images to address the aforementioned technical problems.
[0006] Firstly, this application provides a method for adding text annotations to ultrasound images. The method includes:
[0007] Obtain a sample image containing annotated text, and identify the text information of the annotated text in the sample image; the text information includes the number of texts, the text position, and the text content;
[0008] From the sample images, candidate sample images that match the ultrasound images to be annotated are determined;
[0009] The text information of the annotated text in the candidate sample image is determined as candidate text information;
[0010] Based on the candidate text information, text annotations are added to the ultrasound image to be annotated.
[0011] In one embodiment, adding text annotations to the ultrasound image to be annotated based on the candidate text information includes:
[0012] If the number of candidate sample images exceeds a preset threshold, target text information is selected from the candidate text information;
[0013] Text annotations are added to the ultrasound image to be annotated based on the number of texts, the text position, and the text content of the target text information.
[0014] In one embodiment, selecting target text information from the candidate text information includes:
[0015] The maximum value is selected from the number of texts corresponding to the candidate text information to obtain the target text number;
[0016] The candidate text information corresponding to the target text quantity is determined as the target text information.
[0017] In one embodiment, selecting target text information from the candidate text information further includes:
[0018] The target information weight is obtained by selecting the maximum value from the information weights corresponding to the candidate text information.
[0019] The candidate text information corresponding to the target information weight is determined as the target text information.
[0020] In one embodiment, adding text annotations to the ultrasound image to be annotated based on the candidate text information further includes:
[0021] Based on the information weights, the text quantity, text position, and text content type identifier corresponding to the candidate text information are weighted and summed to obtain the weighted text quantity, weighted text position, and weighted text content type identifier.
[0022] Text annotations are added to the ultrasound image to be annotated based on the weighted text quantity, the weighted text position, and the weighted text content type identifier.
[0023] In one embodiment, the step of identifying the text information of the annotation text in the sample image includes:
[0024] The sample image is input into a pre-trained text information recognition model to obtain the number of texts, text positions, and text content of the annotated text in the sample image;
[0025] The number of texts, the position of the texts, and the content of the texts are used as the text information of the annotation text in the sample image.
[0026] In one embodiment, the method further includes:
[0027] Select a target training sample from the training samples and obtain the sample identifier corresponding to the target training sample;
[0028] The target training sample is input into the text information recognition model to be trained to obtain the recognition result of the target training sample;
[0029] Based on the difference between the recognition result of the target training sample and the sample identifier, the text information recognition model to be trained is trained to obtain the pre-trained text information recognition model.
[0030] In one embodiment, determining candidate sample images that match the ultrasound image to be annotated from the sample images includes:
[0031] Determine the matching degree between the ultrasound image to be annotated and the sample image;
[0032] If the matching degree exceeds a preset threshold, the sample image corresponding to the matching degree is determined as the candidate sample image.
[0033] In one embodiment, selecting target text information from the candidate text information further includes:
[0034] Select a target update time from the update times corresponding to the candidate text information; the target update time is closest to the current time.
[0035] The candidate text information corresponding to the target update time is determined as the target text information.
[0036] Secondly, this application also provides a device for adding text annotations to ultrasound images. The device includes:
[0037] The acquisition and recognition module is used to acquire a sample image containing annotated text and to recognize the text information of the annotated text in the sample image; the text information includes the number of texts, the text position, and the text content.
[0038] The sample determination module is used to determine candidate sample images that match the ultrasound image to be annotated from the sample images;
[0039] The information determination module is used to determine the text information of the annotation text in the candidate sample image as candidate text information;
[0040] The annotation addition module is used to add text annotations to the ultrasound image to be annotated based on the candidate text information.
[0041] Thirdly, this application also provides a computer device. The computer device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to perform the following steps:
[0042] Obtain a sample image containing annotated text, and identify the text information of the annotated text in the sample image; the text information includes the number of texts, the text position, and the text content;
[0043] From the sample images, candidate sample images that match the ultrasound images to be annotated are determined;
[0044] The text information of the annotated text in the candidate sample image is determined as candidate text information;
[0045] Based on the candidate text information, text annotations are added to the ultrasound image to be annotated.
[0046] Fourthly, this application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program thereon, which, when executed by a processor, performs the following steps:
[0047] Obtain a sample image containing annotated text, and identify the text information of the annotated text in the sample image; the text information includes the number of texts, the text position, and the text content;
[0048] From the sample images, candidate sample images that match the ultrasound images to be annotated are determined;
[0049] The text information of the annotated text in the candidate sample image is determined as candidate text information;
[0050] Based on the candidate text information, text annotations are added to the ultrasound image to be annotated.
[0051] The aforementioned method, apparatus, computer device, and storage medium for adding text annotations to ultrasound images acquire sample images containing annotation text, identify the text information of the annotation text in the sample images, determine candidate sample images that match the ultrasound image to be annotated from the sample images, determine the text information of the annotation text in the candidate sample images as candidate text information, and add text annotations to the ultrasound image to be annotated based on the candidate text information. For any ultrasound image to be annotated, a matching candidate sample image can be selected from the sample images. Since the text information of the annotation text in the candidate sample images is known, text annotations can be automatically added to the ultrasound image to be annotated based on the text information of the annotation text in the candidate sample images, thus improving the efficiency of adding text annotations to ultrasound images. Attached Figure Description
[0052] Figure 1 This is a flowchart illustrating a method for adding text annotations to ultrasound images in one embodiment;
[0053] Figure 2A flowchart illustrating a method for adding text annotations to ultrasound images in another embodiment;
[0054] Figure 3 A structural block diagram of an ultrasound image text annotation adding device in one embodiment;
[0055] Figure 4 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation
[0056] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0057] In one embodiment, such as Figure 1 As shown, a method for adding text annotations to ultrasound images is provided. This embodiment illustrates the method using a terminal as an example. It is understood that this method can also be applied to a server, and furthermore, to a system including both a terminal and a server, and is implemented through interaction between the terminal and the server. In this embodiment, the method includes the following steps:
[0058] Step S110: Obtain a sample image containing the annotation text and identify the text information of the annotation text in the sample image; the text information includes the number of texts, the text position, and the text content.
[0059] The sample images can be ultrasound images with manually added annotations, for example, multiple ultrasound images collected by a hospital in various clinical scenarios and with added text annotations.
[0060] The annotation text can be text manually annotated in the sample image.
[0061] The text quantity can be the number of annotation texts. The text position can be the coordinate position of each annotation text. The text content can be the content of each annotation text.
[0062] In practice, at least one sample image is input to the terminal, and annotation text is added to each sample image. The terminal can recognize the annotation text in each sample image, obtain the text information of the annotation text, and store the sample image and the text information in correspondence.
[0063] In practical applications, multiple ultrasound images can be pre-acquired and input into the terminal. Regions of interest (ROIs) in each ultrasound image are manually identified, and annotation text is added to these ROIs on the terminal, including the number of ROIs, their coordinates, and the corresponding content. Alternatively, a text recognition model can be pre-trained. This pre-trained model is then used on the terminal to recognize the annotated text in each ultrasound image, obtaining the text information of the annotations, including the number, location, and content of the ROIs. The ultrasound images and the corresponding annotation texts can then be stored in the terminal's database.
[0064] Step S120: From the sample images, identify candidate sample images that match the ultrasound images to be annotated.
[0065] The ultrasound image to be annotated can be an ultrasound image that requires text annotation.
[0066] In practice, the ultrasound image to be annotated can be compared with each sample image containing annotation text, and a sample image that matches the ultrasound image to be annotated can be selected from at least one sample image and used as a candidate sample image.
[0067] In practical applications, for a specified region of interest, the RGB (Red, Green, Blue) pixel values RGB0 of the ultrasound image to be annotated can be obtained, as well as the RGB pixel values RGBi of the sample image i. The similarity between RGB0 and RGBi is calculated. If the similarity exceeds a preset threshold, the sample image i is determined to match the ultrasound image to be annotated, and the sample image i is a candidate sample image. Otherwise, if the similarity does not exceed the preset threshold, the sample image i is not a candidate sample image.
[0068] Step S130: The text information of the annotation text in the candidate sample image is determined as candidate text information.
[0069] Among them, the candidate text information can be the number of texts, the text position, and the text content of the annotation text in the candidate sample image.
[0070] In practice, after determining the candidate sample image, the text information corresponding to the candidate sample image can be obtained, and the text information corresponding to the candidate sample image can be determined as the candidate text information.
[0071] In practical applications, the number, location, and content of the annotation text corresponding to the candidate sample image can be found in the terminal database. The number, location, and content of the found annotation text are then used as the candidate text information for the candidate sample image.
[0072] Step S140: Add text annotations to the ultrasound image to be annotated based on the candidate text information.
[0073] In practice, candidate text information can include the number of texts, text positions, and text content of the annotation texts in the candidate sample images. Annotation text can be added to the ultrasound images to be annotated based on the number of texts, text positions, and text content of the annotation texts in the candidate sample images.
[0074] In practical applications, the number of annotation texts in the ultrasound image to be annotated can be determined first based on the number of texts. Then, according to the number of annotation texts, the corresponding text content can be added at each text position to achieve the addition of annotation texts in the ultrasound image to be annotated.
[0075] The above-described method for adding text annotations to ultrasound images involves acquiring sample images containing annotation text, identifying the text information of the annotation text in the sample images, determining candidate sample images that match the ultrasound image to be annotated from the sample images, identifying the text information of the annotation text in the candidate sample images as candidate text information, and adding text annotations to the ultrasound image to be annotated based on the candidate text information. This method can select matching candidate sample images from the sample images for any ultrasound image to be annotated, and since the text information of the annotation text in the candidate sample images is known, text annotations can be automatically added to the ultrasound image to be annotated based on the text information of the annotation text in the candidate sample images, thus improving the efficiency of adding text annotations to ultrasound images.
[0076] In one embodiment, step S140 may specifically include: selecting target text information from candidate text information when the number of candidate sample images exceeds a preset threshold; and adding text annotations to the ultrasound image to be annotated based on the text quantity, text position, and text content of the target text information.
[0077] The preset threshold can be 1.
[0078] In practice, candidate sample images that match the ultrasound image to be annotated are determined from the sample images. This can result in one or more candidate sample images. If only one candidate sample image is obtained, meaning the number of candidate sample images does not exceed a preset threshold, text annotations can be added directly to the ultrasound image to be annotated based on the candidate text information of that candidate sample image. Otherwise, if more than one candidate sample image is obtained, meaning the number of candidate sample images exceeds a preset threshold, one set of candidate text information corresponding to the multiple candidate sample images can be selected as the target text information. Text annotations can then be added to the ultrasound image to be annotated based on the text quantity, text position, and text content of the target text information.
[0079] In practical applications, the proportional relationship between the candidate sample image and the ultrasound image to be annotated can also be determined. Based on the proportional relationship, the target text information is transformed to obtain the transformed target text information that matches the ultrasound image to be annotated. Based on the transformed target text information, text annotations are added to the ultrasound image to be annotated.
[0080] For example, two candidate sample images are obtained: candidate sample image C and candidate sample image D. The candidate text information c corresponding to candidate sample image C includes: the number of texts is 1, the text position is (20, 40), and the text content is lesion 1; the candidate text information d corresponding to candidate sample image D includes: the number of texts is 1, the text position is (30, 60), and the text content is left kidney. If the ratio between the candidate sample image and the ultrasound image to be annotated is 2:1, and the candidate text information d is determined to be the target text information, then the number of texts in the ultrasound image to be annotated can be determined to be 1, the text position is (30 / 2, 60 / 2) = (15, 30), and the text content is left kidney. At this time, according to the font and font size set in the ultrasound device, "left kidney" can be displayed at the position (15, 30).
[0081] It should be noted that after determining the target text information, the type of text content in the target text information can also be identified. The text content type includes special annotations and general annotations. If it is identified as a special annotation, there is no need to process the text position, and the text annotation can be generated directly based on the target text information. Otherwise, if the text content type is identified as a general annotation, the above method can be used to determine the text position in the ultrasound image to be annotated based on the ratio between the candidate sample image and the ultrasound image to be annotated, and the text annotation can be generated at the obtained text position.
[0082] Among them, special annotations can be annotation text related to lesions, such as "lesion 1" and "lesion 2" used to mark lesion areas.
[0083] Among them, general annotations can be annotation texts that are not related to the lesion, such as "left kidney" and "right kidney" that only indicate the physiological structure of the human body.
[0084] In this embodiment, when the number of candidate sample images exceeds a preset threshold, target text information is selected from the candidate text information. Based on the text quantity, text position, and text content of the target text information, text annotations are added to the ultrasound image to be annotated. When multiple candidate sample images are matched, a set of target text information can be selected from the candidate text information corresponding to the multiple candidate sample images. Text annotations are added to the ultrasound image to be annotated based on the target text information, eliminating the need for manual selection and further improving the efficiency of annotation addition.
[0085] In one embodiment, the step of selecting target text information from candidate text information may specifically include: selecting the maximum value from the text quantity corresponding to the candidate text information to obtain the target text quantity; and determining the candidate text information corresponding to the target text quantity as the target text information.
[0086] In the specific implementation, if more than one candidate sample image is obtained, that is, the number of candidate sample images exceeds the preset threshold, the corresponding candidate text information is also more than one group. Each group of candidate text information contains the text quantity, text position and text content. The maximum value is selected from the text quantity corresponding to the multiple groups of candidate text information as the target text quantity. Then the candidate text information corresponding to the target text quantity can be used as the target text information.
[0087] For example, there are two candidate sample images that match the ultrasound image to be annotated: candidate sample image A and candidate sample image B. The candidate text information 'a' corresponding to candidate sample image A includes: a text quantity of 2, text positions of (20, 40), (30, 60), and text content of lesion 1 and lesion 2. The candidate text information 'b' corresponding to candidate sample image B includes: a text quantity of 3, text positions of (20, 40), (30, 60), (60, 120), and text content of lesion 1, lesion 2, and lesion 3. Since the text quantity in candidate text information 'b' is the largest, the target text quantity is 3. The candidate text information 'b' corresponding to the target text quantity can be determined as the target text information. That is, the target text information includes: a text quantity of 3, text positions of (20, 40), (30, 60), (60, 120), and text content of lesion 1, lesion 2, and lesion 3. Text annotation can be added to the ultrasound image to be annotated based on the target text information.
[0088] In this embodiment, the target text quantity is obtained by selecting the maximum value from the text quantity corresponding to the candidate text information; the candidate text information corresponding to the target text quantity is determined as the target text information. When multiple sets of candidate text information are filtered out, the one with the most text quantity can be selected as the target text information. Text annotations are added to the ultrasound image to be annotated according to the target text information, so that the number of added text annotations is large and the content is rich.
[0089] In one embodiment, the step of selecting target text information from candidate text information may further include: selecting a target update time from the update times corresponding to the candidate text information; and determining the candidate text information corresponding to the target update time as the target text information.
[0090] The update time can be either the time when the candidate text information was generated or the time when the candidate text information was updated.
[0091] The target update time can be the update time that is closest to the current time among all update times;
[0092] In practice, if more than one candidate sample image is obtained (i.e., the number of candidate sample images exceeds a preset threshold), and there are also more than one group of candidate text information, each group of candidate text information corresponds to an update time. The update time closest to the current time is selected as the target update time, and the candidate text information corresponding to the target update time can be used as the target text information. Alternatively, the number of texts corresponding to the candidate text information can be compared first. If there are two or more maximum values among the text numbers, making it impossible to select a target text information with the largest number of texts, then the update times corresponding to these candidate text information can be further compared. The candidate text information with the largest number of texts and the most recent update time can be determined as the target text information.
[0093] For example, candidate text information b has 3 texts and was updated at 10:00. Candidate text information c also has 3 texts and was updated at 12:00. Since the two have the same number of texts, it is difficult to select a target text information based on the number of texts. Therefore, the most recently updated candidate text information c can be determined as the target text information.
[0094] In this embodiment, by selecting the target update time from the update times corresponding to the candidate text information, and determining the candidate text information corresponding to the target update time as the target text information, when the number of texts corresponding to multiple candidate text information is the same and it is difficult to select the target text information, the target text information is selected according to the update time of the candidate text information, ensuring that the ultrasound image is annotated according to the latest updated text information, thus guaranteeing the effectiveness of the text annotation.
[0095] In one embodiment, the step of selecting target text information from candidate text information may further include: selecting the maximum value from the information weights corresponding to the candidate text information to obtain the target information weight; and determining the candidate text information corresponding to the target information weight as the target text information.
[0096] Among them, information weight can be the weight corresponding to text information.
[0097] In the specific implementation, information weights can be set for the text information. If more than one candidate sample image is obtained, that is, the number of candidate sample images exceeds the preset threshold, there will also be more than one group of candidate text information. Each group of candidate text information can be associated with an information weight. The maximum value is selected from the information weights corresponding to the multiple groups of candidate text information as the target information weight. Then, the candidate text information corresponding to the target information weight can be used as the target text information.
[0098] For example, there are two candidate sample images that match the ultrasound image to be annotated: candidate sample image A and candidate sample image B. The candidate text information 'a' corresponding to candidate sample image A includes: 2 texts, text positions (20, 40), (30, 60), text content "lesion 1", "lesion 2", and an information weight of 0.8. The candidate text information 'b' corresponding to candidate sample image B includes: 3 texts, text positions (20, 40), (30, 60), (60, 120), text content "lesion 1", "lesion 2", "lesion 3", and an information weight of 0.2. The information weight of candidate text information 'a' is the largest, i.e., the target information weight is 0.8. The candidate text information 'a' corresponding to the target information weight can be determined as the target text information. That is, the target text information includes: 2 texts, text positions (20, 40), (30, 60), and text content "lesion 1", "lesion 2". Text annotation can be added to the ultrasound image to be annotated based on the target text information.
[0099] It should be noted that if multiple candidate text messages have the same information weight, making it impossible to select the target information weight, then the update time of each candidate text message can be obtained and compared. The candidate text message with the latest update time can be determined as the target text message. For example, if candidate text message b has an information weight of 0.2 and an update time of 10:00, and candidate text message c has an information weight of 0.2 and an update time of 12:00, and their information weights are the same, then the most recently updated candidate text message c can be determined as the target text message.
[0100] In this embodiment, the target information weight is obtained by selecting the maximum value from the information weights corresponding to the candidate text information; the candidate text information corresponding to the target information weight is determined as the target text information. Since the information weight can reflect the importance of the text information, when multiple sets of candidate text information are selected, the one with the highest importance can be selected as the target text information. Text annotations are added to the ultrasound image to be annotated according to the target text information, and the most important text annotations can be added to the ultrasound image to be annotated.
[0101] In one embodiment, step S140 may further include: weighting and summing the text quantity, text position, and text content type identifier corresponding to the candidate text information according to the information weights to obtain the weighted text quantity, weighted text position, and weighted text content type identifier; and adding text annotations to the ultrasound image to be annotated based on the weighted text quantity, weighted text position, and weighted text content type identifier.
[0102] The text content type identifier can be the identifier corresponding to the type of text content. For example, the type of text content can be a special comment or a general comment. The identifier corresponding to a special comment can be 0, and the identifier corresponding to a general comment can be 1.
[0103] In practice, when there are more than one group of candidate text information, and each group of candidate text information includes text quantity, text position, and text content, the text quantity can be weighted and summed according to the information weights corresponding to the candidate text information to obtain a weighted text quantity. The text position can be weighted and summed to obtain a weighted text position. The identifier corresponding to the type of text content can be weighted and summed to obtain a weighted text content type identifier. Then, text annotations can be added to the ultrasound image to be annotated based on the weighted text quantity, weighted text position, and weighted text content type identifier.
[0104] For example, for two sets of candidate text information, candidate text information c includes: a text quantity of 1, a text position of (20, 40), a text content of lesion 1, a text content type of special annotation, a special annotation identifier of 0, and a corresponding information weight of 0.8; candidate text information d includes: a text quantity of 1, a text position of (30, 60), a text content of left kidney, a text content type of general annotation, a general annotation identifier of 1, and a corresponding information weight of 0.2. By weighted summation, the weighted text quantity is 1×0.8+1×0.2=1, the weighted text position is (20×0.8+30×0.2,40×0.8+60×0.2)=(22,44), and the weighted text content type is 0×0.8+1×0.2=0.2. With a threshold of 0.5, 0.2<threshold 0.5, a special annotation is used. Therefore, one text content can be added at the coordinate position (22,44) of the ultrasound image to be annotated. The specific text content is lesion 1.
[0105] In this embodiment, by weighting and summing the text quantity, text position, and text content type identifier corresponding to the candidate text information according to information weights, weighted text quantity, weighted text position, and weighted text content type identifier are obtained. Text annotations are added to the ultrasound image to be annotated based on the weighted text quantity, weighted text position, and weighted text content type identifier. When multiple sets of candidate text information are selected, the weighted summation of multiple sets of candidate text information according to their importance ensures that the text annotations added to the ultrasound image to be annotated are the result of weighing multiple sets of candidate text information and comprehensively considering each set of candidate text information, thereby increasing the reliability of text annotation addition.
[0106] In one embodiment, step S110 may specifically include: inputting the sample image into a pre-trained text information recognition model to obtain the text quantity, text position, and text content of the annotation text in the sample image; and using the text quantity, text position, and text content as the text information of the annotation text in the sample image.
[0107] In a specific implementation, a text information recognition model can be pre-trained. The pre-trained text information recognition model can be, but is not limited to, a machine learning model, a deep learning model, a neural network model, etc. In order to recognize the text information of the annotation text in the sample image, the sample image can be input into the pre-trained text information recognition model. The pre-trained text information recognition model recognizes the sample image and outputs the text quantity, text position, and text content of the annotation text in the sample image. The text quantity, text position, and text content of the annotation text in the sample image can be used as the text information of the annotation text in the sample image.
[0108] In this embodiment, by inputting the sample image into a pre-trained text information recognition model, the number of texts, the position of texts, and the content of texts in the annotation text in the sample image are obtained. The number of texts, the position of texts, and the content of texts are used as the text information of the annotation text in the sample image. The number of texts, the position of texts, and the content of texts in the annotation text in the sample image can be automatically recognized without the need for manual recognition of the annotation text, thus improving the recognition efficiency of the annotation text.
[0109] In one embodiment, the above-mentioned method for adding text annotations to ultrasound images may further include: selecting a target training sample from the training samples and obtaining the sample identifier corresponding to the target training sample; inputting the target training sample into the text information recognition model to be trained to obtain the recognition result of the target training sample; and training the text information recognition model to be trained based on the difference between the recognition result of the target training sample and the sample identifier to obtain a pre-trained text information recognition model.
[0110] The training sample can be at least one ultrasound image. The sample identifier can be the number of texts, the position of the text, and the content of the text in the annotation text of each training sample.
[0111] The target training samples can be training samples that enable the text information recognition model to meet the preset requirements.
[0112] In practice, target training samples can be selected from the training samples. For example, training samples with a large number of annotated texts can be selected as target training samples, training samples with clearer images can be selected as target training samples, or training samples whose recognition results from the previous recognition are closest to the true results can be selected as target training samples. After selecting target training samples, sample identifiers can be obtained. These identifiers can be the number of texts, their positions, and their contents in the annotated text. The target training samples are then input into the text recognition model to be trained. The model identifies the target training samples and outputs the identification results. The identification results are compared with the sample identifiers. If the difference is less than a preset threshold (e.g., calculating the difference between the number of texts in the identification results and the number of texts in the sample identifiers), the current text recognition model can be used as the pre-trained model. Otherwise, if the difference between the identification results and the sample identifiers is greater than or equal to the preset threshold, the parameters of the current text recognition model need to be adjusted. For example, if the text recognition model is a machine learning model, the parameters can be adjusted. After adjustment, the above process is repeated, inputting the target training samples into the adjusted model. The model is then trained based on the difference between the identification results and the sample identifiers until the difference is less than the preset threshold, resulting in the pre-trained text recognition model.
[0113] In this embodiment, a target training sample is selected from the training samples, and the sample identifier corresponding to the target training sample is obtained; the target training sample is input into the text information recognition model to be trained to obtain the recognition result of the target training sample; based on the difference between the recognition result of the target training sample and the sample identifier, the text information recognition model to be trained is trained to obtain a pre-trained text information recognition model. The text information recognition model can be trained to recognize the text information of the annotation text in the sample image, thereby improving the efficiency of text information recognition.
[0114] In one embodiment, step S120 may specifically include: determining the matching degree between the ultrasound image to be annotated and the sample image; and determining the sample image corresponding to the matching degree as a candidate sample image if the matching degree exceeds a preset threshold.
[0115] In the specific implementation, during the process of determining candidate sample images from sample images, the matching degree between the ultrasound image to be annotated and each sample image can be calculated. Each matching degree is compared with a preset threshold. If the matching degree does not exceed the preset threshold, the sample image corresponding to that matching degree cannot be used as a candidate sample image. Otherwise, if the matching degree exceeds the preset threshold, the sample image corresponding to that matching degree can be used as a candidate sample image.
[0116] In practical applications, correlation operations can be performed between the ultrasound image to be annotated and the sample image to obtain the matching degree between the two. Alternatively, the probability of pixel overlap between the ultrasound image to be annotated and the sample image can be used as the matching degree between the two. Finally, the candidate sample image can be one or multiple images.
[0117] In this embodiment, the matching degree between the ultrasound image to be annotated and the sample image is determined. If the matching degree exceeds a preset threshold, the sample image corresponding to the matching degree is determined as a candidate sample image. The sample image that matches the ultrasound image to be annotated can be selected, and then the ultrasound image to be annotated can be annotated according to the text information of the annotation text in the sample image. There is no need for manual recognition of the ultrasound image to be annotated, which improves the efficiency of text annotation.
[0118] In one embodiment, the sample image containing the annotation text obtained in the above ultrasound image text annotation method can also be an ultrasound image with text annotation added within a preset historical time period. The historical time period can be a past time period based on the current time. For example, it can be an ultrasound image with text annotation added in the most recent N times in the same clinical setting and in the same department.
[0119] In practice, after each ultrasound scan, text annotations can be manually added to the obtained ultrasound images, resulting in sample images containing the annotated text. The number, location, and content of the text annotations in the sample images are used as text information. The sample images and corresponding text information are stored in a database. Subsequently, after obtaining an ultrasound image to be annotated, candidate sample images can be searched among the sample images. Based on the text information corresponding to the candidate sample images, text annotations are added to the ultrasound image to be annotated. Alternatively, multiple candidate sample images can be found using the above method. The matching degree between the ultrasound image to be annotated and each candidate sample image is calculated, and the calculated matching degree is compared with a preset threshold. If the matching degree does not exceed the preset threshold, the matching degree of the next candidate sample image is compared. Otherwise, if the matching degree exceeds the preset threshold, text annotations are added to the ultrasound image to be annotated based on the text information corresponding to the current candidate sample image.
[0120] For example, an ultrasound scan yields an ultrasound image E of the liver, an ultrasound image F of the kidney, and an ultrasound image G of the stomach. By manually identifying the region of interest and adding text annotations, three sample images containing annotated text are obtained. The sample images and corresponding text information are stored in a database. If the same doctor performs an ultrasound scan on the kidney in the next scan, obtaining an ultrasound image H of the kidney, then the kidney ultrasound image F can be selected from the sample images, and text annotations can be added to the kidney ultrasound image H based on the text information of the kidney ultrasound image F.
[0121] In this embodiment, by making the obtained sample image containing annotation text annotated with ultrasound images that have had text annotations added within a preset historical time period, text annotations can be added to the current ultrasound image to be annotated based on the text information of the historical annotations, thereby improving the efficiency of text annotation addition.
[0122] To facilitate a deeper understanding of the embodiments of this application by those skilled in the art, a specific example will be used for illustration below.
[0123] This application proposes a method for automatically adding text annotations to ultrasound images, which may specifically include:
[0124] 1. Examine the patient. After the examination is completed, display the ultrasound image of the examined area on the interface, and click "Start Adding Annotations".
[0125] 2. Upon entering the annotation module, the system detects the current clinical scene, automatically identifies the region of interest in the ultrasound image, and adds text annotations. Furthermore, text annotation tags can be set, including recently added tags and default tags.
[0126] The recently added label can be a label that instructs the addition of text annotations to the current ultrasound image based on historically added text annotations. Specifically, the system can automatically record the location and content of recently added text annotations in different clinical scenarios. Based on the relative coordinates between the region of interest in the tissue anatomy in the historical records and the region of interest in the current ultrasound image, the system creates a text annotation for the current ultrasound image using the location and content of the recently added text annotations. For example, if the left kidney was labeled at coordinates (30, 60) in the previous kidney ultrasound examination, and the image scale is reduced to half of the previous size in this kidney ultrasound examination, then the left kidney can be labeled directly at coordinates (15, 30).
[0127] When adding text annotations to the current ultrasound image based on previously added text annotations, multiple sets of text annotations can be recorded for the same historical ultrasound image for different clinical scenarios. For example, two text annotations can be recorded for a general scenario for a historical cardiac ultrasound image E, and four text annotations can be recorded for a special scenario for a historical cardiac ultrasound image E. Scene labels can also be set for general and special scenarios, and the annotation set can be switched according to the scene label. For a currently acquired cardiac ultrasound image F, if the general scene label is selected, two text annotations will be automatically generated for cardiac ultrasound image F. If the general scene label is switched to the special scene label, the two text annotations in cardiac ultrasound image F will be automatically changed to four text annotations.
[0128] The default label can be an instruction system that uses a predictive complementarity model (e.g., the text information recognition model in the aforementioned embodiments) to quickly locate the position where text annotations need to be added and automatically add the text annotation labels. Specifically, image recognition technology can be used to obtain the region of interest of the tissue anatomy, text annotations can be performed on the region of interest, and the relative coordinate position of the text annotations can be recorded. The predictive complementarity model can output the text annotation information at that position. By training the predictive complementarity model with a large amount of sample information, the accuracy of the predictive complementarity model can be improved. The predictive complementarity model can analyze and calculate annotation data of the same tissue anatomy and automatically add text annotations. For example, in cardiac clinical practice, the region of interest and color code of the cardiac ultrasound image can be obtained first. The cardiac ultrasound image can be matched with sample images to obtain the sample image with the highest matching degree. The annotation position coordinates and the number of annotations can be calculated based on the sample image with the highest matching degree. If the sample image with the highest matching degree contains two text annotations, then the two annotations are added to the corresponding coordinates of the cardiac ultrasound image, and so on.
[0129] It should be noted that the above method can also record the user's frequently used coordinate positions and operating habits for adding text annotations while logged in. The text annotations can then be added based on these frequently used coordinate positions and operating habits the user logs in next time. Furthermore, different operating needs can be provided to multiple users by switching login accounts.
[0130] 3. Users can selectively fine-tune the automatically added text comments. For example, they can edit, replace, move, or delete the automatically added text comments.
[0131] The above-mentioned method for automatically adding text annotations to ultrasound images can automatically add text annotations to ultrasound images according to different clinical scenarios, thereby improving the efficiency of ultrasound image examination.
[0132] In one embodiment, such as Figure 2As shown, a method for adding text annotations to ultrasound images is provided, including the following steps:
[0133] Step S201: Obtain a sample image containing the annotated text;
[0134] Step S202: Input the sample image into the pre-trained text information recognition model to obtain the number of texts, text positions and text content of the annotated text in the sample image;
[0135] Step S203: The number of texts, the position of the texts, and the content of the texts are used as the text information of the annotation text in the sample image;
[0136] Step S204: From the sample images, determine the candidate sample images that match the ultrasound image to be annotated;
[0137] Step S205: Determine the text information of the annotation text in the candidate sample image as candidate text information;
[0138] Step S206: Add text annotations to the ultrasound image to be annotated based on the candidate text information.
[0139] In this embodiment, by acquiring sample images containing annotation text, and inputting the sample images into a pre-trained text information recognition model, the number, position, and content of the annotation text in the sample images are obtained. These three elements are used as the text information of the annotation text in the sample images. The pre-trained text information recognition model can automatically identify the number, position, and content of the annotation text in the sample images, improving the efficiency of acquiring the text information of the annotation text in the sample images. From the sample images, candidate sample images matching the ultrasound image to be annotated are determined. The text information of the annotation text in the candidate sample images is determined as candidate text information. Based on the candidate text information, text annotations are added to the ultrasound image to be annotated. For any ultrasound image to be annotated, a matching candidate sample image can be selected from the sample images. Since the text information of the annotation text in the candidate sample images is known, text annotations can be automatically added to the ultrasound image to be annotated based on the text information of the annotation text in the candidate sample images, improving the efficiency of adding text annotations to ultrasound images.
[0140] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.
[0141] Based on the same inventive concept, this application also provides an ultrasound image text annotation adding device for implementing the ultrasound image text annotation adding method described above. The solution provided by this device is similar to the solution described in the above method. Therefore, the specific limitations of one or more ultrasound image text annotation adding device embodiments provided below can be found in the limitations of the ultrasound image text annotation adding method above, and will not be repeated here.
[0142] In one embodiment, such as Figure 3 As shown, an ultrasound image text annotation device is provided, comprising: an acquisition and recognition module 310, a sample determination module 320, an information determination module 330, and an annotation addition module 340, wherein:
[0143] The acquisition and recognition module 310 is used to acquire a sample image containing annotated text and recognize the text information of the annotated text in the sample image; the text information includes the number of texts, the text position, and the text content;
[0144] The sample determination module 320 is used to determine candidate sample images that match the ultrasound image to be annotated from the sample images;
[0145] The information determination module 330 is used to determine the text information of the annotation text in the candidate sample image as candidate text information;
[0146] The annotation adding module 340 is used to add text annotations to the ultrasound image to be annotated based on the candidate text information.
[0147] In one embodiment, the annotation adding module 340 is further configured to select target text information from the candidate text information when the number of candidate sample images exceeds a preset threshold; and add text annotations to the ultrasound image to be annotated according to the text quantity, text position and text content of the target text information.
[0148] In one embodiment, the annotation adding module 340 is further configured to select the maximum value from the number of texts corresponding to the candidate text information to obtain the target text number; and determine the candidate text information corresponding to the target text number as the target text information.
[0149] In one embodiment, the annotation adding module 340 is further configured to select the maximum value from the information weights corresponding to the candidate text information to obtain the target information weight; and determine the candidate text information corresponding to the target information weight as the target text information.
[0150] In one embodiment, the annotation adding module 340 is further configured to perform weighted summation on the text quantity, text position, and text content type identifier corresponding to the candidate text information according to the information weight, to obtain the weighted text quantity, weighted text position, and weighted text content type identifier; and add text annotations to the ultrasound image to be annotated according to the weighted text quantity, the weighted text position, and the weighted text content type identifier.
[0151] In one embodiment, the acquisition and recognition module 310 is further configured to input the sample image into a pre-trained text information recognition model to obtain the text quantity, text position, and text content of the annotation text in the sample image; and to use the text quantity, text position, and text content as the text information of the annotation text in the sample image.
[0152] In one embodiment, the ultrasound image text annotation adding device further includes:
[0153] The sample acquisition module is used to select a target training sample from the training samples and obtain the sample identifier corresponding to the target training sample;
[0154] The model recognition module is used to input the target training sample into the text information recognition model to be trained, and obtain the recognition result of the target training sample;
[0155] The model training module is used to train the text information recognition model to be trained based on the difference between the recognition result of the target training sample and the sample identifier, so as to obtain the pre-trained text information recognition model.
[0156] In one embodiment, the sample determination module 320 is further configured to determine the matching degree between the ultrasound image to be annotated and the sample image; and if the matching degree exceeds a preset threshold, determine the sample image corresponding to the matching degree as the candidate sample image.
[0157] In one embodiment, the annotation adding module 340 is further configured to select a target update time from the update times corresponding to the candidate text information; the target update time is closest to the current time; and the candidate text information corresponding to the target update time is determined as the target text information.
[0158] The modules in the aforementioned ultrasound image text annotation device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the corresponding operations of each module.
[0159] In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 4 As shown, the computer device includes a processor, memory, input / output interface, communication interface, display unit, and input device. The processor, memory, and input / output interface are connected via a system bus, and the communication interface, display unit, and input device are also connected to the system bus via the input / output interface. The processor provides computing and control capabilities. The memory includes a non-volatile storage medium and internal memory. The non-volatile storage medium stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The input / output interface is used for exchanging information between the processor and external devices. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, mobile cellular networks, NFC (Near Field Communication), or other technologies. When executed by the processor, the computer program implements a method for adding text annotations to ultrasound images. The display unit is used to form a visually visible image and can be a display screen, a projection device, or a virtual reality imaging device. The display screen can be an LCD screen or an e-ink screen. The input device of the computer device can be a touch layer covering the display screen, or buttons, trackballs, or touchpads set on the casing of the computer device, or external keyboards, touchpads, or mice, etc.
[0160] Those skilled in the art will understand that Figure 4 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0161] In one embodiment, a computer device is also provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above method embodiments.
[0162] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon that, when executed by a processor, implements the steps in the above method embodiments.
[0163] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of the relevant data shall comply with the relevant laws, regulations and standards of the relevant countries and regions.
[0164] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.
[0165] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0166] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. A method for adding text annotations to ultrasound images, characterized in that, The method includes: Obtain a sample image containing annotated text, and identify the text information of the annotated text in the sample image; the text information includes the number of texts, the text position, and the text content; From the sample images, candidate sample images that match the ultrasound images to be annotated are determined; The text information of the annotated text in the candidate sample image is determined as candidate text information; Based on the candidate text information, text annotations are added to the ultrasound image to be annotated.
2. The method according to claim 1, characterized in that, The step of adding text annotations to the ultrasound image to be annotated based on the candidate text information includes: If the number of candidate sample images exceeds a preset threshold, target text information is selected from the candidate text information; Text annotations are added to the ultrasound image to be annotated based on the number of texts, the text position, and the text content of the target text information.
3. The method according to claim 2, characterized in that, The step of selecting target text information from the candidate text information includes: The maximum value is selected from the number of texts corresponding to the candidate text information to obtain the target text number; The candidate text information corresponding to the target text quantity is determined as the target text information.
4. The method according to claim 2, characterized in that, The step of selecting target text information from the candidate text information further includes: The target information weight is obtained by selecting the maximum value from the information weights corresponding to the candidate text information. The candidate text information corresponding to the target information weight is determined as the target text information.
5. The method according to claim 4, characterized in that, The step of adding text annotations to the ultrasound image to be annotated based on the candidate text information further includes: Based on the information weights, the text quantity, text position, and text content type identifier corresponding to the candidate text information are weighted and summed to obtain the weighted text quantity, weighted text position, and weighted text content type identifier. Text annotations are added to the ultrasound image to be annotated based on the weighted text quantity, the weighted text position, and the weighted text content type identifier.
6. The method according to claim 1, characterized in that, The text information for identifying the annotation text in the sample image includes: The sample image is input into a pre-trained text information recognition model to obtain the number of texts, text positions, and text content of the annotated text in the sample image; The number of texts, the position of the texts, and the content of the texts are used as the text information of the annotation text in the sample image.
7. The method according to claim 6, characterized in that, The method further includes: Select a target training sample from the training samples and obtain the sample identifier corresponding to the target training sample; The target training sample is input into the text information recognition model to be trained to obtain the recognition result of the target training sample; Based on the difference between the recognition result of the target training sample and the sample identifier, the text information recognition model to be trained is trained to obtain the pre-trained text information recognition model.
8. The method according to claim 1, characterized in that, The step of determining candidate sample images that match the ultrasound image to be annotated from the sample images includes: Determine the matching degree between the ultrasound image to be annotated and the sample image; If the matching degree exceeds a preset threshold, the sample image corresponding to the matching degree is determined as the candidate sample image.
9. The method according to claim 2, characterized in that, The step of selecting target text information from the candidate text information further includes: Select a target update time from the update times corresponding to the candidate text information; the target update time is closest to the current time. The candidate text information corresponding to the target update time is determined as the target text information.
10. A device for adding text annotations to ultrasound images, characterized in that, The device includes: The acquisition and recognition module is used to acquire a sample image containing annotated text and to recognize the text information of the annotated text in the sample image; the text information includes the number of texts, the text position, and the text content. The sample determination module is used to determine candidate sample images that match the ultrasound image to be annotated from the sample images; The information determination module is used to determine the text information of the annotation text in the candidate sample image as candidate text information; The annotation addition module is used to add text annotations to the ultrasound image to be annotated based on the candidate text information.