UI detection method and apparatus, electronic device, and storage medium

By using an element detection model trained on positive sample images, and converting UI element positions into vectors and clustering them, the problem of existing UI detection methods being unable to determine whether the UI display meets design expectations is solved, thus achieving automated and accurate UI anomaly detection.

CN116302254BActive Publication Date: 2026-06-16BEIJING QIYI CENTURY SCI & TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING QIYI CENTURY SCI & TECH CO LTD
Filing Date
2023-02-09
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing UI inspection methods cannot effectively determine whether the UI display meets design expectations, and the collection of abnormal materials is limited, resulting in low accuracy of model judgment.

Method used

An element detection model trained on positive sample images is used. By obtaining the position of UI elements, the image to be detected is converted into a vector, and then clustered with the vector of positive sample images to automatically detect whether the UI page is normal.

🎯Benefits of technology

UI elements can be located without manual XPath coding, and abnormal images that do not meet design expectations can be automatically detected, eliminating the problems of difficulty in collecting abnormal materials and the inability of the model to converge.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116302254B_ABST
    Figure CN116302254B_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide a UI detection method and device, electronic equipment and a storage medium, the method comprising: obtaining a to-be-detected picture containing a UI element, inputting the to-be-detected picture into an element detection model to obtain a detection result containing an element position, wherein the element detection model is trained based on forward sample pictures, the to-be-detected picture is converted into a to-be-detected picture vector using the element position, the to-be-detected picture vector and the forward sample picture vector are clustered, if the clustering result is that no matching forward sample picture is obtained, the to-be-detected picture is detected as an abnormal picture, and if the clustering result is that a matching forward sample picture is obtained, the to-be-detected picture is detected as a normal picture. Embodiments of the present application can solve the problems of difficulty in collecting abnormal materials and model convergence, can support positioning of UI elements without manual coding of xpath, and can detect abnormal pictures that do not meet the design expectations without manual comparison of pictures.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image processing technology, and in particular to a UI detection method, a UI detection device, an electronic device, and a computer-readable storage medium. Background Technology

[0002] UI (User Interface) is a visual entry point for obtaining information, accessing device resources, and controlling device operation. Before a new version is released, it is necessary to check whether the UI is working properly.

[0003] Current UI inspection methods include those based on XPath (Extensible Markup Language) location, those based on image analysis and comparison, and those based on UI anomaly detection models. However, XPath-based methods can only detect the presence of elements on the UI page; image analysis and comparison methods require manual comparison of images; and UI anomaly detection models are mainly used for common anomaly detection, such as empty images, missing elements, and overlapping errors. They cannot determine whether the UI display meets design expectations. Furthermore, the collection of anomaly data required for UI anomaly detection models is limited, and the accuracy of model judgments is difficult to guarantee. Summary of the Invention

[0004] In view of the above problems, embodiments of the present invention are proposed to provide a UI detection method and a corresponding UI detection device, an electronic device, and a computer-readable storage medium to overcome or at least partially solve the above problems.

[0005] In a first aspect of this invention, a UI detection method is provided, the method comprising:

[0006] Obtain the image to be detected; the image to be detected includes multiple UI elements;

[0007] The image to be detected is input into a preset element detection model to obtain the detection result; wherein, the element detection model is trained based on positive sample images, and the detection result includes the element position of each UI element;

[0008] The image to be detected is converted into an image vector by using the element positions of each UI element;

[0009] Obtain a forward sample image vector; wherein the forward sample image vector is obtained by transforming the forward sample image;

[0010] The vector of the image to be detected is clustered with the vector of the positive sample image to obtain the clustering result;

[0011] If the clustering result is that no positive sample image vector is obtained that matches the vector of the image to be detected, then the image to be detected is detected as an abnormal image; if the clustering result is that a positive sample image vector is obtained that matches the vector of the image to be detected, then the image to be detected is detected as a normal image.

[0012] In a second aspect of the invention, a UI detection device is also provided, the device comprising:

[0013] The image to be detected module is used to acquire the image to be detected; the image to be detected includes multiple UI elements;

[0014] The detection result output module is used to input the image to be detected into a preset element detection model to obtain the detection result; wherein, the element detection model is trained based on positive sample images, and the detection result includes the element position of each UI element;

[0015] The image vector conversion module is used to convert the image to be detected into an image vector by using the element positions of each UI element;

[0016] A forward sample image vector acquisition module is used to acquire forward sample image vectors; wherein, the forward sample image vectors are obtained by transforming the forward sample images;

[0017] The clustering module is used to cluster the vector of the image to be detected with the vector of the positive sample image to obtain the clustering result;

[0018] The detection module is configured to detect the image to be detected as an abnormal image if the clustering result does not yield a positive sample image vector that matches the image vector to be detected; and to detect the image to be detected as a normal image if the clustering result yields a positive sample image vector that matches the image vector to be detected.

[0019] In another aspect of the present invention, an electronic device is also provided, comprising: a processor, a memory, and a computer program stored in the memory and capable of running on the processor, wherein the computer program, when executed by the processor, implements the steps of the UI detection method as described above.

[0020] In another aspect of the present invention, a computer-readable storage medium is also provided, on which a computer program is stored, which, when executed by a processor, implements the steps of the UI detection method as described above.

[0021] Compared with the prior art, the embodiments of the present invention have the following advantages:

[0022] In this embodiment of the invention, an image to be detected containing multiple UI elements is obtained. The image to be detected is input into a preset element detection model to obtain detection results containing element positions. The element detection model is trained based on positive sample images. Then, using the element positions of the UI elements, the image to be detected is converted into an image vector to be detected, and a positive sample image vector is obtained. The positive sample image vector is obtained by converting a positive sample image. The image vector to be detected and the positive sample image vector are clustered to obtain a clustering result. If the clustering result does not yield a positive sample image vector that matches the image vector to be detected, the image to be detected is detected as an abnormal image; if the clustering result yields a positive sample image vector that matches the image vector to be detected, the image to be detected is detected as a normal image. The embodiment of this invention trains an element detection model based on positive sample images. The positive sample images can be enumerated, thereby overcoming the problems of difficult collection of abnormal materials and model non-convergence. Moreover, the element detection model can locate UI elements without manual XPath coding, and by automatically clustering the image to be detected and the positive sample images, abnormal images that do not meet the design expectations can be detected without manual comparison of images. Attached Figure Description

[0023] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0024] Figure 1 This is a flowchart illustrating the steps of a UI detection method embodiment one of the present invention.

[0025] Figure 2 This is a flowchart of the steps in Embodiment 2 of the UI detection method of the present invention;

[0026] Figure 3 This is a graphical representation of the detection results provided in this embodiment of the invention;

[0027] Figure 4 This is a flowchart of the sub-steps of a UI detection method embodiment two of the present invention;

[0028] Figure 5 This is a flowchart of the sub-steps of a UI detection method embodiment two of the present invention;

[0029] Figure 6 This is a block diagram provided in an embodiment of the present invention;

[0030] Figure 7This is a flowchart of the sub-steps of a UI detection method embodiment two of the present invention;

[0031] Figure 8 This is a graphical representation of the relative positions provided in the embodiments of the present invention;

[0032] Figure 9 This is a flowchart of the sub-steps of a UI detection method embodiment two of the present invention;

[0033] Figure 10 This is a flowchart of the sub-steps of a UI detection method embodiment two of the present invention;

[0034] Figure 11 This is a flowchart of the sub-steps of a UI detection method embodiment two of the present invention;

[0035] Figure 12 This is a UI detection flowchart provided in an embodiment of the present invention;

[0036] Figure 13 This is a structural block diagram of a UI detection device provided in an embodiment of the present invention;

[0037] Figure 14 This is a structural block diagram of the electronic device provided in an embodiment of the present invention. Detailed Implementation

[0038] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0039] Current UI detection methods include XPath-based detection, image processing and image comparison-based detection, and UI anomaly detection model-based detection.

[0040] The detection method based on XPath location is the mainstream method. It requires manually writing XPath and following development specifications when developing the code. However, it can only detect the existence of elements on the UI page.

[0041] The detection method based on image processing and image comparison performs image processing (removing color and drawing outlines). However, it requires manual comparison between the image to be detected and the correct image, and it cannot correctly compare complex UIs (such as those containing animations and overlapping elements).

[0042] The detection method based on the UI anomaly detection model uses deep learning to derive a discrimination model for abnormal UI images. The image to be detected is then input into the UI anomaly detection model to determine the probability that it is an abnormal image. It is mainly used for common anomaly detection (such as empty images, missing elements, and incorrect overlap). However, it cannot determine whether the UI display meets the design expectations. At the same time, the collection of abnormal materials required by the UI anomaly detection model is limited, and the accuracy of the model's judgment is not easy to guarantee.

[0043] To address the aforementioned issues, this invention proposes a UI detection method. This invention trains an element detection model based on positive sample images, which can be enumerated, thus overcoming the problems of difficulty in collecting abnormal materials and the inability of the model to converge. Furthermore, the element detection model can locate UI elements without manual XPath coding, and by automatically clustering the image to be detected and the positive sample images, abnormal images that do not meet design expectations can be detected without manual image comparison.

[0044] Reference Figure 1 The diagram illustrates a flowchart of a UI detection method according to an embodiment of the present invention. The method may specifically include the following steps:

[0045] Step 101: Obtain the image to be detected; the image to be detected includes multiple UI elements.

[0046] This invention is primarily used to automatically detect whether a UI page is functioning correctly, and is a solution for automated front-end page detection. Abnormal UI pages include empty images, missing elements, overlapping elements, and swapped element positions.

[0047] Typically, a UI page consists of page elements, page layout, and element styles and colors. The front-end code mainly focuses on programming the page elements and page layout, while styles and colors are more related to graphic design. Since the probability of problems with styles and colors and their impact on user experience are relatively small, this embodiment of the invention mainly focuses on the front-end's automated detection of whether the page elements and page layout are normal.

[0048] In this embodiment of the invention, an image to be detected can be obtained, which is an image of a UI page. The UI page may include multiple UI elements. These UI elements may include buttons, input boxes, players, progress bars, icons, and other elements.

[0049] Step 102: Input the image to be detected into a preset element detection model to obtain the detection result; wherein, the element detection model is trained based on positive sample images, and the detection result includes the element position of each UI element.

[0050] After obtaining the image to be detected, a preset element detection model can be called. The element detection model is a pre-trained model. Therefore, the image to be detected can be input into the element detection model, which will detect the UI elements contained in the image and the position of the UI elements on the UI page. The element detection model will output the detection results, which may include the position of each UI element.

[0051] Step 103: Using the element positions of each UI element, convert the image to be detected into an image vector.

[0052] Since clustering algorithms require a large number of arrays for clustering, the images to be detected need to be represented by vectorization to facilitate the description of the images as arrays for clustering. Therefore, in this embodiment of the invention, the element position of each UI element in the detection result can be used to first convert the images to be detected into vectors.

[0053] Step 104: Obtain the positive sample image vector; wherein the positive sample image vector is obtained by transforming the positive sample image.

[0054] Since clustering algorithms require a large number of arrays for clustering, positive sample images need to be represented by vectors to facilitate the description of images as arrays for clustering. Therefore, in this embodiment of the invention, positive sample images can be converted into positive sample image vectors in advance, and then the positive sample image vectors can be stored in a positive sample vector library. In this way, after the UI detection process is carried out, the positive sample vector library can be directly called to obtain the positive sample image vectors from the positive sample vector library.

[0055] Step 105: Cluster the image vector to be detected and the positive sample image vector to obtain the clustering result.

[0056] In this embodiment of the invention, the vector of the image to be detected and the vector of the positive sample image can be clustered based on density. The purpose is to group similar vectors into one class. In a specific implementation, regions with sufficient density can be divided into clusters, and clusters of arbitrary shapes can be found in a noisy spatial database to obtain the clustering results. Regions with sufficient density can be determined based on the Euclidean distance between vectors being less than a preset threshold. A cluster can be defined as the largest set of density-connected points; vectors in the same cluster are similar vectors.

[0057] In one example, if the Euclidean distance between the detected image vector and the forward sample image vector is less than a preset threshold, then it can be determined that the detected image vector and the forward sample image vector are in a region of sufficient density, that is, they are in the same cluster, indicating that the detected image vector and the forward sample image vector are similar vectors. If the Euclidean distance between the detected image vector and the forward sample image vector is greater than or equal to the preset threshold, then it can be determined that the detected image vector and the forward sample image vector are not in a region of sufficient density, that is, they are not in the same cluster, indicating that the detected image vector and the forward sample image vector are not similar vectors.

[0058] Step 106: If the clustering result is that no positive sample image vector matching the vector of the image to be detected is obtained, then the image to be detected is detected as an abnormal image; if the clustering result is that a positive sample image vector matching the vector of the image to be detected is obtained, then the image to be detected is detected as a normal image.

[0059] Positive sample images can include multiple images, such as 5000 images. A single positive sample image can also include multiple images, such as 100 images. If the same positive sample image is classified as the same category, then there will be positive sample images of different categories, such as 50 categories.

[0060] Positive sample images are normal image materials that meet design expectations. Therefore, after clustering the vector of the image to be detected with the vector of positive sample images, the clustering results can be used to determine whether the image to be detected is a normal or abnormal image. If the clustering result yields a positive sample image vector that matches the vector of the image to be detected, meaning the image to be detected can be clustered into one of the positive sample images, then the image to be detected can be determined to be a normal image that meets design expectations. If the clustering result does not yield a positive sample image vector that matches the vector of the image to be detected, meaning the image to be detected cannot be clustered into any of the positive sample images, then the image to be detected can be determined to be an abnormal image that does not meet design expectations.

[0061] In summary, in this embodiment of the invention, an image to be detected containing multiple UI elements is obtained. This image is then input into a preset element detection model to obtain detection results containing element positions. The element detection model is trained based on positive sample images. The image to be detected is then converted into an image vector using the element positions of the UI elements, thus obtaining a positive sample image vector. This positive sample image vector is obtained by converting a positive sample image. The image vector to be detected is clustered with the positive sample image vector to obtain clustering results. If the clustering result does not yield a matching positive sample image vector, the image to be detected is identified as an abnormal image; otherwise, if the clustering result yields a matching positive sample image vector, the image to be detected is identified as a normal image. The embodiment of this invention trains an element detection model based on positive sample images. The positive sample images can be enumerated, thereby overcoming the problems of difficult collection of abnormal materials and model non-convergence. Moreover, the element detection model can locate UI elements without manual XPath coding, and by automatically clustering the image to be detected and the positive sample images, abnormal images that do not meet the design expectations can be detected without manual comparison of images.

[0062] Reference Figure 2 The diagram illustrates a flowchart of a second embodiment of a UI detection method according to the present invention. The method may specifically include the following steps:

[0063] Step 201: Obtain the image to be detected; the image to be detected includes multiple UI elements.

[0064] Before a new version is released, you can check if the UI is working correctly. This check can include whether the UI page is empty, whether any UI elements are missing, whether UI elements overlap, and whether the positions of two elements have been swapped, etc.

[0065] In this embodiment of the invention, an image to be detected can be obtained. The image to be detected is an image of the UI page, which may include UI elements such as buttons, input boxes, players, progress bars, and icons.

[0066] Step 202: Input the image to be detected into a preset element detection model to obtain the detection result; wherein, the element detection model is trained based on positive sample images, and the detection result includes the element position of each UI element.

[0067] After obtaining the image to be detected, the page elements can be detected first. Specifically, the image to be detected can be input into a pre-trained element detection model. The element detection model is used to detect UI elements and their positions on the UI page. Therefore, after the element detection model completes its detection, it can output detection results containing the position of each UI element.

[0068] The structure of the detection result can be {element category, [x1,y1,x2,y2], confidence score}. Here, [x1,y1,x2,y2] is the element position; further, [x1,y1] is the coordinate of the top-left corner of the bounding rectangle of the UI element, and [x2,y2] is the coordinate of the bottom-right corner of the bounding rectangle of the UI element.

[0069] In one example, refer to Figure 3 This figure shows a graphical representation of the detection results provided in an embodiment of the present invention. Figure 3 This is the detection result obtained after the image to be detected is processed by the element detection model. The detection result includes the element category, element position, and confidence score corresponding to elements a to h, as follows:

[0070] Element a: {btn_ptc_input box,[x a1 ,y a1 ,x a2 ,y a2 ],99%;

[0071] Element b: {btn_atc_search,[x b1 ,y b1 ,x b2 ,y b2 ],99%;

[0072] Element c: {btn_img_rect,[x c1 ,y c1 ,x c2 ,y c2 ],97%;

[0073] Element d: {btn_atc_star,[x d1 ,y d1 ,x d2 ,y d2 ],97%;

[0074] Element e: {btn_label_dubo,[x e1 ,y e1 ,x e2 ,y e2 ],99%;

[0075] Element f: {btn_img_rect,[x f1 ,y f1 ,x f2 ,y f2 ],97%;

[0076] Element g: {btn_func_logo,[x g1 ,y g1 ,x g2 ,y g2 ],99%;

[0077] Element h: {btn_atc_star,[x h1 ,y h1 ,x h2 ,y h2 ],97%}.

[0078] It should be noted that btn_xx is the character representation of the element category, and 97% and 99% are confidence levels. Confidence level indicates the credibility of the detected element category; the higher the confidence level, the more credible it is.

[0079] Based on the element categories, we know that element c and element f are elements of the same category, and element d and element h are elements of the same category.

[0080] In this embodiment of the invention, the element detection model is trained based on positive sample images. Since the positive sample images not only meet the expected UI display effect, but also can enumerate a large number of correct display images, this embodiment of the invention can get rid of the problems of abnormal material collection being difficult and the model being unable to converge.

[0081] In an optional embodiment of the present invention, the element detection model can be trained in the following manner:

[0082] Obtain the positive sample image used for training; the positive sample image includes multiple sample UI elements, each sample UI element is labeled with a rectangle;

[0083] An object detection algorithm is used to perform deep learning on multiple sample UI elements labeled by the rectangle to obtain an element detection model.

[0084] In the specific implementation, multiple initial positive sample images (5000+ images) can be collected. Each initial positive sample image includes multiple sample UI elements. Then, each sample UI element can be manually labeled. Specifically, a rectangle can be used to label each sample UI element. This yields positive sample images for training. The labeled positive sample images can then be input into the object detection algorithm for deep learning, thereby obtaining the element detection model.

[0085] In this embodiment of the invention, Faster R-CNN can be used as a deep learning object detection algorithm. Faster R-CNN is an improvement on R-CNN (Region-CNN). Faster R-CNN aims to find the target of interest in an image and determine its category and location.

[0086] Specifically, the training process of the object detection algorithm is as follows: A positive sample image is input, and the entire positive sample image is fed into a CNN for feature extraction. Based on the Region Proposal Network (RPN), multiple anchor boxes in the positive sample image are identified. Anchor boxes are rectangular boxes used to label sample UI elements. A normalized exponential function is used to determine whether anchors belong to the foreground or background. Simultaneously, another branch, bounding box regression, corrects the anchor boxes, forming more accurate proposals. These proposals are mapped onto the last convolutional feature map of the CNN. A RoI pooling (Region of Interest pooling) layer generates a fixed-size feature map for each RoI. Using probe classification probabilities and probe bounding box regression, the classification probabilities and bounding box regression are jointly trained to obtain the element detection model.

[0087] Step 203: Using the element positions of each UI element, convert the image to be detected into an image vector.

[0088] After detecting the page elements, the page layout can be detected. Specifically, since clustering algorithms require a large number of arrays for clustering, the image to be detected needs to be represented by a vectorized method to facilitate describing the image as an array for clustering. Therefore, this embodiment of the invention can use the element position of each UI element output by the element detection model to first convert the image to be detected into an image vector.

[0089] In an optional embodiment of the present invention, reference is made to Figure 4 The diagram illustrates a sub-step flowchart of a UI detection method according to a second embodiment of the present invention. Step 203 may include the following sub-steps:

[0090] Sub-step S11: Determine the adjacent UI elements of each UI element;

[0091] Sub-step S12: Using the element positions of each UI element and the element positions of adjacent UI elements, determine the relative positions of each UI element in the image to be detected.

[0092] Sub-step S13: Using the relative positions of each UI element in the image to be detected, the image to be detected is converted into an image vector.

[0093] Page layout can be represented by the relative positions of UI elements within the image to be inspected. Specifically, for each UI element in the image, all adjacent UI elements within a certain radius can be scanned in a preset direction (counter-clockwise / clockwise) to determine the adjacent UI elements for each UI element. Then, using the element position of each UI element and the element positions of its corresponding adjacent UI elements, the relative position of each UI element in the image can be determined. The relative position of each UI element can be used as a data point in the entire image vector. By finding the relative positions of all UI elements, the layout of a UI page can be defined. In other words, by using the relative positions of each UI element in the image, the image to be inspected is converted into an image vector, and the image vector can represent the page layout within the image.

[0094] In an optional embodiment of the present invention, reference is made to Figure 5 The diagram illustrates a sub-step flowchart of a UI detection method according to a second embodiment of the present invention. Sub-step S11 may include the following sub-steps:

[0095] Sub-step S111: The image to be detected is segmented to obtain multiple blocks;

[0096] Sub-step S112: For each block, the UI elements within the same block are designated as adjacent UI elements to other elements within the block.

[0097] To increase the dimension of the vector, the image to be detected can be uniformly divided into multiple blocks of equal size. Then, for each block, the adjacent UI elements of each UI element within that block are identified. Specifically, using the top-left corner of each block as the scanning point, all UI elements within the block are scanned in a preset direction (counter-clockwise / clockwise). Then, for each block, the UI elements within the same block are considered as the adjacent UI elements of other elements within that block.

[0098] It should be noted that if a UI element is located in two or more blocks, it can be determined which block it belongs to based on the coordinates of the top left corner of the UI element's bounding rectangle.

[0099] In one example, refer to Figure 6 The diagram shows a block diagram provided in an embodiment of the present invention. Figure 6The image to be detected is evenly divided into nine blocks. The first block contains elements i and j, the seventh block contains element n, the eighth block contains element p, the ninth block contains elements l, m, and q, and element k is contained in the fourth block because the top-left corner of the bounding rectangle of element k is located in the fourth block. The remaining blocks contain no elements.

[0100] After dividing the data into blocks, the top-left corner of each block is used as the scanning point. All UI elements within the block are scanned in a preset direction (counter-clockwise / clockwise). This allows each UI element within the same block to be identified as an adjacent UI element. For example, as shown... Figure 6 As shown, using the top-left corner of the first block as the scanning point, elements i and j are scanned, indicating that elements i and j are in the same block. Therefore, the adjacent element of element i is element j, and the adjacent element of element j is element i. Similarly, using the top-left corner of the ninth block as the scanning point, elements l, m, and q are scanned, indicating that elements l, m, and q are in the same block. Therefore, the adjacent elements of element l are elements m and q, the adjacent elements of element m are elements l and q, and the adjacent elements of element q are elements l and m.

[0101] In an optional embodiment of the present invention, reference is made to Figure 7 The flowchart illustrates a sub-step of a UI detection method according to a second embodiment of the present invention. Sub-step S12 may include the following sub-steps:

[0102] Sub-step S121: Determine the vertices of each block as reference UI elements;

[0103] Sub-step S122: Using the element position of each UI element and the element position of the reference UI element, calculate the reference distance between each UI element and the corresponding reference UI element; and using the element position of each UI element and the element position of the adjacent UI element, calculate the adjacent distance between each UI element and the corresponding adjacent UI element.

[0104] Sub-step S123: For each UI element, calculate the average distance between the reference distance and the adjacent distance;

[0105] Sub-step S124: The average distance of each UI element is determined as the relative position of each UI element in the image to be detected.

[0106] The relative position of a UI element in the image to be detected can be represented by the average distance between elements. Specifically, in this embodiment of the invention, the vertices of each block can be used as reference UI elements. Then, using the element position of each UI element and the element position of its corresponding reference UI element, the reference distance between each UI element and its corresponding reference UI element is calculated. Additionally, using the element position of each UI element and the element positions of its corresponding adjacent UI elements, the adjacent distance between each UI element and its corresponding adjacent UI element is calculated. Furthermore, for each UI element, the average distance between the reference distance and the adjacent distance is calculated, thereby using the average distance of each UI element as the relative position of each UI element in the image to be detected.

[0107] It should be noted that the reason why each block's vertex is used as a reference UI element in this embodiment of the invention is to avoid the mirroring problem when only one adjacent UI element is used as a reference point. For example, as shown... Figure 6 As shown, the neighboring element of element i is element j, and the neighboring element of element j is element i. Therefore, when calculating the adjacent distance D between element i and its neighboring element j... i1 And calculate the adjacent distance D between element j and its neighboring element i. j1 When, the adjacent distance D of element i i1 The adjacent distance D of element j j1 In fact, the two are the same. If element i and element j are swapped, i is originally in the upper right corner of element j, and after the swap, element i is in the lower left corner of element j. However, since the distance is the same, the relative positions before and after the swap are actually the same. However, the page layout has actually changed, which makes it impossible to detect whether the page layout meets the design expectations.

[0108] Therefore, in this embodiment of the invention, the vertices of each block are used as reference UI elements. For example... Figure 6 As shown, elements i and j are within the same block, therefore they share the same reference UI element, which is the vertex of the first block. This is because the reference distance D between element i and the vertex of the first block... i2 The reference distance D between element j and the vertex of the first block. j2 The two are different. So, after elements i and j are swapped, the relative positions before and after the swap are different because the reference distance has changed. This can be used to detect that the page layout has changed.

[0109] In this embodiment of the invention, both the reference distance and the adjacent distance can be calculated using the Euclidean distance formula, which is as follows:

[0110]

[0111] Since the element position is the coordinates of the top-left corner and the bottom-right corner of the outer rectangle of the UI element, one of the coordinates can be used as a parameter for calculation. The following example uses the coordinates of the top-left corner of the outer rectangle of the UI element as the parameter.

[0112] In one example, refer to Figure 8 This diagram illustrates a graphical representation of the relative positions provided in an embodiment of the present invention. Figure 8 yes Figure 6 An enlarged diagram of the first block in the image. Figure 8 In the diagram, vertex O is the top-left corner of the first block, and vertex O can be used as a reference UI element. For element i, the top-left corner coordinates of the bounding rectangle of element i can be [x...]. i1 ,y i1 ] and the coordinates of the reference UI element [x O1 ,y O1 Substituting these values ​​into the Euclidean distance formula (Equation 1) above, the reference distance D is calculated. X And the top-left corner coordinates [x] of the bounding rectangle of element i can be determined. i1 ,y i1 ] and the coordinates of the adjacent UI element j [x j1 ,y j1 Substituting these values ​​into the Euclidean distance formula above, the adjacent distance D is calculated. Z .

[0113] Similarly, for element j, the coordinates of the top-left corner of the bounding rectangle of element j can be [x...]. j1 ,y j1 ] and the coordinates of the reference UI element [x O1 ,y O1 Substituting these values ​​into the Euclidean distance formula, the reference distance D is calculated. Y And the top-left corner coordinates [x] of the bounding rectangle of element j can be set. j1 ,y j1 ] and the coordinates of the adjacent UI element i [x i1 ,y i1 Substituting into the Euclidean distance formula, the adjacent distance D is calculated. Z .

[0114] The reference distance D of element i is calculated. X and adjacent distance D Z and the reference distance D of element j Y and adjacent distance D Z Then, the reference distance D can be further calculated. X Distance D to neighboring Z The mean distance and the reference distance D are calculated.Y and adjacent distance D Z The mean distance can be calculated using the following formula:

[0115] Where n is the total number of elements in the reference UI element and adjacent UI elements, dis(x j -x i ) is the reference distance D obtained from the aforementioned calculation. X Reference distance D Y Adjacent distance D Z .

[0116] In one example, for element i, the reference distance D can be... X and adjacent distance D Z Substituting into Equation 2 above, we obtain the relative position of element i in the image to be detected: (D X +D Z ) / 2.

[0117] In one example, for element j, the reference distance D can be... Y and adjacent distance D Z Substituting into Equation 2 above, we obtain the relative position of element j in the image to be detected: (D Y +D Z ) / 2.

[0118] In an optional embodiment of the present invention, reference is made to Figure 9 The flowchart illustrates a sub-step of a UI detection method according to a second embodiment of the present invention. Sub-step S13 may include the following sub-steps:

[0119] Sub-step S131: For each block, construct an N-dimensional block vector; where N is a positive integer.

[0120] Sub-step S132: Assign the relative position values ​​of each UI element in the image to be detected to the block vector of the block to obtain multiple target block vectors;

[0121] Sub-step S133: Concatenate the multiple target block vectors sequentially according to a preset direction to obtain an M-dimensional image vector to be detected; where M is a positive integer and M > N.

[0122] Typically, the number of UI elements contained in a UI page is limited, while clustering algorithms require a large number of arrays for clustering. Therefore, in order to increase the dimension of the vector, an N-dimensional block vector can be constructed for each block. An N-dimensional block vector is equivalent to an ordered array of N elements, where N is a positive integer.

[0123] Based on experience, each block generally has no more than 10 elements. Therefore, a 10-dimensional block vector can be constructed for each block, that is, each block is set to have 10 elements. However, in reality, each block may not have 10 elements. Therefore, after assigning the value of the relative position of each UI element in the image to be detected to the block vector of its block, the unassigned data points in the block vector are all taken as default values, which can be 0.

[0124] In one example, suppose the relative position value of element i is 0.1839 and the relative position value of element j is 0.1415. Since the first block only contains elements i and j, after assigning the relative position values ​​of element i (0.1839) and element j (0.1415) to the block vector of the first block, the remaining 8 data points in the block vector of the first block will all be 0. Thus, the target block vector of the first block is obtained as: [0.1839, 0.1415, 0, 0, 0, 0, 0, 0, 0, 0].

[0125] After obtaining multiple target block vectors, the multiple target block vectors can be connected sequentially in a preset direction (counterclockwise / clockwise) to obtain an M-dimensional image vector to be detected, where M is a positive integer and M > N.

[0126] In one example, by Figure 6 As can be seen, the image to be detected is divided into nine blocks, so we can obtain the target block vectors of the nine blocks. Then, we can connect these nine target block vectors in sequence according to a preset direction (counterclockwise / clockwise) to obtain a 90-dimensional image vector to be detected. The 90-dimensional image vector to be detected is an ordered array of 90 elements.

[0127] Step 204: Obtain the positive sample image vector; wherein the positive sample image vector is obtained by transforming the positive sample image.

[0128] Since clustering algorithms require a large number of arrays for clustering, positive sample images need to be represented by vectors to facilitate the description of images as arrays for clustering. Therefore, in this embodiment of the invention, positive sample images can be converted into positive sample image vectors in advance, and then the positive sample image vectors can be stored in a positive sample vector library. In this way, after the UI detection process is carried out, the positive sample vector library can be directly called to obtain the positive sample image vectors from the positive sample vector library.

[0129] Since the process of converting a positive sample image into a positive sample image vector is the same as the process of converting a detection image into a detection image vector, it will not be described again here.

[0130] Step 205: Cluster the image vector to be detected with the positive sample image vector to obtain the clustering result.

[0131] After converting the positive sample image and the image to be detected into positive sample image vector and the image to be detected image vector, respectively, the two images can be clustered based on density to obtain clustering results. Based on the clustering results, it can be determined whether the page layout of the image to be detected is normal.

[0132] In an optional embodiment of the present invention, the positive sample image vector includes positive sample image vectors of different categories, and the dimension of each of the different categories of positive sample image vectors is M-dimensional; refer to Figure 10 The diagram illustrates a sub-step flowchart of a UI detection method according to a second embodiment of the present invention. Step 205 may include the following sub-steps:

[0133] Sub-step S21: Cluster the M-dimensional image vector to be detected with the M-dimensional positive sample image vectors of different categories to obtain clustering results; wherein, the clustering results include multiple clusters, and the positive sample image vectors of different categories correspond to one cluster respectively;

[0134] Sub-step S22: Determine whether the image vector to be detected is in the same cluster as the positive sample image vector of any category;

[0135] Sub-step S23: If the image vector to be detected is in the same cluster as the positive sample image vector of any category, then the clustering result is determined to be a positive sample image vector that matches the image vector to be detected.

[0136] Sub-step S24: If the image vector to be detected is not in the same cluster as any positive sample image vector of any category, then the clustering result is determined to be that no positive sample image vector matching the image vector to be detected has been obtained.

[0137] In this embodiment of the invention, the positive sample image vector may include positive sample image vectors of different categories. The positive sample image vectors of different categories can represent different positive sample images, and the positive sample image vectors of the same category can represent the same positive sample image. For example, if there are 5,000 positive sample images and 100 of the same positive sample images, then there are 50 categories.

[0138] Since the dimension of the image vector to be detected is M, in order to ensure that there is a common standard during clustering, the dimension of the positive sample image vectors of different categories is M, that is, the dimension of each positive sample image vector is M.

[0139] In this embodiment of the invention, the M-dimensional image vector to be detected can be clustered with M-dimensional positive sample image vectors of different categories to obtain a clustering result containing the largest set of data points with multiple density connections, that is, a clustering result containing multiple clusters. Each category of positive sample image vector can correspond to a cluster; for example, if there are 50 categories of positive sample image vectors, then there are 50 clusters of positive sample image vectors.

[0140] Specifically, one M-dimensional image vector to be detected can represent one sample parameter, and M-dimensional positive sample image vectors of different categories can each represent one sample parameter. In the clustering process, assuming there are 90-dimensional positive sample image vectors of 50 categories and one 90-dimensional image vector to be detected, these 51 vectors can form a sample set D = (x1, x2, ..., x...). 51 If the density is such that the density description can be defined as follows:

[0141] 1) ∈-neighborhood: for x j ∈D, its ∈-neighborhood contains samples in the sample set D that are the same as x. j The subset of samples whose distance is not greater than that of ∈, i.e., N∈(x) j )={xi∈D)distance(x i ,x j The number of samples in the subset denoted as N ∈ (x ≤ ∈ ) is denoted as N ∈ (x ≤ ∈ ) j ));

[0142] 2) Core object: For any x j ∈D, if its ∈-neighborhood corresponds to N∈(x j It contains at least MinPts sample parameters, that is, if N∈(x) j If x ≥ MinPts, then x j It is the core object;

[0143] 3) Density direct access: If x i Located at x j In the ∈-neighborhood of x, and x j If it is a core object, then it is called x. i By x j Density reaches directly.

[0144] The 51 vectors mentioned above are input into the clustering algorithm. The clustering process is as follows:

[0145] Input: Sample set D = (x1, x2, ..., x...) 51 ), neighborhood parameters (∈,MinPts), sample distance metric (Euclidean distance);

[0146] Output: Cluster partition C;

[0147] 1) Initialize the core object collection Initialize the number of clusters k = 0, initialize the set of unvisited samples г = D, and then perform cluster partitioning.

[0148] 2) For j = 1, 2, ... 51, find all core objects using the following steps:

[0149] a) Find the sample parameter x using the sample distance metric. j The subset of samples N ∈ (x) in the ∈-neighborhood j );

[0150] b) If the subset N∈(x j The number of sample parameters in (x) satisfies N∈(x) j If ))≥MinPts, then the sample parameter x j Add to the core object sample set: Ω=Ω∪{x j};

[0151] 3) If the core object collection The clustering algorithm ends if the clustering algorithm ends; otherwise, proceed to step 4.

[0152] 4) Randomly select a core object O from the core object set Ω and initialize the current cluster core object queue Ω. cur ={O}, initialize the category index k = k + 1, initialize the current cluster sample set C k ={O}, update the unvisited sample set г=г - {O};

[0153] 5) If the current cluster core object queue Then the current cluster C k Once generated, update the cluster partition C = {C1, C2, ..., C}. k}, Update the core object set Ω = Ω - C k ;

[0154] 6) In the current cluster core object queue Ω cur Take a core object O′ from the sample set, find all ∈-neighborhood subsets N∈(O′) using the neighborhood distance threshold ∈, let Δ=N∈(O′)∩г, and update the current cluster sample set C. k =C k ∪Δ, update the unvisited sample set г=г - Δ, update Ω cur =Ω cur ∪(Δ∩Ω) - O′;

[0155] The output is: Cluster partition C = {C1, C2, ..., C}k}

[0156] In an optional embodiment of the present invention, the detection result further includes the element category of each UI element, and each sample UI element in the positive sample image vector has a sample element category and a sample relative position; refer to Figure 11 The flowchart of a UI detection method according to a second embodiment of the present invention is shown. Sub-step S22 may include the following sub-steps:

[0157] Sub-step S221: Determine whether the element category and relative position in the image vector to be detected match the sample element category and sample relative position in the positive sample image vector of any category;

[0158] Sub-step S222: If the element category and relative position in the image vector to be detected match the sample element category and sample relative position in the positive sample image vector of any category, then it is determined that the image vector to be detected and the positive sample image vector of any category are in the same cluster.

[0159] Sub-step S223: If the element category and relative position in the image vector to be detected do not match the sample element category and sample relative position in the positive sample image vector of any category, then it is determined that the image vector to be detected is not in the same cluster as the positive sample image vector of any category.

[0160] In this embodiment of the invention, the detection result for the image to be detected includes not only the element position of each UI element, but also the element category of each UI element.

[0161] During training based on positive sample images, training results can be obtained for the positive sample images. The training results include the sample element category, sample element position, and sample confidence of the sample UI elements. Furthermore, during the process of converting the positive sample images into positive sample image vectors, the conversion results can be obtained for the positive sample image vectors. The conversion results include the relative positions of the sample UI elements in the positive sample images.

[0162] The same element category indicates the presence of the same page elements, and the same relative position indicates the presence of the same page layout. Page elements and page layout define a UI page. Therefore, embodiments of the present invention can determine whether the image vector to be detected is in the same cluster as any category of positive sample image vectors by judging whether the element category and relative position in the image vector to be detected match the sample element category and sample relative position in any category of positive sample image vectors. In other words, it can be determined whether the image vector to be detected is in the same cluster as any category of positive sample image vectors by judging whether the element category and relative position in the image vector to be detected are the same as the sample element category and sample relative position in any category of positive sample image vectors.

[0163] If the element category and relative position in the vector of the image to be detected are the same as the sample element category and relative position in the vector of the positive sample image of any category, then it can be determined that the vector of the image to be detected and the vector of the positive sample image of any category are in the same cluster, that is, the image to be detected is clustered into one of the classes of the positive sample images.

[0164] If the element category and relative position in the vector of the image to be detected are not the same as the sample element category and relative position in the vector of the positive sample image of any category, then it can be determined that the vector of the image to be detected is not in the same cluster as the vector of the positive sample image of any category, that is, the image to be detected is not clustered into any category of the positive sample images.

[0165] In one example, if the element category in the vector of the image to be detected is the same as the sample element category in the vector of positive sample images of any category, but the relative position in the vector of the image to be detected is different from the relative position of the sample in the vector of positive sample images of that category, it means that the page elements of the image to be detected meet the design expectations, but the page layout of the image to be detected does not meet the design expectations. Therefore, it can be determined that the vector of the image to be detected is not in the same cluster as the vector of positive sample images of any category, that is, the image to be detected is not clustered into any category of positive sample images.

[0166] In one example, if the relative position in the vector of the image to be detected is the same as the relative position of the sample in the vector of the positive sample image of any category, but the category of the element in the vector of the image to be detected is different from the category of the sample element in the vector of the positive sample image of that category, it means that the page layout of the image to be detected meets the design expectations, but the page elements of the image to be detected do not meet the design expectations. Therefore, it can be determined that the vector of the image to be detected is not in the same cluster as the vector of the positive sample image of any category, that is, the image to be detected is not clustered into any category of the positive sample images.

[0167] In one example, if the element category in the vector of the image to be detected is different from the sample element category in the vector of positive sample images of any category, and the relative position in the vector of the image to be detected is also different from the relative position of the sample in the vector of positive sample images of that category, it indicates that the page elements and page layout of the image to be detected do not meet the design expectations. Therefore, it can be determined that the vector of the image to be detected is not in the same cluster as the vector of positive sample images of any category, that is, the image to be detected has not been clustered into any category of positive sample images.

[0168] Step 206: If the clustering result is that no positive sample image vector matching the vector of the image to be detected is obtained, then the image to be detected is detected as an abnormal image; if the clustering result is that a positive sample image vector matching the vector of the image to be detected is obtained, then the image to be detected is detected as a normal image.

[0169] If no positive sample image vector is clustered to match the vector of the image to be detected, that is, if the image to be detected is not clustered to any class of positive sample images, then it can be determined that the image to be detected is an abnormal image that does not meet the design expectations.

[0170] Images that are empty, have missing elements, have overlapping elements, or have swapped elements, and are awaiting detection, cannot be clustered into any of the positive sample images. They are all considered abnormal images that do not meet the design expectations.

[0171] Therefore, the embodiments of the present invention provide a feasible solution for detecting the correctness of front-end UI page display. Based on the element detection model, it can support the location of UI elements without manual coding of XPath. Through vectorization and automatic clustering, it can detect abnormal images that do not meet the design expectations without manual comparison of images. The embodiments of the present invention can be applied to the anomaly detection of UI pages with relatively stable element types and iterative layout or material.

[0172] This invention differs from anomaly detection approaches by not using deep learning for abnormal interface displays, thus overcoming the problems of difficult-to-obtain abnormal materials and model convergence issues (abnormal page displays are not enumerable). Instead, this invention classifies a large number of enumerable correct page displays (positive sample images) and uses clustering to determine whether an image to be detected is abnormal, greatly improving the accuracy of the judgment. Subsequent UI page design upgrades and the iteration costs of automated tools are controllable.

[0173] It should be noted that during the UI detection process in this embodiment of the invention, the following situations may occur: the image to be detected is a normal image that meets the design expectations, but due to errors in the detection results output by the element detection model, the image to be detected is ultimately determined to be an abnormal image that does not meet the design expectations. Alternatively, the UI page may be redesigned, and the image to be detected may have a new style, such as new UI elements appearing in the image, or old UI elements being repositioned or deleted. Because the element detection model is not iterated in time, the image to be detected may ultimately be determined to be an abnormal image that does not meet the design expectations.

[0174] To avoid the two situations mentioned above, in this embodiment of the invention, if no positive sample image matching the image to be detected is clustered, the image to be detected can be further reviewed manually to see if a new category or element detection error has occurred.

[0175] If the manual review identifies it as a new category, it indicates a UI page redesign and a new style for the image to be tested. In this case, the image to be tested can be judged as a normal image that meets the design expectations, and the vector of the image to be tested can be used as a positive sample image vector for the new category.

[0176] If the manual review indicates an element detection error, it means that the detection results output by the element detection model are incorrect. In this case, the sample UI elements in the positive sample images can be re-annotated, and the element detection model can be iterated.

[0177] In addition, if the manual review finds a page bug, it means that there is no error in the UI detection process of this embodiment of the invention, that is, the image to be detected is an abnormal image that does not meet the design expectations, and the process can be terminated.

[0178] In summary, in this embodiment of the invention, an image to be detected containing multiple UI elements is obtained. This image is then input into a preset element detection model to obtain detection results containing element positions. The element detection model is trained based on positive sample images. The image to be detected is then converted into an image vector using the element positions of the UI elements, thus obtaining a positive sample image vector. This positive sample image vector is obtained by converting a positive sample image. The image vector to be detected is clustered with the positive sample image vector to obtain clustering results. If the clustering result does not yield a matching positive sample image vector, the image to be detected is identified as an abnormal image; otherwise, if the clustering result yields a matching positive sample image vector, the image to be detected is identified as a normal image. The embodiment of this invention trains an element detection model based on positive sample images. The positive sample images can be enumerated, thereby overcoming the problems of difficult collection of abnormal materials and model non-convergence. Moreover, the element detection model can locate UI elements without manual XPath coding, and by automatically clustering the image to be detected and the positive sample images, abnormal images that do not meet the design expectations can be detected without manual comparison of images.

[0179] To enable those skilled in the art to better understand the embodiments of the present invention, the embodiments of the present invention are illustrated below through the following examples:

[0180] refer to Figure 12 The diagram illustrates a UI detection flowchart provided in an embodiment of the present invention. The UI detection process is as follows:

[0181] 1. The image to be detected is input into the element detection model;

[0182] 2. Element Detection: The element detection model detects the UI elements contained in the image to be detected and the position of the UI elements on the UI page, and outputs the detection results; the detection results include element category, element position and confidence score;

[0183] 3. Feature vectorization: The image to be detected is converted into a vector by using the element positions;

[0184] 4. Clustering Processing: Call the positive sample vector library, extract positive sample image vectors from the library, and cluster the image vector to be detected with positive sample image vectors of different categories;

[0185] 5. Outlier Status: If the element category and relative position in the vector of the image to be detected match the sample element category and relative position in the vector of any positive sample image of any category, then the image to be detected is determined not to be an outlier, that is, the image to be detected is clustered into one of the positive sample images, and the process proceeds to step 7; if the element category and relative position in the vector of the image to be detected do not match the sample element category and relative position in the vector of any positive sample image of any category, then the image to be detected is determined to be an outlier, that is, the image to be detected is not clustered into any of the positive sample images, and the process proceeds to step 6.

[0186] 6. Manual review: If the review finds a new category, proceed to step 7; if the review finds an element detection error, proceed to step 8; if the review finds a page bug, the process ends.

[0187] 7. Add the vector of the image to be detected as a positive sample image vector for the new category to the positive sample vector library, and the process ends;

[0188] 8. Re-label the UI elements in the positive sample images to iterate the element detection model. The process ends here.

[0189] refer to Figure 13 The diagram shows a structural block diagram of a UI detection device provided in an embodiment of the present invention, which may specifically include the following modules:

[0190] The image acquisition module 1301 is used to acquire the image to be detected; the image to be detected includes multiple UI elements.

[0191] The detection result output module 1302 is used to input the image to be detected into a preset element detection model to obtain the detection result; wherein, the element detection model is trained based on positive sample images, and the detection result includes the element position of each UI element;

[0192] The image vector conversion module 1303 is used to convert the image to be detected into an image vector by using the element positions of each UI element;

[0193] The forward sample image vector acquisition module 1304 is used to acquire a forward sample image vector; wherein, the forward sample image vector is obtained by transforming the forward sample image;

[0194] Clustering module 1305 is used to cluster the image vector to be detected with the positive sample image vector to obtain clustering results;

[0195] The detection module 1306 is configured to detect the image to be detected as an abnormal image if the clustering result is that no positive sample image vector matching the image vector to be detected is obtained; and to detect the image to be detected as a normal image if the clustering result is that a positive sample image vector matching the image vector to be detected is obtained.

[0196] In an optional embodiment of the present invention, the image vector conversion module 1303 may include:

[0197] The adjacent UI element determination submodule is used to determine the adjacent UI elements of each UI element respectively;

[0198] The relative position determination submodule is used to determine the relative position of each UI element in the image to be detected by using the element position of each UI element and the element position of the adjacent UI elements.

[0199] The image vector conversion submodule is used to convert the image to be detected into an image vector by using the relative positions of the various UI elements in the image to be detected.

[0200] In an optional embodiment of the present invention, the adjacent UI element determination submodule may include:

[0201] The segmentation unit is used to segment the image to be detected into multiple blocks;

[0202] The adjacent UI element determination unit is used to determine, for each block, UI elements within the same block as adjacent UI elements of other elements within the block.

[0203] In an optional embodiment of the present invention, the relative position determination submodule may include:

[0204] The reference UI element determination unit is used to determine the vertices of each block as reference UI elements.

[0205] The first calculation unit is used to calculate the reference distance between each UI element and the corresponding reference UI element by using the element position of each UI element and the element position of the reference UI element, and to calculate the adjacent distance between each UI element and the corresponding adjacent UI element by using the element position of each UI element and the element position of the adjacent UI element.

[0206] The second calculation unit is used to calculate the average distance between the reference distance and the adjacent distance for each UI element;

[0207] The relative position determination unit is used to determine the average distance of each UI element as the relative position of each UI element in the image to be detected.

[0208] In an optional embodiment of the present invention, the image vector conversion submodule may include:

[0209] The block vector construction unit is used to construct an N-dimensional block vector for each block; where N is a positive integer.

[0210] The assignment unit is used to assign the relative position values ​​of each UI element in the image to be detected to the block vector of the block, so as to obtain multiple target block vectors;

[0211] The stitching unit is used to stitch together the multiple target block vectors sequentially according to a preset direction to obtain an M-dimensional image vector to be detected; where M is a positive integer and M > N.

[0212] In an optional embodiment of the present invention, the positive sample image vector includes positive sample image vectors of different categories, and the dimension of the positive sample image vectors of different categories is M-dimensional; the clustering module 1305 may include:

[0213] The clustering submodule is used to cluster the M-dimensional image vector to be detected with the M-dimensional positive sample image vectors of different categories to obtain a clustering result; wherein, the clustering result includes multiple clusters, and the positive sample image vectors of different categories correspond to one cluster respectively;

[0214] The judgment submodule is used to determine whether the image vector to be detected is in the same cluster as the positive sample image vector of any category;

[0215] The matching positive sample image vector determination submodule is used to determine the clustering result as obtaining a positive sample image vector that matches the image vector to be detected if the image vector to be detected is in the same cluster as any category of positive sample image vector.

[0216] The submodule for determining mismatched positive sample image vectors is used to determine that if the image vector to be detected is not in the same cluster as any category of positive sample image vectors, the clustering result is that no positive sample image vector matching the image vector to be detected has been obtained.

[0217] In an optional embodiment of the present invention, the detection result further includes the element category of each UI element, and each sample UI element in the positive sample image vector has a sample element category and a sample relative position; the judgment submodule may include:

[0218] The judgment unit is used to determine whether the element category and relative position in the image vector to be detected match the sample element category and sample relative position in the positive sample image vector of any category;

[0219] The unit for determining the same cluster is used to determine that the image vector to be detected and the positive sample image vector of any category are in the same cluster if the element category and relative position in the image vector to be detected match the sample element category and sample relative position in the positive sample image vector of any category.

[0220] The determination unit for not being in the same cluster is used to determine that the image vector to be detected is not in the same cluster as any positive sample image vector if the element category and relative position in the image vector to be detected do not match the sample element category and sample relative position in any positive sample image vector of any category.

[0221] In an optional embodiment of the present invention, the element detection model can be trained by the following modules:

[0222] A positive sample image acquisition module is used to acquire the positive sample images used for training; the positive sample images include multiple sample UI elements, and each sample UI element is labeled with a rectangle.

[0223] The deep learning module is used to perform deep learning on multiple sample UI elements labeled by the rectangle using object detection algorithms to obtain an element detection model.

[0224] In summary, in this embodiment of the invention, an image to be detected containing multiple UI elements is obtained. This image is then input into a preset element detection model to obtain detection results containing element positions. The element detection model is trained based on positive sample images. The image to be detected is then converted into an image vector using the element positions of the UI elements, thus obtaining a positive sample image vector. This positive sample image vector is obtained by converting a positive sample image. The image vector to be detected is clustered with the positive sample image vector to obtain clustering results. If the clustering result does not yield a matching positive sample image vector, the image to be detected is identified as an abnormal image; otherwise, if the clustering result yields a matching positive sample image vector, the image to be detected is identified as a normal image. The embodiment of this invention trains an element detection model based on positive sample images. The positive sample images can be enumerated, thereby overcoming the problems of difficult collection of abnormal materials and model non-convergence. Moreover, the element detection model can locate UI elements without manual XPath coding, and by automatically clustering the image to be detected and the positive sample images, abnormal images that do not meet the design expectations can be detected without manual comparison of images.

[0225] As the device embodiment is basically similar to the method embodiment, the description is relatively simple, and relevant parts can be found in the description of the method embodiment.

[0226] This invention also provides an electronic device, such as... Figure 14 As shown, it includes a processor 1401, a communication interface 1402, a memory 1403, and a communication bus 1404, wherein the processor 1401, the communication interface 1402, and the memory 1403 communicate with each other through the communication bus 1404.

[0227] Memory 1403 is used to store computer programs;

[0228] When the processor 1401 executes the program stored in the memory 1403, it implements the various processes of the above-described UI detection method embodiment.

[0229] The communication bus mentioned above can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. This communication bus can be divided into address bus, data bus, control bus, etc. For ease of illustration, only one thick line is used to represent it in the diagram, but this does not mean that there is only one bus or one type of bus.

[0230] The communication interface is used for communication between the aforementioned terminal and other devices.

[0231] The memory may include random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.

[0232] The processors mentioned above can be general-purpose processors, including central processing units (CPUs), network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.

[0233] This invention also provides a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform any of the UI detection methods described in the above embodiments.

[0234] This invention also provides a computer program product containing instructions that, when run on a computer, cause the computer to execute any of the UI detection methods described in the above embodiments.

[0235] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk (SSD)).

[0236] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0237] The various embodiments in this specification are described in a related manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0238] The above description is merely a preferred embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention are included within the scope of protection of the present invention.

Claims

1. A user interface (UI) detection method, characterized in that, The method includes: Obtain the image to be detected; the image to be detected includes multiple UI elements; The image to be detected is input into a preset element detection model to obtain the detection result; wherein, the element detection model is trained based on positive sample images, and the detection result includes the element position of each UI element; The image to be detected is segmented into multiple blocks; for each block, UI elements within the same block are designated as adjacent UI elements to other elements within the block; and the vertices of each block are determined as reference UI elements. Using the element positions of each UI element and the element positions of the reference UI element, the reference distance between each UI element and the corresponding reference UI element is calculated; and using the element positions of each UI element and the element positions of the adjacent UI elements, the adjacent distance between each UI element and the corresponding adjacent UI element is calculated. For each UI element, calculate the average distance between the reference distance and the adjacent distance; determine the average distance of each UI element as the relative position of each UI element in the image to be detected; The image to be detected is converted into an image vector by using the relative positions of the various UI elements in the image to be detected; Obtain a forward sample image vector; wherein the forward sample image vector is obtained by transforming the forward sample image; The vector of the image to be detected is clustered with the vector of the positive sample image to obtain the clustering result; If the clustering result is that no positive sample image vector is obtained that matches the vector of the image to be detected, then the image to be detected is detected as an abnormal image; if the clustering result is that a positive sample image vector is obtained that matches the vector of the image to be detected, then the image to be detected is detected as a normal image.

2. The method according to claim 1, characterized in that, The step of converting the image to be detected into an image vector by using the relative positions of the various UI elements in the image to be detected includes: For each block, construct an N-dimensional block vector; where N is a positive integer. The relative position values ​​of each UI element in the image to be detected are assigned to the block vector of the block to obtain multiple target block vectors; The multiple target block vectors are sequentially concatenated according to a preset direction to obtain an M-dimensional image vector to be detected; where M is a positive integer and M > N.

3. The method according to claim 2, characterized in that, The positive sample image vector includes positive sample image vectors of different categories, and the dimension of each category of positive sample image vector is M-dimensional; the step of clustering the image vector to be detected with the positive sample image vector to obtain the clustering result includes: The M-dimensional image vector to be detected is clustered with the M-dimensional positive sample image vectors of different categories to obtain a clustering result; wherein the clustering result includes multiple clusters, and the positive sample image vectors of different categories correspond to one cluster respectively; Determine whether the image vector to be detected is in the same cluster as the positive sample image vector of any category; If the image vector to be detected is in the same cluster as the positive sample image vector of any category, then the clustering result is determined to be a positive sample image vector that matches the image vector to be detected. If the image vector to be detected is not in the same cluster as any positive sample image vector of any category, then the clustering result is determined to be that no positive sample image vector matching the image vector to be detected was obtained.

4. The method according to claim 3, characterized in that, The detection result also includes the element category of each UI element, and each sample UI element in the positive sample image vector has a sample element category and a relative sample position; determining whether the image vector to be detected is in the same cluster as any category of positive sample image vector includes: Determine whether the element category and relative position in the image vector to be detected match the sample element category and relative position in the positive sample image vector of any category; If the element category and relative position in the image vector to be detected match the sample element category and sample relative position in the positive sample image vector of any category, then it is determined that the image vector to be detected and the positive sample image vector of any category are in the same cluster. If the element category and relative position in the image vector to be detected do not match the sample element category and relative position in the positive sample image vector of any category, then it is determined that the image vector to be detected is not in the same cluster as the positive sample image vector of any category.

5. The method according to claim 1, characterized in that, The element detection model is trained in the following manner: Obtain the positive sample image used for training; the positive sample image includes multiple sample UI elements, each sample UI element is labeled with a rectangle; An object detection algorithm is used to perform deep learning on multiple sample UI elements labeled by the rectangle to obtain an element detection model.

6. A UI detection device, characterized in that, The device includes: The image to be detected module is used to acquire the image to be detected; the image to be detected includes multiple UI elements; The detection result output module is used to input the image to be detected into a preset element detection model to obtain the detection result; wherein, the element detection model is trained based on positive sample images, and the detection result includes the element position of each UI element; The image vector conversion module is used to convert the image to be detected into an image vector by using the element positions of each UI element; A forward sample image vector acquisition module is used to acquire forward sample image vectors; wherein, the forward sample image vectors are obtained by transforming the forward sample images; The clustering module is used to cluster the vector of the image to be detected with the vector of the positive sample image to obtain the clustering result; The detection module is configured to detect the image to be detected as an abnormal image if the clustering result is that no positive sample image vector matching the image vector to be detected is obtained; and to detect the image to be detected as a normal image if the clustering result is that a positive sample image vector matching the image vector to be detected is obtained. The image vector conversion module 1303 to be detected may include: The adjacent UI element determination submodule is used to determine the adjacent UI elements of each UI element respectively; The relative position determination submodule is used to determine the relative position of each UI element in the image to be detected by using the element position of each UI element and the element position of the adjacent UI elements. The adjacent UI element determination submodule includes: The segmentation unit is used to segment the image to be detected into multiple blocks; The adjacent UI element determination unit is used to determine, for each block, UI elements within the same block as adjacent UI elements of other elements within the block. The relative position determination submodule includes: The reference UI element determination unit is used to determine the vertices of each block as reference UI elements. The first calculation unit is used to calculate the reference distance between each UI element and the corresponding reference UI element by using the element position of each UI element and the element position of the reference UI element, and to calculate the adjacent distance between each UI element and the corresponding adjacent UI element by using the element position of each UI element and the element position of the adjacent UI element. The second calculation unit is used to calculate the average distance between the reference distance and the adjacent distance for each UI element; The relative position determination unit is used to determine the average distance of each UI element as the relative position of each UI element in the image to be detected.

7. An electronic device, characterized in that, It includes a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; Memory, used to store computer programs; A processor, when executing a program stored in memory, implements the UI detection method according to any one of claims 1-5.

8. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the UI detection method as described in any one of claims 1-5.