A method and a terminal for automatically generating training data for a target detection model

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By parsing the application installation package to obtain the control layout and attributes, multi-resolution simulated images and annotation files are generated, solving the problems of low efficiency and high cost in generating training data in existing technologies. This enables efficient and batch generation of high-quality training data, improving the efficiency and generalization ability of model training.

CN122244579APending Publication Date: 2026-06-19FUJIAN TQ DIGITAL

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: FUJIAN TQ DIGITAL
Filing Date: 2026-01-28
Publication Date: 2026-06-19

Application Information

Patent Timeline

28 Jan 2026

Application

19 Jun 2026

Publication

CN122244579A

IPC: G06V10/774; G06V10/25; G06T11/60; G06F8/61; G06F8/75; G06V10/82; G06F11/3668

AI Tagging

Application Domain

Error detection/correction Character and pattern recognition

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122244579A_ABST

Patent Text Reader

Abstract

This invention relates to the fields of computer vision and object detection technology, and particularly to a method and terminal for automatically generating training data for an object detection model. The method for automatically generating training data for an object detection model includes: acquiring the installation package file of a target application; parsing the installation package file to obtain a layout file corresponding to the target interface in the target application and resources of at least one target control; calculating the geometric information of the target control at a preset design resolution based on the layout file; generating simulated images of the target interface at each preset target resolution based on the geometric information and multiple preset target resolutions; generating a corresponding annotation file for each simulated image, the annotation file including the position annotation information of the target control in the corresponding simulated image; and storing a dataset consisting of the simulated images and their corresponding annotation files. This invention can efficiently and in batches generate high-quality, multi-resolution annotation data.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer vision and object detection technology, and in particular to a method and terminal for automatically generating training data for object detection models. Background Technology

[0002] In mobile application testing, automated script recording, and accessibility technologies, accurate identification of UI controls within application interfaces is crucial for automation. Currently, the mainstream approach relies on deep learning-based object detection models (such as YOLO and SSD) to accomplish this task. Training these models depends on a large amount of precisely labeled simulated image data. Currently, obtaining training data primarily depends on manual annotation of control positions and categories on screenshots. While manual annotation provides data, it suffers from inherent drawbacks such as low efficiency, high cost, poor annotation consistency, and difficulty in covering the entire interface and device resolution, resulting in long model training cycles and limited generalization ability. To overcome the shortcomings of manual annotation, related technologies have proposed computer vision methods based on screenshots or layout inference methods based on application runtime accessibility trees. Among these, image recognition-based methods are easily affected by visual interference from interface lighting, transparency, and dynamic effects, and struggle to completely and accurately identify the semantic type of controls (e.g., misidentifying a "text box" as a "button"). While the runtime interface tree-based method can obtain control type information, the screen coordinates it extracts are easily affected by system components such as the status bar and navigation bar. Furthermore, it requires the application to be actually installed and running, making the process cumbersome and unable to support large-scale, offline automated data generation.

[0003] Both of the above-mentioned automatic data acquisition schemes have limitations in accuracy, stability, or implementation conditions, and have failed to fundamentally solve the problem of generating large-scale, high-quality labeled data efficiently, at low cost, and with high consistency, thus restricting the training effect and application scope of UI control recognition models. Summary of the Invention

[0004] The technical problem to be solved by the present invention is to provide a method for automatically generating training data for a target detection model, so as to achieve automatic, accurate and batch generation of training data.

[0005] A method for automatically generating training data for an object detection model, the method comprising:

[0006] Obtain the installation package file of the target application; The installation package file is parsed to obtain the layout file corresponding to the target interface in the target application and the resources of at least one target control; Based on the layout file, calculate the geometric information of the target control at a preset design resolution; Based on the geometric information and multiple preset target resolutions, a simulated image of the target interface at each target resolution is generated; Generate a corresponding annotation file for each of the simulated images, the annotation file including the position annotation information of the target control in the corresponding simulated image; The dataset consists of the simulated image and the corresponding annotation file of the simulated image.

[0007] To solve the above-mentioned technical problems, another technical solution adopted by the present invention is as follows: A terminal for automatically generating training data for an object detection model includes a memory, a processor, and a computer program stored in the memory and running on the processor. When the processor executes the computer program, it performs the following steps: Obtain the installation package file of the target application; The installation package file is parsed to obtain the layout file corresponding to the target interface in the target application and the resources of at least one target control; Based on the layout file, calculate the geometric information of the target control at a preset design resolution; Based on the geometric information and multiple preset target resolutions, a simulated image of the target interface at each target resolution is generated; Generate a corresponding annotation file for each of the simulated images, the annotation file including the position annotation information of the target control in the corresponding simulated image; The dataset consists of the simulated image and the corresponding annotation file of the simulated image.

[0008] The beneficial effects of this invention are as follows: This invention obtains the installation package file of a target application; parses the installation package file to obtain the layout file corresponding to the target interface and resources of at least one target control in the target application; calculates the geometric information of the target control at a preset design resolution based on the layout file; generates simulated images of the target interface at each target resolution based on the geometric information and multiple preset target resolutions; generates a corresponding annotation file for each simulated image, the annotation file including the position annotation information of the target control in the corresponding simulated image; and stores a dataset consisting of simulated images and their corresponding annotation files. By directly obtaining the layout file and control resources of the application installation package through parsing, and calculating the geometric information of the control based on a preset design resolution, this invention automatically generates simulated interface images and corresponding annotation files at multiple resolutions. Compared to related technologies that rely on manual screenshots, manual annotation, or automated methods based on pixel differences, this invention directly obtains accurate control layouts and attributes from the application installation package, ensuring high precision and consistency in the geometric positions of the generated training data; at the same time, by automatically generating simulated images adapted to different screens through multiple preset target resolutions, it significantly expands the diversity of the dataset. This method avoids the cost and error of manual annotation, as well as the noise that may be introduced by traditional automated methods. It can efficiently and in batches generate high-quality, multi-resolution labeled data, thereby providing sufficient and accurate data support for the training of object detection models and improving the efficiency and generalization ability of model training. Attached Figure Description

[0009] Figure 1 A flowchart illustrating the steps of a method for automatically generating training data for an object detection model, as provided in an embodiment of the present invention; Figure 2 This is a schematic diagram of the structure of a terminal that automatically generates training data for a target detection model, provided in an embodiment of the present invention. Figure 3 This is a schematic diagram of the system structure for automatically generating training data for a target detection model, provided in an embodiment of the present invention. Figure 4 This is an application flowchart of a method for automatically generating training data for a target detection model, provided in an embodiment of the present invention. Label Explanation: 1. A terminal for automatically generating training data for an object detection model; 2. A processor; 3. A memory. Detailed Implementation

[0010] To explain in detail the technical content, objectives, and effects of the present invention, the following description is provided in conjunction with the embodiments and accompanying drawings.

[0011] Please refer to the appendix. Figure 1The following describes a method for automatically generating training data for an object detection model according to the present invention, including: Step 110: Obtain the installation package file of the target application; Step 120: Parse the installation package file to obtain the layout file corresponding to the target interface in the target application and the resources of at least one target control; Step 130: Based on the layout file, calculate the geometric information of the target control at the preset design resolution; where the preset design resolution is the original design resolution (e.g., 1080×1920 pixels), and the geometric information includes the position and size information of the control.

[0012] Step 140: Based on geometric information and multiple preset target resolutions, generate a simulated image of the target interface at each target resolution; wherein, the multiple target resolutions are a predefined set of target resolutions to be simulated, such as 720×1280, 1080×1920 and 1440×2560, etc.

[0013] Step 150: Generate a corresponding annotation file for each simulated image. The annotation file includes the position annotation information of the target control in the corresponding simulated image. Step 160: Store the dataset consisting of the simulated image and the corresponding annotation file of the simulated image; As can be seen from the above description, the beneficial effects of the present invention are as follows: This invention obtains the installation package file of the target application; parses the installation package file to obtain the layout file corresponding to the target interface and the resources of at least one target control in the target application; calculates the geometric information of the target control at a preset design resolution based on the layout file; generates a simulated image of the target interface at each target resolution based on the geometric information and multiple preset target resolutions; generates a corresponding annotation file for each simulated image, the annotation file including the position annotation information of the target control in the corresponding simulated image; and stores a dataset consisting of the simulated images and their corresponding annotation files. By parsing the application installation package to directly obtain its layout file and control resources, and calculating the geometric information of the controls based on a preset design resolution, this invention automatically generates simulated interface images and corresponding annotation files at multiple resolutions. Compared to related technologies that rely on manual screenshots, manual annotation, or automated methods based on pixel differences, this invention directly obtains accurate control layouts and attributes from the application installation package, ensuring high precision and consistency in the geometric positions of the generated training data; simultaneously, by automatically generating simulated images adapted to different screens through multiple preset target resolutions, it significantly expands the diversity of the dataset. This method avoids the cost and error of manual annotation, as well as the noise that may be introduced by traditional automated methods. It can efficiently and in batches generate high-quality, multi-resolution labeled data, thereby providing sufficient and accurate data support for the training of object detection models and improving the efficiency and generalization ability of model training.

[0014] In one optional implementation, step 130 involves calculating the geometric information of the target control at a preset design resolution based on the layout file, including: Step 210: Parse the layout file and build a node tree; the layout file includes an XML layout file. Use the Document Object Model (DOM) parser to parse the layout file and build a node tree representing the parent-child relationship of controls.

[0015] Step 220: Traverse the node tree, identify the control nodes in the node tree, and extract the attribute information of the control nodes. The attribute information includes the relative unit values and constraint attributes used by the control nodes; specifically, identify each UI control node based on its tag name and extract its key attribute information. Tag names are, for example, " <button>"or" <imageview>".

[0016] Step 230: Convert the relative unit values and constraint attributes used by the control node into absolute pixel coordinates at the preset design resolution, as the geometric information of the target control at the preset design resolution. The absolute pixel coordinates include position information and size information. As described above, this embodiment provides a method for directly and automatically calculating the precise geometric information of controls at a specific design resolution from a layout file. Compared to related technologies that rely on manual annotation or runtime to obtain control size and position, this embodiment can complete the calculation based on a static layout file without launching the application. This avoids errors caused by different devices or screen sizes, enabling the subsequently generated training data to cover different mobile device resolutions.

[0017] In an optional implementation, step 230, converting the relative unit values and constraint attributes used by the control nodes into absolute pixel coordinates at a preset design resolution, includes: Step 310: Obtain the screen density value corresponding to the preset design resolution, and obtain the preset baseline screen density value. Step 320: Convert the relative unit values used by the control nodes to absolute pixel values using the screen density value and a preset baseline screen density value; specifically, convert the relative unit values and constraint attributes to absolute pixel coordinates at the preset design resolution using a unit conversion formula. The unit conversion formula is as follows: Absolute pixel value = relative unit value × (screen density value / reference density value) Among them, relative unit values include density-independent pixel values "dp", which are numerical values in "dp" parsed from the ML layout file.

[0018] The absolute pixel value is a calculated value in pixels.

[0019] The screen density value is the target screen density for the original design, and the unit is "dpi".

[0020] Baseline density value: The default value is "160 dpi" (corresponding to Android's "mdpi" baseline screen).

[0021] Step 330: Combine the absolute pixel values with the constraint properties of the control node to obtain the absolute pixel coordinates; By parsing the above constraint attributes and following the official Android layout rules, the final boundary (left, top, right, bottom) of each control within its parent container is recursively calculated and then converted into absolute pixel coordinates (x, y, w, h).

[0022] As described above, this embodiment directly generates the absolute pixel coordinates of controls from the layout definition by performing deterministic unit conversion and coordinate calculation based on a preset design resolution and screen density. This embodiment calculates based on source code and design parameters, eliminating runtime environment uncertainties and generating accurate and consistent coordinate data. This improves the quality of training data. It avoids errors caused by runtime screenshots and image recognition, theoretically achieving pixel-level precision annotation with high semantic type accuracy.

[0023] In one alternative implementation, it further includes: Step 410: If the control node has a parent container, recursively calculate the final pixel coordinates of the control node within the parent container based on the type of the parent container and the constraint properties of the control node. The type of the parent container (e.g., LinearLayout, RelativeLayout, ConstraintLayout) The layout constraint properties of the control node itself (such as layout_gravity, layout_alignParentTop, layout_constraintTop_toTopOf, layout_margin, etc.) By parsing the aforementioned constraint attributes and following Android's official layout rules, the engine recursively calculates the final boundaries (left, top, right, bottom) of each control within its parent container, and then converts them into absolute pixel coordinates (x, y, w, h). For example, for ConstraintLayout, the engine needs to parse its complex anchor constraints and chained relationships, construct and solve a linear constraint system; for LinearLayout, it needs to accumulate the control sizes and margins in directional order.

[0024] As described above, this embodiment processes nested layouts through recursive calculation. Related techniques may overlook complex layout relationships, leading to incorrect control positioning. This method recursively calculates the final pixel coordinates within the parent container, ensuring the correctness of coordinate calculations in nested structures. The generated coordinate data matches the actual interface layout, improving the accuracy of data on complex interfaces.

[0025] In an optional implementation, step 220 involves identifying control nodes in the node tree and extracting the attribute information of the control nodes, including: Step 510: Identify nodes in the node tree whose label name is a preset control type label name as control nodes, and extract the attribute information of the control nodes; for example, based on the label name (such as "...").< / imageview> < / button> <button>The system identifies each UI control node and extracts its attribute information, including the relative unit values and constraint properties used by the control node. The relative unit values include density-independent pixel values for width and height.

[0026] As described above, this embodiment directly identifies controls and extracts attributes through label names. Related technologies rely on image recognition, which struggles to distinguish the semantic types of controls. These technologies require the application to be launched. This embodiment directly obtains information from the XML in the application installation package, eliminating the need to run the application. This improves the efficiency of data generation while ensuring the accuracy of control type identification. It also provides geometric data with precise semantic labels for training.

[0027] In an optional implementation, step 140 involves generating a simulated image of the target interface at each target resolution based on geometric information and multiple preset target resolutions, including: Step 610: Obtain multiple preset target resolutions and the geometric information of the target control at the preset design resolution; wherein, the multiple target resolutions are a predefined set of target resolutions to be simulated, such as 720×1280, 1080×1920, 1440×2560, etc. The geometric information includes position information and size information.

[0028] Step 620: For each target resolution, based on a preset resource selection strategy, select a rendering resource file from the resource files corresponding to the target control. The resource selection strategy includes prioritizing the selection of specific resource files that match the target resolution as rendering resource files. Specifically, first, check if there is a specific resource matching the current target resolution in the resource files obtained from the resource extraction module. If it exists, use that specific resource for layout parsing and rendering. If it does not exist, adopt a fallback strategy and use the basic layout resource for parsing and linear scaling.

[0029] Step 630: Determine the interface scaling parameters based on the target resolution and the preset design resolution; The interface scaling parameters include a width scaling factor and a height scaling factor, calculated using the following formulas: Width scaling factor = Target resolution width / Original design resolution width Height scaling factor = Target resolution height / Original design resolution height; Step 640: Adjust the geometric information of the target control at the preset design resolution according to the interface scaling parameters to obtain the adjusted geometric information of the target control at the target resolution. The adjustment includes scaling the control's geometric information; that is, for each control in the original control list, apply a scaling factor to calculate its position and size information at the target resolution. The calculation formula is as follows. Scaling the x-coordinate = Original x-coordinate × Width scaling factor Scaling the ordinate = Original ordinate × Height scaling factor Scaled width = Original width × Scaling factor Scaled height = Original height × Height scaling factor.

[0030] Step 650: Based on the rendering resource file and the adjusted geometric information, generate a simulated image of the target interface at the target resolution through image rendering; As described above, this embodiment generates multi-resolution simulated images in batches through a process of preset resolution, resource selection, scaling calculation, and rendering. Through procedural resource adaptation and geometric scaling, a series of interface images at target resolutions can be generated efficiently and automatically. This ensures the diversity and integrity of the training dataset, providing the model with the training samples needed for cross-resolution generalization, and significantly improving the efficiency and controllability of data generation. By procedurally simulating multiple resolutions and extending to simulate different themes and languages, training data covering various screen sizes and aspect ratios can be easily generated, effectively enhancing the generalization ability of the trained model.

[0031] In one alternative implementation, it further includes: Step 710: If no specific resource file matching the target resolution is found, select the preset basic layout resource file as the rendering resource file; The "preset base layout resource file" typically refers to the default layout file (e.g., layout / activity_main.xml) located in the application's resource directory (e.g., res / layout / ) that does not contain screen density or size qualifiers. When no matching qualifier resource is found, the system will fall back to using this base file for parsing and scaling.

[0032] In step 630, the interface scaling parameters are determined based on the target resolution and the preset design resolution, including: determining the interface scaling parameters based on the basic layout resource file and the linear scaling method; As described above, this embodiment employs a fallback mechanism and linear scaling for resolution adaptation. This ensures that even with incomplete resources, effective scaling parameters can still be generated based on a preset base layout resource file. This improves the robustness of the method across diverse device resolutions.

[0033] In an optional implementation, in step 150, a corresponding annotation file is generated for each simulated image. The annotation file includes the position annotation information of the target control in the corresponding simulated image, including: Step 810: Normalize the pixel coordinates in the adjusted geometric information of the target control at the target resolution to proportional coordinates relative to the size of the simulated image. The scaled pixel coordinates are normalized to a proportional value (range 0-1) relative to the width and height of the simulated image. The normalization formula is as follows: Normalized center x-coordinate = (scaled x-coordinate + scaled width / 2) / simulated image width Normalized center y-coordinate = (scaled y-coordinate + scaled height / 2) / simulated image height Normalized width = scaled width / simulated image width Normalized height = scaled height / simulated image height.

[0034] Step 820: Based on the simulated image and scale coordinates of the target interface at the target resolution, generate a labeling file corresponding to the simulated image. Specifically, the labeling file includes a YOLO format labeling file with the file extension (.txt). For example, each line of the labeling file has the following format: <Category Number><Normalized Center Axis Coordinate><Normalized Center Vertical Coordinate><Normalized Width><Normalized Height>.

[0035] As described above, this embodiment generates standard-format annotation files through coordinate normalization. Through calculation, precise pixel coordinates are automatically converted into proportional coordinates independent of image size. This ensures the accuracy and format consistency of the annotation data, allowing it to be directly used for training mainstream object detection models, thus improving the efficiency and annotation quality of the data preparation stage. The entire process can be fully automated offline, enabling the generation of massive amounts of annotation data in a short time, achieving an order-of-magnitude improvement in efficiency compared to manual annotation.

[0036] In an optional implementation, step 650 involves generating a simulated image of the target interface at the target resolution through image rendering, based on the rendered resource file and the adjusted geometric information, including: Step 910, Create a canvas; use the image library to create a canvas of the specified size; "Image library" refers to a programming library that can be used to create and draw bitmap canvases, such as the Pillow library in Python, or the BufferedImage and Graphics2D libraries in Java. When creating a canvas, its size is set to the current target resolution (target_width, target_height).

[0037] Step 920: Scale the rendered resource file to obtain the scaled resource file; Step 930: Draw the scaled resource file on the canvas according to the adjusted geometric information to obtain a simulated image of the target interface at the target resolution; As described above, this embodiment generates simulated images through a process of creating a canvas, scaling resources, and drawing. This embodiment, through procedural rendering, can efficiently and in batches generate interface images at any target resolution. This ensures the comprehensiveness of the training dataset and the controllability of the generation process, providing rich image samples for model training.

[0038] The method and terminal for automatically generating target detection model training data described above are applicable to the preparation of automated test data for mobile applications, especially for the development of UI automated test scripts for e-commerce applications (such as "shopping mall APP"). The following is a detailed description of the specific implementation method.

[0039] See attached document Figure 3 and attached Figure 4 The method described above for automatically generating training data for object detection models can be applied to real-world scenarios. For example, in one scenario, a testing team needs to develop UI automation test scripts for an e-commerce application (such as a "shopping mall APP"), requiring the ability to accurately identify the "add to cart" button on different mobile phone resolutions. This includes steps A-1 to A-7.

[0040] Step A-1: Obtain the APK installation package for the "Shopping Mall APP". This corresponds to step 110 above.

[0041] Step A-2: Use the resource extraction module of this system to parse the APK and extract the layout file (such as "activity_product_detail.XML") and button image resources of the product details page. This corresponds to step 120 above.

[0042] Step A-3: The layout parsing module analyzes the XML file, locates the "Add to Cart" button control, and calculates its precise position and size at the design resolution (e.g., 1080×1920). This corresponds to step 130 above.

[0043] Step A-4: The system automatically generates simulated screenshots of the product details page for each predefined test resolution list (e.g., 720×1280; 1080×1920; 1440×2560). This corresponds to step 140 above.

[0044] Step A-5: For each simulated screenshot, the annotation file generation module automatically generates a corresponding YOLO annotation file. The category number of the "Add to Cart" button is 0, and its location information has been precisely calculated and normalized. This corresponds to step 150 above.

[0045] Step A-6: The system outputs a complete dataset containing multi-resolution simulated images and corresponding annotation files. This corresponds to step 160 above.

[0046] Step A-7: The testing team uses this dataset to train a YOLO model. The trained model can then be integrated into the automated testing framework to identify the location of the "Add to Cart" button on different devices in real time, enabling precise clicks.

[0047] A terminal 1 for automatically generating training data for an object detection model includes a memory 3, a processor 2, and a computer program stored in the memory 3 and running on the processor 2. When the processor 2 executes the computer program, it implements each step of the above-described method for automatically generating training data for an object detection model.

[0048] In summary, this invention provides a method and terminal for automatically generating training data for an object detection model. It involves obtaining the installation package file of a target application; parsing the installation package file to obtain the layout file corresponding to the target interface and resources of at least one target control within the target application; calculating the geometric information of the target control at a preset design resolution based on the layout file; generating simulated images of the target interface at each target resolution based on the geometric information and multiple preset target resolutions; generating a corresponding annotation file for each simulated image, the annotation file including the positional annotation information of the target control in the corresponding simulated image; and storing a dataset consisting of the simulated images and their corresponding annotation files. By directly obtaining the layout file and control resources from the application installation package through parsing, and calculating the geometric information of the controls based on the preset design resolution, it automatically generates simulated interface images and corresponding annotation files at multiple resolutions. Compared to related technologies that rely on manual screenshots, manual annotation, or automated methods based on pixel differences, this invention directly obtains accurate control layouts and attributes from the application installation package, ensuring high precision and consistency in the geometric positions of the generated training data. Simultaneously, by automatically generating simulated images adapted to different screens through multiple preset target resolutions, it significantly expands the diversity of the dataset. This method avoids the cost and errors of manual annotation, as well as the noise that may be introduced by traditional automated methods. It can efficiently and in batches generate high-quality, multi-resolution labeled data, thus providing sufficient and accurate data support for the training of object detection models, improving the efficiency and generalization ability of model training. Compared with the original bounding box method, which typically only recognizes controls at a single resolution, this invention reduces the workload of manual annotation and saves time spent on bounding box work. At the same time, this invention expands the training data through various combinations of controls, backgrounds, and resolutions, resulting in powerful model training data and good generalization performance, achieving better recognition results than models trained on training data obtained from single bounding box methods. Once developed, it can be reused for data generation in different applications with extremely low marginal cost. Furthermore, the procedural generation completely eliminates the subjective differences of manual annotation, ensuring high consistency of the dataset.

[0049] The above description is merely an embodiment of the present invention and does not limit the patent scope of the present invention. Any equivalent modifications made based on the content of the present invention specification and drawings, or direct or indirect applications in related technical fields, are similarly included within the patent protection scope of the present invention.< / button>

Claims

1. A method for automatically generating training data for an object detection model, characterized in that, The method includes: Obtain the installation package file of the target application; The installation package file is parsed to obtain the layout file corresponding to the target interface in the target application and the resources of at least one target control; Based on the layout file, calculate the geometric information of the target control at a preset design resolution; Based on the geometric information and multiple preset target resolutions, a simulated image of the target interface at each target resolution is generated; Generate a corresponding annotation file for each of the simulated images, the annotation file including the position annotation information of the target control in the corresponding simulated image; The dataset consists of the simulated image and the corresponding annotation file of the simulated image.

2. The method for automatically generating training data for a target detection model according to claim 1, characterized in that, The step of calculating the geometric information of the target control at a preset design resolution based on the layout file includes: Parse the layout file and construct a node tree; Traverse the node tree, identify the control nodes in the node tree, and extract the attribute information of the control nodes, including the relative unit value and constraint attributes used by the control nodes; The relative unit values and constraint attributes used by the control node are converted into absolute pixel coordinates at the preset design resolution, which serve as the geometric information of the target control at the preset design resolution. The absolute pixel coordinates include position information and size information.

3. The method for automatically generating training data for a target detection model according to claim 2, characterized in that, The step of converting the relative unit values and constraint attributes used by the control node into absolute pixel coordinates at the preset design resolution includes: Obtain the screen density value corresponding to the preset design resolution, and obtain the preset reference screen density value; The relative unit values used by the control nodes are converted into absolute pixel values using the screen density value and the preset baseline screen density value. The absolute pixel values are combined with the constraint properties of the control node to obtain the absolute pixel coordinates.

4. The method for automatically generating training data for a target detection model according to claim 2, characterized in that, Also includes: If the control node has a parent container, the final pixel coordinates of the control node within the parent container are recursively calculated based on the type of the parent container and the constraint attributes of the control node.

5. The method for automatically generating training data for a target detection model according to claim 2, characterized in that, The process of identifying control nodes in the node tree and extracting the attribute information of the control nodes includes: The nodes in the node tree whose labels are preset control type labels are identified as control nodes, and the attribute information of the control nodes is extracted.

6. The method for automatically generating training data for a target detection model according to claim 1, characterized in that, Based on the geometric information and multiple preset target resolutions, a simulated image of the target interface at each target resolution is generated, including: Obtain multiple preset target resolutions, as well as the geometric information of the target control at the preset design resolution; For each target resolution, based on a preset resource selection strategy, a rendering resource file is selected from the resource files corresponding to the target control. The resource selection strategy includes preferentially selecting a specific resource file that matches the target resolution from the resource files as the rendering resource file. Based on the target resolution and the preset design resolution, determine the interface scaling parameters; The geometric information of the target control at the preset design resolution is adjusted according to the interface scaling parameters to obtain the adjusted geometric information of the target control at the target resolution. Based on the rendered resource file and the adjusted geometric information, a simulated image of the target interface at the target resolution is generated through image rendering.

7. The method for automatically generating training data for a target detection model according to claim 6, characterized in that, Also includes: If no specific resource file matching the target resolution is found, a preset basic layout resource file is selected as the rendering resource file; The step of determining the interface scaling parameters based on the target resolution and the preset design resolution includes: determining the interface scaling parameters based on the basic layout resource file and the linear scaling method.

8. The method for automatically generating training data for a target detection model according to claim 1, characterized in that, The step of generating a corresponding annotation file for each simulated image, wherein the annotation file includes the position annotation information of the target control in the corresponding simulated image, including: The pixel coordinates of the target control in the adjusted geometric information at the target resolution are normalized to proportional coordinates relative to the size of the simulated image. Based on the simulated image of the target interface at the target resolution and the scale coordinates, a labeling file corresponding to the simulated image is generated.

9. The method for automatically generating training data for a target detection model according to claim 6, characterized in that, The step of generating a simulated image of the target interface at the target resolution through image rendering based on the rendered resource file and the adjusted geometric information includes: Create a canvas; After scaling the rendered resource file, a scaled resource file is obtained; The scaled resource file is drawn on the canvas based on the adjusted geometric information to obtain a simulated image of the target interface at the target resolution.

10. A terminal for automatically generating training data for an object detection model, comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements a method for automatically generating training data for a target detection model according to any one of claims 1 to 9.