A non-intrusive software operation agent method based on adaptive image reconstruction

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By employing adaptive image reconstruction and semantic-level matching technologies, the problems of response latency and low recognition rate under high resolution and complex software interfaces are solved, enabling a fast, accurate, and non-intrusive software operation agent that adapts to software version updates and natural language commands.

CN122308677APending Publication Date: 2026-06-30WUXI CHANGSHENG VISION TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: WUXI CHANGSHENG VISION TECHNOLOGY CO LTD
Filing Date: 2026-03-17
Publication Date: 2026-06-30

AI Technical Summary

Technical Problem

Existing non-intrusive interaction technologies suffer from high response latency and low recognition rates when faced with high-resolution screens and complex professional software interfaces. They also struggle to effectively handle high pixel density and low-contrast interfaces, limiting the application of automated software in these scenarios.

Method used

An adaptive image reconstruction method is adopted, including image compression, adaptive binarization, and pseudo-color three-channel data reconstruction. Combined with semantic-level matching technology, the image processing and recognition process is optimized, reducing the amount of computation and improving the recognition accuracy.

Benefits of technology

Achieving high-precision text recognition within millisecond-level response time reduces hardware costs and improves the accuracy, response speed, and robustness of non-intrusive software operation agents. It can handle complex low-contrast interfaces and adapt to software version updates and natural language commands.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122308677A_ABST

Patent Text Reader

Abstract

This application discloses a non-intrusive software operation agent method based on adaptive image reconstruction, belonging to the field of non-intrusive interaction technology. This method introduces a preprocessing mechanism combining image compression and adaptive grayscale binarization into the original high-resolution image. This significantly reduces image information entropy while preserving high-resolution binarized features. Furthermore, it reconstructs the single-channel binarized feature map into high-resolution, low-computational-power pseudo-color three-channel data that can be efficiently processed by OCR inference engines, thus adapting to the input format requirements of OCR inference engines. This method greatly reduces hardware deployment costs, and the adaptive binarization segmentation method significantly improves text recognition accuracy in complex, low-contrast, and even semi-transparent UI interfaces, enhancing the overall performance of the non-intrusive software operation agent in terms of accuracy, response speed, reliability, and robustness.

Need to check novelty before this filing date? Find Prior Art