Generative reasoning method, system, computer device and medium based on reinforcement learning framework
By employing an availability generalization reasoning method based on a reinforcement learning framework, and utilizing a high-precision binocular stereo vision camera and a thought chain reasoning mechanism, the problem of insufficient out-of-domain generalization ability of multimodal large language models is solved, thereby improving the robot's ability to reason about object availability and operational reliability in unstructured environments.
CN122021940BActive Publication Date: 2026-06-23HONG KONG UNIV OF SCI & TECH (GUANGZHOU)
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HONG KONG UNIV OF SCI & TECH (GUANGZHOU)
- Filing Date
- 2026-04-08
- Publication Date
- 2026-06-23
Smart Images

Figure CN122021940B_ABST
Abstract
The application relates to the technical field of robot intelligent control, and in particular relates to a generalization reasoning method and system for affordance based on a reinforcement learning framework, computer equipment and a medium; the method comprises the following steps: acquiring multi-modal environment data collected by a perception system and performing pretreatment; based on the pretreated multi-modal environment data, introducing prior knowledge of affordance to construct an input representation of an affordance reasoning task; performing affordance reasoning calculation through a large language model to generate an initial affordance region prediction; based on a reinforcement learning framework, using a thought chain reasoning mechanism to iteratively optimize and enhance the generalization of the initial affordance region prediction, so as to output target affordance region information. In this way, the technical problem of insufficient out-of-domain generalization capability of the prior art under the support of a multi-modal large language model is solved, and the reliability, interpretability and generalization operation capability of the robot in an unstructured environment are improved.
Need to check novelty before this filing date? Find Prior Art