Generative reasoning method, system, computer device and medium based on reinforcement learning framework

By employing an availability generalization reasoning method based on a reinforcement learning framework, and utilizing a high-precision binocular stereo vision camera and a thought chain reasoning mechanism, the problem of insufficient out-of-domain generalization ability of multimodal large language models is solved, thereby improving the robot's ability to reason about object availability and operational reliability in unstructured environments.

CN122021940BActive Publication Date: 2026-06-23HONG KONG UNIV OF SCI & TECH (GUANGZHOU)

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HONG KONG UNIV OF SCI & TECH (GUANGZHOU)
Filing Date
2026-04-08
Publication Date
2026-06-23

Smart Images

  • Figure CN122021940B_ABST
    Figure CN122021940B_ABST
Patent Text Reader

Abstract

The application relates to the technical field of robot intelligent control, and in particular relates to a generalization reasoning method and system for affordance based on a reinforcement learning framework, computer equipment and a medium; the method comprises the following steps: acquiring multi-modal environment data collected by a perception system and performing pretreatment; based on the pretreated multi-modal environment data, introducing prior knowledge of affordance to construct an input representation of an affordance reasoning task; performing affordance reasoning calculation through a large language model to generate an initial affordance region prediction; based on a reinforcement learning framework, using a thought chain reasoning mechanism to iteratively optimize and enhance the generalization of the initial affordance region prediction, so as to output target affordance region information. In this way, the technical problem of insufficient out-of-domain generalization capability of the prior art under the support of a multi-modal large language model is solved, and the reliability, interpretability and generalization operation capability of the robot in an unstructured environment are improved.
Need to check novelty before this filing date? Find Prior Art