Modulated video token compression via causal compression module with positional information injection

WO2026124754A1PCT designated stage Publication Date: 2026-06-18HUAWEI TECH CO LTD +1

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: HUAWEI TECH CO LTD
Filing Date: 2024-12-11
Publication Date: 2026-06-18

Application Information

Patent Timeline

11 Dec 2024

Application

18 Jun 2026

Publication

WO2026124754A1

IPC: G06V10/62; G06V10/82; G06V20/40

AI Tagging

Application Domain

Character and pattern recognition

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure EP2024085629_18062026_PF_FP_ABST

Patent Text Reader

Abstract

Described is a computer apparatus (900) configured to: obtain a plurality of latent frames (103) of the video; and compress (104) a plurality of tokens of the latent frames (103) to generate a plurality of compressed tokens (105) for input into a vision language model (VLM) (109); wherein the compression (104) comprises combining tokens along a temporal dimension and along a spatial dimension. In this way, the compression (104) may be greater than if a single dimension is used, which may reduce the workload of the VLM (109).

Need to check novelty before this filing date? Find Prior Art