Scene character interactive understanding system for visually impaired people

An interactive and textual technology, applied in electronic digital data processing, input/output process of data processing, program control design, etc. Highly scalable, easy-to-use, highly sensitive and accurate effects

Pending Publication Date: 2022-03-11
HANGZHOU DIANZI UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the limitation of traditional visual question answering and visual description lies in the general description, which cannot describe the specific text in the image, and the description effect of image text information is poor.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Scene character interactive understanding system for visually impaired people
  • Scene character interactive understanding system for visually impaired people
  • Scene character interactive understanding system for visually impaired people

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0099] 1. Mobile APP

[0100] For the front-end, that is, the mobile APP, the present invention is developed based on the widely used and stable Vue framework, and the functions of each component are realized by programming. Among them, the speech conversion technology involved in the speech-to-text component and the text-to-text speech component uses the API call of iFLYTEK’s speech recognition SDK, because it provides a speech dictation interface and iFLYOS that convert speech signals less than 60 seconds into corresponding text information. The service access platform lowers the technical threshold for reading voice information.

[0101] APP user usage process can refer to figure 1 . First, the user wakes up the APP by voice, runs the APP to obtain the photo storage permission and camera access permission, enters the photo shooting interface, the APP voice prompts "Please take a photo", and the user clicks any position in the effective area of ​​the screen (equivalent to ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a visual impaired crowd-oriented scene character interactive understanding system, which comprises a mobile phone mobile terminal APP and a rear-end visual interactive calculation processing platform, the mobile phone mobile terminal APP comprises a voice wake-up component, a visual scene shooting component, a voice problem acquisition component, a voice-to-text conversion component, a logic judgment component, a data transmission transceiving component and a text-to-voice synthesis component; the back-end visual interaction processing computing platform comprises an input preprocessing module and a multi-head attention mechanism model. According to the method and the device, the character information in different scene pictures can be identified, the scene information is autonomously acquired by a user, the environmental adaptability and the expandability are high, and the sensitivity and the accuracy of character identification in the scene are high. Dynamic answering can be carried out according to user questions, and practicability and real-time performance are higher; the system can be installed on a mobile phone mobile terminal, can carry out information interaction through voice, and is convenient to use, low in cost consumption and easy to use for a user.

Description

technical field [0001] The invention belongs to the field of computer vision technology and the field of Internet technology, and in particular relates to a method for assisting visually impaired people to obtain text information in images based on visual description technology and visual question answering technology through voice interaction. Background technique [0002] According to the statistics of the Ministry of Health, there are as many as 14 million blind people in China, ranking first in the world, and the scale of the visually impaired population including blind people is still expanding. The inconvenience and danger of the daily life of the visually impaired make the visually impaired have a greater demand for facilities that can assist them in their normal lives, and various visually impaired aids have also emerged as the times require. As far as text-assisted recognition is concerned, most of the existing products cannot realize intelligent recognition of text...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/16G06F9/4401G06N3/02
CPCG06F3/167G06F9/4418G06N3/02
Inventor 余宙王璐瑶梁崴黄逸飞陈晨
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products