Monitoring device
The monitoring terminal addresses the challenges of size, durability, and operability in harsh environments by using a housing design with acoustic gaps and separators, ensuring clear sound quality and preventing ignored voice messages.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- MIXI INC
- Filing Date
- 2025-10-07
- Publication Date
- 2026-07-01
AI Technical Summary
Existing monitored terminals for children face challenges in achieving a small form factor while ensuring clear voice communication, durability, and operability in harsh environments, and there are issues with voice message notifications being ignored, inappropriate routing of messages, and managing multiple guardians.
A monitoring terminal with a housing design that incorporates a button portion exposed through an opening, utilizing a gap between the housing and button for acoustic paths, and an acoustic isolation structure, along with an acoustic separator to enhance sound quality and durability, and a system to prevent voice messages from being ignored.
The design achieves miniaturization, improved waterproofing, dustproofing, and shock resistance with clear sound quality, and ensures that important voice messages are not ignored by appropriately routing them to multiple guardians.
Smart Images

Figure 2026109541000001_ABST
Abstract
Description
Technical Field
[0005]
[0001] The present invention relates to a monitored terminal.
Background Art
[0002] Conventionally, there has been known a monitored terminal that acquires the position information of a monitored person such as a child and notifies a monitor such as a guardian (see Patent Document 1). In such a monitored terminal, various functions such as transmission and reception of voice messages, photography, and recording are installed, and some of them display these operating states on a display screen to inform the user.
[0003] Such a monitored terminal has a shape and size suitable for a child to carry. For example, it is a terminal having a housing on a cube with a length and width of about 10 cm and a width of about 3 cm. By miniaturizing the terminal in this way, a child can carry it by hanging it from the neck using a strap. On the other hand, because the terminal is small, there are restrictions on the user interface for the child to execute functions such as transmission and reception of voice messages, photography, and recording, and it is realistic to provide only a very limited number of buttons of a very limited size. Also, considering the space for providing mounting holes for passing the above strap, sound collection ports of built-in microphones, sound emission ports of speakers, etc., the space available for the user interface such as operation buttons is further restricted.
Prior Art Documents
Patent Documents
[0004]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0005] Given the constraints described above, there is room for improvement in the user interface to realize a small, child-friendly monitoring device. In particular, use by children is expected to be in far harsher environments than use by adults, such as use in sandboxes in parks, outdoor use in rainy weather, and frequent drops. In such environments, it has been an extremely difficult challenge to achieve a high level of compatibility between multiple requirements that are in a trade-off relationship, such as maintaining a small form factor while enabling clear voice communication, durability such as waterproofing, dustproofing, and shock resistance, and operability that even children can operate intuitively. One of the objectives of the present invention is to provide a monitoring device that is small yet achieves excellent sound quality, durability, and operability even in harsh usage environments, in view of the above challenges.
[0006] Furthermore, there is room for improvement in the function of sending and receiving voice messages between children and guardians. Specifically, when a child sends a voice message to a guardian, the guardian's device receives the voice message and notifies them that a voice message has arrived from the child, but the guardian may not notice this notification. If the content of the voice message is a question that the child is asking the guardian to answer, the child will be waiting to receive a voice message in response from the guardian, but the guardian does not notice and leaves the voice message unattended, which can cause anxiety and frustration in the child. The reverse is also possible. That is, if a guardian sends a message with a question from their device to the child's device, but the child does not notice that they received the message and leaves it unattended, the guardian will feel anxious. One of the objectives of the present invention is to provide a monitoring system that can prevent voice messages that should not be ignored from being left unattended, in view of the above problems.
[0007] Furthermore, there is room for improvement in the voice message sending and receiving function between children and guardians from another perspective. That is, there may be multiple guardians watching over the same child, and each guardian may use their own device to monitor the child. For example, multiple guardians such as the child's father, mother, grandfather, and grandmother may cooperate in watching over the child. In this case, appropriate control is needed to determine which guardian a voice message sent from the child should be sent to. One objective of the present invention is to provide a monitoring system that can appropriately route voice messages sent from a child to multiple guardians, in view of the above problems. [Means for solving the problem]
[0008] One embodiment of the present invention is a monitoring terminal comprising a housing, a button portion that receives a press operation from a user, a microphone for recording the user's speech when a voice message recording function is active while the button portion is pressed, and a speaker for outputting playback sound to the user when the voice message recording function is not active, wherein the button portion is arranged inside the housing so as to be pressable and exposed to the outside through an opening in the housing, and the microphone and speaker are arranged inside the housing so as to be able to pick up and emit sound, respectively, through a gap provided between the housing and the button portion. [Effects of the Invention]
[0009] According to one aspect of the present invention, by using the gap between the housing and the button section as an acoustic path, and further providing an acoustic isolation structure inside, it is possible to simultaneously achieve miniaturization, improved waterproofing, dustproofing, and shock resistance, and ensure clear sound quality without increasing the number of parts. [Brief explanation of the drawing]
[0010] [Figure 1] This figure shows the overall configuration of the monitoring system according to this embodiment. [Figure 2] This is a block diagram showing the hardware configuration of the monitored device. [Figure 3] It is a functional block diagram of the monitored terminal. [Figure 4] It is a diagram showing the external configuration of the monitored terminal. [Figure 5A] It is a side cross-sectional view showing the internal configuration of the monitored terminal. [Figure 5B] It is a side cross-sectional view of the main part showing the state of the gap (first interval) when not pressed. [Figure 5C] It is a side cross-sectional view of the main part showing the state of the gap (second interval) when pressed. [Figure 5D] It is an enlarged cross-sectional view of the main part showing the configuration of the acoustic separator. [Figure 5E] It is a perspective view showing the configuration of the acoustic separator. [Figure 6] It is a diagram showing an example of the stereo arrangement of the microphone and the speaker. [Figure 7] It is a side cross-sectional view of the main part showing an example of the display speaker configuration. [Figure 8] It is a block diagram showing the configuration of the guardian terminal 20. [Figure 9] It is a diagram showing the function of preventing the abandonment of voice messages. [Figure 10] It is a diagram showing an example of the display of a notification message. [Figure 11] It is a diagram showing an example of the volume control of a notification sound. [Figure 12] It is a flowchart showing the operation related to the function of preventing the abandonment of voice messages. [Figure 13] It is a diagram showing an example of the database 70 for managing the association between the monitored terminal 10 and the guardian terminal 20. [Figure 14] It is a diagram showing the routing function of voice messages.
Embodiments for Carrying Out the Invention
[0011] Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0012] (Overall System Configuration) Figure 1 is a diagram showing the overall configuration of the monitoring system according to this embodiment. The monitoring system 1 includes a monitored terminal 10 carried by a monitored person such as a child, a guardian terminal 20 used by a guardian (for example, a parent or guardian) who is in a position to monitor the monitored person, and a network 30 that connects these. The monitored terminal 10 acquires position information by means of GPS or the like and performs transmission and reception of voice messages with the guardian terminal 20. In addition, the monitored terminal 10 has a photographing function using a camera and a recording function using a microphone.
[0013] (Hardware Configuration of the Monitored Terminal) Figure 2 is a diagram showing the hardware configuration of the monitored terminal 10. The monitored terminal 10 includes a processor 11, a memory 12, a storage 13, a communication interface 14, a display 15, a button unit 16, a light-emitting element 17, a camera 18, a microphone 19, a speaker 21, a GPS receiver 22, and a bus 23 that connects these.
[0014] The processor 11 is a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), or a combination thereof, and executes a program stored in the storage 13 to realize various functions of the monitored terminal 10. The memory 12 is a main storage device such as a RAM (Random Access Memory), and temporarily stores data when the program is executed by the processor 11. The storage 13 is a non-volatile storage device such as a ROM (Read Only Memory), a flash memory, or an SSD, and permanently stores programs and various data.
[0015] In the hardware configuration shown in Figure 2, the processor 11 functions as the central processing unit of the monitored terminal 10, the memory 12 is responsible for temporary data storage, and the storage 13 is responsible for permanent program data storage. The communication interface 14 handles wireless communication with the outside, the display 15 displays visual information, and the button section 16 accepts user operations. The light-emitting element 17, a feature of the present invention, is positioned within the area of the button section 16, independently of the display area of the display 15. The camera 18, microphone 19, speaker 21, and GPS receiver 22 function as peripheral devices to realize the monitoring function. These components communicate data with each other via the bus 23.
[0016] The inertial sensor 24 is an inertial measurement device that includes an accelerometer, gyroscope, etc., and detects the posture, movement, vibration, etc., of the monitored terminal 10. The inertial sensor 24 can also be used to detect the terminal falling, violent movement, abnormal vibration, etc., and determine an emergency situation.
[0017] The communication interface 14 communicates with the monitoring server 40 and the guardian terminal 20 via the network 30 using a communication method such as wireless LAN or LTE. The display 15 is an LCD display, an organic EL display, etc., and displays various information. The button section 16 is a physical button that accepts user input, and pressing it allows operations such as starting voice message recording. The display 15 is provided in contact with a predetermined surface of the button section 16. For example, the display 15 may be attached and fixed to a predetermined surface of the button section 16 using adhesive or the like, or it may be fixed to a predetermined surface of the button section 16 using fixing members such as screws. The button portion 16 is made of a transparent, light-transmitting material, and the content displayed on the display 15, which is provided on a predetermined surface of the button portion 16, can be viewed from the back surface of the predetermined surface of the button portion 16. The area of the button portion 16 in which the content displayed on the display 15 can be seen through is called the display area. The button portion 16 has a larger area than the display 15, and as a result, the button portion 16 has a display area as well as a non-display area in which the content displayed on the display 15 cannot be seen through.
[0018] The light-emitting element 17 is a light-emitting device such as an LED (Light Emitting Diode), organic EL (Organic Electro-Luminescence), or inorganic EL (Inorganic Electro-Luminescence). The light-emitting element 17 is provided on the same predetermined surface as the display 15 in the button portion 16, and more specifically, it is provided around the display 15. This allows the light emitted by the light-emitting element 17 to pass through the non-display area of the button portion 16 and be visible. The light-emitting element 17 can be a single-color LED, a full-color LED, or a combination of multiple LEDs of different colors.
[0019] In this embodiment, the monitored terminal 10 is assumed to have two configurations: a basic configuration and an extended configuration. In the basic configuration, the core functions provided are location information acquisition, voice message transmission and reception, and status display (light-emitting element 17), but the camera 18 is not included. This configuration is suitable for providing basic monitoring functions at a low cost.
[0020] In the expanded configuration, a camera 18 is added to the basic configuration to provide photo / video messaging, monitoring photography, and various authentication and analysis functions. The camera 18 is an imaging device that captures still images and videos. The camera 18 can be used for a variety of purposes, such as sending photo and video messages, periodic monitoring photography, automatic photography in emergencies, reading information such as 2D codes, identity verification by facial recognition, emotion estimation by facial recognition, and recording the surrounding environment. Note that the camera 18 is not a mandatory configuration requirement and can be omitted depending on the application and cost requirements.
[0021] Microphone 19 is a sound-collecting device that records voice and is positioned around the button section 16 within the housing. Microphone 19 is used for recording voice messages based on the voice of the person being monitored, monitoring ambient sounds, and controlling operations by voice recognition. In this embodiment, multiple microphones 19a and 19b can be provided to improve sound quality and control directionality, but it is also possible to implement this with a single microphone 19. When multiple microphones are used, beamforming technology can be used to focus on picking up sound from a specific direction. The voice of the person being monitored is picked up through a gap 24 provided between the housing and the button section 16, as will be described in detail later. This gap 24 is formed as a fine opening with a width of about 0.5 to 2.0 mm in order to optimize acoustic characteristics.
[0022] The speaker 21 is an acoustic device that outputs audio such as received voice messages, and like the microphone 19, it is positioned inside the housing around the button section 16. The audio from the speaker 21 is output to the outside through a gap 24 provided between the housing and the button section 16, as will be described in detail later. This gap 24 is formed as a fine opening with a width of about 0.5 to 2.0 mm in order to optimize the acoustic characteristics. In this embodiment, multiple speakers 21a and 21b can be provided for stereo sound and echo cancellation functions, but it can also be implemented with a single speaker 21. The GPS receiver 22 receives signals from GPS satellites and acquires location information.
[0023] (Functional configuration of the monitored device) Figure 3 is a functional block diagram of the monitored terminal 10. The monitored terminal 10 includes a communication unit 31, a location information acquisition unit 32, a detection unit 33, a control unit 34, and a storage unit 35. These functional units are realized by the processor 11 executing a program stored in the storage 13. (Function block details) The details of each functional unit shown in Figure 3 are described below. These functional units are functional modules realized by the processor 11 shown in Figure 2 executing a predetermined program loaded into memory 12. The communication unit 31 is responsible for sending and receiving data with external terminals. Its main functions include receiving voice messages and various instruction commands from the parent terminal 20, and sending voice messages and location information generated by the monitored terminal 10 to the parent terminal 20. As a concrete example of implementation, it is implemented as driver software that controls a communication interface 14 such as LTE or Wi-Fi. The voice message data received from the parent terminal 20 is passed to the control unit 34, and the voice message data for transmission passed from the control unit 34 is transmitted via the network 30. The location information acquisition unit 32 determines the current location of the terminal. Its main function is to receive signals from GPS satellites and calculate latitude and longitude information. As a concrete example of implementation, it is implemented as software that controls the GPS receiver unit 22 and processes the received signals with a positioning calculation algorithm. The calculated location information is passed to the control unit 34 periodically or in response to a request from the detection unit 33. The detection unit 33 detects physical and internal state changes of the terminal. Its main functions include detecting various "operating states," such as pressing (long press, short press) the button 16, low battery level, entry into a dangerous area, and the characteristics of the audio signal collected by the microphone 19 (sound pressure, etc.). Specific implementations include changes in electrical signals from the switch mechanism 41 (described later), voltage information from the battery management IC, comparison of location information from the location information acquisition unit 32 with dangerous area information stored in the memory unit 35, and digital analysis of the audio signal from the microphone 19. When a state change is detected, the type of state (e.g., "button long press detected") is notified to the control unit 34. The control unit 34 functions as the brain that oversees and controls the operation of the entire monitored terminal 10. Its main function is to control the operation of the entire terminal based on the operating status notified by the detection unit 33. For example, when it receives a "button long press detected" notification, it activates the microphone 19 and starts recording, and when it receives a "voice message received" notification from the communication unit 31, it executes the process of playing the ringtone from the speaker 21.
[0024] Example processing steps (when recording a voice message): (1) The detection unit 33 receives a "button long press detected" notification. (2) A control signal is issued to enable (ON) the microphone 19 and disable (OFF) the speaker 21 (to prevent echo). (3) Audio data is generated from the sound input from microphone 19, and the audio data is sequentially recorded in memory 12 (recording). (4) When the detection unit 33 receives a "button release detected" notification, the recording is terminated, and the audio data recorded from the start to the end of the recording is passed to the communication unit 31 and instructed to be transmitted. The memory unit 35 permanently or temporarily stores various data and programs. Its main functions include holding the OS, control programs, received voice messages, and geographical information of dangerous areas. A concrete example of implementation is that it consists of storage 13 (flash memory, etc.) and memory 12 (RAM).
[0025] The communication unit 31 controls the communication interface 14 to send and receive various data such as voice messages and location information with the guardian terminal 20. The location information acquisition unit 32 controls the GPS receiver 22 to receive signals from GPS satellites and acquire location information indicating the current location of the monitored terminal 10.
[0026] The detection unit 33 detects various operating states of the monitored terminal 10. In this embodiment, "operating state" refers to a state classified based on the process being executed by the monitored terminal 10 or the circumstances in which the monitored terminal 10 is located, and specifically includes the following states: (1) Communication state: state of receiving a voice message from the guardian terminal 20 by the communication unit 31, state of transmitting location information, etc.; (2) Recording state: state of recording audio by the microphone 19; (3) Shooting state: state of shooting a still image or video by the camera 18; (4) Warning state: state of entering a pre-set danger area, low battery level, etc.; (5) Operation state: state of pressing the button unit 16; (6) Sound collection state: state of the audio signal, such as sound pressure and frequency characteristics, of the audio collected by the microphone 19. The danger area is pre-set by the guardian terminal 20 as an area other than safe places such as school or home.
[0027] These operating states occur as a result of the monitoring terminal 10 performing its functions or the user operating buttons, etc. Each operating state is detected by signals from the corresponding components (communication unit 31, microphone 19, camera 18, button unit 16, processor 11, etc.). The detection unit 33 continuously monitors the start, continuation, and end of these operating states and notifies the control unit 34 of any changes in state.
[0028] The control unit 34 controls the overall operation of the monitored terminal 10. In particular, the control unit 34 controls the illumination of the light-emitting element 17 according to the operating state detected by the detection unit 33. As illumination control, the control unit 34 can turn the light-emitting element 17 on, off, blink, change the illumination color, change the illumination intensity, etc. The storage unit 35 stores various data using the memory 12 and storage 13.
[0029] (Voice message sending and receiving function) The monitored terminal 10 has a function to send and receive voice messages with the guardian terminal 20. One is a voice message receiving function, which includes the function of receiving audio data (voice message) recorded on the guardian terminal 20 and storing the audio data in storage 13. The voice message receiving function also includes a function to sound a ringtone to notify the monitored person that a voice message has been received. The sound source (audio data) of the ringtone is stored in storage 13 in advance, and the ringtone is played and output (sounded) from speaker 21. The ringtone is played and output from speaker 21 automatically in response to the receipt of a voice message from guardian terminal 20, without requiring any operation input from the monitored person. The second is a voice message sending function, which includes the function of recording audio collected by microphone 19 with processor 11 to generate audio data (voice message), and sending it to guardian terminal 20 for playback. The monitored terminal 10 activates the voice message sending and receiving function in response to the pressing of button 16. For example, the monitored terminal 10 may detect that the button 16 has been pressed for a predetermined time or longer (long press), start recording the audio collected by the microphone 19, and then execute the voice message sending function. Specifically, it may detect that the button 16 has been pressed for a predetermined time or longer (long press), start recording the audio collected by the microphone 19, and then, when it detects that the long press has been released, stop recording and send the audio data recorded between the start and end of recording.
[0030] (Voice message playback function) The monitored terminal 10 has a voice message playback function that plays voice messages received and stored by the voice message receiving function described above. The voice message playback function includes a function that plays audio data with the processor 11 and outputs sound with the speaker 21. For example, the monitored terminal 10 may detect that the button 16 has been pressed but released within a predetermined time (short press), and then start playing the audio data received from the guardian terminal 20 and stored in the memory 12 or storage 13, thereby executing the voice message receiving function.
[0031] (External structure of the monitored device) Figure 4 shows the external configuration of the monitored terminal 10, and is a view of the monitored terminal 10 from the front (the side where the display 15 is visible). The monitored terminal 10 has a roughly square housing 26, and the display 15 is located in the center of the housing 26. The display area of the display 15 is clearly demarcated as a rectangular area (e.g., 30 mm x 24 mm) in the center of the housing 26. Below the display 15, a circular button section 16 is located. The button section 16 is configured as a pressable physical button, and most of its surface is a black movable part.
[0032] The button section 16 is a physically separate and independent component from the display area of the display 15. The area outside the boundary line of the display area of the display 15 is entirely a non-display area. The light-emitting element 17 is positioned within this non-display area. This enables light emission that is independent of the display.
[0033] (Detailed configuration of microphone, speaker, buttons, etc.) From here, the arrangement of the microphone 19, speaker 21, button section 16, etc., inside the housing 26 will be described in detail. Figure 5A is a diagram showing the internal configuration of the housing 26 of the monitored terminal 10, and is a view from the side of the monitored terminal 10 (the widthwise surface corresponding to the vertical direction of the front). As shown in Figure 5A, the housing 26 is provided with an opening 26a to ensure the visibility of the built-in display 15. Through the opening 26a, the content displayed on the internal display 15 can be viewed through the display area of the button section 16. Inside the housing 26, the button section 16 is positioned closest to the opening 26a, allowing the user to press the button section 16 through the opening 26a. The display 15 is positioned on the back side of the button section 16 (the side facing the opening 26a). The display 15 is positioned close to the button section 16 and is practically in contact with the button section 16. The button section 16 and the display 15 may be integrally formed. The button section 16 is made of a transparent and light-transmitting material, and the content displayed on the display 15 located on its rear surface can be viewed through the button section 16. A circuit board 40 and a switch mechanism 41 are located behind the display 15.
[0034] The button portion 16, which is exposed to the outside through the opening 26a, can be pressed. When the button portion 16 is pressed, the button portion 16 and the display 15 move inward, acting on a switch mechanism 41 on the circuit board 40. The circuit board 40 is a board on which the hardware components shown in Figure 2 are arranged, and is provided with fixed contacts for detecting when the switch is ON. The switch mechanism 41 is provided with movable contacts that, when pressed, make electrical contact with the fixed contacts on the circuit board 40. As the button portion 16 and the display 15 move inward, the switch mechanism 41 is pressed, the movable contact of the switch mechanism 41 makes electrical contact with the fixed contact on the circuit board 40, and it is detected that the button portion 16 has been pressed.
[0035] If the movable contact of the switch mechanism 41 is detected to be in contact with a fixed contact on the circuit board 40 and conduction is maintained for a predetermined time or longer, the monitored terminal 10 detects that the button portion 16 has been pressed for a long time. If the movable contact of the switch mechanism 41 is in contact with a fixed contact on the circuit board 40 and conduction is maintained, but is released and ceases to conduction in less than the predetermined time, the monitored terminal 10 detects that the button portion 16 has been pressed for a short time.
[0036] (Configuration Example 1) Figure 5B is a side view of the monitored terminal 10, focusing on the microphone 19, speaker 21, button section 16, display 15, and the opening 26a of the housing 26. As shown in Figure 5B, the button section 16b is positioned inside the housing 26 with a gap 24 intentionally left between it and the inner wall of the housing 26. This allows sound emitted from the speaker 21 to pass through the gap 24 and be output to the outside through the opening 26a. Furthermore, during recording, the voice spoken by the user to the monitored terminal 10 reaches the microphone 19 through the opening 26a and the gap 24, allowing for proper sound collection.
[0037] In Configuration Example 1, a gap 24 exists even when the button 16 is not pressed and the switch mechanism 41 is not pushed in. Therefore, when the ringtone is activated, for example, when the button 16 is not pressed, the ringtone emitted from the speaker 21 is reliably emitted from the opening 26a through the gap 24. Also, when the button 16 is briefly pressed to play a voice message, the sound emitted from the speaker 21 is reliably emitted from the opening 26a through the gap 24.
[0038] Figure 5C is a side view of the monitored terminal 10, showing the state after the button 16 has been pressed, compared to the state shown in Figure 5B. As shown in Figure 5C, when the button 16 is pressed and the movable contact of the switch mechanism 41 is in contact with the fixed contact on the circuit board 40, the button 16 moves further inward, causing the gap 24 between the opening 26a and the button 16 to widen to become a gap 25. In other words, when the voice message transmission function is executed by pressing and holding the button 16, the microphone 19 can pick up the voice spoken by the monitored person through the widened gap 25. In this way, as the gap expands when the button 16 is pressed, the acoustic impedance of the opening around the microphone 19 changes, making it particularly suitable for picking up the user's (monitored person's) speech. This makes it possible to dynamically optimize the acoustic characteristics according to user operation, from a first interval state suitable for sound emission during standby to a second interval state suitable for sound collection during recording.
[0039] (Acoustic transmission unit) In the present invention, the gaps 24 and 25 formed between the housing 26 and the button portion 16 are not merely movable clearances for the button portion 16 to move, but are actively utilized as an acoustic transmission unit that combines two functions: sound pickup by the microphone 19 and sound emission by the speaker 21. • Higher-level concept: The aforementioned acoustic transmission unit includes any path that acoustically connects the acoustic components inside the housing (microphone 19, speaker 21) to the external space. Intermediate concept (in this embodiment): This is an annular gap formed between the periphery of the housing opening 26a and the outer circumference of the movable button portion 16 arranged within the opening 26a. The width of the gap is preferably set in the range of 0.5 mm to 2.0 mm, taking into consideration acoustic characteristics and dustproofness. • Lower-level concept (dynamic change): Furthermore, the shape (gap width) of this acoustic transmission unit dynamically changes according to the pressing state of the button 16. When not pressed, it forms a first gap (gap 24) suitable for sound emission, and when pressed, it forms a second gap (gap 25) suitable for sound collection, thus optimizing for different acoustic modes with a single mechanism. (Acoustic isolation unit) To prevent acoustic crosstalk (echo) that may occur due to the physical proximity of the speaker 21 and microphone 19, the monitored terminal 10 may be equipped with an acoustic separation unit. As mentioned above, if a single gap is used as both a sound collection and sound emission path, crosstalk, where the sound from the speaker 21 blew around to the microphone 19, becomes a problem. In this embodiment, to solve this problem, an acoustic separation unit is provided inside the housing 26. Figures 5D and 5E show an example of an acoustic separator 42, which is an example of an acoustic unit. The acoustic separator 42 is a component that separates the speaker 21 and the microphone 19 by closing a portion of the gap 24 so that the sound emitted from the speaker 21 does not pass through the gap 24 and is collected by the microphone 19. The acoustic separator 42 has a width 42a that is substantially the same length as the gap 24 and is positioned within the gap 24, interposed between the respective positions of the microphone 19 and the speaker 21. As shown in Figures 5D and 5E, the acoustic separator 42 is a substantially cubic member with a width of 42a, fixed to the non-visible area of the button portion 16, and functions as a wall that prevents sound emitted from the speaker 21 from reaching the microphone 19. • Higher concept: The acoustic isolation unit includes any partition that physically separates at least a portion of the sound emission path from at least a portion of the sound collection path within the enclosure. Intermediate concept (this embodiment): As shown in Figure 5D, the acoustic separator 42 is a wall-like member erected on the circuit board 40, separating the area where the speaker 21 is located from the area where the microphone 19 is located. • Lower-level concept (specific configuration example): The material is preferably an elastic material such as silicone rubber or elastomer that suppresses sound transmission and also has excellent shock absorption properties. Its tip contacts or is close to the back surface of the button portion 16, blocking the acoustic bypass path, including the space between it and the inner wall of the housing 26. This effectively prevents sound from leaking in from the inside while still sharing an opening (gap 24) to the outside.
[0040] The acoustic separator 42 is made of an elastic material such as silicone rubber and is provided in a part of the gap 24 to physically separate the area where the speaker 21 is located from the area where the microphone 19 is located. As a result, the gap 24 functions as a single opening to the outside, while the sound emission path and sound collection path are separated internally, effectively suppressing acoustic crosstalk. The acoustic separator can also function as a buffer to absorb external shocks.
[0041] (Echo reduction) Echo countermeasures are taken to prevent echoes caused by a loop in which audio output from speaker 21 is collected by microphone 19, becomes audio data sent to parent terminal 20, is received again by monitored terminal 10, and output from speaker 21. As one echo countermeasure, when the monitored terminal 10 enables the voice message transmission function, it controls the microphone 19 to be ON but the speaker 21 to be OFF. As another echo countermeasure, when the monitored terminal 10 enables the voice message playback function, it controls the speaker 21 to be ON but the microphone 19 to be OFF.
[0042] (Stereo arrangement) At least one of the microphones 19 and the speakers 21 may be arranged in stereo. As shown in Figure 6, when the microphones 19 are arranged in stereo, the two microphones 19 (microphones 19a and 19b) are positioned opposite each other with the display 15 in between. When the speakers 21 are arranged in stereo, the two speakers 21 (speakers 21a and 21b) are positioned opposite each other with the display 15 in between. When the microphones 19 are arranged in stereo, the left and right channels can be recorded separately using microphones 19a and 19b. When the speakers 21 are arranged in stereo, the left and right channels can be output separately using speakers 21a and 21b.
[0043] If two microphones 19 are arranged in a stereo configuration, beamforming may be formed between these microphones 19 to create a directional microphone directed towards the aperture 26a. If two speakers 21 are arranged, beamforming may be formed between these speakers 21 to create a directional speaker directed towards the aperture 26a.
[0044] The microphone 19 may be mounted on the circuit board 40, or it may be fixed by being attached to the inner wall of the housing 26. Similarly, the speaker 21 may be mounted on the circuit board 40, or it may be fixed by being attached to the inner wall of the housing 26. The microphone 19 may be a MEMS microphone. The speaker 21 may be a MEMS speaker.
[0045] (Configuration Example 2: Display Speaker) In the example configuration, a speaker 21 is shown to be placed inside the housing 26 separately from the button section 16, but the speaker may be integrated with the button section 16. As shown in Figure 7, in Configuration Example 2, an exciter 29 is provided on the button section 16, and the button section 16 is configured to also function as a speaker 21. The exciter 29 is a device that vibrates in response to an audio signal and can produce sound by vibrating the attached material. When the voice message playback function is activated, the exciter 29 receives the played and outputted audio signal and vibrates the attached button section 16. As a result, the button section 16 also functions as a speaker 21. The button section 16 is made of a material that can be appropriately vibrated by the exciter 29. The button section 16 can move back and forth (left and right in Figure 7) by vibrating with the exciter 29, utilizing the gap 24, without touching the housing 26.
[0046] (modified version) In the above embodiment, an example was shown in which a substantially cubic acoustic separator 42 is erected on the button portion 16, but the invention is not limited to this. For example, a rib-shaped partition wall may be integrally molded to the surface of the button portion 16 to achieve the acoustic separation function. Furthermore, a membrane material that provides both waterproofing and sound transmission (for example, stretched polytetrafluoroethylene, etc.) may be placed to cover the inside of the gap 24. This makes it possible to achieve high waterproof and dustproof performance equivalent to IPX7 while maintaining acoustic performance.
[0047] (Configuration of monitoring server 40) As shown in Figure 1, the monitoring server 40 comprises a processor 41, memory 42, storage 43, and a communication interface 44. The processor 41 is a CPU, GPU, or a combination thereof, and executes programs stored in the storage 43 to perform various functions for the monitoring service. The memory 42 is a main memory such as RAM, and temporarily stores data when programs are executed by the processor 41. The storage 43 is a non-volatile memory such as ROM, flash memory, or SSD, and permanently stores programs and various data.
[0048] The processor 41 functions as the central processing unit of the monitoring server 40, the memory 42 is responsible for temporary data storage, and the storage 43 is responsible for persistent program data storage. The communication interface 44 communicates with the external environment. The communication interface 44 communicates with the monitored terminal 10 and the guardian terminal 20 via the network 30 using communication methods such as LAN and LTE.
[0049] (Configuration of parent device 20) Figure 8 shows the configuration of the guardian terminal 20. The guardian terminal 20 is a device held by the caregiver and can be a PC (personal computer), mobile phone, smartphone, etc. If there are multiple caregivers, it is preferable to have a guardian terminal 20 for each caregiver. Although not shown in the diagram, the guardian terminal 20 includes a processor 51, memory 52, storage 53, communication interface 54, display 55, microphone 59, and speaker 57. The processor 51 is a CPU, GPU, or a combination thereof, and executes programs stored in the storage 53 to realize the various functions of the guardian terminal 20. The memory 52 is a main memory device such as RAM, and temporarily stores data when the processor 51 executes a program. The storage 53 is a non-volatile memory device such as ROM, flash memory, or SSD, and permanently stores programs and various data. The display 55 is a display device such as a liquid crystal or organic EL that displays various information and displays various graphics based on instructions from the processor 51. The microphone 59 is a device that picks up sound and can collect the voice spoken by the caregiver. Speaker 57 is an audio device that outputs the audio of received and played voice messages (voice data) and ringtones.
[0050] (Voice message unattended prevention feature) The monitoring system according to this embodiment includes a "voice message neglect prevention function." The voice message neglect prevention function determines the content of a voice message sent from the monitored person's terminal 10, and if the voice message contains content that should not be ignored by the guardian and requires some kind of action, it causes the guardian's terminal 20 to send a notification to that effect.
[0051] Figure 9 shows the function to prevent voice messages from being left unattended. The monitoring server 40 is equipped with an analysis unit 60 as a functional unit. The analysis unit 60 has the function of playing back voice messages (audio data) sent from the monitored terminal 10 by the voice message transmission function and analyzing the content of the audio data. For example, the analysis unit 60 converts the audio data into text using speech-to-text technology, analyzes the sentences made up of those characters, and determines whether or not the content requests a response or some kind of action. For example, the content is pre-classified into categories of content that requests a response or some kind of action and content that does not request a response or some kind of action, and the analysis unit determines which category it belongs to. In the example shown in Figure 8, voice messages are classified into three categories: "report," "request," and "question." The analysis unit 51 converts the audio data into text and determines that sentences that are neither questions nor requests, such as "I'm leaving" and "I'm coming back now," are "reports." The analysis unit 51 converts the audio data into text and determines that sentences requesting something, such as "Tell me what time you'll be back," are "requests." The analysis unit 51 also determines that sentences ending in a question, such as "Where are the snacks?", are "questions."
[0052] The monitoring server 40 sends a voice message to the parent terminal 20, with classification information indicating the result of the analysis added to it. For example, a voice message determined to be a "report" as a result of the analysis is sent with classification information indicating that it is a "report". A voice message determined to be a "request" as a result of the analysis is sent with classification information indicating that it is a "request". A voice message determined to be a "question" as a result of the analysis is sent with classification information indicating that it is a "question".
[0053] The parent terminal 20 stores the received voice message in storage 53 and also sends a notification to inform the parent that a voice message has been received. For example, the parent terminal 20 displays a predetermined notification message on the display 55. For example, the parent terminal 20 plays a predetermined notification sound and makes it ring from the speaker 57.
[0054] The parent terminal 20 performs different notification actions depending on the classification of the message attached to the voice message. Specifically, if the voice message is labeled "Report," it performs the first notification action, and if the voice message is labeled "Request" or "Question," it performs the second notification action. The first and second notification actions differ in their nature. When the voice message is a "Report," the need for the caregiver to respond to or take any action towards the person being cared for is low, whereas when the voice message is a "Request" or "Question," such a need is expected to be high. Therefore, the second notification action includes actions that more reliably make the caregiver aware that the voice message has been received than the first notification action.
[0055] As an example, as shown in Figure 10, the content (wording) of the notification message (first notification message) displayed in the second notification action is designed to prompt the caregiver to check it immediately, more so than the content (wording) of the notification message (second notification message) displayed in the first notification action.
[0056] For example, the first notification message is displayed in a more prominent manner than the second notification message. For instance, as shown in Figure 9, the second notification message is displayed in a larger size than the first notification message.
[0057] As an example, as shown in Figure 11, the notification sound emitted from speaker 57 during the second notification operation (second notification sound) is emitted at a louder volume than the notification sound emitted from speaker 57 during the first notification operation (first notification sound).
[0058] For example, the second notification message is displayed for a longer period than the first notification message. For instance, parent 0 may set it so that the first notification message is displayed on display 55 and then stops being displayed (for example, disappears) after a predetermined first hour (e.g., 1 minute), and the second notification message is displayed on display 55 and then stops being displayed (for example, disappears) after a predetermined second hour (e.g., 15 minutes).
[0059] For example, the parent terminal 20 plays the first notification sound only once, but plays the second notification sound repeatedly at predetermined intervals (e.g., 1 minute) until the voice message is played. In this case, the parent terminal 20 may control the volume to increase gradually when repeatedly playing the second notification sound.
[0060] Figure 12 is a flowchart showing the operation of the voice message unattended prevention function. S10: The monitoring server 40 receives a voice message from the monitored terminal 10. S11: The monitoring server 40 analyzes and classifies the received voice messages. S12: The monitoring server 40 adds classification information to the voice message according to the classification. S13: The monitoring server 40 sends (transfers) a voice message to the parent terminal 20 along with information indicating the category. S14: The parent terminal 20 receives a voice message sent from the monitoring server 40. S15: The parent terminal 20 refers to the classification information attached to the voice message and performs the first notification action if it is a "report". S16: The parent terminal 20 refers to the classification information attached to the voice message and performs a second notification action if it is a "request" or a "question".
[0061] (Voice message routing function) The monitoring system according to this embodiment is equipped with a "voice message routing function." The voice message routing function determines the content of a voice message sent from the monitored terminal 10 when multiple guardian terminals 20 are linked to the monitored terminal 10, determines which guardian terminal 20 the voice message should be sent to, and sends a "yes" message to the appropriate guardian terminal 20.
[0062] Figure 13 shows an example of a database 70 for managing the association (correspondence) between the monitored device 10 and the guardian device 20. The database 70 is stored in the storage 42 of the monitoring server 40. The database 70 is updated according to the association between the monitored device 10 and the guardian device 20 made by the guardian. Each guardian device 20 contains information about the attributes and name of the guardian who owns it. In the example shown in Figure 13, three guardian devices 20 are associated with the monitored device 10: the first guardian device 20, the second guardian device 20, and the third guardian device 30. The first guardian device 20 is owned by "Takeshi," the father of the monitored child; the second guardian device 20 is owned by "Misae," the mother of the monitored child; and the third guardian device 20 is owned by "Hanako," the grandmother of the monitored child.
[0063] Figure 14 shows the voice message routing function. The monitoring server 40 includes a determination unit 61 as a functional unit. The determination unit 61 plays back the voice message (audio data) sent from the monitored terminal 10 by the voice message transmission function, analyzes the content of the audio data, and has the function of determining (identifying) which of the first to third guardian terminals 20 associated with the monitored terminal 10 should receive the voice message. For example, the determination unit 61 converts the audio data into text using speech-to-text technology, and identifies the destination guardian terminal 20 by determining the attributes and name of the destination guardian terminal 20 based on the characters contained in that set of characters.
[0064] For example, the determination unit 61 determines the destination parent terminal 20 based on the designation or name of the parent who possesses the parent terminal 20, which is included in the text of the voice message. For example, if the transcribed text of the voice message is "It's starting to rain, Dad, come pick me up at the station," the unit identifies the first parent terminal 10, which has the attribute "father," as the destination because the designation "Dad" is included in the text. For example, if the transcribed text of the voice message is "Mom, where's the snack?", the unit identifies the second parent terminal 10, which has the attribute "mother," as the destination because the designation "Mom" is included in the text. Also, for example, if the transcribed text of the voice message is "Someone, come pick me up," the unit identifies all of the first to third parent terminals 20 as destinations because the designation "Someone," which does not represent a specific attribute, is included in the text. Furthermore, for example, if the transcribed voice message is "I'm going to a friend's house to play," since this message does not contain any characters indicating a specific name, all of the first to third parent terminals 20 are identified as the recipients.
[0065] The monitoring server 40 sends a voice message to one or more parent terminals 20 that have been identified as recipients based on the determination by the determination unit 61. In this way, the determination unit 61 identifies the appropriate recipient and sends the voice message, thus suppressing the sending of unnecessary voice messages and ensuring that the voice message is delivered to the appropriate caregiver who understands the intentions of the person being monitored. [General tasks] One of the objectives of this invention is to improve the user interface in small electronic devices. Issues corresponding to [Appendix 1] One of the objectives of the present invention is to provide a monitoring terminal that can achieve a balance of sound quality, durability, and operability while maintaining a compact housing. [Note 1] A monitoring terminal comprising a housing, a button section that accepts press operations from a user, a microphone for recording user speech when a voice message recording function is enabled while the button section is pressed, and a speaker for outputting playback sound to the user when the voice message recording function is not enabled, wherein the button section is arranged inside the housing so as to be pressable and exposed to the outside through an opening in the housing, and the microphone and speaker are arranged inside the housing so as to be able to pick up and emit sound, respectively, through a gap provided between the housing and the button section. [Effects of Appendix 1] According to the above-mentioned information processing device, there is no need to separately provide openings for microphones and speakers in the casing, which reduces the number of parts while improving durability such as water and dust resistance. Issues corresponding to [Appendix 2] One of the objectives of this invention is to suppress the generation of echo caused by sound output from a speaker leaking into the microphone. [Note 2] The monitored terminal as described in Appendix 1, wherein when the voice message recording function is active, the microphone is enabled to pick up sound and the speaker is disabled, and when the voice message recording function is not active, the microphone is disabled to pick up sound and the speaker is enabled. [Effects of Appendix 2] This allows for the suppression of echo generation with a simple configuration by exclusively controlling the functions during recording and playback, while utilizing a single acoustic path. Issues corresponding to [Appendix 3] One of the objectives of the present invention is to provide a specific configuration that reliably detects a button press operation by a user. [Note 3] The monitored terminal as described in Appendix 1, comprising a switch mechanism that detects a button operation by causing a movable contact to move and make electrical contact with a fixed contact in response to the button being pressed through the opening. [Effects of Appendix 3] This allows for the detection of operations through the conductivity of physical contacts, thus enabling a highly reliable user interface. Issues corresponding to [Appendix 4] One of the objectives of the present invention is to dynamically change the acoustic characteristics in response to user operation and provide acoustic performance suitable for the usage scenario. [Note 4] The monitored terminal as described in Appendix 1, wherein when the button portion is not pressed through the opening, a gap of a first interval is formed between the housing and the button portion, and when the button portion is pressed through the opening, a gap of a second interval, which is wider than the first interval, is formed between the housing and the button portion. [Effects of Appendix 4] This allows for the optimization of acoustic characteristics according to user operation, for example, by setting the device to a state suitable for audio playback when not pressed (first interval) and a state suitable for audio recording when pressed (second interval). Issues corresponding to [Appendix 5] One of the objectives of the present invention is to provide an alternative speaker configuration that further improves space efficiency inside the enclosure. [Note 5] The monitoring terminal as described in Appendix 1, wherein the speaker is an exciter provided in contact with the back surface of the surface of the button portion facing the opening, and emits sound by vibrating the button portion through the gap with the exciter. [Effects of Appendix 5] This allows the button itself to be used as the speaker's diaphragm, eliminating the need for a separate speaker unit. This enables further miniaturization of the device and the inclusion of a larger capacity battery. Issues corresponding to [Appendix 6] One of the objectives of the present invention is to suppress acoustic crosstalk, which is a concern when a single acoustic path is shared, where sound output from a speaker leaks into the microphone. [Note 6] The monitoring terminal according to Appendix 1, further comprising an acoustic separator for separating the microphone and the speaker in the aforementioned gap. [Effects of Appendix 6] This allows for the physical separation of the sound emission path and sound absorption path within the enclosure, while still sharing an opening to the outside. This effectively suppresses acoustic crosstalk and ensures clear sound quality. Furthermore, by using an elastic material in the acoustic separator, it can also function as a shock absorber to absorb external impacts. Issues corresponding to [Appendix 7] One of the objectives of this invention is to provide a monitoring system that can prevent voice messages that should not be ignored from being left unattended. [Note 7] An information processing device for controlling the exchange of voice messages between a monitoring terminal and a monitored terminal, configured to transmit a voice message received from the monitored terminal to the monitoring terminal and to notify the monitoring terminal of the receipt of the voice message, comprising an analysis means for classifying the voice message transmitted from the monitored terminal, wherein if the voice message belongs to a first classification, the monitoring terminal is notified of the receipt of the voice message by a first notification operation, and if the voice message belongs to a second classification, the monitoring terminal is notified of the receipt of the voice message by a second notification which has a higher notification effect than the first notification. [Effects of Appendix 7] This allows notification behavior to change depending on the content of the voice message, so that important messages for the caregiver are highlighted, reducing anxiety and frustration between the caregiver and the person being cared for due to neglect. Issues corresponding to [Appendix 8] One of the objectives of the present invention is to provide specific criteria for appropriately classifying voice messages. [Note 8] The information processing device described in Appendix 7, wherein the analysis means classifies whether the content of the voice message is a voice message belonging to a second category that requests a response or action, or a voice message belonging to a first category that does not request a response or action. [Effects of Appendix 8] This allows for the appropriate classification of voice messages based on the need for the caregiver's response, resulting in more effective notification control. Issues corresponding to [Appendix 9] One of the objectives of this invention is to achieve classification of voice messages using more specific and practical criteria. [Note 9] The information processing device according to Appendix 8, wherein the analysis means classifies the voice message into the second category if the content of the voice message is a question or a request, and classifies the voice message into the first category if the content of the voice message is neither a question nor a request. [Effects of Appendix 9] This enables automatic classification using linguistic features such as questions and requests, improving the system's practicality. Issues corresponding to [Appendix 10] One of the objectives of the present invention is to provide a monitoring system that can appropriately route voice messages sent by a child to multiple guardians. [Note 10] An information processing device for controlling the exchange of voice messages between a monitoring terminal and a monitored terminal, the device being configured to transmit a voice message received from the monitored terminal to one of a plurality of monitoring terminals associated with the monitored terminal, and comprising a determination means for determining the destination monitoring terminal based on the characters in the string indicated by the voice message. [Effects of Appendix 10] This allows messages to be delivered to the appropriate caregiver according to the care recipient's intentions, even when there are multiple caregivers, and suppresses the sending of unnecessary messages, resulting in an efficient monitoring system. Issues corresponding to [Appendix 11] One of the objectives of the present invention is to provide specific criteria for achieving more accurate routing determination of voice messages. [Note 11] The determination means is an information processing device as described in Appendix 10, which makes a determination based on the characters in the string indicated by the voice message and the attributes or names of each person who possesses each monitoring terminal. [Effects of Appendix 11] This allows for the identification of the recipient based on terms of endearment such as "Dad" or "Mom," or specific names, enabling more accurate routing that reflects the intentions of the person being monitored. [Explanation of Symbols]
[0066] 1. Monitoring System 10. Monitoring device 11 processors 12 memory 13 Storage 14 Communication Interfaces 15 displays 16 Button section 16a opening 17 Light-emitting element 18 Cameras 19 Mike 19a Mike (1st) 19b Mike (2nd) 20 Parental devices 21 speakers 21a Speaker (1st) 21b Speaker (2nd) 22 GPS receiver 23 bus 24 Inertial Sensors 26 cabinets 30 Networks 31 Communications Department 32 Location information acquisition unit 33 Detection unit 34 Control Unit 35 Storage section 40 Monitoring Server 41 processors 42 storage 43 memory 44 Communication Interfaces 51 processors 52 memory 53 Storage 54 Communication Interfaces 55 displays 56 Input section 57 speakers 59 Mike 60 Analysis Department 61 Judgment section
Claims
1. The casing and A button section that accepts presses from the user, During the voice message recording function, which is activated while the aforementioned button is pressed, a microphone for recording the user's speech is provided. A monitoring terminal equipped with a speaker for outputting playback audio to the user when the voice message recording function is not being used, The button portion is positioned inside the housing so as to be exposed to the outside through an opening in the housing so as to be pressable. The monitoring terminal is arranged within the housing such that the microphone and speaker can pick up and emit sound, respectively, through a gap provided between the housing and the button section.
2. A monitoring terminal according to claim 1, When the voice message recording function is activated, the microphone is enabled to pick up sound, and the speaker is disabled to emit sound. A monitored terminal that, when not using the voice message recording function, disables the microphone for sound pickup and enables the speaker for sound emission.
3. A monitoring terminal according to claim 1, A monitored terminal equipped with a switch mechanism that detects a button operation by causing a movable contact to move and make electrical contact with a fixed contact in response to the button being pressed through the opening.
4. A monitoring terminal according to claim 1, A monitored terminal wherein, when the button portion is not pressed through the opening, a gap of a first interval is formed between the housing and the button portion, and when the button portion is pressed through the opening, a gap of a second interval, which is wider than the first interval, is formed between the housing and the button portion.
5. A monitoring terminal according to claim 1, The speaker is an exciter provided in contact with the back surface of the surface of the button portion facing the opening, A monitoring terminal that emits sound by vibrating the button portion through the gap using the exciter.
6. A monitoring terminal according to claim 1, A monitoring terminal comprising an acoustic separator for separating the microphone and the speaker in the aforementioned gap.
7. An information processing device that controls the exchange of voice messages between a monitoring terminal and a monitored terminal, The system is configured to transmit the voice message received from the monitored terminal to the monitoring terminal, and to notify the monitoring terminal of the receipt of the voice message. The system includes an analysis means for classifying voice messages transmitted from the monitored terminal, An information processing device that, if the voice message belongs to the first category, notifies the recipient of the voice message through a first notification operation, and if the voice message belongs to the second category, notifies the recipient of the voice message through a second notification which has a higher notification effect than the first notification.
8. An information processing apparatus according to claim 7, The analysis means is an information processing device that classifies whether the content of the voice message is a voice message belonging to a second category that requests a response or action, or a voice message belonging to a first category that does not request a response or action.
9. An information processing apparatus according to claim 8, The analysis means is an information processing device that classifies the voice message into the first category if the content of the voice message is a question or a request, and classifies the voice message into the second category if the content of the voice message is neither a question nor a request.
10. An information processing device that controls the exchange of voice messages between a monitoring terminal and a monitored terminal, The system is configured to transmit voice messages received from the monitored terminal to one of the multiple monitoring terminals associated with the monitored terminal. An information processing device comprising a determination means for determining the destination monitoring terminal based on the characters in the string indicated in the aforementioned voice message.
11. An information processing apparatus according to claim 10, The determination means is an information processing device that performs a determination based on the characters in the string indicated by the voice message and the attributes or names of each person who possesses each monitoring terminal.