system

A system that analyzes children's skeletal features and compares them with successful athletes' data to recommend suitable sports, addressing the challenge of selecting sports that match their physical characteristics and maintaining interest.

JP2026105525APending Publication Date: 2026-06-26SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-16
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Parents and coaches face challenges in selecting suitable sports for children without considering their physical characteristics, leading to potential burden and loss of interest.

Method used

A system that receives photographic data, extracts skeletal features, and compares them with data from past successful athletes to determine the most suitable sport, providing real-time recommendations.

Benefits of technology

Enables effective selection of sports based on children's physical characteristics, preventing loss of interest and ensuring appropriate instruction.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026105525000001_ABST
    Figure 2026105525000001_ABST
Patent Text Reader

Abstract

Provide a system. 【Solution means】 Means for receiving image data, Means for extracting target skeletal features from the received image data, Means for comparing the extracted skeletal features with data of past successful athletes, Means for determining a sports event suitable for the target based on the comparison result, Means for outputting the determination result, Means for proposing an appropriate exercise program based on the output determination result, Voice output means for guiding the proposed exercise program, A system including the above.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] Conventionally, it has been difficult for parents to find sports suitable for children. Without comparing with other reference information or experience, it is difficult for children to choose a sport suitable for their physical characteristics, and starting a sport without determining suitability may lead to burden and loss of interest. Therefore, there is a need to provide a technology for effectively discriminating suitable sports based on the physical characteristics of children, especially skeletal information.

Means for Solving the Problems

[0005] This invention provides a means for receiving photographic data and extracting the skeletal features of a subject from that data. Furthermore, the extracted skeletal features are compared with data from past successful athletes, and the invention includes a means for determining the most suitable sport for the subject based on the comparison results. This makes it easy to find a sport that is highly suitable for the child, enabling parents and coaches to provide appropriate instruction and support to prevent loss of interest.

[0006] "Image data" refers to data that represents visual information in a digital format and is stored in a format that can be processed by computers.

[0007] "Skeletal characteristics" refer to data about the subject's physical structure, specifically including the position, length, and other morphological features of the bones.

[0008] "Data on past successful athletes" refers to data collected on the physical characteristics and achievements of individuals who have achieved outstanding results in the field of sports.

[0009] A "sports discipline" refers to a specific type of exercise or sports activity, and that activity has its own rules and objectives.

[0010] "Determination" refers to drawing conclusions or results based on given data and information, and in particular, in this invention, it means the process of determining the appropriate type of exercise. [Brief explanation of the drawing]

[0011] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] This is a sequence diagram showing the processing flow of the data processing system in Example 2, which incorporates an emotion engine. [Figure 14] This is a sequence diagram showing the processing flow of the data processing system in Application Example 2, which combines an emotion engine. [Modes for carrying out the invention]

[0012] Hereinafter, an example of an embodiment of the system relating to the technology of this disclosure will be described with reference to the attached drawings.

[0013] First, let's explain the terminology used in the following explanation.

[0014] In the following embodiments, the numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0015] In the following embodiments, the numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0016] In the following embodiments, the numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, and the like.

[0017] In the following embodiments, the numbered communication I / F (Interface) is an interface that includes a communication processor and an antenna, etc. The communication I / F controls communication between multiple computers. Examples of communication standards applied to the communication I / F include wireless communication standards including 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark), and the like.

[0018] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0019] [First Embodiment]

[0020] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0021] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0022] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0023] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0024] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0025] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0026] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0027] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0028] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0029] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0030] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0031] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0032] This invention relates to a skeletal analysis system for determining the most suitable type of exercise for a child. The user saves a high-resolution image of the child's entire body on their device and sends it to the server through the system's application. The server extracts skeletal features from the received image using image processing technology. Specifically, it automatically analyzes features such as joint positions and limb sizes using computer vision technology and machine learning algorithms.

[0033] The server then uses these skeletal features to compare and match them with data on past successful athletes pre-registered in its database. This database contains a wealth of information on successful athletes in various sports and the physical characteristics associated with them. Based on the matching results, the most suitable type of exercise for the child is determined.

[0034] The assessment results are transmitted in real time from the server to the user's device, allowing the user to review the recommended sports. For example, if the server determines that "basketball is suitable" based on specific skeletal characteristics, this information, along with associated benefits and success stories, is sent to the user. This system helps users effectively select sports based on their child's aptitude.

[0035] The following describes the processing flow.

[0036] Step 1:

[0037] The user uploads a high-resolution image of the child's entire body to the device. The device selects the image and prepares to upload it through the system's application.

[0038] Step 2:

[0039] The terminal sends the image file specified by the user to the server. The server receives the transmitted data and verifies that it is in the correct format and resolution. If necessary, it performs preprocessing to convert the image to a size and format suitable for analysis.

[0040] Step 3:

[0041] The server extracts skeletal features based on pre-processed image data. Specifically, it uses computer vision technology to execute algorithms that extract skeletal information such as joint positions and limb lengths from the images.

[0042] Step 4:

[0043] The server compares the extracted skeletal features with data on past successful athletes stored in a database. The matching process evaluates skeletal similarity and features that influence athletic suitability.

[0044] Step 5:

[0045] The server analyzes the matching results and determines which sports are suitable for the child. The determination uses an algorithm that takes into account sports in which athletes with similar skeletal structures have performed well, as well as their level of success.

[0046] Step 6:

[0047] The server sends information about the determined exercise type to the terminal. The terminal displays the recommended exercise type and related information on the screen so that the user can easily check the determination result.

[0048] (Example 1)

[0049] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0050] Traditionally, methods for selecting the optimal exercise based on an individual's physical characteristics have been limited, and there is a lack of clear criteria and analytical techniques. As a result, athletes and their coaches are forced to rely on empirical rules when deciding on an exercise. To resolve this problem, there is a need for a system that objectively selects exercise based on reliable data.

[0051] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0052] In this invention, the server includes means for acquiring image information, means for analyzing body structure features from the acquired image information, and means for comparing the analyzed body structure features with information from past successful exercise participants. This makes it possible to identify the optimal type of exercise based on an individual's physical characteristics.

[0053] "Image information" refers to photographs and videos acquired as visual data.

[0054] "Physical structural characteristics" refer to an individual's physical features, such as the arrangement of their skeleton and muscles, and the position and size of their joints.

[0055] "Machine learning technology" is a technique in which computers find patterns and rules from data and automatically learn from that experience.

[0056] "Type of exercise" refers to a category of specific exercise or sport within sports or fitness.

[0057] "Analysis" is the process of breaking down specific information or data into smaller parts to reveal its content and structure.

[0058] "Specification" refers to the act of clarifying something based on specific requirements or conditions.

[0059] "Comparison" is the act of comparing multiple things to examine their differences and characteristics.

[0060] "Provision" refers to the act of providing information or services to users.

[0061] This invention is a system for determining the most suitable type of exercise for a child. The user first takes a full-body photograph or high-resolution image of the child and saves it to their device. Then, the user uses the system's application to send the image to a server. In this process, the device must be equipped with communication technology to reliably transmit the captured image to the server.

[0062] The server first processes the received images using tools such as OpenCV and TENSORFLOW (registered trademark) to analyze body structure features. Specifically, it automatically extracts features such as joint positions and limb sizes. Computer vision and machine learning technologies play a crucial role at this stage.

[0063] Next, the server uses the extracted physical characteristics to compare them with information on successful exercise participants stored in its database, searching for past cases with similar physical characteristics. The database contains a wealth of successful cases across various exercise disciplines, and this information is used to determine eligibility for specific exercise disciplines.

[0064] The determined sport is notified to the user's device in real time, allowing the user to consider suitable sport options for their child based on this information. For example, if the server determines that "basketball is suitable," the user will also be presented with its benefits and related success stories.

[0065] Furthermore, by inputting a prompt for the generating AI model such as, "Please describe a system that determines the optimal sport for a child by analyzing their physical characteristics using computer vision and machine learning," it becomes possible to utilize the detailed processing content of the system for learning and tuning. This invention supports the objective and effective selection of sports based on individual physical characteristics.

[0066] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0067] Step 1:

[0068] The user saves an image of their child's entire body to their device and sends it to the server via the app. Specifically, the user uses the app's "Send Image" function to select an image from their device folder and upload it. The input for this step is the image data stored on the user's device, and the output is the image data transferred to the server.

[0069] Step 2:

[0070] The server applies image processing techniques to analyze the received image data. Specifically, the server uses the OpenCV or TensorFlow library to perform image analysis to identify joint positions and limb sizes. The input for this step is image data stored on the server, and the output is anatomical features extracted from the images.

[0071] Step 3:

[0072] The server compares the extracted physical characteristics with information from past successful exercise participants in the database. Specifically, the server executes SQL queries to search for entries with similar physical characteristics. The input for this step is the extracted physical characteristics, and the output is a list of data on sports participants with high similarity.

[0073] Step 4:

[0074] The server determines the most suitable exercise based on the comparison results. Specifically, the server uses machine learning techniques (e.g., a random forest classifier) ​​to evaluate the suitability score for each sport and selects the sport with the highest score. The input for this step is similar entries and their feature data, and the output is the determined exercise.

[0075] Step 5:

[0076] The server notifies the user's device in real time of the determined exercise type. Specifically, it pushes information using WebSocket or a notification service, and displays it in the app on the device. The input for this step is information about the determined exercise type, and the output is recommended exercise type information displayed on the user's device.

[0077] Step 6:

[0078] The user checks the judgment results and related information on the terminal screen and refers to the presented success stories and benefits of the sport. Specifically, the user accesses the judgment results by pressing the "Show Results" button. The input for this step is the judgment result data sent from the server, and the output is the information and instructions provided to the user.

[0079] (Application Example 1)

[0080] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0081] Finding the right type of exercise for children to promote physical activity and improve athletic ability is a challenge for many parents and educators. Traditional methods lack a detailed approach based on individual physical characteristics, making it difficult to select the appropriate exercise for each child. As a result, inefficient training and a lack of consistency can occur.

[0082] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0083] In this invention, the server includes means for receiving image data, means for extracting the skeletal features of a target from the received image data, means for comparing the extracted skeletal features with data of past successful athletes, means for determining an appropriate exercise for the target based on the comparison results, means for outputting the determination results, means for proposing an appropriate exercise program based on the output determination results, and voice output means for guiding the proposed exercise program. This makes it possible to select the optimal exercise based on each individual's skeletal features and provide effective feedback.

[0084] "Image data" refers to digital data containing visual information, in a format that can be processed on computers and digital devices.

[0085] "Skeletal characteristics" refer to information about the skeleton of an object, including physical structural features such as the position of joints and the length of limbs, and are fundamental data for evaluating the motor skills of humans and animals.

[0086] A "means of comparison" is a technology that has the function of comparing multiple data sets against each other based on specific criteria, making it possible to evaluate similarities and differences.

[0087] "Output means" refers to a method for providing processed data or information to a user, and is a device or technology for displaying or transmitting information visually or audibly.

[0088] An "exercise program" is a plan that includes a series of exercises or training sessions aimed at achieving specific athletic abilities or health goals, and is designed according to individual needs and aptitudes.

[0089] "Audio output means" refers to a device or technology for transmitting information using sound, and can transmit information through speech synthesis or speakers.

[0090] This system is a platform for providing children with optimal exercise programs and activities via home robots and smart devices. The server has the functionality to receive and process image data from the user's smart device or robot.

[0091] The server first processes the image data using the OpenCV library and extracts skeletal features based on computer vision techniques. This process yields important structural information such as joint positions and limb lengths.

[0092] Next, the extracted skeletal features are analyzed using a machine learning algorithm powered by TensorFlow. The analysis then compares these features with a database of past successful athletes to determine the most suitable exercise for the user.

[0093] The determined exercise type and the corresponding exercise program are output from the server to the terminal or robot. The prompt used may be something like, "Please suggest an appropriate exercise type based on the skeletal characteristics of the person in the photograph."

[0094] The terminal or robot has the function of guiding the user through the assessment results visually and audibly. For example, the robot could suggest, "Jogging is suitable today. Let's run together in a nearby park," and actively encourage the user to exercise.

[0095] This system allows users to always access the latest and most personalized exercise programs, enabling them to train more healthily and efficiently.

[0096] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0097] Step 1:

[0098] The device takes a full-body image of the child from the user using its camera and generates high-resolution image data. This image data is the input data to be sent to the server.

[0099] Step 2:

[0100] The server processes the image data received from the terminal using the OpenCV library to extract skeletal features from the image. The data processing performed here analyzes characteristics such as joint positions and limb lengths from the image, and the output is skeletal feature data.

[0101] Step 3:

[0102] The server inputs the extracted skeletal features into a machine learning model using TensorFlow and performs analysis. The prompt might be, "Suggest a suitable exercise based on the skeletal features of the person in the photograph," and the generative AI model determines the appropriate exercise. The output is the determined optimal exercise.

[0103] Step 4:

[0104] The server sends the judgment result to the terminal or robot. The terminal receives this information and processes the data to display it visually to the user (for example, on a display). If voice guidance is also provided, the output is converted into voice format using text-to-speech technology.

[0105] Step 5:

[0106] Users check the exercise program provided via display or audio from the device and then engage in the actual exercise. The device may also record the user's responses and feedback, and may output data that can be used to adjust future programs.

[0107] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0108] This invention combines an emotional engine with a system that determines the most suitable type of exercise for a child, enabling more precise sports instruction. The user first uploads an image of the child's entire body to their device and sends it to the server via the system's application. The server receives this image and uses computer vision technology to extract the child's skeletal features.

[0109] The extracted skeletal information is compared with data on past successful athletes stored in a database on the server to determine the most suitable sport. In addition to the usual sport selection, this invention improves the accuracy of the selection by using an emotion engine. The server recognizes the child's real-time emotional state through sensors and cameras installed on the user's terminal. This emotional data is used as auxiliary information to adjust the suitability of the selected sport.

[0110] For example, even if the server determines that basketball is suitable based on a child's skeletal characteristics, the emotion engine will analyze the child's emotional state. If it negates signs of strong interest or enjoyment, it will suggest another sport (such as track and field). The server sends this information to the terminal, which displays the determined sport along with supplementary information related to emotions. This allows the user to make a final decision on a sport that takes into account not only physical aptitude but also emotional interest and motivation.

[0111] The following describes the processing flow.

[0112] Step 1:

[0113] The user uploads a clear image of the child's entire body to the device. The device provides an interface that allows the user to easily check if the child's posture is correct and prompts the user to upload the image to the system.

[0114] Step 2:

[0115] The terminal sends the image file selected by the user to the server. The server saves the received image data in the appropriate resolution and format, and ensures image quality by performing initial processing.

[0116] Step 3:

[0117] The server uses computer vision technology to extract skeletal features from images. This involves a process that utilizes image analysis algorithms to identify joint locations, limb lengths, and overall skeletal proportions.

[0118] Step 4:

[0119] The server uses the extracted skeletal features to match them against data of past successful athletes in an existing database. The matching algorithm calculates similarity and identifies the most suitable sport.

[0120] Step 5:

[0121] The device activates the emotion engine based on instructions from the user or the server. Cameras and sensors are involved to measure the user's emotional state and collect emotional patterns displayed by the user in real time.

[0122] Step 6:

[0123] The server takes in the analysis results from the emotion engine and evaluates the emotional suitability for the determined sport. For example, it further evaluates whether the determined sport evokes joy and excitement in children and adjusts the judgment result as needed.

[0124] Step 7:

[0125] The server sends the final determined exercise category and additional information based on the emotional assessment to the device. The device provides visualized feedback so that the user can easily view this information. Based on this information, the user can make more appropriate choices regarding sports activities for their child.

[0126] (Example 2)

[0127] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0128] Conventional sports selection systems rely solely on physical characteristics, failing to consider the subject's emotions or interests, and therefore may not be able to suggest the most suitable sport. This highlights the challenge of developing highly accurate sports selection systems that consider both physical aptitude and emotional factors.

[0129] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0130] In this invention, the server includes means for acquiring and transmitting image data, means for extracting the skeletal features of a target from the received image data using an analysis device, means for comparing the extracted skeletal features with data of past successful athletes in an information base, means for recognizing the target's emotional state using sensors and an analysis device and adjusting the suitability of the sport, and means for outputting the final proposed sport using an output device. This makes it possible to select the optimal sport considering the physical and emotional suitability of the target.

[0131] "Image data" refers to data that records the visual information of an object in digital format and is used to extract skeletal features.

[0132] An "analysis device" is a computer system used to analyze skeletal features from received image data and compare them with information base data.

[0133] An "information base" is a database used to hold data on past successful athletes and to cross-reference it with skeletal information.

[0134] A "sensor" is a device used to recognize the emotional state of an object in real time, and typically includes cameras and microphones.

[0135] An "output device" is a device used to display the final suggested exercise results to the user, and usually refers to a display screen.

[0136] This invention is a sports instruction support system that proposes the most suitable type of exercise for a child, making a determination by combining physical characteristics and emotional state. The user takes an image of the child's entire body into a terminal and sends the image data to the server using the system's application. This system uses computer vision technology to extract skeletal features and analyzes the emotional state in real time, thereby achieving a more precise determination.

[0137] Specifically, the server uses visual processing libraries such as "OpenCV" and "TensorFlow" to extract skeletal information of children from received image data. This information is then compared with data on past successful athletes stored in the server's database. The user's device is equipped with sensors such as a camera and microphone, and the server uses these to analyze the child's emotional state using technologies such as "EmotionAPI". The emotional data obtained from this analysis is used to improve the accuracy of sports sport recommendations.

[0138] For example, even if the server determines from a child's skeletal structure that track and field is suitable, if sentiment analysis indicates that the child has little interest, it can suggest a different sport, such as soccer. This allows users to recommend more appropriate sports activities for their children.

[0139] As an example of a prompt when using a generative AI model, it is possible to input a message in the following format: "Based on an image of a 10-year-old child, determine the most suitable type of exercise for this child. Please also consider emotional data and provide a final suggestion."

[0140] Thus, the present invention is a system that enables the proposal of sports disciplines that comprehensively consider both physical aptitude and emotional interest and motivation.

[0141] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0142] Step 1:

[0143] The user takes a full-body image of the child with their device. The image taken by the user is sent to the server through the system's application. The input is the full-body image data of the child, and the output is the transfer of the image data to the server.

[0144] Step 2:

[0145] The server processes the received image data using computer vision technology. Specifically, it uses "OpenCV" and "TensorFlow" to perform image analysis and extract the skeletal features of the child. The input is the received image data, and the output is generated skeletal feature information.

[0146] Step 3:

[0147] The server matches the extracted skeletal features against an internal database. This database contains physical data of past successful athletes. The input is skeletal feature information, and the output generates a list of suitable athletic candidates.

[0148] Step 4:

[0149] The server collects real-time emotion data from sensors and cameras connected to the user's device. The server analyzes this data using technologies such as "EmotionAPI" to evaluate the child's emotional state. Real-time emotion data is the input, and emotion evaluation information is obtained as the output.

[0150] Step 5:

[0151] The server integrates candidate exercises based on skeletal structure with emotional evaluation information to propose a final exercise. The input consists of candidate exercises and emotional evaluation information, and the output is the optimal exercise suggested to the user.

[0152] Step 6:

[0153] The server sends the final suggestions to the terminal, which then provides this information to the user. The user can review the suggestions on the terminal screen and obtain details and advice on sports suitable for their child. The input is the final suggested sports activity, and the output is the presentation of information to the user.

[0154] (Application Example 2)

[0155] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0156] Traditional sports coaching systems are limited to determining the type of sport based on physical aptitude, making it difficult to provide precise instruction that takes into account individual emotional states. Furthermore, if the suggested sport does not interest the child, it may fail to cultivate sustained motivation, hindering the overall improvement of athletic ability.

[0157] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0158] In this invention, the server includes means for receiving image information, means for extracting the skeletal features of the target, means for comparing with past data, means for evaluating the emotional state, and means for modifying the judgment result according to the emotional state. This makes it possible to suggest exercises that take into account not only physical aptitude but also emotional motivation and interest, enabling more precise sports instruction that encourages participation.

[0159] "Image information" refers to visual data that has been converted into a format that can be processed by computers and digital devices.

[0160] "Skeletal features" refer to information related to the shape, arrangement, and structure of the skeleton of an object.

[0161] The term "athlete" refers to a person who participates in a specific physical activity or sport.

[0162] "Data" refers to a set of facts, values, or instructions that have been formalized for processing by a computer or other system.

[0163] "Physical activity" refers to exercise and behavior that involves moving the body.

[0164] "Emotional state" refers to an index that quantifies the situation that represents an individual's emotional response or mood.

[0165] "Image analysis technology" refers to a series of methods and techniques used to extract meaningful information from image data.

[0166] The system for realizing this application consists of a program that performs the following processes. First, the user has a robot in the house acquire image information of the child using sensors and a camera. The robot captures video using, for example, an Intel RealSense camera, and this video information is sent to a terminal. The terminal transfers this information to a cloud server, which extracts skeletal features using image analysis technology. Rekognition from Amazon Web Services (AWS®) is used for image analysis.

[0167] Next, the server compares the extracted skeletal features with data on athletes stored in a past database. This comparison utilizes algorithms that perform pattern recognition on similar data. Furthermore, the robot evaluates the child's emotional state using a heart rate sensor and sends this data to the server. Microsoft® Azure® sentiment analysis API is used for this emotion recognition.

[0168] The server combines this information to determine the optimal physical activity and outputs the result to the user's device. The device displays the determination result along with advice on the type of exercise and an assessment of the emotional state, providing suggestions for the final exercise plan. This system allows children to choose and participate in sports activities based not only on their physical aptitude but also on their emotional interests.

[0169] As a concrete example, a robot that has received movement information can provide feedback such as, "This child is suited to individual running events, but judging from their emotional state, they don't seem very interested. How about trying soccer?" An example of a prompt using a generative AI model is as follows:

[0170] "Please suggest the most suitable sport based on the child's interests and physical aptitude. Current emotional state data indicates a positive interest."

[0171] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0172] Step 1:

[0173] The user instructs the robot to acquire image information. The robot, equipped with an Intel RealSense camera, captures images of the child before exercise. The acquired image information is transmitted to the terminal in real time. The input is video data from the camera, and the output is raw data transferred to the terminal.

[0174] Step 2:

[0175] The terminal transfers the received image information to the cloud server. Data compression may occur during transmission, but it may also be sent in its original format. The input is the video data acquired in step 1, and the output is a notification that the transfer to the server is complete.

[0176] Step 3:

[0177] The server processes the received image information and extracts the skeletal features of the target using image analysis technology. In this process, it utilizes tools such as AWS Rekognition to identify key skeletal features from the image data. The input is video data sent to the cloud server, and the output is the extracted skeletal information.

[0178] Step 4:

[0179] The server compares the extracted skeletal information with past athlete data. A pattern recognition algorithm is used to calculate similarity. The input is the skeletal information obtained in step 3 and the database data, and the output is the determination of the most similar type of exercise.

[0180] Step 5:

[0181] The user initiates data collection using a heart rate sensor so that the robot can analyze their emotional state. The robot sends the acquired heart rate data to a terminal, which then forwards it to a server. The input is the heart rate data from the robot, and the output is the emotional data sent to the server.

[0182] Step 6:

[0183] The server evaluates emotional states and reflects this in the sports activity selection results. It uses the Microsoft Azure Sentiment Analysis API to interpret emotional data and re-evaluate interest in and suitability for sports. Inputs are emotional data and the selection results from step 4, and output is a final sports activity suggestion modified based on emotions.

[0184] Step 7:

[0185] The server sends the final exercise suggestions to the terminal, which then displays this information to the user. Along with the assessment result, advice based on the user's emotional state is displayed. The input is the suggestion result from step 6, and the output is the information on the recommended exercises displayed on the user's screen.

[0186] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0187] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0188] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0189] [Second Embodiment]

[0190] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0191] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0192] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0193] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0194] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0195] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0196] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0197] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0198] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0199] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0200] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0201] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0202] This invention relates to a skeletal analysis system for determining the most suitable type of exercise for a child. The user saves a high-resolution image of the child's entire body on their device and sends it to the server through the system's application. The server extracts skeletal features from the received image using image processing technology. Specifically, it automatically analyzes features such as joint positions and limb sizes using computer vision technology and machine learning algorithms.

[0203] The server then uses these skeletal features to compare and match them with data on past successful athletes pre-registered in its database. This database contains a wealth of information on successful athletes in various sports and the physical characteristics associated with them. Based on the matching results, the most suitable type of exercise for the child is determined.

[0204] The assessment results are transmitted in real time from the server to the user's device, allowing the user to review the recommended sports. For example, if the server determines that "basketball is suitable" based on specific skeletal characteristics, this information, along with associated benefits and success stories, is sent to the user. This system helps users effectively select sports based on their child's aptitude.

[0205] The following describes the processing flow.

[0206] Step 1:

[0207] The user uploads a high-resolution image of the child's entire body to the device. The device selects the image and prepares to upload it through the system's application.

[0208] Step 2:

[0209] The terminal sends the image file specified by the user to the server. The server receives the transmitted data and verifies that it is in the correct format and resolution. If necessary, it performs preprocessing to convert the image to a size and format suitable for analysis.

[0210] Step 3:

[0211] The server extracts skeletal features based on pre-processed image data. Specifically, it uses computer vision technology to execute algorithms that extract skeletal information such as joint positions and limb lengths from the images.

[0212] Step 4:

[0213] The server compares the extracted skeletal features with data on past successful athletes stored in a database. The matching process evaluates skeletal similarity and features that influence athletic suitability.

[0214] Step 5:

[0215] The server analyzes the matching results and determines which sports are suitable for the child. The determination uses an algorithm that takes into account sports in which athletes with similar skeletal structures have performed well, as well as their level of success.

[0216] Step 6:

[0217] The server sends information about the determined exercise type to the terminal. The terminal displays the recommended exercise type and related information on the screen so that the user can easily check the determination result.

[0218] (Example 1)

[0219] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0220] Traditionally, methods for selecting the optimal exercise based on an individual's physical characteristics have been limited, and there is a lack of clear criteria and analytical techniques. As a result, athletes and their coaches are forced to rely on empirical rules when deciding on an exercise. To resolve this problem, there is a need for a system that objectively selects exercise based on reliable data.

[0221] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0222] In this invention, the server includes means for acquiring image information, means for analyzing body structure features from the acquired image information, and means for comparing the analyzed body structure features with information from past successful exercise participants. This makes it possible to identify the optimal type of exercise based on an individual's physical characteristics.

[0223] "Image information" refers to photographs and videos acquired as visual data.

[0224] "Physical structural characteristics" refer to an individual's physical features, such as the arrangement of their skeleton and muscles, and the position and size of their joints.

[0225] "Machine learning technology" is a technique in which computers find patterns and rules from data and automatically learn from that experience.

[0226] "Type of exercise" refers to a category of specific exercise or sport within sports or fitness.

[0227] "Analysis" is the process of breaking down specific information or data into smaller parts to reveal its content and structure.

[0228] "Specification" refers to the act of clarifying something based on specific requirements or conditions.

[0229] "Comparison" is the act of comparing multiple things to examine their differences and characteristics.

[0230] "Provision" refers to the act of providing information or services to users.

[0231] This invention is a system for determining the most suitable type of exercise for a child. The user first takes a full-body photograph or high-resolution image of the child and saves it to their device. Then, the user uses the system's application to send the image to a server. In this process, the device must be equipped with communication technology to reliably transmit the captured image to the server.

[0232] The server first processes the received images using tools such as OpenCV or TensorFlow to analyze body structure features. Specifically, it automatically extracts features such as joint positions and limb sizes. Computer vision and machine learning technologies play a crucial role at this stage.

[0233] Next, the server uses the extracted physical characteristics to compare them with information on successful exercise participants stored in its database, searching for past cases with similar physical characteristics. The database contains a wealth of successful cases across various exercise disciplines, and this information is used to determine eligibility for specific exercise disciplines.

[0234] The determined sport is notified to the user's device in real time, allowing the user to consider suitable sport options for their child based on this information. For example, if the server determines that "basketball is suitable," the user will also be presented with its benefits and related success stories.

[0235] Furthermore, by inputting a prompt for the generating AI model such as, "Please describe a system that determines the optimal sport for a child by analyzing their physical characteristics using computer vision and machine learning," it becomes possible to utilize the detailed processing content of the system for learning and tuning. This invention supports the objective and effective selection of sports based on individual physical characteristics.

[0236] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0237] Step 1:

[0238] The user saves an image of their child's entire body to their device and sends it to the server via the app. Specifically, the user uses the app's "Send Image" function to select an image from their device folder and upload it. The input for this step is the image data stored on the user's device, and the output is the image data transferred to the server.

[0239] Step 2:

[0240] The server applies image processing techniques to analyze the received image data. Specifically, the server uses the OpenCV or TensorFlow library to perform image analysis to identify joint positions and limb sizes. The input for this step is image data stored on the server, and the output is anatomical features extracted from the images.

[0241] Step 3:

[0242] The server compares the extracted physical characteristics with information from past successful exercise participants in the database. Specifically, the server executes SQL queries to search for entries with similar physical characteristics. The input for this step is the extracted physical characteristics, and the output is a list of data on sports participants with high similarity.

[0243] Step 4:

[0244] The server determines the most suitable exercise based on the comparison results. Specifically, the server uses machine learning techniques (e.g., a random forest classifier) ​​to evaluate the suitability score for each sport and selects the sport with the highest score. The input for this step is similar entries and their feature data, and the output is the determined exercise.

[0245] Step 5:

[0246] The server notifies the user's device in real time of the determined exercise type. Specifically, it pushes information using WebSocket or a notification service, and displays it in the app on the device. The input for this step is information about the determined exercise type, and the output is recommended exercise type information displayed on the user's device.

[0247] Step 6:

[0248] The user checks the judgment results and related information on the terminal screen and refers to the presented success stories and benefits of the sport. Specifically, the user accesses the judgment results by pressing the "Show Results" button. The input for this step is the judgment result data sent from the server, and the output is the information and instructions provided to the user.

[0249] (Application Example 1)

[0250] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0251] Finding the right type of exercise for children to promote physical activity and improve athletic ability is a challenge for many parents and educators. Traditional methods lack a detailed approach based on individual physical characteristics, making it difficult to select the appropriate exercise for each child. As a result, inefficient training and a lack of consistency can occur.

[0252] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0253] In this invention, the server includes means for receiving image data, means for extracting the skeletal features of a target from the received image data, means for comparing the extracted skeletal features with data of past successful athletes, means for determining an appropriate exercise for the target based on the comparison results, means for outputting the determination results, means for proposing an appropriate exercise program based on the output determination results, and voice output means for guiding the proposed exercise program. This makes it possible to select the optimal exercise based on each individual's skeletal features and provide effective feedback.

[0254] "Image data" refers to digital data containing visual information, in a format that can be processed on computers and digital devices.

[0255] "Skeletal characteristics" refer to information about the skeleton of an object, including physical structural features such as the position of joints and the length of limbs, and are fundamental data for evaluating the motor skills of humans and animals.

[0256] A "means of comparison" is a technology that has the function of comparing multiple data sets against each other based on specific criteria, making it possible to evaluate similarities and differences.

[0257] "Output means" refers to a method for providing processed data or information to a user, and is a device or technology for displaying or transmitting information visually or audibly.

[0258] An "exercise program" is a plan that includes a series of exercises or training sessions aimed at achieving specific athletic abilities or health goals, and is designed according to individual needs and aptitudes.

[0259] "Audio output means" refers to a device or technology for transmitting information using sound, and can transmit information through speech synthesis or speakers.

[0260] This system is a platform for providing children with optimal exercise programs and activities via home robots and smart devices. The server has the functionality to receive and process image data from the user's smart device or robot.

[0261] The server first processes the image data using the OpenCV library and extracts skeletal features based on computer vision techniques. This process yields important structural information such as joint positions and limb lengths.

[0262] Next, the extracted skeletal features are analyzed using a machine learning algorithm powered by TensorFlow. The analysis then compares these features with a database of past successful athletes to determine the most suitable exercise for the user.

[0263] The determined exercise type and the corresponding exercise program are output from the server to the terminal or robot. The prompt used may be something like, "Please suggest an appropriate exercise type based on the skeletal characteristics of the person in the photograph."

[0264] The terminal or robot has the function of guiding the user through the assessment results visually and audibly. For example, the robot could suggest, "Jogging is suitable today. Let's run together in a nearby park," and actively encourage the user to exercise.

[0265] This system allows users to always access the latest and most personalized exercise programs, enabling them to train more healthily and efficiently.

[0266] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0267] Step 1:

[0268] The device takes a full-body image of the child from the user using its camera and generates high-resolution image data. This image data is the input data to be sent to the server.

[0269] Step 2:

[0270] The server processes the image data received from the terminal using the OpenCV library to extract skeletal features from the image. The data processing performed here analyzes characteristics such as joint positions and limb lengths from the image, and the output is skeletal feature data.

[0271] Step 3:

[0272] The server inputs the extracted skeletal features into a machine learning model using TensorFlow and performs analysis. The prompt might be, "Suggest a suitable exercise based on the skeletal features of the person in the photograph," and the generative AI model determines the appropriate exercise. The output is the determined optimal exercise.

[0273] Step 4:

[0274] The server sends the judgment result to the terminal or robot. The terminal receives this information and processes the data to display it visually to the user (for example, on a display). If voice guidance is also provided, the output is converted into voice format using text-to-speech technology.

[0275] Step 5:

[0276] Users check the exercise program provided via display or audio from the device and then engage in the actual exercise. The device may also record the user's responses and feedback, and may output data that can be used to adjust future programs.

[0277] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0278] This invention combines an emotional engine with a system that determines the most suitable type of exercise for a child, enabling more precise sports instruction. The user first uploads an image of the child's entire body to their device and sends it to the server via the system's application. The server receives this image and uses computer vision technology to extract the child's skeletal features.

[0279] The extracted skeletal information is compared with data on past successful athletes stored in a database on the server to determine the most suitable sport. In addition to the usual sport selection, this invention improves the accuracy of the selection by using an emotion engine. The server recognizes the child's real-time emotional state through sensors and cameras installed on the user's terminal. This emotional data is used as auxiliary information to adjust the suitability of the selected sport.

[0280] For example, even if the server determines that basketball is suitable based on a child's skeletal characteristics, the emotion engine will analyze the child's emotional state. If it negates signs of strong interest or enjoyment, it will suggest another sport (such as track and field). The server sends this information to the terminal, which displays the determined sport along with supplementary information related to emotions. This allows the user to make a final decision on a sport that takes into account not only physical aptitude but also emotional interest and motivation.

[0281] The following describes the processing flow.

[0282] Step 1:

[0283] The user imports an image in which the child's whole body is clearly displayed into the terminal. The terminal provides an interface that allows the user to easily check if the child's posture is correctly captured and prompts the user to upload the image to the system.

[0284] Step 2:

[0285] The terminal sends the image file selected by the user to the server. The server saves the received image data in an appropriate resolution and format and ensures the image quality by performing initial processing.

[0286] Step 3:

[0287] [[ID=

[16] ]The server extracts skeletal features from the image using computer vision technology. This includes a process of utilizing image analysis algorithms to identify the positions of joints, the lengths of limbs, and the overall skeletal ratios.

[0288] Step 4:

[0289] The server uses the extracted skeletal features to match against the data of past successful athletes in the existing database. The matching algorithm calculates the similarity and identifies the most suitable sports category.

[0290] Step 5:

[0291] The terminal activates the emotion engine using an instruction from the user or the server. Cameras and sensors are involved in measuring the user's emotional state, and the pattern of emotions displayed by the user is collected in real-time.

[0292] Step 6:

[0293] The server takes in the analysis results of the emotion engine and evaluates the emotional suitability for the determined sports category. For example, it further evaluates whether the determined sport can elicit the child's joy or excitement and adjusts the determination result if necessary.

[0294] Step 7:

[0295] The server sends the final determined exercise category and additional information based on the emotional assessment to the device. The device provides visualized feedback so that the user can easily view this information. Based on this information, the user can make more appropriate choices regarding sports activities for their child.

[0296] (Example 2)

[0297] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0298] Conventional sports selection systems rely solely on physical characteristics, failing to consider the subject's emotions or interests, and therefore may not be able to suggest the most suitable sport. This highlights the challenge of developing highly accurate sports selection systems that consider both physical aptitude and emotional factors.

[0299] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0300] In this invention, the server includes means for acquiring and transmitting image data, means for extracting the skeletal features of a target from the received image data using an analysis device, means for comparing the extracted skeletal features with data of past successful athletes in an information base, means for recognizing the target's emotional state using sensors and an analysis device and adjusting the suitability of the sport, and means for outputting the final proposed sport using an output device. This makes it possible to select the optimal sport considering the physical and emotional suitability of the target.

[0301] "Image data" refers to data that records the visual information of an object in digital format and is used to extract skeletal features.

[0302] The "analysis device" is a computer system used to analyze skeletal features from received image data and compare them with information-based data.

[0303] The "information base" is a database that holds data on past successful athletes and is used to compare with skeletal information.

[0304] The "sensor" is a device used to recognize the emotional state of the subject in real time and usually includes a camera, microphone, etc.

[0305] The "output device" is a device used to display the proposed result of the final sports event to the user and usually refers to a display screen.

[0306] The present invention is a sports guidance support system that proposes the most suitable sports event for children and makes a determination by combining physical characteristics and emotional states. The user captures an image of the child's whole body with a terminal and transmits the image data to the server using the system's application. This system uses computer vision technology to extract skeletal features and analyze the emotional state in real time, thereby achieving a more precise determination.

[0307] Specifically, the server uses a visual processing library such as "OpenCV" or "TensorFlow" to extract the skeletal information of the child from the received image data. This information is compared with the data of past successful athletes stored in the database within the server. Sensors such as a camera and microphone are installed on the user's terminal, and the server uses these to analyze the emotional state of the child with technologies such as "EmotionAPI". The emotional data obtained from this analysis is used to improve the accuracy of the sports event proposal.

[0308] For example, even if the server determines from a child's skeletal structure that track and field is suitable, if sentiment analysis indicates that the child has little interest, it can suggest a different sport, such as soccer. This allows users to recommend more appropriate sports activities for their children.

[0309] As an example of a prompt when using a generative AI model, it is possible to input a message in the following format: "Based on an image of a 10-year-old child, determine the most suitable type of exercise for this child. Please also consider emotional data and provide a final suggestion."

[0310] Thus, the present invention is a system that enables the proposal of sports disciplines that comprehensively consider both physical aptitude and emotional interest and motivation.

[0311] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0312] Step 1:

[0313] The user takes a full-body image of the child with their device. The image taken by the user is sent to the server through the system's application. The input is the full-body image data of the child, and the output is the transfer of the image data to the server.

[0314] Step 2:

[0315] The server processes the received image data using computer vision technology. Specifically, it uses "OpenCV" and "TensorFlow" to perform image analysis and extract the skeletal features of the child. The input is the received image data, and the output is generated skeletal feature information.

[0316] Step 3:

[0317] The server matches the extracted skeletal features against an internal database. This database contains physical data of past successful athletes. The input is skeletal feature information, and the output generates a list of suitable athletic candidates.

[0318] Step 4:

[0319] The server collects real-time emotion data from sensors and cameras connected to the user's device. The server analyzes this data using technologies such as "EmotionAPI" to evaluate the child's emotional state. Real-time emotion data is the input, and emotion evaluation information is obtained as the output.

[0320] Step 5:

[0321] The server integrates candidate exercises based on skeletal structure with emotional evaluation information to propose a final exercise. The input consists of candidate exercises and emotional evaluation information, and the output is the optimal exercise suggested to the user.

[0322] Step 6:

[0323] The server sends the final suggestions to the terminal, which then provides this information to the user. The user can review the suggestions on the terminal screen and obtain details and advice on sports suitable for their child. The input is the final suggested sports activity, and the output is the presentation of information to the user.

[0324] (Application Example 2)

[0325] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the smart glasses 214 as the "terminal".

[0326] Traditional sports coaching systems are limited to determining the type of sport based on physical aptitude, making it difficult to provide precise instruction that takes into account individual emotional states. Furthermore, if the suggested sport does not interest the child, it may fail to cultivate sustained motivation, hindering the overall improvement of athletic ability.

[0327] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0328] In this invention, the server includes means for receiving image information, means for extracting the skeletal features of the target, means for comparing with past data, means for evaluating the emotional state, and means for modifying the judgment result according to the emotional state. This makes it possible to suggest exercises that take into account not only physical aptitude but also emotional motivation and interest, enabling more precise sports instruction that encourages participation.

[0329] "Image information" refers to visual data that has been converted into a format that can be processed by computers and digital devices.

[0330] "Skeletal features" refer to information related to the shape, arrangement, and structure of the skeleton of an object.

[0331] The term "athlete" refers to a person who participates in a specific physical activity or sport.

[0332] "Data" refers to a set of facts, values, or instructions that have been formalized for processing by a computer or other system.

[0333] "Physical activity" refers to exercise and behavior that involves moving the body.

[0334] "Emotional state" refers to an index that quantifies the situation that represents an individual's emotional response or mood.

[0335] "Image analysis technology" refers to a series of methods and techniques used to extract meaningful information from image data.

[0336] The system for realizing this application consists of a program that performs the following processes. First, the user has a robot in the house acquire image information of the child using sensors and a camera. The robot captures video using, for example, an Intel RealSense camera, and this video information is sent to a terminal. The terminal transfers this information to a cloud server, which extracts skeletal features using image analysis technology. Rekognition from Amazon Web Services (AWS) is used for image analysis.

[0337] Next, the server compares the extracted skeletal features with data on athletes stored in a past database. This comparison utilizes algorithms that perform pattern recognition on similar data. Furthermore, the robot assesses the child's emotional state using a heart rate sensor and sends this data to the server. Microsoft Azure's Sentiment Analysis API is used for this emotion recognition.

[0338] The server combines this information to determine the optimal physical activity and outputs the result to the user's device. The device displays the determination result along with advice on the type of exercise and an assessment of the emotional state, providing suggestions for the final exercise plan. This system allows children to choose and participate in sports activities based not only on their physical aptitude but also on their emotional interests.

[0339] As a concrete example, a robot that has received movement information can provide feedback such as, "This child is suited to individual running events, but judging from their emotional state, they don't seem very interested. How about trying soccer?" An example of a prompt using a generative AI model is as follows:

[0340] "Please suggest the most suitable sport based on the child's interests and physical aptitude. Current emotional state data indicates a positive interest."

[0341] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0342] Step 1:

[0343] The user instructs the robot to acquire image information. The robot, equipped with an Intel RealSense camera, captures images of the child before exercise. The acquired image information is transmitted to the terminal in real time. The input is video data from the camera, and the output is raw data transferred to the terminal.

[0344] Step 2:

[0345] The terminal transfers the received image information to the cloud server. Data compression may occur during transmission, but it may also be sent in its original format. The input is the video data acquired in step 1, and the output is a notification that the transfer to the server is complete.

[0346] Step 3:

[0347] The server processes the received image information and extracts the skeletal features of the target using image analysis technology. In this process, it utilizes tools such as AWS Rekognition to identify key skeletal features from the image data. The input is video data sent to the cloud server, and the output is the extracted skeletal information.

[0348] Step 4:

[0349] The server compares the extracted skeletal information with past athlete data. A pattern recognition algorithm is used to calculate similarity. The input is the skeletal information obtained in step 3 and the database data, and the output is the determination of the most similar type of exercise.

[0350] Step 5:

[0351] The user initiates data collection using a heart rate sensor so that the robot can analyze their emotional state. The robot sends the acquired heart rate data to a terminal, which then forwards it to a server. The input is the heart rate data from the robot, and the output is the emotional data sent to the server.

[0352] Step 6:

[0353] The server evaluates emotional states and reflects this in the sports activity selection results. It uses the Microsoft Azure Sentiment Analysis API to interpret emotional data and re-evaluate interest in and suitability for sports. Inputs are emotional data and the selection results from step 4, and output is a final sports activity suggestion modified based on emotions.

[0354] Step 7:

[0355] The server sends the final exercise suggestions to the terminal, which then displays this information to the user. Along with the assessment result, advice based on the user's emotional state is displayed. The input is the suggestion result from step 6, and the output is the information on the recommended exercises displayed on the user's screen.

[0356] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0357] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0358] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0359] [Third Embodiment]

[0360] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0361] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0362] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0363] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0364] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0365] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0366] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0367] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0368] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0369] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0370] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0371] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0372] This invention relates to a skeletal analysis system for determining the most suitable type of exercise for a child. The user saves a high-resolution image of the child's entire body on their device and sends it to the server through the system's application. The server extracts skeletal features from the received image using image processing technology. Specifically, it automatically analyzes features such as joint positions and limb sizes using computer vision technology and machine learning algorithms.

[0373] The server then uses these skeletal features to compare and match them with data on past successful athletes pre-registered in its database. This database contains a wealth of information on successful athletes in various sports and the physical characteristics associated with them. Based on the matching results, the most suitable type of exercise for the child is determined.

[0374] The assessment results are transmitted in real time from the server to the user's device, allowing the user to review the recommended sports. For example, if the server determines that "basketball is suitable" based on specific skeletal characteristics, this information, along with associated benefits and success stories, is sent to the user. This system helps users effectively select sports based on their child's aptitude.

[0375] The following describes the processing flow.

[0376] Step 1:

[0377] The user uploads a high-resolution image of the child's entire body to the device. The device selects the image and prepares to upload it through the system's application.

[0378] Step 2:

[0379] The terminal sends the image file specified by the user to the server. The server receives the transmitted data and verifies that it is in the correct format and resolution. If necessary, it performs preprocessing to convert the image to a size and format suitable for analysis.

[0380] Step 3:

[0381] The server extracts skeletal features based on pre-processed image data. Specifically, it uses computer vision technology to execute algorithms that extract skeletal information such as joint positions and limb lengths from the images.

[0382] Step 4:

[0383] The server compares the extracted skeletal features with data on past successful athletes stored in a database. The matching process evaluates skeletal similarity and features that influence athletic suitability.

[0384] Step 5:

[0385] The server analyzes the matching results and determines which sports are suitable for the child. The determination uses an algorithm that takes into account sports in which athletes with similar skeletal structures have performed well, as well as their level of success.

[0386] Step 6:

[0387] The server sends information about the determined exercise type to the terminal. The terminal displays the recommended exercise type and related information on the screen so that the user can easily check the determination result.

[0388] (Example 1)

[0389] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0390] Traditionally, methods for selecting the optimal exercise based on an individual's physical characteristics have been limited, and there is a lack of clear criteria and analytical techniques. As a result, athletes and their coaches are forced to rely on empirical rules when deciding on an exercise. To resolve this problem, there is a need for a system that objectively selects exercise based on reliable data.

[0391] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0392] In this invention, the server includes means for acquiring image information, means for analyzing body structure features from the acquired image information, and means for comparing the analyzed body structure features with information from past successful exercise participants. This makes it possible to identify the optimal type of exercise based on an individual's physical characteristics.

[0393] "Image information" refers to photographs and videos acquired as visual data.

[0394] "Physical structural characteristics" refer to an individual's physical features, such as the arrangement of their skeleton and muscles, and the position and size of their joints.

[0395] "Machine learning technology" is a technique in which computers find patterns and rules from data and automatically learn from that experience.

[0396] "Type of exercise" refers to a category of specific exercise or sport within sports or fitness.

[0397] "Analysis" is the process of breaking down specific information or data into smaller parts to reveal its content and structure.

[0398] "Specification" refers to the act of clarifying something based on specific requirements or conditions.

[0399] "Comparison" is the act of comparing multiple things to examine their differences and characteristics.

[0400] "Provision" refers to the act of providing information or services to users.

[0401] This invention is a system for determining the most suitable type of exercise for a child. The user first takes a full-body photograph or high-resolution image of the child and saves it to their device. Then, the user uses the system's application to send the image to a server. In this process, the device must be equipped with communication technology to reliably transmit the captured image to the server.

[0402] The server first processes the received images using tools such as OpenCV or TensorFlow to analyze body structure features. Specifically, it automatically extracts features such as joint positions and limb sizes. Computer vision and machine learning technologies play a crucial role at this stage.

[0403] Next, the server uses the extracted physical characteristics to compare them with information on successful exercise participants stored in its database, searching for past cases with similar physical characteristics. The database contains a wealth of successful cases across various exercise disciplines, and this information is used to determine eligibility for specific exercise disciplines.

[0404] The determined sport is notified to the user's device in real time, allowing the user to consider suitable sport options for their child based on this information. For example, if the server determines that "basketball is suitable," the user will also be presented with its benefits and related success stories.

[0405] Furthermore, by inputting a prompt for the generating AI model such as, "Please describe a system that determines the optimal sport for a child by analyzing their physical characteristics using computer vision and machine learning," it becomes possible to utilize the detailed processing content of the system for learning and tuning. This invention supports the objective and effective selection of sports based on individual physical characteristics.

[0406] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0407] Step 1:

[0408] The user saves an image of their child's entire body to their device and sends it to the server via the app. Specifically, the user uses the app's "Send Image" function to select an image from their device folder and upload it. The input for this step is the image data stored on the user's device, and the output is the image data transferred to the server.

[0409] Step 2:

[0410] The server applies image processing techniques to analyze the received image data. Specifically, the server uses the OpenCV or TensorFlow library to perform image analysis to identify joint positions and limb sizes. The input for this step is image data stored on the server, and the output is anatomical features extracted from the images.

[0411] Step 3:

[0412] The server compares the extracted physical characteristics with information from past successful exercise participants in the database. Specifically, the server executes SQL queries to search for entries with similar physical characteristics. The input for this step is the extracted physical characteristics, and the output is a list of data on sports participants with high similarity.

[0413] Step 4:

[0414] The server determines the most suitable exercise based on the comparison results. Specifically, the server uses machine learning techniques (e.g., a random forest classifier) ​​to evaluate the suitability score for each sport and selects the sport with the highest score. The input for this step is similar entries and their feature data, and the output is the determined exercise.

[0415] Step 5:

[0416] The server notifies the user's device in real time of the determined exercise type. Specifically, it pushes information using WebSocket or a notification service, and displays it in the app on the device. The input for this step is information about the determined exercise type, and the output is recommended exercise type information displayed on the user's device.

[0417] Step 6:

[0418] The user checks the judgment results and related information on the terminal screen and refers to the presented success stories and benefits of the sport. Specifically, the user accesses the judgment results by pressing the "Show Results" button. The input for this step is the judgment result data sent from the server, and the output is the information and instructions provided to the user.

[0419] (Application Example 1)

[0420] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0421] Finding the right type of exercise for children to promote physical activity and improve athletic ability is a challenge for many parents and educators. Traditional methods lack a detailed approach based on individual physical characteristics, making it difficult to select the appropriate exercise for each child. As a result, inefficient training and a lack of consistency can occur.

[0422] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0423] In this invention, the server includes means for receiving image data, means for extracting the skeletal features of a target from the received image data, means for comparing the extracted skeletal features with data of past successful athletes, means for determining an appropriate exercise for the target based on the comparison results, means for outputting the determination results, means for proposing an appropriate exercise program based on the output determination results, and voice output means for guiding the proposed exercise program. This makes it possible to select the optimal exercise based on each individual's skeletal features and provide effective feedback.

[0424] "Image data" refers to digital data containing visual information, in a format that can be processed on computers and digital devices.

[0425] "Skeletal characteristics" refer to information about the skeleton of an object, including physical structural features such as the position of joints and the length of limbs, and are fundamental data for evaluating the motor skills of humans and animals.

[0426] A "means of comparison" is a technology that has the function of comparing multiple data sets against each other based on specific criteria, making it possible to evaluate similarities and differences.

[0427] "Output means" refers to a method for providing processed data or information to a user, and is a device or technology for displaying or transmitting information visually or audibly.

[0428] An "exercise program" is a plan that includes a series of exercises or training sessions aimed at achieving specific athletic abilities or health goals, and is designed according to individual needs and aptitudes.

[0429] "Audio output means" refers to a device or technology for transmitting information using sound, and can transmit information through speech synthesis or speakers.

[0430] This system is a platform for providing children with optimal exercise programs and activities via home robots and smart devices. The server has the functionality to receive and process image data from the user's smart device or robot.

[0431] The server first processes the image data using the OpenCV library and extracts skeletal features based on computer vision techniques. This process yields important structural information such as joint positions and limb lengths.

[0432] Next, the extracted skeletal features are analyzed using a machine learning algorithm powered by TensorFlow. The analysis then compares these features with a database of past successful athletes to determine the most suitable exercise for the user.

[0433] The determined exercise type and the corresponding exercise program are output from the server to the terminal or robot. The prompt used may be something like, "Please suggest an appropriate exercise type based on the skeletal characteristics of the person in the photograph."

[0434] The terminal or robot has the function of guiding the user through the assessment results visually and audibly. For example, the robot could suggest, "Jogging is suitable today. Let's run together in a nearby park," and actively encourage the user to exercise.

[0435] This system allows users to always access the latest and most personalized exercise programs, enabling them to train more healthily and efficiently.

[0436] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0437] Step 1:

[0438] The device takes a full-body image of the child from the user using its camera and generates high-resolution image data. This image data is the input data to be sent to the server.

[0439] Step 2:

[0440] The server processes the image data received from the terminal using the OpenCV library to extract skeletal features from the image. The data processing performed here analyzes characteristics such as joint positions and limb lengths from the image, and the output is skeletal feature data.

[0441] Step 3:

[0442] The server inputs the extracted skeletal features into a machine learning model using TensorFlow and performs analysis. The prompt might be, "Suggest a suitable exercise based on the skeletal features of the person in the photograph," and the generative AI model determines the appropriate exercise. The output is the determined optimal exercise.

[0443] Step 4:

[0444] The server sends the judgment result to the terminal or robot. The terminal receives this information and processes the data to display it visually to the user (for example, on a display). If voice guidance is also provided, the output is converted into voice format using text-to-speech technology.

[0445] Step 5:

[0446] Users check the exercise program provided via display or audio from the device and then engage in the actual exercise. The device may also record the user's responses and feedback, and may output data that can be used to adjust future programs.

[0447] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0448] This invention combines an emotional engine with a system that determines the most suitable type of exercise for a child, enabling more precise sports instruction. The user first uploads an image of the child's entire body to their device and sends it to the server via the system's application. The server receives this image and uses computer vision technology to extract the child's skeletal features.

[0449] The extracted skeletal information is compared with data on past successful athletes stored in a database on the server to determine the most suitable sport. In addition to the usual sport selection, this invention improves the accuracy of the selection by using an emotion engine. The server recognizes the child's real-time emotional state through sensors and cameras installed on the user's terminal. This emotional data is used as auxiliary information to adjust the suitability of the selected sport.

[0450] For example, even if the server determines that basketball is suitable based on a child's skeletal characteristics, the emotion engine will analyze the child's emotional state. If it negates signs of strong interest or enjoyment, it will suggest another sport (such as track and field). The server sends this information to the terminal, which displays the determined sport along with supplementary information related to emotions. This allows the user to make a final decision on a sport that takes into account not only physical aptitude but also emotional interest and motivation.

[0451] The following describes the processing flow.

[0452] Step 1:

[0453] The user uploads a clear image of the child's entire body to the device. The device provides an interface that allows the user to easily check if the child's posture is correct and prompts the user to upload the image to the system.

[0454] Step 2:

[0455] The terminal sends the image file selected by the user to the server. The server saves the received image data in the appropriate resolution and format, and ensures image quality by performing initial processing.

[0456] Step 3:

[0457] The server uses computer vision technology to extract skeletal features from images. This involves a process that utilizes image analysis algorithms to identify joint locations, limb lengths, and overall skeletal proportions.

[0458] Step 4:

[0459] The server uses the extracted skeletal features to match them against data of past successful athletes in an existing database. The matching algorithm calculates similarity and identifies the most suitable sport.

[0460] Step 5:

[0461] The device activates the emotion engine based on instructions from the user or the server. Cameras and sensors are involved to measure the user's emotional state and collect emotional patterns displayed by the user in real time.

[0462] Step 6:

[0463] The server takes in the analysis results from the emotion engine and evaluates the emotional suitability for the determined sport. For example, it further evaluates whether the determined sport evokes joy and excitement in children and adjusts the judgment result as needed.

[0464] Step 7:

[0465] The server sends the final determined exercise category and additional information based on the emotional assessment to the device. The device provides visualized feedback so that the user can easily view this information. Based on this information, the user can make more appropriate choices regarding sports activities for their child.

[0466] (Example 2)

[0467] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0468] Conventional sports selection systems rely solely on physical characteristics, failing to consider the subject's emotions or interests, and therefore may not be able to suggest the most suitable sport. This highlights the challenge of developing highly accurate sports selection systems that consider both physical aptitude and emotional factors.

[0469] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0470] In this invention, the server includes means for acquiring and transmitting image data, means for extracting the skeletal features of a target from the received image data using an analysis device, means for comparing the extracted skeletal features with data of past successful athletes in an information base, means for recognizing the target's emotional state using sensors and an analysis device and adjusting the suitability of the sport, and means for outputting the final proposed sport using an output device. This makes it possible to select the optimal sport considering the physical and emotional suitability of the target.

[0471] "Image data" refers to data that records the visual information of an object in digital format and is used to extract skeletal features.

[0472] An "analysis device" is a computer system used to analyze skeletal features from received image data and compare them with information base data.

[0473] An "information base" is a database used to hold data on past successful athletes and to cross-reference it with skeletal information.

[0474] A "sensor" is a device used to recognize the emotional state of an object in real time, and typically includes cameras and microphones.

[0475] An "output device" is a device used to display the final suggested exercise results to the user, and usually refers to a display screen.

[0476] This invention is a sports instruction support system that proposes the most suitable type of exercise for a child, making a determination by combining physical characteristics and emotional state. The user takes an image of the child's entire body into a terminal and sends the image data to the server using the system's application. This system uses computer vision technology to extract skeletal features and analyzes the emotional state in real time, thereby achieving a more precise determination.

[0477] Specifically, the server uses visual processing libraries such as "OpenCV" and "TensorFlow" to extract skeletal information of children from received image data. This information is then compared with data on past successful athletes stored in the server's database. The user's device is equipped with sensors such as a camera and microphone, and the server uses these to analyze the child's emotional state using technologies such as "EmotionAPI". The emotional data obtained from this analysis is used to improve the accuracy of sports sport recommendations.

[0478] For example, even if the server determines from a child's skeletal structure that track and field is suitable, if sentiment analysis indicates that the child has little interest, it can suggest a different sport, such as soccer. This allows users to recommend more appropriate sports activities for their children.

[0479] As an example of a prompt when using a generative AI model, it is possible to input a message in the following format: "Based on an image of a 10-year-old child, determine the most suitable type of exercise for this child. Please also consider emotional data and provide a final suggestion."

[0480] Thus, the present invention is a system that enables the proposal of sports disciplines that comprehensively consider both physical aptitude and emotional interest and motivation.

[0481] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0482] Step 1:

[0483] The user takes a full-body image of the child with their device. The image taken by the user is sent to the server through the system's application. The input is the full-body image data of the child, and the output is the transfer of the image data to the server.

[0484] Step 2:

[0485] The server processes the received image data using computer vision technology. Specifically, it uses "OpenCV" and "TensorFlow" to perform image analysis and extract the skeletal features of the child. The input is the received image data, and the output is generated skeletal feature information.

[0486] Step 3:

[0487] The server matches the extracted skeletal features against an internal database. This database contains physical data of past successful athletes. The input is skeletal feature information, and the output generates a list of suitable athletic candidates.

[0488] Step 4:

[0489] The server collects real-time emotion data from sensors and cameras connected to the user's device. The server analyzes this data using technologies such as "EmotionAPI" to evaluate the child's emotional state. Real-time emotion data is the input, and emotion evaluation information is obtained as the output.

[0490] Step 5:

[0491] The server integrates candidate exercises based on skeletal structure with emotional evaluation information to propose a final exercise. The input consists of candidate exercises and emotional evaluation information, and the output is the optimal exercise suggested to the user.

[0492] Step 6:

[0493] The server sends the final suggestions to the terminal, which then provides this information to the user. The user can review the suggestions on the terminal screen and obtain details and advice on sports suitable for their child. The input is the final suggested sports activity, and the output is the presentation of information to the user.

[0494] (Application Example 2)

[0495] Next, we will explain Application Example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0496] Traditional sports coaching systems are limited to determining the type of sport based on physical aptitude, making it difficult to provide precise instruction that takes into account individual emotional states. Furthermore, if the suggested sport does not interest the child, it may fail to cultivate sustained motivation, hindering the overall improvement of athletic ability.

[0497] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0498] In this invention, the server includes means for receiving image information, means for extracting the skeletal features of the target, means for comparing with past data, means for evaluating the emotional state, and means for modifying the judgment result according to the emotional state. This makes it possible to suggest exercises that take into account not only physical aptitude but also emotional motivation and interest, enabling more precise sports instruction that encourages participation.

[0499] "Image information" refers to visual data that has been converted into a format that can be processed by computers and digital devices.

[0500] "Skeletal features" refer to information related to the shape, arrangement, and structure of the skeleton of an object.

[0501] The term "athlete" refers to a person who participates in a specific physical activity or sport.

[0502] "Data" refers to a set of facts, values, or instructions that have been formalized for processing by a computer or other system.

[0503] "Physical activity" refers to exercise and behavior that involves moving the body.

[0504] "Emotional state" refers to an index that quantifies the situation that represents an individual's emotional response or mood.

[0505] "Image analysis technology" refers to a series of methods and techniques used to extract meaningful information from image data.

[0506] The system for realizing this application consists of a program that performs the following processes. First, the user has a robot in the house acquire image information of the child using sensors and a camera. The robot captures video using, for example, an Intel RealSense camera, and this video information is sent to a terminal. The terminal transfers this information to a cloud server, which extracts skeletal features using image analysis technology. Rekognition from Amazon Web Services (AWS) is used for image analysis.

[0507] Next, the server compares the extracted skeletal features with data on athletes stored in a past database. This comparison utilizes algorithms that perform pattern recognition on similar data. Furthermore, the robot assesses the child's emotional state using a heart rate sensor and sends this data to the server. Microsoft Azure's Sentiment Analysis API is used for this emotion recognition.

[0508] The server combines this information to determine the optimal physical activity and outputs the result to the user's device. The device displays the determination result along with advice on the type of exercise and an assessment of the emotional state, providing suggestions for the final exercise plan. This system allows children to choose and participate in sports activities based not only on their physical aptitude but also on their emotional interests.

[0509] As a concrete example, a robot that has received movement information can provide feedback such as, "This child is suited to individual running events, but judging from their emotional state, they don't seem very interested. How about trying soccer?" An example of a prompt using a generative AI model is as follows:

[0510] "Please suggest the most suitable sport based on the child's interests and physical aptitude. Current emotional state data indicates a positive interest."

[0511] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0512] Step 1:

[0513] The user instructs the robot to acquire image information. The robot, equipped with an Intel RealSense camera, captures images of the child before exercise. The acquired image information is transmitted to the terminal in real time. The input is video data from the camera, and the output is raw data transferred to the terminal.

[0514] Step 2:

[0515] The terminal transfers the received image information to the cloud server. Data compression may occur during transmission, but it may also be sent in its original format. The input is the video data acquired in step 1, and the output is a notification that the transfer to the server is complete.

[0516] Step 3:

[0517] The server processes the received image information and extracts the skeletal features of the target using image analysis technology. In this process, it utilizes tools such as AWS Rekognition to identify key skeletal features from the image data. The input is video data sent to the cloud server, and the output is the extracted skeletal information.

[0518] Step 4:

[0519] The server compares the extracted skeletal information with past athlete data. A pattern recognition algorithm is used to calculate similarity. The input is the skeletal information obtained in step 3 and the database data, and the output is the determination of the most similar type of exercise.

[0520] Step 5:

[0521] The user initiates data collection using a heart rate sensor so that the robot can analyze their emotional state. The robot sends the acquired heart rate data to a terminal, which then forwards it to a server. The input is the heart rate data from the robot, and the output is the emotional data sent to the server.

[0522] Step 6:

[0523] The server evaluates emotional states and reflects this in the sports activity selection results. It uses the Microsoft Azure Sentiment Analysis API to interpret emotional data and re-evaluate interest in and suitability for sports. Inputs are emotional data and the selection results from step 4, and output is a final sports activity suggestion modified based on emotions.

[0524] Step 7:

[0525] The server sends the final exercise suggestions to the terminal, which then displays this information to the user. Along with the assessment result, advice based on the user's emotional state is displayed. The input is the suggestion result from step 6, and the output is the information on the recommended exercises displayed on the user's screen.

[0526] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0527] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0528] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0529] [Fourth Embodiment]

[0530] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0531] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0532] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0533] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0534] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0535] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0536] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0537] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0538] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0539] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0540] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0541] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0542] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0543] This invention relates to a skeletal analysis system for determining the most suitable type of exercise for a child. The user saves a high-resolution image of the child's entire body on their device and sends it to the server through the system's application. The server extracts skeletal features from the received image using image processing technology. Specifically, it automatically analyzes features such as joint positions and limb sizes using computer vision technology and machine learning algorithms.

[0544] The server then uses these skeletal features to compare and match them with data on past successful athletes pre-registered in its database. This database contains a wealth of information on successful athletes in various sports and the physical characteristics associated with them. Based on the matching results, the most suitable type of exercise for the child is determined.

[0545] The assessment results are transmitted in real time from the server to the user's device, allowing the user to review the recommended sports. For example, if the server determines that "basketball is suitable" based on specific skeletal characteristics, this information, along with associated benefits and success stories, is sent to the user. This system helps users effectively select sports based on their child's aptitude.

[0546] The following describes the processing flow.

[0547] Step 1:

[0548] The user uploads a high-resolution image of the child's entire body to the device. The device selects the image and prepares to upload it through the system's application.

[0549] Step 2:

[0550] The terminal sends the image file specified by the user to the server. The server receives the transmitted data and verifies that it is in the correct format and resolution. If necessary, it performs preprocessing to convert the image to a size and format suitable for analysis.

[0551] Step 3:

[0552] The server extracts skeletal features based on pre-processed image data. Specifically, it uses computer vision technology to execute algorithms that extract skeletal information such as joint positions and limb lengths from the images.

[0553] Step 4:

[0554] The server compares the extracted skeletal features with data on past successful athletes stored in a database. The matching process evaluates skeletal similarity and features that influence athletic suitability.

[0555] Step 5:

[0556] The server analyzes the matching results and determines which sports are suitable for the child. The determination uses an algorithm that takes into account sports in which athletes with similar skeletal structures have performed well, as well as their level of success.

[0557] Step 6:

[0558] The server sends information about the determined exercise type to the terminal. The terminal displays the recommended exercise type and related information on the screen so that the user can easily check the determination result.

[0559] (Example 1)

[0560] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0561] Traditionally, methods for selecting the optimal exercise based on an individual's physical characteristics have been limited, and there is a lack of clear criteria and analytical techniques. As a result, athletes and their coaches are forced to rely on empirical rules when deciding on an exercise. To resolve this problem, there is a need for a system that objectively selects exercise based on reliable data.

[0562] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0563] In this invention, the server includes means for acquiring image information, means for analyzing body structure features from the acquired image information, and means for comparing the analyzed body structure features with information from past successful exercise participants. This makes it possible to identify the optimal type of exercise based on an individual's physical characteristics.

[0564] "Image information" refers to photographs and videos acquired as visual data.

[0565] "Physical structural characteristics" refer to an individual's physical features, such as the arrangement of their skeleton and muscles, and the position and size of their joints.

[0566] "Machine learning technology" is a technique in which computers find patterns and rules from data and automatically learn from that experience.

[0567] "Type of exercise" refers to a category of specific exercise or sport within sports or fitness.

[0568] "Analysis" is the process of breaking down specific information or data into smaller parts to reveal its content and structure.

[0569] "Specification" refers to the act of clarifying something based on specific requirements or conditions.

[0570] "Comparison" is the act of comparing multiple things to examine their differences and characteristics.

[0571] "Provision" refers to the act of providing information or services to users.

[0572] This invention is a system for determining the most suitable type of exercise for a child. The user first takes a full-body photograph or high-resolution image of the child and saves it to their device. Then, the user uses the system's application to send the image to a server. In this process, the device must be equipped with communication technology to reliably transmit the captured image to the server.

[0573] The server first processes the received images using tools such as OpenCV or TensorFlow to analyze body structure features. Specifically, it automatically extracts features such as joint positions and limb sizes. Computer vision and machine learning technologies play a crucial role at this stage.

[0574] Next, the server uses the extracted physical characteristics to compare them with information on successful exercise participants stored in its database, searching for past cases with similar physical characteristics. The database contains a wealth of successful cases across various exercise disciplines, and this information is used to determine eligibility for specific exercise disciplines.

[0575] The determined sport is notified to the user's device in real time, allowing the user to consider suitable sport options for their child based on this information. For example, if the server determines that "basketball is suitable," the user will also be presented with its benefits and related success stories.

[0576] Furthermore, by inputting a prompt for the generating AI model such as, "Please describe a system that determines the optimal sport for a child by analyzing their physical characteristics using computer vision and machine learning," it becomes possible to utilize the detailed processing content of the system for learning and tuning. This invention supports the objective and effective selection of sports based on individual physical characteristics.

[0577] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0578] Step 1:

[0579] The user saves an image of their child's entire body to their device and sends it to the server via the app. Specifically, the user uses the app's "Send Image" function to select an image from their device folder and upload it. The input for this step is the image data stored on the user's device, and the output is the image data transferred to the server.

[0580] Step 2:

[0581] The server applies image processing techniques to analyze the received image data. Specifically, the server uses the OpenCV or TensorFlow library to perform image analysis to identify joint positions and limb sizes. The input for this step is image data stored on the server, and the output is anatomical features extracted from the images.

[0582] Step 3:

[0583] The server compares the extracted physical characteristics with information from past successful exercise participants in the database. Specifically, the server executes SQL queries to search for entries with similar physical characteristics. The input for this step is the extracted physical characteristics, and the output is a list of data on sports participants with high similarity.

[0584] Step 4:

[0585] The server determines the most suitable exercise based on the comparison results. Specifically, the server uses machine learning techniques (e.g., a random forest classifier) ​​to evaluate the suitability score for each sport and selects the sport with the highest score. The input for this step is similar entries and their feature data, and the output is the determined exercise.

[0586] Step 5:

[0587] The server notifies the user's device in real time of the determined exercise type. Specifically, it pushes information using WebSocket or a notification service, and displays it in the app on the device. The input for this step is information about the determined exercise type, and the output is recommended exercise type information displayed on the user's device.

[0588] Step 6:

[0589] The user checks the judgment results and related information on the terminal screen and refers to the presented success stories and benefits of the sport. Specifically, the user accesses the judgment results by pressing the "Show Results" button. The input for this step is the judgment result data sent from the server, and the output is the information and instructions provided to the user.

[0590] (Application Example 1)

[0591] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0592] Finding the right type of exercise for children to promote physical activity and improve athletic ability is a challenge for many parents and educators. Traditional methods lack a detailed approach based on individual physical characteristics, making it difficult to select the appropriate exercise for each child. As a result, inefficient training and a lack of consistency can occur.

[0593] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0594] In this invention, the server includes means for receiving image data, means for extracting the skeletal features of a target from the received image data, means for comparing the extracted skeletal features with data of past successful athletes, means for determining an appropriate exercise for the target based on the comparison results, means for outputting the determination results, means for proposing an appropriate exercise program based on the output determination results, and voice output means for guiding the proposed exercise program. This makes it possible to select the optimal exercise based on each individual's skeletal features and provide effective feedback.

[0595] "Image data" refers to digital data containing visual information, in a format that can be processed on computers and digital devices.

[0596] "Skeletal characteristics" refer to information about the skeleton of an object, including physical structural features such as the position of joints and the length of limbs, and are fundamental data for evaluating the motor skills of humans and animals.

[0597] A "means of comparison" is a technology that has the function of comparing multiple data sets against each other based on specific criteria, making it possible to evaluate similarities and differences.

[0598] "Output means" refers to a method for providing processed data or information to a user, and is a device or technology for displaying or transmitting information visually or audibly.

[0599] An "exercise program" is a plan that includes a series of exercises or training sessions aimed at achieving specific athletic abilities or health goals, and is designed according to individual needs and aptitudes.

[0600] "Audio output means" refers to a device or technology for transmitting information using sound, and can transmit information through speech synthesis or speakers.

[0601] This system is a platform for providing children with optimal exercise programs and activities via home robots and smart devices. The server has the functionality to receive and process image data from the user's smart device or robot.

[0602] The server first processes the image data using the OpenCV library and extracts skeletal features based on computer vision techniques. This process yields important structural information such as joint positions and limb lengths.

[0603] Next, the extracted skeletal features are analyzed using a machine learning algorithm powered by TensorFlow. The analysis then compares these features with a database of past successful athletes to determine the most suitable exercise for the user.

[0604] The determined exercise type and the corresponding exercise program are output from the server to the terminal or robot. The prompt used may be something like, "Please suggest an appropriate exercise type based on the skeletal characteristics of the person in the photograph."

[0605] The terminal or robot has the function of guiding the user through the assessment results visually and audibly. For example, the robot could suggest, "Jogging is suitable today. Let's run together in a nearby park," and actively encourage the user to exercise.

[0606] This system allows users to always access the latest and most personalized exercise programs, enabling them to train more healthily and efficiently.

[0607] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0608] Step 1:

[0609] The device takes a full-body image of the child from the user using its camera and generates high-resolution image data. This image data is the input data to be sent to the server.

[0610] Step 2:

[0611] The server processes the image data received from the terminal using the OpenCV library to extract skeletal features from the image. The data processing performed here analyzes characteristics such as joint positions and limb lengths from the image, and the output is skeletal feature data.

[0612] Step 3:

[0613] The server inputs the extracted skeletal features into a machine learning model using TensorFlow and performs analysis. The prompt might be, "Suggest a suitable exercise based on the skeletal features of the person in the photograph," and the generative AI model determines the appropriate exercise. The output is the determined optimal exercise.

[0614] Step 4:

[0615] The server sends the judgment result to the terminal or robot. The terminal receives this information and processes the data to display it visually to the user (for example, on a display). If voice guidance is also provided, the output is converted into voice format using text-to-speech technology.

[0616] Step 5:

[0617] Users check the exercise program provided via display or audio from the device and then engage in the actual exercise. The device may also record the user's responses and feedback, and may output data that can be used to adjust future programs.

[0618] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0619] This invention combines an emotional engine with a system that determines the most suitable type of exercise for a child, enabling more precise sports instruction. The user first uploads an image of the child's entire body to their device and sends it to the server via the system's application. The server receives this image and uses computer vision technology to extract the child's skeletal features.

[0620] The extracted skeletal information is compared with data on past successful athletes stored in a database on the server to determine the most suitable sport. In addition to the usual sport selection, this invention improves the accuracy of the selection by using an emotion engine. The server recognizes the child's real-time emotional state through sensors and cameras installed on the user's terminal. This emotional data is used as auxiliary information to adjust the suitability of the selected sport.

[0621] For example, even if the server determines that basketball is suitable based on a child's skeletal characteristics, the emotion engine will analyze the child's emotional state. If it negates signs of strong interest or enjoyment, it will suggest another sport (such as track and field). The server sends this information to the terminal, which displays the determined sport along with supplementary information related to emotions. This allows the user to make a final decision on a sport that takes into account not only physical aptitude but also emotional interest and motivation.

[0622] The following describes the processing flow.

[0623] Step 1:

[0624] The user uploads a clear image of the child's entire body to the device. The device provides an interface that allows the user to easily check if the child's posture is correct and prompts the user to upload the image to the system.

[0625] Step 2:

[0626] The terminal sends the image file selected by the user to the server. The server saves the received image data in the appropriate resolution and format, and ensures image quality by performing initial processing.

[0627] Step 3:

[0628] The server uses computer vision technology to extract skeletal features from images. This involves a process that utilizes image analysis algorithms to identify joint locations, limb lengths, and overall skeletal proportions.

[0629] Step 4:

[0630] The server uses the extracted skeletal features to match them against data of past successful athletes in an existing database. The matching algorithm calculates similarity and identifies the most suitable sport.

[0631] Step 5:

[0632] The device activates the emotion engine based on instructions from the user or the server. Cameras and sensors are involved to measure the user's emotional state and collect emotional patterns displayed by the user in real time.

[0633] Step 6:

[0634] The server takes in the analysis results from the emotion engine and evaluates the emotional suitability for the determined sport. For example, it further evaluates whether the determined sport evokes joy and excitement in children and adjusts the judgment result as needed.

[0635] Step 7:

[0636] The server sends the final determined exercise category and additional information based on the emotional assessment to the device. The device provides visualized feedback so that the user can easily view this information. Based on this information, the user can make more appropriate choices regarding sports activities for their child.

[0637] (Example 2)

[0638] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0639] Conventional sports selection systems rely solely on physical characteristics, failing to consider the subject's emotions or interests, and therefore may not be able to suggest the most suitable sport. This highlights the challenge of developing highly accurate sports selection systems that consider both physical aptitude and emotional factors.

[0640] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0641] In this invention, the server includes means for acquiring and transmitting image data, means for extracting the skeletal features of a target from the received image data using an analysis device, means for comparing the extracted skeletal features with data of past successful athletes in an information base, means for recognizing the target's emotional state using sensors and an analysis device and adjusting the suitability of the sport, and means for outputting the final proposed sport using an output device. This makes it possible to select the optimal sport considering the physical and emotional suitability of the target.

[0642] "Image data" refers to data that records the visual information of an object in digital format and is used to extract skeletal features.

[0643] An "analysis device" is a computer system used to analyze skeletal features from received image data and compare them with information base data.

[0644] An "information base" is a database used to hold data on past successful athletes and to cross-reference it with skeletal information.

[0645] A "sensor" is a device used to recognize the emotional state of an object in real time, and typically includes cameras and microphones.

[0646] An "output device" is a device used to display the final suggested exercise results to the user, and usually refers to a display screen.

[0647] This invention is a sports instruction support system that proposes the most suitable type of exercise for a child, making a determination by combining physical characteristics and emotional state. The user takes an image of the child's entire body into a terminal and sends the image data to the server using the system's application. This system uses computer vision technology to extract skeletal features and analyzes the emotional state in real time, thereby achieving a more precise determination.

[0648] Specifically, the server uses visual processing libraries such as "OpenCV" and "TensorFlow" to extract skeletal information of children from received image data. This information is then compared with data on past successful athletes stored in the server's database. The user's device is equipped with sensors such as a camera and microphone, and the server uses these to analyze the child's emotional state using technologies such as "EmotionAPI". The emotional data obtained from this analysis is used to improve the accuracy of sports sport recommendations.

[0649] For example, even if the server determines from a child's skeletal structure that track and field is suitable, if sentiment analysis indicates that the child has little interest, it can suggest a different sport, such as soccer. This allows users to recommend more appropriate sports activities for their children.

[0650] As an example of a prompt when using a generative AI model, it is possible to input a message in the following format: "Based on an image of a 10-year-old child, determine the most suitable type of exercise for this child. Please also consider emotional data and provide a final suggestion."

[0651] Thus, the present invention is a system that enables the proposal of sports disciplines that comprehensively consider both physical aptitude and emotional interest and motivation.

[0652] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0653] Step 1:

[0654] The user takes a full-body image of the child with their device. The image taken by the user is sent to the server through the system's application. The input is the full-body image data of the child, and the output is the transfer of the image data to the server.

[0655] Step 2:

[0656] The server processes the received image data using computer vision technology. Specifically, it uses "OpenCV" and "TensorFlow" to perform image analysis and extract the skeletal features of the child. The input is the received image data, and the output is generated skeletal feature information.

[0657] Step 3:

[0658] The server matches the extracted skeletal features against an internal database. This database contains physical data of past successful athletes. The input is skeletal feature information, and the output generates a list of suitable athletic candidates.

[0659] Step 4:

[0660] The server collects real-time emotion data from sensors and cameras connected to the user's device. The server analyzes this data using technologies such as "EmotionAPI" to evaluate the child's emotional state. Real-time emotion data is the input, and emotion evaluation information is obtained as the output.

[0661] Step 5:

[0662] The server integrates candidate exercises based on skeletal structure with emotional evaluation information to propose a final exercise. The input consists of candidate exercises and emotional evaluation information, and the output is the optimal exercise suggested to the user.

[0663] Step 6:

[0664] The server sends the final suggestions to the terminal, which then provides this information to the user. The user can review the suggestions on the terminal screen and obtain details and advice on sports suitable for their child. The input is the final suggested sports activity, and the output is the presentation of information to the user.

[0665] (Application Example 2)

[0666] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0667] Traditional sports coaching systems are limited to determining the type of sport based on physical aptitude, making it difficult to provide precise instruction that takes into account individual emotional states. Furthermore, if the suggested sport does not interest the child, it may fail to cultivate sustained motivation, hindering the overall improvement of athletic ability.

[0668] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0669] In this invention, the server includes means for receiving image information, means for extracting the skeletal features of the target, means for comparing with past data, means for evaluating the emotional state, and means for modifying the judgment result according to the emotional state. This makes it possible to suggest exercises that take into account not only physical aptitude but also emotional motivation and interest, enabling more precise sports instruction that encourages participation.

[0670] "Image information" refers to visual data that has been converted into a format that can be processed by computers and digital devices.

[0671] "Skeletal features" refer to information related to the shape, arrangement, and structure of the skeleton of an object.

[0672] The term "athlete" refers to a person who participates in a specific physical activity or sport.

[0673] "Data" refers to a set of facts, values, or instructions that have been formalized for processing by a computer or other system.

[0674] "Physical activity" refers to exercise and behavior that involves moving the body.

[0675] "Emotional state" refers to an index that quantifies the situation that represents an individual's emotional response or mood.

[0676] "Image analysis technology" refers to a series of methods and techniques used to extract meaningful information from image data.

[0677] The system for realizing this application consists of a program that performs the following processes. First, the user has a robot in the house acquire image information of the child using sensors and a camera. The robot captures video using, for example, an Intel RealSense camera, and this video information is sent to a terminal. The terminal transfers this information to a cloud server, which extracts skeletal features using image analysis technology. Rekognition from Amazon Web Services (AWS) is used for image analysis.

[0678] Next, the server compares the extracted skeletal features with data on athletes stored in a past database. This comparison utilizes algorithms that perform pattern recognition on similar data. Furthermore, the robot assesses the child's emotional state using a heart rate sensor and sends this data to the server. Microsoft Azure's Sentiment Analysis API is used for this emotion recognition.

[0679] The server combines this information to determine the optimal physical activity and outputs the result to the user's device. The device displays the determination result along with advice on the type of exercise and an assessment of the emotional state, providing suggestions for the final exercise plan. This system allows children to choose and participate in sports activities based not only on their physical aptitude but also on their emotional interests.

[0680] As a concrete example, a robot that has received movement information can provide feedback such as, "This child is suited to individual running events, but judging from their emotional state, they don't seem very interested. How about trying soccer?" An example of a prompt using a generative AI model is as follows:

[0681] "Please suggest the most suitable sport based on the child's interests and physical aptitude. Current emotional state data indicates a positive interest."

[0682] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0683] Step 1:

[0684] The user instructs the robot to acquire image information. The robot, equipped with an Intel RealSense camera, captures images of the child before exercise. The acquired image information is transmitted to the terminal in real time. The input is video data from the camera, and the output is raw data transferred to the terminal.

[0685] Step 2:

[0686] The terminal transfers the received image information to the cloud server. Data compression may occur during transmission, but it may also be sent in its original format. The input is the video data acquired in step 1, and the output is a notification that the transfer to the server is complete.

[0687] Step 3:

[0688] The server processes the received image information and extracts the skeletal features of the target using image analysis technology. In this process, it utilizes tools such as AWS Rekognition to identify key skeletal features from the image data. The input is video data sent to the cloud server, and the output is the extracted skeletal information.

[0689] Step 4:

[0690] The server compares the extracted skeletal information with past athlete data. A pattern recognition algorithm is used to calculate similarity. The input is the skeletal information obtained in step 3 and the database data, and the output is the determination of the most similar type of exercise.

[0691] Step 5:

[0692] The user initiates data collection using a heart rate sensor so that the robot can analyze their emotional state. The robot sends the acquired heart rate data to a terminal, which then forwards it to a server. The input is the heart rate data from the robot, and the output is the emotional data sent to the server.

[0693] Step 6:

[0694] The server evaluates emotional states and reflects this in the sports activity selection results. It uses the Microsoft Azure Sentiment Analysis API to interpret emotional data and re-evaluate interest in and suitability for sports. Inputs are emotional data and the selection results from step 4, and output is a final sports activity suggestion modified based on emotions.

[0695] Step 7:

[0696] The server sends the final exercise suggestions to the terminal, which then displays this information to the user. Along with the assessment result, advice based on the user's emotional state is displayed. The input is the suggestion result from step 6, and the output is the information on the recommended exercises displayed on the user's screen.

[0697] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0698] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0699] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0700] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0701] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0702] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0703] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0704] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0705] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0706] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0707] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0708] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0709] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0710] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0711] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0712] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0713] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0714] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0715] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0716] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0717] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0718] The following is further disclosed regarding the embodiments described above.

[0719] (Claim 1)

[0720] Means for receiving image data,

[0721] A means for extracting the skeletal features of a target from received image data,

[0722] A method for comparing extracted skeletal features with data from past successful athletes,

[0723] A means for determining the appropriate type of exercise for the subject based on the comparison results,

[0724] A means for outputting the judgment result,

[0725] A system that includes this.

[0726] (Claim 2)

[0727] The system according to claim 1, which uses computer vision technology when extracting skeletal features from image data.

[0728] (Claim 3)

[0729] The system according to claim 1, which provides the user with information related to the determined type of exercise.

[0730] "Example 1"

[0731] (Claim 1)

[0732] Means for acquiring image information,

[0733] A means of analyzing body structure features from acquired image information,

[0734] A means of comparing the analyzed physical structural characteristics with information from past successful exercise participants,

[0735] A means for identifying the type of exercise suitable for the subject based on the comparison results,

[0736] Means for providing specific results,

[0737] A system that includes this.

[0738] (Claim 2)

[0739] The system according to claim 1, which uses machine learning techniques when analyzing body structure features from image information.

[0740] (Claim 3)

[0741] The system according to claim 1, which provides the user with information related to the identified type of exercise.

[0742] "Application Example 1"

[0743] (Claim 1)

[0744] Means for receiving image data,

[0745] A means for extracting the skeletal features of a target from received image data,

[0746] A method for comparing extracted skeletal features with data from past successful athletes,

[0747] A means for determining the appropriate type of exercise for the subject based on the comparison results,

[0748] A means for outputting the judgment result,

[0749] A means for proposing an appropriate exercise program based on the output judgment result,

[0750] A voice output means for guiding the proposed exercise program,

[0751] A system that includes this.

[0752] (Claim 2)

[0753] The system according to claim 1, which uses computer vision technology when extracting skeletal features from image data.

[0754] (Claim 3)

[0755] The system according to claim 1, which provides the user with information related to the determined exercise type and an exercise program based thereon.

[0756] "Example 2 of combining an emotion engine"

[0757] (Claim 1)

[0758] A means of acquiring and transmitting image data,

[0759] A means for extracting the skeletal features of a target from received image data using an analysis device,

[0760] A means for matching extracted skeletal features with data on past successful athletes in an information base,

[0761] A means for identifying the most suitable type of exercise for the subject based on the matching results,

[0762] A means for recognizing the emotional state of a subject using sensors and analysis devices, and adjusting the suitability of the type of exercise,

[0763] A means for outputting the final proposed results of the exercise category using an output device,

[0764] A system that includes this.

[0765] (Claim 2)

[0766] The system according to claim 1, which uses visual information analysis technology when extracting skeletal features from image data.

[0767] (Claim 3)

[0768] The system according to claim 1, comprising an output device for providing information related to a specified type of exercise.

[0769] "Application example 2 when combining with an emotional engine"

[0770] (Claim 1)

[0771] Means for receiving image information,

[0772] A means for extracting the skeletal features of a target from received image information,

[0773] A method for comparing extracted skeletal features with data from past successful athletes,

[0774] A means for determining appropriate physical activity for a subject based on the comparison results,

[0775] A means for outputting the judgment result,

[0776] Means for evaluating emotional states,

[0777] A means of modifying the judgment result according to the emotional state,

[0778] A system that includes this.

[0779] (Claim 2)

[0780] The system according to claim 1, which uses image analysis technology to extract skeletal features from image information.

[0781] (Claim 3)

[0782] The system according to claim 1, which provides the user with data related to the determined physical activity. [Explanation of symbols]

[0783] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. Means for receiving image data, A means for extracting the skeletal features of a target from received image data, A method for comparing extracted skeletal features with data from past successful athletes, A means for determining the appropriate type of exercise for the subject based on the comparison results, A means for outputting the judgment result, A means for proposing an appropriate exercise program based on the output judgment result, A voice output means for guiding the proposed exercise program, A system that includes this.

2. The system according to claim 1, which uses computer vision technology when extracting skeletal features from image data.

3. The system according to claim 1, which provides the user with information related to the determined exercise type and an exercise program based thereon.