Work machine, information processing apparatus, and work machine operation system

The work machine system improves operational effectiveness by integrating natural language input and construction information to execute instructions beyond basic movements, adapting to the working environment.

WO2026134145A1PCT designated stage Publication Date: 2026-06-25SUMITOMO HEAVY IND LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
SUMITOMO HEAVY IND LTD
Filing Date
2025-12-12
Publication Date
2026-06-25

AI Technical Summary

Technical Problem

Existing work machine operation systems are limited to specific instructions like moving forward, backward, left, or right, and excavation, failing to account for the working environment.

Method used

A work machine equipped with an instruction acquisition unit for natural language input, a construction information acquisition unit, and a control unit using a language model to interpret and execute instructions based on both operator input and construction information.

Benefits of technology

Enhances the effectiveness of work machine operations by allowing instructions tailored to the working environment, improving operational efficiency and adaptability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure JP2025043550_25062026_PF_FP_ABST
    Figure JP2025043550_25062026_PF_FP_ABST
Patent Text Reader

Abstract

A work machine according to one embodiment of the present disclosure is provided with an instruction acquisition unit that acquires an instruction in natural language from an operator, a construction information acquisition unit that acquires linguistically expressed construction information, and a control unit that controls the operation of the work machine on the basis of the result of causing a language model to interpret the instruction acquired by the instruction acquisition unit and the information acquired by the construction information acquisition unit.
Need to check novelty before this filing date? Find Prior Art

Description

Work machine, information processing device, and operating system of work machine

[0001] The present disclosure relates to a work machine, an information processing device, and an operating system of the work machine.

[0002] Conventionally, a technique for operating a work machine based on an instruction in a natural language such as voice or text input has been known (see Patent Document 1).

[0003] In Patent Document 1, it is possible to perform voice recognition on an instruction from an operator and cause the work machine to perform an operation corresponding to the content of the voice recognition.

[0004] Japanese Patent Application Laid-Open No. 2000-056827

[0005] However, in Patent Document 1, it is only possible to instruct specific operations of the work machine, such as instructions for moving forward, backward, left, or right and instructions for excavation, and it is not possible to give instructions for operations according to the working environment, for example.

[0006] Therefore, in view of the above problems, an object is to provide a technique capable of improving the effectiveness of operating a work machine according to an instruction in the natural language of an operator.

[0007] To achieve the above object, in one embodiment of the present disclosure, there is provided a work machine including an instruction acquisition unit that acquires an instruction in the natural language from an operator, a construction information acquisition unit that acquires information obtained by verbalizing construction information, and a control unit that controls the operation of the work machine based on the result of interpreting the instruction acquired by the instruction acquisition unit and the information acquired by the construction information acquisition unit using a language model.

[0008] According to the above-described embodiment, it is possible to improve the effectiveness of operating the work machine according to an instruction in the natural language of the operator.

[0009] Figure 1 is a side view showing a work machine according to the first embodiment. Figure 2 is a diagram showing an example of the configuration of the drive control system of the work machine according to the first embodiment. Figure 3 is a functional block diagram showing an example of the configuration of the controller according to the first embodiment. Figure 4 is a diagram showing an example of a language model. Figure 5 is a diagram showing an example of an explanation included in the reference information. Figure 6 is a diagram showing an example of an example problem included in the reference information. Figure 7 is a diagram showing an example of the display of the display device. Figure 8 is the first diagram for explaining a specific example of operation support processing. Figure 9 is the second diagram for explaining a specific example of operation support processing. Figure 10 is the third diagram for explaining a specific example of operation support processing. Figure 11 is a flowchart showing an example of operation support processing according to the first embodiment. Figure 12 is a flowchart showing an example of operation support processing according to a modified example. Figure 13 is a diagram showing an example of reference information according to a modified example. Figure 14 is a schematic diagram showing an example of an operation support system according to the second embodiment. Figure 15 is a diagram showing an example of the hardware configuration of a server according to the second embodiment. Figure 16 is a sequence diagram showing an example of operation support processing according to the second embodiment. Figure 17 is a schematic diagram showing an example of a remote control system according to the third embodiment. Figure 18 is a sequence diagram showing an example of the operation support process according to the third embodiment.

[0010] Embodiments of this disclosure will be described below with reference to the drawings. Furthermore, the embodiments described below are illustrative and not limiting to the invention, and not all features or combinations thereof described in the embodiments are necessarily essential to the invention. In each drawing, identical or corresponding components are denoted by the same or corresponding reference numerals, and their descriptions may be omitted.

[0011] The working machine 100 according to the embodiment of this disclosure is a shovel. The working machine 100 may be a machine other than a shovel, such as a crane, an asphalt finisher, or a forklift. In the illustrated example, the shovel as the working machine 100 is an excavator equipped with a bucket 6 as an end attachment, but it may be an applied machine such as a forestry machine equipped with an end attachment other than a bucket 6.

[0012] (First Embodiment) An overview of the work machine 100 according to the first embodiment will be described with reference to Figure 1. Figure 1 is a side view showing the work machine 100 according to the first embodiment.

[0013] <Example of the configuration of the work machine> The upper rotating body 3 is mounted on the lower traveling body 1 of the work machine 100 via a slewing mechanism 2 so as to be rotatable. A boom 4 is attached to the upper rotating body 3. An arm 5 is attached to the tip of the boom 4, and a bucket 6 as an end attachment is attached to the tip of the arm 5. The end attachment may be a slope bucket or a dredging bucket, etc.

[0014] In the example shown in Figure 1, the direction of movement (forward and backward direction) of the work machine 100 is indicated by the X-axis, the width direction of the work machine 100 is indicated by the Y-axis, and the height direction of the work machine 100 is indicated by the Z-axis.

[0015] Furthermore, the work machine 100 may have all or part of its driven parts, such as the lower traveling body 1, upper slewing body 3, boom 4, arm 5, and bucket 6, electrically driven. In other words, the work machine 100 may be a hybrid excavator or electric excavator, in which all or part of its driven parts are driven by electric actuators.

[0016] Figure 1 illustrates a lower traveling body 1 having a pair of left and right crawlers, but the work machine 100 is not limited to a crawler-type excavator. The work machine 100 may also be a wheeled excavator including a lower traveling body 1 with multiple tires.

[0017] The boom 4, arm 5, and bucket 6 constitute an excavation attachment, which is an example of an attachment, and are hydraulically driven by the boom cylinder 7, arm cylinder 8, and bucket cylinder 9, respectively. A boom angle sensor S1 is attached to the boom 4, an arm angle sensor S2 is attached to the arm 5, and a bucket angle sensor S3 is attached to the bucket 6. The excavation attachment may also be provided with a bucket tilt mechanism.

[0018] The boom angle sensor S1 detects the rotation angle of the boom 4. In this embodiment, the boom angle sensor S1 is an acceleration sensor and can detect the boom angle, which is the rotation angle of the boom 4 relative to the upper slewing body 3. The boom angle is smallest when the boom 4 is lowered to its lowest position, and increases as the boom 4 is raised. The detection signal corresponding to the boom angle from the boom angle sensor S1 is input to the controller 30.

[0019] The boom angle sensor S1 may include, for example, a rotary encoder, an acceleration sensor, a 6-axis sensor, an IMU (Inertial Measurement Unit), etc. The boom angle sensor S1 may also include a potentiometer using a variable resistor, a cylinder stroke sensor for detecting the stroke amount of the hydraulic cylinder (boom cylinder 7) corresponding to the boom angle, etc. The same applies to the arm angle sensor S2, bucket angle sensor S3, and machine body tilt sensor S4.

[0020] The arm angle sensor S2 detects the rotation angle of the arm 5. In this embodiment, the arm angle sensor S2 is an acceleration sensor and can detect the arm angle, which is the rotation angle of the arm 5 relative to the boom 4. The arm angle is smallest when the arm 5 is closed to its shortest extent, and increases as the arm 5 is opened. The detection signal corresponding to the arm angle from the arm angle sensor S2 is input to the controller 30.

[0021] The bucket angle sensor S3 detects the rotation angle of the bucket 6. In this embodiment, the bucket angle sensor S3 is an acceleration sensor and can detect the bucket angle, which is the rotation angle of the bucket 6 relative to the arm 5. The bucket angle is smallest when the bucket 6 is closed to its lowest position, and increases as the bucket 6 is opened. The detection signal corresponding to the bucket angle from the bucket angle sensor S3 is input to the controller 30.

[0022] The boom angle sensor S1, arm angle sensor S2, and bucket angle sensor S3 may be a potentiometer using a variable resistor, a stroke sensor that detects the stroke amount of the corresponding hydraulic cylinder, or a rotary encoder that detects the rotation angle around the connecting pin. The boom angle sensor S1, arm angle sensor S2, and bucket angle sensor S3 constitute an attitude sensor that detects the attitude of the excavation attachment.

[0023] The upper rotating body 3 is equipped with a cabin 10, which serves as the operator's cab, and is also fitted with a power source such as an engine 11. Furthermore, the upper rotating body 3 is fitted with an aircraft tilt sensor S4, a rotation angle sensor S5, and an imaging device S6. Additionally, the upper rotating body 3 is fitted with a communication device T1 and a positioning device PS.

[0024] A controller 30 is installed inside the cabin 10. The cabin 10 also contains a driver's seat and operating devices. Furthermore, a display device D1 and an input device D2 are installed inside the cabin 10. The display device D1 is positioned so that it can be seen by the operator seated in the driver's seat. The input device D2 is positioned within reach of the operator, or within range of the operator's spoken voice. The input device D2 may also be provided on the display device D1.

[0025] The controller 30 is an arithmetic unit that performs various calculations. The controller 30 is installed, for example, in the cabin 10 and controls the drive of the work machine 100. The functions of the controller 30 may be realized by any hardware, software, or a combination thereof. For example, the controller 30 is mainly composed of a microcomputer that includes a CPU (Central Processing Unit), a memory device such as RAM (Random Access Memory), a non-volatile auxiliary storage device such as ROM (Read Only Memory), and various input / output interface devices. The controller 30 realizes various functions by executing various programs installed in the non-volatile auxiliary storage device on the CPU.

[0026] The machine tilt sensor S4 is configured to detect the tilt of the upper rotating body 3 with respect to a predetermined plane. In this embodiment, the machine tilt sensor S4 is an acceleration sensor that detects the tilt angle of the upper rotating body 3 around the longitudinal axis and the tilt angle around the left-right axis with respect to the horizontal plane. The longitudinal axis and left-right axis of the upper rotating body 3 are, for example, orthogonal to each other and pass through a center point which is a point on the rotation axis of the work machine 100. The detection signal corresponding to the tilt angle from the machine tilt sensor S4 is received by the controller 30.

[0027] The rotation angle sensor S5 is configured to detect the rotational angular velocity of the upper rotating body 3. In this embodiment, the rotation angle sensor S5 is a gyro sensor. The rotation angle sensor S5 may also be a resolver or a rotary encoder, etc. The rotation angle sensor S5 may also detect the rotational speed. The rotational speed may be calculated from the rotational angular velocity. The detection signal corresponding to the rotational angle or rotational angular velocity of the upper rotating body 3 detected by the rotation angle sensor S5 is received by the controller 30.

[0028] Furthermore, if the aircraft tilt sensor S4 includes a gyro sensor, a 6-axis sensor, an IMU, etc., capable of detecting angular velocity around three axes, the rotation state of the upper rotating body 3 (for example, rotational angular velocity) may be detected based on the detection signal from the aircraft tilt sensor S4. In this case, the rotational angle sensor S5 may be omitted.

[0029] The imaging device S6 is mounted on the upper rotating body 3 or the cabin 10 and captures images of the area around the work machine 100, acquiring surrounding image information representing the area around the work machine 100. In the illustrated example, the imaging device S6 includes a front camera S6F, a left camera S6L, a right camera S6R, and a rear camera S6B. The imaging device S6 is an example of a detection device.

[0030] The front camera S6F is a camera that captures images in front of the work machine 100 and is mounted on the outside of the cabin 10, such as on the roof of the cabin 10 or on the side of the boom 4. The front camera S6F may also be mounted inside the cabin 10, such as on the ceiling of the cabin 10. The left camera S6L is a camera that captures images to the left of the work machine 100 and is mounted on the upper left end of the upper surface of the upper slewing body 3. The right camera S6R is a camera that captures images to the right of the work machine 100 and is mounted on the upper right end of the upper surface of the upper slewing body 3. The rear camera S6B is a camera that captures images behind the work machine 100 and is mounted on the upper rear end of the upper surface of the upper slewing body 3.

[0031] The front camera S6F, left camera S6L, right camera S6R, and rear camera S6B are all monocular wide-angle cameras equipped with image sensors such as CCD (Charge Coupled Devices) or CMOS (Complementary Metal Oxide Semiconductor), and they output the captured images to the display device D1. Images captured by the front camera S6F, left camera S6L, right camera S6R, and rear camera S6B are also taken up by the controller 30.

[0032] The front camera S6F, rear camera S6B, left camera S6L, and right camera S6R are all mounted on the upper rotating body 3 such that their optical axes are pointed diagonally downwards and a portion of the upper rotating body 3 is included in their imaging range. Therefore, the imaging ranges of the front camera S6F, rear camera S6B, left camera S6L, and right camera S6R each have a field of view of approximately 180 degrees when viewed from above.

[0033] In this embodiment, by arranging the imaging device S6 in the configuration described above, objects present around the work machine 100 can be imaged. The imaging device S6 may also use a camera capable of recognizing the distance to the object to be photographed (for example, an RGBD camera or a stereo camera).

[0034] The imaging device S6 may constitute a detection device for detecting objects to be monitored in the vicinity of the work machine 100. The detection device may consist of devices other than a camera. For example, the detection device may be a LiDAR (Light Detection And Ranging). A LiDAR is, for example, a device capable of measuring the distance between a point cloud of 1 million or more points within the monitoring range and a LiDAR (laser source). Alternatively, the detection device may be other devices capable of measuring the distance to an object, such as a stereo camera, a depth image camera, or a millimeter-wave radar. When a millimeter-wave radar or the like is used as the detection device, the detection device may derive the distance and direction of the object by transmitting a large number of signals (such as laser light) toward the object and receiving the reflected signals. Alternatively, the detection device may be a combination of two or more types of devices. For example, the detection device may be a combination of an imaging device and a LiDAR, or a combination of an imaging device and a millimeter-wave radar, or a combination of an imaging device and a stereo camera.

[0035] The imaging device S6 may consist of at least one of the following: a monocular camera, a stereo camera, a depth image camera, a LiDAR, or a millimeter-wave radar. For example, the imaging device S6 may consist of any combination of the following: a monocular camera only, a stereo camera only, a depth image camera only, a combination of LiDAR and a monocular camera, a combination of LiDAR and a stereo camera, a combination of LiDAR and a depth image camera, a combination of millimeter-wave radar and a monocular camera, a combination of millimeter-wave radar and a stereo camera, a combination of millimeter-wave radar and a depth image camera, etc.

[0036] Input device D2 receives operation input from the operator and outputs it to controller 30. Input device D2 includes, for example, any hardware operation means such as a touch panel, touch pad, button, toggle, rotary knob, etc. Input device D2 may also include software operation means that can be operated through hardware operation means, such as virtual button icons on an operation screen displayed on a display device D1, etc. Input device D2 may also include, for example, a microphone that picks up the operator's voice and voice input means that accepts operation input from the operator by performing voice recognition and natural language processing on the voice signal picked up by the microphone.

[0037] The positioning device PS is configured to acquire information regarding the position of the work machine 100. In this embodiment, the positioning device PS is configured to measure the position and orientation of the work machine 100 in a reference coordinate system. Specifically, the positioning device PS is a GNSS (Global Navigation Satellite System) receiver incorporating an electronic compass, and measures the latitude, longitude, and altitude of the current position of the work machine 100, as well as the orientation of the work machine 100. The reference coordinate system in this embodiment is, for example, the World Geodetic System. The World Geodetic System is a three-dimensional orthogonal XYZ coordinate system with its origin at the center of gravity of the Earth, the X-axis pointing in the direction of the intersection of the Greenwich Meridian and the equator, the Y-axis pointing in the direction of 90 degrees east longitude, and the Z-axis pointing in the direction of the North Pole.

[0038] The communication device T1 is connected to an external communication line and is configured to control communication with equipment located outside the work machine 100. The communication device T1 may also communicate with equipment provided separately from the work machine 100. Equipment provided separately from the work machine 100 may include not only equipment located outside the work machine 100, but also portable terminal devices (mobile terminals) brought into the cabin 10 by the operator of the work machine 100. In this embodiment, the communication device T1 is configured to control communication between the communication device T1 and equipment located outside the work machine 100 via a wireless communication network. The communication device T1 may include, for example, a mobile communication module that supports mobile communication standards such as LTE (Long Term Evolution), 4G (4th Generation), and 5G (5th Generation). The communication device T1 may also include, for example, a satellite communication module for connecting to a satellite communication network. Furthermore, the communication device T1 may include, for example, a Wi-Fi communication module or a Bluetooth® communication module. Furthermore, if there are multiple connectable communication lines, the communication device T1 may include multiple communication devices T1, corresponding to the types of communication lines.

[0039] For example, communication device T1 communicates with external devices such as a remote control room within the work site via a local communication line established at the work site. The local communication line is, for example, a local 5G (so-called local 5G) mobile communication line or a local Wi-Fi network established at the work site. Furthermore, communication device T1 is configured to send and receive information with communication devices installed in the remote control room via a wide-area communication line that includes the work site, i.e., a wide-area network.

[0040] <Drive Control System for Working Machinery> Figure 2 shows an example of the configuration of the drive control system for working machine 100 shown in Figure 1. In Figure 2, the mechanical power transmission system is shown by double lines, the hydraulic fluid lines by thick solid lines, the pilot lines by dashed lines, and the electric drive and control system by dotted lines.

[0041] The drive system of the working machine 100 according to this embodiment includes an engine 11, a regulator 13, a main pump 14, a pilot pump 15, and a control valve unit 17. Further, the hydraulic drive system of the working machine 100 according to this embodiment includes hydraulic actuators such as travel hydraulic motors 1L and 1R, a swing hydraulic motor 2A, a boom cylinder 7, an arm cylinder 8, and a bucket cylinder 9 that hydraulically drive the lower traveling body 1, the upper swing body 3, the boom 4, the arm 5, and the bucket 6, respectively.

[0042] The engine 11 is the main power source in the hydraulic drive system and is mounted, for example, at the rear of the upper swing body 3. Specifically, the engine 11 rotates at a constant speed at a preset target rotational speed under the direct or indirect control of a controller 30 to be described later, and drives the main pump 14 and the pilot pump 15. The engine 11 is, for example, a diesel engine that uses light oil as fuel.

[0043] The rotary shafts of the main pump 14 and the pilot pump 15 as hydraulic pumps are connected to the rotary shaft of the engine 11. The control valve unit 17 is connected to the main pump 14 via a hydraulic oil line.

[0044] The regulator 13 controls the discharge amount of the main pump 14. For example, the regulator 13 adjusts the angle (tilt angle) of the swash plate of the main pump 14 according to a control command from the controller 30.

[0045] The main pump 14 is mounted, for example, at the rear of the upper swing body 3 like the engine 11, and supplies hydraulic oil to the control valve unit 17 through a high-pressure hydraulic line. The main pump 14 is driven by the engine 11. The main pump 14 is, for example, a variable-displacement hydraulic pump, and under the control of the controller 30, the stroke length of the piston is adjusted by adjusting the tilt angle of the swash plate by the regulator 13, and the discharge flow rate (discharge pressure) is controlled.

[0046] The control valve unit 17 is a hydraulic control device that controls the hydraulic system in the work machine 100. In this embodiment, the control valve unit 17 includes control valves 171 to 176. The control valve unit 17 is configured to selectively supply hydraulic fluid discharged by the main pump 14 to one or more hydraulic actuators through the control valves 171 to 176. The control valves 171 to 176 control, for example, the flow rate of hydraulic fluid flowing from the main pump 14 to the hydraulic actuators, and the flow rate of hydraulic fluid flowing from the hydraulic actuators to the hydraulic fluid tank. The hydraulic actuators include a boom cylinder 7, an arm cylinder 8, a bucket cylinder 9, travel hydraulic motors 1L and 1R, and a slewing hydraulic motor 2A. More specifically, control valve 171 corresponds to the travel hydraulic motor 1L, control valve 172 corresponds to the travel hydraulic motor 1R, and control valve 173 corresponds to the slewing hydraulic motor 2A. Furthermore, control valve 174 corresponds to bucket cylinder 9, control valve 175 corresponds to boom cylinder 7, and control valve 176 corresponds to arm cylinder 8.

[0047] The pilot pump 15 is an example of a pilot pressure generating device and is configured to supply hydraulic fluid to hydraulic control equipment via a pilot line. In this embodiment, the pilot pump 15 is a fixed-displacement hydraulic pump. However, the pilot pressure generating device may be implemented by the main pump 14. That is, the main pump 14 may have the function of supplying hydraulic fluid to the control valve unit 17 via a hydraulic fluid line, as well as the function of supplying hydraulic fluid to various hydraulic control equipment via a pilot line. In this case, the pilot pump 15 may be omitted.

[0048] The operating device 26 is provided near the driver's seat of the cab 10 and is used for an operator to operate various driven elements. Specifically, the operating device 26 is used for an operator to operate hydraulic actuators such as left and right travel hydraulic motors, boom cylinder 7, arm cylinder 8, bucket cylinder 9, and swing hydraulic motor. As a result, the operation of the driven elements driven by the hydraulic actuators by the operator can be realized. The operating device 26 includes a pedal device and a lever device for operating each driven element.

[0049] The discharge pressure sensor 28 is configured to detect the discharge pressure of the main pump 14. In the present embodiment, the discharge pressure sensor 28 outputs the detected value to the controller 30.

[0050] The operation sensor 29 is configured to detect the operation content of the operator using the operating device 26. In the present embodiment, the operation sensor 29 detects the operation direction and operation amount of the operating device 26 corresponding to each hydraulic actuator, and outputs an electrical signal (hereinafter also referred to as an operation signal) corresponding to the detected value to the controller 30. In the present embodiment, the controller 30 controls the opening area of the proportional valve 31 according to the output of the operation sensor 29. Then, the controller 30 supplies the hydraulic oil discharged by the pilot pump 15 to the pilot port of the corresponding control valve in the control valve unit 17. The pressure (pilot pressure) of the hydraulic oil supplied to each pilot port is, in principle, a pressure corresponding to the operation direction and operation amount of the operating device 26 corresponding to each hydraulic actuator. Thus, the operating device 26 is configured to supply the hydraulic oil discharged by the pilot pump 15 to the pilot port of the corresponding control valve in the control valve unit 17.

[0051] Also, the direction switching valve for driving each hydraulic actuator built in the control valve unit 17 may be an electromagnetic solenoid type. In this case, the operation signal output from the operating device 26 may be directly input to the control valve unit 17 (that is, to the electromagnetic solenoid type direction switching valve).

[0052] The operating device 26 may also be a hydraulic pilot type. Specifically, the operating device 26 uses hydraulic fluid supplied from the pilot pump 15 through the pilot line to output a pilot pressure corresponding to the operation to the secondary pilot line. The secondary pilot line is then connected to the control valve unit 17. As a result, the control valve unit 17 can receive pilot pressure corresponding to the operation of various driven elements (hydraulic actuators) in the operating device 26. Therefore, the control valve unit 17 can drive each hydraulic actuator according to the operation performed on the operating device 26 by the operator or the like. In this case, an operation sensor 29 capable of acquiring information about the operating state of the operating device 26 is provided, and the output of the operation sensor 29 is taken up by the controller 30. As a result, the controller 30 can grasp the operating state of the operating device 26. The operation sensor 29 is, for example, a pressure sensor that acquires information about the pilot pressure (operating pressure) of the secondary pilot line of the operating device 26.

[0053] Furthermore, some or all of the hydraulic actuators may be replaced with electric actuators. In this case, for example, the controller 30 may output operation commands to the electric actuator or a driver that drives the electric actuator, etc., according to the operation content of the operating device 26 or the content of the remote operation defined by the remote operation signal. Alternatively, the electric actuator may be configured to be operable by the operating device 26 when an operation signal is input from the operating device 26 to the electric actuator or driver, etc.

[0054] Furthermore, if the work machine 100 is operated exclusively by remote control or exclusively by a fully automatic operation function, the operating device 26 may be omitted.

[0055] The proportional valves 31 function as control valves for machine control and are provided for each driven element (hydraulic actuator) operated by the operating device 26, and for each direction of operation of the driven element (hydraulic actuator) (for example, the raising and lowering directions of the boom 4). For example, two proportional valves 31 are provided for each double-acting hydraulic actuator that drives the lower traveling body 1, upper slewing body 3, boom 4, arm 5, and bucket 6. The proportional valves 31 are provided, for example, in the pilot line between the pilot pump 15 and the control valve unit 17, and may be configured to change their flow area (i.e., the cross-sectional area through which hydraulic fluid can flow). As a result, the proportional valves 31 can use the hydraulic fluid from the pilot pump 15 supplied through the primary pilot line to output a predetermined pilot pressure to the secondary pilot line. Therefore, the proportional valves 31 can apply a predetermined pilot pressure to the control valve unit 17 in response to an operation command from the controller 30. Therefore, for example, the controller 30 can directly supply pilot pressure corresponding to the operation content (operation signal) of the operating device 26 from the proportional valve 31 to the control valve unit 17, thereby realizing the operation of the work machine 100 based on the operator's operation.

[0056] Furthermore, the controller 30 may control the proportional valve 31 to realize an automatic operation function for the work machine 100. Specifically, the controller 30 outputs an operation command corresponding to the automatic operation function to the proportional valve 31. In this way, the controller 30 can realize the operation of the work machine 100 using the automatic operation function.

[0057] Furthermore, the controller 30 controls the proportional valve 31 to enable remote operation of the work machine 100. Specifically, the controller 30 outputs an operation command to the proportional valve 31 corresponding to the content of the remote operation specified by the remote operation signal received from the remote control room via the communication device T1. As a result, the controller 30 causes the proportional valve 31 to supply a pilot pressure corresponding to the content of the remote operation to the control valve unit 17, thereby enabling the operation of the work machine 100 based on the operator's remote operation.

[0058] In this embodiment, we will describe a case in which the engine 11 is used as the drive source, and the hydraulic pump is operated by the driving force generated by the engine 11 to perform the operation of the attachment AT, the rotation of the upper rotating body 3, and the driving motion. However, this embodiment is not limited to the engine 11 as the drive source, and a motor may also be used as the drive source. In other words, the control described in this embodiment may be applied to a so-called electric excavator in which the motor, which is the drive source, is driven by power supplied from a battery, or it may be applied to a hybrid excavator equipped with multiple drive sources including the engine 11 and a motor.

[0059] <Controller Functional Configuration> Referring to Figure 3, the functional configuration related to operation support for the work machine 100 according to the first embodiment will be described. Figure 3 is a functional block diagram showing an example of the configuration of the controller 30 according to the first embodiment.

[0060] In this embodiment, we describe an example in which the controller 30 controls the work machine 100, but some of the functions of the controller 30 may be realized by other controllers (control devices). That is, the functions of the controller 30 may be realized in a distributed manner by multiple controllers mounted on the work machine 100.

[0061] Furthermore, each functional block within the controller 30 is conceptual and does not necessarily need to be physically configured as shown in the diagram. All or part of each functional block can be functionally or physically distributed or integrated in any unit. Each processing function performed by each functional block is implemented, in whole or in any part, by a program executed on the CPU. Alternatively, each functional block may be implemented as hardware using wired logic.

[0062] The controller 30 receives information output from the boom angle sensor S1, arm angle sensor S2, bucket angle sensor S3, machine tilt sensor S4, slewing angle sensor S5, imaging device S6, input device D2, communication device T1, positioning device PS, operation sensor 29, etc. Based on the received information and the information stored in the auxiliary storage device D3, it performs various calculations and outputs the calculation results to the display device D1 and proportional valve 31, etc.

[0063] The auxiliary storage device D3 stores the language model LM, construction information CON, and reference information REF.

[0064] The language model LM is a machine learning model trained to perform a predetermined language processing task. The language model LM may be a large language model (LLM) trained to perform various language processing tasks based on a large dataset. The language model LM may also be a small language model (SLM) trained to perform a specific language processing task based on a small dataset. The language model LM may be implemented in an external device (e.g., a server device) that is connected to the work machine 100 via a predetermined communication line in a way that allows for mutual communication. The language model LM may be, for example, GPT, BERT, or Llama.

[0065] Construction information CON is information about the construction site where the work machine 100 is performing work. Construction information CON may include topographic information of the construction site. Construction information CON may include information about the arrangement of objects at the construction site. Construction information CON may include attribute information of objects at the construction site. Construction information CON may include attribute information of the weather. Construction information CON may include attribute information of the operator. Construction information CON may include information about the arrangement of people at the construction site. Construction information CON may include attribute information of people present at the construction site.

[0066] Furthermore, the construction information CON is static information that can be acquired in advance at the construction site and is independent of the position of the work machine 100. The construction information CON can be acquired by means other than those provided by the work machine 100 (for example, the imaging device S6). However, the construction information CON may include information acquired in advance using the means provided by the work machine 100.

[0067] Construction information CON may include, for example, construction drawings. Construction drawings are drawings that show the topography of the construction site, the arrangement of objects at the construction site, and the attributes of objects placed at the construction site. Construction drawings may show, for example, topography such as the ground surface, slopes, cliffs, or holes. Construction drawings may show, for example, the arrangement of pipes buried underground (e.g., water pipes or gas pipes) or structures formed on the ground surface (e.g., utility poles, fences, walls, or buildings). Construction drawings may also show, for example, information indicating whether or not objects placed at the construction site (pipes or structures, etc.) may be demolished.

[0068] Construction information CON may include not only construction drawings but also information on the topography of the construction site. For example, construction information CON may include point cloud data measured at the construction site using a LiDAR mounted on a drone before construction begins.

[0069] Construction Information CON may include information entered by the operator. Information entered by the operator may include, for example, the operator's attribute information or weather attribute information. Operator attribute information may include, for example, gender, age, height, weight, vision or hearing. Weather attribute information may include, for example, weather (sunny, cloudy, rainy or snowy, etc.), temperature, humidity, precipitation, atmospheric pressure or wind (direction and strength). Operator attribute information or weather attribute information may include information obtained from an external information processing system (for example, a personnel management system or a weather information system).

[0070] The construction information CON may include information about people detected at the construction site. The information about detected people may include, for example, the arrangement of people at the construction site, or attribute information of people placed at the construction site. The attribute information of people may include, for example, the result of determining whether or not they are construction personnel. Whether or not they are construction personnel may be determined, for example, from the person's clothing or equipment (safety vest, armband, or helmet, etc.).

[0071] Construction information CON may include supplementary information described in the construction drawings. Supplementary information may include, for example, attribute information of construction personnel. Construction personnel may include, for example, safety managers or medical institutions. Attribute information may include names, company names, contact information (telephone numbers), etc.

[0072] Construction Information CON may include information that verbalizes information about the construction site. Construction Information CON may include information that verbalizes information described in construction drawings in natural language. Construction Information CON may include text that describes the topography of the construction site in natural language. Construction Information CON may include text that describes objects placed at the construction site in natural language. Construction Information CON may include text that describes attribute information about construction personnel in natural language.

[0073] Reference information REF is information that the language model LM refers to in order to generate control information for the work machine 100. Reference information REF may include information that the language model LM has not learned.

[0074] The reference information REF may include information about the functions that the work machine 100 can perform. The information about functions may include descriptions and examples. The descriptions are a set of text that explains one or more functions available to the work machine 100 in natural language. The examples are a set of text that illustrate the code for calling one or more functions available to the work machine 100. The examples may be defined as a combination of construction information and examples of operation instructions as preconditions (i.e., constraints) and the correct code to be output. By providing descriptions and examples as reference information, the language model LM can learn the information to be output in response to operation instructions.

[0075] As shown in Figure 3, the controller 30 includes, as functional units, an instruction acquisition unit 301, a construction information acquisition unit 302, a prompt generation unit 303, a generation instruction unit 304, an operation control unit 305, and a display control unit 306. These functions are realized, for example, by loading a program installed in an auxiliary storage device into a memory device and executing it with the CPU.

[0076] The instruction acquisition unit 301 acquires instructions (hereinafter also referred to as "operation instructions") related to the operation of the work machine 100. Operation instructions are input in natural language to the input device D2 by the operator of the work machine 100. Operation instructions may also be input to a portable terminal brought into the cabin 10 by the operator and then input to the input device D2 via the communication device T1.

[0077] Instructions in natural language may include, for example, instructions input in natural language by the operator's voice. In this case, the instruction acquisition unit 301 can acquire text data corresponding to the natural language instructions by applying known speech recognition technology based on the voice input data received by the input device D2.

[0078] Furthermore, instructions in natural language may also be instructions entered as text by the operator using an input device D2 capable of text input, such as a keyboard or touch panel. In this case, the instruction acquisition unit 301 can acquire the text data received by the input device D2 as instructions in natural language.

[0079] The construction information acquisition unit 302 acquires construction information expressed in language (hereinafter referred to as "language-formatted construction information"). The construction information acquisition unit 302 may read pre-formatted construction information CON from the auxiliary storage device D3. The construction information acquisition unit 302 may also language the construction information CON read from the auxiliary storage device D3. The construction information acquisition unit 302 may also language the construction information CON based on a predetermined template. The construction information acquisition unit 302 may also language the arrangement of objects by embedding the location information of the objects shown in the construction information into the template. The construction information acquisition unit 302 may also language the contact information of construction personnel shown in the construction information by embedding the contact information of construction personnel into the template.

[0080] A template for verbalizing the arrangement of objects may be defined in the format, for example, "There is a yyy at xxx." where "xxx" is the location information of the object and "yyy" is the type or name of the object. The location information of the object may be a three-dimensional position with the ground surface as the XY plane. The location information of the object may also be the relative position between the work machine 100 and the reference position of the object.

[0081] The reference position of the object may be a position on the XY plane where the object and a part of the work machine 100 overlap. For example, the reference position of the object may be a position where the axis extending from the positioning device PS towards the bucket 6 intersects the object. Alternatively, the reference position of the object may be a position on the object where the distance between the work machine 100 and the object is shortest. Furthermore, the reference position of the object may be one or more positions within the rotational range of the work machine 100 that overlap with the object.

[0082] The construction information acquisition unit 302 may use the position information of the work machine 100 received from the positioning device PS to calculate the relative position between the work machine 100 and the reference position of the object. If the position information of the work machine 100 is unavailable, the construction information acquisition unit 302 may set the reference position of the object to the minimum value. The construction information acquisition unit 302 may also display the set reference position on the display device D1 to allow the operator to confirm the reference position. Furthermore, if the operator inputs an operation instruction that includes supplementary information such as "set it to the height in the soil near the building," the construction information acquisition unit 302 may search for the object indicated by the supplementary information (in this case, the building) from the construction information CON and set the reference position based on that object.

[0083] A template for verbalizing the contact information of construction personnel may be defined in the following format, for example, "xxx is yyy, and the phone number is zzz." where "xxx" is the role of the construction personnel, "yyy" is the name or title of the construction personnel, and "zzz" is the telephone number of the construction personnel.

[0084] The prompt generation unit 303 generates prompts to be input to the language model LM. The prompt generation unit 303 may generate prompts based on the operation instructions acquired by the instruction acquisition unit 301 and the verbalized construction information acquired by the construction information acquisition unit 302.

[0085] The prompt includes information instructing the system to generate control information for the work machine 100 corresponding to the operation instructions, based on the languageized construction information. The control information for the work machine 100 may include control commands that the work machine 100 can execute. The control commands may also be code that calls functions written in a predetermined programming language (e.g., Python).

[0086] In other words, a prompt is an example of information that indicates an instruction for the language model LM to interpret the operation instructions and verbalized construction information. Similarly, the control information of the work machine 100 is an example of information that indicates the result of the language model LM interpreting the operation instructions and verbalized construction information.

[0087] The input and output of the language model LM will be explained with reference to Figures 4 to 6. Figure 4 is a diagram showing an example of a language model. As shown in Figure 4, the language model LM is a machine learning model that takes operation instructions and construction information as input and outputs control information for the work machine 100. The language model LM may also accept reference information REF as input. Reference information REF includes explanation 401 and example 402.

[0088] In this embodiment, the operation instruction is an instruction entered by the operator in natural language, either by voice or text. The construction information is text that describes the arrangement of pipes in the construction drawing in natural language. The control information is a control command written in a predetermined programming language. As an example, Figure 4 shows an example where, in response to the operation instruction "Set to the height of the pipe in the ground," the control command is output to set the boundary line of the height limit function to a depth of 2m, based on the construction information "There is a water pipe at a depth of 2m at position (0,5)."

[0089] Figure 5 shows an example of a description included in the reference information. As shown in Figure 5, Description 401 is text that describes the functions that the work machine 100 can perform. Description 401 may contain multiple texts that describe each of the multiple functions. As an example, Figure 5 shows descriptions of the height limiting function and the machine control function.

[0090] Figure 6 shows an example of an example of an example included in the reference information. As shown in Figure 6, Example 402 is text illustrating control commands that invoke functions that the work machine 100 can perform. Example 402 may contain multiple texts corresponding to multiple functions. Example 402 may illustrate multiple control commands for one function, or it may illustrate a single control command that executes multiple functions. As an example, Figure 6 shows a control command for controlling an air conditioner and a control command for executing a height limit function.

[0091] The generation instruction unit 304 instructs the language model LM to generate control information for the work machine 100. The generation instruction unit 304 may also instruct the language model LM to generate control information for the work machine 100 by inputting a prompt generated by the prompt generation unit 303 to the language model LM. The generation instruction unit 304 may also call the language model LM through a predetermined API (Application Programming Interface). The generation instruction unit 304 acquires the control information for the work machine 100 output by the language model LM when a prompt is input to the language model LM.

[0092] The motion control unit 305 controls the operation of the work machine 100. The motion control unit 305 may also control the operation of the work machine 100 based on the control information of the work machine 100 acquired by the generation instruction unit 304. Specifically, the motion control unit 305 drives the hydraulic actuator of the work machine 100 and outputs control commands to the proportional valve 31 to control the operation of the work machine 100, in accordance with the control commands output from the language model LM.

[0093] The operation control unit 305 may control the operation of the work machine 100 based on environmental information representing the environment surrounding the work machine 100, in addition to control information for the work machine 100. The environmental information is information that can be dynamically acquired at the construction site and is dependent on the position of the work machine 100. The environmental information may include, for example, ambient image information captured by the imaging device S6.

[0094] The motion control unit 305 may detect objects present around the work machine 100 based on surrounding image information and control the operation of the work machine 100 based on the detection results. For example, the motion control unit 305 may detect people (e.g., workers), dump trucks, other work machines, the ground surface, slopes, cliffs, holes, buildings, power lines, cones, guardrails, etc., around the work machine 100.

[0095] The display control unit 306 controls the display on the display device D1. The display control unit 306 may output an image of the surroundings of the work machine 100, information regarding the operation of the work machine 100, and information regarding the status of the work machine 100 to the display device D1.

[0096] Display device D1 displays an image of the surroundings of the work machine 100 output from the display control unit 306. Display device D1 also displays at least one of the following information: information regarding the control of the operation of the work machine 100 or information regarding the state of the work machine 100, along with the image of the surroundings of the work machine 100. Information regarding the state of the work machine 100 may include, for example, operating state information, which represents the operating state of the work machine 100; setting state information, which represents the setting state of the work machine 100; or information representing the positional relationship with objects detected around the work machine 100. Information regarding the control of the operation of the work machine 100 may include, for example, operation instructions acquired by the instruction acquisition unit 301; verbalized construction information acquired by the construction information acquisition unit 302; or the results of the operation control unit 305 controlling the operation of the work machine 100.

[0097] Thus, in this embodiment, the controller 30 controls the operation of the work machine 100 based on the results of the language model LM interpreting the operator's instructions in natural language and the construction information expressed in natural language. As a result, the controller 30 can control the operation of the work machine 100 in accordance with the operator's natural language operation instructions that correspond to the construction information.

[0098] <Example of Display on Display Device> The display screen shown by display device D1 will be explained in detail with reference to Figure 7.

[0099] Figure 7 shows an example of the display of the display device D1. The display device D1 of the work machine 100 has an image display unit 142. The image display unit 142 displays a display screen 185 that includes a date and time display area 142a, a driving mode display area 142b, an attachment display area 142c, a fuel consumption display area 142d, an engine control status display area 142e, an engine operating time display area, a coolant temperature display area 142g, a fuel level display area 142h, a rotation speed level display area 142i, a urea solution level display area 142j, a hydraulic oil temperature display area 142k, a shovel status display area 421, a first image display area 422, a second image display area 423, and a notification display area 424, according to the control from the display control unit 306.

[0100] The driving mode display area 142b, the attachment display area 142c, the engine control status display area 142e, and the rotation speed level display area 142i are areas that display setting status information, which is information related to the setting status of the work machine 100. The fuel consumption display area 142d, the engine operating time display area, the coolant temperature display area 142g, the fuel level display area 142h, the urea solution level display area 142j, and the hydraulic oil temperature display area 142k are areas that display operating status information, which is information representing the operating status of the work machine 100 based on the detection results of various sensors.

[0101] The date and time display area 142a is an area that displays the current date and time. The driving mode display area 142b is an area that displays the current driving mode. The attachment display area 142c is an area that displays an image representing the attachment currently installed. The fuel consumption display area 142d is an area that displays fuel consumption information calculated by the controller 30. The fuel consumption display area 142d includes an average fuel consumption display area 142d1 that displays lifetime average fuel consumption or section average fuel consumption, and an instantaneous fuel consumption display area 142d2 that displays instantaneous fuel consumption.

[0102] The engine control status display area 142e is an area that displays the control status of the engine 11. The engine operating time display area is an area that displays the cumulative operating time of the engine 11. The coolant temperature display area 142g is an area that displays the current temperature status of the engine coolant. The fuel level display area 142h is an area that displays the remaining amount of fuel stored in the fuel tank.

[0103] The rotational speed level display area 142i is an area that displays an image of the current level of the engine 11 set by the dial. The rotational speed level display area 142i displays a number indicating the selected level. A "1" displayed in the rotational speed level display area 142i indicates that the selected rotational speed level is "Level 1". A number "n" displayed in the rotational speed level display area 142i indicates that the selected rotational speed level is "Level n". "n" is a natural number. When the operator rotates the dial, the number displayed in the rotational speed level display area 142i changes.

[0104] The urea solution remaining amount display area 142j is an area that displays the remaining amount of urea solution stored in the urea solution tank as an image. The hydraulic oil temperature display area 142k is an area that displays the temperature of the hydraulic oil in the hydraulic oil tank.

[0105] The shovel status display area 421 is an area that displays information representing the positional relationship between the work machine 100 and objects detected around the work machine 100.

[0106] The shovel status display area 421 is a display area that represents the real space centered on the work machine 100 at a predetermined scale. In the shovel status display area 421, a shovel icon 421b indicating the presence of the work machine 100 is placed at the center of the area.

[0107] In the shovel status display area 421, in addition to the shovel icon 421b representing the work machine 100, a direction indicator icon 421a indicating the direction in which the work machine 100 can move, and detection icons 421e, 421f, and 421g representing objects detected around the work machine 100 are displayed simultaneously. The area of ​​the shovel status display area 421 other than the shovel icon 421b, the direction indicator icon 421a, and the detection icons 421e, 421f, and 421g (in other words, the background) may be represented by a single color (for example, black).

[0108] The shovel icon 421b is an icon that combines an image showing the upper rotating body 3 and an image showing the lower traveling body 1, according to the positional relationship between the upper rotating body 3 and the lower traveling body 1 based on the rotation angle.

[0109] The direction indicator icon 421a shows the direction in which the work machine 100 travels when the travel lever is tilted forward, in the shape of a triangle. Note that this embodiment shows an example of an icon that represents the direction in which the work machine 100 travels when the travel lever is tilted forward, and any shape is acceptable as long as it represents the direction in which the work machine 100 can move.

[0110] The detection icons 421e, 421f, and 421g represent objects detected by the surrounding image information captured by the imaging device S6. Specifically, the detection icons 421e, 421f, and 421g are positioned based on the object's position information received from the controller 30. For example, the detection icons 421e, 421f, and 421g are positioned relative to the work machine 100, at a position obtained by multiplying the direction and distance of the detected object by a predetermined scale ratio.

[0111] Thus, the positional relationship between the shovel icon 421b and the detection icons 421e, 421f, and 421g corresponds to the positional relationship between the work machine 100 in real space and the objects present around the work machine 100.

[0112] In the illustrated example, the shovel status display area 421 displays an icon image showing the work machine 100 and objects surrounding the work machine 100 as icons. However, the shovel status display area 421 may also display an overhead view image. The overhead view image may be, for example, an image generated by synthesizing surrounding image information captured by an imaging device S6 installed on the work machine 100, or it may be an image taken from above the work machine 100, for example, by a drone.

[0113] The shovel status display area 421 displays a first circular area 421c and a second circular area 421d, which are determined based on the distance from the work machine 100.

[0114] In the illustrated example, the first circular area 421c and the second circular area 421d are represented as circles that allow the operator to recognize the relative distance from the work machine 100, with the shovel icon 421b as the reference point.

[0115] The image display unit 142 displays the shovel status display area 421, as well as surrounding image information captured by the imaging device S6. By checking the surrounding image information along with the shovel status display area 421, the operator can recognize the specific conditions around the work machine 100. This improves safety.

[0116] In the illustrated example, the first image display area 422 and the second image display area 423 are areas that display ambient image information captured by the imaging device S6. The first image display area 422 displays the rightward image. The second image display area 423 displays the rearward image. The rightward image is an image that shows the space to the right of the work machine 100 and includes an image 422c of the upper right edge of the upper rotating body 3. The rightward image is a real viewpoint image generated by the display control unit 306 and is generated based on an image acquired by the camera S6R. The rearward image is an image that shows the space behind the work machine 100 and includes an image 423c of the counterweight. The rearward image is a real viewpoint image generated by the display control unit 306 and is generated based on an image acquired by the camera S6B.

[0117] The first image display area 422 is displayed to the right of the shovel status display area 421. The second image display area 423 is displayed below the shovel status display area 421. In this embodiment, the area above the image display unit 142 corresponds to the front of the upper rotating body 3. In other words, the second image display area 423 is displayed at a position corresponding to the rear of the shovel status display area 421. That is, the image display unit 142 displays the surrounding image information captured by the imaging device S6 in the direction captured by the imaging device S6, with the shovel status display area 421 as the reference. In this embodiment, since the surrounding image information captured in the direction captured is displayed with the shovel status display area 421 as the reference, the operator can intuitively recognize which direction the surrounding image information represents when referring to it. Therefore, safety can be improved.

[0118] Furthermore, if the controller 30 detects an object in one or more of the right-facing and rear-facing images, the display control unit 306 superimposes a frame indicating the area where the object was detected onto the image in the right-facing or rear-facing view where the object was detected. As a result, frame 422b is displayed on the right-facing image of the first image display area 422, and frame 423b is displayed on the rear-facing image of the second image display area 423. Then, a detection icon 422a is displayed within frame 422b, and a detection icon 423a is displayed within frame 423b.

[0119] The notification display area 424 displays notifications indicating the control results of the work machine 100. The notification display area 424 shown in Figure 7 notifies that, as the operator entered an operation instruction such as "set to the height of the pipe in the ground," control was performed to "set the height limit function to 2m" to prevent collision with the water pipe located at a depth of 2m.

[0120] The image display unit 142 displays surrounding image information captured by the imaging device S6, along with the shovel status display area 421 and the notification display area 424. The operator can simultaneously check the control results or the status of the work machine 100 while confirming the surrounding image information. This improves safety.

[0121] Figure 7 shows a notification display area 424 that displays a message notifying the control result of the work machine 100, but this does not limit the form of the notification display area 424. The notification display area 424 may notify the control result of the work machine 100 using, for example, an icon, animation, a change in the color of a part of the screen, sound, etc.

[0122] <Specific Example of Operation Support Processing> Figure 8 is the first figure used to illustrate a specific example of operation support processing. Figure 8 shows a construction site where a water pipe 502 is buried at a depth of 2 m at position (0, 5) on the ground surface 501. Figure 8(A) is a cross-sectional view along the X direction (i.e., the direction of travel of the work machine 100), Figure 8(B) is a cross-sectional view along the Y direction (i.e., the direction perpendicular to the direction of travel of the work machine 100), and Figure 8(C) is a plan view in the XY plane.

[0123] Construction information CON is assumed to be a construction drawing that represents the construction site shown in Figure 8 using CAD drawings. The verbalized construction information is information that verbalizes all the piping, such as water pipes, described in the construction drawing using natural language. For example, the verbalized construction information may include sentences such as, "At position (0,5), there is a pipe located 2 [m] from the ground surface."

[0124] At this time, suppose the operator inputs an operation instruction by voice, such as "Set it to the position of the pipe in the ground." The controller 30 inputs the voice-recognized operation instruction as text and the pre-verbalized construction information into the language model LM. The language model LM interprets the operation instruction based on the pre-verbalized construction information and outputs a control command to set the boundary of the height limit function to a depth of 2m. The controller 30 executes the control command output from the language model LM. As a result, the work machine 100 will automatically stop operating if it tries to move the bucket 6 below a depth of 2m, because the boundary of the height limit function has been set to a depth of 2m.

[0125] Figure 9 is the second figure illustrating a specific example of operation support processing. Figure 9 shows a construction site where a 30° slope 512 is formed over a width of 5m by excavating 2m from the ground surface 511 to the excavation surface 513. Figure 9(A) is a cross-sectional view along the X direction (i.e., the direction of travel of the work machine 100), Figure 9(B) is a cross-sectional view along the Y direction (i.e., the direction perpendicular to the direction of travel of the work machine 100), and Figure 8(C) is a plan view in the XY plane.

[0126] Construction information CON is assumed to be a construction drawing that represents the construction site shown in Figure 9 using CAD drawings. The verbalized construction information is information that verbalizes all the topographic features, such as slopes, described in the construction drawing using natural language. For example, the verbalized construction information may include sentences such as, "At position (x, y), there is a slope at a 30° angle from the ground surface, which extends for 5m in the direction of Θ."

[0127] Suppose the operator inputs an action instruction by voice, such as "I want to create a slope 3 meters ahead." The controller 30 inputs the voice-recognized action instruction as text and the pre-verbalized construction information into the language model LM. The language model LM interprets the action instruction based on the pre-verbalized construction information and outputs a control command to form a 30° slope 3 meters ahead using its machine control function. The controller 30 executes the control command output from the language model LM. As a result, the work machine 100 is controlled to automatically form a 30° slope 3 meters ahead using its machine control function.

[0128] In the example shown in Figure 9, the verbalized construction information may include a statement indicating that point cloud data measured by a LiDAR or similar device mounted on a drone is used as the ground surface 511. For example, the verbalized construction information may include a sentence such as, "The ground surface in the construction drawings can utilize LiDAR point cloud data collected by a drone."

[0129] At this time, suppose the operator inputs an operation instruction by voice, such as "Tell me the volume of soil from the ground surface to the construction surface." The controller 30 inputs the voice-recognized operation instruction as text and the pre-verified construction information into the language model LM. The language model LM interprets the operation instruction based on the pre-verified construction information, uses the LiDAR point cloud as the ground surface, and outputs a control command to calculate the volume of soil when excavated 2m. The controller 30 executes the control command output from the language model LM. As a result, the work machine 100 can calculate the volume of soil when excavated 2m from the ground surface.

[0130] Figure 10 is the third figure illustrating a specific example of operation support processing. Figure 10 shows an example of a construction drawing with supplementary information. The construction drawing 520 shown in Figure 10 includes the construction drawing itself 521, the contact information of the safety manager 522, and the contact information of the medical institution 523.

[0131] The construction information CON is assumed to be the construction drawing shown in Figure 10. The verbalized construction information is information that verbalizes the supplementary information described in the construction drawing using natural language. For example, the verbalized construction information may include sentences such as, "There is a hospital called yyyyy nearby, and its phone number is 03-1234-5678."

[0132] At this time, suppose the operator inputs an action instruction by voice, such as "Call the hospital immediately." The controller 30 inputs the voice-recognized action instruction as text and the pre-verbalized construction information into the language model LM. The language model LM interprets the action instruction based on the pre-verbalized construction information and outputs a control command to dial the hospital's phone number. The controller 30 executes the control command output from the language model LM. As a result, the work machine 100 enters a communication state with the hospital, allowing the operator to communicate with the hospital by voice.

[0133] Further examples of operation support processing include the following. For example, suppose the construction information CON includes attribute information of objects at the construction site, and the attribute information of the objects includes information indicating that water pipes buried underground may be demolished due to deterioration. In this case, suppose the operator inputs an operation instruction to excavate the construction site. In this case, the language model LM outputs a control command that excludes the area where the water pipes are buried from the protected area (barrier) using the machine control function. The protected area is an area that should be protected and not included as a target for construction by the work machine 100. As a result, the work machine 100 is controlled to automatically perform the work of demolishing the deteriorated water pipes.

[0134] For example, suppose the construction information CON includes weather attribute information, and the weather attribute information includes information that the weather on that day is rainy. In this case, suppose the operator inputs an operation instruction to control the air conditioner. The language model LM interprets the operation instruction based on the language-based construction information and outputs a control command to set the air conditioner's temperature higher. As a result, the work machine 100 is controlled to maintain a comfortable temperature in the cabin 10 according to the weather.

[0135] For example, suppose the construction information CON includes operator attribute information, and the operator attribute information includes information that the operator is elderly. In this case, suppose the operator inputs an operation instruction to control the air conditioner. The language model LM interprets the operation instruction based on the language-based construction information and outputs a control command to set the air conditioner's temperature higher. As a result, the work machine 100 is controlled to maintain a comfortable room temperature in the cabin 10 according to the operator's attributes.

[0136] For example, suppose the construction information CON includes information on the placement of people at the construction site, and this information includes information on the presence of people around the work machine 100. In this case, suppose the operator inputs an operation instruction to perform work around the work machine 100. The language model LM interprets the operation instruction based on the language-based construction information and outputs a control command to include the people placed around the work machine 100 in the protected area. As a result, the work machine 100 can improve the safety of people present at the construction site.

[0137] For example, the construction information CON includes attribute information of people present at the construction site, and the attribute information of the people includes information indicating that the people present at the construction site are not construction personnel. In this case, suppose the operator inputs an operation instruction to perform work around the work machine 100. The language model LM interprets the operation instruction based on the language-based construction information and outputs a control command that includes a wide area where people other than construction personnel are present in the protected area. As a result, the work machine 100 can improve safety even if people other than construction personnel are present at the construction site.

[0138] <Flow of Operation Support Processing> The operation support processing performed by the controller 30 of the work machine 100 will be explained below. Figure 11 is a flowchart showing an example of operation support processing according to the first embodiment.

[0139] In step S101, the instruction acquisition unit 301 of the controller 30 acquires an operation instruction input to the input device D2 from the operator in natural language. If the operation instruction is input by voice, the instruction acquisition unit 301 applies speech recognition technology to the voice signal of the acquired operation instruction to acquire text indicating the operation instruction. The instruction acquisition unit 301 sends the text indicating the operation instruction to the prompt generation unit 303.

[0140] In step S102, the construction information acquisition unit 302 of the controller 30 reads the construction information CON from the auxiliary storage device D3. In this embodiment, the construction information acquisition unit 302 reads all of the construction information CON stored in the auxiliary storage device D3.

[0141] In step S103, the construction information acquisition unit 302 of the controller 30 determines whether the construction information CON read in step S102 has been translated into language. If the construction information CON has been translated into language (YES), the construction information acquisition unit 302 sends the construction information CON read in step S102 to the prompt generation unit 303 as translated construction information and proceeds to step S105. On the other hand, if the construction information CON has not been translated into language (NO), the construction information acquisition unit 302 proceeds to step S104.

[0142] In step S104, the construction information acquisition unit 302 of the controller 30 translates the construction information CON read in step S102 into language. The construction information acquisition unit 302 may also translate the construction information CON into language by embedding the information recognized from the construction information CON into a template. Once the construction information acquisition unit 302 translates the construction information CON that has not been translated into language, it sends the translated construction information to the prompt generation unit 303 and proceeds to step S105.

[0143] In step S105, the prompt generation unit 303 of the controller 30 receives text indicating an operation instruction from the instruction acquisition unit 301. The prompt generation unit 303 also receives verbalized construction information from the construction information acquisition unit 302. Furthermore, the prompt generation unit 303 reads reference information from the auxiliary storage device D3. Then, based on the text indicating the operation instruction, the verbalized construction information, and the reference information, the prompt generation unit 303 generates a prompt to be input to the language model LM. The prompt generation unit 303 sends the prompt to the generation instruction unit 304.

[0144] In step S106, the generation instruction unit 304 of the controller 30 receives a prompt from the prompt generation unit 303. The generation instruction unit 304 inputs the prompt to the language model LM stored in the auxiliary storage device D3. The language model LM generates control information for the work machine 100 corresponding to the operation instruction, based on the languageized construction information contained in the prompt. The generation instruction unit 304 acquires the control information for the work machine 100 output from the language model LM. The generation instruction unit 304 sends the control information for the work machine 100 to the operation control unit 305.

[0145] In step S107, the operation control unit 305 of the controller 30 receives control information for the work machine 100 from the generation instruction unit 304. The work machine 100 operates according to the control information. As a result, the work machine 100 can perform operations in accordance with the operator's natural language operation instructions.

[0146] After controlling the operation of the work machine 100, the display control unit 306 of the controller 30 may display the result of controlling the operation of the work machine 100 in step S107 on the display device D1. Specifically, the display control unit 306 may display the operation instruction acquired in step S101 and the result of controlling the operation of the work machine 100 in step S107 in the notification display area 424 of the image display unit 142.

[0147] <Modification 1 of Operation Support Processing> In the first embodiment, a configuration was described in which all pre-prepared construction information and reference information are included in the prompt. The construction information or reference information to be included in the prompt may be narrowed down. For example, the construction information acquisition unit 302 may narrow down the verbalized construction information based on the content of the operation instruction. Also, for example, the prompt generation unit 303 may narrow down the reference information to be included in the prompt based on the content of the operation instruction. Below, a modification in which the verbalized construction information or reference information to be included in the prompt is narrowed down will be described, focusing on the differences from the first embodiment.

[0148] Figure 12 is a flowchart showing an example of the operation support process related to the modified example. The processes in steps S111 to S114 and S118 to S119 are the same as the processes in steps S101 to S104 and S106 to S107 of the operation support process according to the first embodiment (see Figure 11). That is, the process in steps S115 to S117 of the operation support process related to the modified example differs from the operation support process according to the first embodiment.

[0149] In step S115, the construction information acquisition unit 302 narrows down the verbalized construction information based on the text indicating the operation instruction. For example, the construction information acquisition unit 302 may classify the text indicating the operation instruction into predetermined categories and extract only the verbalized construction information related to the classified categories. The predetermined categories may be, for example, classifications related to the functions that the work machine 100 can perform. The construction information acquisition unit 302 may also classify the text indicating the operation instruction into predetermined categories based on the language model LM. For example, if the operator inputs an operation instruction such as "set to the height of the pipe in the ground", the construction information acquisition unit 302 may classify the operation instruction as "height limit function" and extract only the verbalized construction information related to "height limit function".

[0150] As another example, the construction information acquisition unit 302 may narrow down the verbalized construction information based on the language model LM. The construction information acquisition unit 302 may also input text indicating the operation instruction and the verbalized construction information into the language model LM in response to a prompt such as "Please summarize the construction information related to the operation instruction." The language model LM extracts and outputs only the verbalized construction information related to the operation instruction. The construction information acquisition unit 302 only needs to acquire the verbalized construction information output from the language model LM.

[0151] In step S116, the prompt generation unit 303 of the controller 30 narrows down the reference information based on the text indicating the operation instruction. For example, the prompt generation unit 303 may classify the text indicating the operation instruction into predetermined categories and extract only the reference information related to the classified category. The classification of the text indicating the operation instruction may be the same as in step S115. For example, if the operator inputs an operation instruction classified as "height limit function", the prompt generation unit 303 may extract only the explanations and examples related to "height limit function".

[0152] In step S117, the prompt generation unit 303 of the controller 30 generates a prompt to be input to the language model LM based on the text indicating the operation instruction, the languageized construction information, and the reference information. However, the languageized construction information is the languageized construction information narrowed down in step S115, and the reference information is the reference information narrowed down in step S116. The prompt generation unit 303 sends the prompt to the generation instruction unit 304.

[0153] Figure 13 shows an example of reference information relating to a modified example. As shown in Figure 13, the reference information REF relating to the modified example is limited to only explanation 411 and example 412 related to a specific function (in this case, the height limiting function).

[0154] <Modification of Operation Support Processing 2> In the first embodiment, a configuration in which the input and output of the language model LM is text data was described. The language model LM may include data other than text data in its input and output. For example, the language model LM may include audio signals in its input and output. Also, for example, the language model LM may include feature data in its input and output. The feature data may be, for example, an embedding vector extracted from text data, or an acoustic feature vector extracted from an audio signal.

[0155] As an example, the instruction acquisition unit 301 may acquire an audio signal, which is an audio signal of an action instruction spoken by the operator, as an action instruction. In this case, the prompt generation unit 303 may generate a prompt that includes the audio signal and input it to the language model LM. Alternatively, as an example, the instruction acquisition unit 301 may extract an acoustic feature vector from the audio signal, which is an audio signal of an action instruction spoken by the operator. In this case, the prompt generation unit 303 may generate a prompt that includes the acoustic feature vector and input it to the language model LM.

[0156] When operator voice commands are converted to text, some information may be lost. For example, an operator's voice may express their emotions (e.g., anxiety), but this information may be lost when converted to text. By inputting the operator's voice as an audio signal or acoustic feature vector into a language model (LM), the LM can interpret the command using the information that would otherwise be lost in text conversion.

[0157] On the other hand, the text obtained by speech recognition of operation instructions has the advantage of being easy for a human (e.g., the operator) to confirm the content. For example, the instruction acquisition unit 301 may generate both the text obtained by speech recognition of the operation instructions and an acoustic feature vector extracted from the audio signal of the operation instructions. In this case, the prompt generation unit 303 may input a prompt including the acoustic feature vector to the language model LM. The display control unit 306 may also display the text obtained by speech recognition of the operation instructions on the display device D1. This allows the operator to easily confirm the text obtained by speech recognition of the operation instructions, and the language model LM can appropriately generate control information for the work machine 100 by utilizing the information that would be lost if it were converted to text.

[0158] As another example, the construction information acquisition unit 302 may extract embedding vectors from information that has been verbalized in natural language from the construction information CON. The construction information acquisition unit 302 may also extract embedding vectors based on a deep learning model that has been trained to correspond with the language model LM. In this case, the prompt generation unit 303 may generate a prompt that includes the embedding vectors and input it to the language model LM.

[0159] When construction information CON, such as construction drawings, is translated into language, the amount of text data can become enormous. Converting the text representing the translated construction information CON into embedded vectors can reduce the amount of data in the translated information. The language model LM can generate control information for the work machine 100 at high speed and with high accuracy based on lighter and more detailed information.

[0160] On the other hand, the text representing the construction information CON has the advantage of being easy for a human (e.g., the operator) to understand. For example, the construction information acquisition unit 302 may generate both the text representing the construction information CON and the embedded vectors extracted from that text. In this case, the prompt generation unit 303 may input a prompt including the embedded vectors to the language model LM. The display control unit 306 may also display the text representing the construction information CON on the display device D1. This allows the operator to easily understand the information representing the construction information CON, and enables the language model LM to generate control information for the work machine 100 quickly and accurately based on lightweight and detailed information.

[0161] <Effects of the First Embodiment> The work machine 100 according to this embodiment includes an instruction acquisition unit 301 that acquires instructions in natural language from the operator, a construction information acquisition unit 302 that acquires information that has been verbalized from construction information, and an operation control unit 305 that controls the operation of the work machine 100 based on the result of having the language model LM interpret the instructions acquired by the instruction acquisition unit 301 and the information acquired by the construction information acquisition unit 302. According to this embodiment, since the work machine 100 interprets instructions in natural language from the operator based on information that has been verbalized from construction information, the effectiveness of operating the work machine 100 with instructions in natural language from the operator can be improved.

[0162] Construction information may include topographic information of the construction site, information on the arrangement of objects at the construction site, or attribute information of construction personnel. According to this embodiment, since the natural language instructions from the operator are interpreted based on the topographic information of the construction site, information on the arrangement of objects at the construction site, or attribute information of construction personnel, the effectiveness of operating the work machine 100 based on the operator's natural language instructions can be improved.

[0163] The work machine 100 may be equipped with a display device D1 that displays an image of the area around the work machine 100. According to this embodiment, the operator can operate the work machine 100 while checking the image of the area around the work machine 100, thus improving safety.

[0164] The display device D1 may display information related to the control of the operation of the work machine 100, along with an image of the surroundings of the work machine 100. According to this embodiment, the operator can check the information related to the control of the operation of the work machine 100 while checking the image of the surroundings of the work machine 100, thereby improving safety.

[0165] The display device D1 may display information regarding the status of the work machine 100 along with an image of the surroundings of the work machine 100. According to this embodiment, the operator can check the information regarding the status of the work machine 100 while checking the image of the surroundings of the work machine 100, thereby improving safety.

[0166] The work machine 100 may narrow down the information that verbalizes the construction information based on the instructions acquired by the instruction acquisition unit 301. According to this embodiment, since only the construction information related to the instructions given by the operator is interpreted by the language model LM, the operation of the work machine 100 can be appropriately controlled.

[0167] The work machine 100 may acquire an audio signal obtained by capturing instructions spoken by the operator, text obtained by speech recognition of the audio signal, or an acoustic feature vector extracted from the audio signal. According to this embodiment, the operation of the work machine 100 can be appropriately controlled by utilizing information that would be lost if the audio signal or acoustic feature vector were converted to text, so that the language model LM interprets it. Furthermore, according to this embodiment, the operator can easily confirm the content of the operation instructions interpreted by the language model LM based on the text obtained by speech recognition of the audio signal.

[0168] The work machine 100 may acquire construction information as text expressed in natural language, or as embedded vectors extracted from the text. According to this embodiment, since the language model LM is made to interpret embedded vectors extracted from text expressed in natural language, the language model LM can control the operation of the work machine 100 at high speed and with high accuracy based on lighter and more detailed information.

[0169] (Second Embodiment) An overview of the operation support system SYS1 according to the second embodiment will be described with reference to Figure 14. Figure 14 is a schematic diagram showing an example of the operation support system SYS1 according to the second embodiment.

[0170] <Equipment comprising the operation support system> As shown in Figure 14, the operation support system SYS1 according to the second embodiment includes a work machine 100 and a server device SVR.

[0171] The work machine 100 and the server device SVR are connected to each other so that they can send and receive data via a communication network NW. Alternatively, the work machine 100 and the server device SVR may be connected to each other so that they can send and receive data directly without using the communication network NW.

[0172] The server device (SVR) is, for example, a server computer (a so-called cloud server) or an edge server. The server device (SVR) is typically a fixed terminal device, but it may also be a portable terminal device (for example, a laptop computer, tablet, or smartphone).

[0173] The server device SVR can be installed anywhere as long as it can communicate with the work machine 100. The server device SVR may be installed in the control room of the work site where the work machine 100 is operating. The server device SVR may also be installed at a location different from the work site (for example, a data center).

[0174] The operation support system SYS1 may include two or more work machines 100. This allows the operation support system SYS1 to support the operation of two or more work machines 100 using a single server device SVR.

[0175] [Server Device Configuration] The configuration of the server device SVR will be described with reference to Figure 15. Figure 15 is a block diagram showing an example of the configuration of the server device SVR.

[0176] The functions of the server device SVR are realized by any hardware or any combination of hardware and software. For example, as shown in Figure 15, the server device SVR includes an external interface 201, an auxiliary storage device 202, a memory device 203, a CPU 204, a high-speed processing unit 205, a communication interface 206, an input device 207, a display device 208, and an audio output device 209. These are connected by bus BS2.

[0177] The external interface 201 functions as an interface for reading data from and writing data to the recording medium 201A. The recording medium 201A includes, for example, a flexible disk, CD (Compact Disc), DVD (Digital Versatile Disc), BD (Blu-ray® Disc), SD memory card, USB memory, etc. The server device SVR can read various data used in processing through the recording medium 201A, store it in the auxiliary storage device 202, and install programs that realize various functions.

[0178] The server device SVR may acquire various data and programs for processing from external devices through the communication interface 206.

[0179] The auxiliary storage device 202 stores various installed programs, as well as files and data necessary for various processes. The auxiliary storage device 202 includes, for example, an HDD (Hard Disc Drive), an SSD (Solid State Disc), or flash memory.

[0180] When a program startup command is received, the memory device 203 reads the program from the auxiliary storage device 202 and stores it. The memory device 203 includes, for example, DRAM (Dynamic Random Access Memory) or SRAM.

[0181] The CPU 204 executes various programs loaded from the auxiliary storage device 202 into the memory device 203, and implements various functions related to the server device SVR according to the program.

[0182] The high-speed arithmetic unit 205 works in conjunction with the CPU 204 to perform calculations at a relatively high speed. The high-speed arithmetic unit 205 includes, for example, a GPU (Graphics Processing Unit), an ASIC (Application Specific Integrated Circuit), or an FPGA (Field-Programmable Gate Array). The high-speed arithmetic unit 205 may be omitted depending on the required speed of the calculations.

[0183] The communication interface 206 is used as an interface for communicating with external devices. The server device SVR can communicate with external devices, such as a work machine 100, through the communication interface 206. The communication interface 206 may also have multiple types of communication interfaces depending on the communication method between it and the connected devices.

[0184] The input device 207 receives various inputs from the user. The input device 207 may also include a remote control device for remotely operating the work machine 100.

[0185] The input device 207 includes, for example, an input device that accepts mechanical operation input from a user (hereinafter referred to as "operation input device"). An operation device for remote control may be an operation input device. An operation input device includes, for example, buttons, toggles, levers, keyboards, mice, touch panels mounted on the display device 208, touch pads provided separately from the display device 208, and the like.

[0186] Furthermore, the input device 207 may include a voice input device capable of receiving voice input from the user. The voice input device may include, for example, a microphone capable of collecting the user's voice.

[0187] Furthermore, the input device 207 may include a gesture input device capable of receiving gesture input from the user. The gesture input device may include, for example, a camera capable of capturing images of the user's gestures.

[0188] Furthermore, the input device 207 may include a biometric input device capable of receiving biometric input from the user. The biometric input device may include, for example, a camera capable of acquiring image data containing information about the user's fingerprints or iris.

[0189] The display device 208 displays information screens and operation screens to the user of the server device SVR. The display device 208 is, for example, a liquid crystal display or an organic EL (electroluminescence) display.

[0190] The sound output device 209 transmits various information to the user of the server device SVR by sound. The sound output device 209 may be, for example, a buzzer, alarm, or speaker.

[0191] In this embodiment, the server device SVR has the same functional configuration as the controller 30 according to the first embodiment. That is, the server device SVR includes, as functional units, an instruction acquisition unit 301, a construction information acquisition unit 302, a prompt generation unit 303, a generation instruction unit 304, an operation control unit 305, and a display control unit 306. These functions are realized, for example, by loading a program installed in the auxiliary storage device 202 into the memory device 203 and executing it with the CPU 204. The auxiliary storage device 202 of the server device SVR stores the language model LM, construction information CON, and reference information REF.

[0192] <Flow of Operation Support Processing> The operation support processing performed by the operation support system SYS1 will be described below. Figure 16 is a sequence diagram showing an example of operation support processing according to the second embodiment. Note that the processing in steps S204 to S208 is the same as the processing in steps S102 to S106 of the operation support processing according to the first embodiment (see Figure 11).

[0193] In step S201, the controller 30 of the work machine 100 acquires operation instructions input by the operator in natural language to the input device D2. If the operation instruction is input by voice, the instruction acquisition unit 301 may acquire text indicating the operation instruction by applying voice recognition technology to the voice signal that has been captured.

[0194] In step S202, the controller 30 of the work machine 100 transmits the operation instruction acquired in step S201 to the server device SVR. The controller 30 may transmit an audio signal containing the operation instruction to the server device SVR, or it may transmit text indicating the operation instruction to the server device SVR.

[0195] In step S203, the server device SVR receives an operation instruction transmitted from the work machine 100. The instruction acquisition unit 301 of the server device SVR acquires the operation instruction received by the server device SVR. If the instruction acquisition unit 301 acquires an audio signal of the operation instruction, it applies speech recognition technology to the audio signal to acquire text indicating the operation instruction. The instruction acquisition unit 301 sends the text indicating the operation instruction to the prompt generation unit 303.

[0196] In step S209, the operation control unit 305 of the server device SVR receives control information for the work machine 100 from the generation instruction unit 304. Based on the control information for the work machine 100, the operation control unit 305 controls the operation of the work machine 100. For example, the operation control unit 305 may generate an operation signal to control the operation of the work machine 100 based on the control information for the work machine 100 and transmit it to the work machine 100.

[0197] In step S210, the controller 30 of the work machine 100 receives an operation signal from the server device SVR. Based on the operation signal, the controller 30 of the work machine 100 controls the operation of the work machine 100. As a result, the work machine 100 can perform operations in accordance with the operator's natural language instructions.

[0198] The processes executed by the work machine 100 and the server device SVR can be arbitrarily swapped. In this embodiment, a configuration was described in which the work machine 100 acquires operation instructions from the operator, the server device SVR acquires verbalized construction information, and the operation instructions and construction information are input to the language model LM to generate control information for the work machine 100. For example, the work machine 100 may acquire operation instructions and verbalized construction information and send a prompt generated based on the operation instructions and verbalized construction information to the server device SVR. The server device SVR may generate control information for the work machine 100 by inputting the prompt received from the work machine 100 to the language model LM.

[0199] <Effects of the Second Embodiment> The server device SVR according to this embodiment includes an instruction acquisition unit 301 that acquires instructions in natural language from the operator to the work machine 100, a construction information acquisition unit 302 that acquires information that has been translated into language from construction information, and an operation control unit 305 that controls the operation of the work machine 100 based on the result of having the language model LM interpret the instructions acquired by the instruction acquisition unit 301 and the information acquired by the construction information acquisition unit 302. According to this embodiment, since the server device SVR interprets instructions in natural language from the operator based on information that has been translated into language from construction information, the effectiveness of operating the work machine 100 with instructions in natural language from the operator can be improved.

[0200] (Third Embodiment) The embodiments described above described the case in which work is performed on a work machine 100 with an operator on board. However, the embodiments described above are not limited to methods in which work is performed when an operator is on board the work machine 100. For example, when the work machine 100 performs work according to remote control, the same control as in the embodiments described above may be applied. Therefore, the third embodiment will describe the case in which the work machine 100 is remotely controlled.

[0201] Referring to Figure 17, an overview of the remote control system SYS2 according to the third embodiment will be described. Figure 17 is a schematic diagram showing an example of the remote control system SYS2 according to the third embodiment.

[0202] <Equipment comprising the remote control system> As shown in Figure 17, the remote control system SYS2 according to the third embodiment includes a work machine 100 and a remote control room RC.

[0203] The work machine 100 and the remote control room RC are connected to each other so that data can be sent and received via a communication network NW. Alternatively, the work machine 100 and the remote control room RC may be connected to each other so that data can be sent and received directly without using the communication network NW.

[0204] In the illustrated example, the work machine 100 transmits information about the work site to the remote control room RC. This allows the operator OP in the remote control room RC to understand the situation at the work site based on the information from the work machine 100. The remote control room RC transmits operation signals to the work machine 100 to control its operation. This allows the operator OP in the remote control room RC to operate the work machine 100 from the remote control room RC.

[0205] The remote control system SYS2 may include two or more work machines 100. This allows the remote control system SYS2 to provide information about the work site to the remote control room RC through two or more work machines 100.

[0206] <Example of Remote Control Room Configuration> The remote control room RC is equipped with a communication device T2, a remote controller 40, an operating device 42, an operating sensor 43, a display device D4, and an input device D5. The remote control room RC also has an operator's seat DS where the operator OP sits to remotely control the work machine 100.

[0207] The communication device T2 is configured to control communication with the communication device T1 attached to the work machine 100.

[0208] The remote controller 40 is an information processing device that performs various calculations. In this embodiment, the remote controller 40 is composed of a microcomputer including a CPU and memory. The various functions of the remote controller 40 are realized by the CPU executing a program stored in memory.

[0209] Display device D4 is a device capable of displaying various types of information. Display device D4 displays a screen based on information transmitted from the work machine 100 so that the operator OP in the remote control room RC can visually check the area around the work machine 100. By referring to display device D4, the operator OP can check the situation of the work site, including the area around the work machine 100, even though they are in the remote control room RC. In the illustrated example, display device D4 is a liquid crystal display that displays images captured by an imaging device mounted on the work machine 100. Note that display device D4 may also be a display or projector that enables naked-eye stereoscopic viewing, or it may be VR goggles or the like.

[0210] The input device D5 is positioned within reach of the operator OP, or within range of the operator OP's voice. The input device D2 may be provided on the operating device 42 or the display device D4. The input device D5 may be configured similarly to the input device D2 installed on the work machine 100.

[0211] The operating device 42 is equipped with an operation sensor 43 for detecting the operation of the operating device 42. The operation sensor 43 is, for example, a tilt sensor that detects the tilt angle of the operating lever, or an angle sensor that detects the oscillation angle of the operating lever around its pivot axis. The operation sensor 43 may also consist of other sensors such as a pressure sensor, a current sensor, a voltage sensor, or a distance sensor. The operation sensor 43 outputs information regarding the detected operation of the operating device 42 to the remote controller 40. The remote controller 40 generates an operation signal based on the information received from the operation sensor 43 and transmits the generated operation signal to the work machine 100. The operation sensor 43 may also be configured to generate an operation signal. In this case, the operation sensor 43 may output the operation signal to the communication device T2 without going through the remote controller 40. This enables remote control of the work machine 100 from the remote control room RC.

[0212] The remote controller 40 or operation sensor 43 may also transmit operation signals to the work machine 100 via the server device SVR. For example, the remote controller 40 or operation sensor 43 generates an operation signal to which identification information for the work machine 100 to be operated is added, and transmits it to the server device SVR. The server device SVR identifies the work machine 100 based on the identification information added to the operation signal and transmits the operation signal to the identified work machine 100.

[0213] In this embodiment, the remote controller 40 has the same functional configuration as the controller 30 in the first embodiment. That is, the remote controller 40 includes, as a functional unit, an instruction acquisition unit 301, a construction information acquisition unit 302, a prompt generation unit 303, a generation instruction unit 304, an operation control unit 305, and a display control unit 306. The remote controller 40 is also connected to an auxiliary storage device D3 in which the language model LM, construction information CON, and reference information REF are stored.

[0214] <Flow of Operation Support Processing> The operation support processing performed by the remote control system SYS2 will be described below. Figure 18 is a sequence diagram showing an example of operation support processing according to the third embodiment. Note that the processing in steps S302 to S306 is the same as the processing in steps S102 to S106 of the operation support processing according to the first embodiment (see Figure 11).

[0215] In step S301, the instruction acquisition unit 301 of the remote controller 40 acquires an operation instruction input to the input device D5 in natural language from the operator OP. If the operation instruction is input by voice, the instruction acquisition unit 301 applies speech recognition technology to the voice signal of the acquired operation instruction to acquire text indicating the operation instruction. The instruction acquisition unit 301 sends the text indicating the operation instruction to the prompt generation unit 303.

[0216] In step S307, the operation control unit 305 of the remote controller 40 receives control information for the work machine 100 from the generation instruction unit 304. Based on the control information for the work machine 100, the operation control unit 305 controls the operation of the work machine 100. For example, the operation control unit 305 may generate an operation signal to control the operation of the work machine 100 based on the control information for the work machine 100 and transmit it to the work machine 100.

[0217] In step S308, the controller 30 of the work machine 100 receives an operation signal from the remote control room RC. Based on the operation signal, the controller 30 of the work machine 100 controls the operation of the work machine 100. As a result, the work machine 100 can perform operations in accordance with the natural language operation instructions of the operator OP.

[0218] The remote control system SYS2 may also include a server device SVR according to the second embodiment. For example, the remote controller 40 may obtain operation instructions from the operator via OP and transmit them to the server device SVR. The server device SVR may obtain verbalized construction information and generate control information for the work machine 100 by inputting the operation instructions and construction information into a language model LM. Alternatively, for example, the remote controller 40 may obtain operation instructions and verbalized construction information and transmit prompts generated based on the operation instructions and verbalized construction information to the server device SVR. The server device SVR may generate control information for the work machine 100 by inputting prompts received from the remote controller 40 into a language model LM.

[0219] <Effects of the Third Embodiment> The remote controller 40 according to this embodiment includes a construction information acquisition unit 302 that acquires construction information in language form, an instruction acquisition unit 301 that acquires instructions from the operator in natural language, and an operation control unit 305 that controls the operation of the work machine 100 based on the result of having the language model LM interpret the instructions acquired by the instruction acquisition unit 301 and the information acquired by the construction information acquisition unit 302. According to this embodiment, the remote controller 40 interprets instructions from the operator in natural language form based on construction information in language form, so the effectiveness of operating the work machine 100 by the operator's instructions in natural language can be improved.

[0220] In the embodiments and modifications described above, a case in which a shovel is used as an example of a work machine was explained. However, the configurations shown in the embodiments and modifications are not limited to being applied to a shovel as a work machine, and may also be applied to cranes, forklifts, etc.

[0221] The embodiments of the work machine, the work machine operation support system, and the work machine remote control system according to the present invention have been described above, but the present invention is not limited to the above embodiments. Various changes, modifications, substitutions, additions, deletions, and combinations are possible within the scope described in the claims. These also naturally fall within the technical scope of the present invention.

[0222] This application claims priority to Japanese Patent Application No. 2024-225500, filed with the Japan Patent Office on December 20, 2024, which is incorporated herein by reference to its entire contents.

[0223] 100 Work Machinery 1 Lower Traveling Body 2 Swivel Mechanism 3 Upper Swivel Body 4 Boom 5 Arm 6 Bucket S1 Boom Angle Sensor S2 Arm Angle Sensor S3 Bucket Angle Sensor S4 Machine Body Tilt Sensor S5 Swivel Angle Sensor S6 Imaging Device PS Positioning Device T1 Communication Device D1 Display Device D2 Input Device D3 Auxiliary Storage Device LM Language Model 30 Controller 301 Instruction Acquisition Unit 302 Construction Information Acquisition Unit 303 Prompt Generation Unit 304 Generation Instruction Unit 305 Operation Control Unit 306 Display Control Unit SYS1 Operation Support System SVR Server Device SYS2 Remote Control System RC Remote Control Room 40 Remote Controller T2 Communication Device D4 Display Device D5 Input Device

Claims

1. A work machine comprising: an instruction acquisition unit that acquires instructions in natural language from an operator; a construction information acquisition unit that acquires construction information expressed in language; and a control unit that controls the operation of the work machine based on the results of a language model interpreting the instructions acquired by the instruction acquisition unit and the information acquired by the construction information acquisition unit.

2. The work machine according to claim 1, wherein the construction information includes topographic information of the construction site, information on the arrangement of objects at the construction site, attribute information of construction personnel, attribute information of objects at the construction site, attribute information of the weather, attribute information of the operator, information on the arrangement of people at the construction site, or attribute information of people present at the construction site.

3. The work machine according to claim 1, further comprising a display device for displaying an image of the surroundings of the work machine.

4. The work machine according to claim 3, wherein the display device displays information relating to the control of the operation of the work machine together with the image.

5. The work machine according to claim 3, wherein the display device displays information relating to the status of the work machine together with the image.

6. The work machine according to any one of claims 1 to 5, wherein the construction information acquisition unit narrows down the information that verbalizes the construction information based on the instructions acquired by the instruction acquisition unit.

7. The work machine according to any one of claims 1 to 5, wherein the instruction acquisition unit acquires an audio signal obtained by recording the instruction spoken by the operator, text obtained by speech recognition of the audio signal, or an acoustic feature vector extracted from the audio signal.

8. The work machine according to any one of claims 1 to 5, wherein the construction information acquisition unit acquires the construction information as text expressed in natural language, or embedded vectors extracted from the text.

9. An information processing device for controlling a work machine, comprising: a construction information acquisition unit that acquires information that has been converted into language from construction information; an instruction acquisition unit that acquires instructions in natural language from an operator from the work machine; and a control unit that controls the operation of the work machine based on the result of a language model interpreting the instructions acquired by the instruction acquisition unit and the information acquired by the construction information acquisition unit.

10. An operating system for a work machine, comprising: a construction information acquisition unit that acquires information that has been converted into language from construction information; an instruction acquisition unit that acquires instructions in natural language from an operator from the work machine; and a control unit that controls the operation of the work machine based on the result of a language model interpreting the instructions acquired by the instruction acquisition unit and the information acquired by the construction information acquisition unit.