Communication method and apparatus

By obtaining the neighborhood matching degree of point cloud data, the real point cloud data and noisy data can be distinguished, which solves the problem of low noise removal efficiency in multimodal perception and improves perception performance.

WO2026045601A9PCT designated stage Publication Date: 2026-06-18HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2025-06-28
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing multimodal sensing schemes suffer from low noise removal efficiency when fusing 2D and 3D sensing, which affects sensing performance.

Method used

By obtaining the neighborhood matching degree of point cloud data, the neighborhood matching degree calculation method is used to distinguish between real point cloud data and noisy data, thereby achieving noise filtering and removal and improving multimodal perception performance.

🎯Benefits of technology

It effectively removes noisy data, improving the accuracy and efficiency of multimodal sensing.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025105114_18062026_PF_FP_ABST
    Figure CN2025105114_18062026_PF_FP_ABST
Patent Text Reader

Abstract

The present application relates to the technical field of communications, and provides a communication method and an apparatus, for use in improving multi-modal sensing performance. In the method, after acquiring the respective neighborhood matching degrees of N pieces of point cloud data, a first apparatus can determine noise in the N pieces of point cloud data on the basis of the respective neighborhood matching degrees of the N pieces of point cloud data, wherein the neighborhood matching degree of an i-th piece of point cloud data among the N pieces of point cloud data is determined on the basis of the image similarity between a first neighborhood i and a second neighborhood i, the first neighborhood i is a region obtained after the i-th piece of point cloud data is projected to a first image, the second neighborhood i is a region obtained after the i-th piece of point cloud data is projected to a second image, the first image and the second image are obtained by performing two-dimensional sensing on a target object, the first image and the second image are different from each other, N is an integer greater than 1, and i is an integer that traverses from 1 to N. In this way, noise in N pieces of point cloud data can be effectively removed, thereby improving multi-modal sensing performance.
Need to check novelty before this filing date? Find Prior Art

Description

Communication methods and devices

[0001] This application claims priority to Chinese Patent Application No. 202411224322.4, filed with the State Intellectual Property Office of China on September 2, 2024, entitled "Communication Method and Apparatus", the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of communication technology, and in particular to a communication method and apparatus. Background Technology

[0003] Perception can be broadly categorized into two-dimensional (2D) and three-dimensional (3D) perception. 2D perception captures a scene onto a two-dimensional plane, generating a planar image. Common 2D perception methods include optical perception and radar perception. 3D perception captures the shape and structure of a scene in three-dimensional space, generating a stereoscopic image. Common 3D perception methods include radio frequency sensing and computed tomography (CT) scanning.

[0004] Currently, 2D and 3D sensing solutions can be integrated to improve sensing performance through multi-dimensional data, a process known as multimodal fusion sensing. For example, radio frequency sensing and optical sensing information can be fused to enhance sensing performance. Building on this, how to further improve multimodal sensing performance is a hot topic of discussion. Summary of the Invention

[0005] This application provides a communication method and apparatus to improve multimodal sensing performance.

[0006] To achieve the above objectives, this application adopts the following technical solution:

[0007] Firstly, a communication method is provided. This method is applied to a first device, for example, it can be executed by the first device itself, or by a component of the first device, such as a processor, chip, chip system, or circuit, or by a logic module or software capable of implementing all or part of the functions of the first device. The following description uses the execution of this method by the first device as an example. The method includes: acquiring the neighborhood matching degree of each of N point cloud data, and determining the noise in the N point cloud data based on the neighborhood matching degree of each of the N point cloud data; wherein N is an integer greater than 1, i is an integer traversing from 1 to N, the N point cloud data are three-dimensional point cloud data obtained by perceiving a target object, the neighborhood matching degree of the i-th point cloud data is determined based on the image similarity between the first neighborhood i and the second neighborhood i, the first neighborhood i is the region after the i-th point cloud data is projected onto a first image, the second neighborhood i is the region after the i-th point cloud data is projected onto a second image, the first image and the second image are obtained by two-dimensional perception of the target object, and the first image and the second image are different.

[0008] Based on the method described in the first aspect, it is known that when real point cloud data (i.e., point cloud data excluding noise) from N point cloud data is projected onto the first and second images, the imaging results of the first and second neighborhoods corresponding to the target object are similar (or identical). These similar regions can be understood as regions that include the point cloud data and exhibit minor differences; that is, the first and second neighborhoods have a high degree of similarity. Conversely, when noisy point cloud data from the N point cloud data is projected onto the first and second images, the imaging results of the first and second neighborhoods corresponding to the noisy point cloud data are different regions; that is, the first and second neighborhoods have a low degree of similarity. Thus, based on the similarity between the first and second neighborhoods, it is possible to determine whether the i-th point cloud data is real point cloud data, or in other words, whether the i-th point cloud data is noise. This effectively filters or removes noise from the N point cloud data, enabling multimodal fusion sensing using real point cloud data, thereby improving multimodal sensing performance.

[0009] It is understandable that the two dimensions mentioned above can also be called 2D, and the three dimensions mentioned above can also be called 3D.

[0010] In one possible design, obtaining the neighborhood matching degree of each of N point cloud data includes: obtaining first information, which indicates the relevant information of pixels in a first neighborhood i and a second neighborhood i; and determining the neighborhood matching degree of the i-th point cloud data based on the first information. The relevant information of pixels in the first neighborhood i and the second neighborhood i can characterize the image features of the first neighborhood i and the second neighborhood i. Therefore, the first device can accurately determine the neighborhood matching degree of the i-th point cloud data through the relevant information of pixels in the first neighborhood i and the second neighborhood i.

[0011] Optionally, obtaining the first information includes: sending a first message to the second device, the first message requesting information about pixels in the region where N point cloud data are projected onto a two-dimensional image, the two-dimensional image being obtained through perception of a target object; and receiving the first information from the second device. In other words, the first device can request information about pixels in the region where N point cloud data are projected onto a two-dimensional image from the second device; that is, the first device can obtain this information from the second device. This avoids the first device calculating the first information, thereby reducing the computational overhead of the first device.

[0012] Furthermore, the first message includes neighborhood description information, which indicates the size and / or shape of the region where the N point cloud data are projected onto the two-dimensional image. That is, after projecting the i-th point cloud data onto the first and second images, the second device can determine the first neighborhood i and the second neighborhood i based on the neighborhood description information, and then obtain relevant information about the pixels in the first and second neighborhood i. It is understood that the first and second devices can also pre-agree or pre-define the size and / or shape of the region where the point cloud data is projected onto the two-dimensional image, without restriction.

[0013] Furthermore, the method in the first aspect further includes: receiving a fifth capability parameter from the second device, the fifth capability parameter being used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information.

[0014] Among them, the information protection level information is used to indicate the information exposure level supported by the second device; the perception capability information is used to indicate the perception-related parameters of the second device; the communication capability information is used to describe whether the second device has the ability to share data, and the strength of this ability; the computing capability information is used to indicate whether the second device has the ability to analyze and calculate data, and the strength of this ability; the internal and external parameter transmission information is used to indicate whether the second device supports the transmission of internal and external parameters (i.e., internal parameters, external parameters) and two-dimensional images; the neighborhood extraction acquisition information is used to indicate whether the second device supports neighborhood extraction and the transmission of extracted neighborhood information; the neighborhood feature acquisition information is used to indicate whether the second device supports neighborhood feature extraction and the transmission of extracted neighborhood features; and the neighborhood matching degree acquisition information is used to indicate whether the second device supports providing neighborhood matching degrees to other devices (such as the first device). The second device indicates the capabilities of the second device (or the device in which the second device is located) to the first device through the fifth capability parameter, enabling the first device to determine the operation of the second device in multimodal perception based on the fifth capability parameter.

[0015] Optionally, the first information includes first neighborhood information i and second neighborhood information i, where first neighborhood information i includes the pixel value corresponding to the pixel in first neighborhood i, and second neighborhood information i includes the pixel value corresponding to the pixel in second neighborhood i; or, the first information includes first neighborhood feature i and second neighborhood feature i, where first neighborhood feature i indicates the image features of first neighborhood i, and second neighborhood feature i indicates the image features of second neighborhood i, and the image features are related to the pixel. The first information can also be other information, and can be flexibly set according to the actual situation without limitation.

[0016] Furthermore, the method in the first aspect further includes: receiving a fifth capability parameter from the second device, the fifth capability parameter being used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information.

[0017] Furthermore, the method in the first aspect further includes: receiving a first capability parameter from the second device, the first capability parameter indicating that the second device supports providing the first device with pixel values ​​corresponding to pixels in a first region, the first region being the region where point cloud data is projected onto a two-dimensional image; and sending a first message to the second device, including: sending the first message to the second device according to the first capability parameter. That is, the first device can determine, based on the first capability parameter, that the second device (or the device to which the second device is located) has the capability to provide the first device with pixel values ​​corresponding to pixels in the first region. This avoids the first device failing to request the second device to send relevant information about the region where N point cloud data is projected onto a two-dimensional image, thus preventing additional communication overhead.

[0018] Furthermore, the first capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information. In this way, the first device can determine the capabilities of the second device (or the equipment in which the second device is located) based on the first capability parameter, and thereby determine the operation of the second device in multimodal sensing based on these capabilities.

[0019] Furthermore, when the first information includes first neighborhood information i and second neighborhood information i, determining the neighborhood matching degree of the i-th point cloud data based on the first information includes: determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood information i, the second neighborhood information i, and the matching degree calculation method, wherein the matching degree calculation method is any one of the following: normalized squared difference, cumulative density function, or neighborhood feature calculation. In this way, the first device can accurately determine the neighborhood matching degree of the i-th point cloud data. It is understood that the matching degree calculation method can also be other methods, without limitation.

[0020] Furthermore, when the first information includes a first neighborhood feature i and a second neighborhood feature i, the method in the first aspect further includes: receiving a second capability parameter from a second device, the second capability parameter indicating that the second device supports providing neighborhood features to the first device, the neighborhood features indicating image features of the region where the point cloud data is projected onto a two-dimensional image; and sending a first message to the second device, including: sending a first message to the second device according to the second capability parameter. That is, the first device can determine, based on the second capability parameter, that the second device (or the device to which the second device is located) has the capability to provide neighborhood features to the first device. This avoids the first device failing to request the second device to send image features corresponding to the region where N point cloud data is projected onto a two-dimensional image, thus preventing additional communication overhead.

[0021] Furthermore, the second capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, or neighborhood matching degree acquisition information. In this way, the first device can determine the capabilities possessed by the second device (or the equipment in which the second device is located) based on the second capability parameter, and thereby determine the operation of the second device in multimodal sensing based on these capabilities.

[0022] In one possible design, obtaining the neighborhood matching degree of each of the N point cloud data points includes: sending a second message to a second device, the second message requesting the neighborhood matching degree corresponding to the N point cloud data points, the neighborhood matching degree indicating the image similarity of the region where the i-th point cloud data point is projected onto multiple two-dimensional images, the multiple two-dimensional images being obtained by perceiving the target object; and the first device receiving the neighborhood matching degree of each of the N point cloud data points from the second device. That is, the first device can request the neighborhood matching degree corresponding to the N point cloud data points from the second device. This avoids the first device calculating the neighborhood matching degree corresponding to the N point cloud data points itself, thereby reducing the computational overhead of the first device.

[0023] Optionally, the second message includes neighborhood description information, which indicates the size and / or shape of the region where the N point cloud data are projected onto the two-dimensional image.

[0024] Optionally, before sending the second message to the second device, the method in the first aspect further includes: receiving a third capability parameter from the second device, the third capability parameter indicating that the second device supports providing the first device with neighborhood matching degrees corresponding to the 3D point cloud data, the neighborhood matching degree corresponding to the 3D point cloud data indicating the image similarity of the regions where the 3D point cloud data is projected onto multiple 2D images; sending the second message to the second device, including: sending the second message to the second device according to the third capability parameter. That is, the first device can determine, based on the third capability parameter, that the second device (or the device to which the second device is located) has the capability to provide the first device with neighborhood matching degrees corresponding to the 3D point cloud data. This avoids the first device failing to request the second device to send neighborhood matching degrees corresponding to N point cloud data, thus preventing additional communication overhead.

[0025] Furthermore, the third capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction and acquisition information, or neighborhood feature acquisition information. In this way, the first device can determine the capabilities of the second device (or the equipment in which the second device is located) based on the third capability parameter, and thereby determine the operation of the second device in multimodal sensing based on these capabilities.

[0026] Optionally, the method in the first aspect further includes: receiving a fifth capability parameter from the second device, the fifth capability parameter being used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information.

[0027] In one possible design, obtaining the neighborhood matching degree of each of the N point cloud data sets includes: acquiring parameter information, a first image, and a second image, wherein the parameter information is used to project the N point cloud data sets onto the first and second images; and determining the neighborhood matching degree of each of the N point cloud data sets based on the parameter information, the first image, and the second image. This enables the first device to accurately obtain the neighborhood matching degree of each of the N point cloud data sets.

[0028] Optionally, acquiring parameter information, the first image, and the second image includes: receiving parameter information, the first image, and the second image from the second device. The parameter information is used to instruct the second device on relevant parameters for two-dimensional perception of the target object. It is understood that the second device can periodically send relevant parameters for two-dimensional perception of the target object, as well as a two-dimensional image obtained from perceiving the target object, to the first device; alternatively, it can send relevant parameters for two-dimensional perception of the target object, as well as a two-dimensional image obtained from perceiving the target object, to the first device upon receiving a request from the first device. The specific configuration can be flexibly set according to actual circumstances and is not limited.

[0029] Furthermore, acquiring parameter information, the first image, and the second image includes: receiving first parameter information and the first image from the second device, wherein the first parameter information is used to instruct the second device to perform two-dimensional perception of the target object, and the first parameter information belongs to parameter information; and receiving second parameter information and the second image from the third device, wherein the second parameter information is used to instruct the third device to receive relevant parameters for two-dimensional perception of the target object, and the second parameter information belongs to parameter information. That is, the first device can acquire corresponding information from the second and third devices respectively. It can be understood that the second and third devices can periodically send relevant parameters for two-dimensional perception of the target object, as well as two-dimensional images obtained from perceiving the target object, to the first device; or they can send relevant parameters for two-dimensional perception of the target object, as well as two-dimensional images obtained from perceiving the target object, to the first device after receiving a request from the first device. The specific settings can be flexibly configured according to actual circumstances and are not limited.

[0030] Furthermore, the method described in the first aspect further includes: sending a third message to the second device, the third message being used to request relevant parameters and at least one image of the target object for two-dimensional perception by the second device. This enables the first device to acquire parameter information, the first image, and the second image in a timely manner.

[0031] Furthermore, before receiving the parameter information, the first image, and the second image from the second device, the method in the first aspect further includes: receiving a fourth capability parameter from the second device, the fourth capability parameter indicating that the second device supports providing the first device with relevant parameters and images for two-dimensional object perception; and sending a third message to the second device, including: sending a third message to the second device based on the fourth capability parameter. That is, the first device can determine, based on the fourth capability parameter, that the second device (or the device in which the second device is located) has the capability to provide the first device with relevant parameters and images for two-dimensional object perception, and thus request the first device to provide relevant parameters and images for two-dimensional object perception based on this capability. This avoids the first device failing to request relevant parameters and images for two-dimensional object perception from the second device, thus preventing additional communication overhead.

[0032] Furthermore, the fourth capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information. In this way, the first device can determine the capabilities possessed by the second device (or the device in which the second device is located) based on the third capability parameter, and thereby determine the operation of the second device in multimodal sensing based on these capabilities.

[0033] Furthermore, the relevant parameters include at least one of the following information of the second device: focal length, resolution, field of view, position, or pointing. The specific settings can be made according to the actual situation and are not limited.

[0034] Optionally, based on parameter information, the first image, and the second image, the neighborhood matching degree of each of the N point cloud data is obtained, including: projecting the i-th point cloud data onto the first image and the second image according to the parameter information; determining the first neighborhood i and the second neighborhood i based on the first image and the second image mapped with the i-th point cloud data; determining the first neighborhood information i and the second neighborhood information i based on the first neighborhood i and the second neighborhood i, where the first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i; and determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood information i and the second neighborhood information i. In this way, the first device can accurately obtain the neighborhood matching degree of the i-th point cloud data.

[0035] Optionally, the method in the first aspect further includes: receiving a fifth capability parameter from the second device, the fifth capability parameter being used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information.

[0036] In one possible design, the expression for the first neighborhood i is: Nr(u,v)={(u+x,v+y)|-r≤x≤r,-k≤y≤k}, where (u,v) are the coordinates of the point i mapped from the point cloud data to the first image, (2r+1) is greater than or equal to 1 and less than the horizontal resolution of the first image, and (2k+1) is greater than or equal to 1 and less than the vertical resolution of the first image; or, the expression for the first neighborhood i is: Nr(u,v)={(u+x,v+y)|x 2 +y 2 ≤r 2Let (u, v) be the coordinates of the i-th point cloud data mapped to the first image, where r is less than a first value, which is the minimum of the horizontal and vertical resolutions of the first image. It can be understood that | is a separator; (u+x, v+y) before | indicates the pixel coordinates of the first neighboring i, and the values ​​after | represent the range of x and y.

[0037] Secondly, a communication method is provided, which is applied to a second device. This method can be executed by the second device, such as its processor, chip, chip system, or circuit, or by a logic module or software capable of implementing all or part of the functions of the second device. The following description uses the execution of this method by a second device as an example. The method includes: receiving a first message from a first device, the first message requesting information about pixels in the region where N point cloud data are projected onto a two-dimensional image; the N point cloud data are three-dimensional point cloud data obtained by perceiving a target object, where N is an integer greater than 1; and the two-dimensional image is obtained by perceiving the target object. In response to the first message, sending first information to the first device, the first information indicating information about pixels in a first neighborhood i and a second neighborhood i, where the first neighborhood i is the region where the i-th point cloud data is projected onto the first image, and the second neighborhood i is the region where the i-th point cloud data is projected onto the second image; the first image and the second image are obtained by two-dimensional perception of the target object, and the first image and the second image are different; and i is an integer ranging from 1 to N.

[0038] In one possible design, the first message includes neighborhood description information, which indicates the size and / or shape of the region where the N point cloud data are projected onto the two-dimensional image.

[0039] In one possible design, the first information includes first neighborhood information i and second neighborhood information i, where the first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i; or, the first information includes first neighborhood feature i and second neighborhood feature i, where the first neighborhood feature i is used to indicate the image features of the first neighborhood i, and the second neighborhood feature i is used to indicate the image features of the second neighborhood i, and the image features are related to the pixel.

[0040] Optionally, if the first information includes first neighborhood information i and second neighborhood information i, the method in the second aspect further includes: sending a first capability parameter to the first device, the first capability parameter being used to indicate that the second device supports providing the pixel values ​​corresponding to the pixels in the first region to the first device, the first region being the region where the point cloud data is projected onto the two-dimensional image.

[0041] Furthermore, the first capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information.

[0042] Optionally, if the first information includes a first neighborhood feature i and a second neighborhood feature i, the method in the second aspect further includes: sending a second capability parameter to the first device, the second capability parameter being used to indicate that the second device supports providing neighborhood features to the first device, the neighborhood features being used to indicate the image features of the region where the point cloud data is projected onto the two-dimensional image.

[0043] Furthermore, the second capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, or neighborhood matching degree acquisition information.

[0044] In one possible design, the method described in the second aspect further includes: sending a fifth capability parameter to the first device, the fifth capability parameter being used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information.

[0045] Furthermore, the technical effects of the method described in the second aspect can also refer to the technical effects of the method described in the first aspect, and will not be repeated here.

[0046] Thirdly, a communication method is provided, which is applied to a second device. This method can be executed by the second device itself, such as its processor, chip, chip system, or circuit, or it can be implemented by a logic module or software capable of performing all or part of the functions of the second device. The following explanation uses the execution of this method by a second device as an example. The method includes: receiving a second message from a first device, the second message being used to request neighborhood matching degrees corresponding to N point cloud data, the N point cloud data being three-dimensional point cloud data obtained by perceiving a target object, N being an integer greater than 1, the neighborhood matching degrees corresponding to the N point cloud data being used to indicate the image similarity of the region where the i-th point cloud data is projected onto multiple two-dimensional images, the multiple two-dimensional images being obtained by perceiving the target object, i being an integer from 1 to N; in response to the second message, sending the neighborhood matching degrees of each of the N point cloud data to the first device, the neighborhood matching degree of the i-th point cloud data being determined based on the image similarity between a first neighborhood i and a second neighborhood i, the first neighborhood i being the region where the i-th point cloud data is projected onto a first image, the second neighborhood i being the region where the i-th point cloud data is projected onto a second image, the first image and the second image being obtained by two-dimensional perception of the target object, the first image and the second image being different.

[0047] In one possible design, before receiving the second message from the first device, the method of the third aspect further includes: sending a third capability parameter to the first device, the third capability parameter being used to indicate that the second device supports providing the neighborhood matching degree corresponding to the point cloud data to the first device, the neighborhood matching degree corresponding to the point cloud data being used to indicate the image similarity of the region where the point cloud data is projected onto multiple two-dimensional images.

[0048] Optionally, the third capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction and acquisition information, or neighborhood feature acquisition information.

[0049] In one possible design, before receiving the second message from the first device, the method of the third aspect further includes: sending a fifth capability parameter to the first device, the fifth capability parameter being used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information.

[0050] In one possible design, the second message includes neighborhood description information, which indicates the size and / or shape of the region where the N point cloud data are projected onto the two-dimensional image.

[0051] Optionally, the method in the third aspect further includes: projecting the i-th point cloud data onto a first image and a second image; determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data; determining first neighborhood information i and second neighborhood information i based on the first neighborhood i and the second neighborhood i, wherein the first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i; and determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood information i and the second neighborhood information i.

[0052] Furthermore, the technical effects of the method described in the third aspect can also refer to the technical effects of the method described in the first aspect, and will not be elaborated here.

[0053] Fourthly, a communication method is provided. This method is applied to a second device, for example, it can be executed by the second device, such as its processor, chip, chip system, or circuit, or it can be implemented by a logic module or software capable of implementing all or part of the functions of the second device. The following description uses the execution of this method by a second device as an example. The method includes: acquiring parameter information, a first image, and a second image; the parameter information is used to indicate relevant parameters for the second device to perform two-dimensional perception of a target object; the first image and the second image are obtained by the second device through two-dimensional perception of the target object, and the first image and the second image are different; and sending the parameter information, the first image, and the second image to a first device.

[0054] In one possible design, the method in the fourth aspect includes: receiving a third message from a first device, the third message being used to request relevant parameters and at least one image of a second device for two-dimensional perception of a target object; sending parameter information, a first image, and a second image to the first device, including: in response to the third message, sending parameter information, a first image, and a second image to the first device.

[0055] In one possible design, before sending parameter information, the first image, and the second image to the first device, the method in the fourth aspect further includes: sending a fourth capability parameter to the first device, the fourth capability parameter being used to instruct the second device to support providing the first device with relevant parameters and images for two-dimensional perception of objects.

[0056] Optionally, the fourth capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information.

[0057] In one possible design, the relevant parameters include at least one of the following information about the second device: focal length, resolution, field of view, position, or pointing.

[0058] In one possible design, before receiving the third message from the first device, the method in the fourth aspect further includes: sending a fifth capability parameter to the first device, the fifth capability parameter being used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information.

[0059] Furthermore, the technical effects of the method described in the fourth aspect can also refer to the technical effects of the method described in the first aspect, and will not be repeated here.

[0060] Fifthly, a communication method is provided. This method is applied to a first device, and can be executed by the first device itself, or by a component of the first device, such as a processor, chip, chip system, or circuit. It can also be implemented by a logic module or software capable of implementing all or part of the functions of the first device. The following description uses the execution of this method by the first device as an example. The method includes: receiving capability parameters from a second device, and sending a first message to the second device according to the capability parameters; wherein the capability parameters indicate first information that the second device can report to the first device, the first information indicates relevant information for the second device to perceive an object, or information processed based on the relevant information obtained by the second device to perceive an object, and the first message requests relevant information for the second device to perceive a target object.

[0061] Based on the method in the fifth aspect, after receiving capability parameters from the second device, the first device can request relevant information from the second device regarding the second device's perception of the target object. In other words, the first device can determine the information that the second device (or the device containing the second device) can provide based on its capabilities, and then request that information. This avoids the first device failing to request information from the second device, thus preventing additional communication overhead.

[0062] In one possible design, the capability parameters include at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood acquisition information, or neighborhood matching degree acquisition information.

[0063] Among them, information protection level information is used to indicate the information exposure level supported by the second device; perception capability information is used to indicate the perception-related parameters of the second device; communication capability information is used to describe whether the second device has data sharing capabilities and the strength of such capabilities; computing capability information is used to indicate whether the second device has data analysis and computing capabilities and the strength of such capabilities; internal and external parameter transmission information is used to indicate whether the second device supports the transmission of internal and external parameters (i.e., internal parameters and external parameters) and two-dimensional images; neighborhood acquisition information is used to indicate whether the second device supports providing the first device with relevant information about pixels in a first region, which is the region where the point cloud data is projected onto the two-dimensional image; and neighborhood matching degree acquisition information is used to indicate whether the second device supports providing neighborhood matching degrees to other devices (such as the first device). The second device indicates the capabilities possessed by the second device (or the device on which the second device is located) to the first device through capability parameters, enabling the first device to determine the operation of the second device in multimodal perception based on these capability parameters.

[0064] Optionally, the capability parameters include neighborhood acquisition information, which indicates that the second device supports providing the first device with relevant information about pixels in a first region, where the first region is the area where the point cloud data is projected onto a 2D image; and a first message, which requests relevant information for the second device to perceive a target object, including: the first message requests relevant information about pixels in the area where N point cloud data is projected onto a 2D image, where N point cloud data are 3D point cloud data obtained from perceiving the target object, N is an integer greater than 1, and the 2D image is obtained from perceiving the target object. That is, when the second device (or the device containing the second device) has the capability to provide the first device with relevant information about pixels in the first region, the first device can request relevant information about pixels in the area where the N point cloud data is projected onto a 2D image from the second device. This ensures that the first device obtains relevant information about pixels in the area where the N point cloud data is projected onto a 2D image from the second device, avoiding request failure.

[0065] Furthermore, the first message includes neighborhood description information, which indicates the size and / or shape of the region where the N point cloud data are projected onto the two-dimensional image. Thus, after projecting the i-th point cloud data from the N point cloud data onto the first and second images, the second device can determine the first neighborhood i and the second neighborhood i based on the neighborhood description information, and then obtain relevant information about the pixels in the first and second neighborhoods i, where i is an integer from 1 to N. It is understood that the first and second devices can also pre-agree or pre-define the size and / or shape of the region where the point cloud data is projected onto the two-dimensional image, without restriction.

[0066] Furthermore, the method in the fifth aspect further includes: receiving second information from the second device, the second information being used to indicate relevant information about pixels in the first neighborhood i and the second neighborhood i, the first neighborhood i being the region where the i-th point cloud data in N point cloud data is projected onto the first image, the second neighborhood i being the region where the i-th point cloud data is projected onto the second image, the first image and the second image being images obtained by two-dimensional perception of the target object, the first image being different from the second image, and i being an integer from 1 to N; obtaining the neighborhood matching degree of the i-th point cloud data according to the second information; and determining the noise in the N point cloud data according to the neighborhood matching degree of the i-th point cloud data.

[0067] Furthermore, the second information includes first neighborhood information i and second neighborhood information i. First neighborhood information i includes the pixel value corresponding to the pixel in first neighborhood i, and second neighborhood information i includes the pixel value corresponding to the pixel in second neighborhood i. Alternatively, the second information includes first neighborhood feature i and second neighborhood feature i. First neighborhood feature i indicates the image features of first neighborhood i, and second neighborhood feature i indicates the image features of second neighborhood i. The image features are related to the pixel. The second information can also be other information, and can be flexibly set according to the actual situation without limitation.

[0068] Furthermore, when the second information includes the first neighborhood information i and the second neighborhood information i, the method in the fifth aspect further includes: obtaining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood information i, the second neighborhood information i, and the matching degree calculation method, wherein the matching degree calculation method is any one of the following: normalized squared difference, cumulative density function, or neighborhood feature calculation. In this way, the first device can accurately determine the neighborhood matching degree of the i-th point cloud data. It is understood that the matching degree calculation method can also be other methods, without limitation.

[0069] Furthermore, when the second information includes first neighborhood information i and second neighborhood information i, the neighborhood acquisition information is used to indicate that the second device supports providing the first device with relevant information about pixels in the first region, including: neighborhood extraction acquisition information is used to indicate that the second device supports providing the first device with pixel values ​​corresponding to pixels in the first region. In this case, the capability parameter may also include neighborhood feature acquisition information, which is used to indicate whether the second device supports neighborhood feature extraction and the transmission of extracted neighborhood features. The specific settings can be configured according to actual circumstances and are not limited.

[0070] Furthermore, when the second information includes the first neighborhood feature i and the second neighborhood feature i, the neighborhood acquisition information is used to indicate that the second device supports providing the first device with relevant information about pixels in the first region, including: neighborhood extraction acquisition information is used to indicate that the second device supports providing the first device with image features of the first region. In this case, the capability parameter may also include neighborhood extraction acquisition information, which is used to indicate whether the second device supports neighborhood extraction and the transmission of extracted neighborhood information. The specific settings can be configured according to actual circumstances and are not limited.

[0071] Optionally, the capability parameters include neighborhood matching degree acquisition information, which is used to indicate that the second device supports providing the neighborhood matching degree corresponding to the point cloud data to the first device. The neighborhood matching degree corresponding to the point cloud data is used to indicate the image similarity of the region where the point cloud data is projected onto multiple two-dimensional images. The first message is used to request relevant information for the second device to perceive the target object, including: the first message is used to request the neighborhood matching degree corresponding to N point cloud data, where N point cloud data are three-dimensional point cloud data obtained by perceiving the target object, and N is an integer greater than 1. The neighborhood matching degree corresponding to the N point cloud data is used to indicate the image similarity of the region where the i-th point cloud data in the N point cloud data is projected onto multiple two-dimensional images, where i is an integer from 1 to N, and the multiple two-dimensional images are obtained by perceiving the target object. That is, when the second device (or the device in which the second device is located) has the capability to provide the neighborhood matching degree corresponding to the point cloud data to the first device, the first device can request the neighborhood matching degree corresponding to the N point cloud data from the second device. In this way, it can be ensured that the first device obtains the neighborhood matching degree corresponding to N point cloud data from the second device, thus avoiding the failure of the first device's request.

[0072] Furthermore, the first message includes neighborhood description information, which indicates the size or shape of the region where the N point cloud data are projected onto the two-dimensional image.

[0073] Furthermore, the method in the fifth aspect also includes: receiving the neighborhood matching degree of each of the N point cloud data from the second device, wherein the neighborhood matching degree of the i-th point cloud data is determined based on the image similarity between the first neighborhood i and the second neighborhood i, where the first neighborhood i is the region where the i-th point cloud data is projected onto the first image, and the second neighborhood i is the region where the i-th point cloud data is projected onto the second image, and the first image and the second image are images obtained by two-dimensional perception of the target object, and the first image and the second image are different; and determining the noise in the N point cloud data based on the neighborhood matching degree of each of the N point cloud data. It can be understood that after determining the noise in the N point cloud data, the first device can filter or remove the noise, thereby using the real point cloud data (the point cloud data in the N point cloud data excluding the noise) for multimodal perception, thereby improving the multimodal perception performance.

[0074] Optionally, the capability parameters include intrinsic and extrinsic parameter transmission information, which instructs the second device to support providing the first device with relevant parameters and images for object perception. The first message requests relevant information from the second device for object perception, including: the first message requests relevant parameters and at least one image from the second device for object perception. That is, when the second device (or the device containing the second device) has the relevant parameters and images for (two-dimensional) object perception to provide to the first device, the first device can request relevant information from the second device for object perception. This ensures that the first device obtains the relevant information from the second device for object perception, preventing request failures.

[0075] Furthermore, the relevant parameters include at least one of the following information of the second device: focal length, resolution, field of view, position, or pointing. The specific settings can be made according to the actual situation and are not limited.

[0076] Furthermore, the method in the fifth aspect further includes: receiving parameter information, a first image, and a second image from the second device, wherein the parameter information is used to project N point cloud data onto the first image and the second image, the N point cloud data being three-dimensional point cloud data obtained by perceiving the target object, N being an integer greater than 1, the first image and the second image being obtained by two-dimensional perception of the target object, and the first image being different from the second image; obtaining the neighborhood matching degree of each of the N point cloud data according to the parameter information, the first image, and the second image, wherein the neighborhood matching degree of the i-th point cloud data is determined based on the image similarity between the first neighborhood i and the second neighborhood i, the first neighborhood i being the region where the i-th point cloud data is projected onto the first image, the second neighborhood i being the region where the i-th point cloud data is projected onto the second image, i being an integer from 1 to N; and determining the noise in the N point cloud data according to the neighborhood matching degree of each of the N point cloud data.

[0077] Furthermore, based on the parameter information, the first image, and the second image, the neighborhood matching degree of each of the N point cloud data is obtained, including: projecting the i-th point cloud data onto the first image and the second image according to the parameter information; determining the first neighborhood i and the second neighborhood i based on the first image and the second image mapped with the i-th point cloud data; determining the first neighborhood information i and the second neighborhood information i based on the first neighborhood i and the second neighborhood i, where the first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i; and determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood information i and the second neighborhood information i. In this way, the first device can accurately obtain the neighborhood matching degree of the i-th point cloud data.

[0078] Sixthly, a communication method is provided, which is applied to a second device. This method can be executed by the second device itself, such as its processor, chip, chip system, or circuit, or by a logic module or software capable of implementing all or part of the functions of the second device. The following description uses the execution of this method by a second device as an example. The method includes: acquiring capability parameters and sending the capability parameters to a first device, wherein the capability parameters indicate first information that the second device can report to the first device, and the first information indicates relevant information for the second device to perceive an object, or information processed based on the relevant information obtained by the second device to perceive the object.

[0079] In one possible design, the capability parameters include at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood acquisition information, or neighborhood matching degree acquisition information.

[0080] Optionally, the capability parameters include neighborhood acquisition information, which is used to indicate that the second device supports providing the first device with relevant information about pixels in the first region, where the first region is the region where the point cloud data is projected onto the two-dimensional image.

[0081] Furthermore, the method in the sixth aspect further includes: receiving a first message from the first device, the first message being used to request information about pixels in the region where N point cloud data are projected onto a two-dimensional image, the N point cloud data being three-dimensional point cloud data obtained by perceiving the target object, N being an integer greater than 1, and the two-dimensional image being obtained by perceiving the target object; responding to the first message, sending second information to the first device, the second information being used to indicate information about pixels in a first neighborhood i and a second neighborhood i, the first neighborhood i being the region where the i-th point cloud data in the N point cloud data is projected onto the first image, the second neighborhood i being the region where the i-th point cloud data is projected onto the second image, the first image and the second image being images obtained by two-dimensional perception of the target object, the first image being different from the second image, and i being an integer ranging from 1 to N.

[0082] Furthermore, the first message includes neighborhood description information, which indicates the size and / or shape of the region where the N point cloud data are projected onto the two-dimensional image.

[0083] Furthermore, the second information includes first neighborhood information i and second neighborhood information i, where first neighborhood information i includes the pixel value corresponding to the pixel in first neighborhood i, and second neighborhood information i includes the pixel value corresponding to the pixel in second neighborhood i; or, the second information includes first neighborhood feature i and second neighborhood feature i, where first neighborhood feature i is used to indicate the image features of first neighborhood i, and second neighborhood feature i is used to indicate the image features of second neighborhood i, and the image features are related to pixel values.

[0084] Furthermore, when the second information includes the first neighborhood information i and the second neighborhood information i, the neighborhood acquisition information is used to instruct the second device to support providing the first device with relevant information of the pixels in the first region, including: the neighborhood extraction acquisition information is used to instruct the second device to support providing the first device with the pixel values ​​corresponding to the pixels in the first region.

[0085] Furthermore, when the second information includes the first neighborhood feature i and the second neighborhood feature i, the neighborhood acquisition information is used to instruct the second device to support providing the first device with relevant information about the pixels in the first region, including: the neighborhood extraction acquisition information is used to instruct the second device to support providing the first device with image features of the first region.

[0086] Optionally, the capability parameters include neighborhood matching degree acquisition information, which is used to indicate that the second device supports providing the neighborhood matching degree corresponding to the point cloud data to the first device, and the neighborhood matching degree corresponding to the point cloud data is used to indicate the image similarity of the region where the point cloud data is projected onto multiple two-dimensional images.

[0087] Furthermore, the method in the sixth aspect further includes: receiving a first message from the first device, the first message being used to request the neighborhood matching degree corresponding to N point cloud data, the N point cloud data being three-dimensional point cloud data obtained by perceiving the target object, N being an integer greater than 1, the neighborhood matching degree corresponding to the N point cloud data being used to indicate the image similarity of the region where the i-th point cloud data in the N point cloud data is projected onto multiple two-dimensional images, i being an integer from 1 to N, and the multiple two-dimensional images being obtained by perceiving the target object; in response to the first message, sending the neighborhood matching degree of each of the N point cloud data to the first device, the neighborhood matching degree of the i-th point cloud data in the N point cloud data being determined based on the image similarity between the first neighborhood i and the second neighborhood i, the first neighborhood i being the region where the i-th point cloud data is projected onto the first image, the second neighborhood i being the region where the i-th point cloud data is projected onto the second image, the first image and the second image being images obtained by two-dimensional perception of the target object, and the first image and the second image being different.

[0088] Furthermore, the first message includes neighborhood description information, which indicates the size or shape of the region where the N point cloud data are projected onto the two-dimensional image.

[0089] Furthermore, the method in the sixth aspect further includes: projecting the i-th point cloud data onto a first image and a second image; determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data; determining first neighborhood information i and second neighborhood information i based on the first neighborhood i and the second neighborhood i, wherein the first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i; and determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood information i and the second neighborhood information i.

[0090] Optionally, the capability parameters include intrinsic and extrinsic parameter transmission information, which is used to indicate that the second device supports providing the first device with relevant parameters and images for object perception.

[0091] Furthermore, the relevant parameters include at least one of the following information about the second device: focal length, resolution, field of view, position, or pointing.

[0092] Furthermore, the method in the sixth aspect further includes: receiving a first message from a first device, the first message being used to request relevant parameters and at least one image from a second device to perceive a target object; in response to the first message, sending parameter information, a first image, and a second image to the first device, the parameter information being used to project N point cloud data onto the first image and the second image, the N point cloud data being three-dimensional point cloud data obtained by perceiving the target object, N being an integer greater than 1, the first image and the second image being obtained by two-dimensional perception of the target object, and the first image being different from the second image.

[0093] Furthermore, the technical effects of the method described in the sixth aspect can also refer to the technical effects of the method described in the fifth aspect, and will not be repeated here.

[0094] A seventh aspect provides a communication method, the method comprising: a first device performing the method of the first aspect, and a second device performing the method of the second aspect; or, the first device performing the method of the first aspect, and the second device performing the method of the third aspect; or, the first device performing the method of the first aspect, and the second device performing the method of the fourth aspect.

[0095] Eighthly, a communication method is provided, the method comprising: a first device performing the method described in the fifth aspect, and a second device performing the method described in the sixth aspect.

[0096] A ninth aspect provides a communication device. The communication device includes: modules or units (e.g., chips, chip systems, or circuits) for performing each of the methods / operations / steps / actions described in any one of the first to sixth aspects, such as a transceiver module and a processing module. For example, the transceiver module is used to indicate the transceiver function of the communication device, and the processing module is used to perform functions of the communication device other than the transceiver function.

[0097] Optionally, the transceiver module may include a transmitting module and a receiving module. The transmitting module implements the transmitting function of the communication device described in the ninth aspect, and the receiving module implements the receiving function of the communication device described in the ninth aspect.

[0098] Optionally, the communication device according to the ninth aspect may further include a storage module storing a program or instructions. When the processing module executes the program or instructions, the communication device can perform the method described in any one of the first to sixth aspects.

[0099] It is understood that the communication device described in the ninth aspect may be a terminal device or a network device, or a chip (system) or other component or assembly that can be disposed in the terminal device or the network device, or a device that includes the terminal device or the network device. This application does not limit it in this regard.

[0100] Furthermore, the technical effects of the communication device described in the ninth aspect can be referred to the technical effects of the method described in any of the implementations of the first to sixth aspects, and will not be repeated here.

[0101] A tenth aspect provides a communication device. The communication device includes a processor, which, when executing computer instructions, causes the communication device to perform the method described in any one of the possible implementations of the first to sixth aspects.

[0102] In one possible design, the communication device described in the tenth aspect may further include a transceiver. The transceiver may be a transceiver circuit or an interface circuit. The transceiver can be used for communication between the communication device described in the tenth aspect and other communication devices.

[0103] In one possible design, the communication device described in the tenth aspect may further include a memory. This memory may be integrated with the processor or disposed separately. The memory may be used to store computer programs and / or data relating to the methods described in any of the first to sixth aspects.

[0104] In the embodiments of this application, the communication device described in the tenth aspect may be a terminal device or network device described in any one of the first to sixth aspects, or may be a chip (system) or other component or assembly disposed in the terminal device or the network device, or may include the terminal device or the network device.

[0105] Furthermore, the technical effects of the communication device described in the tenth aspect can be referred to the technical effects of the method described in any of the implementations of the first to sixth aspects, and will not be repeated here.

[0106] Eleventhly, a communication device is provided. The communication device includes a processor coupled to a memory, the processor being configured to execute a computer program stored in the memory, such that the communication device performs the method described in any one of the possible implementations of the first to sixth aspects.

[0107] In one possible design, the communication device described in the eleventh aspect may further include a transceiver. This transceiver may be a transceiver circuit or an interface circuit. The transceiver can be used for communication between the communication device described in the eleventh aspect and other communication devices.

[0108] In the embodiments of this application, the communication device described in the eleventh aspect may be a terminal device or network device described in any one of the first to sixth aspects, or may be a chip (system) or other component or assembly disposed in the terminal device or network device, or may include the terminal device or network device.

[0109] Furthermore, the technical effects of the communication device described in the eleventh aspect can be referred to the technical effects of the method described in any of the implementations of the first to sixth aspects, and will not be repeated here.

[0110] In a twelfth aspect, a communication device is provided, comprising: a processor and a memory; the memory is configured to store a computer program, which, when executed by the processor, causes the communication device to perform the method described in any one of the first to sixth aspects.

[0111] In one possible design, the communication device described in the twelfth aspect may further include a transceiver. The transceiver may be a transceiver circuit or an interface circuit. The transceiver can be used for communication between the communication device described in the twelfth aspect and other communication devices.

[0112] In the embodiments of this application, the communication device described in the twelfth aspect may be a terminal device or network device described in any one of the first to sixth aspects, or may be a chip (system) or other component or assembly disposed in the terminal device or the network device, or may include the terminal device or the network device.

[0113] Furthermore, the technical effects of the communication device described in the twelfth aspect can be referred to the technical effects of the method described in any of the implementations of the first to sixth aspects, and will not be repeated here.

[0114] In a thirteenth aspect, a communication apparatus is provided for implementing the method described in any of the possible implementations of the first to sixth aspects.

[0115] In a fourteenth aspect, a communication chip is provided, comprising: a logic circuit and a communication interface, the logic circuit being used to execute computer instructions, and the communication interface being used for the communication chip to communicate with other devices or chips, wherein when the logic circuit executes the computer instructions, the method described in any one of the first to sixth aspects is implemented.

[0116] In a fifteenth aspect, a communication system is provided, comprising at least one of the following: a first means for performing the method of the first aspect, or a second means for performing the method of the second aspect; or, a first means for performing the method of the first aspect, or a second means for performing the method of the third aspect; or, the communication system comprises at least one of the following: a first means for performing the method of the first aspect, or a second means for performing the method of the fourth aspect.

[0117] In a sixteenth aspect, a communication system is provided, comprising at least one of the following: a first means for performing the method of the fifth aspect, or a second means for performing the method of the sixth aspect.

[0118] In a seventeenth aspect, a computer-readable storage medium is provided, comprising: a computer program or instructions; when the computer program or instructions are executed on a computer, causing the computer to perform the method described in any one of the possible implementations of the first to sixth aspects.

[0119] Eighteenth aspect: A computer program product is provided, including a computer program or instructions that, when run on a computer, cause the computer to perform the method described in any one of the possible implementations of the first to sixth aspects. Attached Figure Description

[0120] Figure 1 is a schematic diagram of the optical sensing coordinate system provided in an embodiment of this application;

[0121] Figure 2 is a schematic diagram of the camera coordinate system to the image coordinate system provided in an embodiment of this application;

[0122] Figure 3 is a schematic diagram illustrating the relationship between the image coordinate system and the pixel coordinate system provided in an embodiment of this application;

[0123] Figure 4 is a schematic diagram of the relationship between the field of view and focal length provided in the embodiments of this application;

[0124] Figure 5 is a schematic diagram of the radio frequency sensing results provided in the embodiments of this application;

[0125] Figure 6 is a framework diagram of the multimodal fusion sensing scheme provided in the embodiments of this application;

[0126] Figure 7 is a schematic diagram of the communication method provided in an embodiment of this application;

[0127] Figure 8 is a schematic diagram of neighborhood matching provided in an embodiment of this application;

[0128] Figure 9 is a schematic diagram of the projection of real point cloud data and noisy point cloud data provided in the embodiments of this application;

[0129] Figure 10 is a schematic diagram of the simulation configuration and results provided in the embodiments of this application;

[0130] Figure 11 is a schematic diagram of the communication method provided in an embodiment of this application;

[0131] Figure 12 is a schematic diagram of the communication method provided in the embodiment of this application;

[0132] Figure 13 is a schematic diagram of the communication method provided in the embodiment of this application;

[0133] Figure 14 is a schematic diagram of the communication method provided in the embodiment of this application;

[0134] Figure 15 is a schematic diagram of the communication method provided in the embodiment of this application;

[0135] Figure 16 is a schematic diagram of the communication device provided in an embodiment of this application;

[0136] Figure 17 is a schematic diagram of the structure of the communication device provided in the embodiment of this application. Detailed Implementation

[0137] For ease of understanding, the technical terms involved in the embodiments of this application will be introduced below.

[0138] 1. Perception

[0139] Perception encompasses various methods, broadly categorized into 2D and 3D perception. 2D perception captures a scene onto a two-dimensional plane, generating a planar image. This image includes width and height information, but not depth information. Common 2D perception methods include optical perception and radar perception. 3D perception captures the shape and structure of a scene in three-dimensional space, generating a stereo image. This image includes width, height, and depth information. Common 3D perception methods include radio frequency sensing and computed tomography (CT) scanning. The following sections will describe optical perception and radio frequency sensing respectively.

[0140] 1.1 Optical Sensing

[0141] Optical sensing uses image sensors to detect light waves and generate images. The parameters of an optical sensing node are divided into intrinsic parameters and extrinsic parameters. Intrinsic parameters are related to the camera's own characteristics, such as focal length, field of view, and resolution; extrinsic parameters refer to the camera's position, rotation direction, etc. The imaging process of optical sensing involves four coordinate systems (world, camera, image, and pixel coordinate systems) and the transformations between these four coordinate systems. These four coordinate systems are explained below with reference to Figure 1.

[0142] As shown in Figure 1, the optical sensing coordinate system involves the world coordinate system, camera coordinate system, image coordinate system, and pixel coordinate system.

[0143] The world coordinate system is a three-dimensional coordinate system of the objective world. Placing an optical sensing node in three-dimensional space, and using the world coordinate system as a reference to describe the position of the optical sensing node, is denoted as P(X). W ,Y W Z W ).

[0144] The camera coordinate system is established with the optical sensing node (such as the optical center of the camera) as the origin and the camera's optical axis as the Z-axis. In the camera coordinate system, the position of the optical sensing node can be represented as (X... C ,Y C Z C It is understandable that the transformation from the world coordinate system to the camera coordinate system can be achieved through rotation and translation operations, and vice versa.

[0145] An image coordinate system has its origin at the center of the camera's image sensor, with the X and Y axes parallel to the two vertical sides of the image sensor, and its coordinate values ​​are represented by (x, y). Image coordinate systems typically use physical units (such as millimeters (mm)) to represent the position of pixels in the image.

[0146] The pixel coordinate system has its origin at the top-left corner of the image sensor, with the X and Y axes parallel to the X and Y axes of the image coordinate system, respectively. Its coordinate values ​​are represented by (u, v). Each image contains M rows and N columns of elements, each element being called a pixel. The pixel coordinate system uses pixels as its unit. Rows correspond to the horizontal direction of the image sensor, and columns correspond to the vertical direction.

[0147] The transformation process for the above four coordinate systems is as follows:

[0148] (1) World coordinate system → Camera coordinate system: This is a 3D to 3D projection, which transforms the object's coordinates from the world coordinate system to the camera coordinate system through rotation and translation. Camera position refers to the camera's position O in the world coordinate system. c Camera pointing refers to the direction of the three axes of the camera coordinate system (usually the x, y, and z axes) in the world coordinate system [X]. axis ,Y axis Z axis When the directions of any two coordinate axes in the camera's pointing direction are determined, the direction of the other axis can be calculated, such as Y. axis =Z axis ×X axis The rotation matrix from the world coordinate system to the camera coordinate system is a 3×3 orthogonal matrix, represented as follows: ‖·‖ is the norm of the vector. The translation vector is a 3×1 vector, represented as t=-RO. c It can be seen that the camera's rotation and translation are determined by its extrinsic parameters (such as camera position and pointing), and the extrinsic parameter matrix of the camera is defined as follows:

[0149] (2) Camera coordinate system → Image coordinate system: This is a 3D to 2D projection, which can be calculated based on the principle of similar triangles. As shown in Figure 2. Where f represents the camera's focal length, expressed in matrix form as follows: During this projection process, depth information is lost. For example, as shown in Figure 2, points P, B, and A in the camera coordinate system are projected onto points p, C, and o in the image. Furthermore, other points on the ray Oc-P are also projected onto point p in the image. In other words, depth information cannot be recovered from the imaging results.

[0150] (3) Image Coordinate System → Pixel Coordinate System: Converting physical units to pixel units. Both the pixel coordinate system and the image coordinate system lie on the imaging plane of the optical sensing node, but their origins and units of measurement differ. As shown in Figure 3, the image coordinate system (x, y) generally uses the center of the image sensor as its origin, and the unit is a physical unit, such as mm. The pixel coordinate system (u, v) generally uses the upper left corner of the image sensor as its origin, and the unit is pixels. The number of pixels determines the camera's resolution. For example, in Figure 3, the number of pixels in the camera is M*N, meaning that the number of pixels in each row is M, and the number of pixels in each column is N, corresponding to a resolution of M*N. The conversion relationship between the pixel coordinate system coordinates [u, v] and the image coordinate system coordinates [x, y] is as follows: Where dx and dy represent the number of millimeters represented by each column and each row of pixels, respectively, u m and v m The coordinates of the pixel point corresponding to the center of the image sensor are equal to half of the horizontal and vertical resolutions, respectively. Represented in matrix form as follows:

[0151] The above content introduces the transformation process of four coordinate systems involved in the optical sensing imaging process. It can be understood that, combining the transformation of the second coordinate system (i.e., (2) camera coordinate system → image coordinate system) and the transformation of the third coordinate system (i.e., (3) image coordinate system → pixel coordinate system), the camera's intrinsic parameter matrix can be defined as: in, This indicates the number of pixels corresponding to the focal length in the horizontal direction. This represents the number of pixels corresponding to the focal length in the vertical direction. The horizontal field of view is known to be θ. u The relationship between focal length and field of view is shown in Figure 4, that is... Similarly, θ v This represents the field of view in the vertical direction. Therefore, when the focal length and resolution are known, or when the field of view and resolution are known, the camera's intrinsic parameter matrix can be determined.

[0152] 1.2 Radio Frequency Sensing

[0153] Radio frequency (RF) sensing can detect a target by receiving its echo. The target being sensed (or the sensing target) can reflect, diffract, or scatter signals. As shown in Figure 5, RF sensing can obtain multiple 3D scattering points. It is understood that in this embodiment, 3D scattering points can also be referred to as point cloud data, 3D point cloud data, or other possible names, and this embodiment does not impose any limitations on this.

[0154] 2. Multimodal fusion sensing

[0155] Multimodal fusion sensing can fuse radio frequency (RF) and optical multimodal information. For example, multimodal fusion sensing can determine the surface where a target is located based on the 3D scattering points from RF sensing; and then project the optical imaging results back onto the surface determined by RF sensing, using the intersection points to determine the target shape. As shown in Figure 6, the implementation idea of ​​multimodal fusion sensing is described in detail below:

[0156] (1) Regarding optical sensing:

[0157] First, obtain the extrinsic and intrinsic parameter matrices of the camera; then, based on the intrinsic parameter matrix, project the 2D image back onto the camera coordinate system to obtain the set R of ray equations for key points (such as boundary pixels) in the image. E ={γ0*(X0,Y0,1),γ1*(X1,Y1,1),…,γ K *(X K Z K ,1)}. The specific back projection process is represented as follows:

[0158] It is understandable that, due to the unknown depth, the ray equation of the key points in the image (such as boundary pixels) is γ0*(X0,Y0,1), where (u0,v0) represents the coordinates in the pixel coordinate system, K is the camera intrinsic parameter matrix, and γ0 represents the Z-axis coordinate in the camera coordinate system, i.e., the depth information, which is the parameter to be estimated.

[0159] (2) Regarding radio frequency sensing:

[0160] Based on 3D scattering point P S ={(X0,Y0,Z0),(X1,Y1,Z1),……,(X N ,Y N Z N )}, determine the surface of the object F(x,y,z)=0.

[0161] (3) Multimodal fusion:

[0162] The ray equation R where the key point is located ESubstitute the surface equation F(x,y,z)=0 to calculate the depth γ, and determine the coordinates of the key points and the shape of the object.

[0163] As we've discussed, the above content introduces multimodal fusion sensing. However, how to improve the performance of multimodal sensing is currently a hot topic of discussion.

[0164] To address the aforementioned technical problems, this application proposes the following technical solutions to improve multimodal sensing performance.

[0165] The technical solutions in this application will now be described with reference to the accompanying drawings.

[0166] The technical solutions of this application can be applied to various communication systems, such as 4th generation (4G) mobile communication systems, such as long term evolution (LTE) systems, 5th generation (5G) mobile communication systems, such as new radio (NR) systems, and communication systems that evolve after 5G, such as future communication network systems. They can also be applied to wireless fidelity (WiFi) systems, vehicle to everything (V2X) communication systems, device-to-device (D2D) communication systems, vehicle networking communication systems, etc.

[0167] This application will present various aspects, embodiments, or features relating to systems that may include multiple devices, components, modules, etc. It should be understood and appreciated that individual systems may include additional devices, components, modules, etc., and / or may not include all the devices, components, modules, etc. discussed in conjunction with the accompanying drawings. Furthermore, combinations of these approaches are also possible.

[0168] Furthermore, in the embodiments of this application, the words "exemplary," "for example," etc., are used to indicate that they are examples, illustrations, or descriptions. Any embodiment or design scheme described as "exemplary" in this application should not be construed as being more preferred or advantageous than other embodiments or design schemes. Specifically, the use of the term "exemplary" is intended to present the concept in a concrete manner.

[0169] In the embodiments of this application, the terms "information," "signal," "message," "channel," and "signaling" may sometimes be used interchangeably. It should be noted that, without emphasizing their distinction, their intended meanings are consistent. Similarly, "of," "corresponding (relevant)," and "corresponding" may sometimes be used interchangeably. It should be noted that, without emphasizing their distinction, their intended meanings are consistent. Furthermore, the " / " mentioned in this application can be used to indicate an "or" relationship.

[0170] The network architecture and business scenarios described in the embodiments of this application are for the purpose of more clearly illustrating the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided in the embodiments of this application. As those skilled in the art will know, with the evolution of network architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of this application are also applicable to similar technical problems.

[0171] To facilitate understanding of the embodiments of this application, the communication system applicable to the embodiments of this application will be described first.

[0172] The communication system includes: a first device and a second device.

[0173] The first device can be used for communication and data processing. For example, the first device can receive three-dimensional point cloud data from other devices and process the three-dimensional point cloud data. The first device can also be used for radio frequency sensing of objects. The first device can be a terminal device, or a communication module, circuit with communication function, chip, chip system, or other component or assembly in a terminal device. The first device can also be a network device, or a communication module, circuit with communication function, chip, chip system, or other component or assembly in a network device.

[0174] The second device can be used for communication, for example, the first device can receive two-dimensional images from other devices. The second device can also be used for data processing, for example, the second device can process two-dimensional images. The second device can also be used for optical sensing of objects. The second device can be a terminal device, or a communication module, circuit with communication function, chip, chip system, or other component or assembly within a terminal device. The second device can also be a network device, or a communication module, circuit with communication function, chip, chip system, or other component or assembly within a network device.

[0175] The aforementioned terminal equipment can also be referred to as user equipment (UE), mobile station (MS), mobile terminal (MT), user device, access terminal, user unit, user station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication equipment, user agent, or user device, etc., or equipment used to provide voice or data connectivity to users, and can also be Internet of Things (IoT) devices. For example, terminal equipment includes handheld devices with wireless connectivity, vehicle-mounted devices, etc. Currently, terminals can include: mobile phones, tablets, computers with wireless transceiver capabilities, laptops, handheld computers, mobile internet devices (MIDs), wearable devices (such as smartwatches, smart bracelets, pedometers, etc.), in-vehicle equipment (such as cars, bicycles, electric vehicles, airplanes, ships, trains, high-speed trains, etc.), satellite terminals, virtual reality (VR) devices, augmented reality (AR) devices, smart point-of-sale (POS) machines, customer-premises equipment (CPE), wireless terminals in industrial control, smart home devices (such as refrigerators, televisions, air conditioners, electricity meters, etc.), smart robots, robotic arms, workshop equipment, wireless terminals in autonomous driving, wireless terminals in telemedicine, wireless terminals in smart grids, wireless terminals in transportation safety, wireless terminals in smart cities, or wireless terminals in smart homes, and flying equipment (such as smart robots, hot air balloons, drones, airplanes), etc. Terminal devices can also be other devices with terminal functions. For example, a terminal device can also be a device that plays a terminal function in device-to-device (D2D) communication.

[0176] The aforementioned network devices can be base stations, evolved NodeBs (eNodeBs), transmitting and receiving points (TRPs), transmitting points (TPs), next-generation NodeBs (gNBs), next-generation base stations in future communication systems, base stations in future mobile communication systems, satellites, or access points (APs) in WiFi systems, such as home gateways, routers, servers, switches, and bridges. They can also be integrated access and backhaul (IAB) nodes, network devices in mobile switching center non-terrestrial network (NTN) communication systems, meaning they can be deployed on high-altitude platforms or satellites. Network devices can be macro base stations, micro base stations or indoor stations, relay nodes or donor nodes, or wireless controllers in cloud-radio access network (C-RAN) scenarios. Network devices can also function as base stations in D2D communication, vehicle-to-everything (V2X) communication, drone communication, and machine-to-machine (M2M) communication. Optionally, network devices can also be servers, wearable devices, vehicles, or in-vehicle equipment. For example, the access network equipment in vehicle-to-everything (V2X) technology can be a roadside unit (RSU). Furthermore, the aforementioned network devices can also be core network elements, such as policy control functions (PCF), access and mobility management functions (AMF), location management functions (LMF), sensing management functions (SMF), or sensing functions (SF), etc.

[0177] It is understood that the terms "first device" and "second device" in the embodiments of this application are merely exemplary expressions and can be replaced with any possible expression, such as "first device" can be replaced with "radio frequency sensing node" or "first node", and "second device" can be replaced with "optical sensing node" or "second node", etc. The embodiments of this application do not limit this.

[0178] In this communication system, the first device can obtain the neighborhood matching degree of each of the N point cloud data based on the information provided by the second device, and determine the noise in the N point cloud data based on the neighborhood matching degree of each of the N point cloud data. The neighborhood matching degree of the i-th point cloud data is determined based on the image similarity between the first neighborhood i and the second neighborhood i. The first neighborhood i is the region where the i-th point cloud data is projected onto the first image, and the second neighborhood i is the region where the i-th point cloud data is projected onto the second image. i is an integer from 1 to N, and N is an integer greater than 1. It can be understood that after projecting the real point cloud data (i.e., point cloud data excluding noise) from the N point cloud data onto the first and second images, the imaging results of the similar (or identical) regions of the target object corresponding to the first and second neighborhoods of that point cloud data are considered to have a high degree of similarity; conversely, the first and second neighborhoods corresponding to the noisy point cloud data in the N point cloud data have a low degree of similarity. In this way, noise in N point cloud data can be effectively filtered out based on the neighborhood matching degree of each of the N point cloud data. That is, the point cloud data other than noise can be used for multimodal fusion perception, thereby improving the performance of multimodal perception.

[0179] In addition, the above communication system may also include other network devices and / or other terminal devices, which can be flexibly configured according to the actual situation without any restrictions.

[0180] For ease of understanding, the communication method provided in the embodiments of this application will be described in detail below with reference to Figure 7.

[0181] For example, Figure 7 is a schematic flowchart of a communication method provided in an embodiment of this application. This method can be applied to the interaction between the first device and the second device in the above-described communication system.

[0182] As shown in Figure 7, the flow of this communication method is as follows:

[0183] S701, the first device acquires the neighborhood matching degree of each of the N point cloud data.

[0184] S702, the first device determines the noise in the N point cloud data according to the neighborhood matching degree of each of the N point cloud data.

[0185] The steps described above will be explained in detail below.

[0186] For S701:

[0187] The N point cloud data are three-dimensional point cloud data obtained by sensing the target object, or in other words, N point cloud data are point cloud data obtained by three-dimensional sensing of the target object, where N is an integer greater than 1. For example, the N point cloud data are three-dimensional point cloud data obtained by radio frequency sensing of the target object. The N point cloud data can be used to generate a stereo image of the target object. Furthermore, the N point cloud data can be data obtained by the first device through three-dimensional sensing of the target object, or data acquired by the first device from other devices; there are no restrictions.

[0188] It is understood that the term "point cloud data" in the embodiments of this application is merely an exemplary expression, and "point cloud data" can be replaced with any possible expression, such as "three-dimensional scattering points" or "three-dimensional point cloud data", etc. The embodiments of this application do not limit this.

[0189] The neighborhood matching degree of the i-th point cloud data in N point cloud data is determined based on the image similarity between the first neighborhood i and the second neighborhood i, where i is an integer from 1 to N.

[0190] The first neighborhood i is the region where the i-th point cloud data is projected (or mapped) onto the first image (described below). In other words, the first neighborhood i is the region where the point (denoted as the first projection point i) of the i-th point cloud data is projected onto the first image. That is, the first neighborhood includes the first projection point i. The first neighborhood i can be a rectangular region centered on a certain pixel (such as the first projection point i). The size of this rectangular region can be determined by the width and height of the first neighborhood i.

[0191] For example, the expression for the first neighborhood i is: Nr(u,v)={(u+x,v+y)|-r≤x≤r,-k≤y≤k}, where (u,v) are the coordinates of the point i mapped from the point cloud data to the first image, i.e., the coordinates of the first projection point i, (2r+1) is greater than or equal to 1, and (2k+1) is greater than or equal to 1; in this case, the size of the first neighborhood i is (2r+1)×(2k+1). It can be understood that | is a separator, and (u+x,v+y) before | indicates the pixel coordinates of the first neighborhood i, while the values ​​after | represent the range of x and y. The first neighborhood i can also be a circular region centered on a certain pixel (such as the first projection point i), where the pixels within this circular region are the set of all pixels whose Euclidean distance from the center pixel is less than or equal to the radius of the circle. It is understandable that when r is greater than the horizontal resolution of the first image, the first neighborhood i can be filled with zeros (or other values); when k is greater than the vertical resolution of the first image, the first neighborhood i can be filled with zeros (or other values).

[0192] For example, the expression for the first neighborhood i is: Nr(u,v)={(u+x,v+y)|x 2 +y 2 ≤r 2 Let (u, v) be the coordinates of the point i mapped from the i-th point cloud data to the first image, i.e., the coordinates of the first projection point i, where r is greater than or equal to 1. It can be understood that | is a separator; (u+x, v+y) before | indicates the pixel coordinates of the first neighborhood i, and the values ​​after | are the ranges for x and y. The first neighborhood can also be a region of other shapes and / or sizes centered on a certain pixel (such as the first projection point i), and can be flexibly set according to actual conditions without restriction. It can be understood that when r is greater than the horizontal or vertical resolution of the first image, the first neighborhood i can be filled with zeros (or other values).

[0193] The second neighborhood i is the region where the i-th point cloud data is projected (or mapped) onto the second image. In other words, the second neighborhood i is the region where the point (denoted as the second projection point i) of the i-th point cloud data is projected onto the second image. The method for determining the second neighborhood i is similar to that for determining the first neighborhood i, and can be understood by referring to the relevant introduction to the first neighborhood i above; it will not be repeated here.

[0194] The first and second images are obtained through two-dimensional perception of the target object. For example, the first and second images are obtained through optical perception of the target object. Furthermore, the first and second images are different. For instance, when the two-dimensional perception is optical, the first and second images are images obtained by perceiving (or capturing) the target object at different angles, i.e., the first and second images are captured from different angles; or, the first and second images are images obtained by perceiving the target object using two different devices located at different positions.

[0195] The following is a specific example to illustrate the above.

[0196] For example, as shown in Figure 8, the first image is a two-dimensional image obtained by camera C1 capturing the target object. The i-th point cloud data (point P in Figure 8) is projected onto the first image to obtain point P1, and the region where point P1 is located (i.e., the box containing point P1 in Figure 8) is the first neighborhood i. The second image is a two-dimensional image obtained by camera C2 capturing the target object. The i-th point cloud data is projected onto the second image to obtain point P2, and the region where point P2 is located (i.e., the box containing point P2 in Figure 8) is the second neighborhood i. The first and second images are captured from different angles.

[0197] The above content describes the neighborhood matching degree of each of the N point cloud data sets. It can be understood that the first device can obtain the neighborhood matching degree of each of the N point cloud data sets, i.e., the neighborhood matching degree of the i-th point cloud data set, based on different information. The following explains the different cases.

[0198] Case 7.1: The first device determines the neighborhood matching degree of the i-th point cloud data according to the first information, which is used to indicate the relevant information of the pixels in the first neighborhood i and the second neighborhood i.

[0199] In other words, the first device acquiring the neighborhood matching degree of each of the N point cloud data can specifically include: the first device acquiring first information and determining the neighborhood matching degree of the i-th point cloud data based on the first information, wherein the first information is used to indicate the relevant information of the pixels in the first neighborhood i and the second neighborhood i. That is, the first device can determine the neighborhood matching degree of the i-th point cloud data based on the relevant information of the pixels in the first neighborhood i and the second neighborhood i, such as the pixel value corresponding to each pixel, or based on the image features determined by the pixels. Since the relevant information of the pixels can characterize the image features of the first neighborhood i and the second neighborhood i, the first device can accurately determine the neighborhood matching degree of the i-th point cloud data through the relevant information of the pixels in the first neighborhood i and the second neighborhood i.

[0200] The aforementioned first information may include first neighborhood information i and second neighborhood information i. The first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i. The second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i. That is, in this case, the relevant information about the pixel refers to the pixel value corresponding to the pixel. The first information may also include first neighborhood feature i and second neighborhood feature i. The first neighborhood feature i is used to indicate the image features of the first neighborhood i. The second neighborhood feature i is used to indicate the image features of the second neighborhood i. Image features are related to pixels; that is, the image features corresponding to a region can be determined through the pixels in the region. That is, in this case, the relevant information about the pixel refers to the image features. Of course, the first information can also be other information, and can be set according to the actual situation without limitation.

[0201] The aforementioned first information can be obtained from the second device. For example, the first device obtaining the first information may specifically include: the first device sending a first message to the second device, and correspondingly, the second device receiving the first message from the first device. The first message requests information about the pixels in the region where N point cloud data are projected onto a two-dimensional image, where the two-dimensional image is obtained by sensing a target object. In response to the first message, the second device sends first information to the first device, and correspondingly, the first device receives the first information from the second device. That is, the first device can request the second device to send information about the pixels in the region where N point cloud data are projected onto a two-dimensional image. In this case, the second device can project the i-th point cloud data onto the first image and the second image, and obtain information about the pixels in the first neighborhood i and the second neighborhood i.

[0202] The first message may include neighborhood description information, which indicates the size and / or shape of the region where the N point cloud data are projected onto the two-dimensional image. That is, after projecting the i-th point cloud data onto the first and second images, the second device can determine the first neighborhood i and the second neighborhood i based on the neighborhood description information, and then obtain relevant information about the pixels in the first and second neighborhoods i. It is understood that the first and second devices can also pre-agree or pre-define the size and / or shape of the region where the point cloud data is projected onto the two-dimensional image, without restriction.

[0203] For example, the neighborhood description information indicates that the region where the i-th point cloud data is projected onto the 2D image has a rectangular shape, and the size of this region is (2r+1)×(2k+1), where (2r+1) is greater than or equal to 1 and (2k+1) is greater than or equal to 1. After receiving the first message and projecting the i-th point cloud data onto the first image, the second device can determine the first neighborhood i based on the neighborhood description information, with the first projection point i as the center and the size of the region being (2r+1)×(2k+1). It can be understood that the specific implementation principle of the second device determining the second neighborhood i is similar to that of the second device determining the first neighborhood i, and can be understood by referring to the relevant introduction on the second device determining the first neighborhood i; it will not be repeated here.

[0204] For example, the first and second devices pre-determine that the region where the point cloud data is projected onto the two-dimensional image is circular, and the neighborhood description information is used to indicate that the size of the region where the i-th point cloud data is projected onto the two-dimensional image is πr. 2 The second device, after receiving the first message and projecting the i-th point cloud data onto the first image, can, based on the neighborhood description information, define a region centered on the first projection point i with a size of πr. 2The first neighborhood i is determined. It can be understood that the specific implementation principle of the second device determining the second neighborhood i is similar to that of the second device determining the first neighborhood i. For further understanding, please refer to the relevant introduction on the second device determining the first neighborhood i; it will not be elaborated here.

[0205] The first message can also include N point cloud data points, that is, the coordinates corresponding to the N point cloud data points. It can be understood that the N point cloud data points can be sent to the second device through other messages, such as the first device sending the first message and then sending the N point cloud data points to the second device. In other words, the N point cloud data points and the neighborhood description information do not necessarily need to be carried in the same message; this can be flexibly set according to the actual situation without any restrictions.

[0206] It is understandable that the first device can request information about the pixels in the region where N point cloud data are projected onto the two-dimensional image, based on the reporting capabilities of the second device.

[0207] In a first possible implementation, when the first information includes first neighborhood information i and second neighborhood information i, and before the first device sends the first message to the second device, the communication method may further include: the second device sending a first capability parameter to the first device, and correspondingly, the first device receiving the first capability parameter from the second device, the first capability parameter being used to indicate that the second device supports providing the pixel values ​​corresponding to pixels in a first region to the first device, the first region being the region where the point cloud data is projected onto the two-dimensional image; the first device sending the first message to the second device may specifically include: the first device sending the first message to the second device according to the first capability parameter.

[0208] The first capability parameter can also be understood as the second device's ability to provide the first device with the pixel values ​​corresponding to the pixels in the first region, or the second device having the capability to provide the first device with the pixel values ​​corresponding to the pixels in the first region. It can be understood that when the second device is a chip, the device containing the second device has the capability to provide the first device with the pixel values ​​corresponding to the pixels in the first region.

[0209] The first device sending a first message to the second device based on the first capability parameter can be understood as the first device determining to send a first message to the second device based on the first capability parameter.

[0210] In this embodiment, after receiving the first capability parameter, the first device can determine that the second device (or the device containing the second device) has the capability to provide the first device with the pixel values ​​corresponding to the pixels in the first region. At this time, the first device can send a first message to the second device, requesting the second device to send relevant information about the region where N point cloud data are projected onto the 2D image. This avoids the first device failing to request the second device to send relevant information about the region where N point cloud data are projected onto the 2D image, thus preventing additional communication overhead.

[0211] Furthermore, the aforementioned first message, used to request information about pixels in the region where the N point cloud data are projected onto the 2D image, can specifically include: the first message requesting pixel values ​​corresponding to the pixels in the region where the N point cloud data are projected onto the 2D image. This avoids the second device providing incorrect information to the first device, such as providing image features corresponding to the region where the N point cloud data are projected onto the 2D image, thus preventing additional communication overhead.

[0212] Furthermore, the first capability parameter can also be used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information. Each piece of information is described in detail below.

[0213] Information protection level information is used to indicate the degree of information exposure supported by the second device, such as whether the second device is willing to send information such as the internal and external parameters of the second device, the relevant information of the pixels in the area where the point cloud data is projected onto the two-dimensional image, to other devices (such as the first device).

[0214] The perception capability information is used to indicate the perception-related parameters of the second device, such as the resolution and field of view of the second device.

[0215] Communication capability information describes whether the second device has the ability to share data, and the strength of that ability. The ability to share data is related to factors such as bandwidth and signal-to-noise ratio, which affect the information transmitted by the second device. For example, can the second device transmit large amounts of two-dimensional images, or can the second device transmit information about the pixels in the region where point cloud data is projected onto the two-dimensional image?

[0216] Computational capability information indicates whether the second device has data analysis and computation capabilities, and the strength of these capabilities. These capabilities are related to the configuration of the second device's central processing unit (CPU), graphics processing unit (GPU), and other units. These configurations affect the data ultimately transmitted by the second device, such as whether it transmits low-computational-requirement 2D images or high-computational-requirement point cloud data projected onto the 2D image and containing information about the pixels in that region.

[0217] Storage capacity information indicates the storage capacity of the second device. This storage capacity information is related to the configuration of the second device, such as its memory and disk. The storage capacity of the second device affects its participation in multimodal sensing. For example, if the second device caches a large amount of sensing data, the sensing cost required is reduced compared to the second device re-sensing and collecting sensing data.

[0218] The internal and external parameter transmission information is used to indicate whether the second device supports the transmission of internal and external parameters (i.e., internal parameters and external parameters) and two-dimensional images. The internal parameter can be at least one of the following parameters of the second device: focal length, resolution, or field of view. The external parameter can be at least one of the following parameters of the second device: position or orientation. The two-dimensional image is an image obtained by the second device through two-dimensional perception of the object (such as optical perception).

[0219] Neighborhood feature acquisition information can also be called neighborhood feature extraction and transmission information. This neighborhood feature acquisition information is used to indicate whether the second device supports neighborhood feature extraction and the transmission of the extracted neighborhood features. The neighborhood is the region where the point cloud data is projected onto the two-dimensional image. For details, please refer to the aforementioned introductions of "first neighborhood i" and "second neighborhood i" for further understanding, which will not be repeated here. Neighborhood features can be understood as the image features corresponding to the neighborhood. These image features are related to pixels, such as feature extraction based on the pixel values ​​corresponding to the pixels in the neighborhood. It can be understood that sending neighborhood features to other devices by the second device, compared to sending neighborhood information to other devices (see the related introduction of neighborhood extraction and acquisition information below), can further protect privacy and reduce data transmission volume.

[0220] The neighborhood matching degree acquisition information can also be called the matching degree calculation and transmission information. This neighborhood matching degree acquisition information is used to indicate whether the second device supports providing neighborhood matching degrees to other devices (such as the first device). That is, the second device can receive data (such as point cloud data) from other devices, and calculate and transmit the neighborhood matching degree based on this data.

[0221] The content indicated by the above information can be flexibly set according to the actual situation without restriction. Furthermore, the names of the above information are merely exemplary; the specific names can be flexibly set according to the actual situation without restriction.

[0222] For example, the first capability parameter is also used to indicate internal and external parameter transmission information, neighborhood feature acquisition information, and neighborhood matching degree acquisition information. The internal and external parameter transmission information indicates that the second device does not support transmitting internal and external parameters and two-dimensional images to other devices. The neighborhood feature acquisition information indicates that the second device supports neighborhood feature extraction and transmission of the extracted neighborhood features. The neighborhood matching degree acquisition information indicates that the second device does not support providing neighborhood matching degrees to other devices.

[0223] Continuing the example above, when the second device reports its capabilities to the first device, if the second device and the first device have pre-agreed on the capabilities that need to be reported, the second device can indicate the status of each capability corresponding to itself to the first device based on the pre-agreed capabilities. That is, the first capability parameter is the parameter indicating the status of each capability corresponding to the second device. For example, as shown in Table 1 below, the first device and the second device have pre-agreed on the capabilities of the optical sensing node to be reported, namely, internal and external parameter transmission information, neighborhood extraction information (described below), neighborhood feature acquisition information, and neighborhood matching degree acquisition information. The second device can send the status corresponding to each capability to the first device based on the capabilities of the optical sensing node and its own situation.

[0224] Table 1

[0225] It is understood that in this embodiment of the application (i.e., the first possible implementation), the state of the neighborhood extraction capability is "yes". The states of other capabilities in Table 1 besides the neighborhood extraction capability can be set according to the actual situation without restriction.

[0226] In this embodiment of the application, by indicating the capabilities of the second device to the first device through the first capability parameter, the first device can determine the capabilities of the second device (or the device in which the second device is located) based on the first capability parameter, and thus determine the operation of the second device in multimodal sensing based on the capabilities of the second device (or the device in which the second device is located).

[0227] Furthermore, when the first information includes first neighborhood information i and second neighborhood information i, the determination of the neighborhood matching degree of the i-th point cloud data by the first device based on the first information can specifically include: the first device determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood information i, the second neighborhood information i, and the matching degree calculation method, wherein the matching degree calculation method is any one of the following: normalized squared difference, cumulative density function, or neighborhood feature calculation (described below). It is understood that the matching degree calculation method can also be other methods, without limitation. In this way, the first device can accurately determine the neighborhood matching degree of the i-th point cloud data.

[0228] In a second possible implementation, where the first information includes a first neighborhood feature i and a second neighborhood feature i, and before the first device sends the first message to the second device, the communication method may further include: the second device sending a second capability parameter to the first device, and correspondingly, the first device receiving the second capability parameter from the second device, the second capability parameter indicating that the second device supports providing neighborhood features to the first device, the neighborhood features indicating the image features of the region where the point cloud data is projected onto the two-dimensional image; the first device sending the first message to the second device may specifically include: the first device sending the first message to the second device according to the second capability parameter; or, the first device determining to send the first message to the second device according to the second capability parameter.

[0229] The second capability parameter can also be understood as the second device's ability to provide neighborhood features to the first device, or the second device having the capability to provide neighborhood features to the first device. It can be understood that when the second device is a chip, the device containing the second device has the capability to provide neighborhood features to the first device.

[0230] In this embodiment, after receiving the second capability parameter, the first device can determine that the second device (or the device containing the second device) has the capability to provide neighborhood features to the first device. At this time, the first device can (determine) send a first message to the second device, requesting the second device to send image features corresponding to the region where N point cloud data are projected onto the 2D image. This avoids the first device failing to request the second device to send image features corresponding to the region where N point cloud data are projected onto the 2D image, thus preventing additional communication overhead.

[0231] Furthermore, the aforementioned first message, used to request information about pixels in the region where the N point cloud data are projected onto the 2D image, can specifically include: the first message requesting image features corresponding to the region where the N point cloud data are projected onto the 2D image. This avoids the second device providing incorrect information to the first device, such as providing pixel values ​​corresponding to pixels in the region where the N point cloud data are projected onto the 2D image, thus preventing additional communication overhead.

[0232] Furthermore, the second capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction acquisition information, or neighborhood matching degree acquisition information.

[0233] The neighborhood extraction information can also be called neighborhood extraction and transmission information, or other names, without limitation. This neighborhood extraction information is used to indicate whether the second device supports neighborhood extraction and to transmit the extracted neighborhood information. The neighborhood is the region where the point cloud data is projected onto the two-dimensional image. For details, please refer to the aforementioned introductions of "first neighborhood i" and "second neighborhood i" for further understanding, which will not be repeated here. Neighborhood extraction can be understood as obtaining relevant information about the neighborhood, such as the pixel values ​​corresponding to the pixels in the neighborhood.

[0234] For other information, please refer to the aforementioned related introductions, which will not be repeated here.

[0235] For example, the second capability parameter is also used to indicate internal and external parameter transmission information, neighborhood extraction acquisition information, and neighborhood matching degree acquisition information. The internal and external parameter transmission information indicates that the second device does not support transmitting internal and external parameters and two-dimensional images to other devices. The neighborhood extraction acquisition information indicates that the second device does not support neighborhood extraction and transmission operations. The neighborhood matching degree acquisition information indicates that the second device does not support providing neighborhood matching degrees to other devices.

[0236] Continuing the example above, when the second device reports its capabilities to the first device, if the second device and the first device have pre-agreed on the capabilities that need to be reported, the second device can indicate the status of each capability corresponding to the second device to the first device according to the pre-agreed capabilities. For details, please refer to the aforementioned related descriptions; they will not be repeated here. It is understood that in this embodiment of the application (i.e., the second possible implementation), the status of the neighborhood feature acquisition capability is "yes," and the status of the optical sensing node capabilities other than the neighborhood feature acquisition capability can be set according to the actual situation without restriction.

[0237] In this embodiment of the application, by indicating the capabilities of the second device to the first device through the second capability parameters, the first device can determine the capabilities of the second device based on the second capability parameters, and thus determine the operation of the second device in multimodal sensing based on the capabilities of the second device.

[0238] Furthermore, when the first information includes the first neighborhood feature i and the second neighborhood feature i, the first device determining the neighborhood matching degree of the i-th point cloud data based on the first information can specifically include: the first device determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood feature i, the second neighborhood feature i, and the neighborhood feature calculation method (described below). In this way, the first device can accurately determine the neighborhood matching degree of the i-th point cloud data.

[0239] Case 7.2: The first device obtains the neighborhood matching degree of the i-th point cloud data from the second device.

[0240] That is, the first device acquiring the neighborhood matching degree of each of the N point cloud data can specifically include: the first device sending a second message to the second device, and correspondingly, the second device receiving the second message from the first device. The second message is used to request the neighborhood matching degree corresponding to the N point cloud data. The neighborhood matching degree corresponding to the N point cloud data is used to indicate the image similarity of the region where the i-th point cloud data is projected onto multiple two-dimensional images. The multiple two-dimensional images are obtained by perceiving the target object. In response to the second message, the second device sends the neighborhood matching degree of each of the N point cloud data to the first device, and correspondingly, the first device receives the neighborhood matching degree of each of the N point cloud data from the second device.

[0241] The second message mentioned above may include neighborhood description information, which can be referred to in the relevant introduction in "Case 7.1" above, and will not be repeated here.

[0242] The second message can also include N point cloud data points, that is, the coordinates corresponding to the N point cloud data points. It can be understood that the N point cloud data points can be sent to the second device through other messages, such as the first device sending the second message and then sending the N point cloud data points to the second device. In other words, the N point cloud data points and the neighborhood description information do not necessarily need to be carried in the same message; this can be flexibly set according to the actual situation without any restrictions.

[0243] In this embodiment, the first device can request neighborhood matching degrees corresponding to N point cloud data from the second device. After receiving the request, the second device can calculate the neighborhood matching degrees corresponding to the N point cloud data based on the N point cloud data and multiple two-dimensional images, and send the calculated neighborhood matching degrees of each of the N point cloud data to the first device. This reduces the computational overhead of the first device.

[0244] It is understandable that the first device can request the second device to send the neighborhood matching degree of each of the N point cloud data based on the reporting capability of the second device.

[0245] For example, before the first device sends the second message to the second device, the communication method may further include: the second device sending a third capability parameter to the first device, and correspondingly, the first device receiving the third capability parameter from the second device. The third capability parameter is used to indicate that the second device supports providing the first device with the neighborhood matching degree corresponding to the three-dimensional point cloud data. The neighborhood matching degree corresponding to the three-dimensional point cloud data is used to indicate the image similarity of the region where the three-dimensional point cloud data is projected onto multiple two-dimensional images. The first device sending the second message to the second device may specifically include: the first device sending the second message to the second device according to the third capability parameter.

[0246] The third capability parameter can also be understood as the second device's ability to provide the first device with the neighborhood matching degree corresponding to the 3D point cloud data, or the second device's ability to provide the first device with the neighborhood matching degree corresponding to the 3D point cloud data. It can be understood that when the second device is a chip, the device containing the second device has the ability to provide the first device with the neighborhood matching degree corresponding to the 3D point cloud data.

[0247] The first statement that the device sends a second message to the second device based on the third capability parameter can be understood as the first device determining to send a second message to the second device based on the third capability parameter.

[0248] In this embodiment, after receiving the third capability parameter, the first device can determine that the second device (or the device containing the second device) has the capability to provide the first device with neighborhood matching degrees corresponding to the 3D point cloud data. At this time, the first device can (determine) send a second message to the second device, requesting the second device to send neighborhood matching degrees corresponding to N point cloud data. This avoids the first device failing to request the second device to send neighborhood matching degrees corresponding to N point cloud data, thus preventing additional communication overhead.

[0249] Furthermore, the third capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood extraction and acquisition information, or neighborhood feature acquisition information. Each piece of information can be referred to the relevant description in “Situation 7.1” above, and will not be repeated here.

[0250] For example, the third capability parameter is also used to indicate internal and external parameter transmission information, neighborhood extraction acquisition information, and neighborhood feature acquisition information. The internal and external parameter transmission information indicates that the second device does not support transmitting internal and external parameters and two-dimensional images to other devices. The neighborhood extraction acquisition information indicates that the second device does not support neighborhood extraction and transmits the extracted neighborhood information. The neighborhood feature acquisition information indicates that the second device does not support neighborhood feature extraction but supports transmitting the extracted neighborhood features.

[0251] Continuing the example above, when the second device reports its capabilities to the first device, if the second device and the first device have pre-agreed on the capabilities that need to be reported, the second device can indicate the status of each capability to the first device according to the pre-agreed capabilities. For details, please refer to the relevant description in "Situation 7.1" above; it will not be repeated here. It can be understood that in Situation 7.2, the status of the neighborhood matching degree acquisition capability is "yes," and the status of the optical sensing node capabilities other than the neighborhood matching degree acquisition capability can be set according to the actual situation without restriction.

[0252] In this embodiment of the application, by indicating the capabilities of the second device to the first device through the third capability parameter, the first device can determine the capabilities of the second device based on the third capability parameter, and thus determine the operation of the second device in multimodal sensing based on the capabilities of the second device.

[0253] Optionally, the above communication method may further include: a second device projecting the i-th point cloud data onto a first image and a second image; the second device determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data; the second device determining first neighborhood information i and second neighborhood information i based on the first neighborhood i and the second neighborhood i, wherein the first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i; and the second device determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood information i and the second neighborhood information i (described below). In this way, the second device can accurately obtain the neighborhood matching degree of the i-th point cloud data.

[0254] Optionally, the above communication method may further include: a second device projecting the i-th point cloud data onto a first image and a second image; the second device determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data; the second device determining a first neighborhood feature i and a second neighborhood feature i based on the first neighborhood i and the second neighborhood i, where the first neighborhood feature i indicates the image features of the first neighborhood i and the second neighborhood feature i indicates the image features of the second neighborhood i; and the second device determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood feature i and the second neighborhood feature i (described below). In this way, the second device can accurately obtain the neighborhood matching degree of the i-th point cloud data.

[0255] Case 7.3: The first device determines the neighborhood matching degree of the i-th point cloud data based on the parameter information, the first image, and the second image. The parameter information is used to project N point cloud data onto the first image and the second image.

[0256] In other words, the first device's acquisition of the neighborhood matching degree of each of the N point cloud data points can specifically include: the first device acquiring parameter information, a first image, and a second image, and determining the neighborhood matching degree of each of the N point cloud data points based on the parameter information, the first image, and the second image. The parameter information is used to project the N point cloud data points onto the first image and the second image. That is, the first device can project the N point cloud data points onto the first image and the second image based on the parameter information, the first image, and the second image, and then determine the neighborhood matching degree of each of the N point cloud data points based on the projected first image and the second image. This enables the first device to accurately acquire the neighborhood matching degree of each of the N point cloud data points.

[0257] Furthermore, the acquisition of parameter information, first image, and second image by the first device may specifically include: the second device acquiring parameter information, first image, and second image, wherein the parameter information is used to indicate relevant parameters for the second device to perform two-dimensional perception of the target object, and the first image and second image are obtained by the second device to perform two-dimensional perception of the target object, and the first image and second image are different; the second device sends parameter information, first image, and second image to the first device, and correspondingly, the first device receives parameter information, first image, and second image from the second device.

[0258] The aforementioned relevant parameters can be internal and external parameters of the second device (i.e., internal and external parameters). These relevant parameters can be used to project N point cloud data onto the first and second images. For details, please refer to the conversion process from the world coordinate system to the pixel coordinate system in "1.1 Optical Perception" above; it will not be elaborated upon here.

[0259] For example, the aforementioned parameters may include at least one of the following information about the second device: focal length, resolution, field of view, position, or pointing. The focal length, resolution, and field of view can be referred to the relevant descriptions in "1.1 Optical Sensing" above, and will not be repeated here. Position refers to the location of the second device in the world, or the relative position of the second device to a certain device (such as the first device); pointing refers to the direction corresponding to the second device, such as the pointing of the second device relative to a certain device (such as the first device). The position and / or pointing of the second device can be flexibly set according to the actual situation and are not limited.

[0260] In this embodiment, the first device can obtain parameter information, a first image, and a second image from the second device. It is understood that the second device can periodically send to the first device relevant parameters for two-dimensional perception of the target object, as well as two-dimensional images (such as the first and second images) obtained from the perception of the target object. The first device can also request relevant parameters for two-dimensional perception of the target object, as well as two-dimensional images obtained from the perception of the target object, from the second device to obtain parameter information, the first image, and the second image.

[0261] For example, before the first device receives the parameter information, the first image, and the second image from the second device, the communication method may further include: the first device sending a third message to the second device, and correspondingly, the second device receiving the third message from the first device. The third message requests relevant parameters and at least one image from the second device for two-dimensional perception of the target object. Specifically, the second device sending the parameter information, the first image, and the second image to the first device may include: in response to the third message, the second device sending the parameter information, the first image, and the second image to the first device. This enables the first device to obtain the parameter information, the first image, and the second image in a timely manner.

[0262] It is understood that the first device can request the second device to send relevant parameters and at least one image of the target object for two-dimensional perception, based on the reporting capabilities of the second device.

[0263] For example, before the first device receives parameter information, the first image, and the second image from the second device, the above communication method may further include: the second device sending a fourth capability parameter to the first device, and correspondingly, the first device receiving the fourth capability parameter from the second device, the fourth capability parameter being used to indicate that the second device supports providing the first device with relevant parameters and images for two-dimensional perception of objects; the first device sending a third message to the second device may specifically include: the first device sending a third message to the second device according to the fourth capability parameter.

[0264] The fourth capability parameter can also be understood as the second device being able to provide the first device with relevant parameters and images for two-dimensional object perception, or the second device having the capability to provide the first device with relevant parameters and images for two-dimensional object perception. It can be understood that when the second device is a chip, the device containing the second device has the capability to provide the first device with relevant parameters and images for two-dimensional object perception.

[0265] The first device sending a third message to the second device based on the fourth capability parameter can be understood as the first device determining to send a third message to the second device based on the fourth capability parameter.

[0266] In this embodiment, after receiving the fourth capability parameter, the first device can determine that the second device (or the device containing the second device) has the capability to provide the first device with relevant parameters and images for two-dimensional object perception. At this time, the first device can (determine) send a third message to the second device, requesting the second device to provide relevant parameters and images for two-dimensional object perception. This avoids the first device failing to request relevant parameters and images for two-dimensional object perception from the second device, thus preventing additional communication overhead.

[0267] Furthermore, the fourth capability parameter is also used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information. Each piece of information can be referred to the relevant description in the aforementioned "Situation 7.1", and will not be repeated here.

[0268] For example, the fourth capability parameter is also used to indicate neighborhood extraction information, neighborhood feature acquisition information, and neighborhood matching degree acquisition information. The neighborhood extraction information indicates that the second device supports neighborhood extraction and transmits the extracted neighborhood information. The neighborhood feature acquisition information indicates that the second device supports neighborhood feature extraction and supports transmitting the extracted neighborhood features. The neighborhood matching degree acquisition information indicates that the second device does not support providing neighborhood matching degrees to other devices.

[0269] Continuing the example above, when the second device reports its capabilities to the first device, if the second device and the first device have pre-agreed on the capabilities that need to be reported, the second device can indicate the status of each capability to the first device according to the pre-agreed capabilities. For details, please refer to the relevant description in "Situation 7.1" above; it will not be repeated here. It can be understood that in Situation 7.3, the status of the intrinsic and extrinsic parameter transmission capability is "yes," and the status of the optical sensing node capabilities other than the intrinsic and extrinsic parameter transmission capabilities can be set according to the actual situation without restriction.

[0270] In this embodiment of the application, by instructing the first device on the capabilities of the second device through the fourth capability parameter, the first device can determine the capabilities of the second device (or the device in which the second device is located) based on the third capability parameter, and thus determine the operation of the second device in multimodal sensing based on the capabilities of the second device (or the device in which the second device is located).

[0271] As can be understood, the above content describes how the first device acquires parameter information, the first image, and the second image. After acquiring the parameter information, the first image, and the second image, the first device can determine the neighborhood matching degree of the i-th point cloud data based on the parameter information, the first image, and the second image. These will be described in detail below.

[0272] In one possible implementation, the first device obtaining the neighborhood matching degree of each of the N point cloud data points based on parameter information, a first image, and a second image can specifically include: the first device projecting the i-th point cloud data point onto the first image and the second image based on the parameter information; the first device determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data point; the first device determining first neighborhood information i and second neighborhood information i based on the first neighborhood i and the second neighborhood i, where the first neighborhood information i includes the pixel value corresponding to the pixel point in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel point in the second neighborhood i; and the first device determining the neighborhood matching degree of the i-th point cloud data point based on the first neighborhood information i and the second neighborhood information i (described below). In this way, the first device can accurately determine the neighborhood matching degree of the i-th point cloud data point.

[0273] In another possible implementation, the first device obtaining the neighborhood matching degree of each of the N point cloud data based on parameter information, a first image, and a second image can specifically include: the first device projecting the i-th point cloud data onto the first image and the second image based on the parameter information; the first device determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data; the first device determining a first neighborhood feature i and a second neighborhood feature i based on the first neighborhood i and the second neighborhood i, where the first neighborhood feature i indicates the image features of the first neighborhood i and the second neighborhood feature i indicates the image features of the second neighborhood i; and the first device determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood feature i and the second neighborhood feature i (described below). In this way, the first device can accurately determine the neighborhood matching degree of the i-th point cloud data.

[0274] It is understood that, as described above through Situations 7.1-7.3, the first device can obtain the neighborhood matching degree of each of the N point cloud data based on different information. It is also understood that, before the first device obtains the neighborhood matching degree of each of the N point cloud data (i.e., S701), the second device can report its capability parameters to the first device, so that the first device can determine the information obtained from the second device based on these capability parameters.

[0275] For example, the above communication method may further include: the second device sending a fifth capability parameter to the first device, and correspondingly, the first device receiving the fifth capability parameter from the second device. The fifth capability parameter is used to indicate at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, intrinsic and extrinsic parameter transmission information, neighborhood extraction acquisition information, neighborhood feature acquisition information, or neighborhood matching degree acquisition information. Each piece of information can be referred to the relevant description in "Situation 7.1" above, and will not be repeated here.

[0276] After receiving the fifth capability parameter, the first device can determine the information to be requested from the second device based on the fifth capability parameter.

[0277] For example, the fifth capability parameter includes internal and external parameter transmission information, which is used to instruct the second device to support the transmission of internal and external parameters and two-dimensional images to other devices. Based on this fifth capability parameter, the first device can request relevant parameters and at least one image from the second device for two-dimensional perception of the target object. Furthermore, after receiving the parameter information, the first image, and the second image from the second device, the first device can obtain the neighborhood matching degree of each of the N point cloud data points based on these parameters, images, and images.

[0278] For example, the fifth capability parameter includes internal and external parameter transmission information, neighborhood extraction acquisition information, neighborhood feature acquisition information, and neighborhood matching degree acquisition information. The internal and external parameter transmission information indicates that the second device does not support transmitting internal and external parameters and 2D images to other devices. The neighborhood extraction acquisition information indicates that the second device supports neighborhood extraction and transmits the extracted neighborhood information. The neighborhood feature acquisition information indicates that the second device supports neighborhood feature extraction and transmits the extracted neighborhood features. The neighborhood matching degree acquisition information indicates that the second device does not support providing neighborhood matching degrees to other devices. The first device can determine whether to request neighborhood information or neighborhood features from the second device based on the fifth capability parameter and the method by which the first device acquires the neighborhood matching degree of the point cloud data. If the first device calculates the neighborhood matching degree of the point cloud data using neighborhood information, then the first device can request the pixel values ​​corresponding to the pixels in the region where N point cloud data points are projected onto the 2D image from the second device. Furthermore, after receiving the neighborhood information, the first device can acquire the neighborhood matching degree of each of the N point cloud data points based on the neighborhood information.

[0279] It is understandable that if the second device does not support information exposure—that is, if the second device does not support the transmission of internal and external parameters and 2D images, does not support neighborhood extraction and transmission of extracted neighborhood information, does not support neighborhood feature extraction and transmission of extracted neighborhood features, and does not support providing neighborhood matching degrees to other devices—then it corresponds to the first device's request failing. In this case, the first device cannot obtain various information (such as internal and external parameter information, 2D images, extracted neighborhood information, etc.) from the second device. Furthermore, if the first device requests the second device to report its corresponding capability parameters, and the second device does not support information exposure, the second device may choose not to send a capability reporting response message, or it may set all the states in Table 1 above to "No" before reporting, or it may directly send a "fail" command.

[0280] The following example uses one point cloud dataset and two 2D images to illustrate a specific method for obtaining the neighborhood matching degree corresponding to the point cloud dataset. It can be understood that this method can be used to obtain the neighborhood matching degree corresponding to the aforementioned N point cloud datasets; that is, this method can be used to obtain the neighborhood matching degree of each of the N point cloud datasets.

[0281] Step 1: Project the point cloud data onto images #1 and #2 to obtain projection point K1 on image #1 and projection point K2 on image #2.

[0282] That is, projecting the point cloud data from the world coordinate system to the pixel coordinate system, or in other words, transforming the coordinates of the point cloud data to the corresponding coordinates in the pixel coordinate system.

[0283] The point cloud data is projected onto the 2D image obtained by optical sensing. The projection process is shown in the following formula:

[0284] Among them, [X P ,Y P Z P [u] represents the coordinates of the point cloud data. p ,v p [ ] represents the coordinates of the point cloud data projected onto the pixel coordinate system. Let be the extrinsic parameter matrix of the camera. This is the intrinsic parameter matrix for transforming the camera coordinate system to the pixel coordinate system.

[0285] It is understandable that the coordinate transformation process of point cloud data can be understood by referring to the transformation process from the world coordinate system to the pixel coordinate system in "1.1 Optical Perception" above, and will not be repeated here. It is also understandable that the coordinates after projecting the point cloud data onto images #1 and #2 can be determined by the above formula (1).

[0286] Step 2: Perform neighborhood extraction on projection points K1 and K2 to obtain neighborhood #1 corresponding to projection point K1 and neighborhood #2 corresponding to projection point K2.

[0287] For example, a neighborhood is taken near the projection point K1 to obtain neighborhood #1; a neighborhood is taken near the projection point K2 to obtain neighborhood #2. The method for determining neighborhood #1 and neighborhood #2 can be understood by referring to the relevant introduction in the "first neighborhood" section above, and will not be repeated here.

[0288] It is understandable that both neighborhood #1 and neighborhood #2 include multiple pixels (u,v).

[0289] Step 3: Determine the neighborhood matching degree of the point cloud data based on neighborhood #1 and neighborhood #2.

[0290] In one possible implementation, the neighborhood matching degree is determined based on the pixel values ​​corresponding to the pixels in neighborhood #1 and the pixels corresponding to the pixels in neighborhood #2.

[0291] For example, neighborhood matching degree can be determined by normalized sum of squared difference (NSSD). The expression for calculating neighborhood matching degree using NSSD is as follows: Among them, A u,v B represents the pixel value corresponding to pixel (u,v) in neighborhood #1. u,v The pixel value corresponding to pixel (u,v) in neighborhood #2.

[0292] For example, by comparing the distribution differences of pixel values ​​in neighborhood #1 and neighborhood #2 using the cumulative distribution function (CDF), the expression for calculating the neighborhood matching degree using CDF is as follows: Where F is the density function, expressed as: h(x) represents the frequency distribution of pixel values, h(x) = ∑ u,v δ(A(u,v)-x), where δ is the Dirac delta function and x is the pixel value.

[0293] It is understandable that when calculating the neighborhood matching degree using the two methods mentioned above, the closer the neighborhood matching degree result is to 1, the higher the neighborhood matching degree is.

[0294] In another possible implementation, feature extraction is performed on the pixel values ​​corresponding to pixels in neighborhood #1 and neighborhood #2 to obtain neighborhood features #1 and neighborhood features #2. Based on neighborhood features #1 and neighborhood features #2, the neighborhood matching degree is determined. It can be understood that in this case, the neighborhood feature calculation function should satisfy the following condition: the input is the neighborhood pixel value, and the output is the feature value. The input and output need to satisfy: the higher the matching degree of the input neighborhood elements, the closer the calculated feature value is.

[0295] For example, neighborhood matching is determined through difference hashing, which involves calculating the difference between adjacent pixels row by row. Specifically, if the pixel value to the left of a neighborhood (neighborhood #1 or neighborhood #2) is greater than the pixel value to the right, it is recorded as 1; otherwise, it is recorded as 0. All the difference results in the neighborhood (neighborhood #1 or neighborhood #2) are concatenated to form a binary hash string, i.e., H. i =b i,0 b i,1 ...b i,n -1, where b i,jThis represents the j-th hash value of the i-th neighborhood. Next, the Hamming distance between the hash strings corresponding to neighborhood #1 and neighborhood #2 is compared. Where f is an indicator function, when b i,j -b i - 1,j If the distance is not equal to 0, the f function value is 1; otherwise, it is 0. It can be understood that the smaller the Hamming distance, the higher the image matching degree.

[0296] It is understandable that steps 1-3 above use two 2D images as an example to determine the neighborhood matching degree of point cloud data. When there are more than two 2D images, the point cloud data can be projected onto these multiple 2D images to obtain multiple projection points; neighborhood extraction is performed on the multiple projection points to obtain multiple neighborhoods; finally, the neighborhood matching degree of the point cloud data is determined based on the multiple neighborhoods. When determining the neighborhood matching degree of point cloud data based on multiple neighborhoods, the neighborhood matching degree corresponding to each pair of neighborhoods can be calculated, and then the average value can be calculated based on the multiple neighborhood matching degrees to obtain the final neighborhood matching degree; or, the neighborhood matching degree corresponding to each pair of neighborhoods can be calculated, and the neighborhood matching degree with the largest or smallest value among the multiple neighborhood matching degrees can be used as the final neighborhood matching degree.

[0297] For example, when there are three 2D images, the point cloud data can be projected onto the three 2D images to obtain projected points #1a, #2a, and #3a. Neighborhood extraction is performed on projected points #1a, #2a, and #3a to obtain the neighborhoods #1a, #2a, and #3a corresponding to projected points #1a, #2a, and #3a, respectively. The neighborhood matching degree #1a is calculated based on neighborhoods #1a and #2a; the neighborhood matching degree #2a is calculated based on neighborhoods #1a and #3a; and the neighborhood matching degree #3a is calculated based on neighborhoods #2a and #3a. The average of the neighborhood matching degrees #1a, #2a, and #3a is calculated to obtain the final neighborhood matching degree.

[0298] For example, when there are four 2D images, the point cloud data can be projected onto the four 2D images to obtain projection points #1b, #2b, #3b, and #4b. Neighborhood extraction is then performed on projection points #1b, #2b, #3b, and #4b to obtain the neighborhood #1b of projection point #1b, the neighborhood #2b of projection point #2b, the neighborhood #3b of projection point #3b, and the neighborhood #4b of projection point #4b. Based on neighbors #1b and #2b, the neighborhood matching degree #1b is calculated using NSSD; based on neighbors #1b and #3b, the neighborhood matching degree #2b is calculated using NSSD; based on neighbors #1b and #4b, the neighborhood matching degree #3b is calculated using NSSD; based on neighbors #2b and #3b, the neighborhood matching degree #4b is calculated using NSSD; based on neighbors #2b and #4b, the neighborhood matching degree #5b is calculated using NSSD; based on neighbors #3b and #4b, the neighborhood matching degree #6b is calculated using NSSD. The neighborhood matching degree #6b has the smallest value, and is therefore taken as the final neighborhood matching degree.

[0299] It can also be understood that the neighborhood matching degree of the i-th point cloud data can be determined solely based on the image similarity between the first neighbor i and the second neighbor i. In this case, the neighborhood matching degree of the i-th point cloud data is the image similarity between the first neighbor i and the second neighbor i. Alternatively, the neighborhood matching degree of the i-th point cloud data can be determined based on the image similarity between the first neighbor i, the second neighbor i, and other neighbor i. These other neighbor i can be the region located after projecting the i-th point cloud data onto images other than the first and second images. In this case, the neighborhood matching degree of the i-th point cloud data is the image similarity between the first neighbor i, the second neighbor i, and other neighbor i. The specific settings can be flexibly configured according to actual circumstances and are not limited.

[0300] Furthermore, the method for obtaining the neighborhood matching degree described above can be executed by one or more devices. For example, steps 1-3 can be executed by either the first or second device to determine the neighborhood matching degree of the point cloud data. Alternatively, the first and second devices can cooperate to determine the neighborhood matching degree of the point cloud data. For instance, the second device can determine the neighborhood of the point cloud data and send relevant information about that neighborhood (such as the pixel values ​​corresponding to pixels in the neighborhood, or the features corresponding to the neighborhood) to the first device, which then determines the neighborhood matching degree of the point cloud data based on this information. The specific implementation of the cooperation between the first and second devices to obtain the neighborhood matching degree can be flexibly set according to actual conditions and is not limited.

[0301] For S702:

[0302] The first device, after acquiring the neighborhood matching degree of each of the N point cloud data, can determine the noise in the N point cloud data, or in other words, determine the noisy point cloud data in the N point cloud data, based on the neighborhood matching degree of each of the N point cloud data.

[0303] For example, after determining the neighborhood matching degree of each of the N point cloud data through the above NSSD or CDF, the point cloud data with a neighborhood matching degree less than the first matching degree threshold can be identified as noise or noisy point cloud data.

[0304] For example, after determining the neighborhood matching degree of each of the N point cloud data through the above differential hashing, the point cloud data with a neighborhood matching degree greater than the second matching degree threshold can be identified as noise or noisy point cloud data.

[0305] It is understood that the N point cloud data in the embodiments of this application include at least one real point cloud data and / or at least one noisy point cloud data. The real point cloud data can be understood as the point cloud data determined by the echo generated by the target object during the radio frequency sensing process; or, the real point cloud data can be understood as point cloud data other than the noisy point cloud data. The noisy point cloud data can also be called noise points, three-dimensional noise scattering points, or other possible names, without limitation.

[0306] It can also be understood that, as shown in Figure 9, after real point cloud data is projected onto multiple 2D images (the imaging plane in Figure 9), the regions where the projection points of the point cloud data correspond to the imaging results of similar (or identical) regions of the object. Therefore, the similarity of the regions where the projection points of the point cloud data correspond to is relatively high. Conversely, after noisy point cloud data is projected onto multiple 2D images, the regions where the projection points of the noisy point cloud data correspond to the imaging results of different regions. Therefore, the similarity of the regions where the projection points of the noisy point cloud data correspond to is relatively low. It can be understood that the aforementioned similar regions can be interpreted as two regions on the target object that include real point cloud data, and these two regions have a large overlap, or in other words, these two regions have small differences. The aforementioned different regions can be understood as completely different regions, or as two regions with significant differences.

[0307] After the first device identifies the noise in N point cloud data, the noise can be removed or filtered; and the filtered point cloud data can be used for multimodal fusion sensing.

[0308] In summary, in this embodiment, the first device can accurately determine the noise in the N point cloud data by acquiring the neighborhood matching degree of each of the N point cloud data. Thus, the first device can use noise-filtered point cloud data in multimodal perception, thereby improving multimodal perception performance.

[0309] It is understandable that the above-described interaction between the first and second devices illustrates how the first device acquires the neighborhood matching degree of each of the N point cloud data sets. When there are multiple second devices, the first device can acquire different information from each of the multiple second devices and, based on this different information, acquire the neighborhood matching degree of each of the N point cloud data sets. The following example, using the second device #1 perceiving the target object to obtain the first image and the second device #2 perceiving the target object to obtain the second image, illustrates the interaction between the first device and the second devices #1 and #2.

[0310] In the first possible implementation, the first device can obtain the first parameter information and the first image from the second device #1, and the second parameter information and the second image from the second device #2, and determine the neighborhood matching degree of the i-th point cloud data based on the first parameter information, the second parameter information, the first image and the second image.

[0311] For example, the acquisition of parameter information, the first image, and the second image by the first device may specifically include: the second device #1 sending the first parameter information and the first image to the first device, and correspondingly, the first device receiving the first parameter information and the first image from the second device #1. The first parameter information is used to instruct the second device #1 to perform two-dimensional perception of the target object, and the first parameter information belongs to parameter information; the second device #2 sending the first parameter information and the first image to the first device, and correspondingly, receiving the second parameter information and the second image from the second device #2. The second parameter information is used to instruct the second device #2 to receive the relevant parameters for two-dimensional perception of the target object, and the second parameter information belongs to parameter information.

[0312] In the second possible implementation, the first device can obtain relevant information about the first neighborhood i from the second device #1 and relevant information about the second neighborhood i from the second device #2, and determine the neighborhood matching degree of the i-th point cloud data based on the relevant information about the first neighborhood i and the second neighborhood i.

[0313] For example, the first device requests information about the pixels in the region where N point cloud data are projected onto a two-dimensional image from the second device #1 and the second device #2. The two-dimensional image is obtained by sensing a target object. The first device receives information about the first neighborhood i from the second device #1 and information about the second neighborhood i from the second device #2.

[0314] It is understood that the aforementioned second device #1 and second device #2 can be replaced with second device and third device, or other possible names, without limitation. Furthermore, the interaction between the first device and second device #1 and second device #2 can be understood with reference to the aforementioned interaction between the first device and second device, and will not be repeated here.

[0315] It can also be understood that when the number of two-dimensional images used to determine the neighborhood matching degree increases, such as using 3 or 4 two-dimensional images to determine the neighborhood matching degree, the number of second devices can also be increased adaptively. For example, when using 3 two-dimensional images to determine the neighborhood matching degree, there can be 3 second devices, which can each provide a two-dimensional image or neighborhood information to the first device. The number of two-dimensional images and the number of second devices can be flexibly set according to the actual situation, and the specific implementation principle of having more than 2 two-dimensional images and multiple second devices can be understood by referring to the above content, and will not be repeated here.

[0316] The above, in conjunction with the method embodiments, provides an overall overview of the communication method provided in this application. A specific example is given below to illustrate the above method.

[0317] As shown in Figure 10(a), for a building, the three-dimensional coordinates (x, y, z) of a certain wall of the building range as follows: x = -57, y ranges from [-103, 103], and z ranges from [0, 20]. The wall is made of brick and has multiple windows, one every 10m horizontally and one every 4m vertically. The windows are 2m x 2m in size and made of glass. As shown in Figure 10(b), two second devices (optical sensing nodes) perceive the wall and obtain two two-dimensional images. As shown in Figure 10(c), the first device performs radio frequency sensing on the wall to obtain multiple point cloud data. The corresponding neighborhood matching degree is obtained for these multiple point cloud data, and based on the neighborhood matching degree, the noise in these multiple point cloud data is determined, resulting in the noise point cloud data situation in the multiple point cloud data shown in Figure 10(d), where the red points are noise point cloud data and the blue points are real point cloud data. Simulation results show that the method provided in the above embodiments can effectively filter noisy point cloud data from multiple point cloud datasets.

[0318] For example, Figure 11 is a schematic flowchart of a communication method provided in an embodiment of this application. This method can be applied to the interaction between the first device and the second device in the above-described communication system.

[0319] As shown in Figure 11, the flow of this communication method is as follows:

[0320] S1101, the second device acquires capability parameters.

[0321] The capability parameters include information related to the capabilities of the second device. These capability parameters are used to indicate the first information that the second device is capable of reporting to the first device.

[0322] The first information is used to indicate relevant information for the second device to perceive an object, such as the internal and external parameters of the second device, or a two-dimensional image obtained by the second device from perceiving the target object. Alternatively, the first information is used to indicate information processed based on the relevant information obtained by the second device from perceiving the object, such as information obtained by processing the two-dimensional image obtained by the second device from perceiving the target object.

[0323] In one possible design, the capability parameters include at least one of the following information of the second device: information protection level information, sensing capability information, communication capability information, computing capability information, storage capability information, internal and external parameter transmission information, neighborhood acquisition information, or neighborhood matching degree acquisition information.

[0324] The neighborhood acquisition information is used to instruct the second device to support providing the first device with relevant information about pixels in a first region, which is the region where the point cloud data is projected onto the two-dimensional image.

[0325] Other information can be found in the relevant description in the aforementioned "S701", and will not be repeated here.

[0326] S1102, the second device sends capability parameters to the first device, and correspondingly, the first device receives capability parameters from the second device.

[0327] The second device can reuse existing message sending capability parameters or use newly defined message sending capability parameters. The specific settings can be flexibly configured according to the actual situation without any restrictions.

[0328] S1103, the first device sends a first message to the second device according to the capability parameters. Correspondingly, the second device receives the first message from the first device.

[0329] The first message is used to request relevant information from the second device to perceive the target object, such as the internal and external parameters of the second device, and the two-dimensional image obtained by the second device from perceiving the target object.

[0330] It's understandable that the first message will differ depending on the information included in the capability parameters. The following sections will explain these different scenarios.

[0331] Case 11.1: Capability parameters include neighborhood acquisition information.

[0332] That is, the capability parameters include neighborhood acquisition information, which is used to instruct the second device to support providing the first device with relevant information about pixels in the first region. The first region is the region where the point cloud data is projected onto the two-dimensional image. The relevant information in the first message used to request the second device to perceive the target object may specifically include: the first message is used to request relevant information about pixels in the region where N point cloud data are projected onto the two-dimensional image. The N point cloud data are three-dimensional point cloud data obtained by perceiving the target object, N is an integer greater than 1, and the two-dimensional image is obtained by perceiving the target object.

[0333] The first message may include neighborhood description information, which indicates the size and / or shape of the region where the N point cloud data are projected onto the 2D image. The first message may also include the N point cloud data. For details on the first message, please refer to the relevant introduction in "Case 7.1" above, which will not be repeated here.

[0334] Optionally, the above communication method may further include: in response to the first message, the second device sends second information to the first device, and correspondingly, the first device receives the second information from the second device, the second information being used to indicate the relevant information of pixels in the first neighborhood i and the second neighborhood i, the first neighborhood i being the region where the i-th point cloud data is projected onto the first image from the N point cloud data, the second neighborhood i being the region where the i-th point cloud data is projected onto the second image, the first image and the second image being images obtained by two-dimensional perception of the target object, the first image being different from the second image, and i being an integer from 1 to N; the first device obtains the neighborhood matching degree of the i-th point cloud data according to the second information, and determines the noise in the N point cloud data according to the neighborhood matching degree of the i-th point cloud data.

[0335] The second information may include first neighborhood information i and second neighborhood information i. The first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i. The second information may also include first neighborhood feature i and second neighborhood feature i. The first neighborhood feature i is used to indicate the image features of the first neighborhood i, and the second neighborhood feature i is used to indicate the image features of the second neighborhood i. The image features are related to the pixel.

[0336] It is understood that the above second information can be referred to the relevant introduction of the first information in the aforementioned "S701", and will not be repeated here.

[0337] Furthermore, when the second information includes the first neighborhood information i and the second neighborhood information i, the above communication method may further include: the first device obtains the neighborhood matching degree of the i-th point cloud data according to the first neighborhood information i, the second neighborhood information i and the matching degree calculation method, wherein the matching degree calculation method is any one of the following: normalized squared difference, cumulative density function, or neighborhood feature calculation, for details please refer to the relevant introduction in the aforementioned "Case 7.1", which will not be repeated here.

[0338] Furthermore, when the second information includes the first neighborhood information i and the second neighborhood information i, the aforementioned neighborhood acquisition information used to instruct the second device to support providing the first device with relevant information about the pixels in the first region can specifically include: neighborhood extraction acquisition information used to instruct the second device to support providing the first device with the pixel values ​​corresponding to the pixels in the first region. In this case, the neighborhood extraction acquisition information can also be referred to as neighborhood extraction acquisition information, and for details, please refer to the relevant introduction of neighborhood extraction acquisition information in the aforementioned "Case 7.1", which will not be repeated here. In addition, the capability parameters may also include neighborhood feature acquisition information, which can be referred to the relevant introduction in the aforementioned "Case 7.1", which will not be repeated here.

[0339] Furthermore, when the second information includes the first neighborhood feature i and the second neighborhood feature i, the neighborhood acquisition information used to instruct the second device to support providing the first device with relevant information about the pixels in the first region can specifically include: neighborhood extraction acquisition information used to instruct the second device to support providing the first device with image features of the first region. In this case, the neighborhood extraction acquisition information can also be referred to as neighborhood feature acquisition information, and for details, please refer to the relevant introduction of neighborhood feature acquisition information in the aforementioned "Case 7.1", which will not be repeated here. In addition, the capability parameters may also include neighborhood extraction acquisition information, which can be referred to the relevant introduction in the aforementioned "Case 7.1", which will not be repeated here.

[0340] Case 11.2: Capability parameters include neighborhood matching degree acquisition information.

[0341] That is, the capability parameters include neighborhood matching degree acquisition information, which is used to indicate that the second device supports providing the neighborhood matching degree corresponding to the point cloud data to the first device. The neighborhood matching degree corresponding to the point cloud data is used to indicate the image similarity of the region where the point cloud data is projected onto multiple two-dimensional images. The first message is used to request the relevant information of the second device to perceive the target object, which may specifically include: the first message is used to request the neighborhood matching degree corresponding to N point cloud data, where N point cloud data are three-dimensional point cloud data obtained by perceiving the target object, and N is an integer greater than 1. The neighborhood matching degree corresponding to N point cloud data is used to indicate the image similarity of the region where the i-th point cloud data in the N point cloud data is projected onto multiple two-dimensional images, where i is an integer from 1 to N. The multiple two-dimensional images are obtained by perceiving the target object.

[0342] The first message may include neighborhood description information, and the first message may also include N point cloud data. For details, please refer to the relevant introduction in "Case 11.1" above, which will not be repeated here.

[0343] Optionally, the above communication method may further include: in response to a first message, a second device sends the neighborhood matching degree of each of N point cloud data to a first device; correspondingly, the first device receives the neighborhood matching degree of each of the N point cloud data from the second device; the neighborhood matching degree of the i-th point cloud data is determined based on the image similarity between the first neighborhood i and the second neighborhood i; the first neighborhood i is the region where the i-th point cloud data is projected onto the first image, and the second neighborhood i is the region where the i-th point cloud data is projected onto the second image; the first image and the second image are images obtained by two-dimensional perception of the target object, and the first image and the second image are different; the first device determines the noise in the N point cloud data based on the neighborhood matching degree of each of the N point cloud data.

[0344] Furthermore, the above communication method may further include: a second device projecting the i-th point cloud data onto a first image and a second image; the second device determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data; the second device determining first neighborhood information i and second neighborhood information i based on the first neighborhood i and the second neighborhood i, wherein the first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i; and the second device determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood information i and the second neighborhood information i.

[0345] Furthermore, the above communication method may further include: a second device projecting the i-th point cloud data onto a first image and a second image; the second device determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data; the second device determining a first neighborhood feature i and a second neighborhood feature i based on the first neighborhood i and the second neighborhood i, wherein the first neighborhood feature i is used to indicate the image features of the first neighborhood i and the second neighborhood feature i is used to indicate the image features of the second neighborhood i; and the second device determining the neighborhood matching degree of the i-th point cloud data based on the first neighborhood feature i and the second neighborhood feature i.

[0346] It is understandable that the content in Case 11.2 can be referred to the relevant introduction in the aforementioned Case 7.2, and will not be repeated here.

[0347] Case 11.3: Capability parameters include internal and external parameter transmission information.

[0348] That is, the capability parameters include internal and external parameter transmission information, which is used to instruct the second device to support providing the first device with relevant parameters and images for object perception; the first message is used to request the second device to perceive the target object. The relevant information may specifically include: the first message is used to request the second device to perceive the target object (two-dimensional) with relevant parameters and at least one image.

[0349] The relevant parameters may include at least one of the following information of the second device: focal length, resolution, field of view, position, or pointing. For details, please refer to the relevant introduction in "Case 7.3" above, which will not be repeated here.

[0350] Optionally, the above communication method may further include: in response to a first message, a second device sends parameter information, a first image, and a second image to a first device; correspondingly, the first device receives parameter information, the first image, and the second image from the second device. The parameter information is used to project N point cloud data onto the first image and the second image. The N point cloud data are three-dimensional point cloud data obtained by perceiving a target object, where N is an integer greater than 1. The first image and the second image are obtained by two-dimensional perception of the target object, and the first image and the second image are different. The first device obtains the neighborhood matching degree of each of the N point cloud data according to the parameter information, the first image, and the second image, and determines the noise in the N point cloud data according to the neighborhood matching degree of each of the N point cloud data. The neighborhood matching degree of the i-th point cloud data is determined based on the image similarity between the first neighborhood i and the second neighborhood i. The first neighborhood i is the region where the i-th point cloud data is projected onto the first image, and the second neighborhood i is the region where the i-th point cloud data is projected onto the second image, where i is an integer from 1 to N.

[0351] Furthermore, the process by which the first device obtains the neighborhood matching degree of each of the N point cloud data points based on the parameter information, the first image, and the second image can specifically include: the first device projecting the i-th point cloud data point onto the first image and the second image based on the parameter information; the first device determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data point; the first device determining first neighborhood information i and second neighborhood information i based on the first neighborhood i and the second neighborhood i, where the first neighborhood information i includes the pixel value corresponding to the pixel point in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel point in the second neighborhood i; and the first device determining the neighborhood matching degree of the i-th point cloud data point based on the first neighborhood information i and the second neighborhood information i.

[0352] Furthermore, the process by which the first device obtains the neighborhood matching degree of each of the N point cloud data points based on parameter information, the first image, and the second image can specifically include: the first device projecting the i-th point cloud data point onto the first image and the second image based on the parameter information; the first device determining a first neighborhood i and a second neighborhood i based on the first image and the second image mapped with the i-th point cloud data point; the first device determining a first neighborhood feature i and a second neighborhood feature i based on the first neighborhood i and the second neighborhood i, where the first neighborhood feature i indicates the image features of the first neighborhood i and the second neighborhood feature i indicates the image features of the second neighborhood i; and the first device determining the neighborhood matching degree of the i-th point cloud data point based on the first neighborhood feature i and the second neighborhood feature i.

[0353] In summary, in this embodiment, the first device can request relevant information from the second device regarding the second device's perception of the target object based on the capability parameters sent by the second device. This avoids the first device failing to request information from the second device, thus preventing additional communication overhead.

[0354] It is understood that the embodiment shown in Figure 11 is similar to the embodiment shown in Figure 7, and the same parts can be understood by mutual reference, and will not be described again here. In addition, in the various embodiments of this application, unless otherwise specified or logically conflicting, the terminology and / or descriptions between different embodiments are consistent and can be referenced by each other, and the technical features in different embodiments can be combined to form new embodiments according to their inherent logical relationship.

[0355] It is also understood that in the various embodiments of this application, "two-dimensional" can also be referred to as "2D" and "three-dimensional" can also be referred to as "3D". Furthermore, in the various embodiments of this application, the names of each message, each piece of information, and each parameter are merely illustrative representations, and each message, each piece of information, and each parameter can be replaced with any possible representation without limitation.

[0356] The above, in conjunction with the method embodiments shown in Figures 7 and 11, provides an overall overview of the communication method provided by the embodiments of this application. For ease of understanding, the interaction flow between the first device and the second device will be specifically described below through method embodiments, with reference to Figures 12-15.

[0357] Scene 1:

[0358] Figure 12 is a schematic flowchart of the communication method provided in this application embodiment. This communication method can be applied to the interaction between the first device and the second device in the above-described communication system. In scenario 1, the second device supports the transmission of internal and external parameters and two-dimensional images, that is, the second device (or the device in which the second device is located) has the ability to transmit internal and external parameters. At this time, the first device obtains the internal and external parameters of the second device and at least one two-dimensional image obtained from the perception of the target object from the second device, and obtains the neighborhood matching degree of each of N point cloud data based on the internal and external parameters and the at least one two-dimensional image, where N is an integer greater than 1.

[0359] As shown in Figure 12, the flow of this communication method is as follows:

[0360] S1201, the first device sends a multimodal sensing request message to the second device. Correspondingly, the second device receives the multimodal sensing request message from the first device.

[0361] The multimodal sensing request message is used to request the second device to participate in multimodal sensing. Alternatively, the multimodal sensing request message is used to request the second device to report its capability parameters.

[0362] S1202, the second device sends capability parameters to the first device. Correspondingly, the first device receives the capability parameters from the second device.

[0363] In response to the multimodal perception request message, the second device sends capability parameters to the first device.

[0364] This capability parameter corresponds to the fourth capability parameter in the embodiment shown in Figure 7. Alternatively, the capability parameter corresponds to the capability parameter in "Case 11.3" in the embodiment shown in Figure 11.

[0365] S1203, the first device sends request message #1 to the second device. Correspondingly, the second device receives request message #1 from the second device.

[0366] Request message #1 corresponds to the third message in the embodiment shown in Figure 7. Alternatively, request message #1 corresponds to the first message in "Case 11.3" in the embodiment shown in Figure 11.

[0367] S1204, the second device sends internal and external parameters and a two-dimensional image to the first device. Correspondingly, the first device receives the internal and external parameters and the two-dimensional image from the second device.

[0368] In response to request message #1, the second device sends internal and external parameters and a two-dimensional image to the first device.

[0369] The internal and external parameters correspond to the parameter information in the embodiment shown in Figure 7; the two-dimensional image includes two two-dimensional images, which correspond to the first image and the second image in the embodiment shown in Figure 7.

[0370] Alternatively, the internal and external parameters correspond to the parameter information in “Case 11.3” of the embodiment shown in Figure 11; the two-dimensional image includes two two-dimensional images, which correspond to the first image and the second image in “Case 11.3” of the embodiment shown in Figure 11.

[0371] S1205, the first device calculates the neighborhood matching degree corresponding to N point cloud data based on internal and external parameters and two-dimensional images.

[0372] S1206, the first device determines the noise in the N point cloud data based on the neighborhood matching degree of each of the N point cloud data.

[0373] It is understood that S1201-S1206 can be referred to the relevant descriptions in the embodiments shown in Figure 7 or Figure 11 above, and will not be repeated here. In addition, S1201 and S1203 above are optional steps, such as the second device can periodically report capability information to the first device, and / or the second device can periodically report internal and external parameters and two-dimensional images to the first device.

[0374] Scene 2:

[0375] Figure 13 is a schematic flowchart of the communication method provided in this application embodiment. This communication method can be applied to the interaction between the first device and the second device in the above-described communication system. In scenario 2, the second device supports neighborhood extraction and transmission of extracted neighborhood information, that is, the second device (or the device in which the second device is located) has the ability to extract and transmit extracted neighborhood information. At this time, the first device obtains the neighborhood information of the second device from the second device, and based on the neighborhood information, obtains the neighborhood matching degree of each of the N point cloud data, where N is an integer greater than 1.

[0376] As shown in Figure 13, the flow of this communication method is as follows:

[0377] S1301, the first device sends a multimodal sensing request message to the second device. Correspondingly, the second device receives the multimodal sensing request message from the first device.

[0378] The specific implementation principle of S1301 can be found in the aforementioned introduction of S1201, and will not be repeated here.

[0379] S1302, the second device sends capability parameters to the first device. Correspondingly, the first device receives the capability parameters from the second device.

[0380] In response to the multimodal perception request message, the second device sends capability parameters to the first device.

[0381] This capability parameter corresponds to the first capability parameter in the embodiment shown in Figure 7. Alternatively, the capability parameter corresponds to the capability parameter in "Case 11.1" in the embodiment shown in Figure 11, which includes neighborhood acquisition information. This neighborhood acquisition information is used to instruct the second device to support providing the first device with the pixel values ​​corresponding to the pixels in the first region.

[0382] S1303, the first device sends request message #1 to the second device. Correspondingly, the second device receives request message #1 from the second device.

[0383] Request message #1 includes at least one of the following: N point cloud data or neighborhood description information.

[0384] Request message #1 corresponds to the first message in the embodiment shown in Figure 7. Alternatively, request message #1 corresponds to the first message in “Case 11.1” in the embodiment shown in Figure 11.

[0385] S1304, the second device obtains multiple neighborhood information according to request message #1.

[0386] For example, according to request message #1, the second device projects N point cloud data onto multiple two-dimensional images obtained for the target object perception to obtain multiple projection points; and extracts the neighborhood information of these multiple projection points, which is the pixel value corresponding to the pixel point in the region.

[0387] S1305, the second device sends multiple neighborhood information to the first device. Correspondingly, the first device receives the neighborhood information from the second device.

[0388] Neighborhood information includes the pixel values ​​of the pixels in the region where the projection point of the point cloud data is located. It can be understood that the second device can send multiple neighborhood information messages in the order in which it receives N point cloud data messages; or, the second device can use an index to send multiple neighborhood information messages. It can be understood that the neighborhood information corresponds to the neighborhood information in "Case 11.1" in the embodiment shown in Figure 7 or the embodiment shown in Figure 11.

[0389] Optionally, the second device sends multiple neighborhood information to the first device, including: the second device sending N point cloud data and multiple neighborhood information to the first device. The N point cloud data and the multiple neighborhood information have a corresponding relationship; for example, the point cloud data with the corresponding relationship and the multiple neighborhood information can be placed in a tuple. In this way, the first device can determine the correspondence between the N point cloud data and the multiple neighborhood information based on the received N point cloud data and multiple neighborhood information.

[0390] S1306, the first device calculates the neighborhood matching degree corresponding to N point cloud data based on multiple neighborhood information.

[0391] S1307, based on the neighborhood matching degree of each of the N point cloud data, determine the noise in the N point cloud data.

[0392] It is understood that S1301-S1307 can be referred to the relevant descriptions in the embodiments shown in Figure 7 or Figure 11 above, and will not be repeated here. In addition, the above-mentioned S1301 is an optional step, such as the second device periodically reporting capability information to the first device.

[0393] Scene 3:

[0394] Figure 14 is a schematic flowchart of the communication method provided in this application embodiment. This communication method can be applied to the interaction between the first device and the second device in the above-described communication system. In scenario 3, the second device supports neighborhood feature extraction and transmission of the extracted neighborhood features, that is, the second device (or the device in which the second device is located) has the ability to extract neighborhood features and transmit the extracted neighborhood features. At this time, the first device obtains the neighborhood features of the second device from the second device, and based on the neighborhood features, obtains the neighborhood matching degree of each of the N point cloud data, where N is an integer greater than 1.

[0395] As shown in Figure 14, the flow of this communication method is as follows:

[0396] S1401, the first device sends a multimodal sensing request message to the second device. Correspondingly, the second device receives the multimodal sensing request message from the first device.

[0397] The specific implementation principle of S1401 can be found in the aforementioned introduction of S1201, and will not be repeated here.

[0398] S1402, the second device sends capability parameters to the first device. Correspondingly, the first device receives the capability parameters from the second device.

[0399] In response to the multimodal perception request message, the second device sends capability parameters to the first device.

[0400] This capability parameter corresponds to the second capability parameter in the embodiment shown in Figure 7. Alternatively, the capability parameter corresponds to the capability parameter in "Case 11.1" in the embodiment shown in Figure 11, which includes neighborhood acquisition information used to instruct the second device to support providing image features of the first region to the first device.

[0401] S1403, the first device sends request message #1 to the second device. Correspondingly, the second device receives request message #1 from the second device.

[0402] Request message #1 includes at least one of the following: N point cloud data or neighborhood description information.

[0403] Request message #1 corresponds to the first message in the embodiment shown in Figure 7. Alternatively, request message #1 corresponds to the first message in “Case 11.1” in the embodiment shown in Figure 11.

[0404] S1404, the second device obtains multiple neighborhood information according to request message #1.

[0405] For example, according to request message #1, the second device projects N point cloud data onto multiple two-dimensional images obtained for the target object perception to obtain multiple projection points; and extracts the neighborhood information of these multiple projection points, which is the pixel value corresponding to the pixel point in the region.

[0406] S1405, the second device calculates multiple neighborhood features based on multiple neighborhood information.

[0407] Neighborhood information can be found in the relevant description in S1305 above, and will not be repeated here.

[0408] Multiple neighborhood features are features extracted based on multiple neighborhood information. It can be understood that the neighborhood features correspond to the neighborhood features in "Case 11.1" in the embodiment shown in Figure 7 or the embodiment shown in Figure 11.

[0409] S1406, the second device sends multiple neighborhood features to the first device. Correspondingly, the first device receives multiple neighborhood features from the second device.

[0410] The second device can send multiple neighborhood features in the order of receiving N point cloud data; alternatively, the second device can send multiple neighborhood features using an index. It can be understood that the neighborhood information corresponds to the neighborhood information in "Case 11.1" of the embodiment shown in Figure 7 or the embodiment shown in Figure 11.

[0411] Optionally, the second device sends multiple neighborhood features to the first device, including: the second device sending N point cloud data and multiple neighborhood features to the first device. The N point cloud data and the multiple neighborhood features have a corresponding relationship; for example, the point cloud data with the corresponding relationship and the multiple neighborhood features can be placed in a tuple. In this way, the first device can determine the correspondence between the N point cloud data and the multiple neighborhood features based on the received N point cloud data and multiple neighborhood features.

[0412] S1407, the first device calculates the neighborhood matching degree corresponding to N point cloud data based on multiple neighborhood features.

[0413] S1408, the first device determines the noise in the N point cloud data based on the neighborhood matching degree of each of the N point cloud data.

[0414] It is understood that S1401-S1408 can be referred to the relevant descriptions in the embodiments shown in Figure 7 or Figure 11 above, and will not be repeated here. In addition, the above-mentioned S1401 is an optional step, such as the second device periodically reporting capability information to the first device.

[0415] Scene 4:

[0416] Figure 15 is a schematic flowchart of the communication method provided in this application embodiment. This communication method can be applied to the interaction between the first device and the second device in the above-described communication system. In scenario 4, the second device supports providing neighborhood matching degrees to other devices, that is, the second device (or the device where the second device is located) has the ability to provide neighborhood matching degrees to other devices. At this time, the first device obtains the neighborhood matching degrees of each of the N point cloud data from the second device, where N is an integer greater than 1.

[0417] As shown in Figure 15, the flow of this communication method is as follows:

[0418] S1501, the first device sends a multimodal sensing request message to the second device. Correspondingly, the second device receives the multimodal sensing request message from the first device.

[0419] The specific implementation principle of S1501 can be found in the aforementioned introduction of S1201, and will not be repeated here.

[0420] S1502, the second device sends capability parameters to the first device. Correspondingly, the first device receives the capability parameters from the second device.

[0421] In response to the multimodal perception request message, the second device sends capability parameters to the first device.

[0422] This capability parameter corresponds to the third capability parameter in the embodiment shown in Figure 7. Alternatively, the capability parameter corresponds to the capability parameter in "Case 11.2" in the embodiment shown in Figure 11.

[0423] S1503, the first device sends request message #1 to the second device. Correspondingly, the second device receives request message #1 from the second device.

[0424] Request message #1 includes at least one of the following: N point cloud data or neighborhood description information.

[0425] Request message #1 corresponds to the second message in the embodiment shown in Figure 7. Alternatively, request message #1 corresponds to the first message in “Case 11.2” in the embodiment shown in Figure 11.

[0426] Optionally, request message #1 may also include a two-dimensional image, which is an image obtained by a device other than the second device from sensing the target object.

[0427] S1504, the second device calculates the neighborhood matching degree corresponding to N point cloud data according to request message #1.

[0428] S1505, the second device sends the neighborhood matching degree of each of the N point cloud data to the first device. Correspondingly, the first device receives the neighborhood matching degree of each of the N point cloud data from the second device.

[0429] The second device can send multiple neighborhood matching degrees in the order of receiving N point cloud data; or, the second device can send multiple neighborhood matching degrees using an index.

[0430] Optionally, the second device sends multiple neighborhood matching degrees to the first device, including: the second device sending N point cloud data and multiple neighborhood matching degrees to the first device. The N point cloud data and the multiple neighborhood matching degrees have a corresponding relationship; for example, the point cloud data with the corresponding relationship and the multiple neighborhood matching degrees can be placed in a tuple. In this way, the first device can determine the correspondence between the N point cloud data and the multiple neighborhood matching degrees based on the received N point cloud data and multiple neighborhood matching degrees.

[0431] S1506, the first device determines the noise in the N point cloud data based on the neighborhood matching degree of each of the N point cloud data.

[0432] It is understood that S1501-S1506 can be referred to the relevant descriptions in the embodiments shown in Figure 7 or Figure 11 above, and will not be repeated here. In addition, the above-mentioned S1501 is an optional step, such as the second device periodically reporting capability information to the first device.

[0433] It can also be understood that in the above scenarios 1-4, the second device can report its capabilities to the first device through multiple messages. For example, the second device can first report information protection level information, sensing capability information, communication capability information, computing capability information, or storage capability information to the first device, and then report other capabilities to the first device.

[0434] It's also understandable that scenarios 1-3 above are all based on a single second device. When there are multiple second devices, the above process can be adapted. For example, when there are two second devices, the first device sends multimodal perception request messages to both second devices respectively, which will not be elaborated here. Furthermore, when there are multiple second devices, after receiving the capabilities reported by multiple second devices, the first device can select the second device with the strongest overall capabilities to perform subsequent operations.

[0435] Furthermore, the embodiments shown in Figures 12-15 above are merely examples. In different situations, the steps in the embodiments shown in Figures 12-15 can be varied accordingly without limitation.

[0436] The communication method provided by the embodiments of this application has been described in detail above with reference to Figures 7-15. The communication apparatus used to perform the communication method provided by the embodiments of this application is described in detail below with reference to Figures 16-17.

[0437] Figure 16 is a schematic diagram of the structure of a communication device provided in an embodiment of this application. As exemplarily shown in Figure 16, the communication device 1600 includes a transceiver module 1601 and a processing module 1602. For ease of explanation, Figure 16 only shows the main components of the communication device.

[0438] The transceiver module 1601 is used to perform the transceiver function of the method shown in Figures 7-15 above, and the processing module 1602 is used to perform other functions of the method shown in Figures 7-15 above besides the transceiver function.

[0439] Optionally, the transceiver module 1601 may include a transmitting module (not shown in FIG16) and a receiving module (not shown in FIG16). The transmitting module is used to implement the transmitting function of the communication device 1600, and the receiving module is used to implement the receiving function of the communication device 1600.

[0440] Optionally, the communication device 1600 may further include a storage module (not shown in FIG. 16) that stores programs or instructions. When the processing module 1602 executes the program or instructions, the communication device 1600 can perform the functions of the terminal device or network device (such as the first device, the second device, etc.) in the methods shown in FIG. 7-FIG. 15 above.

[0441] It is understood that the communication device 1600 may be a terminal device or a network device, or it may be a chip (system) or other component or assembly that can be disposed in the terminal device or the network device, or it may be a device that includes the terminal device or the network device. This application does not limit it in this respect.

[0442] Furthermore, the technical effects of the communication device 1600 can be referred to the technical effects of the communication methods shown in Figures 7-15, and will not be repeated here.

[0443] Figure 17 is a second schematic diagram of the structure of a communication device provided in an embodiment of this application. Exemplarily, the communication device can be a terminal device or a network device, or it can be a chip (system) or other component or assembly that can be disposed in the terminal device or the network device. As shown in Figure 17, the communication device 1700 may include a processor 1701. Optionally, the communication device 1700 may further include a memory 1702 and / or a transceiver 1703. The processor 1701 is coupled to the memory 1702 and the transceiver 1703, for example, they can be connected via a communication bus.

[0444] The following section, with reference to Figure 17, provides a detailed description of each component of the communication device 1700:

[0445] The processor 1701 is the control center of the communication device 1700. It can be a single processor or a collective term for multiple processing elements. For example, the processor 1701 can be one or more central processing units (CPUs), application-specific integrated circuits (ASICs), or one or more integrated circuits configured to implement the embodiments of this application, such as one or more digital signal processors (DSPs), or one or more field-programmable gate arrays (FPGAs).

[0446] Optionally, the processor 1701 can perform various functions of the communication device 1700, such as performing the communication method described above, by running or executing software programs stored in the memory 1702 and calling data stored in the memory 1702.

[0447] In a specific implementation, as one example, processor 1701 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG17.

[0448] In a specific implementation, as one embodiment, the communication device 1700 may also include multiple processors, such as processors 1701 and 1704 shown in FIG. 17. Each of these processors may be a single-core processor (single-CPU) or a multi-core processor (multi-CPU). Here, a processor may refer to one or more devices, circuits, and / or processing cores for processing data (e.g., computer program instructions).

[0449] The memory 1702 is used to store the software program that executes the solution of this application, and is controlled by the processor 1701 to execute it. The specific implementation method can be referred to the above method embodiment, and will not be repeated here.

[0450] Optionally, the memory 1702 may be a read-only memory (ROM) or other type of static storage device capable of storing static information and instructions, random access memory (RAM) or other type of dynamic storage device capable of storing information and instructions, or electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed optical discs, laser discs, optical discs, digital universal optical discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium capable of carrying or storing desired program code in the form of instructions or data structures and accessible by a computer, but not limited thereto. The memory 1702 may be integrated with the processor 1701 or may exist independently and be coupled to the processor 1701 through the interface circuit of the communication device 1700 (not shown in FIG. 17). This embodiment of the application does not specifically limit this.

[0451] Transceiver 1703 is used for communication with other communication devices. For example, if communication device 1700 is a terminal, transceiver 1703 can be used to communicate with a network device or with another terminal device. As another example, if communication device 1700 is a network device, transceiver 1703 can be used to communicate with a terminal or with another network device.

[0452] Optionally, transceiver 1703 may include a receiver and a transmitter (not shown separately in Figure 17). The receiver is used to implement the receiving function, and the transmitter is used to implement the transmitting function.

[0453] Optionally, the transceiver 1703 can be integrated with the processor 1701 or exist independently and be coupled to the processor 1701 through the interface circuit of the communication device 1700 (not shown in FIG17). This application embodiment does not specifically limit this.

[0454] It is understood that the structure of the communication device 1700 shown in Figure 17 does not constitute a limitation on the communication device. Actual communication devices may include more or fewer components than shown, or combine certain components, or have different component arrangements.

[0455] Furthermore, the technical effects of the communication device 1700 can be referred to the technical effects of the method described in the above method embodiments, and will not be repeated here.

[0456] It should be understood that the processor in the embodiments of this application can be a central processing unit (CPU), or it can be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor or any conventional processor.

[0457] It should also be understood that the memory in the embodiments of this application can be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of random access memory (RAM) are available, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate synchronous DRAM (DDR SDRAM), enhanced synchronous DRAM (ESDRAM), synchronous linked DRAM (SLDRAM), and direct rambus RAM (DR RAM).

[0458] The above embodiments can be implemented, in whole or in part, by software, hardware (such as circuits), firmware, or any other combination thereof. When implemented using software, the above embodiments can be implemented, in whole or in part, in the form of a computer program product. The computer program product includes one or more computer instructions or computer programs. When the computer instructions or computer programs are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more sets of available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. A semiconductor medium can be a solid-state drive.

[0459] It should be understood that the term "and / or" in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. A and B can be singular or plural. Additionally, the character " / " in this article generally indicates an "or" relationship between the preceding and following related objects, but it can also represent an "and / or" relationship. Please refer to the context for a more accurate understanding.

[0460] In this application, "at least one" means one or more, and "more than one" means two or more. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or multiple items. For example, at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple.

[0461] It should be understood that in the various embodiments of this application, the order of the above-mentioned processes does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

[0462] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0463] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0464] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0465] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0466] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0467] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0468] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A communication method, characterized in that, Applied to a first device, the method includes: Obtain the neighborhood matching degree of each of N point cloud data, where N is an integer greater than 1 and i is an integer from 1 to N. The N point cloud data are three-dimensional point cloud data obtained by perceiving the target object. The neighborhood matching degree of the i-th point cloud data is determined based on the image similarity between the first neighborhood i and the second neighborhood i. The first neighborhood i is the region where the i-th point cloud data is projected onto the first image, and the second neighborhood i is the region where the i-th point cloud data is projected onto the second image. The first image and the second image are obtained by two-dimensional perception of the target object, and the first image and the second image are different. Based on the neighborhood matching degree of each of the N point cloud data, the noise in the N point cloud data is determined.

2. The method according to claim 1, characterized in that, The process of obtaining the neighborhood matching degree of each of the N point cloud data includes: Obtain first information, which is used to indicate relevant information of pixels in the first neighborhood i and the second neighborhood i; Based on the first information, the neighborhood matching degree of the i-th point cloud data is determined.

3. The method according to claim 2, characterized in that, The acquisition of the first information includes: Send a first message to the second device. The first message is used to request relevant information of the pixels in the region where the N point cloud data are projected onto the two-dimensional image. The two-dimensional image is obtained by sensing the target object. Receive the first information from the second device.

4. The method according to claim 3, characterized in that, The first message includes neighborhood description information, which is used to indicate the size and / or shape of the region where the N point cloud data are projected onto the two-dimensional image.

5. The method according to any one of claims 2-4, characterized in that, The first information includes first neighborhood information i and second neighborhood information i. The first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i. or, The first information includes a first neighborhood feature i and a second neighborhood feature i. The first neighborhood feature i is used to indicate the image features of the first neighborhood i, and the second neighborhood feature i is used to indicate the image features of the second neighborhood i. The image features are related to pixels.

6. The method according to claim 5, characterized in that, When the first information includes the first neighborhood information i and the second neighborhood information i, the method further includes: Receive a first capability parameter from a second device, the first capability parameter being used to indicate that the second device supports providing the first device with pixel values ​​corresponding to pixels in a first region, the first region being the region where point cloud data is projected onto a two-dimensional image; Sending the first message to the second device includes: Based on the first capability parameter, the first message is sent to the second device.

7. The method according to claim 5, characterized in that, When the first information includes the first neighborhood feature i and the second neighborhood feature i, the method further includes: Receive a second capability parameter from a second device, the second capability parameter being used to indicate that the second device supports providing neighborhood features to the first device, the neighborhood features being used to indicate the image features of the region where the point cloud data is projected onto a two-dimensional image; Sending the first message to the second device includes: The first message is sent to the second device according to the second capability parameter.

8. The method according to claim 1, characterized in that, The process of obtaining the neighborhood matching degree of each of the N point cloud data includes: A second message is sent to the second device. The second message is used to request the neighborhood matching degree corresponding to the N point cloud data. The neighborhood matching degree corresponding to the N point cloud data is used to indicate the image similarity of the region where the i-th point cloud data is projected onto multiple two-dimensional images. The multiple two-dimensional images are obtained by sensing the target object. The first device receives the neighborhood matching degree of each of the N point cloud data from the second device.

9. The method according to claim 8, characterized in that, Before sending the second message to the second device, the method further includes: The device receives a third capability parameter from a second device, which indicates that the second device supports providing the first device with the neighborhood matching degree corresponding to the three-dimensional point cloud data. The neighborhood matching degree corresponding to the three-dimensional point cloud data is used to indicate the image similarity of the region where the three-dimensional point cloud data is projected onto multiple two-dimensional images. Sending the second message to the second device includes: The second message is sent to the second device according to the third capability parameter.

10. The method according to claim 1, characterized in that, The process of obtaining the neighborhood matching degree of each of the N point cloud data includes: Obtain parameter information, the first image, and the second image; the parameter information is used to project the N point cloud data onto the first image and the second image. Based on the parameter information, the first image, and the second image, the neighborhood matching degree of each of the N point cloud data is determined.

11. The method according to claim 10, characterized in that, The acquisition of parameter information, the first image, and the second image includes: The device receives the parameter information, the first image, and the second image from the second device, wherein the parameter information is used to indicate relevant parameters for the second device to perform two-dimensional perception of the target object.

12. The method according to claim 11, characterized in that, Before receiving the parameter information, the first image, and the second image from the second device, the method further includes: Receive a fourth capability parameter from a second device, the fourth capability parameter being used to indicate that the second device supports providing the first device with relevant parameters and images for two-dimensional perception of objects; Sending the third message to the second device includes: The third message is sent to the second device according to the fourth capability parameter.

13. A communication method, characterized in that, Applied to a second device, the method includes: The system receives a first message from a first device. The first message is used to request information about the pixels in the region where N point cloud data are projected onto a two-dimensional image. The N point cloud data are three-dimensional point cloud data obtained by perceiving the target object, where N is an integer greater than 1. The two-dimensional image is obtained by perceiving the target object. In response to the first message, first information is sent to the first device. The first information is used to indicate the relevant information of the pixels in the first neighborhood i and the second neighborhood i. The first neighborhood i is the area where the i-th point cloud data in the N point cloud data is projected onto the first image. The second neighborhood i is the area where the i-th point cloud data is projected onto the second image. The first image and the second image are obtained by two-dimensional perception of the target object. The first image and the second image are different. i is an integer from 1 to N.

14. The method according to claim 13, characterized in that, The first message includes neighborhood description information, which is used to indicate the size and / or shape of the region where the N point cloud data are projected onto the two-dimensional image.

15. The method according to claim 13 or 14, characterized in that, The first information includes first neighborhood information i and second neighborhood information i. The first neighborhood information i includes the pixel value corresponding to the pixel in the first neighborhood i, and the second neighborhood information i includes the pixel value corresponding to the pixel in the second neighborhood i. or, The first information includes a first neighborhood feature i and a second neighborhood feature i. The first neighborhood feature i is used to indicate the image features of the first neighborhood i, and the second neighborhood feature i is used to indicate the image features of the second neighborhood i. The image features are related to pixels.

16. The method according to claim 15, characterized in that, When the first information includes the first neighborhood information i and the second neighborhood information i, the method further includes: Send a first capability parameter to the first device. The first capability parameter is used to indicate that the second device supports providing the first device with the pixel values ​​corresponding to the pixels in the first region. The first region is the region where the point cloud data is projected onto the two-dimensional image.

17. The method according to claim 15, characterized in that, When the first information includes the first neighborhood feature i and the second neighborhood feature i, the method further includes: A second capability parameter is sent to the first device, the second capability parameter being used to indicate that the second device supports providing neighborhood features to the first device, the neighborhood features being used to indicate the image features of the region where the point cloud data is projected onto the two-dimensional image.

18. A communication method, characterized in that, Applied to a second device, the method includes: A second message is received from the first device. The second message is used to request the neighborhood matching degree corresponding to N point cloud data. The N point cloud data are three-dimensional point cloud data obtained by perceiving the target object. N is an integer greater than 1. The neighborhood matching degree corresponding to the N point cloud data is used to indicate the image similarity of the region where the i-th point cloud data is projected onto multiple two-dimensional images. The multiple two-dimensional images are obtained by perceiving the target object. i is an integer from 1 to N. In response to the second message, the neighborhood matching degree of each of the N point cloud data is sent to the first device. The neighborhood matching degree of the i-th point cloud data is determined based on the image similarity between the first neighborhood i and the second neighborhood i. The first neighborhood i is the region where the i-th point cloud data is projected onto the first image, and the second neighborhood i is the region where the i-th point cloud data is projected onto the second image. The first image and the second image are obtained by two-dimensional perception of the target object, and the first image and the second image are different.

19. The method according to claim 18, characterized in that, Before receiving the second message from the first device, the method further includes: A third capability parameter is sent to the first device. The third capability parameter is used to indicate that the second device supports providing the first device with the neighborhood matching degree corresponding to the point cloud data. The neighborhood matching degree corresponding to the point cloud data is used to indicate the image similarity of the region where the point cloud data is projected onto multiple two-dimensional images.

20. A communication method, characterized in that, Applied to a second device, the method includes: Acquire parameter information, a first image, and a second image. The parameter information is used to indicate the relevant parameters for the second device to perform two-dimensional perception of the target object. The first image and the second image are obtained by the second device to perform two-dimensional perception of the target object. The first image and the second image are different. The parameter information, the first image, and the second image are sent to the first device.

21. The method according to claim 20, characterized in that, The method includes: Receive a third message from the first device, the third message being used to request the second device to perform two-dimensional perception of the target object and related parameters and at least one image; Sending the parameter information, the first image, and the second image to the first device includes: In response to the third message, the parameter information, the first image, and the second image are sent to the first device.

22. The method according to claim 20 or 21, characterized in that, Before sending the parameter information, the first image, and the second image to the first device, the method further includes: A fourth capability parameter is sent to the first device, the fourth capability parameter being used to instruct the second device to support providing the first device with relevant parameters and images for two-dimensional perception of objects.

23. A communication device, characterized in that, The communication device is used to perform the communication method as described in any one of claims 1-22.

24. A communication device, characterized in that, include: Processor and memory; The memory is used to store computer instructions, which, when executed by the processor, cause the communication device to perform the communication method as described in any one of claims 1-22.

25. A computer-readable storage medium, characterized in that, The computer-readable storage medium includes a computer program or instructions that, when executed on a communication device, cause the communication device to perform the method as described in any one of claims 1-22.

26. A computer program product, characterized in that, The computer program product includes a computer program or instructions that, when executed by a communication device, cause the method of any one of claims 1-22 to be performed.