Array element processing method and apparatus, storage medium, and electronic device

By dividing array elements into data sub-blocks for parallel processing and utilizing GPU parallel computing and atomic operations, the problem of low efficiency in finding non-zero elements in sparse data structures is solved, thereby improving the efficiency and computation speed of sparse data processing.

CN119760180BActive Publication Date: 2026-06-12INSPUR SUZHOU INTELLIGENT TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
INSPUR SUZHOU INTELLIGENT TECH CO LTD
Filing Date
2024-12-20
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In large-scale sparse data processing, existing technologies are inefficient at finding non-zero elements in sparse data structures, resulting in inefficient array element processing.

Method used

By dividing the target input array into multiple data sub-blocks, multiple threads are executed in parallel to find the position index of non-zero elements, and the parallel computing power and atomic operation management of the GPU are used to record the position index of non-zero elements into the target output array.

🎯Benefits of technology

It improves the efficiency of finding non-zero elements in large-scale arrays and the management of atomic operations, thereby increasing the computation speed and resource utilization during model training.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119760180B_ABST
    Figure CN119760180B_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide an array element processing method and device, a storage medium and an electronic device, relating to the field of computer, comprising: obtaining a target input array, wherein the target input array comprises at least one non-zero element; dividing each element in the target input array into a plurality of data sub-blocks, wherein one data sub-block comprises at least one element; finding non-zero elements in the plurality of data sub-blocks by executing a plurality of threads in parallel to obtain position indexes of the non-zero elements, wherein the position indexes represent the row coordinates and column coordinates of each non-zero element in the target input array; recording the position indexes of the non-zero elements into a target output array, wherein the column dimension of the target output array is the same as the dimension of the target input array; and training a machine learning model by using the target input array and the target output array.
Need to check novelty before this filing date? Find Prior Art