A memory-compute integrated circuit, a computing system and a computing method
By combining optically and electrically controlled PCM arrays, the problem of separating storage and computation in traditional computing systems is solved, achieving mixed-precision computing, improving efficiency and reducing power consumption, and realizing single-chip integration.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HUAWEI TECH CO LTD
- Filing Date
- 2021-07-20
- Publication Date
- 2026-06-23
AI Technical Summary
The separation of storage and computation in traditional computing systems leads to performance and power consumption bottlenecks. Existing in-memory computing architectures struggle to achieve mixed-precision computing and have poor flexibility.
By combining optically controlled computing modules and electrically controlled computing modules, the optically controlled PCM array is used for high-precision computing, and the electrically controlled PCM array is used for low-precision computing. They are integrated on the same silicon substrate to achieve mixed-precision computing.
It improves computing efficiency, reduces power consumption, and achieves single-chip integration, thereby enhancing computing efficiency and flexibility.
Smart Images

Figure CN115640839B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of artificial intelligence technology, and in particular to an in-memory computing circuit, computing system and computing method. Background Technology
[0002] Artificial intelligence (AI) algorithms run on traditional computing systems, such as central processing units (CPUs) and graphics processing units (GPUs), whose energy consumption and efficiency are relatively difficult to match those of the human brain. The core problem lies in the separation of storage and computation in traditional computing systems, which leads to performance and power consumption bottlenecks when processing neural networks. In-memory computing, developed based on non-volatile memories such as resistive random access memory (RRAM) or NOR flash memory, can effectively solve this problem.
[0003] When performing in-memory computations, using mixed-precision computation can further improve computational efficiency and reduce power consumption. For example, in a convolutional neural network (CNN), the first and last layers require high-precision computation, while the intermediate layers can use low-precision computation.
[0004] In existing technologies, in-memory computing architectures can be like... Figure 1 As shown, x1, x2, ..., x n For the input data, after digital-to-analog conversion, the data is processed through different conductivities (G). 11 ~G nm The calculations are performed and the results are stored in a trans-impedance amplifier (TIA). The TIA is equivalent to an integrator, used to integrate the results of multiple calculations and output them after analog-to-digital conversion, as the final calculation results y1, y2, ..., y3. m Output. Using Figure 1 The architecture shown has a computational accuracy that depends on the accuracy of the conductivity value. Therefore, this architecture can only achieve single-precision (low-precision) calculations and is difficult to achieve mixed-precision calculations, resulting in poor flexibility.
[0005] Therefore, there is an urgent need for an in-memory computing architecture that enables mixed-precision computing, thereby improving computational efficiency and reducing power consumption. Summary of the Invention
[0006] This application provides an in-memory computing circuit, computing system, and computing method to achieve mixed-precision in-memory computing, improve computing efficiency, and reduce power consumption.
[0007] In a first aspect, embodiments of this application provide an in-memory computing circuit. This in-memory computing circuit includes an optical control computing module, a photoelectric conversion module, and an electrical control computing module. The optical control computing module can perform a first neural network layer calculation on a first set of optical signals used to indicate the first set of data based on a first set of weighting coefficients, thereby obtaining a second set of optical signals, the second set of optical signals carrying the calculation results of the first set. Then, the photoelectric conversion module receives the second set of optical signals and converts them into a first set of electrical signals. After receiving the first set of electrical signals, the electrical control computing module can perform a second neural network layer calculation on the first set of electrical signals based on a second set of weighting coefficients.
[0008] It should be understood that the in-memory computing circuit provided in the first aspect can be regarded as a type of neural network chip, used to realize neural network layer computation.
[0009] The in-memory computing circuit provided in the first aspect utilizes an optically controlled computing module, which employs optical signals for control and thus achieves high computational accuracy, making it suitable as a high-precision computing module. Furthermore, computation via optical signals improves computational efficiency. Conversely, the electrically controlled computing module, which uses electrical signals for control and has lower computational accuracy, can be used as a low-precision computing module to reduce power consumption. This approach enables mixed-precision computing, ensuring both the computational efficiency of high-precision operations and reducing power consumption.
[0010] In one possible design, the optically modulated computation module may include at least one optically modulated phase change material (PCM) array. This PCM array performs multiplication and accumulation calculations on the first set of data and the first set of weight coefficients to obtain the first set of calculation results. Each optically modulated PCM array may include multiple PCM units, and the crystallinity of the phase change material used in these units differs. The crystallinity of the phase change material used in each PCM unit is used to indicate the weight coefficients. Using this scheme, the multiplication and accumulation calculations of the first neural network layer can be achieved through optically modulated PCM arrays.
[0011] In one possible design, the electronically controlled calculation module may include an electronically controlled PCM array, which multiplies and accumulates the first set of calculation results with the second set of weight coefficients to obtain the second set of calculation results.
[0012] Using the above scheme, the multiplication and accumulation calculation of the second neural network layer can be realized through an electrically controlled PCM array. Simultaneously, both the optically controlled and electrically controlled calculation modules are implemented using PCM arrays, thus allowing them to be integrated onto the same silicon substrate. In other words, compared to existing technologies that use a CPU+PCM array for mixed-precision computing, the in-memory computing circuit provided in the first aspect can be integrated on the same silicon substrate, achieving single-chip integration.
[0013] In one possible design, the photoelectric conversion module may include a detector array and a conversion module. The detector array is used to detect the light intensity of the second set of optical signals to obtain a set of photocurrents, and the conversion module is used to convert the set of photocurrents into a first set of electrical signals, wherein the first set of electrical signals is a voltage signal.
[0014] In one possible design, the in-memory computing circuit provided in the first aspect may further include a routing module coupled to the photoelectric conversion module, used to route the first set of electrical signals to the electro-control computing module. Using the above scheme, the output of the photoelectric control computing module can be matched to the input of the electro-control computing module through the routing module.
[0015] In one possible design, the in-memory computing circuit provided in the first aspect may further include a light-emitting array for sending a first set of optical signals based on multiple elements in the first set of data. Using the above scheme, the input electrical signal can be converted into an optical signal by the light-emitting array, which is then used by the optically controlled computing module to perform calculations for the first neural network layer.
[0016] In practical applications, light-emitting arrays can be implemented in various ways. The following are three examples illustrating these implementation methods.
[0017] Implementation Method 1
[0018] The light-emitting array includes a laser array, which contains multiple lasers. Each laser has a modulation function, so the laser frame can modulate multiple elements in the first set of data onto multiple optical signals and send multiple optical signals. The multiple optical signals form the first set of optical signals.
[0019] Implementation Method Two
[0020] The light-emitting array includes a laser array and a modulator array. The laser array includes multiple lasers, which are used to transmit multiple optical signals. The lasers do not have modulation capabilities and require an additional modulator array to modulate the optical signals. The modulator array includes multiple modulators, which correspond one-to-one with the lasers. The multiple modulators receive the optical signals emitted by their corresponding lasers and modulate multiple elements from the first set of data onto the multiple optical signals. The modulated multiple optical signals form the first set of optical signals.
[0021] Implementation Method 3
[0022] The light-emitting array includes an optical frequency comb source and a modulator array. The optical frequency comb source is used to generate an optical frequency comb, which can also be called an optical frequency comb. It refers to a spectrum composed of a series of uniformly spaced frequency components with a coherent and stable phase relationship. Specifically, it can be understood that the optical frequency comb includes multiple optical signals. The modulator array includes multiple modulators, which correspond one-to-one with multiple optical signals. The multiple modulators receive the corresponding optical signals and modulate multiple elements in the first set of data onto the multiple optical signals. The modulated multiple optical signals form the first set of optical signals.
[0023] In a second aspect, embodiments of this application provide a computing system, which includes a processor and a memory computing circuit provided in the first aspect and any possible design described above. The processor is used to send a first set of data to the memory computing circuit, and the memory computing circuit is used to perform neural network layer calculations based on the first set of data.
[0024] Thirdly, embodiments of this application also provide a calculation method, which includes the following steps: performing a first neural network layer calculation on a first set of optical signals based on a first set of weight coefficients to obtain a second set of optical signals, wherein the first set of optical signals is used to indicate the first set of data, and the second set of optical signals is used to carry the first set of calculation results; converting the second set of optical signals into a first set of electrical signals; and performing a second neural network layer calculation on the first set of electrical signals based on the second set of weight coefficients.
[0025] In one possible design, the first neural network layer is calculated based on the first set of weight coefficients for the first set of optical signals. Specifically, this can be achieved by multiplying and accumulating the first set of data with the first set of weight coefficients to obtain the first set of calculation results.
[0026] In one possible design, the second neural network layer calculation is performed on the first set of electrical signals based on the second set of weight coefficients. Specifically, this can be achieved by performing a multiply-accumulate calculation on the received first set of electrical signals based on the second set of weight coefficients to obtain the second set of electrical signals. The second set of electrical signals is used to indicate the result of the multiply-accumulate calculation of the first set of calculation results and the second set of weight coefficients.
[0027] Furthermore, it should be understood that the technical effects of the second to third aspects and any of their possible design methods can be found in the technical effects of the different design methods in the first aspect, and will not be repeated here. Attached Figure Description
[0028] Figure 1 A schematic diagram of the structure of a computing storage unit provided for the prior art;
[0029] Figure 2 This application provides a schematic diagram of the structure of a neural network system according to an embodiment of the present application.
[0030] Figure 3 A schematic diagram of the structure of the first in-memory computing circuit provided in the embodiments of this application;
[0031] Figure 4 This is a schematic diagram of the structure of the second in-memory computing circuit provided in the embodiments of this application;
[0032] Figure 5 This is a schematic diagram of the structure of a light control computing module provided in an embodiment of this application;
[0033] Figure 6 This is a schematic diagram of the structure of a light-controlled PCM array provided in an embodiment of this application;
[0034] Figure 7 This is a schematic diagram of the structure of an electrically controlled PCM array provided in an embodiment of this application;
[0035] Figure 8 This is a schematic diagram of another electrically regulated PCM array provided in an embodiment of this application;
[0036] Figure 9 A schematic diagram of the structure of the third in-memory computing circuit provided in the embodiments of this application;
[0037] Figure 10a This is a schematic diagram of the structure of the first type of light-emitting array provided in the embodiments of this application;
[0038] Figure 10b This is a schematic diagram of the structure of the second type of light-emitting array provided in the embodiments of this application;
[0039] Figure 10c This is a schematic diagram of the structure of the third type of light-emitting array provided in the embodiments of this application;
[0040] Figure 11 This is a schematic diagram of the structure of the fourth in-memory computing circuit provided in the embodiments of this application;
[0041] Figure 12 This application provides a schematic diagram of the structure of a computing system according to an embodiment of the present application.
[0042] Figure 13 This is a flowchart illustrating a calculation method provided in an embodiment of this application. Detailed Implementation
[0043] The embodiments of this application will now be described in further detail with reference to the accompanying drawings.
[0044] It should be noted that in the embodiments of this application, "multiple" refers to two or more. Furthermore, in the description of this application, terms such as "first" and "second" are used only for descriptive purposes and should not be construed as indicating or implying relative importance, nor as indicating or implying order. The "coupling" mentioned in the embodiments of this application refers to electrical connection, which can specifically include both direct and indirect connections.
[0045] First, a brief introduction to the application scenarios of the embodiments of this application will be given. The embodiments of this application can be applied to neural network systems. A neural network system may include multiple neural network layers. In the embodiments of this application, a neural network layer is a logical layer concept; one neural network layer performs one neural network operation. Each neural network computation is implemented by computation nodes. Neural network layers may include convolutional layers, pooling layers, etc.
[0046] like Figure 2 As shown, a neural network system can include n neural network layers (also known as an n-layer neural network), where n is an integer greater than or equal to 2. Figure 2 This shows a portion of the neural network layers in a neural network system, such as... Figure 2 As shown, the neural network system may include a first layer 202, a second layer 204, a third layer 206, a fourth layer 208, a fifth layer 210, and an nth layer 212. The first layer 202 can perform convolution operations, the second layer 204 can perform pooling operations on the output data of the first layer 202, the third layer 206 can perform convolution operations on the output data of the second layer 204, the fourth layer 208 can perform convolution operations on the output of the third layer 206, and the fifth layer 210 can perform summation operations on the output data of the second layer 204 and the output data of the fourth layer 208, and so on.
[0047] Understandable, Figure 2 This is just a simple example and explanation of neural network layers in a neural network system. It does not restrict the specific operations of each neural network layer. For example, the fourth layer 208 can also be a pooling operation, and the fifth layer 310 can also be a convolution operation or a pooling operation, or other neural network operations.
[0048] The in-memory computing circuit provided in this embodiment is mainly used to perform convolution operations in neural network layers. In the in-memory computing circuit, the optically controlled computing module can be regarded as a computing node used to perform calculations for the first neural network layer, and the electrically controlled computing module can be regarded as another computing node used to perform calculations for the second neural network layer.
[0049] Figure 3 This application provides an embodiment of a memory computing circuit. For example... Figure 3As shown, the in-memory computing circuit 300 includes a light-controlled computing module 301, a photoelectric conversion module 302, and an electrical-controlled computing module 303.
[0050] The optical modulation calculation module 301 is used to perform a first neural network layer calculation on the first set of optical signals based on a first set of weight coefficients to obtain a second set of optical signals. The first set of optical signals is used to indicate the first set of data, and the second set of optical signals is used to carry the first set of calculation results. The photoelectric conversion module 302 is used to receive the second set of optical signals and convert the second set of optical signals into the first set of electrical signals. The electrical modulation calculation module 303 is used to perform a second neural network layer calculation on the first set of electrical signals based on a second set of weight coefficients.
[0051] Specifically, after the electrical control calculation module 303 performs the second neural network layer calculation on the first set of electrical signals based on the second set of weight coefficients, it can obtain the second set of electrical signals, which are used to carry the second set of calculation results.
[0052] It should be noted that in the embodiments of this application, the data or calculation results are carried by optical signals or electrical signals. When the signal undergoes photoelectric conversion or electro-optical conversion, only the form of the signal carrying the data or calculation results is converted, while the data or calculation results carried by the signal remain unchanged.
[0053] Furthermore, it should be understood that the first set of data includes multiple data points (multiple elements), which are usually different, but in some cases, the same data may exist among the multiple data points. Similarly, the first set of calculation results also includes multiple calculation results, which are usually different data points, and the same applies to the second set of calculation results.
[0054] In a neural network system, different neural network layers require different levels of computational precision. For example, in a CNN network, the first and last layers require high-precision computation, while the intermediate layers only require low-precision computation. Therefore, the computation of the neural network layer requiring high-precision computation (i.e., the first neural network layer) can be achieved through the optically controlled computation module 301, while the computation of the neural network layer requiring low-precision computation (i.e., the second neural network layer) can be achieved through the electrically controlled computation module 303.
[0055] Using the aforementioned in-memory computing circuit 300, the optical control computing module 301 uses optical signals for control, resulting in higher calculation accuracy, and can therefore be used as a high-precision computing module; meanwhile, the electrical control computing module 303 uses electrical signals for control, resulting in lower calculation accuracy, and can therefore be used as a low-precision computing module, thereby achieving mixed-precision computing.
[0056] It should be noted that, in this embodiment, the optical modulation calculation module 301 can be considered as one computing node for performing the calculation of the first neural network layer, and the electrical modulation calculation module 303 can be considered as another computing node for performing the calculation of the second neural network layer. In practical applications, the neural network system may also include other neural network layers besides the first and second neural network layers. The calculation of other neural network layers can be implemented by other computing nodes (such as another optical modulation calculation module or electrical modulation calculation module), or the calculation of other neural network layers can reuse the optical modulation calculation module 301 or the electrical modulation calculation module 303. For example, the second set of electrical signals output by the electrical modulation calculation module 303, after electro-optical conversion, can be used as the input of the optical modulation calculation module 301, which then performs the calculation of the third neural network layer.
[0057] Furthermore, the in-memory computing circuit 300 may also include a routing module. For example... Figure 4 As shown, the routing module is coupled to the data conversion module 302 and is used to route the first set of electrical signals to the electrical control calculation module 303.
[0058] The following sections will provide a detailed introduction to each unit / module in the in-memory computing circuit 300.
[0059] I. Light Control Calculation Module 301
[0060] Specifically, the light-controlled calculation module 301 may include at least one light-controlled phase change material (PCM) array, which is used to multiply and accumulate the first set of data with the first set of weighting coefficients to obtain the first set of calculation results.
[0061] It should be understood that in this embodiment, the first set of data refers to all the data for which the first neural network layer calculation needs to be performed. After performing the first neural network layer calculation on the first set of data, all the data for performing the second neural network layer calculation can be obtained. In some cases, the first set of data can undergo multiple first neural network layer calculations. Each calculation performs the first neural network layer calculation on a portion of the data in the first set of data. The data conversion module 302 can cache the results obtained from multiple calculations. After all the data in the first set of data has been calculated, the results obtained from multiple calculations are integrated and sent to the electrical control calculation module 303 for the second neural network layer calculation. The sum of the results obtained from multiple calculations can be regarded as the aforementioned first set of calculation results.
[0062] Therefore, in practical applications, multiple optically controlled PCM arrays can be set up according to the needs of the application scenario, with each optically controlled PCM array used to perform calculations on a portion of the data. Alternatively, a single optically controlled PCM array can be reused for multiple calculations. In the subsequent description of the embodiments of this application, for the sake of simplicity, an example is provided where the optically controlled calculation module 301 includes one optically controlled PCM array.
[0063] Specifically, the optically controlled PCM array can adopt a crossbar structure. The optically controlled PCM array includes multiple PCM units, each of which uses a phase change material with a different degree of crystallinity. The crystallinity of the phase change material is used to indicate the different weighting coefficients in the first set of weighting coefficients.
[0064] Optically modulated PCM matrices achieve optical modulation based on the evanescent wave mechanism. Specifically, the refractive index of phase change alloy materials varies significantly under different crystal states, leading to large differences in the absorption rate of the optical signal. Therefore, when the first set of optical signals carrying the first set of data passes through PCM units with different crystallinities, the intensity of the optical pulse changes, which is equivalent to multiplying and summing the first set of data with the first set of weighting coefficients. In practical applications, the optically modulated PCM matrix can employ a crossbar structure composed of phase change alloy materials (such as Ge2Sb2Te5) sputtered onto materials such as Si or Si3N4.
[0065] The following is based on Figure 5 To illustrate the process of multiplying and accumulating the first set of data with the first set of weighted systems, let's take an example: P1 and P2 represent two elements in the first set of optical signals. P1 and P2 pass through PCM units with different crystallinities, which is equivalent to multiplying and accumulating with G respectively. 11 and G 12 Multiply, G 11 and G 12 This represents the first set of weight coefficients, and the output results are P1G. 11 and P2G 12 P1G 11 and P2G 12 The outputs are superimposed, which is equivalent to accumulation.
[0066] For example, assuming the optically modulated PCM array is an n*m array, meaning the first group of optical signals includes n optical signals and the first group of electrical signals includes m electrical signals (i.e., the first group of calculation results includes m calculation results), then a possible structural diagram of the optically modulated PCM matrix can be shown as follows: Figure 6 As shown. Figure 6The optically modulated PCM array shown employs a crossbar structure. PCM units are positioned on microrings, which are coupled to the transmission paths of the input signals (X1, X2, ..., Xn) and the output signals (Y1, Y2, ..., Ym). For example, for optical signal X1, most of X1 is transmitted through the coupled microrings, with only a negligible portion transmitted along the original input channel. Therefore, for the first group of optical signals (X1, X2, ..., Xn), each optical signal undergoes a change in intensity after passing through PCM units with different crystallinity, which is equivalent to multiplication with different weighting coefficients. Then, each column of output channels is used to accumulate the optical signals obtained after the multiplication operation.
[0067] II. Electrical Control Calculation Module 303
[0068] In this embodiment, the electrical control calculation module 303 is used to perform a second neural network layer calculation on the first group of electrical signals based on the second set of weight coefficients.
[0069] Specifically, the electrical control calculation module 303 may include an electrical control PCM array, used to multiply and accumulate the first set of calculation results with the second set of weight coefficients to obtain the second set of calculation results.
[0070] After the first set of electrical signals is input to the electrical control calculation module 303, it passes through conductances of different values, which is equivalent to multiplying the first set of calculation results by the second set of weighting coefficients to obtain multiple intermediate calculation results. Finally, the electrical control calculation module 303 outputs the second set of calculation results.
[0071] In practical applications, electrically controlled PCM arrays can adopt a structure with one transistor and one conductance (referred to as 1T1R), a structure with two transistors and two conductances (referred to as 2T2R), or a crossbar structure.
[0072] For example, Figure 7 An electrically controlled PCM array with a 1T1R structure is shown. In this structure, signals (P1, P2, ..., Pb) are used to select transistors. When a transistor is selected, a first set of electrical signals (A1, A2, ..., Aa) are transmitted through the transistor to conductances (resistors) of different conductance values. Different conductance values are equivalent to storing different weighting coefficients, thereby realizing the multiplication of the first set of electrical signals with the second set of weighting coefficients. The signals output by different conductances in the same column are then accumulated, thereby realizing multiplication and accumulation calculation and outputting (B1, B2, ..., Bb).
[0073] For example, Figure 8An electrically controlled PCM array with a crossbar structure is shown. In this structure, a first set of electrical signals (A1, A2, ..., Aa) are transmitted to conductors with different conductance values, thereby realizing the multiplication of the first set of electrical signals with the second set of weighting coefficients. The signals output by different conductors in the same column are then accumulated, thereby realizing multiplication and accumulation calculation and outputting (B1, B2, ..., Bb).
[0074] It should be understood that the implementation method of the electrically controlled PCM array is the prior art, and the structure of the electrically controlled PCM array in the prior art is also applicable to the embodiments of this application, and will not be described again here.
[0075] The following section describes the computational accuracy of optically controlled PCM arrays and electrically controlled PCM arrays: In electrically controlled PCM arrays, different voltage values are applied to the array's input terminals before computation, thus writing different conductance values. Applying voltage is equivalent to heating the molecules or atoms in the PCM units, causing them to rearrange and change from an amorphous to a crystalline state, resulting in different degrees of crystallinity. Since changing the crystalline state of the material through heating can easily cause material defects, it affects the computational accuracy of electrically controlled PCM arrays. In optically controlled PCM arrays, different weighting coefficients are written by introducing light signals before computation. This process, under the influence of a light field, causes the PCM units to change from an amorphous to a crystalline state. Because this process does not involve heating the material, it is less likely to cause material defects, thus resulting in higher computational accuracy for optically controlled PCM arrays.
[0076] It should be noted that, in the embodiments of this application, the electrically controlled PCM array can also be reused. For example, after using the electrically controlled PCM array to perform the calculation of the second neural network layer, the electrically controlled PCM array can be used again to perform the calculation of other neural network layers. In this case, the electrically controlled calculation module 303 may also include a trans-impedance amplifier (TIA). The TIA is used to integrate the calculation results of the electrically controlled PCM array, thereby merging and outputting the multiple calculation results.
[0077] Furthermore, in existing technologies, to achieve mixed-precision in-memory computing, a CPU+PCM array architecture is typically adopted, where the CPU performs high-precision computing tasks and the electrically controlled PCM array performs low-precision computing tasks. However, this approach results in a large amount of data exchange between high- and low-precision units, leading to low computational efficiency and high power consumption. In the embodiments of this application, if the optically controlled computing module 301 uses an optically controlled PCM array and the electrically controlled computing module 303 uses an electrically controlled PCM array, then the optically controlled PCM array and the electrically controlled PCM array can be integrated on the same silicon substrate. Compared with the existing technology that uses a CPU+PCM array to achieve mixed-precision computing, the in-memory computing circuit 300 can be integrated on the same silicon substrate, achieving single-chip integration.
[0078] III. Data Conversion Module 302
[0079] In this embodiment of the application, the data conversion module 302 is used to convert the second set of optical signals into the first set of electrical signals.
[0080] Specifically, the data conversion module 302 may include a detector array and a conversion module. The detector array is used to detect the light intensity of the second set of optical signals to obtain a set of photocurrents; the conversion module is used to convert the set of photocurrents into a first set of electrical signals, wherein the first set of electrical signals is a voltage signal.
[0081] Furthermore, as mentioned earlier, the calculation of the first neural network layer on the first set of data can be performed multiple times, with each calculation targeting only a portion of the data in the first set. The data conversion module 302 can store the results of each calculation. After all the calculations on the first set of data are completed, the results from the multiple calculations are integrated and sent to the electrical control calculation module 303 for the calculation of the second neural network layer. In practical applications, a threshold parameter can be set in the data conversion module 302 as a nonlinear activation function, triggering the next operation (the calculation process of the electrical control calculation module 303) only after the calculation results have accumulated to a certain level.
[0082] For example, the second set of optical signals output in parallel by the optical control calculation module 301 includes 10 optical signals, while the input of the electrical control calculation module 303 has 50 channels. Then, the data conversion module 302 can trigger the calculation of the electrical control calculation module 302 when the number of optical signals output by the optical control calculation module 301 accumulates to 50 (the optical control calculation module 301 performs 5 calculations).
[0083] IV. Light-emitting array
[0084] In this embodiment of the application, the in-memory computing circuit 300 may further include a light-emitting array, such as... Figure 9As shown, the light-emitting array is used to send a first set of light signals based on multiple elements in the first set of data, where each light signal in the first set of light signals is used to indicate one element in the first set of data.
[0085] Specifically, the light-emitting array may include a laser array, such as... Figure 10a As shown, the laser array includes multiple laser diodes (LDs). These LDs transmit multiple optical signals based on multiple elements in the first set of data, forming the first set of optical signals. In practical applications, the optical signals emitted by lasers LD1 and LD2 have different intensities, used to indicate different elements in the first set of data.
[0086] In another implementation, such as Figure 10b As shown, the light-emitting array includes a laser array and a modulator array. The laser array includes multiple lasers for transmitting multiple optical signals (some of which may have the same wavelength). The modulator array includes multiple modulators, each corresponding to one laser, which are used to receive multiple optical signals and modulate multiple elements from the first set of data onto the multiple optical signals. For example, after laser LD1 emits an optical signal, modulator MD1 can adjust the intensity of the optical signal emitted by LD1 according to the first element to be loaded, so as to load the first element onto the optical signal. Similarly, after laser LD2 emits an optical signal, modulator MD2 can modulate the intensity of the optical signal emitted by LD2 according to the second element to be loaded, so as to load the second element onto the optical signal. It is understandable that in practical applications, the number of modulators in the light-emitting array can be the same as the number of lasers, or multiple lasers can multiplex the same modulator. However, when multiple lasers multiplex the same modulator, the modulation efficiency may be affected.
[0087] In another implementation, such as Figure 10c As shown, the light-emitting array includes an optical frequency comb source and a modulator array. The optical frequency comb source is used to generate the optical frequency comb. An optical frequency comb, also known as an optical frequency distribution, refers to a spectrum composed of a series of uniformly spaced frequency components with a coherent and stable phase relationship. The modulator array includes multiple modulators.
[0088] Specifically, the optical frequency comb source is connected to multiple modulators. If the optical frequency comb source emits m optical signals, then the optical frequency comb source can be connected to m modulators. The m modulators can modulate the intensity of the received optical signal according to the m elements in the first data, thereby modulating the m elements in the first data into the received optical signal. For example, modulator MD1 can adjust the intensity of the received optical signal according to the first element in the first data, thereby modulating the first element into the optical signal. Similarly, modulator MD2 can adjust the intensity of the received optical signal according to the second element in the first data, thereby modulating the second element into the optical signal.
[0089] Furthermore, in this embodiment of the application, in order to achieve large-scale parallel transmission of optical signals, wavelength division multiplexing technology can be used when transmitting optical signals between the light-emitting array and the optical control computing module 301. That is, the light-emitting array combines the first group of optical signals into an optical beam, and then transmits the optical beam to the optical control computing module 301 through an optical waveguide.
[0090] By adopting the above scheme, parallel transmission of optical signals can be achieved. When transmitting optical signals between the light-emitting array and the optical control computing module 301, only one optical waveguide is needed.
[0091] V. Routing Module
[0092] In this embodiment of the application, the routing module is used to route the first group of electrical signals to the input terminal of the electrical control calculation module 303.
[0093] As mentioned earlier, the electrical control calculation module 303 has a larger array compared to the optical control calculation module 301. The routing module is used to match the output signal of the smaller array to the larger array. For example, assuming the size of the optical control PCM array is n*m and the size of the electrical control PCM array is a*b, where m < a; then, the routing module can configure which of the a input terminals of the electrical control calculation module 303 need to be input to the m calculation results obtained by the optical control calculation module 301 each time.
[0094] In practical applications, the routing module can be implemented by a multiplexer (MUX) to achieve the selection of m from a.
[0095] In summary, the in-memory computing circuit 300 provided in this application embodiment can be used as a high-precision computing module because the optically controlled computing module 301 uses light pulses to control the computing process, resulting in high computing accuracy. This improves computing efficiency. Meanwhile, the electrically controlled computing module 303 can be used as a low-precision computing module to meet the needs of low-precision computing, reduce power consumption, and achieve mixed-precision computing.
[0096] Furthermore, compared to existing technologies that use a CPU+PCM array for mixed-precision computing, the in-memory computing circuit 300 can be integrated on the same silicon substrate, achieving single-chip integration and reducing fabrication complexity. Employing a fully analog-domain mixed-precision computing scheme, data transmission within the in-memory computing circuit 300 eliminates the need for digital-to-analog and analog-to-digital conversions, resulting in high computational efficiency.
[0097] The following is a detailed description of a memory computing circuit provided in this application embodiment through a specific example.
[0098] like Figure 11The diagram illustrates an in-memory computing circuit provided in an embodiment of this application, used to achieve mixed-precision computing. It includes a light-emitting array, a high-precision computing module, a low-precision computing module, a data conversion module, and a routing module.
[0099] Light-emitting array: The main function of the light-emitting array is to convert the input data into an optical signal and input it into the optically modulated PCM array.
[0100] Data conversion module: This module includes a detector array and a conversion module. Its main function is to convert multiple photodetector signals into voltage signals. Threshold parameters can be set in the data conversion module as a nonlinear activation function. When the voltage signal accumulates to a certain level, it triggers the next operation.
[0101] The high-precision computing module's core component is an optically controlled PCM array (n*m), used for matrix multiplication and accumulation calculations. Its key features are high precision and small array size. The optically controlled PCM array can employ a crossbar structure based on phase change materials and optical waveguides; wavelength division multiplexing (WDM) technology can be used to achieve large-scale parallel input. Furthermore, the optically controlled PCM array can be further expanded in size using methods such as beam splitting and mode division, achieving the functionality of a large array with a small array.
[0102] The core component of the low-precision computing module is an electrically controlled PCM array (a*b), used for matrix multiplication and accumulation calculations. Its characteristics include slightly lower precision but a large array size, mature technology, and simple fabrication; it can employ 1T1R, 2T2R, or crossbar structures based on phase change materials. Another component in the low-precision computing module is the TIA (Transformer Integrator Array), which can be designed as an integration module to aggregate multiple inputs into a single output.
[0103] Routing module: Includes signal routing and driving components, which can use MUX to select m from a; its main function is to match most of the output signals of the electrically controlled PCM array to the larger electrically controlled PCM array as its input; depending on the requirements of the task and algorithm, the m outputs of the small array can be used as the m inputs of the large array a.
[0104] Specifically, the data calculation process can be described as follows.
[0105] Step 1: After data input, the input data needs to be converted into an optical signal. The electrical signal is then modulated into an optical signal for a matched optically modulated PCM array (n*m) by a light-emitting array, which serves as the array's input. The light-emitting array can employ techniques such as optical frequency combs or mode dividers to achieve large-scale parallel input, depending on requirements.
[0106] Step 2: The optical signal enters the optically modulated PCM array (n*m) and performs high-precision multiplication and accumulation calculations to obtain m calculation results.
[0107] Step 3: The m optical signals obtained by the optically controlled PCM array are converted into m electrical signals by the data conversion module. A threshold parameter is set in the data conversion module as a nonlinear activation function. When the electrical signals accumulate to a certain level (e.g., a signals), they are excited to proceed to the next step.
[0108] Step 4: a electrical signals pass through the routing module and are matched to the electrically controlled PCM array (a*b), serving as the input to the large array a*b;
[0109] Step 5: The signal passes through an electrically controlled PCM array (a*b) for low-precision multiplication and accumulation calculations, and is then converted using a TIA to obtain the final calculation result. The TIA can be designed as an integration module to integrate multiple batches of current and output the result in a single step.
[0110] Based on the same inventive concept, embodiments of this application also provide a computing system. For example... Figure 12 As shown, the computing system 1200 includes a processor 1201 and the aforementioned in-memory computing circuit 300. The processor 1201 is used to send a first set of data to the in-memory computing circuit 300, and the in-memory computing circuit 300 is used to perform neural network layer calculations based on the first set of data.
[0111] It should be noted that the implementation methods and technical effects not described in detail in the computing system 1200 can be found in the relevant descriptions in the in-memory computing circuit 300, and will not be repeated here.
[0112] Furthermore, embodiments of this application also provide a calculation method, such as... Figure 13 As shown, the method includes the following steps.
[0113] S1301: Calculate the first set of optical signals using the first set of weight coefficients in the first neural network layer to obtain the second set of optical signals.
[0114] The first set of optical signals is used to indicate the first set of data, and the second set of optical signals is used to carry the first set of calculation results.
[0115] S1302: Convert the second set of optical signals into the first set of electrical signals.
[0116] S1303: Calculate the second neural network layer based on the second set of weight coefficients for the first set of electrical signals.
[0117] Specifically, in S1301, the calculation of the first neural network layer on the first set of optical signals based on the first set of weight coefficients can be performed in the following way: the first set of data is multiplied and accumulated with the first set of weight coefficients respectively to obtain the first set of calculation results.
[0118] Specifically, in S1303, the second neural network layer calculation is performed on the first group of electrical signals based on the second set of weight coefficients. This can be done in the following way: the received first group of electrical signals is multiplied and accumulated based on the second set of weight coefficients to obtain the second set of electrical signals. The second set of electrical signals is used to indicate the calculation result of the first group calculation result and the second set of weight coefficients multiplied and accumulated.
[0119] It should be noted that, Figure 13 The implementation methods and their technical effects not described in detail in the calculation method shown can be found in the relevant description in the in-memory computing circuit 300, and will not be repeated here.
[0120] Obviously, those skilled in the art can make various modifications and variations to this application without departing from the scope of this application. Therefore, if such modifications and variations fall within the scope of the claims of this application and their equivalents, this application also intends to include such modifications and variations.
Claims
1. A memory computing integrated circuit, characterized in that, include: The optical modulation calculation module is used to perform a first neural network layer calculation on a first set of optical signals based on a first set of weight coefficients to obtain a second set of optical signals. The first set of optical signals is used to indicate the first set of data, and the second set of optical signals is used to carry the first set of calculation results. A photoelectric conversion module is used to receive the second set of optical signals and convert the second set of optical signals into a first set of electrical signals; An electrical control calculation module is used to perform a second neural network layer calculation on the first group of electrical signals based on a second set of weighting coefficients. The optical modulation calculation module has a higher calculation accuracy for the first neural network layer than the electrical modulation calculation module has for the second neural network layer; both the first neural network layer and the second neural network layer are neural network layers used to perform convolution operations.
2. The circuit as described in claim 1, characterized in that, The light modulation calculation module includes: At least one optically modulated phase change material (PCM) array is used to multiply and accumulate the first set of data with the first set of weighting coefficients to obtain the first set of calculation results.
3. The circuit as described in claim 1, characterized in that, The electrical control calculation module includes: An electrically controlled PCM array is used to perform multiply-accumulate calculations on the received first set of electrical signals based on the second set of weighting coefficients to obtain a second set of electrical signals, wherein the second set of electrical signals is used to indicate the calculation result of multiplying and accumulating the first set of calculation results with the second set of weighting coefficients.
4. The circuit as described in claim 2, characterized in that, Each of the at least one optically controlled PCM arrays includes multiple PCM units, wherein the multiple PCM units use phase change materials with different degrees of crystallinity, and the crystallinity of the phase change materials used by the PCM units is used to indicate the weighting coefficient.
5. The circuit as described in claim 1, characterized in that, The photoelectric conversion module includes: A detector array is used to detect the light intensity of the second set of optical signals to obtain a set of photocurrents; A conversion module is used to convert the group of photocurrents into the first group of electrical signals, wherein the first group of electrical signals is a voltage signal.
6. The circuit as described in claim 1, characterized in that, Also includes: The routing module, coupled to the photoelectric conversion module, is used to route the first set of electrical signals to the electrical control calculation module.
7. The circuit according to any one of claims 1 to 6, characterized in that, Also includes: A light-emitting array is used to send the first set of light signals according to multiple elements in the first set of data.
8. The circuit as described in claim 7, characterized in that, The light-emitting array includes: A laser array comprising multiple lasers, the multiple lasers being used to send multiple optical signals according to multiple elements in the first set of data, the multiple optical signals constituting the first set of optical signals.
9. The circuit as described in claim 7, characterized in that, The light-emitting array includes: A laser array comprising multiple lasers for transmitting multiple optical signals; The modulator array includes multiple modulators, each corresponding to a multiple laser. The multiple modulators are used to receive optical signals emitted by the corresponding lasers and to modulate multiple elements in the first set of data onto the multiple optical signals. The modulated multiple optical signals constitute the first set of optical signals.
10. The circuit as described in claim 7, characterized in that, The light-emitting array includes: An optical frequency comb source is used to generate an optical frequency comb, wherein the optical frequency comb includes multiple optical signals; A modulator array includes multiple modulators, each corresponding to a multiple optical signal. The multiple modulators are used to receive the corresponding optical signal and to modulate multiple elements in the first set of data onto the multiple optical signals. The modulated multiple optical signals constitute the first set of optical signals.
11. A computing system, characterized in that, The device includes a processor and a memory computing circuit as described in any one of claims 1 to 10, wherein the processor is configured to send a first set of data to the memory computing circuit, and the memory computing circuit is configured to perform neural network layer calculations based on the first set of data.
12. A calculation method, characterized in that, include: The first set of optical signals is calculated by the first neural network layer based on the first set of weight coefficients to obtain the second set of optical signals. The first set of optical signals is used to indicate the first set of data, and the second set of optical signals is used to carry the first set of calculation results. The second set of optical signals is converted into the first set of electrical signals; The second neural network layer is used to calculate the first group of electrical signals based on the second set of weight coefficients; The computational precision of the first neural network layer is higher than that of the second neural network layer; both the first neural network layer and the second neural network layer are neural network layers used to perform convolution operations.
13. The method as described in claim 12, characterized in that, The first neural network layer performs calculations on the first set of optical signals based on the first set of weight coefficients, including: The first set of data is multiplied and summed with the first set of weight coefficients to obtain the first set of calculation results.
14. The method as described in claim 12 or 13, characterized in that, The second neural network layer is used to calculate the first set of electrical signals based on the second set of weight coefficients, including: Based on the second set of weighting coefficients, the received first set of electrical signals is multiplied and accumulated to obtain a second set of electrical signals, wherein the second set of electrical signals is used to indicate the result of the multiplication and accumulation calculation of the first set of calculation results and the second set of weighting coefficients.