A standard cell memory optimization design method based on controlled layout

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a controlled layout and optimized design of standard cell memory, the problems of high dynamic power consumption and complex clock trees are solved, achieving a low-power, high-efficiency memory design that can meet a variety of needs.

CN122242412APending Publication Date: 2026-06-19NORTHWESTERN POLYTECHNICAL UNIV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: NORTHWESTERN POLYTECHNICAL UNIV
Filing Date: 2026-05-20
Publication Date: 2026-06-19

Application Information

Patent Timeline

20 May 2026

Application

19 Jun 2026

Publication

CN122242412A

IPC: G06F30/337; G06F30/3315; G06F30/398

AI Tagging

Application Domain

Computer aided design Special data processing applications

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Computing device and verification method for performing verification on layout of integrated circuits
US20260178814A1Computer aided design Special data processing applications Layout Feature data
Massive data chip verification method, system and storage medium
CN122197760AComputer aided design Special data processing applications
A carbon nanotube device TCAD modeling method based on hybrid retrieval and agent iteration
CN122197783ADigital data information retrieval Semantic analysis
Layout method of printed circuit board and electronic device
CN122263796Alittle room for actionGuaranteed flexibilityComputer aided design Special data processing applications Control engineering Layout
Filter circuit design method and structure for wireless modules
CN122242432ATransmission Computer aided design

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122242412A_ABST

Patent Text Reader

Abstract

This invention discloses a controlled layout-based standard cell memory optimization design method, belonging to the field of memory design technology, to solve the technical problems of high dynamic power consumption and complex clock tree structure of existing standard cell memories, which seriously affect the allocation of global routing resources at the top level of the on-chip system. The controlled layout-based standard cell memory optimization design method of this invention adopts a memory array composed of latches in its overall architecture; the memory array is organized into a regular structure of row number × column number; in the regular structure, each column constitutes an independent data bit slice, and read and write paths are separated through a dual-port design; in the write path design, the input write address is converted into a one-hot code row selection signal by a write address decoder to control the update of the target row latch; in the read path design, the input read address is converted into a read row selection signal by a read address decoder, and the target row data is selected and output by a multiplexer.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of memory design technology, and more specifically, to a method for optimizing the design of standard cell memory based on controlled layout. Background Technology

[0002] In modern System-on-Chip (SoC) design, embedded memory significantly impacts chip silicon area and power consumption. Particularly in high-performance processors, digital signal processing modules, image processing devices, and IoT devices, memory often becomes a major source of system power consumption. Traditional embedded Static Random Access Memory (SRAM) performs excellently when handling large capacity and high-performance demands, but its performance is limited for small capacity, unconventional word lengths, or low-voltage operating environments. Especially in deep submicron processes, the minimum operating voltage of SRAM is typically between 600-700 millivolts, limiting the potential for further energy savings through voltage regulation. Furthermore, when the capacity is small, the proportion of peripheral circuitry in the SRAM is too high, leading to decreased area efficiency. At the same time, SRAM macrocells provided by manufacturers are mostly of fixed specifications, making it difficult to meet personalized requirements such as asymmetric word widths, multiple ports, or special masking modes.

[0003] In contrast, standard cell memories utilize registers or latches from the standard cell library to construct memory arrays and generate macrocells using digital synthesis and place-and-route techniques, exhibiting greater design flexibility. Standard cell memories can adjust capacity, word width, number of ports, and latency characteristics according to requirements and maintain stable operation under low voltage conditions. However, existing standard cell memories mostly employ general digital implementation processes that are not optimized for the regularity of memory arrays, resulting in a scattered layout, low cell utilization, increased signal interconnect lengths, and the need for more buffers, leading to higher dynamic power consumption. Simultaneously, the clock tree structure is complex, with significant clock skew, increasing timing convergence difficulty, extending tool runtime, and consuming more routing layers, severely impacting the allocation of global routing resources at the top level of the on-chip system. Summary of the Invention

[0004] The purpose of this invention is to provide a standard cell memory optimization design method based on controlled layout, which addresses the technical problems of high dynamic power consumption and complex clock tree structure in existing standard cell memories, severely affecting the allocation of global routing resources at the top level of the on-chip system. In view of this, this invention is implemented through the following scheme.

[0005] This invention provides a standard cell memory optimization design method based on controlled layout, applicable to embedded memory arrays with various capacities, word widths, number of ports, and different latency requirements; The standard cell memory optimization design adopts a memory array composed of latches in its overall architecture; The storage array is organized into a regular structure with the number of rows × the number of columns; in the regular structure, each column constitutes an independent data bit slice, and the read and write paths are separated through a dual-port design; Write path design is performed, and the input write address is converted into a one-hot code row selection signal through a write address decoder to control the update of the target row latch; The read path is designed by converting the input read address into a read row selection signal through a read address decoder, and then selecting the target row data and outputting it through a multiplexer.

[0006] Compared with existing technologies, the standard cell memory optimization design method based on controlled layout of the present invention can be applied to embedded memory arrays with various capacities, word widths, number of ports, and different latency requirements. In terms of overall architecture, it adopts a memory array composed of latches; the memory array is organized into a regular R×C (number of rows × number of columns) structure; in this regular structure, each column constitutes an independent data bit slice, and a dual-port design achieves complete separation of read and write paths, thereby ensuring timing independence during simultaneous read and write operations; furthermore, the write path design converts the input write address into a one-hot row selection signal through a write address decoder to control the update of the target row latch; the input data is first sampled by a data register, thereby reducing the setup time requirement; in terms of clock control, the write path introduces a two-stage clock gating structure, the first stage performs global gating based on the write enable signal, and the second stage performs clock gating row-wise... The selection signal performs local gate control on the target row clock, thereby avoiding invalid flips of non-target rows and significantly reducing dynamic power consumption. Furthermore, in the read path design, the read address decoder can convert the input read address into a read row selection signal. A multiplexer based on a NAND / NOR tree structure can select and output the target row data. This structure is superior to the tri-state buffer method in terms of both area and power consumption. Depending on system performance requirements, sampling registers can be added at the address input or data output end of the read path to balance setup time and access time. Furthermore, latch structures and clock gating can be used to reduce storage area and power consumption. Compared to SRAM structures, the standard cell memory optimized in this invention has easier capacity and bit width control, avoids unnecessary storage redundancy, and maintains the same low voltage as standard cells in near-threshold applications. Through the above technical solutions of this invention, the technical problems of high dynamic power consumption and complex clock tree structures in existing standard cell memories, which seriously affect the allocation of global routing resources at the top level of the on-chip system, are solved.

[0007] Furthermore, in the controlled layout-based standard cell memory optimization design method of the present invention, in the write path design: Input data is sampled by a data register, reducing the requirements for setup time. During clock control, the write path introduces a two-level clock gating structure for global and local gating.

[0008] Furthermore, in the controlled layout-based standard cell memory optimization design method of the present invention, the write path introduces a two-level clock gating structure for global gating and local gating, including: The two-level clock gating structure includes a first-level clock gating structure and a second-level clock gating structure; The first-level clock gating structure performs global gating based on the write enable signal; The second-level clock gating structure performs local gating on the target row clock based on the row selection signal.

[0009] Furthermore, in the controlled layout-based standard cell memory optimization design method of the present invention, the step of selecting and outputting the target row data through a multiplexer includes: Select and output the target row data using a NAND / NOR tree-based multiplexer; NAND represents NAND, and NOR represents NOR.

[0010] Furthermore, in the standard cell memory optimization design method based on controlled layout of the present invention, a sampling register is set at the address input terminal and / or data output terminal in the read path design.

[0011] Furthermore, in the standard cell memory optimization design method based on controlled layout of the present invention, the standard cell memory optimization design adopts a controlled layout strategy in the layout stage, which improves the regularity and compactness of the array by fixing the cell positions.

[0012] Furthermore, in the controlled layout-based standard cell memory optimization design method of the present invention, the controlled layout strategy includes: The latches, first-level multiplexing gates, and buffer units inside each data bit slice are arranged in a fixed order, and multiple bit slices are spliced together along the column direction to form a storage array. The middle of the array is reserved for arranging the second-level clock gating unit and its driver for each row, and the clock signal is evenly distributed from the middle to the left and right sides to shorten the clock path.

[0013] Furthermore, in the controlled layout-based standard cell memory optimization design method of the present invention, the controlled layout strategy includes: The address decoder and global clock gating unit are located near the center of the array; Buffers are inserted at intervals along the long signal path, and trap units are arranged.

[0014] Furthermore, in the controlled layout-based standard cell memory optimization design method of the present invention, clock tree design is performed, including: Separate the clock networks for the write data register, memory cell array, and read path register; The write data register, memory cell array, and read path register are respectively clock tree synthesized.

[0015] Furthermore, in the standard cell memory optimization design method based on controlled layout of the present invention, the standard cell memory optimization design achieves controlled layout by pre-defining layout constraints and cell protection attributes. Attached Figure Description

[0016] The accompanying drawings, which are included to provide a further understanding of the invention and form part of this invention, illustrate exemplary embodiments of the invention and are used to explain the invention, but do not constitute an undue limitation of the invention. In the drawings: Figure 1 This is a schematic diagram of the overall architecture of the standard cell memory designed in this invention; Figure 2 This is a schematic diagram of the dual-port read / write path structure in this invention; Figure 3 This is a schematic diagram of the implementation structure of the multiplexer based on the NAND / NOR tree structure in this invention; Figure 4 This is a schematic diagram of the controlled layout of the mid-layer arrangement and the central driving distribution in this invention.

[0017] Figure label: Figure 1 and Figure 2 In the middle, D, Q and These are the standard pin symbols for latches, where D represents the data input terminal and Q represents the data output terminal. This represents the complementary output of Q. Detailed Implementation

[0018] To make the technical problems to be solved, the technical solutions, and the beneficial effects of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention.

[0019] It should be noted that when a component is referred to as being "fixed to" or "set on" another component, it can be directly on or indirectly on that other component. When a component is referred to as being "connected to" another component, it can be directly connected to or indirectly connected to that other component.

[0020] Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of this invention, "a plurality of" means two or more, unless otherwise explicitly specified. "Several" means one or more, unless otherwise explicitly specified.

[0021] Standard cell memories (SCRMs) utilize registers or latches from a standard cell library to construct memory arrays and generate macrocells using digital synthesis and place-and-route techniques, exhibiting high design flexibility. SCRMs can adjust capacity, word width, number of ports, and latency characteristics according to requirements and maintain stable operation under low voltage conditions. However, existing SCRMs mostly employ general digital implementation flows that are not optimized for the regularity of memory arrays, resulting in a scattered layout, low cell utilization, increased signal interconnect lengths, and the need for more buffers, leading to higher dynamic power consumption. Simultaneously, the clock tree structure is complex, with significant clock skew, increasing timing convergence difficulty, extending tool runtime, and consuming more routing layers, severely impacting the allocation of global routing resources at the top level of the on-chip system.

[0022] To address the aforementioned technical problems, this invention provides a standard cell memory optimization design method based on controlled layout, applicable to embedded memory arrays with various capacities, word widths, number of ports, and different latency requirements; The standard cell memory optimization design adopts a memory array composed of latches in its overall architecture; The storage array is organized into a regular structure with the number of rows × the number of columns; in the regular structure, each column constitutes an independent data bit slice, and the read and write paths are separated through a dual-port design; Write path design is performed, and the input write address is converted into a one-hot code row selection signal through a write address decoder to control the update of the target row latch; The read path is designed by converting the input read address into a read row selection signal through a read address decoder, and then selecting the target row data and outputting it through a multiplexer.

[0023] With the above technical solution, the standard cell memory optimization design method based on controlled layout of the present invention can be applied to embedded memory arrays with various capacities, word widths, number of ports, and different latency requirements. The overall architecture employs a memory array composed of latches; the memory array is organized into a regular R×C (number of rows × number of columns) structure; in this regular structure, each column constitutes an independent data bit slice, and a dual-port design achieves complete separation of read and write paths, thereby ensuring timing independence during simultaneous read and write operations; furthermore, the write path design converts the input write address into a one-hot row selection signal using a write address decoder to control the update of the target row latch; the input data is first sampled by a data register, thereby reducing the setup time requirement; in terms of clock control, a two-stage clock gating structure is introduced for the write path, with the first stage performing global gating based on the write enable signal. The second stage uses row selection signals to locally control the target row clock, thereby avoiding invalid flips of non-target rows and significantly reducing dynamic power consumption. Furthermore, in the read path design, the read address decoder converts the input read address into a read row selection signal, which can be used by a multiplexer based on a NAND / NOR tree structure to select and output the target row data. This structure is superior to the tri-state buffer method in terms of both area and power consumption. Depending on system performance requirements, sampling registers can be added at the address input or data output end of the read path to balance setup time and access time. Furthermore, latch structures and clock gating can be used to reduce storage area and power consumption. Compared to SRAM structures, the capacity and bit width of the standard cell memory optimized by this invention are easier to control, avoiding unnecessary storage redundancy, and maintaining the same low voltage as standard cells in near-threshold applications. Through the above technical solutions of this invention, the technical problems of high dynamic power consumption and complex clock tree structures in existing standard cell memories, which seriously affect the allocation of global routing resources at the top level of the on-chip system, are solved.

[0024] To better understand the present invention, the following specific embodiments further illustrate the content of the present invention, but the content of the present invention is not limited to the following embodiments.

[0025] Example 1 This embodiment provides a standard cell memory optimization design method based on controlled layout, applicable to embedded memory arrays with various capacities (16-512 words), word widths (8-256 bits), number of ports (single / dual ports), and different latency requirements; the standard cell memory optimization design achieves controlled layout by pre-defining layout constraints and cell protection attributes; The standard cell memory optimization design adopts a memory array composed of latches in its overall architecture; the memory array is organized into a regular R×C (number of rows × number of columns) structure; in the regular structure, each column constitutes an independent data bit slice, and the read and write paths are separated through a dual-port design, thereby ensuring the timing independence of simultaneous read and write operations; Write path design is performed, and the input write address is converted into a one-hot code row selection signal through a write address decoder to control the update of the target row latch; The read path is designed by setting sampling registers at the address input and / or data output terminals to balance setup time and access time. The input read address is converted into a read row selection signal by a read address decoder, and the target row data is selected and output by a multiplexer based on a NAND / NOR tree structure. This structure is superior to the tri-state buffer method in terms of area and power consumption. Here, NAND represents NAND and NOR represents NOR.

[0026] Furthermore, in the write path design: input data is first sampled by the data register, thereby reducing the requirements for setup time; during clock control, the write path introduces a two-level clock gating structure for global gating and local gating; the two-level clock gating structure includes a first-level clock gating structure and a second-level clock gating structure; the first-level clock gating structure performs global gating based on the write enable signal; the second-level clock gating structure performs local gating on the target row clock based on the row selection signal, thereby avoiding invalid flips of non-target rows and significantly reducing dynamic power consumption.

[0027] Furthermore, to avoid wiring chaos and wasted area caused by random cell placement in the general synthesis and placement process, the standard cell memory optimization design in this embodiment adopts a controlled placement strategy during the placement phase, achieving high regularity and compactness of the array by fixing cell positions; in the controlled placement strategy: The latches, first-level multiplexing gates, and buffer units inside each data bit slice are arranged in a fixed order, and multiple bit slices are spliced along the column direction to form a memory array; the middle of the array is reserved for arranging the second-level clock gating unit and its driver for each row, and the clock signal is evenly distributed from the middle to the left and right sides to shorten the clock path; the address decoder and global clock gating unit are arranged near the middle of the array to reduce control signal delay; buffers are inserted at intervals on the long interconnect signal path, and well tap units are arranged according to process requirements to ensure signal integrity and layout manufacturability.

[0028] Furthermore, clock tree design is implemented: the clock networks of the write data register, memory cell array, and read path register are separated; clock tree synthesis is performed on the write data register, memory cell array, and read path register respectively to reduce buffer overhead and clock skew issues caused by cross-network balancing.

[0029] Furthermore, in this embodiment, controlled layout is achieved by pre-defining layout constraints and cell protection attributes in standard digital implementation tools. After the layout and routing are completed, static timing analysis, power consumption analysis and functional simulation are used to verify its performance advantages and the achievement of design goals.

[0030] Example 2 Please see Figures 1 to 4 This embodiment provides a standard cell memory optimization design method based on controlled layout to obtain a target memory with 256 rows, each row having 32 bits, and employing a dual-port design to support independent read and write functions. The design process is as follows: First, in the RTL (Register Transfer Level) design phase, standard cells from the technology library are directly called, such as latch cells, dual-input NAND / NOR gates, clock control units, and buffer cells. Protection attributes are set for these cells in the synthesis constraint file to prevent the synthesis tool from replacing them with functionally equivalent cells that do not match in timing or layout.

[0031] Furthermore, a bit-slice template is constructed, with each bit-slice containing a latch, a first-level multiplexing gate, and a local buffer, maintaining a consistent physical width. Multiple bit-slices are arranged sequentially along the column direction to form a memory array. A second-level clock control unit and driver channel are reserved in the center of the array, and clock signals are symmetrically distributed from the center to both sides. The address decoder and global clock control unit are placed in the lower center of the array, close to the data bit-slices, to reduce control signal delay and reduce drive power consumption.

[0032] Furthermore, during the placement and routing phase, commercial P&R (Placement and Routing) tools are used. Floorplan scripts constrain bit positions and row / column spacing to ensure a regular array structure. Word lines and write clock lines use lower metal layers, while data lines and multiplexer signals are primarily distributed to higher metal layers to reduce cross-layer jumpers. To ensure the integrity of long-distance signal transmission, buffers are inserted at fixed intervals in horizontal traces, and well-contact cells are evenly distributed throughout the layout to comply with process specifications.

[0033] Furthermore, in the clock tree optimization stage: the clock networks of the write data register, memory array latch, and read path register are processed separately, generating independent clock trees for each, avoiding buffer increases and clock skew issues caused by cross-domain balancing. For two-level clock control, the first level is generated by the global clock and write enable signal, and the second level is driven by the row selection signal, ensuring that non-target row latches remain stationary during writing, significantly reducing dynamic power consumption.

[0034] Furthermore, in the read path design of this embodiment, sampling registers are set at the address input and / or data output terminals to balance setup time and access time. The input read address is converted into a read row selection signal by a read address decoder, and the target row data is selected and output by a multiplexer based on a NAND / NOR tree structure. This structure is superior to the tri-state buffer method in terms of both area and power consumption. Here, NAND represents NAND, and NOR represents NOR. In the write path design: the input write address is converted into a one-hot code row selection signal by a write address decoder to control the update of the target row latch. The input data is first sampled by the data register, thereby reducing the setup time requirement. During clock control, the write path introduces a two-stage clock gating structure for global and local gating. The two-stage clock gating structure includes a first-stage clock gating structure and a second-stage clock gating structure. The first-stage clock gating structure performs global gating based on the write enable signal; the second-stage clock gating structure performs local gating on the target row clock based on the row selection signal, thereby avoiding invalid flips of non-target rows and significantly reducing dynamic power consumption.

[0035] Furthermore, after the placement and routing are completed, static timing analysis, power consumption assessment, and functional simulation are performed. The test results show that, compared with standard cell memory of the same size, the standard cell memory optimization design method based on controlled placement in this embodiment reduces the area of the standard cell memory by about 23%, reduces the power consumption of read operations by 55%, reduces the power consumption of write operations by 15%, shortens the critical path delay by about 12%, reduces the minimum operating voltage to 0.3 volts, and has better stability than the custom static random access memory macrocell under the same conditions.

[0036] Furthermore, it will be combined Figures 1 to 4 The controlled layout-based standard cell memory optimization design method of the present invention will be described again. Please refer to [link / reference]. Figures 1 to 4 D represents the data input terminal, and Q represents the data output terminal. Indicating the complementary output of Q, in Figure 1 The diagram illustrates the overall architecture of the standard cell memory optimized by this invention. The memory array consists of row × column latch cells, with each column forming an independent data bit slice. A write address decoder converts the write address into a one-hot write row selection signal and, in conjunction with a write enable signal, controls the update of the target row; a read address decoder converts the read address into a read row selection signal and outputs the data of the target row via a multiplexer; the data input is sampled and written through a data register, and the data output is controlled by the read path. The entire system also includes a global clock and corresponding clock gating logic; these structures are common knowledge in the field and will not be described in detail here. Figure 2The diagram illustrates the dual-port read / write path structure used in this invention. The write path consists of a write address input, a write data input, a write address decoder, and a two-stage clock gating structure. The first-stage clock gating controls the global clock based on the write enable signal, while the second-stage gating controls the target row clock based on row selection, thereby reducing the switching power consumption of non-target rows. The read path consists of a read address input, a read address decoder, a read row selection signal, a multiplexer, and a data output. A sampling register can also be optionally added to the read address input or data output to achieve timing optimization. Figure 3 This paper presents an output multiplexer based on a NAND / NOR tree structure. For each column of memory cell, its latch output and the corresponding read row selection signal are first NAND-NOTed with a NAND gate, and then combined through a hierarchical NOR tree to finally form the data of the selected target row at the output. This structure avoids the risks of tri-state buffering and has advantages in terms of area and power consumption. Figure 3 The logic gates at each level are arranged in a tree-like hierarchy, ensuring a regular and compact implementation. Figure 4 The diagram illustrates a controlled layout of bit slices and central driver distribution. Each data bit slice includes a latch, a first-level multiplexer gate, and a buffer unit, which are assembled along the column direction to form an array. Space is reserved in the center of the array to accommodate the second-level clock gating units and their drivers for each row, thereby evenly distributing the distributed gating write clock signal from the center to the left and right sides, shortening the clock path and reducing clock skew. Address decoders and global clock gating units are also positioned near the center of the array. Buffers are periodically inserted along long interconnect signal paths, and trap units are arranged according to process requirements to ensure signal integrity and layout manufacturability. The yellow area represents the vertical routing buffer, the blue area represents the data input register, the black area represents the horizontal routing buffer, and the red area represents the second-level clock gating units and drivers.

[0037] In the description of the above embodiments, specific features, structures, materials, or characteristics may be combined in any suitable manner in one or more embodiments or examples.

[0038] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

1. A standard cell memory optimization design method based on controlled layout, characterized in that, It can be applied to embedded storage arrays with various capacities, word widths, number of ports, and different latency requirements; The standard cell memory optimization design adopts a memory array composed of latches in its overall architecture; The storage array is organized into a regular structure with a row count multiplied by a column count; In the aforementioned rule structure, each column constitutes an independent data bit slice, and the read and write paths are separated through a dual-port design; Write path design is performed, and the input write address is converted into a one-hot code row selection signal through a write address decoder to control the update of the target row latch; The read path is designed by converting the input read address into a read row selection signal through a read address decoder, and then selecting the target row data and outputting it through a multiplexer.

2. The standard cell memory optimization design method based on controlled layout according to claim 1, characterized in that, In the write path design: Input data is sampled by a data register, reducing the requirements for setup time. During clock control, the write path introduces a two-level clock gating structure for global and local gating.

3. The standard cell memory optimization design method based on controlled layout according to claim 2, characterized in that, The write path introduces a two-level clock gating structure for global and local gating, including: The two-level clock gating structure includes a first-level clock gating structure and a second-level clock gating structure; The first-level clock gating structure performs global gating based on the write enable signal; The second-level clock gating structure performs local gating on the target row clock based on the row selection signal.

4. The standard cell memory optimization design method based on controlled layout according to claim 3, characterized in that, The step of selecting and outputting the target row data via a multiplexer includes: Select and output the target row data using a NAND / NOR tree-based multiplexer; NAND represents NAND, and NOR represents NOR.

5. The standard cell memory optimization design method based on controlled layout according to claim 4, characterized in that, In the read path design, a sampling register is set at the address input terminal and / or the data output terminal.

6. The standard cell memory optimization design method based on controlled layout according to claim 5, characterized in that, The standard cell memory optimization design adopts a controlled layout strategy during the layout stage, which improves the regularity and compactness of the array by fixing the cell positions.

7. The standard cell memory optimization design method based on controlled layout according to claim 6, characterized in that, In the controlled layout strategy: The latches, first-level multiplexing gates, and buffer units inside each data bit slice are arranged in a fixed order, and multiple bit slices are spliced together along the column direction to form a storage array. The middle of the array is reserved for arranging the second-level clock gating unit and its driver for each row, and the clock signal is evenly distributed from the middle to the left and right sides to shorten the clock path.

8. The standard cell memory optimization design method based on controlled layout according to claim 7, characterized in that, In the controlled layout strategy: The address decoder and global clock gating unit are located near the center of the array; Buffers are inserted at intervals along the long signal path, and trap units are arranged.

9. The standard cell memory optimization design method based on controlled layout according to claim 8, characterized in that, Design a clock tree, including: Separate the clock networks for the write data register, memory cell array, and read path register; The write data register, memory cell array, and read path register are respectively clock tree synthesized.

10. The standard cell memory optimization design method based on controlled layout according to claim 9, characterized in that, The standard cell memory optimization design achieves controlled layout through predefined layout constraints and cell protection attributes.