# **DesignCon 2020**

## Signal and Power Integrity Co-Simulation for High-Density Heterogenous Multi-Die Design

Ashkan Hashemi, Intel Corporation Ashkan.hashemi@intel.com

Gung Chen, Intel Corporation Guang.chen@intel.com

Hyo-Soon Kang, Intel Corporation Hyosoon.kang@intel.com

Wendem Beyene, Intel Corporation Wendem.beyene@intel.com

## Abstract

Heterogeneous packaging techniques are critical in enabling high-density I/O communication between two or more chips. In this paper, Intel's embedded multi-die interconnect bridge (EMIB) solution is utilized to realize fabric-to-fabric connection of two monolithic FPGAs to form an extremely high (>10M LEs) compute capacity FPGA targeted for ASIC prototyping and emulation market. A comprehensive signal and power integrity co-simulation methodology is presented for the extremely high-density interface between the two high-performance FPGAs using extracted and correlated on-die passive channel models. The link analysis characterizes the channel's ISI, crosstalk as well as SSN separately, and transient eye diagrams are generated to verify the interface performance.

## **Authors Biography**

**Ashkan Hashemi** is an SoC design engineer with Programmable Solutions Group (PSG) at Intel Corporation. His responsibilities cover both chip and system level signal/power integrity modeling and analysis of high-speed links and interfaces in FPGA products. His research interests include RF, microwave, and signal/power integrity modeling in multidie integration. He received the Ph.D. degree in electrical engineering from the Missouri University of Science and Technology in 2016.

**Hyo-Soon Kang** is a Signal and Power Integrity engineer at Intel Corporation. His responsibilities include the timing budgeting of high-speed links and the power supply noise induced jitter analysis on analog interface and core logic. From 2009 to 2017, he was with Package Development Team in Samsung Electronics Ltd., Korea. His research interests were in the area of signal integrity (SI) and power integrity (PI) of semiconductor packages. He received the B.S., M.S., and Ph.D. degrees in Electrical and Electronic Engineering from Yonsei University, Korea, in 2002, 2004, and 2009, respectively.

**Guang Chen** is a design engineer with Silicon Systems Development Department at the Programmable Solutions Group (PSG), Intel Corporation, where he is responsible for driving signal/power integrity co-design effort for the key FPGA products with focus on both silicon and system level power integrity solutions. Prior to this, he was with Technical Service Dept. at Altera Corporation, providing system SI/PI solutions for customer Altera FPGA applications. He holds a Ph.D. in Electrical Engineering from University of Arizona. His professional interests include power integrity and signal integrity, high-speed interconnects and channel modeling.

**Wendem Beyene** received his B.S. and M.S. degrees in Electrical Engineering from Columbia University, and his Ph.D. degree in Electrical and Computer Engineering from University of Illinois at Urbana-Champaign. In the past, he was employed by IBM, Hewlett-Packard, Agilent Technologies and Rambus Inc. He is currently with Intel Corp. working on modeling and analysis of core and I/O subsystems of FPGA chips.

### I. Introduction

In the era of Artificial Intelligence (AI), neuromorphic cognition, Internet-of-Things (IoT), and autonomous computing, next generation platforms face multiple challenges to satisfy the ever-increasing need for higher bandwidth, lower power, smaller form factor and scalability. Although monolithic integration played as an architectural remedy for this problem for a short period of time, IPs process node technology maturity appeared as the next challenge where different IPs required on a single chip become mature at different technology process nodes and not necessarily at the same time. These challenges are more pivotal in high-end FPGAs in which variety of protocols/standards (e.g., PCIe, Ethernet, HBM, etc.) as well as modulation schemes (NRZ, PAM4, optical) are being supported for a wide range of data rates and speeds spanning from few GHz to 112 Gbps in transceivers. As such, to meet the demand for high-bandwidth and lowpower computing, well-designed high-density interconnects are critical for the seamless integration of heterogeneous multi-die system in a small footprint. For this purpose, in addition to the traditional multichip module (MCM), system-in-package (SiP) techniques, and silicon interposer interconnect, Intel's Co/EMIB (embedded multi-die interconnect bridge combined with 3D packaging technology), omni-directional interconnects (ODI), and Multi-Die I/O (MDIO) are the most recent heterogenous integration techniques to facilitate a high-density, low-latency, and low-impedance (for power supply transfer) interconnection [1]. For instance, to realize a high-density Input/Output (I/O) SoC, multiple standalone ASIC/FPGA cores can be integrated through these heterogeneous packaging technologies which results in a modular design enabling faster design cycles. For instance, Intel's Stratix 10 is the first FPGA to incorporate the EMIB to integrate a monolithic fabric core at 14nm with transceiver tiles at different process nodes.

This paper demonstrates the implementation, modeling and analysis of EMIB technology in Intel Stratix 10GX-10M FPGA to realize a logically and electrically interconnection between two FPGA monolithic fabric die enabling an extremely high-density FPGA (10.2M logic elements) manufactured on a 70mmx74mm package [2]. ASIC prototyping and emulation market are the key drivers for developing such high-density FPGAs offering over 10M logic elements (LE). EMIB solution, by seamless stitching of the two monolithic fabric cores in one package, plays a critical role for fast-to-market realization of such high capacity FPGAs as well as other similar SiP products. Figure 1 illustrates Intel Stratix 10GX-10M device and Table 1 summarizes some of the key specifications of this FPGA.



Figure 1. EMIB-based high-density FPGA

| Features                    | Capability                   |
|-----------------------------|------------------------------|
| Logic Elements              | 10.2 M                       |
| Transistors                 | 43.3 B                       |
| Data Interface Bus via EMIB | Up to 25920 interconnections |
| Memory                      | 308 Mbits                    |
| DSP                         | 6912                         |
| GPIO                        | 2304                         |
| LVDS                        | 1152 @ 1.4 Gbps              |
| Transceivers                | 48 @ 17.4 Gbps               |
| Package                     | 70 mm x 74 mm                |

#### Table 1. High-density FPGA key specifications

In this heterogenous multi-die design case, ~25000 wires are routed across multiple EMIBs to achieve full I/O interconnection capacity between the two FPGA chips. To implement this short-range and high-density interconnection, a new data interface bus (DIB) is designed and characterized. The DIB allows FPGA core fabric-to-fabric interconnection with low-latency and low-power. Accurate signal and power integrity analysis of this high-density interface requires detailed on-chip models for both the signals as well as power distribution network (PDN) to capture the link's intersymbol interference (ISI), crosstalk, simultaneous switching noise (SSN), and non-uniform current returns on the power and ground structures. Accurately modeling all these channel imperfections and non-idealities into account is critical for a robust signal and power integrity analysis.

In this paper, a comprehensive signal and power integrity co-simulation methodology for DIB interface is presented, and the link analysis techniques to characterize channel's ISI, crosstalk as well as SSN are reviewed. Full transistor-level SPICE models are used for the victim and all the aggressor drivers where other adjacent drivers are modeled using current mirror technique. This methodology makes it possible to analyze hundreds of I/Os simultaneously, and accurately captures all the coupling effects and SSN while maintaining reasonable simulation time. Modeling the passive channels, given the large number of wires, through the traditional approach of building transmission line models or utilizing full-EM solvers is computationally prohibitive. Therefore, since the characteristics of these interconnects are different than off-chip wires, an efficient and accurate analysis of on-chip interconnects in heterogeneous systems is of great interest. As such, to achieve both accuracy and efficiency in modeling, on-die channel models are extracted directly from the actual layout design database using a quasi-static tool, and the results of the extraction are then correlated with the corresponding conventional W-element model, that is capable of accurately capturing the conductor and dielectric loss, to qualify the extraction methodology. Additionally, detailed power distribution network models are attached to the victim and all aggressor drivers to capture SSN. Finally, full link SIPI co-simulations are performed, and each channel imperfection (i.e., ISI, crosstalk, SSN) is reported individually. Transient eye diagrams are then generated to verify the interface performance.

#### II. EMIB technology and the need for SIPI co-simulation

Although EMIB solution facilitates realization of high-density on-chip interconnection, careful attention is required to address potential signal and power integrity issues caused by tight coupling and dense on-chip routings. As the number of on-chip interconnects increases, modeling of such channels become more and more challenging mainly in terms of accuracy and efficiency of the modeling methodology. In one hand, brute force 3D electromagnetic modeling of the interconnects renders accurate models for the on-chip tracks, on the other hand those methodologies are complex, timeconsuming, and computationally expensive which makes them prohibitive in the designs with dense and compact on-chip metal routings. Moreover, proximity of power and signal tracks, especially in SiP designs adds an additional layer of complexity to the accurate modeling of on-chip interconnects. As the number of on-chip signals increase, bump pitch distances become shorter which leads to tighter coupling between signals and power nets, hence the need to consider signal and power integrity co-simulation (CO-SIPI) to capture and accurately break down channel's contribution to overall link performance such as inter-symbol interference (ISI), crosstalk and simultaneous switching noise (SSN).

Figure 2(a-c) shows top, bump, and side view of the new FPGA chip and how EMIBs are implemented to bond the two FPGAs through three EMIBs in the middle of the chip. Additional four EMIBs have been utilized to interface to four transceivers on the shorelines of the chip. Regarding collocation of the victim and aggressor drivers for channel analysis, one need to recognize that although channel's ISI manifests itself as energy loss at the end of each link, worst-case crosstalk is captured by considering surrounding drivers/bumps/pins coupling in the victim line and termination conditions. As shown in Figure 2(b), out of 24 neighbor channels, the nearest six interconnects are considered as aggressors to the victim line to capture crosstalk, and the remaining drivers configured to stimulate SSN. Figure 2(c) illustrates the layer stack-up for the EMIB where metal layers M1 and M3 are dedicated for signal routing, and ground/power tracks

placed at M2 and M4, respectively. Although in this case power is not transferred through EMIB, such power transfer (through EMIB) is feasible as it was shown previously in Stratix 10 FPGA in which one or more EMIB layers can be used for power distribution [3]. As shown in the Figure 2, there are power and signal tracks running parallel to each other in both horizontal and vertical paths which highlights the need to consider both signal-to-power and signal-to-signal coupling in the link analysis.



Figure 2. a) Top view, b) bump view, and c) side view of EMIB-based SiP FPGA

Figure 3 shows the proposed SIPI co-simulation framework to capture both signal and power supply network (PDN) contribution to the transient performance of the link. As indicated in the figure, PDN models are incorporated in the transistor-level driver models to capture SSN. Additionally, horizontal and vertical on-die paths on EMIB are extracted from bump to bump (outlined by the green box in Figure 3) through an EDA tool capable of distributed modeling of power grids and signal structures by considering mutual coupling (capacitive and inductive) well as realistic current return paths [4]. The extraction results are then correlated with W-element models to verify extracted models' accuracy. This layout-based channel extraction method not only facilitates accurate modeling of EMIB channels, also it allows for extensive time domain signal and power co-simulations in reasonable amount of time. In the following sections, every component of the link path and their corresponding models are explained in more details.



Figure 3. SIPI co-simulation framework for EMIB-based data interface bus

#### **Driver model:**

To accurately characterize the DIB link, a single tile including 24 channels is considered for the simulations. Out of the 24 drivers, full transistor-level netlist models are used for 1 victim channel and the six surrounding aggressors. The remaining 17 drivers in the same tile are toggled with a burst-idle-burst pattern at PDN resonance frequency to stimulate worst case SSN. Current mirrors are used for SSN drivers for which the current consumed by the actual drivers is measured and applied to the behavioral drivers. Table 2 summarizes the driver configurations as well as their corresponding stimulus patterns utilized in transient simulations.

| <b>Drivers/simulations</b> | ISI Only     | ISI+crosstalk | ISI+crosstalk+SSN |
|----------------------------|--------------|---------------|-------------------|
| Victim                     | PRBS_15 (x1) | PRBS_15 (x1)  | PRBS_15 (x1)      |
| Aggressor                  | Quiet (x1)   | PRBS_10 (x6)  | PRBS_10 (x6)      |
| SSN                        | Quiet (x1)   | Quiet (x17)   | Toggling (x17)    |

Table 2. Drivers patterns and configurations

#### **Passive channel:**

The passive channel consists of horizontal path (i.e., EMIB routing) and vertical path (i.e., via, bump, pad). As mentioned earlier, the most accurate (and time-consuming) method to model this path is full-wave 3D structural modeling of both horizontal and vertical paths and then cascading the resulting models to represents the entire channel. This approach becomes even more prohibitive in dense routing environments such as in EMIB in which multitude of signal and power tracks coexist. Alternatively, in this paper an EDA tool is used to extract layout-based signal and power/grounds nets in an efficient way, in a reasonable amount of time [4]. Through this methodology, it is possible to extract distributed signal and power/ground models (including all nets coupling and skin effects), internal parasitics, as well as inductive/capacitive couplings and the correct current return paths. The extracted models are represented by distributed resistance, inductance, and capacitance in SPICE netlists and then the resulting models are used in the 24-driver SIPI co-simulation approach.

To verify the accuracy of the extracted models, equivalent W-element models are developed and compared against the layout-based extracted models. The W-element models are developed for both vertical and horizontal paths where the horizontal path is the direct path in the EMIB signal layers and vertical path comprised of six cascaded models starting from the top silicon via to the bottom EMIB via. Lastly, the resulting Welement models are cascaded to obtain the final consolidated model representing the bump-to-bump passive channel. Figure 4(a-c) show the layout-based extraction view and the two W-element models developed to benchmark the extracted model.

|        | - | ; |        |  |
|--------|---|---|--------|--|
| Die #1 |   | 3 | Die #2 |  |
|        |   | 2 |        |  |
|        | 2 | 2 |        |  |
|        | • |   |        |  |



Figure 4. a) Bump-to-bump layout-based extraction top-view, b) equivalent W-element model for horizontal path, c) equivalent W-element model for vertical path

Figure 5(a-b) compares the insertion loss (IL) and far-end crosstalk (FEXT) between the two modeling techniques. As shown in Fig. 5a, there is ~0.2dB difference between the IL where W-element model showed higher loss starting at 6 GHz with the maximum loss difference of ~0.2dB at 10 GHz. Although the difference in IL appears to be minimal and within the acceptable error tolerance of the methodologies, it may be associated to the difference in the material properties and certain predefined assumptions embedded in the two methodologies. Related to crosstalk, layout-based model showed a higher coupling (~5dB) compared to the W-element model. This difference can be attributed to the fact that in the layout-based model all coupling mechanisms (i.e., signalto-signal as well as power-to-signal) are considered while in the consolidated W-element model power-to-signal coupling is ignored and caused the difference in crosstalk results. Comparison of the IL and crosstalk results highlights the strength of the layout-based approach in which all the real-world design imperfections or nonidealities that may not be captured through schematic-based tools, are accounted for in final channel performance and timing budget. Furthermore, the difference in the two modeling approaches indicates that how different methodologies (with different assumptions) can yield distinct results and engineers need to be aware of each method's weakness and strengths and choose a suitable methodology with respect to the desired application and design environments as well as modeling/simulation time.



Figure 5. Layout-based vs. W-element models comparison for a) Insertion loss vs. b) Far-end crosstalk

#### PDN model:

The last critical component of the DIB link analysis is the system-level PDN model which includes board, package, and die power delivery models. Figure 6(a) shows the detailed PDN model which is built accurately through rigorous on/off-chip extractions and modeling to represent system-level power delivery network. For brevity, the PDN modeling techniques and its respective on-die extractions, which are beyond the scope of this paper, are not discussed in this paper. According to Figure 6(b), an important piece of information that can be inferred from the PDN impedance profile is package-die resonance frequency at which the impedance is maximum. This frequency (i.e., package-die resonant frequency at ~50 MHz) can be assumed as modulation frequency for SSN drivers with an idle-burst-idle pattern toggling at resonance frequency. Thus, the worst-case SSN is considered in channel simulations.





Figure 6. a) System-level PDN model, b) PDN impedance profile

#### III. System-level channel performance results

Once all passive and active components of the link characterized, system-level SIPI co-simulation model is built according to the framework shown in Figure 3 including both signal models as well as power delivery network. As the first step in system-level simulations, channel's step response is characterized for two coupling mechanisms; capacitive and inductive coupling, coupling through power distribution network onto the victim. This step response characterization is beneficial at early design stage in order to find out the dominant contributing coupling mechanism in the system and feedback to the design team for optimum pin/pitch/bump configuration and placement. As indicated in Figure 7 and summarized in Table 3, SSN-only case with 17 drivers is having almost similar contribution to the induced noise (34.6mV peak-to-peak) as in the crosstalk-only case with 6 drivers (40.1mV peak-to-peak). In the third case where both SSN and FEXT aggressors are coupling to the victim channel at the same time,  $\sim 52 \text{mV}$  peak-to-peak noise is coupled to the victim line indicating that the maximum coupled noise is induced when both crosstalk and SSN effects are considered simultaneously. This information can be further used to optimize the number of actuals drivers needed in the simulation to assure that the worst-case channel budgeting is considered in the timing closure.



Figure 7. Channel step response to different coupling mechansims

| Noise              | Crosstalk only | SSN only | Crosstalk+SSN |
|--------------------|----------------|----------|---------------|
| # of aggressor IOs | 6              | 17       | 23            |
| Peak-to-peak (mV)  | 40.1           | 34.6     | 51.6          |

Table 3. Induced voltage noise for different coupling mechanisms

Lastly, transient channel SIPI co-simulation breakdown results for each simulation case are shown in Figure 8(a-c). Eye diagram measurements at the receiver are shown in these figures based on a receiver sensitivity of  $\pm 100$ mV AC and 50% V<sub>REF</sub> mask. ISI-only transient response (Figure 8a) represents the case in which only victim driver is running at PRBS15 and other drivers are quiet, and the result shows a clean and wide-open received eye diagram with adequate margin (6% UI jitter) at the predefined mask. In the second case (ISI and crosstalk), the six adjacent aggressor drivers running at PRBS10 simultaneously with the victim drivers resulted in an additional 2% jitter increase due to crosstalk. The last case (Figure 8c) includes ISI, crosstalk, and SSN altogether and it shows a maximum of 10% UI jitter at the mask, indicating similar incremental contribution to channel jitter as in the crosstalk-only case. It is worth noting that the transient results corroborated step responses discussed in the earlier section (refer to Table 3) where the same conclusion is drawn.

Although this specific set of simulations are conducted at a relatively low frequency due to the design intent for DIB, crosstalk and SSN can have more significant impact on channel degradation at higher data rates. The other key factor that needs to be pointed out is the need for an accurate and efficient methodology to enable SIPI cosimulation. Traditional full-wave 3D electromagnetic modeling of passive channels may be the most accurate modeling approach, but resource-efficient layout-based methodologies that consider all possible sources of signal-power interaction are more favorable in SIPI co-simulation techniques. By employing such efficient methods and accurate characterization of both power and signal tracks as well as system-level PDN models, a comprehensive and realistic channel simulation can be conducted, and the results will serve as a reliable source of information for system designers and timing closure of the entire system or a particular interface.

|                    | ISI only | ISI+crosstalk | ISI+crosstalk+SSN |
|--------------------|----------|---------------|-------------------|
| Jitter @ ±100mV-AC | 6% UI    | 8% UI         | 10% UI            |

Table 4. Jitter at predefined mask



Figure 9. Eye diagrams for a) ISI only, b) ISI+crosstalk, and c) ISI+crosstalk+SSN

## **IV.** Conclusion

In this paper a comprehensive SIPI co-simulation approach for data interface bus in an EMIB-based high-capacity FPGA is presented. The data interface bus (DIB) provided a dense, with over 25000 wires, seamless die-to-die connection between the two FPGAs enabling enormous logic elements availability on a single package. For channel analysis of the interface, transistor-level driver models are utilized together with distributed layout-based models for both vertical and horizontal paths on EMIB. The resulting models are correlated with W-element models to verify the accuracy of the extracted models. Furthermore, system-level PDN model are also incorporated in the link analysis to accurately capture the SSN contributions. Lastly, transient eye diagrams for ISI, crosstalk, and SSN reported for proportional contribution of each mechanism to overall channel performance. In conclusion, through this work it is shown that an efficient on-die extraction modeling technique facilitates a thorough signal and power SIPI co-simulation especially in dense and compact routing environments such as in EMIB and similar 2.5D interposer technologies.

## References

- [1] Ravi Mahajan, et al., "Embedded Multi-die Interconnect Bridge (EMIB) -- A High Density, High Bandwidth Packaging Interconnect," *IEEE Electronic Components and Technology Conference (ECTC)*, 2016 IEEE 66th, Las Vegas, US, May 31 – Jun. 3, 2016.
- [2] Intel Stratix 10 GX 10M FPGA: https://blogs.intel.com/psg/intel-announces-intelstratix-10-gx-10m-fpga-worlds-highest-capacity-with-10-2-million-logic-elementstargets-asic-prototyping-and-emulation-markets/. Last retrieved on 11/16/2019.
- [3] Changwook Yoon, *et al.*, "In-depth SI and PI analysis of chip-to-chip interconnect using silicon bridge", presented at *DesignCon 2018* in Santa Clara, CA, January 2018.
- [4] Sigrity XcitePI Extraction: https://www.cadence.com/content/cadencewww/global/en\_US/home/tools/ic-package-design-and-analysis/si-pi-analysis-pointtools/sigrity-xcitepi-extraction.html. Last retrieved on 11/16/2019.