# DesignCon 2019

Analysis of Power Integrity Effects on Signal Integrity in FPGA DDR4 Memory Interfaces by Using PDN Resonance Peaks Based Worst Case Data Patterns

Cristian Filip, Product Architect High-Speed Products, Mentor, a Siemens Business Cristian\_Filip@mentor.com

- Cosmin lorga, Principal R&D, PlScanner.com cosmin.iorga@noisecoupling.com
- Daniel N. de Araujo, Principal Product Architect, Mentor, a Siemens Business Daniel\_deAraujo@mentor.com
- Nitin Bhagwath, Product Architect, Mentor, a Siemens Business <u>Nitin\_Bhagwath@mentor.com</u>
- Hans Klos, Managing Director Sintecs BV, Sintecs hans.klos@sintecs.nl
- Tom Berends, Manager High-speed board design Sintecs BV, Sintecs tom.berends@sintecs.nl
- Arpad Muranyi, Senior Product Architect, Mentor, A Siemens Business <u>Arpad\_Muranyi@mentor.com</u>
- Chuck Ferry, Director of Systems Architecture, Mentor, A Siemens Business Chuck\_Ferry@mentor.com

Praveen Anmula, Hardware Engineer, Cisco Systems panmula@cisco.com

# Abstract

Power integrity effects on signal integrity in FPGA DDR4 memory interfaces are analyzed in pre-layout, post-layout, and system validation data patterns created based on the resonance peaks of the power distribution network (PDN). The PDN impedance profile is measured with an FPGA configured vector network analyzer (VNA). Multiple test data patterns are created to superimpose the power supply current frequency spectral components with the PDN resonance peaks and to exercise transmission line multiple reflections build-up effect. These data patterns are then used to identify the dominant contributors to signal integrity degradation.

# **Authors' Biographies**

#### **Cristian Filip**

Cristian Filip joined Mentor Graphics Corporation in 2014 where he is a Product Architect for high-speed products. Cristian holds a M. Eng. In Electronics and Telecommunications from the Polytechnic University, Timisoara, Romania and is member of Professional Engineers Ontario. He has authored and co-authored several articles and papers and won the DesignCon Best Paper Award in 2016 and 2018 respectively.

# Cosmin lorga

Cosmin lorga has earned his PhD in Electrical Engineering from Stanford University and he has accumulated over 20 years of experience in high-speed analog and mixed-signal circuit design and troubleshooting at system, board, and integrated circuit levels. Cosmin is the author of the book "Noise Coupling in Integrated Circuits: A Practical Approach to Analysis, Modeling, and Suppression", and he teaches courses in analog circuit design, data converters, and power integrity at UCLA Extension.

# Daniel de Araujo

Daniel has 21 years of experience in board and chip design, simulation, and validation in high-end servers and high volume commercial desktops. Originally from Brazil, he graduated with a B.S. in electrical engineering from Michigan State University and joined IBM's Personal Systems Division in 1997 in Research Triangle Park, NC. He finished his Masters in Computer Engineering at North Carolina State University in December of 2000. In 2001, he moved to Austin, TX and worked as a Senior Engineer and Team Lead in the IBM System X Electrical Interconnect and Packaging Design group. Then, in 2006, he joined Ansoft Corporation as an application engineer in the areas of High Frequency / Signal & Power Integrity. In 2010, he joined Physware as Director of Applications and the company name changed to Nimbic in June 2011. Mentor Graphics acquired Nimbic in 2014 and Siemens acquired Mentor Graphics in

2017 where he is a Principal Product Architect. Daniel has 14 patents issued, 11 filed, nine patent disclosure publications and 70 peer-reviewed publications in international IEEE conference proceedings, transactions, journals, and books

# Nitin Bhagwath

Nitin Bhagwath is a Product Architect at Mentor Graphics. He has designed and architected high-speed systems for Hewlett Packard and Cisco for ten years. He has been with the high-speed simulation group at Mentor Graphics since 2012, where he advises on simulation architecture for DDR memory, power integrity, and multi-gigabit SerDes signals. Nitin represents Mentor Graphics at the JEDEC memory groups. Nitin has a bachelors in Electronic Engineering from Bangalore University, an MS in EE from Purdue University and an MBA from the Indian Institute of Management, Bangalore.

#### Hans Klos

Hans Klos has more than 20 years of experience in performing high-speed board design and analysis. He worked as a consultant for several telecommunication companies in Europe like Alcatel, Ericsson, Lucent, Motorola, Siemens and has a broad range of experience with different EDA analysis tools. Hans worked in the past with ViewLogic XTK, Mentor Graphics ICX and HyperLynx tooling, performing Signal Integrity, Power Integrity, DDRx timing and EMC analysis. Hans founded in 2000 Sintecs a high-end embedded system design and services company in the Netherlands.

# **Tom Berends**

Tom has more than 20 years of experience in high speed electronics development. Tom joined Sintecs in 2008 and worked as hardware design engineer and High speed board analysis engineer on different high speed electronics designs. He has experience in high speed signal & power integrity, DDR and multi-gigabit SerDes signals.

Currently he manages the high speed board development team at Sintecs. His team is responsible for many high speed board design challenges. In 2000 Tom graduated from the Enschede University of Professional Education with a bachelors Electrical Engineering with a specialization in Telecommunication. Prior to joining Sintecs, Tom was a hardware design engineer at European companies in the Telecom business like Ericsson and Nokia Siemens networking.

# Arpad Muranyi

Arpad Muranyi joined Mentor Graphics Corporation in 2007. His work includes developing and testing advanced modeling and simulation technologies for Mentor's leading edge signal integrity simulation products. He also serves as the chairman of the IBIS Advanced Technology Modeling Task Group which is responsible for developing support for such new technologies in the IBIS specification.

# **Chuck Ferry**

Chuck Ferry manages the analysis product architecture group at Mentor a Siemens Business

focused on architecting system level signal and power integrity analysis solutions. He has spent the last 21 years tackling a broad range of high speed digital design challenges Chuck graduated Magna cum laude from the University of Alabama with a B.S.E in electrical engineering and continued graduate course work in the areas of signal processing and hardware description languages.

#### Praveen Anmula

Praveen Anmula is currently working at Cisco Systems as Hardware Engineer. He is leading signal integrity work for next generation router PCB boards, designing high-speed serdes interfaces and working on link optimization. Prior to joining Cisco, Praveen was a Product Architect in high speed product group at Mentor Graphics Corporation. Praveen Anmula received his M.S. degree in Electrical Engineering from Missouri University of Science & Technology, Rolla, MO and B.E. degree in Electronics and Communications Engineering from Jawaharlal Nehru Technological University, Hyderabad, India.

# Introduction

It is well known that power supply noise can generate signal distortion and timing violations in high-speed memory interface designs. In order to mitigate this issue, designers usually optimize the power distribution network (PDN) and the signal interconnects separately. Although this partitioning process is helpful for understanding how the individual contributions of signal and power integrity affect overall performance, this approach is based on the assumption that the system is linear and time invariant (LTI). However, in large parallel single-ended interfaces, like those in DDR4, the LTI assumption is typically violated and neglecting interactions between power and signal integrity effects makes the analysis too optimistic.

Our proposed optimization method starts in the pre-layout simulation environment with a simulation deck that includes both the power distribution network (PDN) and the transmission lines connections between an FPGA DDR4 memory controller and associated DRAM devices.

After the board is completely routed based on the defined constraints, a post-layout validation step consisting of power-aware SI/PI co-simulation run is performed. In this step, a set of worst case patterns are being identified based on the ISI and the PDN impedance resonant frequencies for the DQ bus. The linearity of the DQ signals is assessed and the isolated contributions of the ISI, crosstalk and SSN effects on the eye opening are quantified. The methodology of isolating those effects is described in great detail.

In the final step of this design methodology, post-layout simulation results are compared to lab measurements. First, the PDN impedance profile of the FPGA DDR4 memory interface I/O bank is measured using a tool that configures the FPGA to act like a vector network analyzer with ports connected to its own on-die power rails. Next, based on the frequencies of the PDN resonance peaks, a set of worst case power integrity DDR4 data patterns are created so that the SSO power supply currents overlap PDN resonance peaks. A set of worst case signal integrity data patterns are created to exercise transmission line multiple reflections built-up effect. Signal integrity and power integrity measurements using these data patterns are then performed to identify the dominant contributors to signal integrity degradation. An assessment of how well the measurement results correlate with simulations is made and explanations are provided.

Results for this methodology are quantified using the Xilinx Zynq UltraScale+ MPSoC ZCU102 Evaluation Kit and a VNA implemented in the FPGA. The HyperLynx suite is used for signal and power integrity simulations.

#### **Challenges of designing complex PCBs**

As PCBs increase in complexity and density, hardware development becomes more complicated and new effects need to be considered. A traditional post-layout signal integrity

analysis flow is no longer enough to guarantee a well-functioning PCB. It is necessary to evaluate alternative strategies before board layout begins, considering allowable manufacturing tolerances. By using high-speed design, analysis and verification techniques early in the design cycle, it is possible to drastically reduce or eliminate layout iterations.

The design process starts with the selection of a multilayer stack-up for the PCBs, and the following aspects need to be considered:

- 1. High-speed signal layers should always be adjacent to a reference plane. This limits the number of signal layers embedded between planes to two and top and bottom (outer) layers to a single signal layer.
- 2. Signal layers should be tightly coupled (<250 um, or ~10 mil) to their respective reference planes.
- 3. While both power and ground planes can be used for a signal's return, careful attention must be paid to the signal's return path.
- 4. Multi-gigabit routing should be constrained to specific routing layers with appropriate material properties.
- 5. Choice of proper via technology to minimize risk of reflections for high speed signals.

For complex designs, simply following the IC vendor design guidelines is no longer enough to guarantee that the product will work properly. High-speed interfaces such as DDR4 running at 2666MT/s, require detailed signal and power integrity analysis to ensure design requirements are met.

PCB design is generally under highest pressure to release a product on schedule. This is the last opportunity to implement development changes before fab out, and it's where all the key design tradeoffs need to come together. Success depends not only on the PCB designers and their knowledge of the layout tool, but also on knowledge of specific interfaces in the design and their requirements, use of simulation tools to ensure those requirements are met and understanding of specific physical phenomena and their accurate modeling in the simulation setup.

# Power Integrity effects on Signal Integrity

A typical serial data path from a transmitter to a receiver contains signal integrity and power integrity components, as shown in the drawing of Figure 1.



Figure 1. Signal integrity and power integrity components of a data transmission path

The typical signal integrity components are the characteristic impedance discontinuities on the die to package interface, package to PCB interface, crosstalk, split reference planes, reference plane change at vias transitions, losses (electromagnetic, dielectric...), and transmission line termination impedance. The typical power integrity components are the power distribution networks on the transmitter chip and receiver chip. The signal arriving at the receiver may have waveform aberrations like overshoot, undershoot, rise/fall time degradation, ringing, reflections, and crosstalk as shown in Figure 2.



Figure 2. Typical waveform aberrations seen on transmission line signals

Besides these signal integrity waveform aberrations the power integrity may produce additional waveform degradation. The power supply noise on the transmitter chip "modulates" the starting level of rising and falling transitions resulting in edge shift and amplitude reduction, as shown in Figure 3.



Figure 3. Power integrity effects on transmitter signal integrity

The power supply noise on the receiver die may couple into the reference voltage, VREF, of the receiver comparator, as illustrated in Figure 4. This happens for the case when VREF is generated externally on the PCB and for the case when VREF is generated internally on the receiver die. In both cases there is a low-pass filter that couples the on-die VREF node to the on-die VSS (or VDD) power rail. The input signal into the comparator comes from a low impedance path primarily coupled to the PCB VSS or VDD. So any on-die power supply noise coupled into VREF will be interpreted as differential noise at the input into the comparator and will "chop" the center portion of the sampled data eye resulting in a degraded eye opening.



Figure 4. Power integrity effects on receiver signal integrity

In simulations we would like to model all these power integrity and signal integrity elements of the transmission path, as illustrated in Figure 5.



# Figure 5. Example simulation test bench including both signal integrity and power integrity elements on the die, package, and PCB.

The power distribution network is represented for simplicity as inductors in this figure; however, in simulation test benches the power distribution networks are typically represented as S-parameter models.

# Modeling requirements for SI/PI co-simulation

In order to be able to perform power-aware SI/PI co-simulations, we obviously need models which include the signal integrity (SI), as well as power integrity (PI) effects we want to see in our simulation results. This includes power-aware buffer models for the driver and receiver (Tx and Rx), package models that include both signals and power and ground traces, on-die and/or on-package power decoupling models (capacitance between power and ground), model(s) for the on board power delivery network (PDN) that involve the power and ground planes, board via models for both signal and power, models for the board's decoupling capacitors, and models for on-board voltage regulator modules (VRM). Ideally, these models should all include the interactions (coupling) between power and signals. The goal is to find out how the power delivery and/or signal layout imperfections degrade the overall system performance, and to identify the most effective changes for a working design.

A large portion of SI simulations are performed using behavioral IBIS models. A simple IBIS buffer model consists of several I-V and V-t tables, which describe the buffer's impedance and switching (transient) characteristics. However, to make these basic buffer models usable for SI/PI co-simulations, additional information is needed.

First, using the IBIS [Pin Mapping] keyword, the IBIS file needs to provide detailed information about which and how many power and ground pins provide the supply current for the various (groups of) buffer models.

Second, the buffer models need to make use of the IBIS power-aware keywords [ISSO PU] and [ISSO PD] to describe how the power supply voltage fluctuations modulate the buffer's impedance (or I-V curves). The buffer models should also contain the [Composite Current] keyword to provide information on how the output current is distributed between the power and ground supply rails of the buffer.

Third, IBIS has several keywords for package modeling with various levels of accuracy. The basic (and required) IBIS [Package] keyword contains only an overall typ/min/max range for the entire package of the device. This "bare minimum" information does not contain enough detail for the level of accuracy needed for SI/PI co-simulations. The IBIS [Pin] keyword provides a mechanism to describe the R/L/C package parasitics for each pin individually, including the power and ground pins. Even though the syntax of this keyword does not support coupling, it can be useful because it can provide a pin level detail for the package model. The [Package Model] / [Define Package Model] keyword pair allows the model maker to include coupling effects as well. Unfortunately none of these IBIS keywords support frequency dependent conductor and dielectric losses. However, most EDA vendors have (sometimes proprietary) solutions to incorporate more accurate package models in Touchstone and/or SPICE formats. The next version of the IBIS specification (IBIS v7.0) will include new package and on-die interconnect modeling syntax that eliminates these shortcomings and tool-specific methods will not be needed anymore.

Fourth, power-aware simulations require models for on-die and on-package decoupling capacitors, because they have a very strong influence on the supply rail noise for the I/O buffers. Here again, the current IBIS version is limited, but IBIS 7.0 will provide support for these types of models as well. In the meantime, EDA tools have their own mechanisms for this purpose.

How the various components are connected with the board is a key factor in keeping simulations manageable. Most devices have a large number of power and ground pins (this helps provide adequate return paths for all signals, in addition to supplying the device with power). Ideally, we would model every power and ground pin surrounding the signal pins so that we could account for the exact return paths for the signals in our simulations. However, this results in prohibitively large interconnect models for the component package and the board PDN model. In order to simplify models and speed up simulations, we must find ways to reduce the number of ports connecting the devices to the board. Different vendor models approach this challenge in a different ways, so we as users of these models must pay close attention to how the models were created and intended to be used. One vendor, for example, uses the "merged pins" technique, which combines the parasitics of multiple power and/or ground pins into a single pin equivalent model (or a reduced model using only a few pins), leaving the remaining

power and/or ground pins without a package model. When we use a model like this, we must be sure to not create a port for the board PDN for those power/ground pins which are effectively no-connect pins, or we need to apply a corresponding "pin-grouping" when we generate our board PDN model to match how the power pins are merged (i.e. grouped) on the device.

Of course, we need to make sure we have accurate models for all the decoupling capacitors which are "sprinkled" over the board. These are usually supplied by the capacitor vendors as Touchstone models.

Without these details the simulator is not going to be able to calculate the power supply currents accurately. Consequently, power-aware simulations might not run at all, or produce questionable results.

# DDR4 interface and technology

The DDR4 Bus is a high speed parallel bus consisting of one controller on one end, and one or more DRAMs on the other. The bus is functionally split between the unidirectional Address, Command and Control bus (henceforth called the "Address bus") and the bi-directional Data bus, as shown in the following diagram (Figure 6):



Figure 6. DDR Bus Overview

The Controller issues instructions to the DRAMs on the Address bus, which consists of a differential pair Clock signal, and several single ended signals with specific functions. The DRAMs latch in the Address signals using the Clock they receive.

The Controller can also write data to the DRAMs or read data from the DRAMs over the Data bus. The data bus is made up of one or more lanes, each of which has its own dedicated differential strobe, four or eight data bits, and an optional Data Mask signal. The data and mask

signals use the strobe to latch in the signal at the controller (during a read) or the DRAMs (during a write).

For both the Address bus and the Data bus, the signal integrity (SI) quality of the signal arriving at the receiver can be determined by the ability of the receiver to correctly latch in the incoming signal. At the DRAM, the SI requirements to correctly latch in an address or data bit is provided by the JEDEC industry standard body. Although there are numerous stated requirements, two of the most important requirements on the address bus are the setup and hold times – that is the time the address signal is valid before the clocking event, and the time the signal stays valid after the clocking event. For the data bus, DDR4 has introduced an eye mask as shown in the diagram below (Figure 7). This eye mask can be seen as a combination of a setup and hold time centered around the strobe event.



Figure 7. DDR4 Data Compliance Mask

At the controller, although there isn't an industry standard defining every controller's requirements, one of the primary requirements at nearly every controller for read cycles is defined either as explicit setup and hold times, or as an eye mask, similar to the DRAM.

Furthermore, many DDR4 controllers have the ability to explicitly delay the bits in the data bus so as to optimize the setup and hold times for each data bit to compensate for routing flight skews. As an example, if a bit is routed too short with respect to the strobe, and therefore could arrive too early at the DRAM with respect to the strobe, then the setup time at the DRAM would be very large, but the hold time might be unacceptably short. To compensate for this situation, the controller could internally delay the data bit to balance out the setup and hold times. For such controllers, it is a combination of setup and hold – combined to be the eye width – which is important to be met rather than the individual setup or hold times.

Therefore, for this paper, the optimization algorithms are aimed towards maximizing the combined setup and hold times, or the eye-width at the receiver. For most designs, this would

normally be the first parameter to be optimized. For specific situations, the procedures outlined in this paper can be used to optimize alternative parameters as needed.

# **PDN & Optimization**

#### Power Distribution Network and Decoupling hierarchy

Power integrity includes everything from the voltage regulator module (VRM) to the on-die core power rails and includes the interconnects on the board and package, discrete capacitors as well as the on-die capacitance and is all about the quality of the power as seen by the circuits on the die [12]

Various aspects of the design may or may not be accessible or changeable in the design depending on the system and the level of control across the various components in the PDN. A vertically integrated company may have full visibility from the die circuits and on-chip PDN characteristics, to the package design, layout and on-package decoupling selection as well as the board, the various capacitors on the PCB to the VRM. System integrators may control the PCB, but must use a package/die which has a fixed layout and decoupling scheme.

The goal is to reduce component count and cost while maintaining performance requirements to ensure robust operation across material and manufacturing variations.

#### Package and Die modeling with Measurement Based Models

Since the package layout, decoupling and parasitics were not available, so a measurement based model with the equivalent parasitics was created to represent the missing information. The equivalent circuit model is depicted in Figure 8 below:





The measured impedance profile was fitted as an average s-parameter and the values of the discrete components from the equivalent circuit were identified. The overlaid plots of the impedance plots of the measured and fitted PDN self-impedances are shown in Figure 9.



Figure 9. Overlapped PDN Impedance plots

The impedance profiles can be identified as follows:

Blue - measurement (extraction done from the die side)

Red - Hybrid Solver extraction (does not include the VRM and bulk capacitor - extraction done at the PCB power pins of the FPGA)

Green - full PDN that includes VRM, bulk capacitor, PCB + FPGA PDNs (extraction from the equivalent lumped circuit)

# Simulation Flow description

The pre-layout simulation deck contains two sections, one dedicated to modeling the transmission line path, which includes the data signals (DQ, DM, and DQS) and the other needed for modeling the system power distribution network: PCB, FPGA package, FPGA die. The two sections are included in the same pre-layout schematic. The data bus topology is point-to-point, for the DDR4 interface, and connects the memory controller and the SDRAM as described in Figure 10:



Figure 10. DDR4 transmission line path

This simulation deck includes coupled transmission lines for crosstalk evaluation, as well as uncoupled transmission lines that allow for skew and timing budgeting. Total length, delay and skew constraints can be developed and the effect of layout parameters such as trace width, trace length and trace-to-trace spacing can be incorporated in the analysis.

The system power distribution network consists of a 18 port s-parameter model with ports placed at the VRM, FPGA, SDRAM and PCB decoupling capacitors power pins as shown in Figure 11.



Figure 11. The system power distribution network: PCB, FPGA package, FPGA die

For a more realistic representation of a real life system, the s-parameter of the PCB PDN was extracted from the bare board of the Xilinx reference design, using an EM field solver. Port locations are depicted in Figure 12.



Figure 12. Port locations for PDN extraction

The decoupling capacitors and their associated equivalent series resistance (ESR) are modeled as discrete passive components attached to the PDN model. This allows analysis and optimization scripts to run in batch mode. Z-parameters of bare PCB is shown in Figure 13.



Figure 13. Self-impedances of the VCC1V2 power rail PDN, bare board

#### **PDN Optimization:**

Optimization of capacitor selection and placement of a power distribution network consists of finding a set of capacitors that meet the system performance (acceptability criteria) while minimizing the costs associated with it such as component cost, capacitor count reduction and capacitor type reduction (desirability criteria). This becomes a combinatorial optimization based on a cost function that takes into account both acceptability and desirability. [2]

To achieve the PDN optimization, there are many approaches such as general optimization methods such as genetic algorithms[13], particle swarm optimization [14], simulated annealing as well as synthesis approaches.

#### Genetic Algorithms

General genetic optimization algorithms can suffer from slow convergence [2] as the mutations 'directions' are often randomized and their rates are constant throughout the optimization. More efficient approaches can accelerate the convergence by exploring the design space in more efficient ways such as not removing capacitors when a PDN is not meeting the acceptability criteria to removing capacitors at a higher rate when there is excess margin.

#### Synthesis approaches

Algorithms can be applied to select capacitors based on a fixed layout as shown in [15], however, when multiple constraints are imposed such as individual part cost, total count reduction, capacitor type reduction (reduce the number of different caps to facilitate procurement, assembly costs, etc), then it becomes a combinatorial problem.

#### Post-layout analysis

The goal of the post-layout analysis is to predict the system margins under worst case conditions and evaluate whether the margin found provides enough confidence that the final product will work reliably in High Volume Manufacturing (HVM) applications. If errors are found at this stage of the design, the failure mechanism can be identified through simulations and the possible solutions can be evaluated.

A typical post-layout simulation flow includes several steps, each of them addressing particular aspects of the design. The analysis can be performed either in time domain, frequency domain or both and might include several steps:

- 1. Timing-SI co-simulation of the interconnect only, assuming ideal power delivery and ideal return current paths for the system interconnect;
- 2. DC drop analysis;
- 3. Selection and optimization of decoupling capacitors values and location, based on the target impedance requirements;

- 4. AC noise analysis of the main power rails;
- 5. Power-aware SI/PI co-simulation that includes the interactions between signals and the system PDN.

Each of the first four steps described above helps quantify the quality and performance of various aspects of the design, allowing the designer to optimize each of them separately. This simplified approach considers the interactions between signal network and the PDN to be linear: Signal integrity is performed assuming ideal voltage and current sources, and power integrity does not include the effect of the power noise on the signals and their return paths. This approach has the advantage of keeping the analysis simple and the individual contributions to the system are easily assessed.

However, since the quality of the signals and their corresponding power supplies are tightly coupled and the interaction between them is not linear. As a result, power-aware SI/PI co-simulation produces more accurate results at the cost of added complexity. This method permits incorporating the Power Supply Induced jitter (PSIJ) in the noise and timing margin calculations, evaluation of the effect of the supply noise on the signal distortions. This method identifies subtle functional failures that cannot be caught by any of the previously identified approaches. For the experiments described in this paper a methodology that combines the two methods was used.

The SI-PI co-simulation process starts with a 3D electromagnetic (EM) extraction of the combined SI/PDN network of the routed PCB, as shown in Figure 14:



Figure 14. SI/PDN 3D model for EM extraction

This model captures all the forms of interaction between the signals and the PDN and includes all the decoupling capacitor models. Overlapped self-impedances at each port locations are shown in Figure 15.



Figure 15. Self-impedances of the VCC1V2 power rail PDN with decoupling capacitors included

The simulated loop inductances to FPGA and SDRAM from each capacitor on the PCB are summarised in Figure 16 below:



#### Figure 16. Loop inductances to FPGA and SDRAM from each capacitor on PCB.

The lower loop inductance values show a closer proximity between the capacitors and the SDRAM compared to the BGA which matches physical distances and intuition.

The VRM model is included in the transient simulations along with buffer models, FPGA package, on-package and on-die capacitances. Power aware IBIS models and external circuits

are used for this purpose. The simulation tool automatically links extracted s-parameter models with the appropriate nodes in the netlist, thereby modeling the whole system. A wizard automates the tasks of simulating the DDR4 interface from a combined signal/power integrity and timing perspective. This methodology also allows toggling various types of coupling on/off and setting various coupling thresholds, thus allowing to identify the individual contributions of ISI, Xtalk and SSN effects on the eye opening.

# Measurements of PDN impedance and Signal Integrity on ZCU102 board

We have used PI Scanner, a Vector Network Analyzer (VNA) IP tool configured in the XCZU9 FPGA, to measure the PDN impedance frequency profile and to extract an S-parameter model of the XCZU9 FPGA on the ZCU102 board (www.piscanner.com). Figure 17 shows the PDN impedance frequency profile of FPGA I/O bank 64, which is the I/O bank used for the DDR4 memory interface to an on-board DRAM chip.



Figure 17. PDN impedance frequency profile of ZCU102 XCZU9 FPGA I/O bank 64

The measured impedance represents the PDN of the FPGA die, the package, the board, and the output impedance of the VRM. This impedance profile contains multiple resonance peaks. The data transmission activity on the DDR4 memory interface generates power supply transient currents through the PDN. Depending on how the spectral components of these transient currents align with the PDN impedance resonance peaks there will be more or less noise generated on the on-die power supply rails. To evaluate this effect we have setup the memory interface in DDR4-1866 mode, which sets the clock frequency at 933MHz, and we defined two simultaneously switching outputs (SSO) data patterns. First pattern "8x1\_8x0..." has frequency

spectral components at 933MHz/8=116MHz, which overlap with a PDN resonance peak as shown in Figure 18. Second pattern " $4x1_4x0...$ " has frequency spectral components at 933MHz/4 = 233MHz and overlaps with a low impedance "deep" in the frequency profile.



Figure 18. SSO data patterns having frequency spectral components aligned with a resonance peak and an impedance deep

A third data pattern that toggled only DQ4 while keeping quiet the other DQ lines was used also (SSO=OFF). Figure 19 shows the clock signal measured at the DRAM termination resistor with these three data patterns.



# Figure 19. The clock signal measured at the DRAM termination with and without SSO data patterns

We notice that the clock jitter increases during SSO activity. We notice also that the " $8x1_8x0...$ " data pattern creates higher jitter than the " $4x1_4x0...$ " pattern, this difference resulting from the 116MHz spectral component of " $8x1_8x0...$ " pattern overlapping with a resonance peak and the 233MHz spectral component of " $4x1_4x0...$ " pattern overlapping with a lower impedance "deep".



The effect of the SSO data patterns on the DQ4 signal is shown in Figure 20.

# Figure 20. DDR4 DQ4 eye opening measured at the DRAM vias on the PCB with and without SSO data patterns

We notice that the vertical eye opening decreases with the SSO data patterns compared to the quiet mode (SSO=OFF). We notice also that the " $8x1_8x0...$ " data pattern reduces the eye opening more than the " $4x1_4x0...$ " pattern, and similarly to the previous measurement on the clock jitter, this difference results from the 116MHz spectral component of " $8x1_8x0...$ " pattern overlapping with a resonance peak and the 233MHz spectral component of " $4x1_4x0...$ " pattern overlapping with a lower impedance "deep".

Part of the vertical eye opening degradation is due to power supply noise generated by the simultaneously switching output drivers on the FPGA die and part of it is due to crosstalk. To further identify the contributions of each of these two mechanisms we kept the adjacent DQ signals quiet, DQ2 and DQ3 which are routed on the PCB as shown in Figure 21.



Figure 21. DQ2, DQ3, and DQ4 traces routed on the ZCU102 PCB

Figure 22 shows the measured DQ4 eye opening with and without crosstalk for the two SSO data patterns, "8x1\_8x0..." and "4x1\_4x0...".



#### Figure 22. DQ4 eye opening with and without crosstalk for the two SSO data patterns, "8x1\_8x0..." and "4x1\_4x0...".

The vertical eye opening is larger without crosstalk. To differentiate the contribution from power supply noise and from crosstalk we had to account for the fact that keeping DQ2 and DQ3 quiet also reduced the power supply noise generated on the FPGA power rails. The eye degradation contribution from power supply noise and from crosstalk is illustrated in Table 1.

| Analysis of DQ4<br>Vertical Eye<br>Opening (mV) |                 |                       |                       |                                       |                                    |                                                     |                                                       |                                             |                                             |                                                |                                                |
|-------------------------------------------------|-----------------|-----------------------|-----------------------|---------------------------------------|------------------------------------|-----------------------------------------------------|-------------------------------------------------------|---------------------------------------------|---------------------------------------------|------------------------------------------------|------------------------------------------------|
|                                                 | SSO=OFF<br>(mV) | SSO<br>116MHz<br>(mV) | SSO<br>233MHz<br>(mV) | Eye degradation<br>SSO 116MHz<br>(mV) | Eye degradation<br>SSO 233MHz (mV) | Approx Eye<br>degradation per DQ<br>SSO 116MHz (mV) | Approx Eye<br>degradation per<br>DQSSO 233MHz<br>(mV) | Xtalk<br>contribution<br>550 116MHz<br>(mV) | Xtalk<br>contribution<br>SSO 233MHz<br>(mV) | Suply noise<br>contribution SSO<br>116MHz (mV) | Suply noise<br>contribution SSO<br>233MHz (mV) |
| With Xtalk                                      | 485             | 350                   | 374                   | 135                                   | 111                                |                                                     |                                                       | 84.23                                       | 87.92                                       | 50.77                                          | 23.08                                          |
| Without Xtalk                                   | 485             | 441                   | 465                   | 44                                    | 20                                 | 3.38                                                | 154                                                   | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1       |                                             |                                                | 1.1.1                                          |

#### Table 1. DQ4 eye degradation contribution from power supply noise and from crosstalk

The crosstalk contribution to the degradation of the DQ4 eye opening is around 86mV for both SSO patterns, and the power supply noise contribution is 50mV for the SSO pattern with spectral component overlapping to the resonant peak at 116MHz and 23mV for the SSO pattern with spectral component overlapping with the impedance "deep" at 233MHz. This was an example that showed how "worst case" data patterns can be created to exercise specific resonance peaks of the PDN impedance. Other data patterns that exercise different resonance peaks can be created and used to investigate the power integrity effects on signal integrity.

# Summary

Power Integrity effects on signal integrity in FPGA DDR4 memory interfaces are analysed in pre-layout, post-layout, and system validation data patterns created based on the resonance peaks of the power distribution network (PDN). The PDN impedance profile was measured with an FPGA configured vector network analyzer (VNA) and served as the basis for the measurement based models. Multiple test data patterns were created to superimpose the power supply current frequency spectral components with the PDN resonance peaks and to exercise transmission line multiple reflections build up effect. These data patterns were then used to identify the dominant contributors to signal integrity degradation.

# References

1. Cosmin lorga, Measuring S-parameter Models of Power Delivery Networks in FPGA Systems by Using an Embedded Multi-port Vector Network Analyzer, DesignCon 2018

- D.N. de Araujo, J. Pingenot, Cost Space Evolution for PDN Synthesis, IEEE Electronics Packaging Technology Conference 2018
- N. Bhagwath, A. Muranyi, D. Smirnov, C. Ferry, A. Sato, M. Ono, I. Shinichiro, Y. Sugaya, T. Fukuhara and R. Wolff, SI Analysis of DDR Bus during Read/Write operation transitions, DesignCon 2017
- 4. W. Cheng, A. Sarkar, S. Lin, and J. Zheng, Worst Case Switching Pattern for Core Noise Analysis, DesignCon 2009
- 5. M. J. Choi, V. S. Pandit, and W. H. Ryu, Controllable Parameters Identification for High Speed Channel through Signal-Power Integrity Combined Analysis, IEEE Electronic Components and Technology Conference 2008
- T. To, P. Niu, J. Wang, C. Su, C.L. Khoo, A. K. Sharma, D. Klokotov, W.Liu, and Y Wang, Ultrascale FPGA DDR4 2400 Mbps System Level Design Optimization and Validation, DesignCon 2015
- R. Schmitt, J.Kim, W. Kim, D. Oh, J. Feng, C. Yuan, L. Luo, and J. Wilson, Analyzing the Impact of Simultaneous Switching Noise on System Margin in Gigabit Single-Ended Memory Systems, DesignCon 2008
- 8. L. Smith, and H. Shi, FPGA Design for Signal and Power Integrity, DesignCon 2007
- B. Mutnury, N. Singh, N. Pham, and M. Cases, Statistical and Evolutionary Techniques for Efficient Electrical Design Space Exploration, IEEE Electronics Packaging Technology Conference 2008
- 10. L. Smith and E. Bogatin, *Principles for Power Integrity for PDN Design, Robust and Cost Effective Design for High Speed Digital Products*. Prentice Hall, 2017.
- 11. K. Bharath, E. Engin, M. Swaminathan, *Automatic Package and Board Decoupling Capacitor Placement Using Genetic Algorithms and M-FDM*, DAC 2008, June 8–13, 2008, Anaheim, California, USA.
- Yi-En Chen, Tu-Hsiung Tsai, Shi-Hao Cheny and Hung-Ming Chen, PSO Cost-Effective Decap Selection for Beyond Die Power Integrity, DATE '14 Proceedings of the conference on Design, Automation & Test in Europe, Article No. 46 Dresden, Germany — March 24 - 28, 2014
- 13. K. Koo, G. Luevano, T. Wang, S. Özbayat, T. Michalka, J. Drewniak, "Fast Algorithm for Minimizing the Number of decap in Power Distribution Networks", Electromagnetic Compatibility IEEE Transactions on, vol. 60, no. 3, pp. 725-732, 2018.