The training logic is implemented as a soft FPGA core that de-skews read data and center aligns the clock using IOD capabilities in the read direction. A known training pattern, generated by the QDR controller, is used for training.
The QDR RAM uses the CQP and CQN clock pair to capture data in the PHY. IODs in the PHY only use CQP for sampling data. The Lane Controller adjusts delay taps to center CQP within the valid data eye. This tuning occurs during CQ alignment.
0xAA
.0xA5
.The FSM is designed to iterate across lanes and across each IOD data pad. Data framing, Q Alignment, CQ Alignment operations are all performed using the same methodology. A sequence is written into memory and read back; received data is compared with the transmitted data. Based on this comparison, some actions are taken. What differentiates the three phases is the block where these tuning actions are performed.
After the delays taps for all IODs are set in a lane during Q alignment, then CQ alignment is performed. This step fine tunes the delay taps of the CQP. Since we only use CQP to capture the data, we only adjust the delay taps on the CQP IOD. Only the bits sampled on the rising edge are used to compare written and read data.
for k=1:num_lanes for g=1:num_IODs_in_lane(9) for j=1:delay_taps(256) for i=1:byte_len(8) // 1. Write a test pattern to memory generated from the PRBS8 generator. // 2. Read it back and compare it with bits 7,5,3,1 (falling edge bits) // in the byte to the corresponding bits of the test pattern. // 3. Generate the next pattern from the PRBS8 generator. END for // byte_len // 1. Compare the eight generated patterns. // 2. Increment IOD delay taps as the training logic searches for all 8 patterns // to compare. // 3. Once found, it indicates the start of the valid window, and the delay // value (delay left) is recorded. // 4. Continue to increment the delay taps as the patterns match. When 7(*8 patterns) // consecutive matches and there is no longer a match, save this second delay value // (delay right) calculate mid value (left+right)/2 and exit loop. END for //delay_taps // 1. After trying all delay values (or exited the loop): // 2. Once a solution is found, the IOD delay tap is set to the midpoint of the // valid window. // 3. If (a solution is not found on the first IOD of the lane), increase the offset // (see data framing). Reload default delay value and set j=1 and retry to delay. // 4. If out of range and there is no solution. Increase fail number and try again // from beginning. If number of failures is greater than 15, declare ERROR. END for //num_IODs_in_lane Perform CQ alignment END for //num_lanes
The timing relationships between CQP, CQN, data (Q), and data valid (before training begins) is shown in the following figure.
At first, the falling edge data alignment is done concurrently with the data_valid alignment. As a result, the data_valid properly frames the 8-bit burst coming from the QDR device and the CQN centers the falling edge data (f0, f1, f2, f3).
A high-level timing diagram of alignment is shown in the following figure.
Next, CQP is aligned with the rising edge data by performing write and read of 8 different patterns and selecting a match if and only if all patterns match.