LDPC Decoder Hardening Results
Run 1: 26_02_25_21_11 (Feb 25, 2026) — FAILED
- RTL: Original (unpipelined CN update)
- Config:
CLOCK_PERIOD=20 (50 MHz), RUN_HEURISTIC_DIODE_INSERTION=true, HEURISTIC_ANTENNA_THRESHOLD=110
- Die area: 2800 x 1760 µm (4.93 mm²)
- Failure:
GRT-0118 routing congestion after heuristic diode insertion (66,016 diodes added)
- Notes: Initial global routing passed (0 overflow, 39% routing utilization). Diode insertion nearly doubled cell count, causing re-routing congestion failure.
Run 2: reuse_synth (Feb 27, 2026) — COMPLETED (timing violations)
- RTL: Original (unpipelined CN update) — reused synthesis netlist from Run 1
- Config:
CLOCK_PERIOD=20 (50 MHz), RUN_HEURISTIC_DIODE_INSERTION=false, RUN_ANTENNA_REPAIR=true
- Die area: 2800 x 1760 µm (4.93 mm²)
- Result: All 70 steps completed. GDS generated. Deferred timing errors.
Physical Results
| Metric |
Result |
| Magic DRC |
Clean |
| KLayout DRC |
Clean |
| LVS |
Clean (0 errors, 0 unmatched) |
| XOR (Magic vs KLayout) |
Clean |
| Illegal overlap |
Clean |
| Power grid violations |
0 |
| Antenna violating nets |
658 |
| Antenna violating pins |
905 |
Area & Utilization
| Metric |
Value |
| Die area |
4,928,000 µm² (4.93 mm²) |
| Core area |
4,846,670 µm² |
| Instance count |
184,663 |
| Instance area |
1,303,260 µm² (1.30 mm²) |
| Core utilization |
26.9% |
| Sequential cells |
16,967 |
| Combinational cells |
61,366 |
| Timing repair buffers |
23,709 |
| Fill cells |
415,149 |
| Tap cells |
69,228 |
Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)
| Corner |
Setup WNS (ns) |
Setup TNS (ns) |
Hold WNS (ns) |
Hold TNS (ns) |
Setup Violations |
| nom_tt_025C_1v80 |
-27.13 |
-234.9 |
-0.32 |
-3.76 |
9 |
| nom_ss_100C_1v60 |
-70.58 |
-29,946.3 |
0.06 |
0 |
5,463 |
| nom_ff_n40C_1v95 |
-10.18 |
-86.3 |
-0.26 |
-12.4 |
— |
| Worst across all |
-71.40 |
-34,329.1 |
-0.47 |
-26.4 |
— |
Estimated Max Frequency
- TT corner: Critical path ~47 ns → ~21 MHz
- SS corner: Critical path ~91 ns → ~11 MHz
- FF corner: Critical path ~30 ns → ~33 MHz
Power (TT corner)
| Component |
Power (W) |
| Internal |
0.0554 |
| Switching |
0.0273 |
| Leakage |
~0.002 mW |
| Total |
0.0827 |
Key Observations
- Disabling heuristic diode insertion fixed the routing congestion failure from Run 1
- 658 antenna violations remain — iterative antenna repair was not sufficient. May need to re-enable heuristic insertion with a higher threshold or use
DIODE_ON_PORTS
- Setup timing is severely violated — critical path is ~47 ns at TT, far from 20 ns target
- This run used the unpipelined RTL (synthesis reused from Run 1 which predated the CN pipeline split)
- Next run should re-synthesize with pipelined CN update RTL to see if timing improves
Run 3: pipelined_pnr (Mar 1, 2026) — FAILED
- RTL: Pipelined CN update (CN_STAGE1 + CN_STAGE2)
- Config:
CLOCK_PERIOD=20 (50 MHz), SYNTH_STRATEGY=AREA 0, RUN_HEURISTIC_DIODE_INSERTION=false, RUN_ANTENNA_REPAIR=true
- Die area: 2800 x 1760 µm (4.93 mm²)
- Failure:
GRT-0118 routing congestion during iterative antenna repair (step 36), after 13+ hours of repair loops
- Notes: Iterative antenna repair kept inserting diodes and re-routing until congestion became too high. Same root cause as Run 1 but via different mechanism.
Run 3b: pipelined_synth (Feb 28, 2026) — STILL RUNNING
- RTL: Pipelined CN update
- Config:
SYNTH_STRATEGY=AREA 2 — synthesis only
- Status: ABC pass 2 (tech mapping) running 20+ hours.
AREA 2 is far too aggressive for this design size. Do not use AREA 2 for this design.
Run 4: pipelined_noantenna (Mar 2, 2026) — COMPLETED (timing violations)
- RTL: Pipelined CN update (CN_STAGE1 + CN_STAGE2)
- Config:
CLOCK_PERIOD=20 (50 MHz), SYNTH_STRATEGY=AREA 0, RUN_HEURISTIC_DIODE_INSERTION=false, RUN_ANTENNA_REPAIR=false
- Die area: 2800 x 1760 µm (4.93 mm²)
- Result: All 69 steps completed. GDS generated. Deferred timing errors. No antenna repair attempted.
Physical Results
| Metric |
Result |
| Magic DRC |
Clean |
| KLayout DRC |
Clean |
| LVS |
Clean (0 errors, 0 unmatched) |
| XOR (Magic vs KLayout) |
Clean |
| Illegal overlap |
Clean |
| Antenna violating nets |
1,707 (no repair attempted) |
| Antenna violating pins |
3,319 (no repair attempted) |
Area & Utilization
| Metric |
Value |
| Die area |
4,928,000 µm² (4.93 mm²) |
| Instance count |
183,774 |
| Instance area |
1,351,790 µm² (1.35 mm²) |
| Core utilization |
27.9% |
Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)
| Corner |
Setup WNS (ns) |
Setup TNS (ns) |
Hold WNS (ns) |
Hold TNS (ns) |
| nom_tt_025C_1v80 |
-28.86 |
-348.0 |
-0.08 |
-0.15 |
| nom_ss_100C_1v60 |
-74.22 |
-20,536.0 |
-0.07 |
-0.07 |
| nom_ff_n40C_1v95 |
-11.04 |
-93.8 |
-0.12 |
-2.15 |
| min_tt_025C_1v80 |
-28.39 |
-251.0 |
0 |
0 |
| max_tt_025C_1v80 |
-29.36 |
-725.1 |
-0.24 |
-2.15 |
Estimated Max Frequency
- TT corner: Critical path ~49 ns → ~20 MHz
- SS corner: Critical path ~94 ns → ~11 MHz
- FF corner: Critical path ~31 ns → ~32 MHz
Power (TT corner)
| Metric |
Value |
| Total |
0.0858 W |
Key Observations
- Pipelined CN update did NOT improve timing — TT WNS is -28.86 ns vs -27.13 ns (unpipelined Run 2). Slightly worse, possibly due to AREA 0 vs AREA 2 synth strategy difference.
- Hold violations are much smaller than Run 2 (-0.08 vs -0.32 ns), nearly clean.
- Antenna violations increased to 1,707 nets (vs 658 in Run 2) without any repair — AREA 0 produces a less antenna-friendly netlist.
- The critical path is still ~47-49 ns, suggesting the bottleneck is NOT the CN update pipeline stage but something else (likely the large mux/barrel shifter or belief update logic).
SYNTH_STRATEGY=AREA 2 takes 20+ hours for ABC tech mapping on this design — never use it. AREA 0 completed in reasonable time.
Summary Table
| Run |
RTL |
Synth |
Antenna |
Status |
TT Setup WNS |
Max Freq (TT) |
| 1 |
Unpipelined |
AREA 2 |
Heuristic 110µm |
FAILED (congestion) |
— |
— |
| 2 |
Unpipelined |
AREA 2 |
Iterative |
COMPLETED |
-27.13 ns |
~21 MHz |
| 3 |
Pipelined |
AREA 0 |
Iterative |
FAILED (congestion) |
— |
— |
| 3b |
Pipelined |
AREA 2 |
— (synth only) |
Still running (20+ hrs) |
— |
— |
| 4 |
Pipelined |
AREA 0 |
None |
COMPLETED |
-28.86 ns |
~20 MHz |
Critical Path Analysis (from Run 4, pipelined_noantenna)
Path Summary
| Item |
Value |
| Startpoint |
u_core.beliefs[0][5] (beliefs register, bit 5 of element 0) |
| Endpoint |
syndrome_weight[7] (MSB of syndrome weight counter) |
| RTL location |
SYNDROME state in ldpc_decoder_core.sv, lines 363-385 |
| Slack |
-28.859 ns (VIOLATED) |
| Total combinational delay |
47.67 ns |
| Logic levels |
222 (171 XOR/XNOR + 51 adder/mux) |
| Logic vs wire delay |
99.7% logic / 0.3% wire |
All 8 worst setup violators fan out from beliefs[0][5] to syndrome_weight[7:0].
What the Critical Path Computes
The SYNDROME state computes the full syndrome check in a single clock cycle:
- Parity computation (171 XOR levels, 33.9 ns): XOR the sign bits of all beliefs connected to each check node — 7 rows x 32 z-elements x up to 3 columns = 224 parity bits, reading from 256 belief sign bits.
- Population count (51 adder levels, 13.6 ns): Sum all 224 parity results into an 8-bit
syndrome_cnt.
The syndrome_cnt = syndrome_cnt + 1 accumulation pattern creates a carry chain dependency that serializes everything.
Delay Breakdown
| Segment |
Delay (ns) |
Cells |
Description |
| Source CLK-to-Q |
0.795 |
1 (dfxtp_4) |
beliefs[0][5] register output |
| Parity XOR chain |
33.888 |
171 (xor2/xnor2) |
XOR reduction across belief sign bits |
| Popcount adder tree |
13.634 |
51 (and/or/aoi/oai) |
224-bit popcount to 8-bit count |
| State MUX |
0.148 |
1 (mux2_1) |
FSM output mux |
| Wire (interconnect) |
0.149 |
— |
0.3% of total — negligible |
| Total |
48.614 |
222 levels |
|
Proposed Fix: 2-3 Stage Syndrome Pipeline
SYNDROME_S1 (cycle 1, ~16 ns): Compute all 224 parity bits in parallel. Each parity is only 2-3 XOR operations deep (one per connected column). Register the 224-bit parity_vec.
SYNDROME_S2 (cycle 2, ~14 ns): Popcount the 224-bit parity vector via balanced adder tree. Register the 8-bit syndrome_weight and syndrome_ok flag.
SYNDROME_DONE (cycle 3): Already exists — reads syndrome_ok.
Estimated post-fix critical path: ~14-16 ns (comfortably under 20 ns / 50 MHz).
Latency impact: +1-2 cycles per iteration (negligible at 30 iterations).
Secondary Violations
Wishbone address input (wb_adr_i) has -2.47 ns setup violation. Fixable by registering the address at the decoder boundary.
Run 5: syndrome_pipeline (Mar 3, 2026) — COMPLETED (timing violations)
- RTL: Pipelined CN + syndrome pipeline (SYNDROME_S1 + SYNDROME_S2 with serial popcount)
- Config:
CLOCK_PERIOD=20 (50 MHz), SYNTH_STRATEGY=AREA 0, RUN_ANTENNA_REPAIR=false
- Die area: 2800 x 1760 µm (4.93 mm²)
- Result: All 75 steps completed. DRC/LVS clean.
- TT Setup WNS: -28.98 ns — no improvement from Run 4
- Root cause: Yosys serializes
syndrome_cnt = syndrome_cnt + 1 loop-carried dependency into ~48 ns chain
- Lesson: Splitting parity + popcount into 2 cycles helps nothing if the popcount itself is still serial
Run 6: balanced_popcount (Mar 4, 2026) — COMPLETED (TT timing MET!)
- RTL: Pipelined CN + syndrome pipeline with balanced 4-wide adder tree popcount
- Config:
CLOCK_PERIOD=20 (50 MHz), SYNTH_STRATEGY=AREA 0, RUN_ANTENNA_REPAIR=false
- Die area: 2800 x 1760 µm (4.93 mm²)
- Result: All 75 steps completed. DRC/LVS clean. TT timing met!
Physical Results
| Metric |
Result |
| Magic DRC |
Clean |
| KLayout DRC |
Clean |
| LVS |
Clean (0 errors, 0 unmatched) |
| Antenna violating nets |
1,687 (no repair attempted) |
Area & Utilization
| Metric |
Value |
| Die area |
4,928,000 µm² (4.93 mm²) |
| Instance count |
186,915 |
| Instance area |
1,367,580 µm² (1.37 mm²) |
| Core utilization |
28.2% |
| Sequential cells |
18,056 |
| Timing repair buffers |
27,864 |
Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)
| Corner |
Setup WNS (ns) |
Setup TNS (ns) |
Hold WNS (ns) |
Hold TNS (ns) |
| nom_tt_025C_1v80 |
0.0 |
0 |
-0.45 |
-10.5 |
| nom_ss_100C_1v60 |
-9.18 |
-12,474.4 |
-0.17 |
-0.21 |
| nom_ff_n40C_1v95 |
0.0 |
0 |
-0.37 |
-38.6 |
| max_ss_100C_1v60 |
-10.45 |
-15,896.8 |
-0.44 |
-0.87 |
Estimated Max Frequency
- TT corner: 50 MHz — TIMING MET
- SS corner: Critical path ~40 ns → ~25 MHz (up from ~11 MHz)
- FF corner: 50 MHz — TIMING MET
New Critical Path (SS corner)
| Item |
Value |
| Startpoint |
u_core.col_idx[0] (column index register) |
| Endpoint |
u_core.beliefs registers |
| Slack |
-9.18 ns (nom_ss) |
| Data arrival time |
40.15 ns |
| Description |
Belief update mux path during LAYER_READ/LAYER_WRITE |
The syndrome path is NO LONGER critical. The new bottleneck is the column-indexed mux/barrel-shifter path used during belief reads and writes.
Key Observations
- Balanced popcount tree eliminated the syndrome bottleneck — WNS improved from -28.98 ns to 0.0 ns at TT
- TT and FF corners now fully meet 50 MHz timing
- SS corner still fails (-9.18 ns) due to a different path: belief update mux indexed by col_idx
- Hold violations are minor (-0.45 ns) and can be fixed with post-route optimization
- 1,687 antenna violations need to be addressed (antenna repair was disabled)
Updated Summary Table
| Run |
RTL |
Key Change |
Antenna |
Status |
TT Setup WNS |
Max Freq (TT) |
| 1 |
Unpipelined |
— |
Heuristic |
FAILED |
— |
— |
| 2 |
Unpipelined |
— |
Iterative |
COMPLETED |
-27.13 ns |
~21 MHz |
| 3 |
Pipelined CN |
CN pipeline |
Iterative |
FAILED |
— |
— |
| 4 |
Pipelined CN |
CN pipeline |
None |
COMPLETED |
-28.86 ns |
~20 MHz |
| 5 |
+ Syndrome pipeline |
Serial popcount |
None |
COMPLETED |
-28.98 ns |
~20 MHz |
| 6 |
+ Balanced popcount |
Adder tree |
None |
COMPLETED |
0.0 ns |
50 MHz |
Run 7a: pipelined_layer2 (Mar 9, 2026) — FAILED
- RTL: Run 6 + LAYER_WRITE split into LAYER_WRITE_ADDR + LAYER_WRITE_DATA
- Config:
CLOCK_PERIOD=20, DIODE_ON_PORTS=in, HEURISTIC_ANTENNA_THRESHOLD=200
- Failure:
GRT-0118 routing congestion — heuristic diode insertion on input ports added too many cells
- Lesson: Any heuristic diode insertion causes GRT failure on this design
Run 7b: pipelined_layer3 (Mar 9, 2026) — FAILED
- RTL: Same as 7a (LAYER_WRITE_ADDR/DATA split)
- Config:
DIODE_ON_PORTS=none, RUN_HEURISTIC_DIODE_INSERTION=false
- Failure: Post-CTS resizer diverged — 2.5+ hours at 100% CPU, memory climbing linearly, never converging
- Lesson: LAYER_WRITE pipeline split creates too many paths for OpenROAD resizer
Run 7c: pre_shift (Mar 9, 2026) — FAILED
- RTL: Run 6 + pre-registered H_BASE shift lookahead (
H_BASE[row_idx][col_idx+1])
- Config: Same as 7b
- Failure:
GPL-0302 placement density overflow — 150K cells at 41.3% exceeded 40% target
- Root cause: Yosys cannot fold H_BASE constants through registers → full 256:1 write mux explosion (~2x cell count vs Run 6's 83K)
- Lesson: Registering H_BASE shift values prevents Yosys constant folding
Run 7d: run6_baseline (Mar 9, 2026) — FAILED
- RTL: Reverted to Run 6 baseline (identical RTL)
- Config:
DIODE_ON_PORTS=in (inadvertently left from earlier runs), RUN_HEURISTIC_DIODE_INSERTION=false
- Cells: 85,500
- Failure:
GRT-0118 routing congestion
- Root cause:
DIODE_ON_PORTS=in inserts diodes on input ports even when heuristic insertion is disabled
Run 7e: run6b_nodiode (Mar 10, 2026) — FAILED
- RTL: Run 6 baseline
- Config:
DIODE_ON_PORTS=none, hold margins 0.5/0.3 (from config.json), reused run6_baseline synthesis
- Failure: Post-CTS resizer diverged (9+ GiB memory, 3+ hours, never converged)
- Root cause: Reusing synthesis from a run with different config (
DIODE_ON_PORTS=in) produces a subtly different netlist that causes PnR divergence
Run 7f: run6_clean (Mar 10, 2026) — FAILED
- RTL: Run 6 baseline, clean full run from scratch
- Config:
DIODE_ON_PORTS=none, hold margins 0.5/0.3
- Cells: 85,500
- Hold buffers inserted: 35,506
- Failure:
GRT-0118 routing congestion
- Root cause: Higher hold slack margins (0.5/0.3 vs balanced_popcount's 0.4/0.2) caused 13K extra hold buffers (35K vs 22K), pushing routing congestion over GRT threshold
Run 7g: run6_fixhold (Mar 10, 2026) — FAILED
- RTL: Run 6 baseline, reused
run6_clean synthesis
- Config:
DIODE_ON_PORTS=none, hold margins 0.4/0.2 (matching balanced_popcount)
- Failure: Post-CTS resizer diverged (14+ GiB, 3.5+ hours)
- Root cause: Yosys non-determinism —
run6_clean synthesis produced a slightly different cell mix that didn't route cleanly despite identical config
Run 7h: run6_reuse_bp (Mar 10, 2026) — COMPLETED (reproduces Run 6!)
- RTL: Run 6 baseline, reused balanced_popcount's actual synthesis netlist
- Config:
DIODE_ON_PORTS=none, hold margins 0.4/0.2
- Result: All stages completed. DRC/LVS clean. TT timing met!
- Hold buffers: 22,095 (identical to balanced_popcount)
Physical Results
| Metric |
Result |
| Magic DRC |
Clean |
| KLayout DRC |
Clean |
| LVS |
Clean (circuits match uniquely) |
| Antenna violating nets |
1,687 (repair disabled) |
| Antenna violating pins |
3,416 (repair disabled) |
Area & Utilization
| Metric |
Value |
| Die area |
4,928,000 µm² (4.93 mm²) |
| Instance count |
186,915 |
| Instance area |
1,367,580 µm² (1.37 mm²) |
| Core utilization |
28.2% |
Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)
| Corner |
Setup WNS (ns) |
Setup TNS (ns) |
Hold WNS (ns) |
Hold TNS (ns) |
| nom_tt_025C_1v80 |
+3.28 |
0 |
-0.45 |
-10.5 |
| nom_ss_100C_1v60 |
-9.18 |
-12,474 |
-0.17 |
-0.21 |
| nom_ff_n40C_1v95 |
+5.93 |
0 |
-0.37 |
-38.6 |
| max_ss_100C_1v60 |
-10.45 |
-15,897 |
-0.44 |
-0.87 |
| min_tt_025C_1v80 |
+3.71 |
0 |
-0.26 |
-1.66 |
| max_tt_025C_1v80 |
+2.90 |
0 |
-0.62 |
-29.5 |
Key Observations
- Results identical to Run 6 — confirms that the balanced_popcount synthesis netlist is the key ingredient
- Yosys non-determinism is significant: re-synthesizing the same RTL with same config produces netlists that fail PnR
- Hold violations (1,543 total) are all on input port paths (
wb_dat_i, wb_adr_i), zero reg-to-reg — fixable with input delay constraints
- Max slew violations (4,112) and max cap violations (655) concentrated in SS corner
Updated Summary Table
| Run |
RTL |
Key Change |
Antenna |
Status |
TT Setup WNS |
Max Freq (TT) |
| 1 |
Unpipelined |
— |
Heuristic |
FAILED |
— |
— |
| 2 |
Unpipelined |
— |
Iterative |
COMPLETED |
-27.13 ns |
~21 MHz |
| 3 |
Pipelined CN |
CN pipeline |
Iterative |
FAILED |
— |
— |
| 4 |
Pipelined CN |
CN pipeline |
None |
COMPLETED |
-28.86 ns |
~20 MHz |
| 5 |
+ Syndrome pipeline |
Serial popcount |
None |
COMPLETED |
-28.98 ns |
~20 MHz |
| 6 |
+ Balanced popcount |
Adder tree |
None |
COMPLETED |
0.0 ns |
50 MHz |
| 7a |
+ LAYER_WRITE split |
ADDR/DATA pipeline |
Heuristic |
FAILED |
— |
— |
| 7b |
+ LAYER_WRITE split |
ADDR/DATA pipeline |
None |
FAILED (resizer) |
— |
— |
| 7c |
+ pre_shift |
H_BASE lookahead |
None |
FAILED (GPL) |
— |
— |
| 7d |
Run 6 baseline |
DIODE_ON_PORTS=in |
None |
FAILED (GRT) |
— |
— |
| 7e |
Run 6 baseline |
Reuse wrong synth |
None |
FAILED (resizer) |
— |
— |
| 7f |
Run 6 baseline |
Hold margins 0.5/0.3 |
None |
FAILED (GRT) |
— |
— |
| 7g |
Run 6 baseline |
Reuse run6_clean synth |
None |
FAILED (resizer) |
— |
— |
| 7h |
Run 6 baseline |
Reuse BP synth |
None |
COMPLETED |
+3.28 ns |
50 MHz |
Key Lessons Learned (Run 7 Series)
- LAYER_WRITE pipeline is not viable: Any register between col_idx and H_BASE causes either cell explosion (Yosys can't fold constants through registers) or PnR divergence (too many paths for resizer)
- Heuristic diode insertion always fails: Both
RUN_HEURISTIC_DIODE_INSERTION=true and DIODE_ON_PORTS=in cause GRT-0118 congestion
- Hold slack margins matter: 0.5/0.3 inserts 35K hold buffers → GRT failure. 0.4/0.2 inserts 22K → passes
- Yosys synthesis is non-deterministic: Re-synthesizing identical RTL+config produces different netlists with different PnR outcomes. The balanced_popcount synthesis netlist is the only one proven to complete
- Config must be consistent: Reusing synthesis from a run with different config settings causes PnR divergence
- Run 6's balanced_popcount synthesis netlist is the golden reference — all future PnR runs should reuse it
Wrapper Hardening (Mar 12-13, 2026)
wrapper_v2 — COMPLETED (LVS fail)
- Config:
SYNTH_ELABORATE_ONLY=true, FP_PDN_ENABLE_RAILS=false
- Result: DRC clean, but LVS fails — 3 standard cells (inv_2 + 2x conb_1) have floating VPWR/VGND
- Root cause: Without power rails, wrapper std cells have no power connection
wrapper_v3 — ABORTED (208 LVS pin-match errors)
- Config:
SYNTH_ELABORATE_ONLY=true, FP_PDN_ENABLE_RAILS=true, ERROR_ON_LVS_ERROR=true
- Result: DRC clean, XOR clean, power pins connected. Flow aborted at LVS check.
- LVS issue: 206 constant-tied output pins merged during Magic SPICE extraction
wrapper_v4 — COMPLETED (golden wrapper)
- Config: Same as v3 but
ERROR_ON_LVS_ERROR=false
- Result: All 69 stages completed. DRC clean (Magic + KLayout). XOR clean.
- LVS: 208 pin-match errors (cosmetic — device classes equivalent)
- Pin merging: Magic SPICE extraction merges io_oeb[37:0], io_out[37:0], la_data_out[127:0], user_irq[2:1] into shared constant nets, losing individual pin labels
Precheck Results (Mar 13, 2026)
| # |
Check |
Result |
| 1 |
License |
PASSED (SPDX sub-check: 1727 non-compliant venv files) |
| 2 |
Makefile |
PASSED |
| 3 |
Default |
PASSED |
| 4 |
Documentation |
PASSED |
| 5 |
Top Cell |
PASSED |
| 6 |
Consistency |
PASSED |
| 7 |
GPIO-Defines |
PASSED |
| 8 |
XOR |
PASSED |
| 9 |
Magic DRC |
PASSED |
| 10 |
KLayout FEOL |
FAILED (SIGSEGV crash, NOT real DRC) |
| 11 |
KLayout BEOL |
PASSED |
| 12 |
KLayout Offgrid |
PASSED |
| 13 |
KLayout Metal Density |
PASSED |
| 14 |
KLayout Pin Labels |
PASSED |
| 15 |
KLayout ZeroArea |
PASSED |
| 16 |
Spike Check |
PASSED |
| 17 |
Illegal Cellname |
PASSED |
| 18 |
OEB |
PASSED |
| 19 |
LVS |
FAILED (3 cosmetic pin mismatches) |
17 PASSED, 2 FAILED. Both failures are non-functional:
- KLayout FEOL: Tool crash (signal 11), not a DRC violation
- LVS: "Top level cell failed pin matching" — 3 cosmetic mismatches:
io_oeb[9] in layout only (Magic kept 1 label for merged constant net)
user_irq[2] in layout only (same issue)
vssd2 in netlist only (PDN power net not labeled as port)
- CVC: 0 errors. Device classes: equivalent.
Gate-Level Simulation Results (Mar 13, 2026)
All 5 cocotb tests passed in GL mode (iverilog + caravel_cocotb, no SDF annotation):
| Test |
Status |
Sim Time (ns) |
Wall Time (s) |
GPIO[7:0] |
Errors |
| ldpc_basic |
PASS |
854,225 |
1,814 |
0xAB |
0 |
| ldpc_noisy |
PASS |
1,011,550 |
2,720 |
0xAB |
0 |
| ldpc_max_iter |
PASS |
1,104,525 |
3,393 |
0xAB |
0 |
| ldpc_back_to_back |
PASS |
1,140,375 |
3,371 |
0xAB |
0 |
| ldpc_demo |
PASS |
1,251,050 |
3,612 |
0xAB |
0 |
- iverilog compilation: ~2h18m per test (1.1GB sim.vvp), 8.2GB RAM
- Simulation: ~30-60 min per test (5-9GB VCD waveform)
- All tests ran on snoke (247GB RAM), 4 tests in parallel
- GPIO[7:0] = 0xAB is the firmware success code for all tests
- No X-propagation or timing race issues observed
Wrapper Hardening Attempts (May 7-11, 2026) — Failed LVS Cosmetic-Fix Series
After the May 1 cf_wrapper_v5 golden run landed (commit 74ad20a to origin / 1fcdc1d to gitea) with 208 cosmetic LVS pin-match errors, a series of seven follow-up runs tried to eliminate those errors. All seven failed. The errors are a Magic SPICE-extraction limitation, not a hardening defect — no amount of RTL/placement tweaking will change Magic's behavior.
Timeline
| Run |
Date |
Strategy |
Result |
| v6 |
May 7 |
First post-PDN-swap retry (commit 8cc8414 landed config changes); same wrapper RTL |
Flow completed but KLayout crashed in final manufacturability step; same 208 LVS errors |
| v7 |
May 7 |
Same as v6, re-run |
Aborted mid-routing on [DRT-0349] LEF58_ENCLOSURE warnings — routing never completed |
| v8 |
May 8 |
manual_tieoffs.vh with 206 per-pin conb_1 cells + manual_placements.json placing each cell adjacent to its target pin; mprj moved [60,15] → [60,200] to make room |
Flow completed; same 208 LVS errors — Magic still merged all constant-tied outputs. STA failed on min_ss_100C_1v60 and nom_tt_025C_1v80 corners |
| v9 |
May 9 |
Same as v8 with ERROR_ON_TR_DRC=false to push through routing |
1780 routing DRC errors (deferred). Magic streamout completed but DRC was never clean |
| v10 |
May 11 |
Same family of placement tweaks |
1362 routing DRC errors (deferred); same failure mode as v9 |
| v11 |
May 11 |
One more attempt |
Interrupted at step 01 (yosys-jsonheader); no harden process running |
Why every attempt failed
The 208 LVS errors all come from Magic SPICE extraction collapsing constant-tied nets:
la_data_out[127:0] — all 128 bits tied to 1'b0 → Magic extracts as a single GND net → 127 pin labels lost (only one kept arbitrarily, often none)
io_out[37:0] — all 38 bits tied to 1'b0 → same merge
io_oeb[37:0] — all 38 bits tied to 1'b1 → merged into VDD net (Magic keeps the label for io_oeb[9] for unknown reasons)
user_irq[2:1] — tied to 2'b0 → merged into GND
The v8 attempt — putting each pin behind its own sky130_fd_sc_hd__conb_1 cell — does not break the merge because Magic's extractor still resolves each conb_1 output as the constant VPWR or VGND and collapses them onto the global power/ground nets at the extracted-SPICE level. Per-pin cells generate distinct logical nets in the Verilog netlist but not distinct extracted nets in the layout. Netgen itself reports "Device classes equivalent" and "Cell pin lists altered to match" — the failure is bookkeeping, not electrical.
Approaches proven non-viable (don't try again)
- Per-pin
conb_1 cells in the wrapper Verilog — v8 disproved this. Magic optimizes them onto the constant nets.
- Per-pin manual placement of tieoff cells — placement doesn't change extraction behavior.
- mprj location shifts to make room for tieoff rows — doesn't help; cosmetic LVS persists.
- Pushing routing-DRC tolerance up (v9, v10) — produces broken layouts (1300–1800 routing DRC errors), worse than starting state.
Approaches that could work but were not attempted (deferred — too risky pre-deadline)
- Drive 206 dummy zero outputs from inside
ldpc_decoder_top — would force each wrapper output to come from a distinct extracted macro pin instead of a constant-tied wrapper net. Requires a fresh macro re-harden, which risks breaking Run 6's golden timing on a non-deterministic Yosys run. 4–6 hour cost, high regression risk.
- Post-extraction
.mag editing to add per-pin port labels — brittle and tool-specific; would not survive a re-harden.
- Formal LVS waiver (the chosen May 12 path) — document the cosmetic nature of the errors, cite netgen's own "Device classes equivalent" line, and submit alongside the submission packet.
Key lesson
The 208 LVS pin-match errors are not fixable with wrapper-only hardening. Magic SPICE-extraction behavior is the root cause. Future sessions should not re-litigate this — either fix it inside the macro (re-harden risk) or formally waive it.
Next Steps
- Submit with a formal LVS waiver (see
chip_ignite/docs/LVS_WAIVER.md)
- Confirm
cf precheck and cf verify ldpc_basic --sim gl still pass on the HEAD wrapper state
cf push before 2026-05-13 deadline