Files
ldpc_optical/docs/hardening-results.md
cah f22ee197ab docs(hardening): add wrapper attempt history through v8-v11 + LVS-fix lessons
Document the full wrapper hardening trail:
- Mar 12-13 wrapper_v2/v3/v4 results, mpw_precheck 17/19, and 5/5 GLS pass
- May 7-11 v6-v11 LVS-cosmetic-fix attempts (all seven failed)

The v6-v11 series tried to eliminate the 208 cosmetic LVS pin-match
errors via per-pin conb_1 tieoffs and placement tweaks. All failed
because the errors are a Magic SPICE-extraction limitation (constant-
tied output nets collapse into shared power/ground at extract time),
not a hardening defect. Documented so future sessions don't re-explore
this dead end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 23:13:11 -06:00

27 KiB
Raw Blame History

LDPC Decoder Hardening Results

Run 1: 26_02_25_21_11 (Feb 25, 2026) — FAILED

  • RTL: Original (unpipelined CN update)
  • Config: CLOCK_PERIOD=20 (50 MHz), RUN_HEURISTIC_DIODE_INSERTION=true, HEURISTIC_ANTENNA_THRESHOLD=110
  • Die area: 2800 x 1760 µm (4.93 mm²)
  • Failure: GRT-0118 routing congestion after heuristic diode insertion (66,016 diodes added)
  • Notes: Initial global routing passed (0 overflow, 39% routing utilization). Diode insertion nearly doubled cell count, causing re-routing congestion failure.

Run 2: reuse_synth (Feb 27, 2026) — COMPLETED (timing violations)

  • RTL: Original (unpipelined CN update) — reused synthesis netlist from Run 1
  • Config: CLOCK_PERIOD=20 (50 MHz), RUN_HEURISTIC_DIODE_INSERTION=false, RUN_ANTENNA_REPAIR=true
  • Die area: 2800 x 1760 µm (4.93 mm²)
  • Result: All 70 steps completed. GDS generated. Deferred timing errors.

Physical Results

Metric Result
Magic DRC Clean
KLayout DRC Clean
LVS Clean (0 errors, 0 unmatched)
XOR (Magic vs KLayout) Clean
Illegal overlap Clean
Power grid violations 0
Antenna violating nets 658
Antenna violating pins 905

Area & Utilization

Metric Value
Die area 4,928,000 µm² (4.93 mm²)
Core area 4,846,670 µm²
Instance count 184,663
Instance area 1,303,260 µm² (1.30 mm²)
Core utilization 26.9%
Sequential cells 16,967
Combinational cells 61,366
Timing repair buffers 23,709
Fill cells 415,149
Tap cells 69,228

Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)

Corner Setup WNS (ns) Setup TNS (ns) Hold WNS (ns) Hold TNS (ns) Setup Violations
nom_tt_025C_1v80 -27.13 -234.9 -0.32 -3.76 9
nom_ss_100C_1v60 -70.58 -29,946.3 0.06 0 5,463
nom_ff_n40C_1v95 -10.18 -86.3 -0.26 -12.4
Worst across all -71.40 -34,329.1 -0.47 -26.4

Estimated Max Frequency

  • TT corner: Critical path ~47 ns → ~21 MHz
  • SS corner: Critical path ~91 ns → ~11 MHz
  • FF corner: Critical path ~30 ns → ~33 MHz

Power (TT corner)

Component Power (W)
Internal 0.0554
Switching 0.0273
Leakage ~0.002 mW
Total 0.0827

Key Observations

  1. Disabling heuristic diode insertion fixed the routing congestion failure from Run 1
  2. 658 antenna violations remain — iterative antenna repair was not sufficient. May need to re-enable heuristic insertion with a higher threshold or use DIODE_ON_PORTS
  3. Setup timing is severely violated — critical path is ~47 ns at TT, far from 20 ns target
  4. This run used the unpipelined RTL (synthesis reused from Run 1 which predated the CN pipeline split)
  5. Next run should re-synthesize with pipelined CN update RTL to see if timing improves

Run 3: pipelined_pnr (Mar 1, 2026) — FAILED

  • RTL: Pipelined CN update (CN_STAGE1 + CN_STAGE2)
  • Config: CLOCK_PERIOD=20 (50 MHz), SYNTH_STRATEGY=AREA 0, RUN_HEURISTIC_DIODE_INSERTION=false, RUN_ANTENNA_REPAIR=true
  • Die area: 2800 x 1760 µm (4.93 mm²)
  • Failure: GRT-0118 routing congestion during iterative antenna repair (step 36), after 13+ hours of repair loops
  • Notes: Iterative antenna repair kept inserting diodes and re-routing until congestion became too high. Same root cause as Run 1 but via different mechanism.

Run 3b: pipelined_synth (Feb 28, 2026) — STILL RUNNING

  • RTL: Pipelined CN update
  • Config: SYNTH_STRATEGY=AREA 2 — synthesis only
  • Status: ABC pass 2 (tech mapping) running 20+ hours. AREA 2 is far too aggressive for this design size. Do not use AREA 2 for this design.

Run 4: pipelined_noantenna (Mar 2, 2026) — COMPLETED (timing violations)

  • RTL: Pipelined CN update (CN_STAGE1 + CN_STAGE2)
  • Config: CLOCK_PERIOD=20 (50 MHz), SYNTH_STRATEGY=AREA 0, RUN_HEURISTIC_DIODE_INSERTION=false, RUN_ANTENNA_REPAIR=false
  • Die area: 2800 x 1760 µm (4.93 mm²)
  • Result: All 69 steps completed. GDS generated. Deferred timing errors. No antenna repair attempted.

Physical Results

Metric Result
Magic DRC Clean
KLayout DRC Clean
LVS Clean (0 errors, 0 unmatched)
XOR (Magic vs KLayout) Clean
Illegal overlap Clean
Antenna violating nets 1,707 (no repair attempted)
Antenna violating pins 3,319 (no repair attempted)

Area & Utilization

Metric Value
Die area 4,928,000 µm² (4.93 mm²)
Instance count 183,774
Instance area 1,351,790 µm² (1.35 mm²)
Core utilization 27.9%

Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)

Corner Setup WNS (ns) Setup TNS (ns) Hold WNS (ns) Hold TNS (ns)
nom_tt_025C_1v80 -28.86 -348.0 -0.08 -0.15
nom_ss_100C_1v60 -74.22 -20,536.0 -0.07 -0.07
nom_ff_n40C_1v95 -11.04 -93.8 -0.12 -2.15
min_tt_025C_1v80 -28.39 -251.0 0 0
max_tt_025C_1v80 -29.36 -725.1 -0.24 -2.15

Estimated Max Frequency

  • TT corner: Critical path ~49 ns → ~20 MHz
  • SS corner: Critical path ~94 ns → ~11 MHz
  • FF corner: Critical path ~31 ns → ~32 MHz

Power (TT corner)

Metric Value
Total 0.0858 W

Key Observations

  1. Pipelined CN update did NOT improve timing — TT WNS is -28.86 ns vs -27.13 ns (unpipelined Run 2). Slightly worse, possibly due to AREA 0 vs AREA 2 synth strategy difference.
  2. Hold violations are much smaller than Run 2 (-0.08 vs -0.32 ns), nearly clean.
  3. Antenna violations increased to 1,707 nets (vs 658 in Run 2) without any repair — AREA 0 produces a less antenna-friendly netlist.
  4. The critical path is still ~47-49 ns, suggesting the bottleneck is NOT the CN update pipeline stage but something else (likely the large mux/barrel shifter or belief update logic).
  5. SYNTH_STRATEGY=AREA 2 takes 20+ hours for ABC tech mapping on this design — never use it. AREA 0 completed in reasonable time.

Summary Table

Run RTL Synth Antenna Status TT Setup WNS Max Freq (TT)
1 Unpipelined AREA 2 Heuristic 110µm FAILED (congestion)
2 Unpipelined AREA 2 Iterative COMPLETED -27.13 ns ~21 MHz
3 Pipelined AREA 0 Iterative FAILED (congestion)
3b Pipelined AREA 2 — (synth only) Still running (20+ hrs)
4 Pipelined AREA 0 None COMPLETED -28.86 ns ~20 MHz

Critical Path Analysis (from Run 4, pipelined_noantenna)

Path Summary

Item Value
Startpoint u_core.beliefs[0][5] (beliefs register, bit 5 of element 0)
Endpoint syndrome_weight[7] (MSB of syndrome weight counter)
RTL location SYNDROME state in ldpc_decoder_core.sv, lines 363-385
Slack -28.859 ns (VIOLATED)
Total combinational delay 47.67 ns
Logic levels 222 (171 XOR/XNOR + 51 adder/mux)
Logic vs wire delay 99.7% logic / 0.3% wire

All 8 worst setup violators fan out from beliefs[0][5] to syndrome_weight[7:0].

What the Critical Path Computes

The SYNDROME state computes the full syndrome check in a single clock cycle:

  1. Parity computation (171 XOR levels, 33.9 ns): XOR the sign bits of all beliefs connected to each check node — 7 rows x 32 z-elements x up to 3 columns = 224 parity bits, reading from 256 belief sign bits.
  2. Population count (51 adder levels, 13.6 ns): Sum all 224 parity results into an 8-bit syndrome_cnt.

The syndrome_cnt = syndrome_cnt + 1 accumulation pattern creates a carry chain dependency that serializes everything.

Delay Breakdown

Segment Delay (ns) Cells Description
Source CLK-to-Q 0.795 1 (dfxtp_4) beliefs[0][5] register output
Parity XOR chain 33.888 171 (xor2/xnor2) XOR reduction across belief sign bits
Popcount adder tree 13.634 51 (and/or/aoi/oai) 224-bit popcount to 8-bit count
State MUX 0.148 1 (mux2_1) FSM output mux
Wire (interconnect) 0.149 0.3% of total — negligible
Total 48.614 222 levels

Proposed Fix: 2-3 Stage Syndrome Pipeline

SYNDROME_S1 (cycle 1, ~16 ns): Compute all 224 parity bits in parallel. Each parity is only 2-3 XOR operations deep (one per connected column). Register the 224-bit parity_vec.

SYNDROME_S2 (cycle 2, ~14 ns): Popcount the 224-bit parity vector via balanced adder tree. Register the 8-bit syndrome_weight and syndrome_ok flag.

SYNDROME_DONE (cycle 3): Already exists — reads syndrome_ok.

Estimated post-fix critical path: ~14-16 ns (comfortably under 20 ns / 50 MHz). Latency impact: +1-2 cycles per iteration (negligible at 30 iterations).

Secondary Violations

Wishbone address input (wb_adr_i) has -2.47 ns setup violation. Fixable by registering the address at the decoder boundary.

Run 5: syndrome_pipeline (Mar 3, 2026) — COMPLETED (timing violations)

  • RTL: Pipelined CN + syndrome pipeline (SYNDROME_S1 + SYNDROME_S2 with serial popcount)
  • Config: CLOCK_PERIOD=20 (50 MHz), SYNTH_STRATEGY=AREA 0, RUN_ANTENNA_REPAIR=false
  • Die area: 2800 x 1760 µm (4.93 mm²)
  • Result: All 75 steps completed. DRC/LVS clean.
  • TT Setup WNS: -28.98 ns — no improvement from Run 4
  • Root cause: Yosys serializes syndrome_cnt = syndrome_cnt + 1 loop-carried dependency into ~48 ns chain
  • Lesson: Splitting parity + popcount into 2 cycles helps nothing if the popcount itself is still serial

Run 6: balanced_popcount (Mar 4, 2026) — COMPLETED (TT timing MET!)

  • RTL: Pipelined CN + syndrome pipeline with balanced 4-wide adder tree popcount
  • Config: CLOCK_PERIOD=20 (50 MHz), SYNTH_STRATEGY=AREA 0, RUN_ANTENNA_REPAIR=false
  • Die area: 2800 x 1760 µm (4.93 mm²)
  • Result: All 75 steps completed. DRC/LVS clean. TT timing met!

Physical Results

Metric Result
Magic DRC Clean
KLayout DRC Clean
LVS Clean (0 errors, 0 unmatched)
Antenna violating nets 1,687 (no repair attempted)

Area & Utilization

Metric Value
Die area 4,928,000 µm² (4.93 mm²)
Instance count 186,915
Instance area 1,367,580 µm² (1.37 mm²)
Core utilization 28.2%
Sequential cells 18,056
Timing repair buffers 27,864

Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)

Corner Setup WNS (ns) Setup TNS (ns) Hold WNS (ns) Hold TNS (ns)
nom_tt_025C_1v80 0.0 0 -0.45 -10.5
nom_ss_100C_1v60 -9.18 -12,474.4 -0.17 -0.21
nom_ff_n40C_1v95 0.0 0 -0.37 -38.6
max_ss_100C_1v60 -10.45 -15,896.8 -0.44 -0.87

Estimated Max Frequency

  • TT corner: 50 MHz — TIMING MET
  • SS corner: Critical path ~40 ns → ~25 MHz (up from ~11 MHz)
  • FF corner: 50 MHz — TIMING MET

New Critical Path (SS corner)

Item Value
Startpoint u_core.col_idx[0] (column index register)
Endpoint u_core.beliefs registers
Slack -9.18 ns (nom_ss)
Data arrival time 40.15 ns
Description Belief update mux path during LAYER_READ/LAYER_WRITE

The syndrome path is NO LONGER critical. The new bottleneck is the column-indexed mux/barrel-shifter path used during belief reads and writes.

Key Observations

  1. Balanced popcount tree eliminated the syndrome bottleneck — WNS improved from -28.98 ns to 0.0 ns at TT
  2. TT and FF corners now fully meet 50 MHz timing
  3. SS corner still fails (-9.18 ns) due to a different path: belief update mux indexed by col_idx
  4. Hold violations are minor (-0.45 ns) and can be fixed with post-route optimization
  5. 1,687 antenna violations need to be addressed (antenna repair was disabled)

Updated Summary Table

Run RTL Key Change Antenna Status TT Setup WNS Max Freq (TT)
1 Unpipelined Heuristic FAILED
2 Unpipelined Iterative COMPLETED -27.13 ns ~21 MHz
3 Pipelined CN CN pipeline Iterative FAILED
4 Pipelined CN CN pipeline None COMPLETED -28.86 ns ~20 MHz
5 + Syndrome pipeline Serial popcount None COMPLETED -28.98 ns ~20 MHz
6 + Balanced popcount Adder tree None COMPLETED 0.0 ns 50 MHz

Run 7a: pipelined_layer2 (Mar 9, 2026) — FAILED

  • RTL: Run 6 + LAYER_WRITE split into LAYER_WRITE_ADDR + LAYER_WRITE_DATA
  • Config: CLOCK_PERIOD=20, DIODE_ON_PORTS=in, HEURISTIC_ANTENNA_THRESHOLD=200
  • Failure: GRT-0118 routing congestion — heuristic diode insertion on input ports added too many cells
  • Lesson: Any heuristic diode insertion causes GRT failure on this design

Run 7b: pipelined_layer3 (Mar 9, 2026) — FAILED

  • RTL: Same as 7a (LAYER_WRITE_ADDR/DATA split)
  • Config: DIODE_ON_PORTS=none, RUN_HEURISTIC_DIODE_INSERTION=false
  • Failure: Post-CTS resizer diverged — 2.5+ hours at 100% CPU, memory climbing linearly, never converging
  • Lesson: LAYER_WRITE pipeline split creates too many paths for OpenROAD resizer

Run 7c: pre_shift (Mar 9, 2026) — FAILED

  • RTL: Run 6 + pre-registered H_BASE shift lookahead (H_BASE[row_idx][col_idx+1])
  • Config: Same as 7b
  • Failure: GPL-0302 placement density overflow — 150K cells at 41.3% exceeded 40% target
  • Root cause: Yosys cannot fold H_BASE constants through registers → full 256:1 write mux explosion (~2x cell count vs Run 6's 83K)
  • Lesson: Registering H_BASE shift values prevents Yosys constant folding

Run 7d: run6_baseline (Mar 9, 2026) — FAILED

  • RTL: Reverted to Run 6 baseline (identical RTL)
  • Config: DIODE_ON_PORTS=in (inadvertently left from earlier runs), RUN_HEURISTIC_DIODE_INSERTION=false
  • Cells: 85,500
  • Failure: GRT-0118 routing congestion
  • Root cause: DIODE_ON_PORTS=in inserts diodes on input ports even when heuristic insertion is disabled

Run 7e: run6b_nodiode (Mar 10, 2026) — FAILED

  • RTL: Run 6 baseline
  • Config: DIODE_ON_PORTS=none, hold margins 0.5/0.3 (from config.json), reused run6_baseline synthesis
  • Failure: Post-CTS resizer diverged (9+ GiB memory, 3+ hours, never converged)
  • Root cause: Reusing synthesis from a run with different config (DIODE_ON_PORTS=in) produces a subtly different netlist that causes PnR divergence

Run 7f: run6_clean (Mar 10, 2026) — FAILED

  • RTL: Run 6 baseline, clean full run from scratch
  • Config: DIODE_ON_PORTS=none, hold margins 0.5/0.3
  • Cells: 85,500
  • Hold buffers inserted: 35,506
  • Failure: GRT-0118 routing congestion
  • Root cause: Higher hold slack margins (0.5/0.3 vs balanced_popcount's 0.4/0.2) caused 13K extra hold buffers (35K vs 22K), pushing routing congestion over GRT threshold

Run 7g: run6_fixhold (Mar 10, 2026) — FAILED

  • RTL: Run 6 baseline, reused run6_clean synthesis
  • Config: DIODE_ON_PORTS=none, hold margins 0.4/0.2 (matching balanced_popcount)
  • Failure: Post-CTS resizer diverged (14+ GiB, 3.5+ hours)
  • Root cause: Yosys non-determinism — run6_clean synthesis produced a slightly different cell mix that didn't route cleanly despite identical config

Run 7h: run6_reuse_bp (Mar 10, 2026) — COMPLETED (reproduces Run 6!)

  • RTL: Run 6 baseline, reused balanced_popcount's actual synthesis netlist
  • Config: DIODE_ON_PORTS=none, hold margins 0.4/0.2
  • Result: All stages completed. DRC/LVS clean. TT timing met!
  • Hold buffers: 22,095 (identical to balanced_popcount)

Physical Results

Metric Result
Magic DRC Clean
KLayout DRC Clean
LVS Clean (circuits match uniquely)
Antenna violating nets 1,687 (repair disabled)
Antenna violating pins 3,416 (repair disabled)

Area & Utilization

Metric Value
Die area 4,928,000 µm² (4.93 mm²)
Instance count 186,915
Instance area 1,367,580 µm² (1.37 mm²)
Core utilization 28.2%

Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)

Corner Setup WNS (ns) Setup TNS (ns) Hold WNS (ns) Hold TNS (ns)
nom_tt_025C_1v80 +3.28 0 -0.45 -10.5
nom_ss_100C_1v60 -9.18 -12,474 -0.17 -0.21
nom_ff_n40C_1v95 +5.93 0 -0.37 -38.6
max_ss_100C_1v60 -10.45 -15,897 -0.44 -0.87
min_tt_025C_1v80 +3.71 0 -0.26 -1.66
max_tt_025C_1v80 +2.90 0 -0.62 -29.5

Key Observations

  1. Results identical to Run 6 — confirms that the balanced_popcount synthesis netlist is the key ingredient
  2. Yosys non-determinism is significant: re-synthesizing the same RTL with same config produces netlists that fail PnR
  3. Hold violations (1,543 total) are all on input port paths (wb_dat_i, wb_adr_i), zero reg-to-reg — fixable with input delay constraints
  4. Max slew violations (4,112) and max cap violations (655) concentrated in SS corner

Updated Summary Table

Run RTL Key Change Antenna Status TT Setup WNS Max Freq (TT)
1 Unpipelined Heuristic FAILED
2 Unpipelined Iterative COMPLETED -27.13 ns ~21 MHz
3 Pipelined CN CN pipeline Iterative FAILED
4 Pipelined CN CN pipeline None COMPLETED -28.86 ns ~20 MHz
5 + Syndrome pipeline Serial popcount None COMPLETED -28.98 ns ~20 MHz
6 + Balanced popcount Adder tree None COMPLETED 0.0 ns 50 MHz
7a + LAYER_WRITE split ADDR/DATA pipeline Heuristic FAILED
7b + LAYER_WRITE split ADDR/DATA pipeline None FAILED (resizer)
7c + pre_shift H_BASE lookahead None FAILED (GPL)
7d Run 6 baseline DIODE_ON_PORTS=in None FAILED (GRT)
7e Run 6 baseline Reuse wrong synth None FAILED (resizer)
7f Run 6 baseline Hold margins 0.5/0.3 None FAILED (GRT)
7g Run 6 baseline Reuse run6_clean synth None FAILED (resizer)
7h Run 6 baseline Reuse BP synth None COMPLETED +3.28 ns 50 MHz

Key Lessons Learned (Run 7 Series)

  1. LAYER_WRITE pipeline is not viable: Any register between col_idx and H_BASE causes either cell explosion (Yosys can't fold constants through registers) or PnR divergence (too many paths for resizer)
  2. Heuristic diode insertion always fails: Both RUN_HEURISTIC_DIODE_INSERTION=true and DIODE_ON_PORTS=in cause GRT-0118 congestion
  3. Hold slack margins matter: 0.5/0.3 inserts 35K hold buffers → GRT failure. 0.4/0.2 inserts 22K → passes
  4. Yosys synthesis is non-deterministic: Re-synthesizing identical RTL+config produces different netlists with different PnR outcomes. The balanced_popcount synthesis netlist is the only one proven to complete
  5. Config must be consistent: Reusing synthesis from a run with different config settings causes PnR divergence
  6. Run 6's balanced_popcount synthesis netlist is the golden reference — all future PnR runs should reuse it

Wrapper Hardening (Mar 12-13, 2026)

wrapper_v2 — COMPLETED (LVS fail)

  • Config: SYNTH_ELABORATE_ONLY=true, FP_PDN_ENABLE_RAILS=false
  • Result: DRC clean, but LVS fails — 3 standard cells (inv_2 + 2x conb_1) have floating VPWR/VGND
  • Root cause: Without power rails, wrapper std cells have no power connection

wrapper_v3 — ABORTED (208 LVS pin-match errors)

  • Config: SYNTH_ELABORATE_ONLY=true, FP_PDN_ENABLE_RAILS=true, ERROR_ON_LVS_ERROR=true
  • Result: DRC clean, XOR clean, power pins connected. Flow aborted at LVS check.
  • LVS issue: 206 constant-tied output pins merged during Magic SPICE extraction

wrapper_v4 — COMPLETED (golden wrapper)

  • Config: Same as v3 but ERROR_ON_LVS_ERROR=false
  • Result: All 69 stages completed. DRC clean (Magic + KLayout). XOR clean.
  • LVS: 208 pin-match errors (cosmetic — device classes equivalent)
  • Pin merging: Magic SPICE extraction merges io_oeb[37:0], io_out[37:0], la_data_out[127:0], user_irq[2:1] into shared constant nets, losing individual pin labels

Precheck Results (Mar 13, 2026)

# Check Result
1 License PASSED (SPDX sub-check: 1727 non-compliant venv files)
2 Makefile PASSED
3 Default PASSED
4 Documentation PASSED
5 Top Cell PASSED
6 Consistency PASSED
7 GPIO-Defines PASSED
8 XOR PASSED
9 Magic DRC PASSED
10 KLayout FEOL FAILED (SIGSEGV crash, NOT real DRC)
11 KLayout BEOL PASSED
12 KLayout Offgrid PASSED
13 KLayout Metal Density PASSED
14 KLayout Pin Labels PASSED
15 KLayout ZeroArea PASSED
16 Spike Check PASSED
17 Illegal Cellname PASSED
18 OEB PASSED
19 LVS FAILED (3 cosmetic pin mismatches)

17 PASSED, 2 FAILED. Both failures are non-functional:

  • KLayout FEOL: Tool crash (signal 11), not a DRC violation
  • LVS: "Top level cell failed pin matching" — 3 cosmetic mismatches:
    • io_oeb[9] in layout only (Magic kept 1 label for merged constant net)
    • user_irq[2] in layout only (same issue)
    • vssd2 in netlist only (PDN power net not labeled as port)
    • CVC: 0 errors. Device classes: equivalent.

Gate-Level Simulation Results (Mar 13, 2026)

All 5 cocotb tests passed in GL mode (iverilog + caravel_cocotb, no SDF annotation):

Test Status Sim Time (ns) Wall Time (s) GPIO[7:0] Errors
ldpc_basic PASS 854,225 1,814 0xAB 0
ldpc_noisy PASS 1,011,550 2,720 0xAB 0
ldpc_max_iter PASS 1,104,525 3,393 0xAB 0
ldpc_back_to_back PASS 1,140,375 3,371 0xAB 0
ldpc_demo PASS 1,251,050 3,612 0xAB 0
  • iverilog compilation: ~2h18m per test (1.1GB sim.vvp), 8.2GB RAM
  • Simulation: ~30-60 min per test (5-9GB VCD waveform)
  • All tests ran on snoke (247GB RAM), 4 tests in parallel
  • GPIO[7:0] = 0xAB is the firmware success code for all tests
  • No X-propagation or timing race issues observed

Wrapper Hardening Attempts (May 7-11, 2026) — Failed LVS Cosmetic-Fix Series

After the May 1 cf_wrapper_v5 golden run landed (commit 74ad20a to origin / 1fcdc1d to gitea) with 208 cosmetic LVS pin-match errors, a series of seven follow-up runs tried to eliminate those errors. All seven failed. The errors are a Magic SPICE-extraction limitation, not a hardening defect — no amount of RTL/placement tweaking will change Magic's behavior.

Timeline

Run Date Strategy Result
v6 May 7 First post-PDN-swap retry (commit 8cc8414 landed config changes); same wrapper RTL Flow completed but KLayout crashed in final manufacturability step; same 208 LVS errors
v7 May 7 Same as v6, re-run Aborted mid-routing on [DRT-0349] LEF58_ENCLOSURE warnings — routing never completed
v8 May 8 manual_tieoffs.vh with 206 per-pin conb_1 cells + manual_placements.json placing each cell adjacent to its target pin; mprj moved [60,15] → [60,200] to make room Flow completed; same 208 LVS errors — Magic still merged all constant-tied outputs. STA failed on min_ss_100C_1v60 and nom_tt_025C_1v80 corners
v9 May 9 Same as v8 with ERROR_ON_TR_DRC=false to push through routing 1780 routing DRC errors (deferred). Magic streamout completed but DRC was never clean
v10 May 11 Same family of placement tweaks 1362 routing DRC errors (deferred); same failure mode as v9
v11 May 11 One more attempt Interrupted at step 01 (yosys-jsonheader); no harden process running

Why every attempt failed

The 208 LVS errors all come from Magic SPICE extraction collapsing constant-tied nets:

  • la_data_out[127:0] — all 128 bits tied to 1'b0 → Magic extracts as a single GND net → 127 pin labels lost (only one kept arbitrarily, often none)
  • io_out[37:0] — all 38 bits tied to 1'b0 → same merge
  • io_oeb[37:0] — all 38 bits tied to 1'b1 → merged into VDD net (Magic keeps the label for io_oeb[9] for unknown reasons)
  • user_irq[2:1] — tied to 2'b0 → merged into GND

The v8 attempt — putting each pin behind its own sky130_fd_sc_hd__conb_1 cell — does not break the merge because Magic's extractor still resolves each conb_1 output as the constant VPWR or VGND and collapses them onto the global power/ground nets at the extracted-SPICE level. Per-pin cells generate distinct logical nets in the Verilog netlist but not distinct extracted nets in the layout. Netgen itself reports "Device classes equivalent" and "Cell pin lists altered to match" — the failure is bookkeeping, not electrical.

Approaches proven non-viable (don't try again)

  1. Per-pin conb_1 cells in the wrapper Verilog — v8 disproved this. Magic optimizes them onto the constant nets.
  2. Per-pin manual placement of tieoff cells — placement doesn't change extraction behavior.
  3. mprj location shifts to make room for tieoff rows — doesn't help; cosmetic LVS persists.
  4. Pushing routing-DRC tolerance up (v9, v10) — produces broken layouts (13001800 routing DRC errors), worse than starting state.

Approaches that could work but were not attempted (deferred — too risky pre-deadline)

  1. Drive 206 dummy zero outputs from inside ldpc_decoder_top — would force each wrapper output to come from a distinct extracted macro pin instead of a constant-tied wrapper net. Requires a fresh macro re-harden, which risks breaking Run 6's golden timing on a non-deterministic Yosys run. 46 hour cost, high regression risk.
  2. Post-extraction .mag editing to add per-pin port labels — brittle and tool-specific; would not survive a re-harden.
  3. Formal LVS waiver (the chosen May 12 path) — document the cosmetic nature of the errors, cite netgen's own "Device classes equivalent" line, and submit alongside the submission packet.

Key lesson

The 208 LVS pin-match errors are not fixable with wrapper-only hardening. Magic SPICE-extraction behavior is the root cause. Future sessions should not re-litigate this — either fix it inside the macro (re-harden risk) or formally waive it.

Next Steps

  • Submit with a formal LVS waiver (see chip_ignite/docs/LVS_WAIVER.md)
  • Confirm cf precheck and cf verify ldpc_basic --sim gl still pass on the HEAD wrapper state
  • cf push before 2026-05-13 deadline