From f22ee197ab80246faa0b29f889ced6fe37bd3848 Mon Sep 17 00:00:00 2001 From: cah Date: Tue, 12 May 2026 23:13:11 -0600 Subject: [PATCH] docs(hardening): add wrapper attempt history through v8-v11 + LVS-fix lessons Document the full wrapper hardening trail: - Mar 12-13 wrapper_v2/v3/v4 results, mpw_precheck 17/19, and 5/5 GLS pass - May 7-11 v6-v11 LVS-cosmetic-fix attempts (all seven failed) The v6-v11 series tried to eliminate the 208 cosmetic LVS pin-match errors via per-pin conb_1 tieoffs and placement tweaks. All failed because the errors are a Magic SPICE-extraction limitation (constant- tied output nets collapse into shared power/ground at extract time), not a hardening defect. Documented so future sessions don't re-explore this dead end. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/hardening-results.md | 118 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 114 insertions(+), 4 deletions(-) diff --git a/docs/hardening-results.md b/docs/hardening-results.md index f104e69..7f55d5c 100644 --- a/docs/hardening-results.md +++ b/docs/hardening-results.md @@ -377,8 +377,118 @@ The syndrome path is NO LONGER critical. The new bottleneck is the column-indexe 5. **Config must be consistent**: Reusing synthesis from a run with different config settings causes PnR divergence 6. **Run 6's balanced_popcount synthesis netlist is the golden reference** — all future PnR runs should reuse it +## Wrapper Hardening (Mar 12-13, 2026) + +### wrapper_v2 — COMPLETED (LVS fail) +- **Config**: `SYNTH_ELABORATE_ONLY=true`, `FP_PDN_ENABLE_RAILS=false` +- **Result**: DRC clean, but LVS fails — 3 standard cells (inv_2 + 2x conb_1) have floating VPWR/VGND +- **Root cause**: Without power rails, wrapper std cells have no power connection + +### wrapper_v3 — ABORTED (208 LVS pin-match errors) +- **Config**: `SYNTH_ELABORATE_ONLY=true`, `FP_PDN_ENABLE_RAILS=true`, `ERROR_ON_LVS_ERROR=true` +- **Result**: DRC clean, XOR clean, power pins connected. Flow aborted at LVS check. +- **LVS issue**: 206 constant-tied output pins merged during Magic SPICE extraction + +### wrapper_v4 — COMPLETED (golden wrapper) +- **Config**: Same as v3 but `ERROR_ON_LVS_ERROR=false` +- **Result**: All 69 stages completed. DRC clean (Magic + KLayout). XOR clean. +- **LVS**: 208 pin-match errors (cosmetic — device classes equivalent) +- **Pin merging**: Magic SPICE extraction merges io_oeb[37:0], io_out[37:0], la_data_out[127:0], user_irq[2:1] into shared constant nets, losing individual pin labels + +## Precheck Results (Mar 13, 2026) + +| # | Check | Result | +|---|-------|--------| +| 1 | License | PASSED (SPDX sub-check: 1727 non-compliant venv files) | +| 2 | Makefile | **PASSED** | +| 3 | Default | **PASSED** | +| 4 | Documentation | **PASSED** | +| 5 | Top Cell | **PASSED** | +| 6 | Consistency | **PASSED** | +| 7 | GPIO-Defines | **PASSED** | +| 8 | XOR | **PASSED** | +| 9 | Magic DRC | **PASSED** | +| 10 | KLayout FEOL | FAILED (SIGSEGV crash, NOT real DRC) | +| 11 | KLayout BEOL | **PASSED** | +| 12 | KLayout Offgrid | **PASSED** | +| 13 | KLayout Metal Density | **PASSED** | +| 14 | KLayout Pin Labels | **PASSED** | +| 15 | KLayout ZeroArea | **PASSED** | +| 16 | Spike Check | **PASSED** | +| 17 | Illegal Cellname | **PASSED** | +| 18 | OEB | **PASSED** | +| 19 | LVS | FAILED (3 cosmetic pin mismatches) | + +**17 PASSED, 2 FAILED.** Both failures are non-functional: +- KLayout FEOL: Tool crash (signal 11), not a DRC violation +- LVS: "Top level cell failed pin matching" — 3 cosmetic mismatches: + - `io_oeb[9]` in layout only (Magic kept 1 label for merged constant net) + - `user_irq[2]` in layout only (same issue) + - `vssd2` in netlist only (PDN power net not labeled as port) + - CVC: 0 errors. Device classes: equivalent. + +## Gate-Level Simulation Results (Mar 13, 2026) + +All 5 cocotb tests passed in GL mode (iverilog + caravel_cocotb, no SDF annotation): + +| Test | Status | Sim Time (ns) | Wall Time (s) | GPIO[7:0] | Errors | +|------|--------|---------------|----------------|-----------|--------| +| ldpc_basic | **PASS** | 854,225 | 1,814 | 0xAB | 0 | +| ldpc_noisy | **PASS** | 1,011,550 | 2,720 | 0xAB | 0 | +| ldpc_max_iter | **PASS** | 1,104,525 | 3,393 | 0xAB | 0 | +| ldpc_back_to_back | **PASS** | 1,140,375 | 3,371 | 0xAB | 0 | +| ldpc_demo | **PASS** | 1,251,050 | 3,612 | 0xAB | 0 | + +- iverilog compilation: ~2h18m per test (1.1GB sim.vvp), 8.2GB RAM +- Simulation: ~30-60 min per test (5-9GB VCD waveform) +- All tests ran on snoke (247GB RAM), 4 tests in parallel +- GPIO[7:0] = 0xAB is the firmware success code for all tests +- No X-propagation or timing race issues observed + +## Wrapper Hardening Attempts (May 7-11, 2026) — Failed LVS Cosmetic-Fix Series + +After the May 1 `cf_wrapper_v5` golden run landed (commit `74ad20a` to origin / `1fcdc1d` to gitea) with 208 cosmetic LVS pin-match errors, a series of seven follow-up runs tried to eliminate those errors. **All seven failed.** The errors are a Magic SPICE-extraction limitation, not a hardening defect — no amount of RTL/placement tweaking will change Magic's behavior. + +### Timeline + +| Run | Date | Strategy | Result | +|-----|------|----------|--------| +| v6 | May 7 | First post-PDN-swap retry (commit `8cc8414` landed config changes); same wrapper RTL | Flow completed but KLayout crashed in final manufacturability step; same 208 LVS errors | +| v7 | May 7 | Same as v6, re-run | Aborted mid-routing on `[DRT-0349]` LEF58_ENCLOSURE warnings — routing never completed | +| v8 | May 8 | `manual_tieoffs.vh` with 206 per-pin `conb_1` cells + `manual_placements.json` placing each cell adjacent to its target pin; mprj moved `[60,15] → [60,200]` to make room | Flow completed; **same 208 LVS errors** — Magic still merged all constant-tied outputs. STA failed on `min_ss_100C_1v60` and `nom_tt_025C_1v80` corners | +| v9 | May 9 | Same as v8 with `ERROR_ON_TR_DRC=false` to push through routing | **1780 routing DRC errors** (deferred). Magic streamout completed but DRC was never clean | +| v10 | May 11 | Same family of placement tweaks | **1362 routing DRC errors** (deferred); same failure mode as v9 | +| v11 | May 11 | One more attempt | Interrupted at step 01 (yosys-jsonheader); no harden process running | + +### Why every attempt failed + +The 208 LVS errors all come from **Magic SPICE extraction collapsing constant-tied nets**: + +- `la_data_out[127:0]` — all 128 bits tied to `1'b0` → Magic extracts as a single GND net → 127 pin labels lost (only one kept arbitrarily, often none) +- `io_out[37:0]` — all 38 bits tied to `1'b0` → same merge +- `io_oeb[37:0]` — all 38 bits tied to `1'b1` → merged into VDD net (Magic keeps the label for `io_oeb[9]` for unknown reasons) +- `user_irq[2:1]` — tied to `2'b0` → merged into GND + +The v8 attempt — putting each pin behind its own `sky130_fd_sc_hd__conb_1` cell — does not break the merge because Magic's extractor still resolves each `conb_1` output as the constant `VPWR` or `VGND` and collapses them onto the global power/ground nets at the extracted-SPICE level. Per-pin cells generate distinct logical nets in the Verilog netlist but not distinct extracted nets in the layout. **Netgen itself reports "Device classes equivalent" and "Cell pin lists altered to match"** — the failure is bookkeeping, not electrical. + +### Approaches proven non-viable (don't try again) + +1. **Per-pin `conb_1` cells in the wrapper Verilog** — v8 disproved this. Magic optimizes them onto the constant nets. +2. **Per-pin manual placement of tieoff cells** — placement doesn't change extraction behavior. +3. **mprj location shifts** to make room for tieoff rows — doesn't help; cosmetic LVS persists. +4. **Pushing routing-DRC tolerance up** (v9, v10) — produces broken layouts (1300–1800 routing DRC errors), worse than starting state. + +### Approaches that *could* work but were not attempted (deferred — too risky pre-deadline) + +1. **Drive 206 dummy zero outputs from inside `ldpc_decoder_top`** — would force each wrapper output to come from a distinct extracted macro pin instead of a constant-tied wrapper net. Requires a fresh macro re-harden, which risks breaking Run 6's golden timing on a non-deterministic Yosys run. 4–6 hour cost, high regression risk. +2. **Post-extraction `.mag` editing** to add per-pin port labels — brittle and tool-specific; would not survive a re-harden. +3. **Formal LVS waiver** (the chosen May 12 path) — document the cosmetic nature of the errors, cite netgen's own "Device classes equivalent" line, and submit alongside the submission packet. + +### Key lesson + +**The 208 LVS pin-match errors are not fixable with wrapper-only hardening.** Magic SPICE-extraction behavior is the root cause. Future sessions should not re-litigate this — either fix it inside the macro (re-harden risk) or formally waive it. + ## Next Steps -- Address antenna violations (1,687 nets) for tapeout — try `GRT_ANTENNA_ITERS` with reused BP synthesis -- Fix hold violations via input delay constraints (all are input port paths) -- Consider relaxing SS target or adding pipeline stage to belief update mux for SS corner improvement -- Investigate making Yosys synthesis deterministic (fixed random seed, etc.) for reproducible builds +- Submit with a formal LVS waiver (see `chip_ignite/docs/LVS_WAIVER.md`) +- Confirm `cf precheck` and `cf verify ldpc_basic --sim gl` still pass on the HEAD wrapper state +- `cf push` before 2026-05-13 deadline