Compare commits
10 Commits
db06a8a481
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
| f2901c6366 | |||
| 3e797fd5ab | |||
| a83f05cf82 | |||
| 6cc13829c8 | |||
| ab9ef9ca30 | |||
| 1520f4da5b | |||
| b7449a6191 | |||
| 3372f84a3a | |||
| 74baf3cd05 | |||
| 9a28e30bed |
4
.gitignore
vendored
4
.gitignore
vendored
@@ -5,5 +5,9 @@
|
||||
__pycache__/
|
||||
*.pyc
|
||||
|
||||
# Simulation artifacts
|
||||
tb/obj_dir/
|
||||
tb/*.vcd
|
||||
|
||||
# Contest submission repo (separate GitHub repo)
|
||||
chip_ignite/
|
||||
|
||||
22162
data/test_vectors.json
Normal file
22162
data/test_vectors.json
Normal file
File diff suppressed because it is too large
Load Diff
197
docs/hardening-results.md
Normal file
197
docs/hardening-results.md
Normal file
@@ -0,0 +1,197 @@
|
||||
# LDPC Decoder Hardening Results
|
||||
|
||||
## Run 1: `26_02_25_21_11` (Feb 25, 2026) — FAILED
|
||||
- **RTL**: Original (unpipelined CN update)
|
||||
- **Config**: `CLOCK_PERIOD=20` (50 MHz), `RUN_HEURISTIC_DIODE_INSERTION=true`, `HEURISTIC_ANTENNA_THRESHOLD=110`
|
||||
- **Die area**: 2800 x 1760 µm (4.93 mm²)
|
||||
- **Failure**: `GRT-0118` routing congestion after heuristic diode insertion (66,016 diodes added)
|
||||
- **Notes**: Initial global routing passed (0 overflow, 39% routing utilization). Diode insertion nearly doubled cell count, causing re-routing congestion failure.
|
||||
|
||||
## Run 2: `reuse_synth` (Feb 27, 2026) — COMPLETED (timing violations)
|
||||
- **RTL**: Original (unpipelined CN update) — reused synthesis netlist from Run 1
|
||||
- **Config**: `CLOCK_PERIOD=20` (50 MHz), `RUN_HEURISTIC_DIODE_INSERTION=false`, `RUN_ANTENNA_REPAIR=true`
|
||||
- **Die area**: 2800 x 1760 µm (4.93 mm²)
|
||||
- **Result**: All 70 steps completed. GDS generated. Deferred timing errors.
|
||||
|
||||
### Physical Results
|
||||
| Metric | Result |
|
||||
|--------|--------|
|
||||
| Magic DRC | **Clean** |
|
||||
| KLayout DRC | **Clean** |
|
||||
| LVS | **Clean** (0 errors, 0 unmatched) |
|
||||
| XOR (Magic vs KLayout) | **Clean** |
|
||||
| Illegal overlap | **Clean** |
|
||||
| Power grid violations | **0** |
|
||||
| Antenna violating nets | 658 |
|
||||
| Antenna violating pins | 905 |
|
||||
|
||||
### Area & Utilization
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Die area | 4,928,000 µm² (4.93 mm²) |
|
||||
| Core area | 4,846,670 µm² |
|
||||
| Instance count | 184,663 |
|
||||
| Instance area | 1,303,260 µm² (1.30 mm²) |
|
||||
| Core utilization | 26.9% |
|
||||
| Sequential cells | 16,967 |
|
||||
| Combinational cells | 61,366 |
|
||||
| Timing repair buffers | 23,709 |
|
||||
| Fill cells | 415,149 |
|
||||
| Tap cells | 69,228 |
|
||||
|
||||
### Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)
|
||||
| Corner | Setup WNS (ns) | Setup TNS (ns) | Hold WNS (ns) | Hold TNS (ns) | Setup Violations |
|
||||
|--------|----------------|-----------------|----------------|----------------|------------------|
|
||||
| nom_tt_025C_1v80 | **-27.13** | -234.9 | -0.32 | -3.76 | 9 |
|
||||
| nom_ss_100C_1v60 | **-70.58** | -29,946.3 | 0.06 | 0 | 5,463 |
|
||||
| nom_ff_n40C_1v95 | **-10.18** | -86.3 | -0.26 | -12.4 | — |
|
||||
| **Worst across all** | **-71.40** | -34,329.1 | -0.47 | -26.4 | — |
|
||||
|
||||
### Estimated Max Frequency
|
||||
- **TT corner**: Critical path ~47 ns → **~21 MHz**
|
||||
- **SS corner**: Critical path ~91 ns → **~11 MHz**
|
||||
- **FF corner**: Critical path ~30 ns → **~33 MHz**
|
||||
|
||||
### Power (TT corner)
|
||||
| Component | Power (W) |
|
||||
|-----------|-----------|
|
||||
| Internal | 0.0554 |
|
||||
| Switching | 0.0273 |
|
||||
| Leakage | ~0.002 mW |
|
||||
| **Total** | **0.0827** |
|
||||
|
||||
### Key Observations
|
||||
1. Disabling heuristic diode insertion fixed the routing congestion failure from Run 1
|
||||
2. 658 antenna violations remain — iterative antenna repair was not sufficient. May need to re-enable heuristic insertion with a higher threshold or use `DIODE_ON_PORTS`
|
||||
3. Setup timing is severely violated — critical path is ~47 ns at TT, far from 20 ns target
|
||||
4. This run used the **unpipelined** RTL (synthesis reused from Run 1 which predated the CN pipeline split)
|
||||
5. Next run should re-synthesize with pipelined CN update RTL to see if timing improves
|
||||
|
||||
## Run 3: `pipelined_pnr` (Mar 1, 2026) — FAILED
|
||||
- **RTL**: Pipelined CN update (CN_STAGE1 + CN_STAGE2)
|
||||
- **Config**: `CLOCK_PERIOD=20` (50 MHz), `SYNTH_STRATEGY=AREA 0`, `RUN_HEURISTIC_DIODE_INSERTION=false`, `RUN_ANTENNA_REPAIR=true`
|
||||
- **Die area**: 2800 x 1760 µm (4.93 mm²)
|
||||
- **Failure**: `GRT-0118` routing congestion during iterative antenna repair (step 36), after 13+ hours of repair loops
|
||||
- **Notes**: Iterative antenna repair kept inserting diodes and re-routing until congestion became too high. Same root cause as Run 1 but via different mechanism.
|
||||
|
||||
## Run 3b: `pipelined_synth` (Feb 28, 2026) — STILL RUNNING
|
||||
- **RTL**: Pipelined CN update
|
||||
- **Config**: `SYNTH_STRATEGY=AREA 2` — synthesis only
|
||||
- **Status**: ABC pass 2 (tech mapping) running 20+ hours. `AREA 2` is far too aggressive for this design size. **Do not use AREA 2 for this design.**
|
||||
|
||||
## Run 4: `pipelined_noantenna` (Mar 2, 2026) — COMPLETED (timing violations)
|
||||
- **RTL**: Pipelined CN update (CN_STAGE1 + CN_STAGE2)
|
||||
- **Config**: `CLOCK_PERIOD=20` (50 MHz), `SYNTH_STRATEGY=AREA 0`, `RUN_HEURISTIC_DIODE_INSERTION=false`, `RUN_ANTENNA_REPAIR=false`
|
||||
- **Die area**: 2800 x 1760 µm (4.93 mm²)
|
||||
- **Result**: All 69 steps completed. GDS generated. Deferred timing errors. No antenna repair attempted.
|
||||
|
||||
### Physical Results
|
||||
| Metric | Result |
|
||||
|--------|--------|
|
||||
| Magic DRC | **Clean** |
|
||||
| KLayout DRC | **Clean** |
|
||||
| LVS | **Clean** (0 errors, 0 unmatched) |
|
||||
| XOR (Magic vs KLayout) | **Clean** |
|
||||
| Illegal overlap | **Clean** |
|
||||
| Antenna violating nets | 1,707 (no repair attempted) |
|
||||
| Antenna violating pins | 3,319 (no repair attempted) |
|
||||
|
||||
### Area & Utilization
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Die area | 4,928,000 µm² (4.93 mm²) |
|
||||
| Instance count | 183,774 |
|
||||
| Instance area | 1,351,790 µm² (1.35 mm²) |
|
||||
| Core utilization | 27.9% |
|
||||
|
||||
### Timing (post-route, CLOCK_PERIOD = 20 ns / 50 MHz target)
|
||||
| Corner | Setup WNS (ns) | Setup TNS (ns) | Hold WNS (ns) | Hold TNS (ns) |
|
||||
|--------|----------------|-----------------|----------------|----------------|
|
||||
| nom_tt_025C_1v80 | **-28.86** | -348.0 | -0.08 | -0.15 |
|
||||
| nom_ss_100C_1v60 | **-74.22** | -20,536.0 | -0.07 | -0.07 |
|
||||
| nom_ff_n40C_1v95 | **-11.04** | -93.8 | -0.12 | -2.15 |
|
||||
| min_tt_025C_1v80 | -28.39 | -251.0 | 0 | 0 |
|
||||
| max_tt_025C_1v80 | -29.36 | -725.1 | -0.24 | -2.15 |
|
||||
|
||||
### Estimated Max Frequency
|
||||
- **TT corner**: Critical path ~49 ns → **~20 MHz**
|
||||
- **SS corner**: Critical path ~94 ns → **~11 MHz**
|
||||
- **FF corner**: Critical path ~31 ns → **~32 MHz**
|
||||
|
||||
### Power (TT corner)
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| **Total** | **0.0858 W** |
|
||||
|
||||
### Key Observations
|
||||
1. Pipelined CN update did NOT improve timing — TT WNS is -28.86 ns vs -27.13 ns (unpipelined Run 2). Slightly worse, possibly due to AREA 0 vs AREA 2 synth strategy difference.
|
||||
2. Hold violations are much smaller than Run 2 (-0.08 vs -0.32 ns), nearly clean.
|
||||
3. Antenna violations increased to 1,707 nets (vs 658 in Run 2) without any repair — AREA 0 produces a less antenna-friendly netlist.
|
||||
4. The critical path is still ~47-49 ns, suggesting the bottleneck is NOT the CN update pipeline stage but something else (likely the large mux/barrel shifter or belief update logic).
|
||||
5. `SYNTH_STRATEGY=AREA 2` takes 20+ hours for ABC tech mapping on this design — **never use it**. `AREA 0` completed in reasonable time.
|
||||
|
||||
## Summary Table
|
||||
|
||||
| Run | RTL | Synth | Antenna | Status | TT Setup WNS | Max Freq (TT) |
|
||||
|-----|-----|-------|---------|--------|-------------|---------------|
|
||||
| 1 | Unpipelined | AREA 2 | Heuristic 110µm | **FAILED** (congestion) | — | — |
|
||||
| 2 | Unpipelined | AREA 2 | Iterative | **COMPLETED** | -27.13 ns | ~21 MHz |
|
||||
| 3 | Pipelined | AREA 0 | Iterative | **FAILED** (congestion) | — | — |
|
||||
| 3b | Pipelined | AREA 2 | — (synth only) | Still running (20+ hrs) | — | — |
|
||||
| 4 | Pipelined | AREA 0 | None | **COMPLETED** | -28.86 ns | ~20 MHz |
|
||||
|
||||
## Critical Path Analysis (from Run 4, pipelined_noantenna)
|
||||
|
||||
### Path Summary
|
||||
| Item | Value |
|
||||
|------|-------|
|
||||
| Startpoint | `u_core.beliefs[0][5]` (beliefs register, bit 5 of element 0) |
|
||||
| Endpoint | `syndrome_weight[7]` (MSB of syndrome weight counter) |
|
||||
| RTL location | `SYNDROME` state in `ldpc_decoder_core.sv`, lines 363-385 |
|
||||
| Slack | **-28.859 ns** (VIOLATED) |
|
||||
| Total combinational delay | **47.67 ns** |
|
||||
| Logic levels | **222** (171 XOR/XNOR + 51 adder/mux) |
|
||||
| Logic vs wire delay | 99.7% logic / 0.3% wire |
|
||||
|
||||
All 8 worst setup violators fan out from `beliefs[0][5]` to `syndrome_weight[7:0]`.
|
||||
|
||||
### What the Critical Path Computes
|
||||
|
||||
The `SYNDROME` state computes the full syndrome check in a **single clock cycle**:
|
||||
|
||||
1. **Parity computation** (171 XOR levels, 33.9 ns): XOR the sign bits of all beliefs connected to each check node — 7 rows x 32 z-elements x up to 3 columns = 224 parity bits, reading from 256 belief sign bits.
|
||||
2. **Population count** (51 adder levels, 13.6 ns): Sum all 224 parity results into an 8-bit `syndrome_cnt`.
|
||||
|
||||
The `syndrome_cnt = syndrome_cnt + 1` accumulation pattern creates a carry chain dependency that serializes everything.
|
||||
|
||||
### Delay Breakdown
|
||||
| Segment | Delay (ns) | Cells | Description |
|
||||
|---------|-----------|-------|-------------|
|
||||
| Source CLK-to-Q | 0.795 | 1 (dfxtp_4) | beliefs[0][5] register output |
|
||||
| Parity XOR chain | 33.888 | 171 (xor2/xnor2) | XOR reduction across belief sign bits |
|
||||
| Popcount adder tree | 13.634 | 51 (and/or/aoi/oai) | 224-bit popcount to 8-bit count |
|
||||
| State MUX | 0.148 | 1 (mux2_1) | FSM output mux |
|
||||
| Wire (interconnect) | 0.149 | — | 0.3% of total — negligible |
|
||||
| **Total** | **48.614** | **222 levels** | |
|
||||
|
||||
### Proposed Fix: 2-3 Stage Syndrome Pipeline
|
||||
|
||||
**SYNDROME_S1** (cycle 1, ~16 ns): Compute all 224 parity bits in parallel. Each parity is only 2-3 XOR operations deep (one per connected column). Register the 224-bit `parity_vec`.
|
||||
|
||||
**SYNDROME_S2** (cycle 2, ~14 ns): Popcount the 224-bit parity vector via balanced adder tree. Register the 8-bit `syndrome_weight` and `syndrome_ok` flag.
|
||||
|
||||
**SYNDROME_DONE** (cycle 3): Already exists — reads `syndrome_ok`.
|
||||
|
||||
**Estimated post-fix critical path**: ~14-16 ns (comfortably under 20 ns / 50 MHz).
|
||||
**Latency impact**: +1-2 cycles per iteration (negligible at 30 iterations).
|
||||
|
||||
### Secondary Violations
|
||||
|
||||
Wishbone address input (`wb_adr_i`) has -2.47 ns setup violation. Fixable by registering the address at the decoder boundary.
|
||||
|
||||
## Next Steps
|
||||
- Implement syndrome pipeline (SYNDROME_S1 + SYNDROME_S2) to cut critical path from ~49 ns to ~16 ns
|
||||
- Register Wishbone address input to fix secondary violation
|
||||
- Re-synthesize with AREA 0 and run PnR to verify timing improvement
|
||||
- Consider increasing die area for antenna repair headroom
|
||||
- Consider `SYNTH_STRATEGY=AREA 1` as middle ground between AREA 0 and AREA 2
|
||||
1282
docs/plans/2026-02-25-chipfoundry-contest-impl.md
Normal file
1282
docs/plans/2026-02-25-chipfoundry-contest-impl.md
Normal file
File diff suppressed because it is too large
Load Diff
288
model/gen_firmware_vectors.py
Normal file
288
model/gen_firmware_vectors.py
Normal file
@@ -0,0 +1,288 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Generate test vector files for cocotb and firmware from the Python model output.
|
||||
|
||||
Reads data/test_vectors.json and produces:
|
||||
1. chip_ignite/verilog/dv/cocotb/ldpc_tests/test_data.py (cocotb Python module)
|
||||
2. chip_ignite/firmware/ldpc_demo/test_vectors.h (C header for PicoRV32)
|
||||
|
||||
LLR packing format (matches wishbone_interface.sv):
|
||||
Each 32-bit word holds 5 LLRs, 6 bits each, in two's complement.
|
||||
Word[i] bits [5:0] = LLR[5*i+0]
|
||||
Word[i] bits [11:6] = LLR[5*i+1]
|
||||
Word[i] bits [17:12] = LLR[5*i+2]
|
||||
Word[i] bits [23:18] = LLR[5*i+3]
|
||||
Word[i] bits [29:24] = LLR[5*i+4]
|
||||
52 words cover 260 LLRs (256 used, last 4 are zero-padded).
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
# Paths relative to this script's directory
|
||||
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
PROJECT_DIR = os.path.dirname(SCRIPT_DIR)
|
||||
INPUT_FILE = os.path.join(PROJECT_DIR, 'data', 'test_vectors.json')
|
||||
COCOTB_OUTPUT = os.path.join(PROJECT_DIR, 'chip_ignite', 'verilog', 'dv', 'cocotb',
|
||||
'ldpc_tests', 'test_data.py')
|
||||
FIRMWARE_OUTPUT = os.path.join(PROJECT_DIR, 'chip_ignite', 'firmware', 'ldpc_demo',
|
||||
'test_vectors.h')
|
||||
|
||||
Q_BITS = 6
|
||||
LLRS_PER_WORD = 5
|
||||
N_LLR = 256
|
||||
N_WORDS = (N_LLR + LLRS_PER_WORD - 1) // LLRS_PER_WORD # 52
|
||||
K = 32
|
||||
|
||||
|
||||
def signed_to_twos_complement(val, bits=Q_BITS):
|
||||
"""Convert signed integer to two's complement unsigned representation."""
|
||||
if val < 0:
|
||||
return val + (1 << bits)
|
||||
return val & ((1 << bits) - 1)
|
||||
|
||||
|
||||
def pack_llr_words(llr_quantized):
|
||||
"""
|
||||
Pack 256 signed LLRs into 52 uint32 words.
|
||||
|
||||
Each word contains 5 LLRs, 6 bits each:
|
||||
bits[5:0] = LLR[5*word + 0]
|
||||
bits[11:6] = LLR[5*word + 1]
|
||||
bits[17:12] = LLR[5*word + 2]
|
||||
bits[23:18] = LLR[5*word + 3]
|
||||
bits[29:24] = LLR[5*word + 4]
|
||||
"""
|
||||
# Pad to 260 entries (52 * 5)
|
||||
padded = list(llr_quantized) + [0] * (N_WORDS * LLRS_PER_WORD - N_LLR)
|
||||
|
||||
words = []
|
||||
for w in range(N_WORDS):
|
||||
word = 0
|
||||
for p in range(LLRS_PER_WORD):
|
||||
llr_idx = w * LLRS_PER_WORD + p
|
||||
tc = signed_to_twos_complement(padded[llr_idx])
|
||||
word |= (tc & 0x3F) << (p * Q_BITS)
|
||||
words.append(word)
|
||||
return words
|
||||
|
||||
|
||||
def bits_to_uint32(bits):
|
||||
"""Convert a list of 32 binary values to a single uint32 (bit 0 = LSB)."""
|
||||
val = 0
|
||||
for i, b in enumerate(bits):
|
||||
if b:
|
||||
val |= (1 << i)
|
||||
return val
|
||||
|
||||
|
||||
def generate_cocotb_test_data(vectors):
|
||||
"""Generate Python module for cocotb tests."""
|
||||
lines = []
|
||||
lines.append('"""')
|
||||
lines.append('Auto-generated test vector data for LDPC decoder cocotb tests.')
|
||||
lines.append('Generated by model/gen_firmware_vectors.py')
|
||||
lines.append('')
|
||||
lines.append('LLR packing: 5 LLRs per 32-bit word, 6 bits each (two\'s complement)')
|
||||
lines.append(' Word bits [5:0] = LLR[5*i+0]')
|
||||
lines.append(' Word bits [11:6] = LLR[5*i+1]')
|
||||
lines.append(' Word bits [17:12] = LLR[5*i+2]')
|
||||
lines.append(' Word bits [23:18] = LLR[5*i+3]')
|
||||
lines.append(' Word bits [29:24] = LLR[5*i+4]')
|
||||
lines.append('"""')
|
||||
lines.append('')
|
||||
lines.append(f'# Number of test vectors')
|
||||
lines.append(f'NUM_VECTORS = {len(vectors)}')
|
||||
lines.append(f'LLR_WORDS_PER_VECTOR = {N_WORDS}')
|
||||
lines.append('')
|
||||
lines.append('# Wishbone register offsets (byte-addressed)')
|
||||
lines.append('REG_CTRL = 0x00')
|
||||
lines.append('REG_STATUS = 0x04')
|
||||
lines.append('REG_LLR_BASE = 0x10 # 52 words: 0x10, 0x14, ..., 0xDC')
|
||||
lines.append('REG_DECODED = 0x50')
|
||||
lines.append('REG_VERSION = 0x54')
|
||||
lines.append('')
|
||||
lines.append('')
|
||||
lines.append('TEST_VECTORS = [')
|
||||
|
||||
for vec in vectors:
|
||||
llr_words = pack_llr_words(vec['llr_quantized'])
|
||||
decoded_word = bits_to_uint32(vec['decoded_bits'])
|
||||
|
||||
lines.append(' {')
|
||||
lines.append(f' \'index\': {vec["index"]},')
|
||||
|
||||
# Format LLR words as hex, 8 per line
|
||||
lines.append(f' \'llr_words\': [')
|
||||
for chunk_start in range(0, len(llr_words), 8):
|
||||
chunk = llr_words[chunk_start:chunk_start + 8]
|
||||
hex_str = ', '.join(f'0x{w:08X}' for w in chunk)
|
||||
comma = ',' if chunk_start + 8 < len(llr_words) else ''
|
||||
lines.append(f' {hex_str}{comma}')
|
||||
lines.append(f' ],')
|
||||
|
||||
lines.append(f' \'decoded_word\': 0x{decoded_word:08X},')
|
||||
lines.append(f' \'info_bits\': {vec["info_bits"]},')
|
||||
lines.append(f' \'converged\': {vec["converged"]},')
|
||||
lines.append(f' \'iterations\': {vec["iterations"]},')
|
||||
lines.append(f' \'syndrome_weight\': {vec["syndrome_weight"]},')
|
||||
lines.append(f' \'bit_errors\': {vec["bit_errors"]},')
|
||||
lines.append(' },')
|
||||
|
||||
lines.append(']')
|
||||
lines.append('')
|
||||
lines.append('')
|
||||
lines.append('def get_converged_vectors():')
|
||||
lines.append(' """Return only vectors that converged (for positive testing)."""')
|
||||
lines.append(' return [v for v in TEST_VECTORS if v[\'converged\']]')
|
||||
lines.append('')
|
||||
lines.append('')
|
||||
lines.append('def get_failed_vectors():')
|
||||
lines.append(' """Return only vectors that did not converge (for negative testing)."""')
|
||||
lines.append(' return [v for v in TEST_VECTORS if not v[\'converged\']]')
|
||||
lines.append('')
|
||||
|
||||
return '\n'.join(lines)
|
||||
|
||||
|
||||
def generate_firmware_header(vectors):
|
||||
"""Generate C header for PicoRV32 firmware."""
|
||||
lines = []
|
||||
lines.append('/*')
|
||||
lines.append(' * Auto-generated test vectors for LDPC decoder firmware')
|
||||
lines.append(' * Generated by model/gen_firmware_vectors.py')
|
||||
lines.append(' *')
|
||||
lines.append(' * LLR packing: 5 LLRs per 32-bit word, 6 bits each (two\'s complement)')
|
||||
lines.append(' * Word bits [5:0] = LLR[5*i+0]')
|
||||
lines.append(' * Word bits [11:6] = LLR[5*i+1]')
|
||||
lines.append(' * Word bits [17:12] = LLR[5*i+2]')
|
||||
lines.append(' * Word bits [23:18] = LLR[5*i+3]')
|
||||
lines.append(' * Word bits [29:24] = LLR[5*i+4]')
|
||||
lines.append(' */')
|
||||
lines.append('')
|
||||
lines.append('#ifndef TEST_VECTORS_H')
|
||||
lines.append('#define TEST_VECTORS_H')
|
||||
lines.append('')
|
||||
lines.append('#include <stdint.h>')
|
||||
lines.append('')
|
||||
lines.append(f'#define NUM_TEST_VECTORS {len(vectors)}')
|
||||
lines.append(f'#define LLR_WORDS_PER_VECTOR {N_WORDS}')
|
||||
lines.append('')
|
||||
|
||||
# Generate per-vector arrays
|
||||
for vec in vectors:
|
||||
idx = vec['index']
|
||||
llr_words = pack_llr_words(vec['llr_quantized'])
|
||||
decoded_word = bits_to_uint32(vec['decoded_bits'])
|
||||
|
||||
lines.append(f'/* Vector {idx}: converged={vec["converged"]}, '
|
||||
f'iterations={vec["iterations"]}, '
|
||||
f'syndrome_weight={vec["syndrome_weight"]}, '
|
||||
f'bit_errors={vec["bit_errors"]} */')
|
||||
lines.append(f'static const uint32_t tv{idx}_llr[{N_WORDS}] = {{')
|
||||
for chunk_start in range(0, len(llr_words), 4):
|
||||
chunk = llr_words[chunk_start:chunk_start + 4]
|
||||
hex_str = ', '.join(f'0x{w:08X}' for w in chunk)
|
||||
comma = ',' if chunk_start + 4 < len(llr_words) else ''
|
||||
lines.append(f' {hex_str}{comma}')
|
||||
lines.append('};')
|
||||
lines.append(f'static const uint32_t tv{idx}_decoded = 0x{decoded_word:08X};')
|
||||
lines.append(f'static const int tv{idx}_converged = {1 if vec["converged"] else 0};')
|
||||
lines.append(f'static const int tv{idx}_iterations = {vec["iterations"]};')
|
||||
lines.append(f'static const int tv{idx}_syndrome_weight = {vec["syndrome_weight"]};')
|
||||
lines.append('')
|
||||
|
||||
# Generate array-of-pointers for easy iteration
|
||||
lines.append('/* Array of LLR pointers for iteration */')
|
||||
lines.append(f'static const uint32_t * const tv_llr[NUM_TEST_VECTORS] = {{')
|
||||
for i, vec in enumerate(vectors):
|
||||
comma = ',' if i < len(vectors) - 1 else ''
|
||||
lines.append(f' tv{vec["index"]}_llr{comma}')
|
||||
lines.append('};')
|
||||
lines.append('')
|
||||
|
||||
lines.append(f'static const uint32_t tv_decoded[NUM_TEST_VECTORS] = {{')
|
||||
for i, vec in enumerate(vectors):
|
||||
decoded_word = bits_to_uint32(vec['decoded_bits'])
|
||||
comma = ',' if i < len(vectors) - 1 else ''
|
||||
lines.append(f' 0x{decoded_word:08X}{comma} /* tv{vec["index"]} */')
|
||||
lines.append('};')
|
||||
lines.append('')
|
||||
|
||||
lines.append(f'static const int tv_converged[NUM_TEST_VECTORS] = {{')
|
||||
vals = ', '.join(str(1 if v['converged'] else 0) for v in vectors)
|
||||
lines.append(f' {vals}')
|
||||
lines.append('};')
|
||||
lines.append('')
|
||||
|
||||
lines.append(f'static const int tv_iterations[NUM_TEST_VECTORS] = {{')
|
||||
vals = ', '.join(str(v['iterations']) for v in vectors)
|
||||
lines.append(f' {vals}')
|
||||
lines.append('};')
|
||||
lines.append('')
|
||||
|
||||
lines.append(f'static const int tv_syndrome_weight[NUM_TEST_VECTORS] = {{')
|
||||
vals = ', '.join(str(v['syndrome_weight']) for v in vectors)
|
||||
lines.append(f' {vals}')
|
||||
lines.append('};')
|
||||
lines.append('')
|
||||
|
||||
lines.append('#endif /* TEST_VECTORS_H */')
|
||||
lines.append('')
|
||||
|
||||
return '\n'.join(lines)
|
||||
|
||||
|
||||
def main():
|
||||
# Load test vectors
|
||||
print(f'Reading {INPUT_FILE}...')
|
||||
with open(INPUT_FILE) as f:
|
||||
vectors = json.load(f)
|
||||
print(f' Loaded {len(vectors)} vectors')
|
||||
converged = sum(1 for v in vectors if v['converged'])
|
||||
print(f' Converged: {converged}/{len(vectors)}')
|
||||
|
||||
# Generate cocotb test data
|
||||
cocotb_content = generate_cocotb_test_data(vectors)
|
||||
os.makedirs(os.path.dirname(COCOTB_OUTPUT), exist_ok=True)
|
||||
with open(COCOTB_OUTPUT, 'w') as f:
|
||||
f.write(cocotb_content)
|
||||
print(f' Wrote {COCOTB_OUTPUT}')
|
||||
|
||||
# Generate firmware header
|
||||
firmware_content = generate_firmware_header(vectors)
|
||||
os.makedirs(os.path.dirname(FIRMWARE_OUTPUT), exist_ok=True)
|
||||
with open(FIRMWARE_OUTPUT, 'w') as f:
|
||||
f.write(firmware_content)
|
||||
print(f' Wrote {FIRMWARE_OUTPUT}')
|
||||
|
||||
# Verify: check roundtrip of LLR packing
|
||||
print('\nVerifying LLR packing roundtrip...')
|
||||
for vec in vectors:
|
||||
llr_q = vec['llr_quantized']
|
||||
words = pack_llr_words(llr_q)
|
||||
# Unpack and compare
|
||||
for w_idx, word in enumerate(words):
|
||||
for p in range(LLRS_PER_WORD):
|
||||
llr_idx = w_idx * LLRS_PER_WORD + p
|
||||
if llr_idx >= N_LLR:
|
||||
break
|
||||
tc_val = (word >> (p * Q_BITS)) & 0x3F
|
||||
# Convert back to signed
|
||||
if tc_val >= 32:
|
||||
signed_val = tc_val - 64
|
||||
else:
|
||||
signed_val = tc_val
|
||||
expected = llr_q[llr_idx]
|
||||
assert signed_val == expected, (
|
||||
f'Vec {vec["index"]}, LLR[{llr_idx}]: '
|
||||
f'packed={signed_val}, expected={expected}'
|
||||
)
|
||||
print(' LLR packing roundtrip OK for all vectors')
|
||||
|
||||
print('\nDone.')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
188
model/gen_verilator_vectors.py
Normal file
188
model/gen_verilator_vectors.py
Normal file
@@ -0,0 +1,188 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Generate hex files for Verilator $readmemh from Python model test vectors.
|
||||
|
||||
Reads data/test_vectors.json and produces:
|
||||
tb/vectors/llr_words.hex - LLR data packed as 32-bit hex words
|
||||
tb/vectors/expected.hex - Expected decode results
|
||||
tb/vectors/num_vectors.txt - Vector count
|
||||
|
||||
LLR packing format (matches wishbone_interface.sv):
|
||||
Each 32-bit word holds 5 LLRs, 6 bits each, in two's complement.
|
||||
Word[i] bits [5:0] = LLR[5*i+0]
|
||||
Word[i] bits [11:6] = LLR[5*i+1]
|
||||
Word[i] bits [17:12] = LLR[5*i+2]
|
||||
Word[i] bits [23:18] = LLR[5*i+3]
|
||||
Word[i] bits [29:24] = LLR[5*i+4]
|
||||
52 words cover 260 LLRs (256 used, last 4 are zero-padded).
|
||||
|
||||
Expected output format (per vector, 4 lines):
|
||||
Line 0: decoded_word (32-bit hex, info bits packed LSB-first)
|
||||
Line 1: converged (00000000 or 00000001)
|
||||
Line 2: iterations (32-bit hex)
|
||||
Line 3: syndrome_weight (32-bit hex)
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
# Paths relative to this script's directory
|
||||
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
PROJECT_DIR = os.path.dirname(SCRIPT_DIR)
|
||||
INPUT_FILE = os.path.join(PROJECT_DIR, 'data', 'test_vectors.json')
|
||||
OUTPUT_DIR = os.path.join(PROJECT_DIR, 'tb', 'vectors')
|
||||
|
||||
Q_BITS = 6
|
||||
LLRS_PER_WORD = 5
|
||||
N_LLR = 256
|
||||
N_WORDS = (N_LLR + LLRS_PER_WORD - 1) // LLRS_PER_WORD # 52
|
||||
K = 32
|
||||
|
||||
LINES_PER_EXPECTED = 4 # decoded_word, converged, iterations, syndrome_weight
|
||||
|
||||
|
||||
def signed_to_twos_complement(val, bits=Q_BITS):
|
||||
"""Convert signed integer to two's complement unsigned representation."""
|
||||
if val < 0:
|
||||
return val + (1 << bits)
|
||||
return val & ((1 << bits) - 1)
|
||||
|
||||
|
||||
def pack_llr_words(llr_quantized):
|
||||
"""
|
||||
Pack 256 signed LLRs into 52 uint32 words.
|
||||
|
||||
Each word contains 5 LLRs, 6 bits each:
|
||||
bits[5:0] = LLR[5*word + 0]
|
||||
bits[11:6] = LLR[5*word + 1]
|
||||
bits[17:12] = LLR[5*word + 2]
|
||||
bits[23:18] = LLR[5*word + 3]
|
||||
bits[29:24] = LLR[5*word + 4]
|
||||
"""
|
||||
# Pad to 260 entries (52 * 5)
|
||||
padded = list(llr_quantized) + [0] * (N_WORDS * LLRS_PER_WORD - N_LLR)
|
||||
|
||||
words = []
|
||||
for w in range(N_WORDS):
|
||||
word = 0
|
||||
for p in range(LLRS_PER_WORD):
|
||||
llr_idx = w * LLRS_PER_WORD + p
|
||||
tc = signed_to_twos_complement(padded[llr_idx])
|
||||
word |= (tc & 0x3F) << (p * Q_BITS)
|
||||
words.append(word)
|
||||
return words
|
||||
|
||||
|
||||
def bits_to_uint32(bits):
|
||||
"""Convert a list of 32 binary values to a single uint32 (bit 0 = LSB)."""
|
||||
val = 0
|
||||
for i, b in enumerate(bits):
|
||||
if b:
|
||||
val |= (1 << i)
|
||||
return val
|
||||
|
||||
|
||||
def main():
|
||||
# Load test vectors
|
||||
print(f'Reading {INPUT_FILE}...')
|
||||
with open(INPUT_FILE) as f:
|
||||
vectors = json.load(f)
|
||||
num_vectors = len(vectors)
|
||||
converged_count = sum(1 for v in vectors if v['converged'])
|
||||
print(f' Loaded {num_vectors} vectors ({converged_count} converged, '
|
||||
f'{num_vectors - converged_count} non-converged)')
|
||||
|
||||
# Create output directory
|
||||
os.makedirs(OUTPUT_DIR, exist_ok=True)
|
||||
|
||||
# =========================================================================
|
||||
# Generate llr_words.hex
|
||||
# =========================================================================
|
||||
# Format: one 32-bit hex word per line, 52 words per vector
|
||||
# Total lines = 52 * num_vectors
|
||||
llr_lines = []
|
||||
for vec in vectors:
|
||||
llr_words = pack_llr_words(vec['llr_quantized'])
|
||||
assert len(llr_words) == N_WORDS
|
||||
for word in llr_words:
|
||||
llr_lines.append(f'{word:08X}')
|
||||
|
||||
llr_path = os.path.join(OUTPUT_DIR, 'llr_words.hex')
|
||||
with open(llr_path, 'w') as f:
|
||||
f.write('\n'.join(llr_lines) + '\n')
|
||||
print(f' Wrote {llr_path} ({len(llr_lines)} lines, {N_WORDS} words/vector)')
|
||||
|
||||
# =========================================================================
|
||||
# Generate expected.hex
|
||||
# =========================================================================
|
||||
# Format: 4 lines per vector (all 32-bit hex)
|
||||
# Line 0: decoded_word (info bits packed LSB-first)
|
||||
# Line 1: converged (00000000 or 00000001)
|
||||
# Line 2: iterations
|
||||
# Line 3: syndrome_weight
|
||||
expected_lines = []
|
||||
for vec in vectors:
|
||||
decoded_word = bits_to_uint32(vec['decoded_bits'])
|
||||
converged = 1 if vec['converged'] else 0
|
||||
iterations = vec['iterations']
|
||||
syndrome_weight = vec['syndrome_weight']
|
||||
|
||||
expected_lines.append(f'{decoded_word:08X}')
|
||||
expected_lines.append(f'{converged:08X}')
|
||||
expected_lines.append(f'{iterations:08X}')
|
||||
expected_lines.append(f'{syndrome_weight:08X}')
|
||||
|
||||
expected_path = os.path.join(OUTPUT_DIR, 'expected.hex')
|
||||
with open(expected_path, 'w') as f:
|
||||
f.write('\n'.join(expected_lines) + '\n')
|
||||
print(f' Wrote {expected_path} ({len(expected_lines)} lines, '
|
||||
f'{LINES_PER_EXPECTED} lines/vector)')
|
||||
|
||||
# =========================================================================
|
||||
# Generate num_vectors.txt
|
||||
# =========================================================================
|
||||
num_path = os.path.join(OUTPUT_DIR, 'num_vectors.txt')
|
||||
with open(num_path, 'w') as f:
|
||||
f.write(f'{num_vectors}\n')
|
||||
print(f' Wrote {num_path} ({num_vectors})')
|
||||
|
||||
# =========================================================================
|
||||
# Verify LLR packing roundtrip
|
||||
# =========================================================================
|
||||
print('\nVerifying LLR packing roundtrip...')
|
||||
for vec in vectors:
|
||||
llr_q = vec['llr_quantized']
|
||||
words = pack_llr_words(llr_q)
|
||||
for w_idx, word in enumerate(words):
|
||||
for p in range(LLRS_PER_WORD):
|
||||
llr_idx = w_idx * LLRS_PER_WORD + p
|
||||
if llr_idx >= N_LLR:
|
||||
break
|
||||
tc_val = (word >> (p * Q_BITS)) & 0x3F
|
||||
# Convert back to signed
|
||||
if tc_val >= 32:
|
||||
signed_val = tc_val - 64
|
||||
else:
|
||||
signed_val = tc_val
|
||||
expected = llr_q[llr_idx]
|
||||
assert signed_val == expected, (
|
||||
f'Vec {vec["index"]}, LLR[{llr_idx}]: '
|
||||
f'packed={signed_val}, expected={expected}'
|
||||
)
|
||||
print(' LLR packing roundtrip OK for all vectors')
|
||||
|
||||
# Print summary of expected results
|
||||
print('\nExpected results summary:')
|
||||
for vec in vectors:
|
||||
decoded_word = bits_to_uint32(vec['decoded_bits'])
|
||||
print(f' Vec {vec["index"]:2d}: decoded=0x{decoded_word:08X}, '
|
||||
f'converged={vec["converged"]}, '
|
||||
f'iter={vec["iterations"]}, '
|
||||
f'syn_wt={vec["syndrome_weight"]}')
|
||||
|
||||
print('\nDone.')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@@ -397,14 +397,26 @@ def run_ber_simulation(lam_s_db_range, lam_b=0.1, n_frames=1000, max_iter=30):
|
||||
|
||||
|
||||
def generate_test_vectors(n_vectors=10, lam_s=2.0, lam_b=0.1, max_iter=30):
|
||||
"""Generate test vectors for RTL verification."""
|
||||
"""
|
||||
Generate test vectors for RTL verification.
|
||||
|
||||
Uses a mix of signal levels to ensure we get both converged and
|
||||
non-converged vectors. First half uses high SNR (lam_s * 3) for
|
||||
reliable convergence, then uses the specified lam_s for realistic
|
||||
channel conditions.
|
||||
"""
|
||||
H = build_full_h_matrix()
|
||||
vectors = []
|
||||
|
||||
# Use high SNR for first half to guarantee converged vectors
|
||||
n_high_snr = n_vectors // 2
|
||||
lam_schedule = [lam_s * 3.0] * n_high_snr + [lam_s] * (n_vectors - n_high_snr)
|
||||
|
||||
for i in range(n_vectors):
|
||||
info = np.random.randint(0, 2, K)
|
||||
codeword = ldpc_encode(info, H)
|
||||
llr_float, photons = poisson_channel(codeword, lam_s, lam_b)
|
||||
cur_lam_s = lam_schedule[i]
|
||||
llr_float, photons = poisson_channel(codeword, cur_lam_s, lam_b)
|
||||
llr_q = quantize_llr(llr_float)
|
||||
decoded, converged, iters, syn_wt = decode_layered_min_sum(llr_q, max_iter)
|
||||
|
||||
@@ -420,10 +432,11 @@ def generate_test_vectors(n_vectors=10, lam_s=2.0, lam_b=0.1, max_iter=30):
|
||||
'iterations': iters,
|
||||
'syndrome_weight': syn_wt,
|
||||
'bit_errors': int(np.sum(decoded != info)),
|
||||
'lam_s': cur_lam_s,
|
||||
}
|
||||
vectors.append(vec)
|
||||
status = "PASS" if np.array_equal(decoded, info) else f"FAIL ({vec['bit_errors']} errs)"
|
||||
print(f" Vector {i}: {status} (iter={iters}, converged={converged})")
|
||||
print(f" Vector {i}: {status} (lam_s={cur_lam_s:.1f}, iter={iters}, converged={converged})")
|
||||
|
||||
return vectors
|
||||
|
||||
|
||||
@@ -30,8 +30,8 @@ module ldpc_decoder_core #(
|
||||
input logic early_term_en,
|
||||
input logic [4:0] max_iter,
|
||||
|
||||
// Channel LLRs (loaded before start)
|
||||
input logic signed [Q-1:0] llr_in [N],
|
||||
// Channel LLRs (loaded before start) - packed vector for Yosys compatibility
|
||||
input logic [N*Q-1:0] llr_in,
|
||||
|
||||
// Status
|
||||
output logic busy,
|
||||
@@ -112,13 +112,14 @@ module ldpc_decoder_core #(
|
||||
// Decoder FSM
|
||||
// =========================================================================
|
||||
|
||||
typedef enum logic [2:0] {
|
||||
typedef enum logic [3:0] {
|
||||
IDLE,
|
||||
INIT, // Initialize beliefs from channel LLRs, zero messages
|
||||
LAYER_READ, // Read Z beliefs for each of DC columns in current row
|
||||
CN_UPDATE, // Run min-sum CN update on gathered messages
|
||||
LAYER_WRITE, // Write updated beliefs and new CN->VN messages
|
||||
SYNDROME, // Check syndrome after full iteration
|
||||
SYNDROME_DONE, // Read registered syndrome result
|
||||
DONE
|
||||
} state_t;
|
||||
|
||||
@@ -167,7 +168,8 @@ module ldpc_decoder_core #(
|
||||
state_next = LAYER_READ; // next row
|
||||
end
|
||||
end
|
||||
SYNDROME: begin
|
||||
SYNDROME: state_next = SYNDROME_DONE;
|
||||
SYNDROME_DONE: begin
|
||||
if (syndrome_ok && early_term_en)
|
||||
state_next = DONE;
|
||||
else if (iter_cnt >= effective_max_iter)
|
||||
@@ -192,28 +194,34 @@ module ldpc_decoder_core #(
|
||||
converged <= 1'b0;
|
||||
iter_used <= '0;
|
||||
syndrome_weight <= '0;
|
||||
syndrome_ok <= 1'b0;
|
||||
end else begin
|
||||
case (state)
|
||||
IDLE: begin
|
||||
iter_cnt <= '0;
|
||||
row_idx <= '0;
|
||||
col_idx <= '0;
|
||||
converged <= 1'b0;
|
||||
// Note: converged, iter_used, syndrome_weight, decoded_bits
|
||||
// are NOT cleared here so the host can read them after decode.
|
||||
// They are cleared in INIT when a new decode starts.
|
||||
end
|
||||
|
||||
INIT: begin
|
||||
// Initialize beliefs from channel LLRs
|
||||
// Use blocking assignment for array in loop (Verilator requirement)
|
||||
for (int j = 0; j < N; j++) begin
|
||||
beliefs[j] <= llr_in[j];
|
||||
beliefs[j] = $signed(llr_in[j*Q +: Q]);
|
||||
end
|
||||
// Zero all CN->VN messages
|
||||
for (int r = 0; r < M_BASE; r++)
|
||||
for (int c = 0; c < N_BASE; c++)
|
||||
for (int z = 0; z < Z; z++)
|
||||
msg_cn2vn[r][c][z] <= '0;
|
||||
msg_cn2vn[r][c][z] = {Q{1'b0}};
|
||||
row_idx <= '0;
|
||||
col_idx <= '0;
|
||||
iter_cnt <= '0;
|
||||
converged <= 1'b0;
|
||||
syndrome_ok <= 1'b0;
|
||||
end
|
||||
|
||||
LAYER_READ: begin
|
||||
@@ -221,18 +229,30 @@ module ldpc_decoder_core #(
|
||||
// VN->CN = belief - old CN->VN message
|
||||
// (belief already contains the sum of ALL CN->VN messages,
|
||||
// so subtracting the current row's message gives the extrinsic)
|
||||
for (int z = 0; z < Z; z++) begin
|
||||
int bit_idx;
|
||||
int shifted_z;
|
||||
logic signed [Q-1:0] old_msg;
|
||||
logic signed [Q-1:0] belief_val;
|
||||
// Skip unconnected columns (H_BASE == -1)
|
||||
if (H_BASE[row_idx][col_idx] >= 0) begin
|
||||
for (int z = 0; z < Z; z++) begin
|
||||
int bit_idx;
|
||||
int shifted_z;
|
||||
logic signed [Q-1:0] old_msg;
|
||||
logic signed [Q-1:0] belief_val;
|
||||
|
||||
shifted_z = (z + H_BASE[row_idx][col_idx]) % Z;
|
||||
bit_idx = int'(col_idx) * Z + shifted_z;
|
||||
old_msg = msg_cn2vn[row_idx][col_idx][z];
|
||||
belief_val = beliefs[bit_idx];
|
||||
shifted_z = (z + H_BASE[row_idx][col_idx]) % Z;
|
||||
bit_idx = int'(col_idx) * Z + shifted_z;
|
||||
// On first iteration (iter_cnt==0), old messages are zero
|
||||
// since no CN update has run yet. Use 0 directly rather
|
||||
// than reading msg_cn2vn, which may not be reliably zeroed
|
||||
// by the INIT state in all simulation tools.
|
||||
old_msg = (iter_cnt == 0) ?
|
||||
{Q{1'b0}} : msg_cn2vn[row_idx][col_idx][z];
|
||||
belief_val = beliefs[bit_idx];
|
||||
|
||||
vn_to_cn[col_idx][z] <= sat_sub(belief_val, old_msg);
|
||||
vn_to_cn[col_idx][z] <= sat_sub(belief_val, old_msg);
|
||||
end
|
||||
end else begin
|
||||
// Unconnected: set to +MAX so magnitude doesn't affect min-sum
|
||||
for (int z = 0; z < Z; z++)
|
||||
vn_to_cn[col_idx][z] <= {1'b0, {(Q-1){1'b1}}}; // +31
|
||||
end
|
||||
|
||||
if (col_idx == N_BASE - 1)
|
||||
@@ -245,13 +265,12 @@ module ldpc_decoder_core #(
|
||||
// Min-sum update for all Z check nodes in current row
|
||||
// Each CN has DC=8 incoming messages (one per column)
|
||||
for (int z = 0; z < Z; z++) begin
|
||||
// Gather DC messages for check node z
|
||||
logic signed [Q-1:0] msgs [DC];
|
||||
for (int d = 0; d < DC; d++)
|
||||
msgs[d] = vn_to_cn[d][z];
|
||||
|
||||
// Min-sum: find min1, min2, sign product, min1 index
|
||||
cn_min_sum(msgs, cn_to_vn[0][z], cn_to_vn[1][z],
|
||||
// Min-sum: pass individual VN->CN messages directly
|
||||
cn_min_sum(vn_to_cn[0][z], vn_to_cn[1][z],
|
||||
vn_to_cn[2][z], vn_to_cn[3][z],
|
||||
vn_to_cn[4][z], vn_to_cn[5][z],
|
||||
vn_to_cn[6][z], vn_to_cn[7][z],
|
||||
cn_to_vn[0][z], cn_to_vn[1][z],
|
||||
cn_to_vn[2][z], cn_to_vn[3][z],
|
||||
cn_to_vn[4][z], cn_to_vn[5][z],
|
||||
cn_to_vn[6][z], cn_to_vn[7][z]);
|
||||
@@ -261,22 +280,25 @@ module ldpc_decoder_core #(
|
||||
|
||||
LAYER_WRITE: begin
|
||||
// Write back: update beliefs and store new CN->VN messages
|
||||
for (int z = 0; z < Z; z++) begin
|
||||
int bit_idx;
|
||||
int shifted_z;
|
||||
logic signed [Q-1:0] new_msg;
|
||||
logic signed [Q-1:0] old_extrinsic;
|
||||
// Skip unconnected columns (H_BASE == -1)
|
||||
if (H_BASE[row_idx][col_idx] >= 0) begin
|
||||
for (int z = 0; z < Z; z++) begin
|
||||
int bit_idx;
|
||||
int shifted_z;
|
||||
logic signed [Q-1:0] new_msg;
|
||||
logic signed [Q-1:0] old_extrinsic;
|
||||
|
||||
shifted_z = (z + H_BASE[row_idx][col_idx]) % Z;
|
||||
bit_idx = int'(col_idx) * Z + shifted_z;
|
||||
new_msg = cn_to_vn[col_idx][z];
|
||||
old_extrinsic = vn_to_cn[col_idx][z];
|
||||
shifted_z = (z + H_BASE[row_idx][col_idx]) % Z;
|
||||
bit_idx = int'(col_idx) * Z + shifted_z;
|
||||
new_msg = cn_to_vn[col_idx][z];
|
||||
old_extrinsic = vn_to_cn[col_idx][z];
|
||||
|
||||
// belief = extrinsic (VN->CN) + new CN->VN message
|
||||
beliefs[bit_idx] <= sat_add(old_extrinsic, new_msg);
|
||||
// belief = extrinsic (VN->CN) + new CN->VN message
|
||||
beliefs[bit_idx] <= sat_add(old_extrinsic, new_msg);
|
||||
|
||||
// Store new message for next iteration
|
||||
msg_cn2vn[row_idx][col_idx][z] <= new_msg;
|
||||
// Store new message for next iteration
|
||||
msg_cn2vn[row_idx][col_idx][z] <= new_msg;
|
||||
end
|
||||
end
|
||||
|
||||
if (col_idx == N_BASE - 1) begin
|
||||
@@ -292,25 +314,32 @@ module ldpc_decoder_core #(
|
||||
|
||||
SYNDROME: begin
|
||||
// Check H * c_hat == 0 (compute syndrome weight)
|
||||
// Only include connected columns (H_BASE >= 0)
|
||||
syndrome_cnt = '0;
|
||||
for (int r = 0; r < M_BASE; r++) begin
|
||||
for (int z = 0; z < Z; z++) begin
|
||||
logic parity;
|
||||
parity = 1'b0;
|
||||
for (int c = 0; c < N_BASE; c++) begin
|
||||
int shifted_z, bit_idx;
|
||||
shifted_z = (z + H_BASE[r][c]) % Z;
|
||||
bit_idx = c * Z + shifted_z;
|
||||
parity = parity ^ beliefs[bit_idx][Q-1]; // sign bit = hard decision
|
||||
if (H_BASE[r][c] >= 0) begin
|
||||
int shifted_z, bit_idx;
|
||||
shifted_z = (z + H_BASE[r][c]) % Z;
|
||||
bit_idx = c * Z + shifted_z;
|
||||
parity = parity ^ beliefs[bit_idx][Q-1];
|
||||
end
|
||||
end
|
||||
if (parity) syndrome_cnt = syndrome_cnt + 1;
|
||||
end
|
||||
end
|
||||
syndrome_weight <= syndrome_cnt;
|
||||
syndrome_ok = (syndrome_cnt == 0);
|
||||
syndrome_ok <= (syndrome_cnt == 0);
|
||||
|
||||
iter_cnt <= iter_cnt + 1;
|
||||
iter_used <= iter_cnt + 1;
|
||||
end
|
||||
|
||||
SYNDROME_DONE: begin
|
||||
// Check registered syndrome result
|
||||
if (syndrome_ok) converged <= 1'b1;
|
||||
end
|
||||
|
||||
@@ -327,13 +356,15 @@ module ldpc_decoder_core #(
|
||||
// Min-sum CN update function
|
||||
// =========================================================================
|
||||
|
||||
// Offset min-sum for DC=8 inputs
|
||||
// Offset min-sum for DC=8 inputs (individual ports for iverilog compatibility)
|
||||
// For each output j: sign = XOR of all other signs, magnitude = min of all other magnitudes - offset
|
||||
task automatic cn_min_sum(
|
||||
input logic signed [Q-1:0] in [DC],
|
||||
input logic signed [Q-1:0] in0, in1, in2, in3,
|
||||
in4, in5, in6, in7,
|
||||
output logic signed [Q-1:0] out0, out1, out2, out3,
|
||||
out4, out5, out6, out7
|
||||
);
|
||||
logic signed [Q-1:0] ins [DC];
|
||||
logic [DC-1:0] signs;
|
||||
logic [Q-2:0] mags [DC];
|
||||
logic sign_xor;
|
||||
@@ -341,11 +372,23 @@ module ldpc_decoder_core #(
|
||||
int min1_idx;
|
||||
logic signed [Q-1:0] outs [DC];
|
||||
|
||||
ins[0] = in0; ins[1] = in1; ins[2] = in2; ins[3] = in3;
|
||||
ins[4] = in4; ins[5] = in5; ins[6] = in6; ins[7] = in7;
|
||||
|
||||
// Extract signs and magnitudes
|
||||
// Note: -32 (100000) has magnitude 32 which overflows 5-bit field to 0.
|
||||
// Clamp to 31 (max representable magnitude) to avoid corruption.
|
||||
sign_xor = 1'b0;
|
||||
for (int i = 0; i < DC; i++) begin
|
||||
signs[i] = in[i][Q-1];
|
||||
mags[i] = in[i][Q-1] ? (~in[i][Q-2:0] + 1) : in[i][Q-2:0];
|
||||
logic [Q-1:0] abs_val;
|
||||
signs[i] = ins[i][Q-1];
|
||||
if (ins[i][Q-1]) begin
|
||||
abs_val = ~ins[i] + 1'b1;
|
||||
// If abs_val overflowed (input was most negative), clamp
|
||||
mags[i] = (abs_val[Q-1]) ? {(Q-1){1'b1}} : abs_val[Q-2:0];
|
||||
end else begin
|
||||
mags[i] = ins[i][Q-2:0];
|
||||
end
|
||||
sign_xor = sign_xor ^ signs[i];
|
||||
end
|
||||
|
||||
@@ -381,26 +424,32 @@ module ldpc_decoder_core #(
|
||||
endtask
|
||||
|
||||
// =========================================================================
|
||||
// Saturating arithmetic helpers
|
||||
// Saturating arithmetic helpers (Yosys-compatible: no return, no complex concat)
|
||||
// =========================================================================
|
||||
|
||||
function automatic logic signed [Q-1:0] sat_add(
|
||||
logic signed [Q-1:0] a, logic signed [Q-1:0] b
|
||||
input logic signed [Q-1:0] a,
|
||||
input logic signed [Q-1:0] b
|
||||
);
|
||||
logic signed [Q:0] sum;
|
||||
sum = {a[Q-1], a} + {b[Q-1], b}; // sign-extend and add
|
||||
if (sum > $signed({1'b0, {(Q-1){1'b1}}}))
|
||||
return {1'b0, {(Q-1){1'b1}}}; // +max
|
||||
else if (sum < $signed({1'b1, {(Q-1){1'b0}}}))
|
||||
return {1'b1, {(Q-1){1'b0}}}; // -max
|
||||
else
|
||||
return sum[Q-1:0];
|
||||
reg signed [Q:0] sum;
|
||||
begin
|
||||
sum = {a[Q-1], a} + {b[Q-1], b};
|
||||
if (!sum[Q] && sum[Q-1]) // positive overflow
|
||||
sat_add = {1'b0, {(Q-1){1'b1}}};
|
||||
else if (sum[Q] && !sum[Q-1]) // negative overflow
|
||||
sat_add = {1'b1, {(Q-1){1'b0}}};
|
||||
else
|
||||
sat_add = sum[Q-1:0];
|
||||
end
|
||||
endfunction
|
||||
|
||||
function automatic logic signed [Q-1:0] sat_sub(
|
||||
logic signed [Q-1:0] a, logic signed [Q-1:0] b
|
||||
input logic signed [Q-1:0] a,
|
||||
input logic signed [Q-1:0] b
|
||||
);
|
||||
return sat_add(a, -b);
|
||||
begin
|
||||
sat_sub = sat_add(a, -b);
|
||||
end
|
||||
endfunction
|
||||
|
||||
endmodule
|
||||
|
||||
@@ -15,7 +15,6 @@ module ldpc_decoder_top #(
|
||||
parameter Z = 32, // lifting factor
|
||||
parameter N = N_BASE * Z, // codeword length = 256
|
||||
parameter K = Z, // info bits = 32 (rate 1/8)
|
||||
parameter M = M_BASE * Z, // parity checks = 224
|
||||
parameter Q = 6, // LLR quantization bits (signed)
|
||||
parameter MAX_ITER = 30, // maximum decoding iterations
|
||||
parameter DC = 8, // check node degree (= N_BASE for regular)
|
||||
@@ -50,8 +49,8 @@ module ldpc_decoder_top #(
|
||||
logic stat_converged;
|
||||
logic [4:0] stat_iter_used;
|
||||
|
||||
// LLR input buffer (written by host before starting decode)
|
||||
logic signed [Q-1:0] llr_input [N];
|
||||
// LLR input buffer (packed vector for Yosys compatibility)
|
||||
logic [N*Q-1:0] llr_input_flat;
|
||||
|
||||
// Decoded output
|
||||
logic [K-1:0] decoded_bits;
|
||||
@@ -75,7 +74,7 @@ module ldpc_decoder_top #(
|
||||
.stat_busy (stat_busy),
|
||||
.stat_converged (stat_converged),
|
||||
.stat_iter_used (stat_iter_used),
|
||||
.llr_input (llr_input),
|
||||
.llr_input (llr_input_flat),
|
||||
.decoded_bits (decoded_bits),
|
||||
.syndrome_weight(syndrome_weight),
|
||||
.irq_o (irq_o)
|
||||
@@ -99,7 +98,7 @@ module ldpc_decoder_top #(
|
||||
.start (ctrl_start),
|
||||
.early_term_en (ctrl_early_term),
|
||||
.max_iter (ctrl_max_iter),
|
||||
.llr_in (llr_input),
|
||||
.llr_in (llr_input_flat),
|
||||
.busy (stat_busy),
|
||||
.converged (stat_converged),
|
||||
.iter_used (stat_iter_used),
|
||||
|
||||
@@ -32,7 +32,7 @@ module wishbone_interface #(
|
||||
input logic stat_busy,
|
||||
input logic stat_converged,
|
||||
input logic [4:0] stat_iter_used,
|
||||
output logic signed [Q-1:0] llr_input [N],
|
||||
output logic [N*Q-1:0] llr_input, // packed LLR vector
|
||||
input logic [K-1:0] decoded_bits,
|
||||
input logic [7:0] syndrome_weight,
|
||||
|
||||
@@ -40,7 +40,7 @@ module wishbone_interface #(
|
||||
output logic irq_o
|
||||
);
|
||||
|
||||
localparam VERSION_ID = 32'hLD01_0001; // LDPC v0.1 build 1
|
||||
localparam VERSION_ID = 32'h1D01_0001; // LDPC v0.1 build 1
|
||||
|
||||
// Wishbone handshake: ack on valid cycle
|
||||
logic wb_valid;
|
||||
@@ -99,7 +99,7 @@ module wishbone_interface #(
|
||||
int llr_idx;
|
||||
llr_idx = word_idx * 5 + p;
|
||||
if (llr_idx < N)
|
||||
llr_input[llr_idx] <= wb_dat_i[p*Q +: Q];
|
||||
llr_input[llr_idx*Q +: Q] <= wb_dat_i[p*Q +: Q];
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
40
tb/Makefile
Normal file
40
tb/Makefile
Normal file
@@ -0,0 +1,40 @@
|
||||
RTL_DIR = ../rtl
|
||||
RTL_FILES = $(RTL_DIR)/ldpc_decoder_top.sv \
|
||||
$(RTL_DIR)/ldpc_decoder_core.sv \
|
||||
$(RTL_DIR)/wishbone_interface.sv
|
||||
|
||||
.PHONY: lint sim sim_vectors clean
|
||||
|
||||
lint:
|
||||
verilator --lint-only -Wall \
|
||||
-Wno-WIDTHEXPAND -Wno-WIDTHTRUNC -Wno-CASEINCOMPLETE \
|
||||
-Wno-BLKSEQ -Wno-BLKLOOPINIT -Wno-UNUSEDSIGNAL -Wno-UNUSEDPARAM \
|
||||
--unroll-count 1024 \
|
||||
$(RTL_FILES) --top-module ldpc_decoder_top
|
||||
|
||||
sim: obj_dir/Vtb_ldpc_decoder
|
||||
./obj_dir/Vtb_ldpc_decoder
|
||||
|
||||
obj_dir/Vtb_ldpc_decoder: tb_ldpc_decoder.sv $(RTL_FILES)
|
||||
verilator --binary --timing --trace \
|
||||
-o Vtb_ldpc_decoder \
|
||||
-Wno-WIDTHEXPAND -Wno-WIDTHTRUNC -Wno-CASEINCOMPLETE \
|
||||
-Wno-BLKSEQ -Wno-BLKLOOPINIT -Wno-UNUSEDSIGNAL -Wno-UNUSEDPARAM \
|
||||
--unroll-count 1024 \
|
||||
tb_ldpc_decoder.sv $(RTL_FILES) \
|
||||
--top-module tb_ldpc_decoder
|
||||
|
||||
sim_vectors: obj_dir/Vtb_ldpc_vectors
|
||||
./obj_dir/Vtb_ldpc_vectors
|
||||
|
||||
obj_dir/Vtb_ldpc_vectors: tb_ldpc_vectors.sv $(RTL_FILES)
|
||||
verilator --binary --timing --trace \
|
||||
-o Vtb_ldpc_vectors \
|
||||
-Wno-WIDTHEXPAND -Wno-WIDTHTRUNC -Wno-CASEINCOMPLETE \
|
||||
-Wno-BLKSEQ -Wno-BLKLOOPINIT -Wno-UNUSEDSIGNAL -Wno-UNUSEDPARAM \
|
||||
--unroll-count 1024 \
|
||||
tb_ldpc_vectors.sv $(RTL_FILES) \
|
||||
--top-module tb_ldpc_vectors
|
||||
|
||||
clean:
|
||||
rm -rf obj_dir *.vcd
|
||||
245
tb/tb_ldpc_decoder.sv
Normal file
245
tb/tb_ldpc_decoder.sv
Normal file
@@ -0,0 +1,245 @@
|
||||
// Standalone Verilator testbench for LDPC decoder
|
||||
// Tests the decoder core directly via Wishbone (no Caravel dependency)
|
||||
//
|
||||
// Test 1: Read VERSION register (expect 0x1D010001)
|
||||
// Test 2: Decode all-zero codeword with strong +31 LLRs
|
||||
|
||||
`timescale 1ns / 1ps
|
||||
|
||||
module tb_ldpc_decoder;
|
||||
|
||||
// =========================================================================
|
||||
// Clock and reset
|
||||
// =========================================================================
|
||||
|
||||
logic clk;
|
||||
logic rst_n;
|
||||
logic wb_cyc_i;
|
||||
logic wb_stb_i;
|
||||
logic wb_we_i;
|
||||
logic [7:0] wb_adr_i;
|
||||
logic [31:0] wb_dat_i;
|
||||
logic [31:0] wb_dat_o;
|
||||
logic wb_ack_o;
|
||||
logic irq_o;
|
||||
|
||||
// 50 MHz clock (20 ns period)
|
||||
initial clk = 0;
|
||||
always #10 clk = ~clk;
|
||||
|
||||
// =========================================================================
|
||||
// DUT instantiation
|
||||
// =========================================================================
|
||||
|
||||
ldpc_decoder_top dut (
|
||||
.clk (clk),
|
||||
.rst_n (rst_n),
|
||||
.wb_cyc_i (wb_cyc_i),
|
||||
.wb_stb_i (wb_stb_i),
|
||||
.wb_we_i (wb_we_i),
|
||||
.wb_adr_i (wb_adr_i),
|
||||
.wb_dat_i (wb_dat_i),
|
||||
.wb_dat_o (wb_dat_o),
|
||||
.wb_ack_o (wb_ack_o),
|
||||
.irq_o (irq_o)
|
||||
);
|
||||
|
||||
// =========================================================================
|
||||
// VCD dump
|
||||
// =========================================================================
|
||||
|
||||
initial begin
|
||||
$dumpfile("tb_ldpc_decoder.vcd");
|
||||
$dumpvars(0, tb_ldpc_decoder);
|
||||
end
|
||||
|
||||
// =========================================================================
|
||||
// Watchdog timeout
|
||||
// =========================================================================
|
||||
|
||||
int cycle_cnt;
|
||||
|
||||
initial begin
|
||||
cycle_cnt = 0;
|
||||
forever begin
|
||||
@(posedge clk);
|
||||
cycle_cnt++;
|
||||
if (cycle_cnt > 100000) begin
|
||||
$display("TIMEOUT: exceeded 100000 cycles");
|
||||
$finish;
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
// =========================================================================
|
||||
// Wishbone tasks
|
||||
// =========================================================================
|
||||
|
||||
task automatic wb_write(input logic [7:0] addr, input logic [31:0] data);
|
||||
@(posedge clk);
|
||||
wb_cyc_i = 1'b1;
|
||||
wb_stb_i = 1'b1;
|
||||
wb_we_i = 1'b1;
|
||||
wb_adr_i = addr;
|
||||
wb_dat_i = data;
|
||||
|
||||
// Wait for ack
|
||||
do begin
|
||||
@(posedge clk);
|
||||
end while (!wb_ack_o);
|
||||
|
||||
// Deassert
|
||||
wb_cyc_i = 1'b0;
|
||||
wb_stb_i = 1'b0;
|
||||
wb_we_i = 1'b0;
|
||||
endtask
|
||||
|
||||
task automatic wb_read(input logic [7:0] addr, output logic [31:0] data);
|
||||
@(posedge clk);
|
||||
wb_cyc_i = 1'b1;
|
||||
wb_stb_i = 1'b1;
|
||||
wb_we_i = 1'b0;
|
||||
wb_adr_i = addr;
|
||||
|
||||
// Wait for ack
|
||||
do begin
|
||||
@(posedge clk);
|
||||
end while (!wb_ack_o);
|
||||
|
||||
data = wb_dat_o;
|
||||
|
||||
// Deassert
|
||||
wb_cyc_i = 1'b0;
|
||||
wb_stb_i = 1'b0;
|
||||
endtask
|
||||
|
||||
// =========================================================================
|
||||
// Test variables
|
||||
// =========================================================================
|
||||
|
||||
int pass_cnt;
|
||||
int fail_cnt;
|
||||
logic [31:0] rd_data;
|
||||
|
||||
// =========================================================================
|
||||
// Main test sequence
|
||||
// =========================================================================
|
||||
|
||||
initial begin
|
||||
pass_cnt = 0;
|
||||
fail_cnt = 0;
|
||||
|
||||
// Initialize Wishbone signals
|
||||
wb_cyc_i = 1'b0;
|
||||
wb_stb_i = 1'b0;
|
||||
wb_we_i = 1'b0;
|
||||
wb_adr_i = 8'h00;
|
||||
wb_dat_i = 32'h0;
|
||||
|
||||
// Reset
|
||||
rst_n = 1'b0;
|
||||
repeat (10) @(posedge clk);
|
||||
rst_n = 1'b1;
|
||||
repeat (5) @(posedge clk);
|
||||
|
||||
// =================================================================
|
||||
// TEST 1: Read VERSION register
|
||||
// =================================================================
|
||||
|
||||
$display("[TEST 1] Read VERSION register");
|
||||
wb_read(8'h54, rd_data);
|
||||
|
||||
if (rd_data === 32'h1D01_0001) begin
|
||||
$display(" PASS: VERSION = 0x%08X", rd_data);
|
||||
pass_cnt++;
|
||||
end else begin
|
||||
$display(" FAIL: VERSION = 0x%08X (expected 0x1D010001)", rd_data);
|
||||
fail_cnt++;
|
||||
end
|
||||
|
||||
// =================================================================
|
||||
// TEST 2: Decode clean all-zero codeword
|
||||
// =================================================================
|
||||
|
||||
$display("[TEST 2] Decode clean all-zero codeword");
|
||||
|
||||
// Write 52 LLR words at addresses 0x10..0xDC
|
||||
// Each word = 5x +31 packed: {6'h1F, 6'h1F, 6'h1F, 6'h1F, 6'h1F}
|
||||
// = 0x1F | (0x1F<<6) | (0x1F<<12) | (0x1F<<18) | (0x1F<<24)
|
||||
// = 0x1F7DF7DF
|
||||
begin
|
||||
int i;
|
||||
for (i = 0; i < 52; i++) begin
|
||||
wb_write(8'h10 + i * 4, 32'h1F7D_F7DF);
|
||||
end
|
||||
end
|
||||
|
||||
// Start decode: write CTRL
|
||||
// bit[0]=1 (start), bit[1]=1 (early_term), bits[12:8]=0x1E=30 (max_iter)
|
||||
// 0x00001E03
|
||||
wb_write(8'h00, 32'h0000_1E03);
|
||||
|
||||
// Poll STATUS (addr 0x04) until busy (bit[0]) = 0
|
||||
// Allow a few cycles for busy to assert first
|
||||
repeat (5) @(posedge clk);
|
||||
|
||||
begin
|
||||
int poll_cnt;
|
||||
poll_cnt = 0;
|
||||
do begin
|
||||
wb_read(8'h04, rd_data);
|
||||
poll_cnt++;
|
||||
if (poll_cnt > 10000) begin
|
||||
$display(" FAIL: decoder stuck busy after %0d polls", poll_cnt);
|
||||
fail_cnt++;
|
||||
$display("=== %0d PASSED, %0d FAILED ===", pass_cnt, fail_cnt);
|
||||
$finish;
|
||||
end
|
||||
end while (rd_data[0] == 1'b1);
|
||||
end
|
||||
|
||||
// Check convergence: bit[1] of STATUS
|
||||
if (rd_data[1] == 1'b1) begin
|
||||
$display(" converged=1 (OK)");
|
||||
end else begin
|
||||
$display(" FAIL: converged=0 (expected 1)");
|
||||
fail_cnt++;
|
||||
end
|
||||
|
||||
// Check syndrome weight: bits[23:16] of STATUS
|
||||
if (rd_data[23:16] == 8'd0) begin
|
||||
$display(" syndrome_weight=0 (OK)");
|
||||
end else begin
|
||||
$display(" FAIL: syndrome_weight=%0d (expected 0)", rd_data[23:16]);
|
||||
fail_cnt++;
|
||||
end
|
||||
|
||||
// Check iterations used: bits[12:8] of STATUS
|
||||
$display(" iterations_used=%0d", rd_data[12:8]);
|
||||
|
||||
// Read DECODED register (addr 0x50)
|
||||
wb_read(8'h50, rd_data);
|
||||
|
||||
if (rd_data === 32'h0000_0000) begin
|
||||
$display(" PASS: decoded=0x%08X", rd_data);
|
||||
pass_cnt++;
|
||||
end else begin
|
||||
$display(" FAIL: decoded=0x%08X (expected 0x00000000)", rd_data);
|
||||
fail_cnt++;
|
||||
end
|
||||
|
||||
// =================================================================
|
||||
// Summary
|
||||
// =================================================================
|
||||
|
||||
$display("");
|
||||
if (fail_cnt == 0) begin
|
||||
$display("=== ALL %0d TESTS PASSED ===", pass_cnt);
|
||||
end else begin
|
||||
$display("=== %0d PASSED, %0d FAILED ===", pass_cnt, fail_cnt);
|
||||
end
|
||||
|
||||
$finish;
|
||||
end
|
||||
|
||||
endmodule
|
||||
356
tb/tb_ldpc_vectors.sv
Normal file
356
tb/tb_ldpc_vectors.sv
Normal file
@@ -0,0 +1,356 @@
|
||||
// Vector-driven Verilator testbench for LDPC decoder
|
||||
// Loads test vectors from hex files generated by model/gen_verilator_vectors.py
|
||||
// Verifies RTL decoder produces bit-exact results matching Python behavioral model
|
||||
//
|
||||
// Files loaded:
|
||||
// vectors/llr_words.hex - 52 words per vector, packed 5x6-bit LLRs
|
||||
// vectors/expected.hex - 4 lines per vector: decoded_word, converged, iterations, syndrome_weight
|
||||
// vectors/num_vectors.txt - single line with vector count (read at generation time)
|
||||
|
||||
`timescale 1ns / 1ps
|
||||
|
||||
module tb_ldpc_vectors;
|
||||
|
||||
// =========================================================================
|
||||
// Parameters
|
||||
// =========================================================================
|
||||
|
||||
localparam int NUM_VECTORS = 20;
|
||||
localparam int LLR_WORDS = 52; // 256 LLRs / 5 per word, rounded up
|
||||
localparam int EXPECTED_LINES = 4; // per vector: decoded, converged, iter, syn_wt
|
||||
|
||||
// Wishbone register addresses (byte-addressed)
|
||||
localparam logic [7:0] REG_CTRL = 8'h00;
|
||||
localparam logic [7:0] REG_STATUS = 8'h04;
|
||||
localparam logic [7:0] REG_LLR_BASE = 8'h10;
|
||||
localparam logic [7:0] REG_DECODED = 8'h50;
|
||||
localparam logic [7:0] REG_VERSION = 8'h54;
|
||||
|
||||
// CTRL register fields
|
||||
localparam int MAX_ITER = 30;
|
||||
|
||||
// =========================================================================
|
||||
// Clock and reset
|
||||
// =========================================================================
|
||||
|
||||
logic clk;
|
||||
logic rst_n;
|
||||
logic wb_cyc_i;
|
||||
logic wb_stb_i;
|
||||
logic wb_we_i;
|
||||
logic [7:0] wb_adr_i;
|
||||
logic [31:0] wb_dat_i;
|
||||
logic [31:0] wb_dat_o;
|
||||
logic wb_ack_o;
|
||||
logic irq_o;
|
||||
|
||||
// 50 MHz clock (20 ns period)
|
||||
initial clk = 0;
|
||||
always #10 clk = ~clk;
|
||||
|
||||
// =========================================================================
|
||||
// DUT instantiation
|
||||
// =========================================================================
|
||||
|
||||
ldpc_decoder_top dut (
|
||||
.clk (clk),
|
||||
.rst_n (rst_n),
|
||||
.wb_cyc_i (wb_cyc_i),
|
||||
.wb_stb_i (wb_stb_i),
|
||||
.wb_we_i (wb_we_i),
|
||||
.wb_adr_i (wb_adr_i),
|
||||
.wb_dat_i (wb_dat_i),
|
||||
.wb_dat_o (wb_dat_o),
|
||||
.wb_ack_o (wb_ack_o),
|
||||
.irq_o (irq_o)
|
||||
);
|
||||
|
||||
// =========================================================================
|
||||
// VCD dump
|
||||
// =========================================================================
|
||||
|
||||
initial begin
|
||||
$dumpfile("tb_ldpc_vectors.vcd");
|
||||
$dumpvars(0, tb_ldpc_vectors);
|
||||
end
|
||||
|
||||
// =========================================================================
|
||||
// Watchdog timeout (generous for 20 vectors * 30 iterations each)
|
||||
// =========================================================================
|
||||
|
||||
int cycle_cnt;
|
||||
|
||||
initial begin
|
||||
cycle_cnt = 0;
|
||||
forever begin
|
||||
@(posedge clk);
|
||||
cycle_cnt++;
|
||||
if (cycle_cnt > 2000000) begin
|
||||
$display("TIMEOUT: exceeded 2000000 cycles");
|
||||
$finish;
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
// =========================================================================
|
||||
// Test vector memory
|
||||
// =========================================================================
|
||||
|
||||
// LLR words: 52 words per vector, total 52 * NUM_VECTORS = 1040
|
||||
logic [31:0] llr_mem [LLR_WORDS * NUM_VECTORS];
|
||||
|
||||
// Expected results: 4 words per vector, total 4 * NUM_VECTORS = 80
|
||||
logic [31:0] expected_mem [EXPECTED_LINES * NUM_VECTORS];
|
||||
|
||||
initial begin
|
||||
$readmemh("vectors/llr_words.hex", llr_mem);
|
||||
$readmemh("vectors/expected.hex", expected_mem);
|
||||
end
|
||||
|
||||
// =========================================================================
|
||||
// Wishbone tasks (same as standalone testbench)
|
||||
// =========================================================================
|
||||
|
||||
task automatic wb_write(input logic [7:0] addr, input logic [31:0] data);
|
||||
@(posedge clk);
|
||||
wb_cyc_i = 1'b1;
|
||||
wb_stb_i = 1'b1;
|
||||
wb_we_i = 1'b1;
|
||||
wb_adr_i = addr;
|
||||
wb_dat_i = data;
|
||||
|
||||
// Wait for ack
|
||||
do begin
|
||||
@(posedge clk);
|
||||
end while (!wb_ack_o);
|
||||
|
||||
// Deassert
|
||||
wb_cyc_i = 1'b0;
|
||||
wb_stb_i = 1'b0;
|
||||
wb_we_i = 1'b0;
|
||||
endtask
|
||||
|
||||
task automatic wb_read(input logic [7:0] addr, output logic [31:0] data);
|
||||
@(posedge clk);
|
||||
wb_cyc_i = 1'b1;
|
||||
wb_stb_i = 1'b1;
|
||||
wb_we_i = 1'b0;
|
||||
wb_adr_i = addr;
|
||||
|
||||
// Wait for ack
|
||||
do begin
|
||||
@(posedge clk);
|
||||
end while (!wb_ack_o);
|
||||
|
||||
data = wb_dat_o;
|
||||
|
||||
// Deassert
|
||||
wb_cyc_i = 1'b0;
|
||||
wb_stb_i = 1'b0;
|
||||
endtask
|
||||
|
||||
// =========================================================================
|
||||
// Test variables
|
||||
// =========================================================================
|
||||
|
||||
int pass_cnt;
|
||||
int fail_cnt;
|
||||
int vec_pass; // per-vector pass flag
|
||||
logic [31:0] rd_data;
|
||||
|
||||
// Expected values for current vector
|
||||
logic [31:0] exp_decoded;
|
||||
logic [31:0] exp_converged;
|
||||
logic [31:0] exp_iterations;
|
||||
logic [31:0] exp_syndrome_wt;
|
||||
|
||||
// Actual values from RTL
|
||||
logic [31:0] act_decoded;
|
||||
logic act_converged;
|
||||
logic [4:0] act_iter_used;
|
||||
logic [7:0] act_syndrome_wt;
|
||||
|
||||
// =========================================================================
|
||||
// Main test sequence
|
||||
// =========================================================================
|
||||
|
||||
initial begin
|
||||
pass_cnt = 0;
|
||||
fail_cnt = 0;
|
||||
|
||||
// Initialize Wishbone signals
|
||||
wb_cyc_i = 1'b0;
|
||||
wb_stb_i = 1'b0;
|
||||
wb_we_i = 1'b0;
|
||||
wb_adr_i = 8'h00;
|
||||
wb_dat_i = 32'h0;
|
||||
|
||||
// Reset
|
||||
rst_n = 1'b0;
|
||||
repeat (10) @(posedge clk);
|
||||
rst_n = 1'b1;
|
||||
repeat (5) @(posedge clk);
|
||||
|
||||
// =================================================================
|
||||
// Sanity check: Read VERSION register
|
||||
// =================================================================
|
||||
$display("=== LDPC Vector-Driven Testbench ===");
|
||||
$display("Vectors: %0d, LLR words/vector: %0d", NUM_VECTORS, LLR_WORDS);
|
||||
$display("");
|
||||
|
||||
wb_read(REG_VERSION, rd_data);
|
||||
if (rd_data === 32'h1D01_0001) begin
|
||||
$display("[SANITY] VERSION = 0x%08X (OK)", rd_data);
|
||||
end else begin
|
||||
$display("[SANITY] VERSION = 0x%08X (UNEXPECTED, expected 0x1D010001)", rd_data);
|
||||
end
|
||||
$display("");
|
||||
|
||||
// =================================================================
|
||||
// Process each test vector
|
||||
// =================================================================
|
||||
for (int v = 0; v < NUM_VECTORS; v++) begin
|
||||
vec_pass = 1;
|
||||
|
||||
// Load expected values
|
||||
exp_decoded = expected_mem[v * EXPECTED_LINES + 0];
|
||||
exp_converged = expected_mem[v * EXPECTED_LINES + 1];
|
||||
exp_iterations = expected_mem[v * EXPECTED_LINES + 2];
|
||||
exp_syndrome_wt = expected_mem[v * EXPECTED_LINES + 3];
|
||||
|
||||
$display("[VEC %0d] Expected: decoded=0x%08X, converged=%0d, iter=%0d, syn_wt=%0d",
|
||||
v, exp_decoded, exp_converged[0], exp_iterations, exp_syndrome_wt);
|
||||
|
||||
// ---------------------------------------------------------
|
||||
// Step 1: Write 52 LLR words via Wishbone
|
||||
// ---------------------------------------------------------
|
||||
for (int w = 0; w < LLR_WORDS; w++) begin
|
||||
wb_write(REG_LLR_BASE + w * 4, llr_mem[v * LLR_WORDS + w]);
|
||||
end
|
||||
|
||||
// ---------------------------------------------------------
|
||||
// Step 2: Start decode
|
||||
// CTRL: bit[0]=start, bit[1]=early_term, bits[12:8]=max_iter
|
||||
// max_iter=30 -> 0x1E, so CTRL = 0x00001E03
|
||||
// ---------------------------------------------------------
|
||||
wb_write(REG_CTRL, {19'b0, 5'(MAX_ITER), 6'b0, 1'b1, 1'b1});
|
||||
|
||||
// Wait a few cycles for busy to assert
|
||||
repeat (5) @(posedge clk);
|
||||
|
||||
// ---------------------------------------------------------
|
||||
// Step 3: Poll STATUS until busy=0
|
||||
// ---------------------------------------------------------
|
||||
begin
|
||||
int poll_cnt;
|
||||
poll_cnt = 0;
|
||||
do begin
|
||||
wb_read(REG_STATUS, rd_data);
|
||||
poll_cnt++;
|
||||
if (poll_cnt > 50000) begin
|
||||
$display(" FAIL: decoder stuck busy after %0d polls", poll_cnt);
|
||||
fail_cnt++;
|
||||
$display("");
|
||||
$display("=== ABORTED: %0d PASSED, %0d FAILED ===", pass_cnt, fail_cnt);
|
||||
$finish;
|
||||
end
|
||||
end while (rd_data[0] == 1'b1);
|
||||
end
|
||||
|
||||
// ---------------------------------------------------------
|
||||
// Step 4: Read results
|
||||
// ---------------------------------------------------------
|
||||
// STATUS fields (from last poll read)
|
||||
act_converged = rd_data[1];
|
||||
act_iter_used = rd_data[12:8];
|
||||
act_syndrome_wt = rd_data[23:16];
|
||||
|
||||
// Read DECODED register
|
||||
wb_read(REG_DECODED, act_decoded);
|
||||
|
||||
$display(" Actual: decoded=0x%08X, converged=%0d, iter=%0d, syn_wt=%0d",
|
||||
act_decoded, act_converged, act_iter_used, act_syndrome_wt);
|
||||
|
||||
// ---------------------------------------------------------
|
||||
// Step 5: Compare results
|
||||
// ---------------------------------------------------------
|
||||
|
||||
if (exp_converged[0]) begin
|
||||
// CONVERGED vector: decoded_word MUST match (bit-exact)
|
||||
if (act_decoded !== exp_decoded) begin
|
||||
$display(" FAIL: decoded mismatch (expected 0x%08X, got 0x%08X)",
|
||||
exp_decoded, act_decoded);
|
||||
vec_pass = 0;
|
||||
end
|
||||
|
||||
// Converged: RTL must also report converged
|
||||
if (!act_converged) begin
|
||||
$display(" FAIL: RTL did not converge (Python model converged)");
|
||||
vec_pass = 0;
|
||||
end
|
||||
|
||||
// Converged: syndrome weight must be 0
|
||||
if (act_syndrome_wt !== 8'd0) begin
|
||||
$display(" FAIL: syndrome_weight=%0d (expected 0 for converged)",
|
||||
act_syndrome_wt);
|
||||
vec_pass = 0;
|
||||
end
|
||||
|
||||
// Iteration count: informational (allow +/- 2 tolerance)
|
||||
if (act_iter_used > exp_iterations[4:0] + 2 ||
|
||||
(exp_iterations[4:0] > 2 && act_iter_used < exp_iterations[4:0] - 2)) begin
|
||||
$display(" NOTE: iteration count differs (expected %0d, got %0d)",
|
||||
exp_iterations, act_iter_used);
|
||||
end
|
||||
|
||||
end else begin
|
||||
// NON-CONVERGED vector
|
||||
// Decoded word comparison is informational only
|
||||
if (act_decoded !== exp_decoded) begin
|
||||
$display(" INFO: decoded differs from Python model (expected for non-converged)");
|
||||
end
|
||||
|
||||
// Convergence status: RTL should also report non-converged
|
||||
if (act_converged) begin
|
||||
// Interesting: RTL converged but Python didn't. Could happen with
|
||||
// fixed-point vs floating-point differences. Report but don't fail.
|
||||
$display(" NOTE: RTL converged but Python model did not");
|
||||
end
|
||||
|
||||
// Syndrome weight should be non-zero for non-converged
|
||||
if (!act_converged && act_syndrome_wt == 8'd0) begin
|
||||
$display(" FAIL: syndrome_weight=0 but converged=0 (inconsistent)");
|
||||
vec_pass = 0;
|
||||
end
|
||||
end
|
||||
|
||||
// ---------------------------------------------------------
|
||||
// Step 6: Record result
|
||||
// ---------------------------------------------------------
|
||||
if (vec_pass) begin
|
||||
$display(" PASS");
|
||||
pass_cnt++;
|
||||
end else begin
|
||||
$display(" FAIL");
|
||||
fail_cnt++;
|
||||
end
|
||||
$display("");
|
||||
|
||||
end // for each vector
|
||||
|
||||
// =================================================================
|
||||
// Summary
|
||||
// =================================================================
|
||||
$display("=== RESULTS: %0d PASSED, %0d FAILED out of %0d vectors ===",
|
||||
pass_cnt, fail_cnt, NUM_VECTORS);
|
||||
|
||||
if (fail_cnt == 0) begin
|
||||
$display("=== ALL VECTORS PASSED ===");
|
||||
end else begin
|
||||
$display("=== SOME VECTORS FAILED ===");
|
||||
end
|
||||
|
||||
$finish;
|
||||
end
|
||||
|
||||
endmodule
|
||||
80
tb/vectors/expected.hex
Normal file
80
tb/vectors/expected.hex
Normal file
@@ -0,0 +1,80 @@
|
||||
3FD74222
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
09A5626C
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
2FFC25FC
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
5DABF50B
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
05D8EA33
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
19AF1473
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
34D925D3
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
45C1E650
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
A4CA7D49
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
D849EB80
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
9BCA9A40
|
||||
00000001
|
||||
00000001
|
||||
00000000
|
||||
79FFC352
|
||||
00000000
|
||||
0000001E
|
||||
00000043
|
||||
5D2534DC
|
||||
00000000
|
||||
0000001E
|
||||
0000003B
|
||||
F21718ED
|
||||
00000000
|
||||
0000001E
|
||||
0000003D
|
||||
7FE0197C
|
||||
00000000
|
||||
0000001E
|
||||
00000041
|
||||
9E869CC2
|
||||
00000000
|
||||
0000001E
|
||||
0000004B
|
||||
4E7507D9
|
||||
00000000
|
||||
0000001E
|
||||
00000038
|
||||
BB5F2BF1
|
||||
00000000
|
||||
0000001E
|
||||
00000033
|
||||
AA500741
|
||||
00000000
|
||||
0000001E
|
||||
0000004C
|
||||
F98E6EFE
|
||||
00000000
|
||||
0000001E
|
||||
0000002A
|
||||
1040
tb/vectors/llr_words.hex
Normal file
1040
tb/vectors/llr_words.hex
Normal file
File diff suppressed because it is too large
Load Diff
1
tb/vectors/num_vectors.txt
Normal file
1
tb/vectors/num_vectors.txt
Normal file
@@ -0,0 +1 @@
|
||||
20
|
||||
Reference in New Issue
Block a user