docs: add frame synchronization section to project report
Adds section 7 covering preamble-less frame sync using syndrome screening, which was missing from the original report. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -242,9 +242,54 @@ Each CN position connects to two adjacent VN positions via component matrices B0
|
||||
|
||||
---
|
||||
|
||||
## 7. What Needs to Be Built (RTL)
|
||||
## 7. Frame Synchronization (No Preamble)
|
||||
|
||||
### 7.1 Phase 1: Basic Decoder (chipIgnite Target)
|
||||
A critical receiver problem: the stream of photon counts is continuous. There are no headers, preambles, or synchronization markers. The receiver must find where each 256-bit codeword starts before it can decode anything.
|
||||
|
||||
Traditional approaches insert a known preamble sequence before each block. This wastes precious photons at low SNR. Instead, we exploit the LDPC code structure itself for synchronization.
|
||||
|
||||
### 7.1 The Insight: Syndrome as a Lock Detector
|
||||
|
||||
A valid codeword satisfies all parity checks (syndrome weight = 0). A random 256-bit window from the wrong alignment fails most checks (expected syndrome weight ~M/2 = 112 out of 224). This huge gap between "correct alignment" and "wrong alignment" is a free synchronization signal -- no preamble overhead needed.
|
||||
|
||||
### 7.2 Acquisition Algorithm
|
||||
|
||||
```
|
||||
1. SYNDROME SCREENING (cheap)
|
||||
For each of 256 candidate offsets:
|
||||
- Extract 256-sample window from stream
|
||||
- Hard-decision: positive LLR → 0, negative → 1
|
||||
- Compute syndrome weight (just XOR with H matrix)
|
||||
- Cost: ~1/30th of a full decode per candidate
|
||||
|
||||
2. FULL DECODE (expensive, but rarely needed)
|
||||
For candidates with syndrome weight < 50:
|
||||
- Run full iterative min-sum decoding (up to 30 iterations)
|
||||
- If converged: candidate is promising
|
||||
|
||||
3. CONFIRMATION
|
||||
- Decode the next 2 consecutive frames at that offset
|
||||
- If both converge: LOCK ACQUIRED
|
||||
- Total cost: ~11 equivalent decodes (screening + 3 full decodes)
|
||||
```
|
||||
|
||||
### 7.3 Re-Synchronization
|
||||
|
||||
If the offset drifts (e.g., clock slip), the receiver first searches locally within +/-16 positions of the last known offset. Only if that fails does it fall back to full acquisition. Local re-sync is nearly instant since it screens ~33 candidates instead of 256.
|
||||
|
||||
### 7.4 RTL Implications
|
||||
|
||||
The frame sync logic is simple hardware:
|
||||
- **Syndrome screening:** reuses the same syndrome checker already in the decoder. Just feed it hard decisions from different offsets.
|
||||
- **State machine:** ACQUIRE -> LOCKED -> RESYNC (on decode failure) -> local search -> fallback to ACQUIRE
|
||||
- **No extra memory:** operates on the incoming LLR stream, one window at a time
|
||||
- **Firmware option:** could also run entirely in PicoRV32 firmware if area is tight, since it's not time-critical (only runs once at startup or after link loss)
|
||||
|
||||
---
|
||||
|
||||
## 8. What Needs to Be Built (RTL)
|
||||
|
||||
### 8.1 Phase 1: Basic Decoder (chipIgnite Target)
|
||||
|
||||
This is what goes on the ASIC. Conservative, proven architecture:
|
||||
|
||||
@@ -266,7 +311,7 @@ This is what goes on the ASIC. Conservative, proven architecture:
|
||||
|
||||
**Critical path:** CN update array (min-find across variable-degree check nodes)
|
||||
|
||||
### 7.2 Phase 2: Enhanced Decoder (Future / FPGA Prototype)
|
||||
### 8.2 Phase 2: Enhanced Decoder (Future / FPGA Prototype)
|
||||
|
||||
For an FPGA prototype, we have more flexibility:
|
||||
|
||||
@@ -275,7 +320,7 @@ For an FPGA prototype, we have more flexibility:
|
||||
- **Multiple H-matrices:** Store several base matrices in a small ROM, selectable at runtime
|
||||
- **SC-LDPC windowed decoder:** Requires more memory (L positions x message storage) but same CN/VN units
|
||||
|
||||
### 7.3 Key RTL Design Decisions
|
||||
### 8.3 Key RTL Design Decisions
|
||||
|
||||
**Memory architecture:**
|
||||
- LLR RAM: single-port is fine (write during load, read during decode)
|
||||
@@ -293,7 +338,7 @@ For an FPGA prototype, we have more flexibility:
|
||||
|
||||
---
|
||||
|
||||
## 8. File Map
|
||||
## 9. File Map
|
||||
|
||||
```
|
||||
ldpc_optical/
|
||||
@@ -304,7 +349,9 @@ ldpc_optical/
|
||||
ldpc_analysis.py # Code analysis tools (rate sweep, matrix compare)
|
||||
density_evolution.py # DE optimizer + matrix construction
|
||||
sc_ldpc.py # SC-LDPC construction + windowed decoder
|
||||
frame_sync.py # Preamble-less frame synchronization
|
||||
test_density_evolution.py # 24 tests for DE/optimization
|
||||
test_frame_sync.py # Frame sync tests
|
||||
test_sc_ldpc.py # 9 tests for SC-LDPC
|
||||
test_ldpc.py # 19 tests for base decoder model
|
||||
test_ldpc_analysis.py # 18 tests for analysis tools
|
||||
@@ -321,7 +368,7 @@ ldpc_optical/
|
||||
|
||||
---
|
||||
|
||||
## 9. Running the Python Model
|
||||
## 10. Running the Python Model
|
||||
|
||||
```bash
|
||||
# Quick demo: encode, channel, decode at several SNR points
|
||||
@@ -339,6 +386,11 @@ python3 model/density_evolution.py alpha-sweep
|
||||
# SC-LDPC threshold + FER comparison
|
||||
python3 model/sc_ldpc.py full
|
||||
|
||||
# Frame synchronization demo
|
||||
python3 model/frame_sync.py # Quick demo at lam_s=5
|
||||
python3 model/frame_sync.py --sweep # Acquisition sweep over SNR
|
||||
python3 model/frame_sync.py --resync-test # Re-sync robustness test
|
||||
|
||||
# Generate all plots
|
||||
python3 model/plot_de_results.py
|
||||
|
||||
@@ -348,7 +400,7 @@ python3 -m pytest model/ -v
|
||||
|
||||
---
|
||||
|
||||
## 10. Next Steps
|
||||
## 11. Next Steps
|
||||
|
||||
1. **RTL implementation** -- Start with `cn_update_array` and `vn_update_array` (most critical blocks), validate against Python bit-exact model
|
||||
2. **Verilator testbench** -- Use `ldpc_sim.py --gen-vectors` to create golden test vectors
|
||||
|
||||
Reference in New Issue
Block a user