ldpc_optical/docs/plans/2026-02-23-frame-sync-and-code-analysis-design.md

# Frame Synchronization & Code Analysis Design

## Context

LDPC decoder for photon-starved optical communication (rate 1/8, n=256, k=32, Z=32). The receiver has no frame alignment — it must find codeword boundaries from a continuous stream of soft LLR values. Target operating point: 1-2 photons/slot (lambda_s).

## Goals

1. Prototype frame synchronization in Python (acquisition + re-sync)
2. Validate design decisions with four quantitative analyses:
   - Rate comparison (is 1/8 the right rate?)
   - Base matrix quality (how much performance is left on the table?)
   - Quantization sweep (is 6-bit enough?)
   - Shannon gap (how far from theoretical limits?)

## Frame Synchronization

### Stream Model

Concatenate N encoded codewords into a continuous stream. Generate Poisson channel LLRs for the entire stream. Insert a random unknown offset (0-255 bits) at the start. The sync algorithm sees only the shifted stream.

### Acquisition Algorithm (Scenario A)

```
for offset in 0..255:
    window = stream_llr[offset : offset+256]
    hard_bits = [0 if llr > 0 else 1 for llr in window]
    syn_wt = compute_syndrome_weight(hard_bits)
    if syn_wt < SCREENING_THRESHOLD:
        decoded, converged, _, _ = decode(quantize(window))
        if converged:
            # Confirm: decode next 2 frames at this offset
            if confirm_sync(stream_llr, offset):
                return offset  # LOCKED
return SYNC_FAILED
```

Screening threshold: ~50 (out of 224 checks). Wrong offsets will have syndrome weight ~112 (random). Correct offset at operational SNR will be much lower.

### Re-Sync (Scenario C)

During steady-state decoding, monitor syndrome weight. If N consecutive frames fail to converge (syndrome_weight > 0 after max iterations), trigger re-acquisition:
1. Search offsets ±16 around last known good offset
2. If not found, full 0-255 search

### Metrics

- Acquisition success rate vs lambda_s
- Average offsets screened before lock
- Total cost in equivalent decode cycles
- False lock probability
- Re-sync success rate after simulated offset slip

## Analysis 1: Rate Comparison

### Codes Under Test

All use Z=32, IRA staircase structure, same shift-value strategy.

| Rate | M_BASE | N_BASE | n   | k  |
|------|--------|--------|-----|----|
| 1/2  | 1      | 2      | 64  | 32 |
| 1/3  | 2      | 3      | 96  | 32 |
| 1/4  | 3      | 4      | 128 | 32 |
| 1/6  | 5      | 6      | 192 | 32 |
| 1/8  | 7      | 8      | 256 | 32 |

### Method

For each rate, sweep lambda_s from 0.5 to 10 (step 0.5), 500 frames/point, lambda_b=0.1. Record FER and BER.

### Key Output

Threshold lambda_s (FER < 10%) for each rate. Directly answers whether rate 1/8 is necessary to reach 1-2 photons/slot.

## Analysis 2: Base Matrix Quality

### Matrices Under Test

All rate 1/8 (7x8, Z=32):

1. **Current staircase** — existing H_BASE. Col 7 has dv=1 (weak).
2. **Improved staircase** — add 1-2 extra connections to low-degree columns. Maintain lower-triangular parity for sequential encoding.
3. **PEG-constructed** — Progressive Edge Growth algorithm to maximize girth. Better degree distribution but encoding requires back-substitution.

### Metrics

- FER vs lambda_s at target range (0.5-5 photons)
- Tanner graph girth for each matrix
- VN/CN degree distributions
- Encoding complexity comparison

## Analysis 3: Quantization Sweep

### Method

Fix lambda_s near decoding threshold (from analysis 1). Run decoder at quantization levels: 4, 5, 6, 8, 10 bits, and float32. Same code, same matrix, 500 frames.

### Key Output

FER vs quantization bits. Identifies the knee where adding more bits stops helping. Validates or challenges the 6-bit design choice.

## Analysis 4: Shannon Gap

### Method

Compute Poisson channel capacity for binary-input OOK:

```
C = max_p H(Y) - p*H(Y|X=1) - (1-p)*H(Y|X=0)

where Y|X=x ~ Poisson(x*lambda_s + lambda_b)
H(Y|X=x) = -sum_y P(y|x) * log2(P(y|x))
```

Optimize over input probability p (though p=0.5 is near-optimal for the symmetric case).

Find minimum lambda_s where C >= R for each rate tested in analysis 1.

### Key Output

Shannon limit lambda_s for rate 1/8 vs decoder operational threshold. Gap in dB tells us how much room for improvement exists.

## Implementation Structure

```
model/
  ldpc_sim.py          # existing (unchanged, provides encoder/decoder/channel)
  frame_sync.py        # NEW: frame sync simulation
  ldpc_analysis.py     # NEW: analyses 1-4 as subcommands
```

### frame_sync.py

- Imports encoder, decoder, channel, syndrome check from ldpc_sim
- `--n-frames`: number of codewords in stream (default 20)
- `--sweep`: sweep lambda_s for acquisition success rate curve
- `--resync-test`: simulate offset slip and test re-acquisition
- Prints summary table + per-offset screening results

### ldpc_analysis.py

- Imports encoder, decoder, channel from ldpc_sim
- Subcommands: `--rate-sweep`, `--matrix-compare`, `--quant-sweep`, `--shannon-gap`, `--all`
- Each analysis prints a summary table to stdout
- Results saved to `data/analysis_results.json`
- `--n-frames` controls simulation length (default 500, increase for publication-quality)

### Dependencies

- numpy (already used by ldpc_sim.py)
- scipy (for Shannon gap — Poisson PMF, optimization) — new dependency
- No other external dependencies