# Cosmology Module Review: RadialIntegrator Implementation

**Date**: 2025-11-01
**Module**: `src/fftloggin/cosmology.py`
**Overall Grade**: **B+**

## Executive Summary

The `RadialIntegrator` implementation is a well-designed, flexible class with excellent documentation and thoughtful API choices. It successfully provides line-of-sight integral computation for angular power spectra using FFTLog. The code follows project architecture principles and offers good user customization options.

**Key Strengths**: Comprehensive documentation, flexible API with deferred computation, robust property management, and smart defaults.

**Key Weaknesses**: Limited support for batch dimensions in input arrays, some edge cases not handled, and undocumented conventions that could surprise users.

---

## Pros

### 1. Excellent Documentation
- Comprehensive docstrings with LaTeX mathematical notation
- Clear parameter descriptions with type hints
- Multiple usage examples showing different patterns
- References to scientific papers (Hamilton 2000, Assassi 2017, Fang 2020)
- Well-explained design choices in inline comments

**Example**:
```python
# Line 112-113: Clear documentation of convention
"""The kr parameter is automatically set to :math:`\ell + 1`, following the
pyccl convention."""
```

### 2. Flexible and User-Friendly API

**Deferred Computation**:
```python
# Users can set up without computing
integrator = RadialIntegrator(chi, s, ells, compute=False)
# ... do other setup ...
result = integrator.compute(workers=-1)  # Compute when ready
```

**Multiple Recentering Modes**:
```python
recenter=True    # Auto-center at source median
recenter=False   # Use geometric mean of endpoints
recenter=150.0   # Explicit r0 value
```

**Customizable Interpolation**:
```python
# Method 1: Override build_interpolator
integrator = RadialIntegrator(chi, s, ells, bc_type='clamped')

# Method 2: Set interpolator directly
integrator.source_interpolator = custom_interpolator
```

### 3. Robust Input Handling
- Validates array lengths match (lines 159-162)
- Minimum array size checking (lines 164-165)
- Graceful fallback when `dlog` inference fails (lines 176-179)
- NaN handling prevents propagation (line 324)

```python
# Line 159-162: Clear validation with helpful error
if nchi != ns:
    raise ValueError(
        f"chi and s must have the same length. Got chi: {nchi}, s: {ns}"
    )
```

### 4. Well-Designed Properties

**Lazy Evaluation with Validation**:
```python
# Lines 396-400: Helpful error if user forgets to compute
if self._s_resampled is None:
    raise ValueError(
        "Source function not yet resampled. Call compute() first or set "
        "compute=True in constructor."
    )
```

**Proper Cache Invalidation**:
```python
# Lines 376-379: Setting interpolator invalidates stale cache
self._source_interpolator = interpolator
self._result = None
self._s_resampled = None
```

### 5. Chi Mask Feature
- Provides `chi_mask` to identify valid vs extrapolated regions (lines 209-212)
- Useful for quality control and data analysis
- Helps users understand which parts of the grid are reliable

### 6. Follows Project Architecture
- Properly delegates to `FFTLog` for transforms
- Uses `Grid` for coordinate management via `create_grid()`
- Consistent with decoupled design philosophy (FFTLog = algorithm, Grid = coordinates)

### 7. Batch Transform Support
```python
# Lines 187-194: Proper reshaping for batch transforms
kr = self.ells + 1.0
kr = np.asarray(kr)
if kr.ndim > 0:
    kr = kr.reshape(-1, 1)  # Enables broadcasting in FFTLog
```
Result automatically adapts shape: `(n_ells, n_k)` for array ells, `(n_k,)` for scalar.

---

## Cons

### 1. Potential Broadcasting Issues with Batched Inputs

**Location**: Lines 210-212

```python
self._chi_mask = (self._grid.r >= self._chi_input[..., 0]) & (
    self._grid.r <= self._chi_input[..., -1]
)
```

**Issue**: If `chi_input` or `s_input` have batch dimensions (e.g., multiple window functions), this indexing pattern might fail or produce unexpected broadcasting.

**Impact**: Limits ability to process multiple source functions simultaneously.

### 2. Recentering Logic Assumes 1D Arrays

**Location**: Lines 247-252

```python
cdf = cumulative_trapezoid(self._s_input, self._chi_input, initial=0)
if cdf[-1] > 0:
    cdf = cdf / cdf[-1]
    median_idx = np.searchsorted(cdf, 0.5)
    median_idx = min(median_idx, self._chi_input.shape[-1] - 1)
    return float(self._chi_input[..., median_idx])
```

**Issues**:
- `cumulative_trapezoid` may not handle batched inputs correctly
- `np.searchsorted` behavior on multidimensional arrays is ambiguous
- `float()` conversion will fail if result is an array
- Batched source functions would need per-batch recentering

### 3. Manual Grid Construction

**Location**: Lines 203-207

```python
log_chi_center = np.log(chi_center)
log_chi_min = log_chi_center - (n - 1) / 2.0 * dlog
chi_fftlog = np.exp(log_chi_min + np.arange(n) * dlog)
self._grid = self._fftlog.create_grid(r=chi_fftlog)
```

**Issues**:
- Duplicates coordinate generation logic
- Could potentially use `Grid.from_r()` factory method if it supported centering
- Mixing manual coordinate construction with factory pattern

### 4. Incomplete Input Validation

**Missing Checks**:
- Chi values are monotonically increasing (required for interpolation)
- Chi values are positive (required for log spacing)
- Ells values are non-negative (physical requirement)
- Input arrays are not empty beyond length check

**Example needed**:
```python
if not np.all(np.diff(chi) > 0):
    raise ValueError("chi must be monotonically increasing")
if not np.all(chi > 0):
    raise ValueError("chi must contain only positive values")
```

### 5. Interpolation Boundary Behavior

**Location**: Line 294

```python
extrapolate = kwargs.pop("extrapolate", False)
```

**Issue**:
- With `extrapolate=False`, CubicSpline returns NaN outside input bounds
- Line 324 silently converts NaN to 0 with `nan_to_num`
- Users might not realize their chi range is insufficient
- Could lead to unexpected zero-padding in integration

**Better approach**: Either default to `extrapolate=True` or warn when extrapolation occurs.

### 6. Undocumented Division Convention

**Location**: Lines 326-328

```python
result = self.fftlog.forward(self.s_resampled, **fft_kwargs)
# Divide forward transform by k because of our convention
# for the fht integration measure kdr
self._result = result / self.grid.k
```

**Issues**:
- Comment mentions "our convention" without explaining why
- Integration measure `kdr` should be documented in class docstring
- Mathematical formulation should be explicit about this factor
- Might surprise users expecting standard FFTLog output

**Recommendation**: Add to class docstring:
```
The integration measure includes a factor of k, so the forward transform
is divided by k to match the conventional definition of the radial integral.
```

### 7. Memory Efficiency Concerns

**Issue**: Stores both `_s_resampled` and `_result` which could be large for many ells.

**Current behavior**:
- For 10 ells with n=1024: stores ~164 KB intermediate data
- No option to discard intermediate results to save memory
- Could be problematic for high-resolution or many-ell calculations

**Suggestion**: Add `keep_intermediate=False` option to discard `_s_resampled` after computation.

### 8. Property Setter Doesn't Auto-Recompute

**Location**: Lines 376-379

```python
self._source_interpolator = interpolator
# Invalidate cached results
self._result = None
self._s_resampled = None
```

**Issue**:
- Invalidates cache but doesn't automatically recompute
- User must remember to call `compute()` again
- Accessing `.result` after setting interpolator raises error

**Possible enhancement**: Add `auto_recompute=True` option or `recompute=True` parameter to setter.

### 9. Error Messages Could Be Clearer

**Location**: Lines 397-400

```python
raise ValueError(
    "Source function not yet resampled. Call compute() first or set "
    "compute=True in constructor."
)
```

**Issue**: If user already constructed object, they can't set `compute=True` retroactively. Message is misleading.

**Better message**: `"Source function not yet computed. Call compute() to calculate results."`

### 10. Missing Features

**Not Implemented**:
- No `backward()` or `inverse()` method (though may not be physically meaningful)
- No built-in validation against analytic test cases
- No method to update source function without recreating object
- No convenience method for common window functions (Gaussian, top-hat, etc.)
- No diagnostic plots or quality metrics

---

## Recommendations

### High Priority

#### 1. Fix Batching Support for Input Arrays
**Effort**: Medium
**Impact**: High

- Ensure `_compute_r0()` works with batch dimensions
- Handle masking correctly for batched grids
- Add tests for batched source functions

#### 2. Add Input Validation
**Effort**: Low
**Impact**: Medium

```python
def _validate_inputs(self, chi, s):
    """Validate chi and s arrays."""
    if not np.all(np.diff(chi, axis=-1) > 0):
        raise ValueError("chi must be monotonically increasing")
    if not np.all(chi > 0):
        raise ValueError("chi must contain only positive values")
    if np.any(~np.isfinite(chi)) or np.any(~np.isfinite(s)):
        raise ValueError("chi and s must contain only finite values")
```

#### 3. Document Integration Measure Convention
**Effort**: Low
**Impact**: High (user understanding)

Add to class docstring mathematical details:
- Why division by k is necessary
- Connection to integral definition
- Reference to FFTLog conventions

### Medium Priority

#### 4. Improve Interpolation Defaults
**Effort**: Low
**Impact**: Medium

- Consider `extrapolate=True` as default with warning
- Or add explicit warning when extrapolation occurs
- Document extrapolation behavior clearly

#### 5. Simplify Grid Creation
**Effort**: Medium
**Impact**: Low (code clarity)

- Investigate if `Grid` factory methods can be extended
- Reduce code duplication in coordinate generation

#### 6. Add Shape Assertions
**Effort**: Low
**Impact**: Medium (debugging)

```python
# After FFTLog creation
expected_shape = (len(ells), n) if self.ells.ndim > 0 else (n,)
assert self.grid.r.shape[-1] == n, f"Grid shape mismatch: {self.grid.r.shape}"
```

### Low Priority

#### 7. Memory Optimization Options
**Effort**: Low
**Impact**: Low (unless high-resolution use cases)

```python
def compute(self, keep_intermediate=True, **fft_kwargs):
    """..."""
    self._s_resampled = self._source_interpolator(self.grid.r)
    # ... compute result ...
    if not keep_intermediate:
        self._s_resampled = None  # Free memory
```

#### 8. Add Update Method
**Effort**: Medium
**Impact**: Medium (user convenience)

```python
def update_source(self, s_new, recompute=True):
    """Update source function without recreating object."""
    if len(s_new) != len(self._s_input):
        raise ValueError("New source must have same length as original")
    self._s_input = np.asarray(s_new)
    self._source_interpolator = self.build_interpolator(
        self._chi_input, self._s_input, **self._interp_kwargs
    )
    if recompute:
        return self.compute()
```

#### 9. Validation Utilities
**Effort**: Medium
**Impact**: Low (testing/validation)

Add method to compare against known analytic solutions (e.g., Gaussian window with Gaussian transfer function).

---

## Specific Code Issues

### Line References

| Line | Issue | Severity |
|------|-------|----------|
| 210-212 | Chi mask broadcasting with batch dims | High |
| 247-252 | Recentering assumes 1D arrays | High |
| 294 | Extrapolate=False default with silent nan_to_num | Medium |
| 328 | Undocumented k division convention | Medium |
| 397-400 | Misleading error message | Low |
| 203-207 | Manual grid construction duplicates logic | Low |

---

## Testing Recommendations

Add test cases for:
1. **Batched inputs**: Multiple source functions, multiple chi arrays
2. **Edge cases**:
   - Minimum array size (n=2)
   - Single ell vs array of ells
   - Recenter modes (True/False/float)
3. **Interpolation**:
   - Extrapolation behavior
   - Custom interpolators
4. **Validation**:
   - Non-monotonic chi (should raise)
   - Negative chi values (should raise)
   - Mismatched array lengths (already tested)
5. **Memory**: Large arrays to verify no memory leaks
6. **Numerical accuracy**: Compare against known analytic cases

---

## Conclusion

The `RadialIntegrator` class is production-ready for single source function use cases and demonstrates excellent software engineering practices. The main areas for improvement are:

1. **Robustness**: Better input validation and error handling
2. **Generality**: Support for batched input arrays
3. **Documentation**: Clarify mathematical conventions
4. **Testing**: Expand test coverage for edge cases

**Recommended Next Steps**:
1. Implement high-priority fixes (input validation, batch support, documentation)
2. Add comprehensive test suite
3. Consider medium-priority enhancements based on user feedback
4. Document known limitations in module docstring

The code demonstrates strong understanding of the FFTLog algorithm and cosmological applications. With the recommended improvements, it would achieve an **A** grade.
