Synthesis Engine — `Synthesis` and `Loss`#

Module: foscat.Synthesis

The Synthesis class drives field synthesis, denoising, and component separation in FOSCAT. It wraps L-BFGS-B (scipy.optimize.fmin_l_bfgs_b) — a quasi-Newton method that builds a low-memory approximation of the inverse Hessian from the gradient history. L-BFGS-B converges much faster than first-order optimisers (Adam, SGD) for smooth, moderately high-dimensional problems like scattering-covariance synthesis.

Architecture overview#

  Initial field  x_0  (random Gaussian noise or a prior)
        │
        ▼
  ┌─────────────────────────────────────────────────────────┐
  │  scipy.optimize.fmin_l_bfgs_b                           │
  │                                                         │
  │   loop (up to NUM_EPOCHS iterations):                   │
  │   ┌─────────────────────────────────────────────────┐   │
  │   │  calc_grad(x)                                   │   │
  │   │   → FoCUS.eval(x)  →  scat_cov stats           │   │
  │   │   → user loss  L(stats, target_stats)           │   │
  │   │   → scalar loss  +  ∇ₓ L  (via autograd)       │   │
  │   └─────────────────────────────────────────────────┘   │
  │                                                         │
  │   L-BFGS-B step  →  x_{t+1}                            │
  │   (Hessian approximation updated from gradient history) │
  └─────────────────────────────────────────────────────────┘
        │
        ▼
  Converged field  x*  with  Φ(x*) ≈ Φ(d_ref)

`Loss` — wrapping a user loss function#

foscat.Synthesis.Loss(
    function,
    scat_operator,
    *param,
    name          = "",
    batch         = None,
    batch_data    = None,
    batch_update  = None,
    info_callback = False,
)

Parameters#

Parameter	Type	Default	Description
`function`	callable	—	Loss function. Signature: `f(x, scat_op, args) → scalar tensor` (no-batch mode) or `f(x, batch, scat_op, args) → scalar tensor` (batch mode). Must return a differentiable PyTorch scalar.
`scat_operator`	FoCUS	—	The scattering operator used inside `function`. Determines the backend and device.
`*param`	any	—	Extra arguments packed into a tuple and passed to `function` as `args`.
`name`	str	`""`	Optional label printed in the iteration log.
`batch`	callable\|None	None	`batch(batch_data, istep) → batch_indices`. Enables mini-batch gradient computation. If None, the full field is used at every step.
`batch_data`	any	None	Data passed to `batch` at each step.
`batch_update`	callable\|None	None	Called between L-BFGS-B restarts (see `NUM_STEP_BIAS`) to update `batch_data`.
`info_callback`	bool	False	If True, `function` must return `(loss, info_str)` and the string is printed at log frequency.

Loss function signature#

def my_loss(x, scat_op, args):
    """
    x       : torch.Tensor — current field, shape (npix,) or (batch, npix)
    scat_op : FoCUS instance
    args    : tuple of extra arguments
    returns : scalar torch.Tensor, differentiable w.r.t. x
    """
    target_stat = args[0]
    stat = scat_op.eval(x)
    return stat.reduce_mean_batch((stat - target_stat) ** 2)

Minimal example#

from foscat.Synthesis import Loss
import foscat.scat_cov as sc

scat_op     = sc.funct(KERNELSZ=5, NORIENT=4, all_type='float64')
target_stat = scat_op.eval(target_map)

def scattering_loss(x, scat_op, args):
    ref  = args[0]
    stat = scat_op.eval(x)
    return stat.reduce_mean_batch((stat - ref) ** 2)

loss = Loss(scattering_loss, scat_op, target_stat)

`Synthesis` — the optimisation loop#

foscat.Synthesis.Synthesis(loss_list)

The constructor only takes the list of Loss objects. Optimiser hyperparameters are passed to run().

Parameter	Type	Description
`loss_list`	list[Loss]	One or more `Loss` objects. The total gradient is the sum across all losses, enabling multi-component optimisation.

`run` — execute L-BFGS-B#

result = solver.run(
    in_x,
    NUM_EPOCHS           = 100,
    EVAL_FREQUENCY       = 100,
    factr                = 10.0,
    NUM_STEP_BIAS        = 1,
    LEARNING_RATE        = 0.03,
    DECAY_RATE           = 0.95,
    batchsz              = 1,
    totalsz              = 1,
    grd_mask             = None,
    idx_grd              = None,
    KEEP_TRACK           = None,
    SHOWGPU              = False,
    MESSAGE              = "",
    axis                 = 0,
)

Parameters

Parameter	Type	Default	Description
`in_x`	ndarray, shape `(npix,)`	—	Initial field. Typically Gaussian noise with power matched to the target.
`NUM_EPOCHS`	int	100	Maximum number of L-BFGS-B iterations (`maxiter`). The optimiser may stop earlier if the convergence criterion is met.
`EVAL_FREQUENCY`	int	100	Print the current loss every N iterations (via the L-BFGS-B callback).
`factr`	float	10.0	L-BFGS-B convergence tolerance: the iteration stops when `(f_k - f_{k+1}) / max(
`NUM_STEP_BIAS`	int	1	Number of L-BFGS-B restarts. At each restart the Hessian approximation is reset and `batch_update` is called on all losses. Useful for progressive noise schedules.
`LEARNING_RATE`	float	0.03	Vestigial — not used by L-BFGS-B.
`DECAY_RATE`	float	0.95	Vestigial — not used by L-BFGS-B.
`batchsz`	int	1	Number of noise realisations used per gradient evaluation (for stochastic losses).
`totalsz`	int	1	Total batch pool size. When `totalsz > batchsz`, the gradient is accumulated over `totalsz // batchsz` sub-batches.
`grd_mask`	ndarray\|None	None	Binary mask applied to the gradient before each L-BFGS-B step. Pixels with mask = 0 are frozen.
`idx_grd`	ndarray\|None	None	Index array selecting the pixels to optimise (partial inpainting).
`KEEP_TRACK`	callable\|None	None	Called at each log step with the current info dict for custom progress tracking.
`SHOWGPU`	bool	False	Print GPU memory usage alongside the loss.
`MESSAGE`	str	`""`	Prefix prepended to each log line.
`axis`	int	0	Axis along which to extract the output map (relevant for multi-component problems).

Returns: ndarray, same shape as in_x — the optimised field.

Why L-BFGS-B?#

L-BFGS-B uses second-order information (an approximation of the Hessian) to take much larger steps than a first-order method like Adam. For scattering-covariance synthesis:

The loss landscape is smooth (sum of squared differences of smooth statistics).
The number of free variables is npix (up to ~800 000 for nside=256) — within the range where L-BFGS-B is effective.
Typical convergence in 50–300 iterations vs. thousands for Adam.

The key parameter is factr: set it to a large value (1e6) for a quick approximate synthesis; reduce it (10 or 1.0) for high-fidelity results. Leave NUM_EPOCHS large enough that the factr criterion triggers before maxiter is hit.

Multiple losses#

from foscat.Synthesis import Loss, Synthesis

loss_scat  = Loss(scattering_loss,    scat_op, target_stat,  name="scat")
loss_power = Loss(power_spec_loss,    scat_op, target_Cl,    name="Cl")

solver = Synthesis([loss_scat, loss_power])
result = solver.run(x0, NUM_EPOCHS=200, factr=10.0)

The total gradient at each L-BFGS-B step is \(\nabla_x \mathcal{L}_\text{scat} + \nabla_x \mathcal{L}_{C_\ell}\).

Gradient masking#

grd_mask = survey_mask.astype(np.float64)   # 1 inside survey, 0 outside
result = solver.run(x0, NUM_EPOCHS=300, grd_mask=grd_mask)

Practical tips#

factr. Start with factr=1e7 for a quick test. For production synthesis use factr=10 or lower. With factr=1e-32 (effectively disabled), convergence is determined entirely by NUM_EPOCHS.

NUM_EPOCHS. Set it higher than you expect to need — L-BFGS-B stops automatically when factr is satisfied. A typical synthesis at nside=64 converges in 100–300 iterations.

Initialisation. A Gaussian white noise map works well. For faster convergence, pre-colour the noise with the target power spectrum (healpy.synfast).

Multi-resolution warm start. For large nside, synthesise first at low resolution and upsample the result as the initial guess for the full resolution. L-BFGS-B benefits significantly from a good initialisation.

get_history(). Retrieve the loss curve after run():

result = solver.run(x0, NUM_EPOCHS=300)
loss_curve = solver.get_history()   # ndarray, one value per logged iteration

Synthesis Engine — Synthesis and Loss#