Coverage for src / gwtransport / diffusion_fast.py: 90%

195 statements  

« prev     ^ index     » next       coverage.py v7.13.5, created at 2026-04-27 06:32 +0000

1""" 

2Fast Diffusive Transport Corrections via Gaussian Smoothing. 

3 

4This module provides a computationally efficient approximation of diffusion/dispersion 

5using Gaussian smoothing. It is much faster than :mod:`gwtransport.diffusion` but 

6less physically accurate, especially under variable flow conditions. 

7 

8Both ``diffusion_fast`` and :mod:`gwtransport.diffusion` add microdispersion and 

9molecular diffusion on top of macrodispersion captured by the APVD. 

10 

11**When to use diffusion_fast vs diffusion:** 

12 

13- Use ``diffusion_fast`` when: Speed is critical, flow and time steps are relatively 

14 constant, or you need real-time processing 

15- Use ``diffusion`` when: Physical accuracy is critical, flow varies significantly, 

16 or you're analyzing periods with changing conditions 

17 

18See :ref:`concept-dispersion` for background on macrodispersion and microdispersion. 

19 

20This module implements diffusion/dispersion processes that modify advective transport 

21in aquifer systems. Diffusion causes spreading and smoothing of concentration or 

22temperature fronts as they travel through the aquifer. While advection moves compounds 

23with water flow, diffusion causes spreading due to molecular diffusion, mechanical 

24dispersion, and thermal diffusion (for temperature). 

25 

26Limitation: When ``flow_out`` is not provided, this fast approximation works best when 

27flow and tedges are relatively constant. The underlying assumption is that dx (spatial 

28step between cells) remains approximately constant, which holds for steady flow but 

29breaks down under highly variable conditions. When input time bins are non-uniform 

30(e.g., from signal compression), provide ``flow_out`` to apply Gaussian smoothing on 

31the (typically uniform) output grid instead. For scenarios with significant flow 

32variability, consider using :mod:`gwtransport.diffusion` instead. 

33 

34Available functions: 

35 

36- :func:`infiltration_to_extraction` - Apply diffusion during infiltration to extraction 

37 transport. Combines advection (via residence time) with diffusion (via Gaussian smoothing). 

38 Computes position-dependent diffusion based on local residence time and returns concentration 

39 or temperature in extracted water on the extraction time grid. 

40 

41- :func:`extraction_to_infiltration` - Inverse diffusion via Tikhonov regularization. Builds 

42 the combined forward matrix (advection + Gaussian diffusion) and solves the inverse problem 

43 to reconstruct infiltration concentrations from extraction data. 

44 

45- :func:`gamma_infiltration_to_extraction` - Gamma-distributed pore volumes with fast 

46 Gaussian diffusion. Convenience wrapper that parameterizes the pore volume distribution 

47 as a gamma distribution. 

48 

49- :func:`gamma_extraction_to_infiltration` - Gamma-distributed pore volumes, inverse 

50 fast Gaussian diffusion. Symmetric inverse of gamma_infiltration_to_extraction. 

51 

52- :func:`convolve_diffusion` - Apply variable-sigma Gaussian filtering. Extends 

53 scipy.ndimage.gaussian_filter1d to position-dependent sigma using sparse matrix representation 

54 for efficiency. Handles boundary conditions via nearest-neighbor extrapolation. 

55 

56- :func:`create_example_data` - Generate test data for demonstrating diffusion effects with 

57 signals having varying time steps and corresponding sigma arrays. Useful for testing and 

58 validation. 

59 

60This file is part of gwtransport which is released under AGPL-3.0 license. 

61See the ./LICENSE file or go to https://github.com/gwtransport/gwtransport/blob/main/LICENSE for full license details. 

62""" 

63 

64import warnings 

65 

66import numpy as np 

67import numpy.typing as npt 

68import pandas as pd 

69from scipy import sparse 

70from scipy.special import erf 

71 

72from gwtransport import gamma 

73from gwtransport.advection_utils import _infiltration_to_extraction_weights 

74from gwtransport.residence_time import residence_time_mean 

75from gwtransport.utils import solve_inverse_transport 

76 

77# Minimum coefficient sum to consider a row valid 

78_EPSILON_COEFF_SUM = 1e-10 

79 

80 

81def infiltration_to_extraction( 

82 *, 

83 cin: npt.ArrayLike, 

84 flow: npt.ArrayLike, 

85 tedges: pd.DatetimeIndex, 

86 cout_tedges: pd.DatetimeIndex, 

87 aquifer_pore_volumes: npt.ArrayLike, 

88 mean_streamline_length: float, 

89 mean_molecular_diffusivity: float, 

90 mean_longitudinal_dispersivity: float, 

91 retardation_factor: float = 1.0, 

92 suppress_dispersion_warning: bool = False, 

93 asymptotic_cutoff_sigma: float | None = 3.0, 

94 flow_out: npt.ArrayLike | None = None, 

95 suppress_uniform_tedges_check: bool = False, 

96) -> npt.NDArray[np.floating]: 

97 """Compute diffusion during 1D transport using fast Gaussian smoothing. 

98 

99 Combines advection (via pore volume distribution) with diffusion (via Gaussian 

100 smoothing) to produce extraction concentrations on the ``cout_tedges`` time grid. 

101 This is the fast approximate counterpart of 

102 :func:`gwtransport.diffusion.infiltration_to_extraction`. 

103 

104 When ``flow_out`` is provided, diffusion is applied on the output grid 

105 (advect-then-smooth), which correctly handles non-uniform ``tedges`` 

106 (e.g., from signal compression). Without ``flow_out``, diffusion is applied 

107 on the input grid and ``tedges`` must have constant time steps. 

108 

109 Parameters 

110 ---------- 

111 cin : array-like 

112 Concentration or temperature of the compound in the infiltrating water. 

113 flow : array-like 

114 Flow rate of water in the aquifer [m3/day]. 

115 tedges : pandas.DatetimeIndex 

116 Time edges for cin and flow data. Has length len(cin) + 1. 

117 cout_tedges : pandas.DatetimeIndex 

118 Time edges for output data bins. Has length of desired output + 1. 

119 aquifer_pore_volumes : array-like 

120 Array of aquifer pore volumes [m3] representing the distribution of 

121 flow paths. 

122 mean_streamline_length : float 

123 Mean travel distance [m]. Must be positive. 

124 mean_molecular_diffusivity : float 

125 Mean effective molecular diffusivity [m2/day]. Must be non-negative. 

126 mean_longitudinal_dispersivity : float 

127 Mean longitudinal dispersivity [m]. Must be non-negative. 

128 retardation_factor : float, optional 

129 Retardation factor (default 1.0). Values > 1.0 indicate slower transport. 

130 suppress_dispersion_warning : bool, optional 

131 Suppress warning about combining multiple pore volumes with dispersivity. 

132 Default is False. 

133 asymptotic_cutoff_sigma : float or None, optional 

134 Truncate the Gaussian kernel at this many standard deviations. 

135 Default is 3.0. 

136 flow_out : array-like or None, optional 

137 Flow rate [m3/day] on the output grid (aligned to ``cout_tedges``). 

138 Required when ``tedges`` has non-uniform time steps. Length must 

139 equal ``len(cout_tedges) - 1``. Default is None. 

140 suppress_uniform_tedges_check : bool, optional 

141 Skip the constant-dt check on ``tedges`` when ``flow_out`` is None. 

142 Default is False. 

143 

144 Returns 

145 ------- 

146 numpy.ndarray 

147 Bin-averaged concentration in extracted water. Length equals 

148 len(cout_tedges) - 1. NaN values indicate time periods with no valid 

149 contributions from the infiltration data. 

150 

151 See Also 

152 -------- 

153 gwtransport.diffusion.infiltration_to_extraction : Physically rigorous analytical solution 

154 that supports per-pore-volume arrays for streamline_length, molecular_diffusivity, 

155 and longitudinal_dispersivity. 

156 extraction_to_infiltration : Inverse operation 

157 :ref:`concept-dispersion-scales` : Macrodispersion vs microdispersion 

158 

159 Notes 

160 ----- 

161 A single ``mean_streamline_length`` is shared across all pore volumes. This 

162 assumes streamline-length heterogeneity is captured solely through the 

163 pore-volume distribution: the effective cross-sectional area 

164 ``A_eff = V_p / L_mean`` varies with V_p while the path length L is held 

165 fixed. This is appropriate for many systems but breaks down for 

166 partially-penetrating wells or wedge-shaped capture zones, where path 

167 length itself varies between streamtubes. In those cases, use 

168 :func:`gwtransport.diffusion.infiltration_to_extraction` with a per-streamtube 

169 ``streamline_length`` array instead. 

170 """ 

171 # Convert inputs 

172 cout_tedges = pd.DatetimeIndex(cout_tedges) 

173 tedges = pd.DatetimeIndex(tedges) 

174 cin = np.asarray(cin, dtype=float) 

175 flow = np.asarray(flow, dtype=float) 

176 aquifer_pore_volumes = np.asarray(aquifer_pore_volumes, dtype=float) 

177 

178 if flow_out is not None: 

179 flow_out = np.asarray(flow_out, dtype=float) 

180 

181 n_pore_volumes = len(aquifer_pore_volumes) 

182 

183 # Input validation 

184 _validate_inputs( 

185 cin_or_cout=cin, 

186 flow=flow, 

187 tedges=tedges, 

188 cout_tedges=cout_tedges, 

189 aquifer_pore_volumes=aquifer_pore_volumes, 

190 mean_streamline_length=mean_streamline_length, 

191 mean_molecular_diffusivity=mean_molecular_diffusivity, 

192 mean_longitudinal_dispersivity=mean_longitudinal_dispersivity, 

193 is_forward=True, 

194 flow_out=flow_out, 

195 ) 

196 

197 # Dispersion warning 

198 if n_pore_volumes > 1 and mean_longitudinal_dispersivity > 0 and not suppress_dispersion_warning: 

199 _warn_dispersion() 

200 

201 # Build advection weight matrix (needed in both branches) 

202 w_adv, spinup_mask = _infiltration_to_extraction_weights( 

203 tedges=tedges, 

204 cout_tedges=cout_tedges, 

205 aquifer_pore_volumes=aquifer_pore_volumes, 

206 flow=flow, 

207 retardation_factor=retardation_factor, 

208 ) 

209 

210 if flow_out is None: 

211 # Original algorithm: smooth-then-advect (requires uniform tedges) 

212 _check_uniform_tedges(tedges=tedges, suppress=suppress_uniform_tedges_check) 

213 

214 sigma_array = _compute_sigma( 

215 flow=flow, 

216 tedges=tedges, 

217 aquifer_pore_volumes=aquifer_pore_volumes, 

218 streamline_length=mean_streamline_length, 

219 molecular_diffusivity=mean_molecular_diffusivity, 

220 longitudinal_dispersivity=mean_longitudinal_dispersivity, 

221 retardation_factor=retardation_factor, 

222 direction="infiltration_to_extraction", 

223 ) 

224 cin_diffused = convolve_diffusion( 

225 input_signal=cin, sigma_array=sigma_array, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma 

226 ) 

227 cout = w_adv @ cin_diffused 

228 else: 

229 # New algorithm: advect-then-smooth (works with any tedges spacing) 

230 cout_unsmoothed = w_adv @ cin 

231 

232 # Rows with zero weight sum (spin-up or zero-flow traceback) carry no 

233 # signal. Flag them so the normalized convolution 

234 # smooth(signal * mask) / smooth(mask) properly renormalizes the kernel 

235 # instead of letting a near-zero value bleed into valid neighbors. 

236 total_coeff = np.sum(w_adv, axis=1) 

237 invalid_mask = total_coeff < _EPSILON_COEFF_SUM 

238 cout_unsmoothed[invalid_mask] = 0.0 

239 validity = np.where(invalid_mask, 0.0, 1.0) 

240 

241 sigma_out = _compute_sigma( 

242 flow=flow_out, 

243 tedges=cout_tedges, 

244 aquifer_pore_volumes=aquifer_pore_volumes, 

245 streamline_length=mean_streamline_length, 

246 molecular_diffusivity=mean_molecular_diffusivity, 

247 longitudinal_dispersivity=mean_longitudinal_dispersivity, 

248 retardation_factor=retardation_factor, 

249 direction="extraction_to_infiltration", 

250 ) 

251 smoothed = convolve_diffusion( 

252 input_signal=cout_unsmoothed, sigma_array=sigma_out, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma 

253 ) 

254 smoothed_validity = convolve_diffusion( 

255 input_signal=validity, sigma_array=sigma_out, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma 

256 ) 

257 with np.errstate(invalid="ignore"): 

258 cout = np.where(smoothed_validity > _EPSILON_COEFF_SUM, smoothed / smoothed_validity, np.nan) 

259 

260 # Mark spin-up rows (signal has not broken through yet) as NaN. Zero-flow 

261 # cout bins keep their 0 output from the zero weight row. 

262 cout[spinup_mask] = np.nan 

263 

264 return cout 

265 

266 

267def extraction_to_infiltration( 

268 *, 

269 cout: npt.ArrayLike, 

270 flow: npt.ArrayLike, 

271 tedges: pd.DatetimeIndex, 

272 cout_tedges: pd.DatetimeIndex, 

273 aquifer_pore_volumes: npt.ArrayLike, 

274 mean_streamline_length: float, 

275 mean_molecular_diffusivity: float, 

276 mean_longitudinal_dispersivity: float, 

277 retardation_factor: float = 1.0, 

278 suppress_dispersion_warning: bool = False, 

279 asymptotic_cutoff_sigma: float | None = 3.0, 

280 regularization_strength: float = 1e-10, 

281 flow_out: npt.ArrayLike | None = None, 

282 suppress_uniform_tedges_check: bool = False, 

283) -> npt.NDArray[np.floating]: 

284 """Reconstruct infiltration concentration from extracted water via Tikhonov inversion. 

285 

286 Inverts the forward transport model (Gaussian diffusion + advection) by building 

287 the combined forward matrix and solving via Tikhonov regularization. This is the 

288 fast approximate counterpart of 

289 :func:`gwtransport.diffusion.extraction_to_infiltration`. 

290 

291 When ``flow_out`` is provided, diffusion is applied on the output grid, 

292 which correctly handles non-uniform ``tedges``. Without ``flow_out``, 

293 diffusion is applied on the input grid and ``tedges`` must have constant 

294 time steps. 

295 

296 Parameters 

297 ---------- 

298 cout : array-like 

299 Concentration of the compound in extracted water. 

300 flow : array-like 

301 Flow rate of water in the aquifer [m3/day]. 

302 tedges : pandas.DatetimeIndex 

303 Time edges for cin (output) and flow data. Has length of len(flow) + 1. 

304 cout_tedges : pandas.DatetimeIndex 

305 Time edges for cout data bins. Has length of len(cout) + 1. 

306 aquifer_pore_volumes : array-like 

307 Array of aquifer pore volumes [m3]. 

308 mean_streamline_length : float 

309 Mean travel distance [m]. Must be positive. 

310 mean_molecular_diffusivity : float 

311 Mean effective molecular diffusivity [m2/day]. Must be non-negative. 

312 mean_longitudinal_dispersivity : float 

313 Mean longitudinal dispersivity [m]. Must be non-negative. 

314 retardation_factor : float, optional 

315 Retardation factor (default 1.0). 

316 suppress_dispersion_warning : bool, optional 

317 Suppress warning about combining multiple pore volumes with dispersivity. 

318 Default is False. 

319 asymptotic_cutoff_sigma : float or None, optional 

320 Truncate the Gaussian kernel at this many standard deviations. 

321 Default is 3.0. 

322 regularization_strength : float, optional 

323 Tikhonov regularization parameter. Default is 1e-10. 

324 flow_out : array-like or None, optional 

325 Flow rate [m3/day] on the output grid (aligned to ``cout_tedges``). 

326 Required when ``tedges`` has non-uniform time steps. Length must 

327 equal ``len(cout_tedges) - 1``. Default is None. 

328 suppress_uniform_tedges_check : bool, optional 

329 Skip the constant-dt check on ``tedges`` when ``flow_out`` is None. 

330 Default is False. 

331 

332 Returns 

333 ------- 

334 numpy.ndarray 

335 Bin-averaged concentration in infiltrating water. Length equals 

336 len(tedges) - 1. NaN values indicate time periods with no valid 

337 contributions from the extraction data. 

338 

339 Warns 

340 ----- 

341 UserWarning 

342 When the forward matrix is rank-deficient. This occurs with constant 

343 flow when the residence time is an integer multiple of the time step 

344 width. To fix, adjust ``aquifer_pore_volumes`` slightly (e.g., 

345 multiply by 1.001). 

346 

347 See Also 

348 -------- 

349 infiltration_to_extraction : Forward operation 

350 gwtransport.diffusion.extraction_to_infiltration : Analytically correct deconvolution 

351 that supports per-pore-volume arrays for streamline_length, molecular_diffusivity, 

352 and longitudinal_dispersivity. 

353 :ref:`concept-dispersion-scales` : Macrodispersion vs microdispersion 

354 

355 Notes 

356 ----- 

357 A single ``mean_streamline_length`` is shared across all pore volumes. This 

358 assumes streamline-length heterogeneity is captured solely through the 

359 pore-volume distribution: the effective cross-sectional area 

360 ``A_eff = V_p / L_mean`` varies with V_p while the path length L is held 

361 fixed. This is appropriate for many systems but breaks down for 

362 partially-penetrating wells or wedge-shaped capture zones, where path 

363 length itself varies between streamtubes. In those cases, use 

364 :func:`gwtransport.diffusion.extraction_to_infiltration` with a per-streamtube 

365 ``streamline_length`` array instead. 

366 """ 

367 # Convert inputs 

368 tedges = pd.DatetimeIndex(tedges) 

369 cout_tedges = pd.DatetimeIndex(cout_tedges) 

370 cout = np.asarray(cout, dtype=float) 

371 flow = np.asarray(flow, dtype=float) 

372 aquifer_pore_volumes = np.asarray(aquifer_pore_volumes, dtype=float) 

373 

374 if flow_out is not None: 

375 flow_out = np.asarray(flow_out, dtype=float) 

376 

377 n_pore_volumes = len(aquifer_pore_volumes) 

378 

379 # Input validation 

380 _validate_inputs( 

381 cin_or_cout=cout, 

382 flow=flow, 

383 tedges=tedges, 

384 cout_tedges=cout_tedges, 

385 aquifer_pore_volumes=aquifer_pore_volumes, 

386 mean_streamline_length=mean_streamline_length, 

387 mean_molecular_diffusivity=mean_molecular_diffusivity, 

388 mean_longitudinal_dispersivity=mean_longitudinal_dispersivity, 

389 is_forward=False, 

390 flow_out=flow_out, 

391 ) 

392 

393 # Dispersion warning 

394 if n_pore_volumes > 1 and mean_longitudinal_dispersivity > 0 and not suppress_dispersion_warning: 

395 _warn_dispersion() 

396 

397 n_cin = len(tedges) - 1 

398 n_cout = len(cout_tedges) - 1 

399 

400 # Build advection weight matrix 

401 w_adv, _ = _infiltration_to_extraction_weights( 

402 tedges=tedges, 

403 cout_tedges=cout_tedges, 

404 aquifer_pore_volumes=aquifer_pore_volumes, 

405 flow=flow, 

406 retardation_factor=retardation_factor, 

407 ) 

408 

409 if flow_out is None: 

410 # Original algorithm: G on input grid, W_forward = W_adv @ G 

411 _check_uniform_tedges(tedges=tedges, suppress=suppress_uniform_tedges_check) 

412 

413 sigma_array = _compute_sigma( 

414 flow=flow, 

415 tedges=tedges, 

416 aquifer_pore_volumes=aquifer_pore_volumes, 

417 streamline_length=mean_streamline_length, 

418 molecular_diffusivity=mean_molecular_diffusivity, 

419 longitudinal_dispersivity=mean_longitudinal_dispersivity, 

420 retardation_factor=retardation_factor, 

421 direction="infiltration_to_extraction", 

422 ) 

423 g_matrix = _build_gaussian_matrix( 

424 n=n_cin, sigma_array=sigma_array, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma 

425 ) 

426 # Combined forward matrix: W_adv @ G via sparse @ dense (avoids dense G allocation) 

427 w_forward = (g_matrix.T @ w_adv.T).T 

428 else: 

429 # New algorithm: G on output grid, W_forward = G @ W_adv 

430 sigma_out = _compute_sigma( 

431 flow=flow_out, 

432 tedges=cout_tedges, 

433 aquifer_pore_volumes=aquifer_pore_volumes, 

434 streamline_length=mean_streamline_length, 

435 molecular_diffusivity=mean_molecular_diffusivity, 

436 longitudinal_dispersivity=mean_longitudinal_dispersivity, 

437 retardation_factor=retardation_factor, 

438 direction="extraction_to_infiltration", 

439 ) 

440 g_matrix = _build_gaussian_matrix( 

441 n=n_cout, sigma_array=sigma_out, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma 

442 ) 

443 w_forward = g_matrix @ w_adv # sparse @ dense 

444 

445 return solve_inverse_transport( 

446 w_forward=w_forward, 

447 observed=cout, 

448 n_output=n_cin, 

449 regularization_strength=regularization_strength, 

450 warn_rank_deficient=True, 

451 ) 

452 

453 

454def gamma_infiltration_to_extraction( 

455 *, 

456 cin: npt.ArrayLike, 

457 flow: npt.ArrayLike, 

458 tedges: pd.DatetimeIndex, 

459 cout_tedges: pd.DatetimeIndex, 

460 mean: float | None = None, 

461 std: float | None = None, 

462 loc: float = 0.0, 

463 alpha: float | None = None, 

464 beta: float | None = None, 

465 n_bins: int = 100, 

466 mean_streamline_length: float, 

467 mean_molecular_diffusivity: float, 

468 mean_longitudinal_dispersivity: float, 

469 retardation_factor: float = 1.0, 

470 suppress_dispersion_warning: bool = False, 

471 asymptotic_cutoff_sigma: float | None = 3.0, 

472 flow_out: npt.ArrayLike | None = None, 

473 suppress_uniform_tedges_check: bool = False, 

474) -> npt.NDArray[np.floating]: 

475 """ 

476 Compute extracted concentration with fast Gaussian diffusion for gamma-distributed pore volumes. 

477 

478 Combines advective transport (based on gamma-distributed pore volumes) with 

479 fast Gaussian diffusion. This is a convenience wrapper around 

480 :func:`infiltration_to_extraction` that parameterizes the aquifer pore volume 

481 distribution as a (shifted) gamma distribution. 

482 

483 Provide either (mean, std) or (alpha, beta); ``loc`` is optional and defaults to 0. 

484 

485 Parameters 

486 ---------- 

487 cin : array-like 

488 Concentration of the compound in infiltrating water. 

489 flow : array-like 

490 Flow rate of water in the aquifer [m3/day]. 

491 tedges : pandas.DatetimeIndex 

492 Time edges for cin and flow data. Has length len(cin) + 1. 

493 cout_tedges : pandas.DatetimeIndex 

494 Time edges for output data bins. Has length of desired output + 1. 

495 mean : float, optional 

496 Mean of the gamma distribution of the aquifer pore volume. Must be strictly 

497 greater than ``loc``. 

498 std : float, optional 

499 Standard deviation of the gamma distribution of the aquifer pore volume 

500 (invariant under the ``loc`` shift). 

501 loc : float, optional 

502 Location (minimum pore volume) of the gamma distribution. Must satisfy 

503 ``0 <= loc < mean``. Default is ``0.0``. 

504 alpha : float, optional 

505 Shape parameter of gamma distribution of the aquifer pore volume (must be > 0). 

506 beta : float, optional 

507 Scale parameter of gamma distribution of the aquifer pore volume (must be > 0). 

508 n_bins : int, optional 

509 Number of bins to discretize the gamma distribution. Default is 100. 

510 mean_streamline_length : float 

511 Mean travel distance through the aquifer [m] averaged over all aquifer 

512 pore volumes. Must be positive. For per-pore-volume arrays, use 

513 :func:`gwtransport.diffusion.gamma_infiltration_to_extraction`. 

514 mean_molecular_diffusivity : float 

515 Mean effective molecular diffusivity [m2/day] averaged over all 

516 aquifer pore volumes. Must be non-negative. For per-pore-volume 

517 arrays, use :func:`gwtransport.diffusion.gamma_infiltration_to_extraction`. 

518 mean_longitudinal_dispersivity : float 

519 Mean longitudinal dispersivity [m] averaged over all aquifer pore 

520 volumes. Must be non-negative. For per-pore-volume arrays, use 

521 :func:`gwtransport.diffusion.gamma_infiltration_to_extraction`. 

522 retardation_factor : float, optional 

523 Retardation factor (default 1.0). Values > 1.0 indicate slower transport. 

524 suppress_dispersion_warning : bool, optional 

525 Suppress warning about combining multiple pore volumes with dispersivity. 

526 Default is False. 

527 asymptotic_cutoff_sigma : float or None, optional 

528 Truncate the Gaussian kernel at this many standard deviations. 

529 Default is 3.0. 

530 flow_out : array-like or None, optional 

531 Flow rate [m3/day] on the output grid (aligned to ``cout_tedges``). 

532 Required when ``tedges`` has non-uniform time steps. Length must 

533 equal ``len(cout_tedges) - 1``. Default is None. 

534 suppress_uniform_tedges_check : bool, optional 

535 Skip the constant-dt check on ``tedges`` when ``flow_out`` is None. 

536 Default is False. 

537 

538 Returns 

539 ------- 

540 numpy.ndarray 

541 Bin-averaged concentration in extracted water. Length equals 

542 len(cout_tedges) - 1. NaN values indicate time periods with no valid 

543 contributions from the infiltration data. 

544 

545 See Also 

546 -------- 

547 infiltration_to_extraction : Transport with explicit pore volume distribution 

548 gamma_extraction_to_infiltration : Reverse operation (deconvolution) 

549 gwtransport.gamma.bins : Create gamma distribution bins 

550 gwtransport.diffusion.gamma_infiltration_to_extraction : Physically rigorous analytical 

551 solution that supports per-pore-volume arrays for streamline_length, 

552 molecular_diffusivity, and longitudinal_dispersivity. 

553 gwtransport.advection.gamma_infiltration_to_extraction : Pure advection (no dispersion) 

554 :ref:`concept-gamma-distribution` : Two-parameter pore volume model 

555 :ref:`concept-dispersion-scales` : Macrodispersion vs microdispersion 

556 

557 Notes 

558 ----- 

559 The APVD is only time-invariant under the steady-streamlines assumption 

560 (see :ref:`assumption-steady-streamlines`). 

561 

562 The spreading from the gamma-distributed pore volumes represents macrodispersion 

563 (aquifer-scale heterogeneity). When ``std`` comes from calibration on measurements, 

564 it absorbs all mixing: macrodispersion, microdispersion, and an average molecular 

565 diffusion contribution. When ``std`` comes from streamline analysis, it represents 

566 macrodispersion only; microdispersion and molecular diffusion can be added via the 

567 dispersion parameters. 

568 See :ref:`concept-dispersion-scales` for guidance on when to add microdispersion. 

569 

570 Examples 

571 -------- 

572 >>> import pandas as pd 

573 >>> import numpy as np 

574 >>> from gwtransport.diffusion_fast import gamma_infiltration_to_extraction 

575 >>> 

576 >>> tedges = pd.date_range(start="2020-01-01", end="2020-01-20", freq="D") 

577 >>> cout_tedges = pd.date_range(start="2020-01-05", end="2020-01-25", freq="D") 

578 >>> cin = np.zeros(len(tedges) - 1) 

579 >>> cin[5:10] = 1.0 

580 >>> flow = np.ones(len(tedges) - 1) * 100.0 

581 >>> 

582 >>> cout = gamma_infiltration_to_extraction( 

583 ... cin=cin, 

584 ... flow=flow, 

585 ... tedges=tedges, 

586 ... cout_tedges=cout_tedges, 

587 ... mean=500.0, 

588 ... std=100.0, 

589 ... n_bins=5, 

590 ... mean_streamline_length=100.0, 

591 ... mean_molecular_diffusivity=1e-4, 

592 ... mean_longitudinal_dispersivity=1.0, 

593 ... ) 

594 """ 

595 bins = gamma.bins(mean=mean, std=std, loc=loc, alpha=alpha, beta=beta, n_bins=n_bins) 

596 return infiltration_to_extraction( 

597 cin=cin, 

598 flow=flow, 

599 tedges=tedges, 

600 cout_tedges=cout_tedges, 

601 aquifer_pore_volumes=bins["expected_values"], 

602 mean_streamline_length=mean_streamline_length, 

603 mean_molecular_diffusivity=mean_molecular_diffusivity, 

604 mean_longitudinal_dispersivity=mean_longitudinal_dispersivity, 

605 retardation_factor=retardation_factor, 

606 suppress_dispersion_warning=suppress_dispersion_warning, 

607 asymptotic_cutoff_sigma=asymptotic_cutoff_sigma, 

608 flow_out=flow_out, 

609 suppress_uniform_tedges_check=suppress_uniform_tedges_check, 

610 ) 

611 

612 

613def gamma_extraction_to_infiltration( 

614 *, 

615 cout: npt.ArrayLike, 

616 flow: npt.ArrayLike, 

617 tedges: pd.DatetimeIndex, 

618 cout_tedges: pd.DatetimeIndex, 

619 mean: float | None = None, 

620 std: float | None = None, 

621 loc: float = 0.0, 

622 alpha: float | None = None, 

623 beta: float | None = None, 

624 n_bins: int = 100, 

625 mean_streamline_length: float, 

626 mean_molecular_diffusivity: float, 

627 mean_longitudinal_dispersivity: float, 

628 retardation_factor: float = 1.0, 

629 suppress_dispersion_warning: bool = False, 

630 asymptotic_cutoff_sigma: float | None = 3.0, 

631 regularization_strength: float = 1e-10, 

632 flow_out: npt.ArrayLike | None = None, 

633 suppress_uniform_tedges_check: bool = False, 

634) -> npt.NDArray[np.floating]: 

635 """ 

636 Reconstruct infiltration concentration from extracted water for gamma-distributed pore volumes. 

637 

638 Inverts the forward transport model (fast Gaussian diffusion + advection with 

639 gamma-distributed pore volumes) via Tikhonov regularization. This is a convenience 

640 wrapper around :func:`extraction_to_infiltration` that parameterizes the aquifer 

641 pore volume distribution as a (shifted) gamma distribution. 

642 

643 Provide either (mean, std) or (alpha, beta); ``loc`` is optional and defaults to 0. 

644 

645 Parameters 

646 ---------- 

647 cout : array-like 

648 Concentration of the compound in extracted water. 

649 flow : array-like 

650 Flow rate of water in the aquifer [m3/day]. 

651 tedges : pandas.DatetimeIndex 

652 Time edges for cin (output) and flow data. Has length of len(flow) + 1. 

653 cout_tedges : pandas.DatetimeIndex 

654 Time edges for cout data bins. Has length of len(cout) + 1. 

655 mean : float, optional 

656 Mean of the gamma distribution of the aquifer pore volume. Must be strictly 

657 greater than ``loc``. 

658 std : float, optional 

659 Standard deviation of the gamma distribution of the aquifer pore volume 

660 (invariant under the ``loc`` shift). 

661 loc : float, optional 

662 Location (minimum pore volume) of the gamma distribution. Must satisfy 

663 ``0 <= loc < mean``. Default is ``0.0``. 

664 alpha : float, optional 

665 Shape parameter of gamma distribution of the aquifer pore volume (must be > 0). 

666 beta : float, optional 

667 Scale parameter of gamma distribution of the aquifer pore volume (must be > 0). 

668 n_bins : int, optional 

669 Number of bins to discretize the gamma distribution. Default is 100. 

670 mean_streamline_length : float 

671 Mean travel distance through the aquifer [m] averaged over all aquifer 

672 pore volumes. Must be positive. For per-pore-volume arrays, use 

673 :func:`gwtransport.diffusion.gamma_extraction_to_infiltration`. 

674 mean_molecular_diffusivity : float 

675 Mean effective molecular diffusivity [m2/day] averaged over all 

676 aquifer pore volumes. Must be non-negative. For per-pore-volume 

677 arrays, use :func:`gwtransport.diffusion.gamma_extraction_to_infiltration`. 

678 mean_longitudinal_dispersivity : float 

679 Mean longitudinal dispersivity [m] averaged over all aquifer pore 

680 volumes. Must be non-negative. For per-pore-volume arrays, use 

681 :func:`gwtransport.diffusion.gamma_extraction_to_infiltration`. 

682 retardation_factor : float, optional 

683 Retardation factor (default 1.0). Values > 1.0 indicate slower transport. 

684 suppress_dispersion_warning : bool, optional 

685 Suppress warning about combining multiple pore volumes with dispersivity. 

686 Default is False. 

687 asymptotic_cutoff_sigma : float or None, optional 

688 Truncate the Gaussian kernel at this many standard deviations. 

689 Default is 3.0. 

690 regularization_strength : float, optional 

691 Tikhonov regularization parameter. Default is 1e-10. 

692 flow_out : array-like or None, optional 

693 Flow rate [m3/day] aligned to ``cout_tedges``. When provided, the 

694 Gaussian matrix operates on the output grid. Length must equal 

695 ``len(cout_tedges) - 1``. Default is None. 

696 suppress_uniform_tedges_check : bool, optional 

697 When True, skip the check that ``tedges`` has constant time steps 

698 (only relevant when ``flow_out`` is None). Default is False. 

699 

700 Returns 

701 ------- 

702 numpy.ndarray 

703 Bin-averaged concentration in infiltrating water. Length equals 

704 len(tedges) - 1. NaN values indicate time periods with no valid 

705 contributions from the extraction data. 

706 

707 See Also 

708 -------- 

709 extraction_to_infiltration : Deconvolution with explicit pore volume distribution 

710 gamma_infiltration_to_extraction : Forward operation (convolution) 

711 gwtransport.gamma.bins : Create gamma distribution bins 

712 gwtransport.diffusion.gamma_extraction_to_infiltration : Physically rigorous analytical 

713 solution that supports per-pore-volume arrays for streamline_length, 

714 molecular_diffusivity, and longitudinal_dispersivity. 

715 gwtransport.advection.gamma_extraction_to_infiltration : Pure advection (no dispersion) 

716 :ref:`concept-gamma-distribution` : Two-parameter pore volume model 

717 :ref:`concept-dispersion-scales` : Macrodispersion vs microdispersion 

718 

719 Notes 

720 ----- 

721 The APVD is only time-invariant under the steady-streamlines assumption 

722 (see :ref:`assumption-steady-streamlines`). 

723 

724 The spreading from the gamma-distributed pore volumes represents macrodispersion 

725 (aquifer-scale heterogeneity). When ``std`` comes from calibration on measurements, 

726 it absorbs all mixing: macrodispersion, microdispersion, and an average molecular 

727 diffusion contribution. When ``std`` comes from streamline analysis, it represents 

728 macrodispersion only; microdispersion and molecular diffusion can be added via the 

729 dispersion parameters. 

730 See :ref:`concept-dispersion-scales` for guidance on when to add microdispersion. 

731 

732 Examples 

733 -------- 

734 >>> import pandas as pd 

735 >>> import numpy as np 

736 >>> from gwtransport.diffusion_fast import gamma_extraction_to_infiltration 

737 >>> 

738 >>> tedges = pd.date_range(start="2020-01-01", end="2020-01-20", freq="D") 

739 >>> cout_tedges = pd.date_range(start="2020-01-05", end="2020-01-25", freq="D") 

740 >>> cout = np.zeros(len(cout_tedges) - 1) 

741 >>> cout[5:10] = 1.0 

742 >>> flow = np.ones(len(tedges) - 1) * 100.0 

743 >>> 

744 >>> cin = gamma_extraction_to_infiltration( 

745 ... cout=cout, 

746 ... flow=flow, 

747 ... tedges=tedges, 

748 ... cout_tedges=cout_tedges, 

749 ... mean=500.0, 

750 ... std=100.0, 

751 ... n_bins=5, 

752 ... mean_streamline_length=100.0, 

753 ... mean_molecular_diffusivity=1e-4, 

754 ... mean_longitudinal_dispersivity=1.0, 

755 ... ) 

756 """ 

757 bins = gamma.bins(mean=mean, std=std, loc=loc, alpha=alpha, beta=beta, n_bins=n_bins) 

758 return extraction_to_infiltration( 

759 cout=cout, 

760 flow=flow, 

761 tedges=tedges, 

762 cout_tedges=cout_tedges, 

763 aquifer_pore_volumes=bins["expected_values"], 

764 mean_streamline_length=mean_streamline_length, 

765 mean_molecular_diffusivity=mean_molecular_diffusivity, 

766 mean_longitudinal_dispersivity=mean_longitudinal_dispersivity, 

767 retardation_factor=retardation_factor, 

768 suppress_dispersion_warning=suppress_dispersion_warning, 

769 asymptotic_cutoff_sigma=asymptotic_cutoff_sigma, 

770 regularization_strength=regularization_strength, 

771 flow_out=flow_out, 

772 suppress_uniform_tedges_check=suppress_uniform_tedges_check, 

773 ) 

774 

775 

776def _compute_sigma( 

777 *, 

778 flow: npt.ArrayLike, 

779 tedges: pd.DatetimeIndex, 

780 aquifer_pore_volumes: npt.ArrayLike, 

781 streamline_length: float, 

782 molecular_diffusivity: float, 

783 longitudinal_dispersivity: float, 

784 retardation_factor: float, 

785 direction: str, 

786) -> npt.NDArray[np.floating]: 

787 """Compute sigma in array-index units with moment-based averaging. 

788 

789 The per-streamtube variance of the Gaussian kernel (in index units) 

790 is sigma_idx^2(V) = L_diff^2(V) * V^2 * R^2 / (Q * dt * L)^2. 

791 L_diff^2 = 2*D_m*tau + 2*alpha_L*L is linear in V (via tau ~ V), 

792 so sigma_idx^2 is cubic + quadratic in V. The averaged variance 

793 over the pore volume distribution is (law of total variance): 

794 

795 E[sigma_idx^2] = R^2 / (Q*dt*L)^2 

796 * (2*D_m*tau_bar/Vbar * E[V^3] + 2*alpha_L*L * E[V^2]) 

797 

798 When a single pore volume is given this reduces to the classical formula. 

799 

800 See :ref:`concept-variance-components` for the derivation. 

801 

802 Parameters 

803 ---------- 

804 flow : array-like 

805 Flow rate [m3/day]. 

806 tedges : DatetimeIndex 

807 Time edges for flow data. 

808 aquifer_pore_volumes : array-like 

809 Pore volumes [m3]. Moments E[V^2] and E[V^3] are computed 

810 from this array. 

811 streamline_length : float 

812 Streamline length [m]. 

813 molecular_diffusivity : float 

814 Effective molecular diffusivity [m2/day]. 

815 longitudinal_dispersivity : float 

816 Longitudinal dispersivity [m]. 

817 retardation_factor : float 

818 Retardation factor [-]. 

819 direction : {'infiltration_to_extraction', 'extraction_to_infiltration'} 

820 Direction for residence time computation. 

821 

822 Returns 

823 ------- 

824 ndarray 

825 Sigma values in array-index units, clipped to [0, 100]. 

826 """ 

827 flow = np.asarray(flow, dtype=float) 

828 aquifer_pore_volumes = np.asarray(aquifer_pore_volumes, dtype=float) 

829 

830 mean_pv = float(np.mean(aquifer_pore_volumes)) 

831 ev2 = float(np.mean(aquifer_pore_volumes**2)) 

832 ev3 = float(np.mean(aquifer_pore_volumes**3)) 

833 

834 # Bin-mean residence time at the mean pore volume [days]. Using the bin-mean 

835 # (rather than the bin-center point sample) is consistent with the bin-edge 

836 # convention: per-bin output quantities are flow-weighted averages over the bin. 

837 rt_array = residence_time_mean( 

838 flow=flow, 

839 flow_tedges=tedges, 

840 tedges_out=tedges, 

841 aquifer_pore_volumes=mean_pv, 

842 retardation_factor=retardation_factor, 

843 direction=direction, 

844 )[0] 

845 

846 # Interpolate NaN values 

847 valid_mask = ~np.isnan(rt_array) 

848 if np.any(valid_mask): 

849 rt_array = np.interp(np.arange(len(rt_array)), np.where(valid_mask)[0], rt_array[valid_mask]) 

850 

851 # Two independent spatial variance components [m^2]: 

852 # molecular: 2 * D_m * tau (linear in V via tau) 

853 # dispersive: 2 * alpha_L * L (constant in V) 

854 mol_var_x = 2.0 * molecular_diffusivity * rt_array 

855 disp_var_x = 2.0 * longitudinal_dispersivity * streamline_length 

856 

857 # Averaged variance in index units (see docstring): 

858 var_numerator = mol_var_x / mean_pv * ev3 + disp_var_x * ev2 

859 

860 timedelta = np.diff(tedges) / pd.to_timedelta(1, unit="D") 

861 # q_dt_l_over_r = Q * dt * L / R has units m^4 (it is the squared scaling 

862 # factor that converts spatial variance [m^2] into index-space variance). 

863 q_dt_l_over_r = flow * timedelta * streamline_length / retardation_factor 

864 

865 with np.errstate(invalid="ignore", divide="ignore"): 

866 sigma_sq = np.where(q_dt_l_over_r > 0.0, var_numerator / q_dt_l_over_r**2, 0.0) 

867 

868 return np.clip(np.sqrt(np.maximum(sigma_sq, 0.0)), 0.0, 100.0) 

869 

870 

871def _check_uniform_tedges(*, tedges: pd.DatetimeIndex, suppress: bool) -> None: 

872 """Raise ValueError if tedges has non-constant time steps. 

873 

874 Raises 

875 ------ 

876 ValueError 

877 If tedges has non-constant time steps and suppress is False. 

878 """ 

879 if suppress: 

880 return 

881 dt = np.diff(tedges) 

882 if not np.all(dt == dt[0]): 

883 msg = ( 

884 "tedges must have constant time steps when flow_out is not provided. " 

885 "Either provide flow_out or set suppress_uniform_tedges_check=True." 

886 ) 

887 raise ValueError(msg) 

888 

889 

890def _build_gaussian_matrix( 

891 *, n: int, sigma_array: npt.ArrayLike, asymptotic_cutoff_sigma: float | None = 3.0 

892) -> sparse.csr_matrix: 

893 """Build sparse Gaussian convolution matrix with CDF-integrated bin weights. 

894 

895 Each row represents a position-specific Gaussian kernel where weights are 

896 computed by integrating the Gaussian CDF over each bin, rather than 

897 point-sampling the PDF. This is more accurate for small sigma values 

898 (< ~2 bins) where the PDF peak can fall between bin centers. 

899 

900 Parameters 

901 ---------- 

902 n : int 

903 Length of the signal (number of bins). 

904 sigma_array : array-like 

905 Array of sigma values of length n. 

906 asymptotic_cutoff_sigma : float or None, optional 

907 Truncate the kernel at this many standard deviations. Set to None to 

908 disable truncation (use full signal length). Default is 3.0. 

909 

910 Returns 

911 ------- 

912 scipy.sparse.csr_matrix 

913 Sparse convolution matrix of shape (n, n). 

914 """ 

915 sigma_array = np.asarray(sigma_array) 

916 

917 # Handle zero sigma values 

918 zero_mask = sigma_array == 0 

919 if np.all(zero_mask): 

920 return sparse.eye(n, format="csr", dtype=float) # pyright: ignore[reportReturnType] 

921 

922 # Get maximum kernel size and create position arrays 

923 max_sigma = np.max(sigma_array) 

924 truncate = asymptotic_cutoff_sigma if asymptotic_cutoff_sigma is not None else n / max(max_sigma, 1.0) 

925 max_radius = int(truncate * max_sigma + 0.5) 

926 

927 # Create arrays for all possible kernel positions 

928 positions = np.arange(-max_radius, max_radius + 1) 

929 

930 # Create a mask for valid sigma values 

931 valid_sigma = ~zero_mask 

932 valid_indices = np.where(valid_sigma)[0] 

933 

934 # Shape: (n_valid_points, 1) 

935 center_positions = valid_indices[:, np.newaxis] 

936 # Shape: (1, max_kernel_size) 

937 kernel_positions = positions[np.newaxis, :] 

938 

939 # Calculate CDF-integrated weights for proper bin averaging. 

940 # Compute erf at unique bin edges, then take differences. Adjacent bins 

941 # share an edge, so this halves the number of erf evaluations. 

942 sigmas = sigma_array[valid_sigma][:, np.newaxis] 

943 sqrt2 = np.sqrt(2) 

944 edges = np.arange(-max_radius - 0.5, max_radius + 1.5) # (2*max_radius + 2,) 

945 erf_at_edges = erf(edges[np.newaxis, :] / (sigmas * sqrt2)) # (n_valid, 2*max_radius+2) 

946 weights = 0.5 * np.diff(erf_at_edges, axis=1) # (n_valid, 2*max_radius+1) 

947 

948 # Normalize each kernel 

949 weights /= np.sum(weights, axis=1)[:, np.newaxis] 

950 

951 # Calculate absolute positions in the signal 

952 absolute_positions = center_positions + kernel_positions 

953 

954 # Handle boundary conditions (nearest-neighbor extrapolation) 

955 absolute_positions = np.clip(absolute_positions, 0, n - 1) 

956 

957 # Create coordinate arrays for sparse matrix 

958 rows = np.repeat(center_positions, weights.shape[1]) 

959 cols = absolute_positions.ravel() 

960 data = weights.ravel() 

961 

962 # Remove zero weights to save memory 

963 nonzero_mask = data != 0 

964 rows = rows[nonzero_mask] 

965 cols = cols[nonzero_mask] 

966 data = data[nonzero_mask] 

967 

968 # Add identity matrix elements for zero-sigma positions 

969 if np.any(zero_mask): 

970 zero_indices = np.where(zero_mask)[0] 

971 rows = np.concatenate([rows, zero_indices]) 

972 cols = np.concatenate([cols, zero_indices]) 

973 data = np.concatenate([data, np.ones(len(zero_indices))]) 

974 

975 return sparse.csr_matrix((data, (rows, cols)), shape=(n, n)) 

976 

977 

978def convolve_diffusion( 

979 *, input_signal: npt.ArrayLike, sigma_array: npt.ArrayLike, asymptotic_cutoff_sigma: float | None = 3.0 

980) -> npt.NDArray[np.floating]: 

981 """Apply Gaussian filter with position-dependent sigma values. 

982 

983 This function extends scipy.ndimage.gaussian_filter1d by allowing the standard 

984 deviation (sigma) of the Gaussian kernel to vary at each point in the signal. 

985 It implements the filter using a sparse convolution matrix where each row 

986 represents a Gaussian kernel with a locally-appropriate standard deviation. 

987 

988 Kernel weights are computed by integrating the Gaussian CDF over each bin, 

989 which is more accurate than point-sampling the PDF for small sigma values. 

990 

991 Parameters 

992 ---------- 

993 input_signal : array-like 

994 One-dimensional input array to be filtered. 

995 sigma_array : array-like 

996 One-dimensional array of standard deviation values, must have same length 

997 as input_signal. 

998 asymptotic_cutoff_sigma : float or None, optional 

999 Truncate the filter at this many standard deviations. Set to None to 

1000 disable truncation. Default is 3.0. 

1001 

1002 Returns 

1003 ------- 

1004 numpy.ndarray 

1005 The filtered input signal. Has the same shape as input_signal. 

1006 

1007 Raises 

1008 ------ 

1009 ValueError 

1010 If input_signal and sigma_array do not have the same length. 

1011 

1012 See Also 

1013 -------- 

1014 scipy.ndimage.gaussian_filter1d : Fixed-sigma Gaussian filtering 

1015 

1016 Notes 

1017 ----- 

1018 At the boundaries, the outer values are repeated to avoid edge effects 

1019 (equivalent to mode='nearest' in `scipy.ndimage.gaussian_filter1d`). 

1020 

1021 Examples 

1022 -------- 

1023 >>> import numpy as np 

1024 >>> from gwtransport.diffusion_fast import convolve_diffusion 

1025 >>> # Create a sample signal 

1026 >>> x = np.linspace(0, 10, 1000) 

1027 >>> signal = np.exp(-((x - 3) ** 2)) + 0.5 * np.exp(-((x - 7) ** 2) / 0.5) 

1028 

1029 >>> # Create position-dependent sigma values 

1030 >>> diffusivity = 0.1 

1031 >>> dt = 0.001 * (1 + np.sin(2 * np.pi * x / 10)) 

1032 >>> dx = x[1] - x[0] 

1033 >>> sigma_array = np.sqrt(2 * diffusivity * dt) / dx 

1034 

1035 >>> # Apply the filter 

1036 >>> filtered = convolve_diffusion(input_signal=signal, sigma_array=sigma_array) 

1037 """ 

1038 input_signal = np.asarray(input_signal) 

1039 sigma_array = np.asarray(sigma_array) 

1040 

1041 if len(input_signal) != len(sigma_array): 

1042 msg = "Input signal and sigma array must have the same length" 

1043 raise ValueError(msg) 

1044 

1045 n = len(input_signal) 

1046 

1047 # Handle zero sigma values 

1048 if np.all(sigma_array == 0): 

1049 return input_signal.copy() 

1050 

1051 g_matrix = _build_gaussian_matrix(n=n, sigma_array=sigma_array, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma) 

1052 return g_matrix.dot(input_signal) 

1053 

1054 

1055def create_example_data( 

1056 *, 

1057 nx: int = 1000, 

1058 domain_length: float = 10.0, 

1059 diffusivity: float = 0.1, 

1060 seed: int | None = None, 

1061) -> tuple[npt.NDArray[np.floating], npt.NDArray[np.floating], npt.NDArray[np.floating], npt.NDArray[np.floating]]: 

1062 """Create example data for demonstrating variable-sigma diffusion. 

1063 

1064 Parameters 

1065 ---------- 

1066 nx : int, optional 

1067 Number of spatial points. Default is 1000. 

1068 domain_length : float, optional 

1069 Domain length. Default is 10.0. 

1070 diffusivity : float, optional 

1071 Diffusivity. Default is 0.1. 

1072 seed : int or None, optional 

1073 Random seed for reproducibility. Default is None. 

1074 

1075 Returns 

1076 ------- 

1077 x : numpy.ndarray 

1078 Spatial coordinates. 

1079 signal : numpy.ndarray 

1080 Initial signal (sum of two Gaussians with noise). 

1081 sigma_array : numpy.ndarray 

1082 Array of sigma values varying in space. 

1083 dt : numpy.ndarray 

1084 Array of time steps varying in space. 

1085 """ 

1086 rng = np.random.default_rng(seed) 

1087 

1088 # Create spatial grid 

1089 x = np.linspace(0, domain_length, nx) 

1090 dx = x[1] - x[0] 

1091 

1092 # Create initial signal (two Gaussians) 

1093 signal = np.exp(-((x - 3) ** 2)) + 0.5 * np.exp(-((x - 7) ** 2) / 0.5) + 0.1 * rng.standard_normal(nx) 

1094 

1095 # Create varying time steps 

1096 dt = 0.001 * (1 + np.sin(2 * np.pi * x / domain_length)) 

1097 

1098 # Calculate corresponding sigma values 

1099 sigma_array = np.sqrt(2 * diffusivity * dt) / dx 

1100 

1101 return x, signal, sigma_array, dt 

1102 

1103 

1104def _validate_inputs( 

1105 *, 

1106 cin_or_cout: np.ndarray, 

1107 flow: np.ndarray, 

1108 tedges: pd.DatetimeIndex, 

1109 cout_tedges: pd.DatetimeIndex, 

1110 aquifer_pore_volumes: np.ndarray, 

1111 mean_streamline_length: float, 

1112 mean_molecular_diffusivity: float, 

1113 mean_longitudinal_dispersivity: float, 

1114 is_forward: bool, 

1115 flow_out: np.ndarray | None = None, 

1116) -> None: 

1117 """Validate inputs for infiltration_to_extraction and extraction_to_infiltration. 

1118 

1119 Raises 

1120 ------ 

1121 ValueError 

1122 If array lengths are inconsistent, mean_molecular_diffusivity or 

1123 mean_longitudinal_dispersivity are negative, cin or cout or flow 

1124 contain NaN values, aquifer_pore_volumes contains non-positive 

1125 values, or mean_streamline_length is non-positive. 

1126 """ 

1127 if is_forward: 

1128 if len(tedges) != len(cin_or_cout) + 1: 

1129 msg = "tedges must have one more element than cin" 

1130 raise ValueError(msg) 

1131 elif len(cout_tedges) != len(cin_or_cout) + 1: 

1132 msg = "cout_tedges must have one more element than cout" 

1133 raise ValueError(msg) 

1134 if len(tedges) != len(flow) + 1: 

1135 msg = "tedges must have one more element than flow" 

1136 raise ValueError(msg) 

1137 if mean_molecular_diffusivity < 0: 

1138 msg = "mean_molecular_diffusivity must be non-negative" 

1139 raise ValueError(msg) 

1140 if mean_longitudinal_dispersivity < 0: 

1141 msg = "mean_longitudinal_dispersivity must be non-negative" 

1142 raise ValueError(msg) 

1143 if np.any(np.isnan(cin_or_cout)): 

1144 msg = f"{'cin' if is_forward else 'cout'} contains NaN values, which are not allowed" 

1145 raise ValueError(msg) 

1146 if np.any(np.isnan(flow)): 

1147 msg = "flow contains NaN values, which are not allowed" 

1148 raise ValueError(msg) 

1149 if np.any(flow < 0): 

1150 msg = "flow must be non-negative (negative flow not supported)" 

1151 raise ValueError(msg) 

1152 if np.any(aquifer_pore_volumes <= 0): 

1153 msg = "aquifer_pore_volumes must be positive" 

1154 raise ValueError(msg) 

1155 if mean_streamline_length <= 0: 

1156 msg = "mean_streamline_length must be positive" 

1157 raise ValueError(msg) 

1158 if flow_out is not None: 

1159 n_cout = len(cout_tedges) - 1 

1160 if len(flow_out) != n_cout: 

1161 msg = f"flow_out must have length len(cout_tedges) - 1 = {n_cout}, got {len(flow_out)}" 

1162 raise ValueError(msg) 

1163 if np.any(np.isnan(flow_out)): 

1164 msg = "flow_out contains NaN values, which are not allowed" 

1165 raise ValueError(msg) 

1166 if np.any(flow_out < 0): 

1167 msg = "flow_out must be non-negative (negative flow not supported)" 

1168 raise ValueError(msg) 

1169 

1170 

1171def _warn_dispersion() -> None: 

1172 """Emit warning about combining multiple pore volumes with dispersivity.""" 

1173 msg = ( 

1174 "Using multiple aquifer_pore_volumes with non-zero mean_longitudinal_dispersivity. " 

1175 "Both represent spreading from velocity heterogeneity at different scales.\n\n" 

1176 "This is appropriate when:\n" 

1177 " - APVD comes from streamline analysis (explicit geometry)\n" 

1178 " - You want to add unresolved microdispersion and molecular diffusion\n\n" 

1179 "This may double-count spreading when:\n" 

1180 " - APVD was calibrated from measurements (microdispersion already included)\n\n" 

1181 "For gamma-parameterized APVD, consider using the 'equivalent APVD std' approach\n" 

1182 "from 05_Diffusion_Dispersion.ipynb to combine both effects in the faster advection\n" 

1183 "module. Note: this variance combination is only valid for continuous (gamma)\n" 

1184 "distributions, not for discrete streamline volumes.\n" 

1185 "Suppress this warning with suppress_dispersion_warning=True." 

1186 ) 

1187 warnings.warn(msg, UserWarning, stacklevel=3)