Coverage for src / gwtransport / advection.py: 78%

190 statements  

« prev     ^ index     » next       coverage.py v7.13.5, created at 2026-04-27 06:32 +0000

1""" 

2Advective Transport Modeling for 1D Aquifer Systems. 

3 

4This module provides functions to model compound transport by advection in one-dimensional 

5aquifer systems, enabling prediction of solute or temperature concentrations in extracted 

6water based on infiltration data and aquifer properties. The model assumes one-dimensional 

7groundwater flow where water infiltrates with concentration ``cin``, flows through the aquifer 

8with pore volume distribution, compounds are transported with retarded velocity (retardation 

9factor >= 1.0), and water is extracted with concentration ``cout``. 

10 

11Available functions: 

12 

13- :func:`infiltration_to_extraction_series` - Single pore volume, time-shift only. Shifts 

14 infiltration time edges forward by residence time. Concentration values remain unchanged 

15 (cout = cin). No support for custom output time edges. Use case: Deterministic transport 

16 with single flow path. 

17 

18- :func:`infiltration_to_extraction` - Arbitrary pore volume distribution, convolution. 

19 Supports explicit distribution of aquifer pore volumes with flow-weighted averaging. 

20 Flexible output time resolution via cout_tedges. Use case: Known pore volume distribution 

21 from streamline analysis. 

22 

23- :func:`gamma_infiltration_to_extraction` - Gamma-distributed pore volumes, convolution. 

24 Models aquifer heterogeneity with 2-parameter gamma distribution. Parameterizable via 

25 (alpha, beta) or (mean, std). Discretizes gamma distribution into equal-probability bins. 

26 Use case: Heterogeneous aquifer with calibrated gamma parameters. 

27 

28Note on dispersion: The spreading from the pore volume distribution (APVD) represents 

29macrodispersion—aquifer-scale velocity heterogeneity that depends on both aquifer 

30properties and hydrological boundary conditions. If APVD was calibrated from 

31measurements, this spreading already includes microdispersion and molecular diffusion. 

32To add microdispersion and molecular diffusion separately (when APVD comes from 

33streamline analysis), use :mod:`gwtransport.diffusion`. 

34See :ref:`concept-dispersion-scales` for details. 

35 

36Note on cross-compound calibration: When APVD is calibrated from measurements of one 

37compound (e.g., temperature with D_m ~ 0.1 m²/day) and used to predict another (e.g., a 

38solute with D_m ~ 1e-4 m²/day), the molecular diffusion contribution baked into the 

39calibrated std may need correction. See :doc:`/examples/05_Diffusion_Dispersion` for 

40the correction procedure. 

41 

42- :func:`extraction_to_infiltration_series` - Single pore volume, time-shift only 

43 (deconvolution). Shifts extraction time edges backward by residence time. Concentration 

44 values remain unchanged (cin = cout). Symmetric inverse of infiltration_to_extraction_series. 

45 Use case: Backward tracing with single flow path. 

46 

47- :func:`extraction_to_infiltration` - Arbitrary pore volume distribution, deconvolution. 

48 Inverts forward transport for arbitrary pore volume distributions. Symmetric inverse of 

49 infiltration_to_extraction. Flow-weighted averaging in reverse direction. Use case: 

50 Estimating infiltration history from extraction data. 

51 

52- :func:`gamma_extraction_to_infiltration` - Gamma-distributed pore volumes, deconvolution. 

53 Inverts forward transport for gamma-distributed pore volumes. Symmetric inverse of 

54 gamma_infiltration_to_extraction. Use case: Calibrating infiltration conditions from 

55 extraction measurements. 

56 

57- :func:`infiltration_to_extraction_front_tracking` - Exact front tracking with nonlinear sorption. 

58 Event-driven algorithm that solves 1D advective transport with Freundlich or Langmuir isotherm 

59 using analytical integration of shock and rarefaction waves. Machine-precision physics (no 

60 numerical dispersion). Returns bin-averaged concentrations. Use case: Sharp concentration fronts 

61 with exact mass balance required, single deterministic flow path. 

62 

63- :func:`infiltration_to_extraction_front_tracking_detailed` - Front tracking with piecewise structure. 

64 Same as infiltration_to_extraction_front_tracking but also returns complete piecewise analytical 

65 structure including all events, segments, and callable analytical forms C(t). Use case: Detailed 

66 analysis of shock and rarefaction wave dynamics. 

67 

68This file is part of gwtransport which is released under AGPL-3.0 license. 

69See the ./LICENSE file or go to https://github.com/gwtransport/gwtransport/blob/main/LICENSE for full license details. 

70""" 

71 

72import numpy as np 

73import numpy.typing as npt 

74import pandas as pd 

75 

76from gwtransport import gamma 

77from gwtransport.advection_utils import _infiltration_to_extraction_weights 

78from gwtransport.fronttracking.math import ( 

79 EPSILON_FREUNDLICH_N, 

80 ConstantRetardation, 

81 FreundlichSorption, 

82 LangmuirSorption, 

83 SorptionModel, 

84) 

85from gwtransport.fronttracking.output import compute_bin_averaged_concentration_exact 

86from gwtransport.fronttracking.solver import FrontTracker 

87from gwtransport.fronttracking.waves import CharacteristicWave, RarefactionWave, ShockWave 

88from gwtransport.residence_time import residence_time 

89from gwtransport.utils import solve_inverse_transport 

90 

91 

92def infiltration_to_extraction_series( 

93 *, 

94 flow: npt.ArrayLike, 

95 tedges: pd.DatetimeIndex, 

96 aquifer_pore_volume: float, 

97 retardation_factor: float = 1.0, 

98) -> pd.DatetimeIndex: 

99 """ 

100 Compute extraction time edges from infiltration time edges using residence time shifts. 

101 

102 This function shifts infiltration time edges forward in time based on residence 

103 times computed from flow rates and aquifer properties. The concentration values remain 

104 unchanged (cout equals cin), only the time edges are shifted. This assumes a single pore 

105 volume (no distribution) and deterministic advective transport. 

106 

107 NOTE: This function is specifically designed for single aquifer pore volumes and does not 

108 support custom output time edges (cout_tedges). For distributions of aquifer pore volumes 

109 or custom output time grids, use `infiltration_to_extraction` instead. 

110 

111 Parameters 

112 ---------- 

113 flow : array-like 

114 Flow rate values in the aquifer [m3/day]. 

115 Length must match the number of time bins defined by tedges (len(tedges) - 1). 

116 tedges : pandas.DatetimeIndex 

117 Time edges defining bins for both cin and flow data. Has length of len(flow) + 1. 

118 aquifer_pore_volume : float 

119 Single aquifer pore volume [m3] used to compute residence times. 

120 retardation_factor : float, optional 

121 Retardation factor of the compound in the aquifer (default 1.0). 

122 Values > 1.0 indicate slower transport due to sorption/interaction. 

123 

124 Returns 

125 ------- 

126 pandas.DatetimeIndex 

127 Time edges for the extracted water concentration. Same length as tedges. 

128 The concentration values in the extracted water (cout) equal cin, but are 

129 aligned with these shifted time edges. 

130 

131 Raises 

132 ------ 

133 ValueError 

134 If ``retardation_factor`` is less than 1.0. 

135 

136 Examples 

137 -------- 

138 Basic usage with constant flow: 

139 

140 >>> import pandas as pd 

141 >>> import numpy as np 

142 >>> from gwtransport.utils import compute_time_edges 

143 >>> from gwtransport.advection import infiltration_to_extraction_series 

144 >>> 

145 >>> # Create input data 

146 >>> dates = pd.date_range(start="2020-01-01", end="2020-01-10", freq="D") 

147 >>> tedges = compute_time_edges( 

148 ... tedges=None, tstart=None, tend=dates, number_of_bins=len(dates) 

149 ... ) 

150 >>> 

151 >>> # Constant concentration and flow 

152 >>> cin = np.ones(len(dates)) * 10.0 

153 >>> flow = np.ones(len(dates)) * 100.0 # 100 m3/day 

154 >>> 

155 >>> # Run infiltration_to_extraction_series with 500 m3 pore volume 

156 >>> tedges_out = infiltration_to_extraction_series( 

157 ... flow=flow, 

158 ... tedges=tedges, 

159 ... aquifer_pore_volume=500.0, 

160 ... ) 

161 >>> len(tedges_out) 

162 11 

163 >>> # Time shift should be residence time = pore_volume / flow = 500 / 100 = 5 days 

164 >>> tedges_out[0] - tedges[0] 

165 Timedelta('5 days 00:00:00') 

166 

167 Plotting the input and output concentrations: 

168 

169 >>> import matplotlib.pyplot as plt 

170 >>> from gwtransport.utils import step_plot_coords 

171 >>> plt.plot( 

172 ... *step_plot_coords(tedges, cin), label="Concentration of infiltrated water" 

173 ... ) # doctest: +SKIP 

174 >>> 

175 >>> # cout equals cin, just with shifted time edges 

176 >>> plt.plot( 

177 ... *step_plot_coords(tedges_out, cin), label="Concentration of extracted water" 

178 ... ) # doctest: +SKIP 

179 >>> plt.xlabel("Time") # doctest: +SKIP 

180 >>> plt.ylabel("Concentration") # doctest: +SKIP 

181 >>> plt.legend() # doctest: +SKIP 

182 >>> plt.show() # doctest: +SKIP 

183 

184 With retardation factor: 

185 

186 >>> tedges_out = infiltration_to_extraction_series( 

187 ... flow=flow, 

188 ... tedges=tedges, 

189 ... aquifer_pore_volume=500.0, 

190 ... retardation_factor=2.0, # Doubles residence time 

191 ... ) 

192 >>> # Time shift is doubled: 500 * 2.0 / 100 = 10 days 

193 >>> tedges_out[0] - tedges[0] 

194 Timedelta('10 days 00:00:00') 

195 """ 

196 if retardation_factor < 1.0: 

197 msg = "retardation_factor must be >= 1.0" 

198 raise ValueError(msg) 

199 rt_array = residence_time( 

200 flow=flow, 

201 flow_tedges=tedges, 

202 index=tedges, 

203 aquifer_pore_volumes=aquifer_pore_volume, 

204 retardation_factor=retardation_factor, 

205 direction="infiltration_to_extraction", 

206 ) 

207 return tedges + pd.to_timedelta(rt_array[0], unit="D", errors="coerce") 

208 

209 

210def extraction_to_infiltration_series( 

211 *, 

212 flow: npt.ArrayLike, 

213 tedges: pd.DatetimeIndex, 

214 aquifer_pore_volume: float, 

215 retardation_factor: float = 1.0, 

216) -> pd.DatetimeIndex: 

217 """ 

218 Compute infiltration time edges from extraction time edges (deconvolution). 

219 

220 This function shifts extraction time edges backward in time based on residence 

221 times computed from flow rates and aquifer properties. The concentration values remain 

222 unchanged (cin equals cout), only the time edges are shifted. This assumes a single pore 

223 volume (no distribution) and deterministic advective transport. This is the inverse 

224 operation of infiltration_to_extraction_series. 

225 

226 NOTE: This function is specifically designed for single aquifer pore volumes and does not 

227 support custom output time edges (cin_tedges). For distributions of aquifer pore volumes 

228 or custom output time grids, use `extraction_to_infiltration` instead. 

229 

230 Parameters 

231 ---------- 

232 flow : array-like 

233 Flow rate values in the aquifer [m3/day]. 

234 Length must match the number of time bins defined by tedges (len(tedges) - 1). 

235 tedges : pandas.DatetimeIndex 

236 Time edges defining bins for both cout and flow data. Has length of len(flow) + 1. 

237 aquifer_pore_volume : float 

238 Single aquifer pore volume [m3] used to compute residence times. 

239 retardation_factor : float, optional 

240 Retardation factor of the compound in the aquifer (default 1.0). 

241 Values > 1.0 indicate slower transport due to sorption/interaction. 

242 

243 Returns 

244 ------- 

245 pandas.DatetimeIndex 

246 Time edges for the infiltrating water concentration. Same length as tedges. 

247 The concentration values in the infiltrating water (cin) equal cout, but are 

248 aligned with these shifted time edges. 

249 

250 Raises 

251 ------ 

252 ValueError 

253 If ``retardation_factor`` is less than 1.0. 

254 

255 Examples 

256 -------- 

257 Basic usage with constant flow: 

258 

259 >>> import pandas as pd 

260 >>> import numpy as np 

261 >>> from gwtransport.utils import compute_time_edges 

262 >>> from gwtransport.advection import extraction_to_infiltration_series 

263 >>> 

264 >>> # Create input data 

265 >>> dates = pd.date_range(start="2020-01-01", end="2020-01-10", freq="D") 

266 >>> tedges = compute_time_edges( 

267 ... tedges=None, tstart=None, tend=dates, number_of_bins=len(dates) 

268 ... ) 

269 >>> 

270 >>> # Constant concentration and flow 

271 >>> cout = np.ones(len(dates)) * 10.0 

272 >>> flow = np.ones(len(dates)) * 100.0 # 100 m3/day 

273 >>> 

274 >>> # Run extraction_to_infiltration_series with 500 m3 pore volume 

275 >>> tedges_out = extraction_to_infiltration_series( 

276 ... flow=flow, 

277 ... tedges=tedges, 

278 ... aquifer_pore_volume=500.0, 

279 ... ) 

280 >>> len(tedges_out) 

281 11 

282 >>> # Time shift should be residence time = pore_volume / flow = 500 / 100 = 5 days (backward) 

283 >>> # First few elements are NaT due to insufficient history, check a valid index 

284 >>> tedges[5] - tedges_out[5] 

285 Timedelta('5 days 00:00:00') 

286 

287 Plotting the input and output concentrations: 

288 

289 >>> import matplotlib.pyplot as plt 

290 >>> from gwtransport.utils import step_plot_coords 

291 >>> plt.plot( 

292 ... *step_plot_coords(tedges, cout), label="Concentration of extracted water" 

293 ... ) # doctest: +SKIP 

294 >>> 

295 >>> # cin equals cout, just with shifted time edges 

296 >>> plt.plot( 

297 ... *step_plot_coords(tedges_out, cout), 

298 ... label="Concentration of infiltrated water", 

299 ... ) # doctest: +SKIP 

300 >>> plt.xlabel("Time") # doctest: +SKIP 

301 >>> plt.ylabel("Concentration") # doctest: +SKIP 

302 >>> plt.legend() # doctest: +SKIP 

303 >>> plt.show() # doctest: +SKIP 

304 

305 With retardation factor: 

306 

307 >>> tedges_out = extraction_to_infiltration_series( 

308 ... flow=flow, 

309 ... tedges=tedges, 

310 ... aquifer_pore_volume=500.0, 

311 ... retardation_factor=2.0, # Doubles residence time 

312 ... ) 

313 >>> # Time shift is doubled: 500 * 2.0 / 100 = 10 days (backward) 

314 >>> # With longer residence time, more elements are NaT, check the last valid index 

315 >>> tedges[10] - tedges_out[10] 

316 Timedelta('10 days 00:00:00') 

317 """ 

318 if retardation_factor < 1.0: 

319 msg = "retardation_factor must be >= 1.0" 

320 raise ValueError(msg) 

321 rt_array = residence_time( 

322 flow=flow, 

323 flow_tedges=tedges, 

324 index=tedges, 

325 aquifer_pore_volumes=aquifer_pore_volume, 

326 retardation_factor=retardation_factor, 

327 direction="extraction_to_infiltration", 

328 ) 

329 return tedges - pd.to_timedelta(rt_array[0], unit="D", errors="coerce") 

330 

331 

332def gamma_infiltration_to_extraction( 

333 *, 

334 cin: npt.ArrayLike, 

335 flow: npt.ArrayLike, 

336 tedges: pd.DatetimeIndex, 

337 cout_tedges: pd.DatetimeIndex, 

338 mean: float | None = None, 

339 std: float | None = None, 

340 loc: float = 0.0, 

341 alpha: float | None = None, 

342 beta: float | None = None, 

343 n_bins: int = 100, 

344 retardation_factor: float = 1.0, 

345) -> npt.NDArray[np.floating]: 

346 """ 

347 Compute the concentration of the extracted water by shifting cin with its residence time. 

348 

349 The compound is retarded in the aquifer with a retardation factor. The residence 

350 time is computed based on the flow rate of the water in the aquifer and the pore volume 

351 of the aquifer. The aquifer pore volume is approximated by a (shifted) gamma distribution 

352 parameterized by either (mean, std, loc) or (alpha, beta, loc). 

353 

354 This function represents infiltration to extraction modeling (equivalent to convolution). 

355 

356 Provide either (mean, std) or (alpha, beta); ``loc`` is optional and defaults to 0. 

357 

358 Parameters 

359 ---------- 

360 cin : array-like 

361 Concentration of the compound in infiltrating water or temperature of infiltrating 

362 water. The model assumes this value is constant over each interval 

363 ``[tedges[i], tedges[i+1])``. 

364 flow : array-like 

365 Flow rate of water in the aquifer [m3/day]. The model assumes this value is 

366 constant over each interval ``[tedges[i], tedges[i+1])``. 

367 tedges : pandas.DatetimeIndex 

368 Time edges for both cin and flow data. Used to compute the cumulative concentration. 

369 Has a length of one more than `cin` and `flow`. 

370 cout_tedges : pandas.DatetimeIndex 

371 Time edges for the output data. Used to compute the cumulative concentration. 

372 Has a length of one more than the desired output length. 

373 mean : float, optional 

374 Mean of the gamma distribution of the aquifer pore volume. Must be strictly 

375 greater than ``loc``. 

376 std : float, optional 

377 Standard deviation of the gamma distribution of the aquifer pore volume 

378 (invariant under the ``loc`` shift). 

379 loc : float, optional 

380 Location (minimum pore volume) of the gamma distribution. Must satisfy 

381 ``0 <= loc < mean``. Default is ``0.0``. 

382 alpha : float, optional 

383 Shape parameter of gamma distribution of the aquifer pore volume (must be > 0). 

384 beta : float, optional 

385 Scale parameter of gamma distribution of the aquifer pore volume (must be > 0). 

386 n_bins : int 

387 Number of bins to discretize the gamma distribution. 

388 retardation_factor : float 

389 Retardation factor of the compound in the aquifer. 

390 

391 Returns 

392 ------- 

393 numpy.ndarray 

394 Concentration of the compound in the extracted water [ng/m3] or temperature. 

395 

396 See Also 

397 -------- 

398 infiltration_to_extraction : Transport with explicit pore volume distribution 

399 gamma_extraction_to_infiltration : Reverse operation (deconvolution) 

400 gwtransport.gamma.bins : Create gamma distribution bins 

401 gwtransport.residence_time.residence_time : Compute residence times 

402 gwtransport.diffusion.infiltration_to_extraction : Add microdispersion and molecular diffusion 

403 :ref:`concept-gamma-distribution` : Two-parameter pore volume model 

404 :ref:`assumption-gamma-distribution` : When gamma distribution is adequate 

405 

406 Notes 

407 ----- 

408 The APVD is only time-invariant under the steady-streamlines assumption 

409 (see :ref:`assumption-steady-streamlines`). 

410 

411 The spreading from the gamma-distributed pore volumes represents macrodispersion 

412 (aquifer-scale heterogeneity). When ``std`` comes from calibration on measurements, 

413 it absorbs all mixing: macrodispersion, microdispersion, and an average molecular 

414 diffusion contribution. When calibrating with the diffusion module, these three 

415 components are taken into account separately. When ``std`` comes from streamline 

416 analysis, it represents macrodispersion only; microdispersion and molecular diffusion 

417 can be added via the diffusion module or by combining variances 

418 (see :doc:`/examples/05_Diffusion_Dispersion`). 

419 

420 If calibrating with one compound (e.g., temperature) and predicting for another 

421 (e.g., a solute), the baked-in molecular diffusion contribution may need 

422 correction — see :doc:`/examples/05_Diffusion_Dispersion`. 

423 See :ref:`concept-dispersion-scales` for guidance on when to add microdispersion 

424 using the diffusion module. 

425 

426 Examples 

427 -------- 

428 Basic usage with alpha and beta parameters: 

429 

430 >>> import pandas as pd 

431 >>> import numpy as np 

432 >>> from gwtransport.utils import compute_time_edges 

433 >>> from gwtransport.advection import gamma_infiltration_to_extraction 

434 >>> 

435 >>> # Create input data with aligned time edges 

436 >>> dates = pd.date_range(start="2020-01-01", end="2020-01-20", freq="D") 

437 >>> tedges = compute_time_edges( 

438 ... tedges=None, tstart=None, tend=dates, number_of_bins=len(dates) 

439 ... ) 

440 >>> 

441 >>> # Create output time edges (can be different alignment) 

442 >>> cout_dates = pd.date_range(start="2020-01-05", end="2020-01-15", freq="D") 

443 >>> cout_tedges = compute_time_edges( 

444 ... tedges=None, tstart=None, tend=cout_dates, number_of_bins=len(cout_dates) 

445 ... ) 

446 >>> 

447 >>> # Input concentration and flow (same length, aligned with tedges) 

448 >>> cin = pd.Series(np.ones(len(dates)), index=dates) 

449 >>> flow = pd.Series(np.ones(len(dates)) * 100, index=dates) # 100 m3/day 

450 >>> 

451 >>> # Run gamma_infiltration_to_extraction with alpha/beta parameters 

452 >>> cout = gamma_infiltration_to_extraction( 

453 ... cin=cin, 

454 ... flow=flow, 

455 ... tedges=tedges, 

456 ... cout_tedges=cout_tedges, 

457 ... alpha=10.0, 

458 ... beta=10.0, 

459 ... n_bins=5, 

460 ... ) 

461 >>> cout.shape 

462 (11,) 

463 

464 Using mean and std parameters instead: 

465 

466 >>> cout = gamma_infiltration_to_extraction( 

467 ... cin=cin, 

468 ... flow=flow, 

469 ... tedges=tedges, 

470 ... cout_tedges=cout_tedges, 

471 ... mean=100.0, 

472 ... std=20.0, 

473 ... n_bins=5, 

474 ... ) 

475 

476 With retardation factor: 

477 

478 >>> cout = gamma_infiltration_to_extraction( 

479 ... cin=cin, 

480 ... flow=flow, 

481 ... tedges=tedges, 

482 ... cout_tedges=cout_tedges, 

483 ... alpha=10.0, 

484 ... beta=10.0, 

485 ... retardation_factor=2.0, # Doubles residence time 

486 ... ) 

487 """ 

488 bins = gamma.bins(mean=mean, std=std, loc=loc, alpha=alpha, beta=beta, n_bins=n_bins) 

489 return infiltration_to_extraction( 

490 cin=cin, 

491 flow=flow, 

492 tedges=tedges, 

493 cout_tedges=cout_tedges, 

494 aquifer_pore_volumes=bins["expected_values"], 

495 retardation_factor=retardation_factor, 

496 ) 

497 

498 

499def gamma_extraction_to_infiltration( 

500 *, 

501 cout: npt.ArrayLike, 

502 flow: npt.ArrayLike, 

503 tedges: pd.DatetimeIndex, 

504 cout_tedges: pd.DatetimeIndex, 

505 mean: float | None = None, 

506 std: float | None = None, 

507 loc: float = 0.0, 

508 alpha: float | None = None, 

509 beta: float | None = None, 

510 n_bins: int = 100, 

511 retardation_factor: float = 1.0, 

512 regularization_strength: float = 1e-10, 

513) -> npt.NDArray[np.floating]: 

514 """ 

515 Compute the concentration of the infiltrating water from extracted water (deconvolution). 

516 

517 The compound is retarded in the aquifer with a retardation factor. The residence 

518 time is computed based on the flow rate of the water in the aquifer and the pore volume 

519 of the aquifer. The aquifer pore volume is approximated by a (shifted) gamma distribution 

520 parameterized by either (mean, std, loc) or (alpha, beta, loc). 

521 

522 This function represents extraction to infiltration modeling (equivalent to deconvolution). 

523 It is symmetric to gamma_infiltration_to_extraction. 

524 

525 Provide either (mean, std) or (alpha, beta); ``loc`` is optional and defaults to 0. 

526 

527 Parameters 

528 ---------- 

529 cout : array-like 

530 Concentration of the compound in extracted water or temperature of extracted 

531 water. The model assumes this value is constant over each interval 

532 ``[cout_tedges[i], cout_tedges[i+1])``. 

533 flow : array-like 

534 Flow rate of water in the aquifer [m3/day]. The model assumes this value is 

535 constant over each interval ``[tedges[i], tedges[i+1])``. 

536 tedges : pandas.DatetimeIndex 

537 Time edges for cin (output) and flow data. 

538 Has a length of one more than `flow`. 

539 cout_tedges : pandas.DatetimeIndex 

540 Time edges for the cout data. 

541 Has a length of one more than `cout`. 

542 mean : float, optional 

543 Mean of the gamma distribution of the aquifer pore volume. Must be strictly 

544 greater than ``loc``. 

545 std : float, optional 

546 Standard deviation of the gamma distribution of the aquifer pore volume 

547 (invariant under the ``loc`` shift). 

548 loc : float, optional 

549 Location (minimum pore volume) of the gamma distribution. Must satisfy 

550 ``0 <= loc < mean``. Default is ``0.0``. 

551 alpha : float, optional 

552 Shape parameter of gamma distribution of the aquifer pore volume (must be > 0). 

553 beta : float, optional 

554 Scale parameter of gamma distribution of the aquifer pore volume (must be > 0). 

555 n_bins : int 

556 Number of bins to discretize the gamma distribution. 

557 retardation_factor : float, optional 

558 Retardation factor of the compound in the aquifer (default 1.0). 

559 Values > 1.0 indicate slower transport due to sorption/interaction. 

560 regularization_strength : float, optional 

561 Tikhonov regularization parameter λ. See 

562 :func:`extraction_to_infiltration` for details. Default is 1e-10. 

563 

564 Returns 

565 ------- 

566 numpy.ndarray 

567 Concentration of the compound in the infiltrating water [ng/m3] or temperature. 

568 

569 See Also 

570 -------- 

571 extraction_to_infiltration : Deconvolution with explicit pore volume distribution 

572 gamma_infiltration_to_extraction : Forward operation (convolution) 

573 gwtransport.gamma.bins : Create gamma distribution bins 

574 gwtransport.diffusion.extraction_to_infiltration : Deconvolution with microdispersion and molecular diffusion 

575 :ref:`concept-gamma-distribution` : Two-parameter pore volume model 

576 :ref:`assumption-gamma-distribution` : When gamma distribution is adequate 

577 

578 Notes 

579 ----- 

580 The APVD is only time-invariant under the steady-streamlines assumption 

581 (see :ref:`assumption-steady-streamlines`). 

582 

583 The spreading from the gamma-distributed pore volumes represents macrodispersion 

584 (aquifer-scale heterogeneity). When ``std`` comes from calibration on measurements, 

585 it absorbs all mixing: macrodispersion, microdispersion, and an average molecular 

586 diffusion contribution. When calibrating with the diffusion module, these three 

587 components are taken into account separately. When ``std`` comes from streamline 

588 analysis, it represents macrodispersion only; microdispersion and molecular diffusion 

589 can be added via the diffusion module or by combining variances 

590 (see :doc:`/examples/05_Diffusion_Dispersion`). 

591 

592 If calibrating with one compound (e.g., temperature) and predicting for another 

593 (e.g., a solute), the baked-in molecular diffusion contribution may need 

594 correction — see :doc:`/examples/05_Diffusion_Dispersion`. 

595 See :ref:`concept-dispersion-scales` for guidance on when to add microdispersion 

596 using the diffusion module. 

597 

598 Examples 

599 -------- 

600 Basic usage with alpha and beta parameters: 

601 

602 >>> import pandas as pd 

603 >>> import numpy as np 

604 >>> from gwtransport.utils import compute_time_edges 

605 >>> from gwtransport.advection import gamma_extraction_to_infiltration 

606 >>> 

607 >>> # Create cin/flow time edges 

608 >>> cin_dates = pd.date_range(start="2019-12-25", end="2020-01-15", freq="D") 

609 >>> tedges = compute_time_edges( 

610 ... tedges=None, tstart=None, tend=cin_dates, number_of_bins=len(cin_dates) 

611 ... ) 

612 >>> 

613 >>> # Create cout time edges 

614 >>> cout_dates = pd.date_range(start="2020-01-01", end="2020-01-20", freq="D") 

615 >>> cout_tedges = compute_time_edges( 

616 ... tedges=None, tstart=None, tend=cout_dates, number_of_bins=len(cout_dates) 

617 ... ) 

618 >>> 

619 >>> # Input concentration and flow 

620 >>> cout = np.ones(len(cout_dates)) 

621 >>> flow = np.ones(len(cin_dates)) * 100 # 100 m3/day 

622 >>> 

623 >>> # Run gamma_extraction_to_infiltration with alpha/beta parameters 

624 >>> cin = gamma_extraction_to_infiltration( 

625 ... cout=cout, 

626 ... flow=flow, 

627 ... tedges=tedges, 

628 ... cout_tedges=cout_tedges, 

629 ... alpha=10.0, 

630 ... beta=10.0, 

631 ... n_bins=5, 

632 ... ) 

633 >>> cin.shape 

634 (22,) 

635 

636 Using mean and std parameters instead: 

637 

638 >>> cin = gamma_extraction_to_infiltration( 

639 ... cout=cout, 

640 ... flow=flow, 

641 ... tedges=tedges, 

642 ... cout_tedges=cout_tedges, 

643 ... mean=100.0, 

644 ... std=20.0, 

645 ... n_bins=5, 

646 ... ) 

647 

648 With retardation factor: 

649 

650 >>> cin = gamma_extraction_to_infiltration( 

651 ... cout=cout, 

652 ... flow=flow, 

653 ... tedges=tedges, 

654 ... cout_tedges=cout_tedges, 

655 ... alpha=10.0, 

656 ... beta=10.0, 

657 ... retardation_factor=2.0, # Doubles residence time 

658 ... ) 

659 """ 

660 bins = gamma.bins(mean=mean, std=std, loc=loc, alpha=alpha, beta=beta, n_bins=n_bins) 

661 return extraction_to_infiltration( 

662 cout=cout, 

663 flow=flow, 

664 tedges=tedges, 

665 cout_tedges=cout_tedges, 

666 aquifer_pore_volumes=bins["expected_values"], 

667 retardation_factor=retardation_factor, 

668 regularization_strength=regularization_strength, 

669 ) 

670 

671 

672def infiltration_to_extraction( 

673 *, 

674 cin: npt.ArrayLike, 

675 flow: npt.ArrayLike, 

676 tedges: pd.DatetimeIndex, 

677 cout_tedges: pd.DatetimeIndex, 

678 aquifer_pore_volumes: npt.ArrayLike, 

679 retardation_factor: float = 1.0, 

680) -> npt.NDArray[np.floating]: 

681 """ 

682 Compute the concentration of the extracted water using flow-weighted advection. 

683 

684 This function implements an infiltration to extraction advection model where cin and flow values 

685 correspond to the same aligned time bins defined by tedges. 

686 

687 The algorithm: 

688 1. Computes residence times for each pore volume at cout time edges 

689 2. Calculates infiltration time edges by subtracting residence times 

690 3. Determines temporal overlaps between infiltration and cin time windows 

691 4. Creates flow-weighted overlap matrices normalized by total weights 

692 5. Computes weighted contributions and averages across pore volumes 

693 

694 

695 Parameters 

696 ---------- 

697 cin : array-like 

698 Concentration values of infiltrating water or temperature [concentration units]. 

699 Length must match the number of time bins defined by tedges. The model assumes 

700 this value is constant over each interval ``[tedges[i], tedges[i+1])``. 

701 flow : array-like 

702 Flow rate values in the aquifer [m3/day]. 

703 Length must match cin and the number of time bins defined by tedges. The model 

704 assumes this value is constant over each interval ``[tedges[i], tedges[i+1])``. 

705 tedges : pandas.DatetimeIndex 

706 Time edges defining bins for both cin and flow data. Has length of 

707 len(cin) + 1 and len(flow) + 1. 

708 cout_tedges : pandas.DatetimeIndex 

709 Time edges for output data bins. Has length of desired output + 1. 

710 Can have different time alignment and resolution than tedges. 

711 aquifer_pore_volumes : array-like 

712 Array of aquifer pore volumes [m3] representing the distribution 

713 of residence times in the aquifer system. 

714 retardation_factor : float, optional 

715 Retardation factor of the compound in the aquifer (default 1.0). 

716 Values > 1.0 indicate slower transport due to sorption/interaction. 

717 

718 Returns 

719 ------- 

720 numpy.ndarray 

721 Flow-weighted concentration in the extracted water. Same units as cin. 

722 Length equals len(cout_tedges) - 1. NaN values indicate time periods 

723 with no valid contributions from the infiltration data. 

724 

725 Raises 

726 ------ 

727 ValueError 

728 If tedges length doesn't match cin/flow arrays plus one, or if 

729 infiltration time edges become non-monotonic (invalid input conditions). 

730 

731 See Also 

732 -------- 

733 gamma_infiltration_to_extraction : Transport with gamma-distributed pore volumes 

734 extraction_to_infiltration : Reverse operation (deconvolution) 

735 infiltration_to_extraction_series : Simple time-shift for single pore volume 

736 gwtransport.residence_time.residence_time : Compute residence times from flow and pore volume 

737 gwtransport.residence_time.freundlich_retardation : Compute concentration-dependent retardation 

738 :ref:`concept-pore-volume-distribution` : Background on aquifer heterogeneity modeling 

739 :ref:`concept-transport-equation` : Flow-weighted averaging approach 

740 

741 Examples 

742 -------- 

743 Basic usage with pandas Series: 

744 

745 >>> import pandas as pd 

746 >>> import numpy as np 

747 >>> from gwtransport.utils import compute_time_edges 

748 >>> from gwtransport.advection import infiltration_to_extraction 

749 >>> 

750 >>> # Create input data 

751 >>> dates = pd.date_range(start="2020-01-01", end="2020-01-20", freq="D") 

752 >>> tedges = compute_time_edges( 

753 ... tedges=None, tstart=None, tend=dates, number_of_bins=len(dates) 

754 ... ) 

755 >>> 

756 >>> # Create output time edges (different alignment) 

757 >>> cout_dates = pd.date_range(start="2020-01-05", end="2020-01-15", freq="D") 

758 >>> cout_tedges = compute_time_edges( 

759 ... tedges=None, tstart=None, tend=cout_dates, number_of_bins=len(cout_dates) 

760 ... ) 

761 >>> 

762 >>> # Input concentration and flow 

763 >>> cin = pd.Series(np.ones(len(dates)), index=dates) 

764 >>> flow = pd.Series(np.ones(len(dates)) * 100, index=dates) # 100 m3/day 

765 >>> 

766 >>> # Define distribution of aquifer pore volumes 

767 >>> aquifer_pore_volumes = np.array([50, 100, 200]) # m3 

768 >>> 

769 >>> # Run infiltration_to_extraction 

770 >>> cout = infiltration_to_extraction( 

771 ... cin=cin, 

772 ... flow=flow, 

773 ... tedges=tedges, 

774 ... cout_tedges=cout_tedges, 

775 ... aquifer_pore_volumes=aquifer_pore_volumes, 

776 ... ) 

777 >>> cout.shape 

778 (11,) 

779 

780 Using array inputs instead of pandas Series: 

781 

782 >>> # Convert to arrays 

783 >>> cin_values = cin.values 

784 >>> flow_values = flow.values 

785 >>> 

786 >>> cout = infiltration_to_extraction( 

787 ... cin=cin_values, 

788 ... flow=flow, 

789 ... tedges=tedges, 

790 ... cout_tedges=cout_tedges, 

791 ... aquifer_pore_volumes=aquifer_pore_volumes, 

792 ... ) 

793 

794 With constant retardation factor (linear sorption): 

795 

796 >>> cout = infiltration_to_extraction( 

797 ... cin=cin, 

798 ... flow=flow, 

799 ... tedges=tedges, 

800 ... cout_tedges=cout_tedges, 

801 ... aquifer_pore_volumes=aquifer_pore_volumes, 

802 ... retardation_factor=2.0, # Compound moves twice as slowly 

803 ... ) 

804 

805 Note: For concentration-dependent retardation (nonlinear sorption), 

806 use `infiltration_to_extraction_front_tracking_detailed` instead, as this 

807 function only supports constant (float) retardation factors. 

808 

809 Using single pore volume: 

810 

811 >>> single_volume = np.array([100]) # Single 100 m3 pore volume 

812 >>> cout = infiltration_to_extraction( 

813 ... cin=cin, 

814 ... flow=flow, 

815 ... tedges=tedges, 

816 ... cout_tedges=cout_tedges, 

817 ... aquifer_pore_volumes=single_volume, 

818 ... ) 

819 """ 

820 tedges = pd.DatetimeIndex(tedges) 

821 cout_tedges = pd.DatetimeIndex(cout_tedges) 

822 

823 # Convert to arrays for vectorized operations 

824 cin = np.asarray(cin) 

825 flow = np.asarray(flow) 

826 aquifer_pore_volumes = np.asarray(aquifer_pore_volumes) 

827 

828 if len(tedges) != len(cin) + 1: 

829 msg = "tedges must have one more element than cin" 

830 raise ValueError(msg) 

831 if len(tedges) != len(flow) + 1: 

832 msg = "tedges must have one more element than flow" 

833 raise ValueError(msg) 

834 

835 # Validate inputs do not contain NaN values 

836 if np.any(np.isnan(cin)): 

837 msg = "cin contains NaN values, which are not allowed" 

838 raise ValueError(msg) 

839 if np.any(np.isnan(flow)): 

840 msg = "flow contains NaN values, which are not allowed" 

841 raise ValueError(msg) 

842 if np.any(flow < 0): 

843 msg = "flow must be non-negative (negative flow not supported)" 

844 raise ValueError(msg) 

845 if retardation_factor < 1.0: 

846 msg = "retardation_factor must be >= 1.0" 

847 raise ValueError(msg) 

848 

849 # Compute normalized weights (includes all pre-computation) 

850 normalized_weights, spinup_mask = _infiltration_to_extraction_weights( 

851 tedges=tedges, 

852 cout_tedges=cout_tedges, 

853 aquifer_pore_volumes=aquifer_pore_volumes, 

854 flow=flow, 

855 retardation_factor=retardation_factor, 

856 ) 

857 

858 # Apply to concentrations. Spin-up rows (no streamtube traced back into 

859 # the cin range) become NaN; zero-flow rows naturally produce 0 from the 

860 # zero weight row. 

861 out = normalized_weights.dot(cin) 

862 out[spinup_mask] = np.nan 

863 

864 return out 

865 

866 

867def extraction_to_infiltration( 

868 *, 

869 cout: npt.ArrayLike, 

870 flow: npt.ArrayLike, 

871 tedges: pd.DatetimeIndex, 

872 cout_tedges: pd.DatetimeIndex, 

873 aquifer_pore_volumes: npt.ArrayLike, 

874 retardation_factor: float = 1.0, 

875 regularization_strength: float = 1e-10, 

876) -> npt.NDArray[np.floating]: 

877 """ 

878 Compute the concentration of the infiltrating water from extracted water (deconvolution). 

879 

880 Inverts the forward transport model by solving the linear system 

881 ``W_forward @ cin = cout`` where ``W_forward`` is the weight matrix from 

882 :func:`infiltration_to_extraction`. Uses Tikhonov regularization to 

883 smoothly blend data fitting with a physically motivated target 

884 (transpose-and-normalize of the forward matrix). 

885 

886 Well-determined modes (large singular values relative to √λ) are 

887 dominated by the data; poorly-determined modes are pulled toward the 

888 target. This avoids edge oscillations and is less sensitive to the 

889 regularization parameter than truncated SVD (``rcond``). 

890 

891 Parameters 

892 ---------- 

893 cout : array-like 

894 Concentration values of extracted water [concentration units]. 

895 Length must match the number of time bins defined by cout_tedges. The model 

896 assumes this value is constant over each interval 

897 ``[cout_tedges[i], cout_tedges[i+1])``. 

898 flow : array-like 

899 Flow rate values in the aquifer [m3/day]. 

900 Length must match the number of time bins defined by tedges. The model assumes 

901 this value is constant over each interval ``[tedges[i], tedges[i+1])``. 

902 tedges : pandas.DatetimeIndex 

903 Time edges defining bins for both cin (output) and flow data. Has length of 

904 len(flow) + 1. Output cin has length len(tedges) - 1. 

905 cout_tedges : pandas.DatetimeIndex 

906 Time edges for cout data bins. Has length of len(cout) + 1. 

907 Can have different time alignment and resolution than tedges. 

908 aquifer_pore_volumes : array-like 

909 Array of aquifer pore volumes [m3] representing the distribution 

910 of residence times in the aquifer system. 

911 retardation_factor : float, optional 

912 Retardation factor of the compound in the aquifer (default 1.0). 

913 Values > 1.0 indicate slower transport due to sorption/interaction. 

914 regularization_strength : float, optional 

915 Tikhonov regularization parameter λ. Controls the tradeoff between 

916 fitting the data (``||W cin - cout||²``) and staying close to the 

917 regularization target (``λ ||cin - cin_target||²``). The target is 

918 the transpose-and-normalize of the forward matrix applied to cout. 

919 

920 Larger values trust the target more (smoother, more biased); smaller 

921 values trust the data more (noisier, less biased). The solution 

922 varies continuously with λ. Default is 1e-10. 

923 

924 A good starting value for noisy data is 

925 ``λ ≈ (noise_std / signal_amplitude)²``. For example, temperature 

926 data with 0.05 °C noise and ~10 °C seasonal amplitude suggests 

927 ``regularization_strength ≈ (0.05 / 10)² ≈ 2.5e-5``. Increase by 

928 a factor of 2-10 for additional smoothing. For noiseless synthetic 

929 data (e.g., roundtrip tests), the default 1e-10 preserves machine 

930 precision. 

931 

932 Returns 

933 ------- 

934 numpy.ndarray 

935 Concentration in the infiltrating water. Same units as cout. 

936 Length equals len(tedges) - 1. NaN values indicate time periods 

937 with no temporal overlap with the extraction data. 

938 

939 Raises 

940 ------ 

941 ValueError 

942 If tedges length doesn't match flow plus one, if cout_tedges length 

943 doesn't match cout plus one, or if inputs contain NaN. 

944 

945 See Also 

946 -------- 

947 gamma_extraction_to_infiltration : Deconvolution with gamma-distributed pore volumes 

948 infiltration_to_extraction : Forward operation (convolution) 

949 extraction_to_infiltration_series : Simple time-shift for single pore volume 

950 gwtransport.residence_time.residence_time : Compute residence times from flow and pore volume 

951 gwtransport.utils.solve_tikhonov : Solver used for inversion 

952 :ref:`concept-pore-volume-distribution` : Background on aquifer heterogeneity modeling 

953 :ref:`concept-transport-equation` : Flow-weighted averaging approach 

954 

955 Examples 

956 -------- 

957 Basic usage with pandas Series: 

958 

959 >>> import pandas as pd 

960 >>> import numpy as np 

961 >>> from gwtransport.utils import compute_time_edges 

962 >>> from gwtransport.advection import extraction_to_infiltration 

963 >>> 

964 >>> # Create cin/flow time edges 

965 >>> cin_dates = pd.date_range(start="2019-12-25", end="2020-01-15", freq="D") 

966 >>> tedges = compute_time_edges( 

967 ... tedges=None, tstart=None, tend=cin_dates, number_of_bins=len(cin_dates) 

968 ... ) 

969 >>> 

970 >>> # Create cout time edges 

971 >>> cout_dates = pd.date_range(start="2020-01-01", end="2020-01-20", freq="D") 

972 >>> cout_tedges = compute_time_edges( 

973 ... tedges=None, tstart=None, tend=cout_dates, number_of_bins=len(cout_dates) 

974 ... ) 

975 >>> 

976 >>> # Input concentration and flow 

977 >>> cout = np.ones(len(cout_dates)) 

978 >>> flow = np.ones(len(cin_dates)) * 100 # 100 m3/day 

979 >>> 

980 >>> # Define distribution of aquifer pore volumes 

981 >>> aquifer_pore_volumes = np.array([50, 100, 200]) # m3 

982 >>> 

983 >>> # Run extraction_to_infiltration 

984 >>> cin = extraction_to_infiltration( 

985 ... cout=cout, 

986 ... flow=flow, 

987 ... tedges=tedges, 

988 ... cout_tedges=cout_tedges, 

989 ... aquifer_pore_volumes=aquifer_pore_volumes, 

990 ... ) 

991 >>> cin.shape 

992 (22,) 

993 

994 Round-trip reconstruction (symmetric with infiltration_to_extraction): 

995 

996 >>> from gwtransport.advection import infiltration_to_extraction 

997 >>> cin_original = np.sin(np.linspace(0, 2 * np.pi, len(cin_dates))) + 2 

998 >>> cout = infiltration_to_extraction( 

999 ... cin=cin_original, 

1000 ... flow=flow, 

1001 ... tedges=tedges, 

1002 ... cout_tedges=cout_tedges, 

1003 ... aquifer_pore_volumes=aquifer_pore_volumes, 

1004 ... ) 

1005 >>> cin_recovered = extraction_to_infiltration( 

1006 ... cout=cout, 

1007 ... flow=flow, 

1008 ... tedges=tedges, 

1009 ... cout_tedges=cout_tedges, 

1010 ... aquifer_pore_volumes=aquifer_pore_volumes, 

1011 ... ) 

1012 """ 

1013 tedges = pd.DatetimeIndex(tedges) 

1014 cout_tedges = pd.DatetimeIndex(cout_tedges) 

1015 

1016 # Convert to arrays for vectorized operations 

1017 cout = np.asarray(cout) 

1018 flow = np.asarray(flow) 

1019 

1020 if len(tedges) != len(flow) + 1: 

1021 msg = "tedges must have one more element than flow" 

1022 raise ValueError(msg) 

1023 if len(cout_tedges) != len(cout) + 1: 

1024 msg = "cout_tedges must have one more element than cout" 

1025 raise ValueError(msg) 

1026 

1027 # Validate inputs do not contain NaN values 

1028 if np.any(np.isnan(cout)): 

1029 msg = "cout contains NaN values, which are not allowed" 

1030 raise ValueError(msg) 

1031 if np.any(np.isnan(flow)): 

1032 msg = "flow contains NaN values, which are not allowed" 

1033 raise ValueError(msg) 

1034 if retardation_factor < 1.0: 

1035 msg = "retardation_factor must be >= 1.0" 

1036 raise ValueError(msg) 

1037 

1038 aquifer_pore_volumes = np.asarray(aquifer_pore_volumes) 

1039 n_cin = len(tedges) - 1 

1040 

1041 # Build forward weight matrix: W_forward @ cin = cout 

1042 w_forward, _ = _infiltration_to_extraction_weights( 

1043 tedges=tedges, 

1044 cout_tedges=cout_tedges, 

1045 aquifer_pore_volumes=aquifer_pore_volumes, 

1046 flow=flow, 

1047 retardation_factor=retardation_factor, 

1048 ) 

1049 

1050 return solve_inverse_transport( 

1051 w_forward=w_forward, 

1052 observed=cout, 

1053 n_output=n_cin, 

1054 regularization_strength=regularization_strength, 

1055 ) 

1056 

1057 

1058def _validate_front_tracking_inputs( 

1059 *, 

1060 cin: npt.ArrayLike, 

1061 flow: npt.ArrayLike, 

1062 tedges: pd.DatetimeIndex, 

1063 cout_tedges: pd.DatetimeIndex, 

1064 aquifer_pore_volumes: npt.ArrayLike, 

1065 freundlich_k: float | None, 

1066 freundlich_n: float | None, 

1067 bulk_density: float | None, 

1068 porosity: float | None, 

1069 retardation_factor: float | None, 

1070 langmuir_s_max: float | None, 

1071 langmuir_k_l: float | None, 

1072) -> tuple[ 

1073 npt.NDArray[np.float64], 

1074 npt.NDArray[np.float64], 

1075 pd.DatetimeIndex, 

1076 pd.DatetimeIndex, 

1077 npt.NDArray[np.float64], 

1078 SorptionModel, 

1079 npt.NDArray[np.float64], 

1080]: 

1081 """Validate inputs and create sorption object for front tracking functions. 

1082 

1083 Returns 

1084 ------- 

1085 tuple 

1086 Validated and converted inputs: (cin, flow, tedges, cout_tedges, 

1087 aquifer_pore_volumes, sorption, cout_tedges_days). 

1088 

1089 Raises 

1090 ------ 

1091 ValueError 

1092 If array lengths are inconsistent, values are non-physical (negative 

1093 concentrations, non-positive flows, NaN values, non-positive pore 

1094 volumes), retardation_factor < 1, Freundlich or Langmuir parameters 

1095 are missing or non-positive, freundlich_n equals 1, or physical 

1096 parameters are invalid. 

1097 """ 

1098 cin = np.asarray(cin, dtype=float) 

1099 flow = np.asarray(flow, dtype=float) 

1100 tedges = pd.DatetimeIndex(tedges) 

1101 cout_tedges = pd.DatetimeIndex(cout_tedges) 

1102 aquifer_pore_volumes = np.asarray(aquifer_pore_volumes, dtype=float) 

1103 

1104 if len(tedges) != len(cin) + 1: 

1105 msg = "tedges must have length len(cin) + 1" 

1106 raise ValueError(msg) 

1107 if len(flow) != len(cin): 

1108 msg = "flow must have same length as cin" 

1109 raise ValueError(msg) 

1110 if np.any(cin < 0): 

1111 msg = "cin must be non-negative" 

1112 raise ValueError(msg) 

1113 if np.any(np.isnan(cin)) or np.any(np.isnan(flow)): 

1114 msg = "cin and flow must not contain NaN" 

1115 raise ValueError(msg) 

1116 if np.any(flow < 0): 

1117 msg = "flow must be non-negative (negative flow not supported)" 

1118 raise ValueError(msg) 

1119 if np.any(aquifer_pore_volumes <= 0): 

1120 msg = "aquifer_pore_volumes must be positive" 

1121 raise ValueError(msg) 

1122 

1123 # Convert cout_tedges to days (relative to tedges[0]) for output computation 

1124 t_ref = tedges[0] 

1125 cout_tedges_days = ((cout_tedges - t_ref) / pd.Timedelta(days=1)).values 

1126 

1127 # Determine which sorption model is requested 

1128 has_retardation = retardation_factor is not None 

1129 has_freundlich = freundlich_k is not None or freundlich_n is not None 

1130 has_langmuir = langmuir_s_max is not None or langmuir_k_l is not None 

1131 n_models = has_retardation + has_freundlich + has_langmuir 

1132 

1133 if n_models == 0: 

1134 msg = ( 

1135 "Must provide one of: retardation_factor, Freundlich parameters " 

1136 "(freundlich_k, freundlich_n, bulk_density, porosity), or Langmuir parameters " 

1137 "(langmuir_s_max, langmuir_k_l, bulk_density, porosity)" 

1138 ) 

1139 raise ValueError(msg) 

1140 if n_models > 1: 

1141 msg = "Only one sorption model can be specified (retardation_factor, Freundlich, or Langmuir)" 

1142 raise ValueError(msg) 

1143 

1144 # Create sorption object 

1145 if retardation_factor is not None: 

1146 if retardation_factor < 1.0: 

1147 msg = "retardation_factor must be >= 1.0" 

1148 raise ValueError(msg) 

1149 

1150 sorption: SorptionModel = ConstantRetardation(retardation_factor=retardation_factor) 

1151 elif has_freundlich: 

1152 if freundlich_k is None or freundlich_n is None or bulk_density is None or porosity is None: 

1153 msg = "All Freundlich parameters required (freundlich_k, freundlich_n, bulk_density, porosity)" 

1154 raise ValueError(msg) 

1155 if freundlich_k <= 0 or freundlich_n <= 0: 

1156 msg = "Freundlich parameters must be positive" 

1157 raise ValueError(msg) 

1158 if abs(freundlich_n - 1.0) < EPSILON_FREUNDLICH_N: 

1159 msg = "freundlich_n = 1 not supported (use retardation_factor for linear case)" 

1160 raise ValueError(msg) 

1161 if bulk_density <= 0 or not 0 < porosity < 1: 

1162 msg = "Invalid physical parameters" 

1163 raise ValueError(msg) 

1164 

1165 sorption = FreundlichSorption( 

1166 k_f=freundlich_k, 

1167 n=freundlich_n, 

1168 bulk_density=bulk_density, 

1169 porosity=porosity, 

1170 ) 

1171 else: 

1172 if langmuir_s_max is None or langmuir_k_l is None or bulk_density is None or porosity is None: 

1173 msg = "All Langmuir parameters required (langmuir_s_max, langmuir_k_l, bulk_density, porosity)" 

1174 raise ValueError(msg) 

1175 if langmuir_s_max <= 0 or langmuir_k_l <= 0: 

1176 msg = "Langmuir parameters must be positive" 

1177 raise ValueError(msg) 

1178 if bulk_density <= 0 or not 0 < porosity < 1: 

1179 msg = "Invalid physical parameters" 

1180 raise ValueError(msg) 

1181 

1182 sorption = LangmuirSorption( 

1183 s_max=langmuir_s_max, 

1184 k_l=langmuir_k_l, 

1185 bulk_density=bulk_density, 

1186 porosity=porosity, 

1187 ) 

1188 

1189 return cin, flow, tedges, cout_tedges, aquifer_pore_volumes, sorption, cout_tedges_days 

1190 

1191 

1192def _flow_weighted_front_tracking_output( 

1193 cout_tedges_days: npt.NDArray[np.floating], 

1194 flow_tedges_days: npt.NDArray[np.floating], 

1195 flow: npt.NDArray[np.floating], 

1196 v_outlet: float, 

1197 waves: list, 

1198 sorption: SorptionModel, 

1199) -> npt.NDArray[np.floating]: 

1200 """Compute flow-weighted bin-averaged concentration from front-tracking output. 

1201 

1202 Splits output bins at flow boundaries so that Q is constant within each 

1203 sub-bin, then combines sub-bins with flow-weighting: 

1204 ``c_avg = Σ(Q_k · c_k · dt_k) / Σ(Q_k · dt_k)``. 

1205 

1206 This is the per-streamtube outlet concentration for one pore volume: 

1207 mass flux divided by water flux for the streamtube's contribution to each 

1208 output bin. Aggregation across streamtubes is the caller's responsibility 

1209 (simple arithmetic mean over streamtubes — equal flow per streamtube). 

1210 

1211 Parameters 

1212 ---------- 

1213 cout_tedges_days : ndarray 

1214 Output time bin edges [days from reference]. 

1215 flow_tedges_days : ndarray 

1216 Flow time bin edges [days from reference]. 

1217 flow : ndarray 

1218 Flow rate per flow bin [m³/day]. 

1219 v_outlet : float 

1220 Outlet volume position [m³]. 

1221 waves : list 

1222 Wave list from front tracking simulation. 

1223 sorption : object 

1224 Sorption model. 

1225 

1226 Returns 

1227 ------- 

1228 ndarray 

1229 Flow-weighted bin-averaged concentrations. Length = len(cout_tedges_days) - 1. 

1230 """ 

1231 # Merge cout edges with flow edges that fall within the cout range 

1232 inner_flow_edges = flow_tedges_days[ 

1233 (flow_tedges_days > cout_tedges_days[0]) & (flow_tedges_days < cout_tedges_days[-1]) 

1234 ] 

1235 fine_edges = np.unique(np.concatenate([cout_tedges_days, inner_flow_edges])) 

1236 

1237 # Compute time-averaged C on fine grid 

1238 c_fine = compute_bin_averaged_concentration_exact( 

1239 t_edges=fine_edges, 

1240 v_outlet=v_outlet, 

1241 waves=waves, 

1242 sorption=sorption, 

1243 ) 

1244 

1245 # Map each fine sub-bin to its flow value 

1246 fine_mids = (fine_edges[:-1] + fine_edges[1:]) / 2 

1247 flow_idx = np.searchsorted(flow_tedges_days[1:], fine_mids, side="left") 

1248 flow_idx = np.clip(flow_idx, 0, len(flow) - 1) 

1249 q_fine = flow[flow_idx] 

1250 dt_fine = np.diff(fine_edges) 

1251 

1252 # Map each fine sub-bin to its original output bin 

1253 cout_bin_idx = np.searchsorted(cout_tedges_days[1:], fine_mids, side="left") 

1254 cout_bin_idx = np.clip(cout_bin_idx, 0, len(cout_tedges_days) - 2) 

1255 

1256 # Vectorized per-bin flow-weighted average: 

1257 # c_out[k] = sum_i (Q_i * c_i * dt_i) / sum_i (Q_i * dt_i) for fine sub-bins i in bin k 

1258 n_cout = len(cout_tedges_days) - 1 

1259 qdt_product = q_fine * dt_fine 

1260 cqdt_product = c_fine * qdt_product 

1261 denominator = np.bincount(cout_bin_idx, weights=qdt_product, minlength=n_cout).astype(np.float64) 

1262 numerator = np.bincount(cout_bin_idx, weights=cqdt_product, minlength=n_cout).astype(np.float64) 

1263 c_out = np.zeros(n_cout) 

1264 valid = denominator > 0 

1265 c_out[valid] = numerator[valid] / denominator[valid] 

1266 return c_out 

1267 

1268 

1269def infiltration_to_extraction_front_tracking( 

1270 *, 

1271 cin: npt.ArrayLike, 

1272 flow: npt.ArrayLike, 

1273 tedges: pd.DatetimeIndex, 

1274 cout_tedges: pd.DatetimeIndex, 

1275 aquifer_pore_volumes: npt.ArrayLike, 

1276 freundlich_k: float | None = None, 

1277 freundlich_n: float | None = None, 

1278 bulk_density: float | None = None, 

1279 porosity: float | None = None, 

1280 retardation_factor: float | None = None, 

1281 langmuir_s_max: float | None = None, 

1282 langmuir_k_l: float | None = None, 

1283 max_iterations: int = 10000, 

1284) -> npt.NDArray[np.floating]: 

1285 """ 

1286 Compute extracted concentration using exact front tracking with nonlinear sorption. 

1287 

1288 Uses event-driven analytical algorithm that tracks shock waves, rarefaction waves, 

1289 and characteristics with machine precision. No numerical dispersion, exact mass 

1290 balance to floating-point precision. 

1291 

1292 Exactly one sorption model must be specified: 

1293 

1294 - ``retardation_factor`` for constant (linear) retardation. 

1295 - ``freundlich_k`` + ``freundlich_n`` + ``bulk_density`` + ``porosity`` for 

1296 Freundlich isotherm. 

1297 - ``langmuir_s_max`` + ``langmuir_k_l`` + ``bulk_density`` + ``porosity`` for 

1298 Langmuir isotherm. 

1299 

1300 Parameters 

1301 ---------- 

1302 cin : array-like 

1303 Infiltration concentration [mg/L or any units]. 

1304 Length = len(tedges) - 1. The model assumes this value is constant over each 

1305 interval ``[tedges[i], tedges[i+1])``. 

1306 flow : array-like 

1307 Flow rate [m³/day]. Must be positive. 

1308 Length = len(tedges) - 1. The model assumes this value is constant over each 

1309 interval ``[tedges[i], tedges[i+1])``. 

1310 tedges : pandas.DatetimeIndex 

1311 Time bin edges. Length = len(cin) + 1. 

1312 cout_tedges : pandas.DatetimeIndex 

1313 Output time bin edges. Can be different from tedges. 

1314 Length determines output array size. 

1315 aquifer_pore_volumes : array-like 

1316 Array of aquifer pore volumes [m³] representing the distribution 

1317 of residence times in the aquifer system. Each pore volume must be positive. 

1318 freundlich_k : float, optional 

1319 Freundlich coefficient [(m³/kg)^(1/n)]. Must be positive. 

1320 freundlich_n : float, optional 

1321 Freundlich exponent [-]. Must be positive and != 1. 

1322 bulk_density : float, optional 

1323 Bulk density [kg/m³]. Must be positive. 

1324 Shared by Freundlich and Langmuir models. 

1325 porosity : float, optional 

1326 Porosity [-]. Must be in (0, 1). 

1327 Shared by Freundlich and Langmuir models. 

1328 retardation_factor : float, optional 

1329 Constant retardation factor [-]. Must be >= 1.0. 

1330 langmuir_s_max : float, optional 

1331 Langmuir maximum sorption capacity [mg/kg]. Must be positive. 

1332 langmuir_k_l : float, optional 

1333 Langmuir half-saturation constant [mg/L]. Must be positive. 

1334 max_iterations : int, optional 

1335 Maximum number of events. Default 10000. 

1336 

1337 Returns 

1338 ------- 

1339 cout : numpy.ndarray 

1340 Flow-weighted extraction concentration averaged across all pore volumes. 

1341 Length = len(cout_tedges) - 1. 

1342 

1343 See Also 

1344 -------- 

1345 infiltration_to_extraction_front_tracking_detailed : Returns detailed structure 

1346 infiltration_to_extraction : Convolution-based approach for linear case 

1347 gamma_infiltration_to_extraction : For distributions of pore volumes 

1348 :ref:`concept-nonlinear-sorption` : Freundlich isotherm and front-tracking theory 

1349 :ref:`assumption-advection-dominated` : When diffusion/dispersion is negligible 

1350 

1351 Notes 

1352 ----- 

1353 **Spin-up Period**: 

1354 The function computes the first arrival time t_first. Concentrations 

1355 before t_first are affected by unknown initial conditions and should 

1356 not be used for analysis. Use `infiltration_to_extraction_front_tracking_detailed` 

1357 to access t_first. 

1358 

1359 **Machine Precision**: 

1360 All calculations use exact analytical formulas. Mass balance is conserved 

1361 to floating-point precision (~1e-14 relative error). No numerical tolerances 

1362 are used for time/position calculations. 

1363 

1364 **Physical Correctness**: 

1365 - All shocks satisfy Lax entropy condition 

1366 - Rarefaction waves use self-similar solutions 

1367 - Causality is strictly enforced 

1368 

1369 Examples 

1370 -------- 

1371 >>> import numpy as np 

1372 >>> import pandas as pd 

1373 >>> 

1374 >>> # Pulse injection with single pore volume 

1375 >>> tedges = pd.date_range("2020-01-01", periods=4, freq="10D") 

1376 >>> cin = np.array([0.0, 10.0, 0.0]) 

1377 >>> flow = np.array([100.0, 100.0, 100.0]) 

1378 >>> cout_tedges = pd.date_range("2020-01-01", periods=10, freq="5D") 

1379 >>> 

1380 >>> cout = infiltration_to_extraction_front_tracking( 

1381 ... cin=cin, 

1382 ... flow=flow, 

1383 ... tedges=tedges, 

1384 ... cout_tedges=cout_tedges, 

1385 ... aquifer_pore_volumes=np.array([500.0]), 

1386 ... freundlich_k=0.01, 

1387 ... freundlich_n=2.0, 

1388 ... bulk_density=1500.0, 

1389 ... porosity=0.3, 

1390 ... ) 

1391 

1392 With multiple pore volumes (distribution): 

1393 

1394 >>> aquifer_pore_volumes = np.array([400.0, 500.0, 600.0]) 

1395 >>> cout = infiltration_to_extraction_front_tracking( 

1396 ... cin=cin, 

1397 ... flow=flow, 

1398 ... tedges=tedges, 

1399 ... cout_tedges=cout_tedges, 

1400 ... aquifer_pore_volumes=aquifer_pore_volumes, 

1401 ... freundlich_k=0.01, 

1402 ... freundlich_n=2.0, 

1403 ... bulk_density=1500.0, 

1404 ... porosity=0.3, 

1405 ... ) 

1406 """ 

1407 cin, flow, tedges, cout_tedges, aquifer_pore_volumes, sorption, cout_tedges_days = _validate_front_tracking_inputs( 

1408 cin=cin, 

1409 flow=flow, 

1410 tedges=tedges, 

1411 cout_tedges=cout_tedges, 

1412 aquifer_pore_volumes=aquifer_pore_volumes, 

1413 freundlich_k=freundlich_k, 

1414 freundlich_n=freundlich_n, 

1415 bulk_density=bulk_density, 

1416 porosity=porosity, 

1417 retardation_factor=retardation_factor, 

1418 langmuir_s_max=langmuir_s_max, 

1419 langmuir_k_l=langmuir_k_l, 

1420 ) 

1421 

1422 # Flow time edges in days (same reference as cout_tedges_days) 

1423 t_ref = tedges[0] 

1424 flow_tedges_days = ((tedges - t_ref) / pd.Timedelta(days=1)).values 

1425 

1426 # Each pore-volume bin from the gamma distribution is an equal-mass streamtube, 

1427 # so all streamtubes carry equal flow at the outlet. The bundle outlet 

1428 # concentration is the simple arithmetic mean over streamtubes. 

1429 cout_all = np.zeros((len(aquifer_pore_volumes), len(cout_tedges) - 1)) 

1430 

1431 for i, aquifer_pore_volume in enumerate(aquifer_pore_volumes): 

1432 tracker = FrontTracker( 

1433 cin=cin, 

1434 flow=flow, 

1435 tedges=tedges, 

1436 aquifer_pore_volume=aquifer_pore_volume, 

1437 sorption=sorption, 

1438 ) 

1439 

1440 tracker.run(max_iterations=max_iterations) 

1441 

1442 cout_all[i, :] = _flow_weighted_front_tracking_output( 

1443 cout_tedges_days=cout_tedges_days, 

1444 flow_tedges_days=flow_tedges_days, 

1445 flow=flow, 

1446 v_outlet=aquifer_pore_volume, 

1447 waves=tracker.state.waves, 

1448 sorption=sorption, 

1449 ) 

1450 

1451 return np.mean(cout_all, axis=0) 

1452 

1453 

1454def infiltration_to_extraction_front_tracking_detailed( 

1455 *, 

1456 cin: npt.ArrayLike, 

1457 flow: npt.ArrayLike, 

1458 tedges: pd.DatetimeIndex, 

1459 cout_tedges: pd.DatetimeIndex, 

1460 aquifer_pore_volumes: npt.ArrayLike, 

1461 freundlich_k: float | None = None, 

1462 freundlich_n: float | None = None, 

1463 bulk_density: float | None = None, 

1464 porosity: float | None = None, 

1465 retardation_factor: float | None = None, 

1466 langmuir_s_max: float | None = None, 

1467 langmuir_k_l: float | None = None, 

1468 max_iterations: int = 10000, 

1469) -> tuple[npt.NDArray[np.floating], list[dict]]: 

1470 """ 

1471 Compute extracted concentration with complete diagnostic information. 

1472 

1473 Returns both bin-averaged concentrations and detailed simulation structure for each pore volume. 

1474 

1475 Exactly one sorption model must be specified: 

1476 

1477 - ``retardation_factor`` for constant (linear) retardation. 

1478 - ``freundlich_k`` + ``freundlich_n`` + ``bulk_density`` + ``porosity`` for 

1479 Freundlich isotherm. 

1480 - ``langmuir_s_max`` + ``langmuir_k_l`` + ``bulk_density`` + ``porosity`` for 

1481 Langmuir isotherm. 

1482 

1483 Parameters 

1484 ---------- 

1485 cin : array-like 

1486 Infiltration concentration [mg/L or any units]. 

1487 Length = len(tedges) - 1. The model assumes this value is constant over each 

1488 interval ``[tedges[i], tedges[i+1])``. 

1489 flow : array-like 

1490 Flow rate [m³/day]. Must be positive. 

1491 Length = len(tedges) - 1. The model assumes this value is constant over each 

1492 interval ``[tedges[i], tedges[i+1])``. 

1493 tedges : pandas.DatetimeIndex 

1494 Time bin edges. Length = len(cin) + 1. 

1495 cout_tedges : pandas.DatetimeIndex 

1496 Output time bin edges. Can be different from tedges. 

1497 Length determines output array size. 

1498 aquifer_pore_volumes : array-like 

1499 Array of aquifer pore volumes [m³] representing the distribution 

1500 of residence times in the aquifer system. Each pore volume must be positive. 

1501 freundlich_k : float, optional 

1502 Freundlich coefficient [(m³/kg)^(1/n)]. Must be positive. 

1503 freundlich_n : float, optional 

1504 Freundlich exponent [-]. Must be positive and != 1. 

1505 bulk_density : float, optional 

1506 Bulk density [kg/m³]. Must be positive. 

1507 Shared by Freundlich and Langmuir models. 

1508 porosity : float, optional 

1509 Porosity [-]. Must be in (0, 1). 

1510 Shared by Freundlich and Langmuir models. 

1511 retardation_factor : float, optional 

1512 Constant retardation factor [-]. Must be >= 1.0. 

1513 langmuir_s_max : float, optional 

1514 Langmuir maximum sorption capacity [mg/kg]. Must be positive. 

1515 langmuir_k_l : float, optional 

1516 Langmuir half-saturation constant [mg/L]. Must be positive. 

1517 max_iterations : int, optional 

1518 Maximum number of events. Default 10000. 

1519 

1520 Returns 

1521 ------- 

1522 cout : numpy.ndarray 

1523 Flow-weighted concentrations averaged across all pore volumes. 

1524 

1525 structures : list of dict 

1526 List of detailed simulation structures, one for each pore volume, with keys: 

1527 

1528 - 'waves': List[Wave] - All wave objects created during simulation 

1529 - 'events': List[dict] - All events with times, types, and details 

1530 - 't_first_arrival': float - First arrival time (end of spin-up period) 

1531 - 'n_events': int - Total number of events 

1532 - 'n_shocks': int - Number of shocks created 

1533 - 'n_rarefactions': int - Number of rarefactions created 

1534 - 'n_characteristics': int - Number of characteristics created 

1535 - 'final_time': float - Final simulation time 

1536 - 'sorption': SorptionModel - Sorption object 

1537 - 'tracker_state': FrontTrackerState - Complete simulation state 

1538 - 'aquifer_pore_volume': float - Pore volume for this simulation 

1539 

1540 See Also 

1541 -------- 

1542 infiltration_to_extraction_front_tracking : Returns concentrations only (simpler interface) 

1543 :ref:`concept-nonlinear-sorption` : Freundlich isotherm and front-tracking theory 

1544 :ref:`assumption-advection-dominated` : When diffusion/dispersion is negligible 

1545 

1546 Examples 

1547 -------- 

1548 :: 

1549 

1550 cout, structures = infiltration_to_extraction_front_tracking_detailed( 

1551 cin=cin, 

1552 flow=flow, 

1553 tedges=tedges, 

1554 cout_tedges=cout_tedges, 

1555 aquifer_pore_volumes=np.array([500.0]), 

1556 freundlich_k=0.01, 

1557 freundlich_n=2.0, 

1558 bulk_density=1500.0, 

1559 porosity=0.3, 

1560 ) 

1561 

1562 # Access spin-up period for first pore volume 

1563 print(f"First arrival: {structures[0]['t_first_arrival']:.2f} days") 

1564 

1565 # Analyze events for first pore volume 

1566 for event in structures[0]["events"]: 

1567 print(f"t={event['time']:.2f}: {event['type']}") 

1568 """ 

1569 cin, flow, tedges, cout_tedges, aquifer_pore_volumes, sorption, cout_tedges_days = _validate_front_tracking_inputs( 

1570 cin=cin, 

1571 flow=flow, 

1572 tedges=tedges, 

1573 cout_tedges=cout_tedges, 

1574 aquifer_pore_volumes=aquifer_pore_volumes, 

1575 freundlich_k=freundlich_k, 

1576 freundlich_n=freundlich_n, 

1577 bulk_density=bulk_density, 

1578 porosity=porosity, 

1579 retardation_factor=retardation_factor, 

1580 langmuir_s_max=langmuir_s_max, 

1581 langmuir_k_l=langmuir_k_l, 

1582 ) 

1583 

1584 # Flow time edges in days (same reference as cout_tedges_days) 

1585 t_ref = tedges[0] 

1586 flow_tedges_days = ((tedges - t_ref) / pd.Timedelta(days=1)).values 

1587 

1588 # Each pore-volume bin from the gamma distribution is an equal-mass streamtube, 

1589 # so all streamtubes carry equal flow at the outlet. The bundle outlet 

1590 # concentration is the simple arithmetic mean over streamtubes. 

1591 cout_all = np.zeros((len(aquifer_pore_volumes), len(cout_tedges) - 1)) 

1592 structures = [] 

1593 

1594 for i, aquifer_pore_volume in enumerate(aquifer_pore_volumes): 

1595 tracker = FrontTracker( 

1596 cin=cin, 

1597 flow=flow, 

1598 tedges=tedges, 

1599 aquifer_pore_volume=aquifer_pore_volume, 

1600 sorption=sorption, 

1601 ) 

1602 

1603 tracker.run(max_iterations=max_iterations) 

1604 

1605 cout_all[i, :] = _flow_weighted_front_tracking_output( 

1606 cout_tedges_days=cout_tedges_days, 

1607 flow_tedges_days=flow_tedges_days, 

1608 flow=flow, 

1609 v_outlet=aquifer_pore_volume, 

1610 waves=tracker.state.waves, 

1611 sorption=sorption, 

1612 ) 

1613 

1614 # Build detailed structure dict for this pore volume 

1615 structure = { 

1616 "waves": tracker.state.waves, 

1617 "events": tracker.state.events, 

1618 "t_first_arrival": tracker.t_first_arrival, 

1619 "n_events": len(tracker.state.events), 

1620 "n_shocks": sum(1 for w in tracker.state.waves if isinstance(w, ShockWave)), 

1621 "n_rarefactions": sum(1 for w in tracker.state.waves if isinstance(w, RarefactionWave)), 

1622 "n_characteristics": sum(1 for w in tracker.state.waves if isinstance(w, CharacteristicWave)), 

1623 "final_time": tracker.state.t_current, 

1624 "sorption": sorption, 

1625 "tracker_state": tracker.state, 

1626 "aquifer_pore_volume": aquifer_pore_volume, 

1627 } 

1628 structures.append(structure) 

1629 

1630 return np.mean(cout_all, axis=0), structures