Coverage for src / gwtransport / diffusion_fast.py: 90%
195 statements
« prev ^ index » next coverage.py v7.13.5, created at 2026-04-27 06:32 +0000
« prev ^ index » next coverage.py v7.13.5, created at 2026-04-27 06:32 +0000
1"""
2Fast Diffusive Transport Corrections via Gaussian Smoothing.
4This module provides a computationally efficient approximation of diffusion/dispersion
5using Gaussian smoothing. It is much faster than :mod:`gwtransport.diffusion` but
6less physically accurate, especially under variable flow conditions.
8Both ``diffusion_fast`` and :mod:`gwtransport.diffusion` add microdispersion and
9molecular diffusion on top of macrodispersion captured by the APVD.
11**When to use diffusion_fast vs diffusion:**
13- Use ``diffusion_fast`` when: Speed is critical, flow and time steps are relatively
14 constant, or you need real-time processing
15- Use ``diffusion`` when: Physical accuracy is critical, flow varies significantly,
16 or you're analyzing periods with changing conditions
18See :ref:`concept-dispersion` for background on macrodispersion and microdispersion.
20This module implements diffusion/dispersion processes that modify advective transport
21in aquifer systems. Diffusion causes spreading and smoothing of concentration or
22temperature fronts as they travel through the aquifer. While advection moves compounds
23with water flow, diffusion causes spreading due to molecular diffusion, mechanical
24dispersion, and thermal diffusion (for temperature).
26Limitation: When ``flow_out`` is not provided, this fast approximation works best when
27flow and tedges are relatively constant. The underlying assumption is that dx (spatial
28step between cells) remains approximately constant, which holds for steady flow but
29breaks down under highly variable conditions. When input time bins are non-uniform
30(e.g., from signal compression), provide ``flow_out`` to apply Gaussian smoothing on
31the (typically uniform) output grid instead. For scenarios with significant flow
32variability, consider using :mod:`gwtransport.diffusion` instead.
34Available functions:
36- :func:`infiltration_to_extraction` - Apply diffusion during infiltration to extraction
37 transport. Combines advection (via residence time) with diffusion (via Gaussian smoothing).
38 Computes position-dependent diffusion based on local residence time and returns concentration
39 or temperature in extracted water on the extraction time grid.
41- :func:`extraction_to_infiltration` - Inverse diffusion via Tikhonov regularization. Builds
42 the combined forward matrix (advection + Gaussian diffusion) and solves the inverse problem
43 to reconstruct infiltration concentrations from extraction data.
45- :func:`gamma_infiltration_to_extraction` - Gamma-distributed pore volumes with fast
46 Gaussian diffusion. Convenience wrapper that parameterizes the pore volume distribution
47 as a gamma distribution.
49- :func:`gamma_extraction_to_infiltration` - Gamma-distributed pore volumes, inverse
50 fast Gaussian diffusion. Symmetric inverse of gamma_infiltration_to_extraction.
52- :func:`convolve_diffusion` - Apply variable-sigma Gaussian filtering. Extends
53 scipy.ndimage.gaussian_filter1d to position-dependent sigma using sparse matrix representation
54 for efficiency. Handles boundary conditions via nearest-neighbor extrapolation.
56- :func:`create_example_data` - Generate test data for demonstrating diffusion effects with
57 signals having varying time steps and corresponding sigma arrays. Useful for testing and
58 validation.
60This file is part of gwtransport which is released under AGPL-3.0 license.
61See the ./LICENSE file or go to https://github.com/gwtransport/gwtransport/blob/main/LICENSE for full license details.
62"""
64import warnings
66import numpy as np
67import numpy.typing as npt
68import pandas as pd
69from scipy import sparse
70from scipy.special import erf
72from gwtransport import gamma
73from gwtransport.advection_utils import _infiltration_to_extraction_weights
74from gwtransport.residence_time import residence_time_mean
75from gwtransport.utils import solve_inverse_transport
77# Minimum coefficient sum to consider a row valid
78_EPSILON_COEFF_SUM = 1e-10
81def infiltration_to_extraction(
82 *,
83 cin: npt.ArrayLike,
84 flow: npt.ArrayLike,
85 tedges: pd.DatetimeIndex,
86 cout_tedges: pd.DatetimeIndex,
87 aquifer_pore_volumes: npt.ArrayLike,
88 mean_streamline_length: float,
89 mean_molecular_diffusivity: float,
90 mean_longitudinal_dispersivity: float,
91 retardation_factor: float = 1.0,
92 suppress_dispersion_warning: bool = False,
93 asymptotic_cutoff_sigma: float | None = 3.0,
94 flow_out: npt.ArrayLike | None = None,
95 suppress_uniform_tedges_check: bool = False,
96) -> npt.NDArray[np.floating]:
97 """Compute diffusion during 1D transport using fast Gaussian smoothing.
99 Combines advection (via pore volume distribution) with diffusion (via Gaussian
100 smoothing) to produce extraction concentrations on the ``cout_tedges`` time grid.
101 This is the fast approximate counterpart of
102 :func:`gwtransport.diffusion.infiltration_to_extraction`.
104 When ``flow_out`` is provided, diffusion is applied on the output grid
105 (advect-then-smooth), which correctly handles non-uniform ``tedges``
106 (e.g., from signal compression). Without ``flow_out``, diffusion is applied
107 on the input grid and ``tedges`` must have constant time steps.
109 Parameters
110 ----------
111 cin : array-like
112 Concentration or temperature of the compound in the infiltrating water.
113 flow : array-like
114 Flow rate of water in the aquifer [m3/day].
115 tedges : pandas.DatetimeIndex
116 Time edges for cin and flow data. Has length len(cin) + 1.
117 cout_tedges : pandas.DatetimeIndex
118 Time edges for output data bins. Has length of desired output + 1.
119 aquifer_pore_volumes : array-like
120 Array of aquifer pore volumes [m3] representing the distribution of
121 flow paths.
122 mean_streamline_length : float
123 Mean travel distance [m]. Must be positive.
124 mean_molecular_diffusivity : float
125 Mean effective molecular diffusivity [m2/day]. Must be non-negative.
126 mean_longitudinal_dispersivity : float
127 Mean longitudinal dispersivity [m]. Must be non-negative.
128 retardation_factor : float, optional
129 Retardation factor (default 1.0). Values > 1.0 indicate slower transport.
130 suppress_dispersion_warning : bool, optional
131 Suppress warning about combining multiple pore volumes with dispersivity.
132 Default is False.
133 asymptotic_cutoff_sigma : float or None, optional
134 Truncate the Gaussian kernel at this many standard deviations.
135 Default is 3.0.
136 flow_out : array-like or None, optional
137 Flow rate [m3/day] on the output grid (aligned to ``cout_tedges``).
138 Required when ``tedges`` has non-uniform time steps. Length must
139 equal ``len(cout_tedges) - 1``. Default is None.
140 suppress_uniform_tedges_check : bool, optional
141 Skip the constant-dt check on ``tedges`` when ``flow_out`` is None.
142 Default is False.
144 Returns
145 -------
146 numpy.ndarray
147 Bin-averaged concentration in extracted water. Length equals
148 len(cout_tedges) - 1. NaN values indicate time periods with no valid
149 contributions from the infiltration data.
151 See Also
152 --------
153 gwtransport.diffusion.infiltration_to_extraction : Physically rigorous analytical solution
154 that supports per-pore-volume arrays for streamline_length, molecular_diffusivity,
155 and longitudinal_dispersivity.
156 extraction_to_infiltration : Inverse operation
157 :ref:`concept-dispersion-scales` : Macrodispersion vs microdispersion
159 Notes
160 -----
161 A single ``mean_streamline_length`` is shared across all pore volumes. This
162 assumes streamline-length heterogeneity is captured solely through the
163 pore-volume distribution: the effective cross-sectional area
164 ``A_eff = V_p / L_mean`` varies with V_p while the path length L is held
165 fixed. This is appropriate for many systems but breaks down for
166 partially-penetrating wells or wedge-shaped capture zones, where path
167 length itself varies between streamtubes. In those cases, use
168 :func:`gwtransport.diffusion.infiltration_to_extraction` with a per-streamtube
169 ``streamline_length`` array instead.
170 """
171 # Convert inputs
172 cout_tedges = pd.DatetimeIndex(cout_tedges)
173 tedges = pd.DatetimeIndex(tedges)
174 cin = np.asarray(cin, dtype=float)
175 flow = np.asarray(flow, dtype=float)
176 aquifer_pore_volumes = np.asarray(aquifer_pore_volumes, dtype=float)
178 if flow_out is not None:
179 flow_out = np.asarray(flow_out, dtype=float)
181 n_pore_volumes = len(aquifer_pore_volumes)
183 # Input validation
184 _validate_inputs(
185 cin_or_cout=cin,
186 flow=flow,
187 tedges=tedges,
188 cout_tedges=cout_tedges,
189 aquifer_pore_volumes=aquifer_pore_volumes,
190 mean_streamline_length=mean_streamline_length,
191 mean_molecular_diffusivity=mean_molecular_diffusivity,
192 mean_longitudinal_dispersivity=mean_longitudinal_dispersivity,
193 is_forward=True,
194 flow_out=flow_out,
195 )
197 # Dispersion warning
198 if n_pore_volumes > 1 and mean_longitudinal_dispersivity > 0 and not suppress_dispersion_warning:
199 _warn_dispersion()
201 # Build advection weight matrix (needed in both branches)
202 w_adv, spinup_mask = _infiltration_to_extraction_weights(
203 tedges=tedges,
204 cout_tedges=cout_tedges,
205 aquifer_pore_volumes=aquifer_pore_volumes,
206 flow=flow,
207 retardation_factor=retardation_factor,
208 )
210 if flow_out is None:
211 # Original algorithm: smooth-then-advect (requires uniform tedges)
212 _check_uniform_tedges(tedges=tedges, suppress=suppress_uniform_tedges_check)
214 sigma_array = _compute_sigma(
215 flow=flow,
216 tedges=tedges,
217 aquifer_pore_volumes=aquifer_pore_volumes,
218 streamline_length=mean_streamline_length,
219 molecular_diffusivity=mean_molecular_diffusivity,
220 longitudinal_dispersivity=mean_longitudinal_dispersivity,
221 retardation_factor=retardation_factor,
222 direction="infiltration_to_extraction",
223 )
224 cin_diffused = convolve_diffusion(
225 input_signal=cin, sigma_array=sigma_array, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma
226 )
227 cout = w_adv @ cin_diffused
228 else:
229 # New algorithm: advect-then-smooth (works with any tedges spacing)
230 cout_unsmoothed = w_adv @ cin
232 # Rows with zero weight sum (spin-up or zero-flow traceback) carry no
233 # signal. Flag them so the normalized convolution
234 # smooth(signal * mask) / smooth(mask) properly renormalizes the kernel
235 # instead of letting a near-zero value bleed into valid neighbors.
236 total_coeff = np.sum(w_adv, axis=1)
237 invalid_mask = total_coeff < _EPSILON_COEFF_SUM
238 cout_unsmoothed[invalid_mask] = 0.0
239 validity = np.where(invalid_mask, 0.0, 1.0)
241 sigma_out = _compute_sigma(
242 flow=flow_out,
243 tedges=cout_tedges,
244 aquifer_pore_volumes=aquifer_pore_volumes,
245 streamline_length=mean_streamline_length,
246 molecular_diffusivity=mean_molecular_diffusivity,
247 longitudinal_dispersivity=mean_longitudinal_dispersivity,
248 retardation_factor=retardation_factor,
249 direction="extraction_to_infiltration",
250 )
251 smoothed = convolve_diffusion(
252 input_signal=cout_unsmoothed, sigma_array=sigma_out, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma
253 )
254 smoothed_validity = convolve_diffusion(
255 input_signal=validity, sigma_array=sigma_out, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma
256 )
257 with np.errstate(invalid="ignore"):
258 cout = np.where(smoothed_validity > _EPSILON_COEFF_SUM, smoothed / smoothed_validity, np.nan)
260 # Mark spin-up rows (signal has not broken through yet) as NaN. Zero-flow
261 # cout bins keep their 0 output from the zero weight row.
262 cout[spinup_mask] = np.nan
264 return cout
267def extraction_to_infiltration(
268 *,
269 cout: npt.ArrayLike,
270 flow: npt.ArrayLike,
271 tedges: pd.DatetimeIndex,
272 cout_tedges: pd.DatetimeIndex,
273 aquifer_pore_volumes: npt.ArrayLike,
274 mean_streamline_length: float,
275 mean_molecular_diffusivity: float,
276 mean_longitudinal_dispersivity: float,
277 retardation_factor: float = 1.0,
278 suppress_dispersion_warning: bool = False,
279 asymptotic_cutoff_sigma: float | None = 3.0,
280 regularization_strength: float = 1e-10,
281 flow_out: npt.ArrayLike | None = None,
282 suppress_uniform_tedges_check: bool = False,
283) -> npt.NDArray[np.floating]:
284 """Reconstruct infiltration concentration from extracted water via Tikhonov inversion.
286 Inverts the forward transport model (Gaussian diffusion + advection) by building
287 the combined forward matrix and solving via Tikhonov regularization. This is the
288 fast approximate counterpart of
289 :func:`gwtransport.diffusion.extraction_to_infiltration`.
291 When ``flow_out`` is provided, diffusion is applied on the output grid,
292 which correctly handles non-uniform ``tedges``. Without ``flow_out``,
293 diffusion is applied on the input grid and ``tedges`` must have constant
294 time steps.
296 Parameters
297 ----------
298 cout : array-like
299 Concentration of the compound in extracted water.
300 flow : array-like
301 Flow rate of water in the aquifer [m3/day].
302 tedges : pandas.DatetimeIndex
303 Time edges for cin (output) and flow data. Has length of len(flow) + 1.
304 cout_tedges : pandas.DatetimeIndex
305 Time edges for cout data bins. Has length of len(cout) + 1.
306 aquifer_pore_volumes : array-like
307 Array of aquifer pore volumes [m3].
308 mean_streamline_length : float
309 Mean travel distance [m]. Must be positive.
310 mean_molecular_diffusivity : float
311 Mean effective molecular diffusivity [m2/day]. Must be non-negative.
312 mean_longitudinal_dispersivity : float
313 Mean longitudinal dispersivity [m]. Must be non-negative.
314 retardation_factor : float, optional
315 Retardation factor (default 1.0).
316 suppress_dispersion_warning : bool, optional
317 Suppress warning about combining multiple pore volumes with dispersivity.
318 Default is False.
319 asymptotic_cutoff_sigma : float or None, optional
320 Truncate the Gaussian kernel at this many standard deviations.
321 Default is 3.0.
322 regularization_strength : float, optional
323 Tikhonov regularization parameter. Default is 1e-10.
324 flow_out : array-like or None, optional
325 Flow rate [m3/day] on the output grid (aligned to ``cout_tedges``).
326 Required when ``tedges`` has non-uniform time steps. Length must
327 equal ``len(cout_tedges) - 1``. Default is None.
328 suppress_uniform_tedges_check : bool, optional
329 Skip the constant-dt check on ``tedges`` when ``flow_out`` is None.
330 Default is False.
332 Returns
333 -------
334 numpy.ndarray
335 Bin-averaged concentration in infiltrating water. Length equals
336 len(tedges) - 1. NaN values indicate time periods with no valid
337 contributions from the extraction data.
339 Warns
340 -----
341 UserWarning
342 When the forward matrix is rank-deficient. This occurs with constant
343 flow when the residence time is an integer multiple of the time step
344 width. To fix, adjust ``aquifer_pore_volumes`` slightly (e.g.,
345 multiply by 1.001).
347 See Also
348 --------
349 infiltration_to_extraction : Forward operation
350 gwtransport.diffusion.extraction_to_infiltration : Analytically correct deconvolution
351 that supports per-pore-volume arrays for streamline_length, molecular_diffusivity,
352 and longitudinal_dispersivity.
353 :ref:`concept-dispersion-scales` : Macrodispersion vs microdispersion
355 Notes
356 -----
357 A single ``mean_streamline_length`` is shared across all pore volumes. This
358 assumes streamline-length heterogeneity is captured solely through the
359 pore-volume distribution: the effective cross-sectional area
360 ``A_eff = V_p / L_mean`` varies with V_p while the path length L is held
361 fixed. This is appropriate for many systems but breaks down for
362 partially-penetrating wells or wedge-shaped capture zones, where path
363 length itself varies between streamtubes. In those cases, use
364 :func:`gwtransport.diffusion.extraction_to_infiltration` with a per-streamtube
365 ``streamline_length`` array instead.
366 """
367 # Convert inputs
368 tedges = pd.DatetimeIndex(tedges)
369 cout_tedges = pd.DatetimeIndex(cout_tedges)
370 cout = np.asarray(cout, dtype=float)
371 flow = np.asarray(flow, dtype=float)
372 aquifer_pore_volumes = np.asarray(aquifer_pore_volumes, dtype=float)
374 if flow_out is not None:
375 flow_out = np.asarray(flow_out, dtype=float)
377 n_pore_volumes = len(aquifer_pore_volumes)
379 # Input validation
380 _validate_inputs(
381 cin_or_cout=cout,
382 flow=flow,
383 tedges=tedges,
384 cout_tedges=cout_tedges,
385 aquifer_pore_volumes=aquifer_pore_volumes,
386 mean_streamline_length=mean_streamline_length,
387 mean_molecular_diffusivity=mean_molecular_diffusivity,
388 mean_longitudinal_dispersivity=mean_longitudinal_dispersivity,
389 is_forward=False,
390 flow_out=flow_out,
391 )
393 # Dispersion warning
394 if n_pore_volumes > 1 and mean_longitudinal_dispersivity > 0 and not suppress_dispersion_warning:
395 _warn_dispersion()
397 n_cin = len(tedges) - 1
398 n_cout = len(cout_tedges) - 1
400 # Build advection weight matrix
401 w_adv, _ = _infiltration_to_extraction_weights(
402 tedges=tedges,
403 cout_tedges=cout_tedges,
404 aquifer_pore_volumes=aquifer_pore_volumes,
405 flow=flow,
406 retardation_factor=retardation_factor,
407 )
409 if flow_out is None:
410 # Original algorithm: G on input grid, W_forward = W_adv @ G
411 _check_uniform_tedges(tedges=tedges, suppress=suppress_uniform_tedges_check)
413 sigma_array = _compute_sigma(
414 flow=flow,
415 tedges=tedges,
416 aquifer_pore_volumes=aquifer_pore_volumes,
417 streamline_length=mean_streamline_length,
418 molecular_diffusivity=mean_molecular_diffusivity,
419 longitudinal_dispersivity=mean_longitudinal_dispersivity,
420 retardation_factor=retardation_factor,
421 direction="infiltration_to_extraction",
422 )
423 g_matrix = _build_gaussian_matrix(
424 n=n_cin, sigma_array=sigma_array, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma
425 )
426 # Combined forward matrix: W_adv @ G via sparse @ dense (avoids dense G allocation)
427 w_forward = (g_matrix.T @ w_adv.T).T
428 else:
429 # New algorithm: G on output grid, W_forward = G @ W_adv
430 sigma_out = _compute_sigma(
431 flow=flow_out,
432 tedges=cout_tedges,
433 aquifer_pore_volumes=aquifer_pore_volumes,
434 streamline_length=mean_streamline_length,
435 molecular_diffusivity=mean_molecular_diffusivity,
436 longitudinal_dispersivity=mean_longitudinal_dispersivity,
437 retardation_factor=retardation_factor,
438 direction="extraction_to_infiltration",
439 )
440 g_matrix = _build_gaussian_matrix(
441 n=n_cout, sigma_array=sigma_out, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma
442 )
443 w_forward = g_matrix @ w_adv # sparse @ dense
445 return solve_inverse_transport(
446 w_forward=w_forward,
447 observed=cout,
448 n_output=n_cin,
449 regularization_strength=regularization_strength,
450 warn_rank_deficient=True,
451 )
454def gamma_infiltration_to_extraction(
455 *,
456 cin: npt.ArrayLike,
457 flow: npt.ArrayLike,
458 tedges: pd.DatetimeIndex,
459 cout_tedges: pd.DatetimeIndex,
460 mean: float | None = None,
461 std: float | None = None,
462 loc: float = 0.0,
463 alpha: float | None = None,
464 beta: float | None = None,
465 n_bins: int = 100,
466 mean_streamline_length: float,
467 mean_molecular_diffusivity: float,
468 mean_longitudinal_dispersivity: float,
469 retardation_factor: float = 1.0,
470 suppress_dispersion_warning: bool = False,
471 asymptotic_cutoff_sigma: float | None = 3.0,
472 flow_out: npt.ArrayLike | None = None,
473 suppress_uniform_tedges_check: bool = False,
474) -> npt.NDArray[np.floating]:
475 """
476 Compute extracted concentration with fast Gaussian diffusion for gamma-distributed pore volumes.
478 Combines advective transport (based on gamma-distributed pore volumes) with
479 fast Gaussian diffusion. This is a convenience wrapper around
480 :func:`infiltration_to_extraction` that parameterizes the aquifer pore volume
481 distribution as a (shifted) gamma distribution.
483 Provide either (mean, std) or (alpha, beta); ``loc`` is optional and defaults to 0.
485 Parameters
486 ----------
487 cin : array-like
488 Concentration of the compound in infiltrating water.
489 flow : array-like
490 Flow rate of water in the aquifer [m3/day].
491 tedges : pandas.DatetimeIndex
492 Time edges for cin and flow data. Has length len(cin) + 1.
493 cout_tedges : pandas.DatetimeIndex
494 Time edges for output data bins. Has length of desired output + 1.
495 mean : float, optional
496 Mean of the gamma distribution of the aquifer pore volume. Must be strictly
497 greater than ``loc``.
498 std : float, optional
499 Standard deviation of the gamma distribution of the aquifer pore volume
500 (invariant under the ``loc`` shift).
501 loc : float, optional
502 Location (minimum pore volume) of the gamma distribution. Must satisfy
503 ``0 <= loc < mean``. Default is ``0.0``.
504 alpha : float, optional
505 Shape parameter of gamma distribution of the aquifer pore volume (must be > 0).
506 beta : float, optional
507 Scale parameter of gamma distribution of the aquifer pore volume (must be > 0).
508 n_bins : int, optional
509 Number of bins to discretize the gamma distribution. Default is 100.
510 mean_streamline_length : float
511 Mean travel distance through the aquifer [m] averaged over all aquifer
512 pore volumes. Must be positive. For per-pore-volume arrays, use
513 :func:`gwtransport.diffusion.gamma_infiltration_to_extraction`.
514 mean_molecular_diffusivity : float
515 Mean effective molecular diffusivity [m2/day] averaged over all
516 aquifer pore volumes. Must be non-negative. For per-pore-volume
517 arrays, use :func:`gwtransport.diffusion.gamma_infiltration_to_extraction`.
518 mean_longitudinal_dispersivity : float
519 Mean longitudinal dispersivity [m] averaged over all aquifer pore
520 volumes. Must be non-negative. For per-pore-volume arrays, use
521 :func:`gwtransport.diffusion.gamma_infiltration_to_extraction`.
522 retardation_factor : float, optional
523 Retardation factor (default 1.0). Values > 1.0 indicate slower transport.
524 suppress_dispersion_warning : bool, optional
525 Suppress warning about combining multiple pore volumes with dispersivity.
526 Default is False.
527 asymptotic_cutoff_sigma : float or None, optional
528 Truncate the Gaussian kernel at this many standard deviations.
529 Default is 3.0.
530 flow_out : array-like or None, optional
531 Flow rate [m3/day] on the output grid (aligned to ``cout_tedges``).
532 Required when ``tedges`` has non-uniform time steps. Length must
533 equal ``len(cout_tedges) - 1``. Default is None.
534 suppress_uniform_tedges_check : bool, optional
535 Skip the constant-dt check on ``tedges`` when ``flow_out`` is None.
536 Default is False.
538 Returns
539 -------
540 numpy.ndarray
541 Bin-averaged concentration in extracted water. Length equals
542 len(cout_tedges) - 1. NaN values indicate time periods with no valid
543 contributions from the infiltration data.
545 See Also
546 --------
547 infiltration_to_extraction : Transport with explicit pore volume distribution
548 gamma_extraction_to_infiltration : Reverse operation (deconvolution)
549 gwtransport.gamma.bins : Create gamma distribution bins
550 gwtransport.diffusion.gamma_infiltration_to_extraction : Physically rigorous analytical
551 solution that supports per-pore-volume arrays for streamline_length,
552 molecular_diffusivity, and longitudinal_dispersivity.
553 gwtransport.advection.gamma_infiltration_to_extraction : Pure advection (no dispersion)
554 :ref:`concept-gamma-distribution` : Two-parameter pore volume model
555 :ref:`concept-dispersion-scales` : Macrodispersion vs microdispersion
557 Notes
558 -----
559 The APVD is only time-invariant under the steady-streamlines assumption
560 (see :ref:`assumption-steady-streamlines`).
562 The spreading from the gamma-distributed pore volumes represents macrodispersion
563 (aquifer-scale heterogeneity). When ``std`` comes from calibration on measurements,
564 it absorbs all mixing: macrodispersion, microdispersion, and an average molecular
565 diffusion contribution. When ``std`` comes from streamline analysis, it represents
566 macrodispersion only; microdispersion and molecular diffusion can be added via the
567 dispersion parameters.
568 See :ref:`concept-dispersion-scales` for guidance on when to add microdispersion.
570 Examples
571 --------
572 >>> import pandas as pd
573 >>> import numpy as np
574 >>> from gwtransport.diffusion_fast import gamma_infiltration_to_extraction
575 >>>
576 >>> tedges = pd.date_range(start="2020-01-01", end="2020-01-20", freq="D")
577 >>> cout_tedges = pd.date_range(start="2020-01-05", end="2020-01-25", freq="D")
578 >>> cin = np.zeros(len(tedges) - 1)
579 >>> cin[5:10] = 1.0
580 >>> flow = np.ones(len(tedges) - 1) * 100.0
581 >>>
582 >>> cout = gamma_infiltration_to_extraction(
583 ... cin=cin,
584 ... flow=flow,
585 ... tedges=tedges,
586 ... cout_tedges=cout_tedges,
587 ... mean=500.0,
588 ... std=100.0,
589 ... n_bins=5,
590 ... mean_streamline_length=100.0,
591 ... mean_molecular_diffusivity=1e-4,
592 ... mean_longitudinal_dispersivity=1.0,
593 ... )
594 """
595 bins = gamma.bins(mean=mean, std=std, loc=loc, alpha=alpha, beta=beta, n_bins=n_bins)
596 return infiltration_to_extraction(
597 cin=cin,
598 flow=flow,
599 tedges=tedges,
600 cout_tedges=cout_tedges,
601 aquifer_pore_volumes=bins["expected_values"],
602 mean_streamline_length=mean_streamline_length,
603 mean_molecular_diffusivity=mean_molecular_diffusivity,
604 mean_longitudinal_dispersivity=mean_longitudinal_dispersivity,
605 retardation_factor=retardation_factor,
606 suppress_dispersion_warning=suppress_dispersion_warning,
607 asymptotic_cutoff_sigma=asymptotic_cutoff_sigma,
608 flow_out=flow_out,
609 suppress_uniform_tedges_check=suppress_uniform_tedges_check,
610 )
613def gamma_extraction_to_infiltration(
614 *,
615 cout: npt.ArrayLike,
616 flow: npt.ArrayLike,
617 tedges: pd.DatetimeIndex,
618 cout_tedges: pd.DatetimeIndex,
619 mean: float | None = None,
620 std: float | None = None,
621 loc: float = 0.0,
622 alpha: float | None = None,
623 beta: float | None = None,
624 n_bins: int = 100,
625 mean_streamline_length: float,
626 mean_molecular_diffusivity: float,
627 mean_longitudinal_dispersivity: float,
628 retardation_factor: float = 1.0,
629 suppress_dispersion_warning: bool = False,
630 asymptotic_cutoff_sigma: float | None = 3.0,
631 regularization_strength: float = 1e-10,
632 flow_out: npt.ArrayLike | None = None,
633 suppress_uniform_tedges_check: bool = False,
634) -> npt.NDArray[np.floating]:
635 """
636 Reconstruct infiltration concentration from extracted water for gamma-distributed pore volumes.
638 Inverts the forward transport model (fast Gaussian diffusion + advection with
639 gamma-distributed pore volumes) via Tikhonov regularization. This is a convenience
640 wrapper around :func:`extraction_to_infiltration` that parameterizes the aquifer
641 pore volume distribution as a (shifted) gamma distribution.
643 Provide either (mean, std) or (alpha, beta); ``loc`` is optional and defaults to 0.
645 Parameters
646 ----------
647 cout : array-like
648 Concentration of the compound in extracted water.
649 flow : array-like
650 Flow rate of water in the aquifer [m3/day].
651 tedges : pandas.DatetimeIndex
652 Time edges for cin (output) and flow data. Has length of len(flow) + 1.
653 cout_tedges : pandas.DatetimeIndex
654 Time edges for cout data bins. Has length of len(cout) + 1.
655 mean : float, optional
656 Mean of the gamma distribution of the aquifer pore volume. Must be strictly
657 greater than ``loc``.
658 std : float, optional
659 Standard deviation of the gamma distribution of the aquifer pore volume
660 (invariant under the ``loc`` shift).
661 loc : float, optional
662 Location (minimum pore volume) of the gamma distribution. Must satisfy
663 ``0 <= loc < mean``. Default is ``0.0``.
664 alpha : float, optional
665 Shape parameter of gamma distribution of the aquifer pore volume (must be > 0).
666 beta : float, optional
667 Scale parameter of gamma distribution of the aquifer pore volume (must be > 0).
668 n_bins : int, optional
669 Number of bins to discretize the gamma distribution. Default is 100.
670 mean_streamline_length : float
671 Mean travel distance through the aquifer [m] averaged over all aquifer
672 pore volumes. Must be positive. For per-pore-volume arrays, use
673 :func:`gwtransport.diffusion.gamma_extraction_to_infiltration`.
674 mean_molecular_diffusivity : float
675 Mean effective molecular diffusivity [m2/day] averaged over all
676 aquifer pore volumes. Must be non-negative. For per-pore-volume
677 arrays, use :func:`gwtransport.diffusion.gamma_extraction_to_infiltration`.
678 mean_longitudinal_dispersivity : float
679 Mean longitudinal dispersivity [m] averaged over all aquifer pore
680 volumes. Must be non-negative. For per-pore-volume arrays, use
681 :func:`gwtransport.diffusion.gamma_extraction_to_infiltration`.
682 retardation_factor : float, optional
683 Retardation factor (default 1.0). Values > 1.0 indicate slower transport.
684 suppress_dispersion_warning : bool, optional
685 Suppress warning about combining multiple pore volumes with dispersivity.
686 Default is False.
687 asymptotic_cutoff_sigma : float or None, optional
688 Truncate the Gaussian kernel at this many standard deviations.
689 Default is 3.0.
690 regularization_strength : float, optional
691 Tikhonov regularization parameter. Default is 1e-10.
692 flow_out : array-like or None, optional
693 Flow rate [m3/day] aligned to ``cout_tedges``. When provided, the
694 Gaussian matrix operates on the output grid. Length must equal
695 ``len(cout_tedges) - 1``. Default is None.
696 suppress_uniform_tedges_check : bool, optional
697 When True, skip the check that ``tedges`` has constant time steps
698 (only relevant when ``flow_out`` is None). Default is False.
700 Returns
701 -------
702 numpy.ndarray
703 Bin-averaged concentration in infiltrating water. Length equals
704 len(tedges) - 1. NaN values indicate time periods with no valid
705 contributions from the extraction data.
707 See Also
708 --------
709 extraction_to_infiltration : Deconvolution with explicit pore volume distribution
710 gamma_infiltration_to_extraction : Forward operation (convolution)
711 gwtransport.gamma.bins : Create gamma distribution bins
712 gwtransport.diffusion.gamma_extraction_to_infiltration : Physically rigorous analytical
713 solution that supports per-pore-volume arrays for streamline_length,
714 molecular_diffusivity, and longitudinal_dispersivity.
715 gwtransport.advection.gamma_extraction_to_infiltration : Pure advection (no dispersion)
716 :ref:`concept-gamma-distribution` : Two-parameter pore volume model
717 :ref:`concept-dispersion-scales` : Macrodispersion vs microdispersion
719 Notes
720 -----
721 The APVD is only time-invariant under the steady-streamlines assumption
722 (see :ref:`assumption-steady-streamlines`).
724 The spreading from the gamma-distributed pore volumes represents macrodispersion
725 (aquifer-scale heterogeneity). When ``std`` comes from calibration on measurements,
726 it absorbs all mixing: macrodispersion, microdispersion, and an average molecular
727 diffusion contribution. When ``std`` comes from streamline analysis, it represents
728 macrodispersion only; microdispersion and molecular diffusion can be added via the
729 dispersion parameters.
730 See :ref:`concept-dispersion-scales` for guidance on when to add microdispersion.
732 Examples
733 --------
734 >>> import pandas as pd
735 >>> import numpy as np
736 >>> from gwtransport.diffusion_fast import gamma_extraction_to_infiltration
737 >>>
738 >>> tedges = pd.date_range(start="2020-01-01", end="2020-01-20", freq="D")
739 >>> cout_tedges = pd.date_range(start="2020-01-05", end="2020-01-25", freq="D")
740 >>> cout = np.zeros(len(cout_tedges) - 1)
741 >>> cout[5:10] = 1.0
742 >>> flow = np.ones(len(tedges) - 1) * 100.0
743 >>>
744 >>> cin = gamma_extraction_to_infiltration(
745 ... cout=cout,
746 ... flow=flow,
747 ... tedges=tedges,
748 ... cout_tedges=cout_tedges,
749 ... mean=500.0,
750 ... std=100.0,
751 ... n_bins=5,
752 ... mean_streamline_length=100.0,
753 ... mean_molecular_diffusivity=1e-4,
754 ... mean_longitudinal_dispersivity=1.0,
755 ... )
756 """
757 bins = gamma.bins(mean=mean, std=std, loc=loc, alpha=alpha, beta=beta, n_bins=n_bins)
758 return extraction_to_infiltration(
759 cout=cout,
760 flow=flow,
761 tedges=tedges,
762 cout_tedges=cout_tedges,
763 aquifer_pore_volumes=bins["expected_values"],
764 mean_streamline_length=mean_streamline_length,
765 mean_molecular_diffusivity=mean_molecular_diffusivity,
766 mean_longitudinal_dispersivity=mean_longitudinal_dispersivity,
767 retardation_factor=retardation_factor,
768 suppress_dispersion_warning=suppress_dispersion_warning,
769 asymptotic_cutoff_sigma=asymptotic_cutoff_sigma,
770 regularization_strength=regularization_strength,
771 flow_out=flow_out,
772 suppress_uniform_tedges_check=suppress_uniform_tedges_check,
773 )
776def _compute_sigma(
777 *,
778 flow: npt.ArrayLike,
779 tedges: pd.DatetimeIndex,
780 aquifer_pore_volumes: npt.ArrayLike,
781 streamline_length: float,
782 molecular_diffusivity: float,
783 longitudinal_dispersivity: float,
784 retardation_factor: float,
785 direction: str,
786) -> npt.NDArray[np.floating]:
787 """Compute sigma in array-index units with moment-based averaging.
789 The per-streamtube variance of the Gaussian kernel (in index units)
790 is sigma_idx^2(V) = L_diff^2(V) * V^2 * R^2 / (Q * dt * L)^2.
791 L_diff^2 = 2*D_m*tau + 2*alpha_L*L is linear in V (via tau ~ V),
792 so sigma_idx^2 is cubic + quadratic in V. The averaged variance
793 over the pore volume distribution is (law of total variance):
795 E[sigma_idx^2] = R^2 / (Q*dt*L)^2
796 * (2*D_m*tau_bar/Vbar * E[V^3] + 2*alpha_L*L * E[V^2])
798 When a single pore volume is given this reduces to the classical formula.
800 See :ref:`concept-variance-components` for the derivation.
802 Parameters
803 ----------
804 flow : array-like
805 Flow rate [m3/day].
806 tedges : DatetimeIndex
807 Time edges for flow data.
808 aquifer_pore_volumes : array-like
809 Pore volumes [m3]. Moments E[V^2] and E[V^3] are computed
810 from this array.
811 streamline_length : float
812 Streamline length [m].
813 molecular_diffusivity : float
814 Effective molecular diffusivity [m2/day].
815 longitudinal_dispersivity : float
816 Longitudinal dispersivity [m].
817 retardation_factor : float
818 Retardation factor [-].
819 direction : {'infiltration_to_extraction', 'extraction_to_infiltration'}
820 Direction for residence time computation.
822 Returns
823 -------
824 ndarray
825 Sigma values in array-index units, clipped to [0, 100].
826 """
827 flow = np.asarray(flow, dtype=float)
828 aquifer_pore_volumes = np.asarray(aquifer_pore_volumes, dtype=float)
830 mean_pv = float(np.mean(aquifer_pore_volumes))
831 ev2 = float(np.mean(aquifer_pore_volumes**2))
832 ev3 = float(np.mean(aquifer_pore_volumes**3))
834 # Bin-mean residence time at the mean pore volume [days]. Using the bin-mean
835 # (rather than the bin-center point sample) is consistent with the bin-edge
836 # convention: per-bin output quantities are flow-weighted averages over the bin.
837 rt_array = residence_time_mean(
838 flow=flow,
839 flow_tedges=tedges,
840 tedges_out=tedges,
841 aquifer_pore_volumes=mean_pv,
842 retardation_factor=retardation_factor,
843 direction=direction,
844 )[0]
846 # Interpolate NaN values
847 valid_mask = ~np.isnan(rt_array)
848 if np.any(valid_mask):
849 rt_array = np.interp(np.arange(len(rt_array)), np.where(valid_mask)[0], rt_array[valid_mask])
851 # Two independent spatial variance components [m^2]:
852 # molecular: 2 * D_m * tau (linear in V via tau)
853 # dispersive: 2 * alpha_L * L (constant in V)
854 mol_var_x = 2.0 * molecular_diffusivity * rt_array
855 disp_var_x = 2.0 * longitudinal_dispersivity * streamline_length
857 # Averaged variance in index units (see docstring):
858 var_numerator = mol_var_x / mean_pv * ev3 + disp_var_x * ev2
860 timedelta = np.diff(tedges) / pd.to_timedelta(1, unit="D")
861 # q_dt_l_over_r = Q * dt * L / R has units m^4 (it is the squared scaling
862 # factor that converts spatial variance [m^2] into index-space variance).
863 q_dt_l_over_r = flow * timedelta * streamline_length / retardation_factor
865 with np.errstate(invalid="ignore", divide="ignore"):
866 sigma_sq = np.where(q_dt_l_over_r > 0.0, var_numerator / q_dt_l_over_r**2, 0.0)
868 return np.clip(np.sqrt(np.maximum(sigma_sq, 0.0)), 0.0, 100.0)
871def _check_uniform_tedges(*, tedges: pd.DatetimeIndex, suppress: bool) -> None:
872 """Raise ValueError if tedges has non-constant time steps.
874 Raises
875 ------
876 ValueError
877 If tedges has non-constant time steps and suppress is False.
878 """
879 if suppress:
880 return
881 dt = np.diff(tedges)
882 if not np.all(dt == dt[0]):
883 msg = (
884 "tedges must have constant time steps when flow_out is not provided. "
885 "Either provide flow_out or set suppress_uniform_tedges_check=True."
886 )
887 raise ValueError(msg)
890def _build_gaussian_matrix(
891 *, n: int, sigma_array: npt.ArrayLike, asymptotic_cutoff_sigma: float | None = 3.0
892) -> sparse.csr_matrix:
893 """Build sparse Gaussian convolution matrix with CDF-integrated bin weights.
895 Each row represents a position-specific Gaussian kernel where weights are
896 computed by integrating the Gaussian CDF over each bin, rather than
897 point-sampling the PDF. This is more accurate for small sigma values
898 (< ~2 bins) where the PDF peak can fall between bin centers.
900 Parameters
901 ----------
902 n : int
903 Length of the signal (number of bins).
904 sigma_array : array-like
905 Array of sigma values of length n.
906 asymptotic_cutoff_sigma : float or None, optional
907 Truncate the kernel at this many standard deviations. Set to None to
908 disable truncation (use full signal length). Default is 3.0.
910 Returns
911 -------
912 scipy.sparse.csr_matrix
913 Sparse convolution matrix of shape (n, n).
914 """
915 sigma_array = np.asarray(sigma_array)
917 # Handle zero sigma values
918 zero_mask = sigma_array == 0
919 if np.all(zero_mask):
920 return sparse.eye(n, format="csr", dtype=float) # pyright: ignore[reportReturnType]
922 # Get maximum kernel size and create position arrays
923 max_sigma = np.max(sigma_array)
924 truncate = asymptotic_cutoff_sigma if asymptotic_cutoff_sigma is not None else n / max(max_sigma, 1.0)
925 max_radius = int(truncate * max_sigma + 0.5)
927 # Create arrays for all possible kernel positions
928 positions = np.arange(-max_radius, max_radius + 1)
930 # Create a mask for valid sigma values
931 valid_sigma = ~zero_mask
932 valid_indices = np.where(valid_sigma)[0]
934 # Shape: (n_valid_points, 1)
935 center_positions = valid_indices[:, np.newaxis]
936 # Shape: (1, max_kernel_size)
937 kernel_positions = positions[np.newaxis, :]
939 # Calculate CDF-integrated weights for proper bin averaging.
940 # Compute erf at unique bin edges, then take differences. Adjacent bins
941 # share an edge, so this halves the number of erf evaluations.
942 sigmas = sigma_array[valid_sigma][:, np.newaxis]
943 sqrt2 = np.sqrt(2)
944 edges = np.arange(-max_radius - 0.5, max_radius + 1.5) # (2*max_radius + 2,)
945 erf_at_edges = erf(edges[np.newaxis, :] / (sigmas * sqrt2)) # (n_valid, 2*max_radius+2)
946 weights = 0.5 * np.diff(erf_at_edges, axis=1) # (n_valid, 2*max_radius+1)
948 # Normalize each kernel
949 weights /= np.sum(weights, axis=1)[:, np.newaxis]
951 # Calculate absolute positions in the signal
952 absolute_positions = center_positions + kernel_positions
954 # Handle boundary conditions (nearest-neighbor extrapolation)
955 absolute_positions = np.clip(absolute_positions, 0, n - 1)
957 # Create coordinate arrays for sparse matrix
958 rows = np.repeat(center_positions, weights.shape[1])
959 cols = absolute_positions.ravel()
960 data = weights.ravel()
962 # Remove zero weights to save memory
963 nonzero_mask = data != 0
964 rows = rows[nonzero_mask]
965 cols = cols[nonzero_mask]
966 data = data[nonzero_mask]
968 # Add identity matrix elements for zero-sigma positions
969 if np.any(zero_mask):
970 zero_indices = np.where(zero_mask)[0]
971 rows = np.concatenate([rows, zero_indices])
972 cols = np.concatenate([cols, zero_indices])
973 data = np.concatenate([data, np.ones(len(zero_indices))])
975 return sparse.csr_matrix((data, (rows, cols)), shape=(n, n))
978def convolve_diffusion(
979 *, input_signal: npt.ArrayLike, sigma_array: npt.ArrayLike, asymptotic_cutoff_sigma: float | None = 3.0
980) -> npt.NDArray[np.floating]:
981 """Apply Gaussian filter with position-dependent sigma values.
983 This function extends scipy.ndimage.gaussian_filter1d by allowing the standard
984 deviation (sigma) of the Gaussian kernel to vary at each point in the signal.
985 It implements the filter using a sparse convolution matrix where each row
986 represents a Gaussian kernel with a locally-appropriate standard deviation.
988 Kernel weights are computed by integrating the Gaussian CDF over each bin,
989 which is more accurate than point-sampling the PDF for small sigma values.
991 Parameters
992 ----------
993 input_signal : array-like
994 One-dimensional input array to be filtered.
995 sigma_array : array-like
996 One-dimensional array of standard deviation values, must have same length
997 as input_signal.
998 asymptotic_cutoff_sigma : float or None, optional
999 Truncate the filter at this many standard deviations. Set to None to
1000 disable truncation. Default is 3.0.
1002 Returns
1003 -------
1004 numpy.ndarray
1005 The filtered input signal. Has the same shape as input_signal.
1007 Raises
1008 ------
1009 ValueError
1010 If input_signal and sigma_array do not have the same length.
1012 See Also
1013 --------
1014 scipy.ndimage.gaussian_filter1d : Fixed-sigma Gaussian filtering
1016 Notes
1017 -----
1018 At the boundaries, the outer values are repeated to avoid edge effects
1019 (equivalent to mode='nearest' in `scipy.ndimage.gaussian_filter1d`).
1021 Examples
1022 --------
1023 >>> import numpy as np
1024 >>> from gwtransport.diffusion_fast import convolve_diffusion
1025 >>> # Create a sample signal
1026 >>> x = np.linspace(0, 10, 1000)
1027 >>> signal = np.exp(-((x - 3) ** 2)) + 0.5 * np.exp(-((x - 7) ** 2) / 0.5)
1029 >>> # Create position-dependent sigma values
1030 >>> diffusivity = 0.1
1031 >>> dt = 0.001 * (1 + np.sin(2 * np.pi * x / 10))
1032 >>> dx = x[1] - x[0]
1033 >>> sigma_array = np.sqrt(2 * diffusivity * dt) / dx
1035 >>> # Apply the filter
1036 >>> filtered = convolve_diffusion(input_signal=signal, sigma_array=sigma_array)
1037 """
1038 input_signal = np.asarray(input_signal)
1039 sigma_array = np.asarray(sigma_array)
1041 if len(input_signal) != len(sigma_array):
1042 msg = "Input signal and sigma array must have the same length"
1043 raise ValueError(msg)
1045 n = len(input_signal)
1047 # Handle zero sigma values
1048 if np.all(sigma_array == 0):
1049 return input_signal.copy()
1051 g_matrix = _build_gaussian_matrix(n=n, sigma_array=sigma_array, asymptotic_cutoff_sigma=asymptotic_cutoff_sigma)
1052 return g_matrix.dot(input_signal)
1055def create_example_data(
1056 *,
1057 nx: int = 1000,
1058 domain_length: float = 10.0,
1059 diffusivity: float = 0.1,
1060 seed: int | None = None,
1061) -> tuple[npt.NDArray[np.floating], npt.NDArray[np.floating], npt.NDArray[np.floating], npt.NDArray[np.floating]]:
1062 """Create example data for demonstrating variable-sigma diffusion.
1064 Parameters
1065 ----------
1066 nx : int, optional
1067 Number of spatial points. Default is 1000.
1068 domain_length : float, optional
1069 Domain length. Default is 10.0.
1070 diffusivity : float, optional
1071 Diffusivity. Default is 0.1.
1072 seed : int or None, optional
1073 Random seed for reproducibility. Default is None.
1075 Returns
1076 -------
1077 x : numpy.ndarray
1078 Spatial coordinates.
1079 signal : numpy.ndarray
1080 Initial signal (sum of two Gaussians with noise).
1081 sigma_array : numpy.ndarray
1082 Array of sigma values varying in space.
1083 dt : numpy.ndarray
1084 Array of time steps varying in space.
1085 """
1086 rng = np.random.default_rng(seed)
1088 # Create spatial grid
1089 x = np.linspace(0, domain_length, nx)
1090 dx = x[1] - x[0]
1092 # Create initial signal (two Gaussians)
1093 signal = np.exp(-((x - 3) ** 2)) + 0.5 * np.exp(-((x - 7) ** 2) / 0.5) + 0.1 * rng.standard_normal(nx)
1095 # Create varying time steps
1096 dt = 0.001 * (1 + np.sin(2 * np.pi * x / domain_length))
1098 # Calculate corresponding sigma values
1099 sigma_array = np.sqrt(2 * diffusivity * dt) / dx
1101 return x, signal, sigma_array, dt
1104def _validate_inputs(
1105 *,
1106 cin_or_cout: np.ndarray,
1107 flow: np.ndarray,
1108 tedges: pd.DatetimeIndex,
1109 cout_tedges: pd.DatetimeIndex,
1110 aquifer_pore_volumes: np.ndarray,
1111 mean_streamline_length: float,
1112 mean_molecular_diffusivity: float,
1113 mean_longitudinal_dispersivity: float,
1114 is_forward: bool,
1115 flow_out: np.ndarray | None = None,
1116) -> None:
1117 """Validate inputs for infiltration_to_extraction and extraction_to_infiltration.
1119 Raises
1120 ------
1121 ValueError
1122 If array lengths are inconsistent, mean_molecular_diffusivity or
1123 mean_longitudinal_dispersivity are negative, cin or cout or flow
1124 contain NaN values, aquifer_pore_volumes contains non-positive
1125 values, or mean_streamline_length is non-positive.
1126 """
1127 if is_forward:
1128 if len(tedges) != len(cin_or_cout) + 1:
1129 msg = "tedges must have one more element than cin"
1130 raise ValueError(msg)
1131 elif len(cout_tedges) != len(cin_or_cout) + 1:
1132 msg = "cout_tedges must have one more element than cout"
1133 raise ValueError(msg)
1134 if len(tedges) != len(flow) + 1:
1135 msg = "tedges must have one more element than flow"
1136 raise ValueError(msg)
1137 if mean_molecular_diffusivity < 0:
1138 msg = "mean_molecular_diffusivity must be non-negative"
1139 raise ValueError(msg)
1140 if mean_longitudinal_dispersivity < 0:
1141 msg = "mean_longitudinal_dispersivity must be non-negative"
1142 raise ValueError(msg)
1143 if np.any(np.isnan(cin_or_cout)):
1144 msg = f"{'cin' if is_forward else 'cout'} contains NaN values, which are not allowed"
1145 raise ValueError(msg)
1146 if np.any(np.isnan(flow)):
1147 msg = "flow contains NaN values, which are not allowed"
1148 raise ValueError(msg)
1149 if np.any(flow < 0):
1150 msg = "flow must be non-negative (negative flow not supported)"
1151 raise ValueError(msg)
1152 if np.any(aquifer_pore_volumes <= 0):
1153 msg = "aquifer_pore_volumes must be positive"
1154 raise ValueError(msg)
1155 if mean_streamline_length <= 0:
1156 msg = "mean_streamline_length must be positive"
1157 raise ValueError(msg)
1158 if flow_out is not None:
1159 n_cout = len(cout_tedges) - 1
1160 if len(flow_out) != n_cout:
1161 msg = f"flow_out must have length len(cout_tedges) - 1 = {n_cout}, got {len(flow_out)}"
1162 raise ValueError(msg)
1163 if np.any(np.isnan(flow_out)):
1164 msg = "flow_out contains NaN values, which are not allowed"
1165 raise ValueError(msg)
1166 if np.any(flow_out < 0):
1167 msg = "flow_out must be non-negative (negative flow not supported)"
1168 raise ValueError(msg)
1171def _warn_dispersion() -> None:
1172 """Emit warning about combining multiple pore volumes with dispersivity."""
1173 msg = (
1174 "Using multiple aquifer_pore_volumes with non-zero mean_longitudinal_dispersivity. "
1175 "Both represent spreading from velocity heterogeneity at different scales.\n\n"
1176 "This is appropriate when:\n"
1177 " - APVD comes from streamline analysis (explicit geometry)\n"
1178 " - You want to add unresolved microdispersion and molecular diffusion\n\n"
1179 "This may double-count spreading when:\n"
1180 " - APVD was calibrated from measurements (microdispersion already included)\n\n"
1181 "For gamma-parameterized APVD, consider using the 'equivalent APVD std' approach\n"
1182 "from 05_Diffusion_Dispersion.ipynb to combine both effects in the faster advection\n"
1183 "module. Note: this variance combination is only valid for continuous (gamma)\n"
1184 "distributions, not for discrete streamline volumes.\n"
1185 "Suppress this warning with suppress_dispersion_warning=True."
1186 )
1187 warnings.warn(msg, UserWarning, stacklevel=3)