Coverage for src / gwtransport / logremoval.py: 94%
72 statements
« prev ^ index » next coverage.py v7.13.5, created at 2026-04-27 06:32 +0000
« prev ^ index » next coverage.py v7.13.5, created at 2026-04-27 06:32 +0000
1"""
2Log Removal Calculations for First-Order Decay Processes.
4This module provides utilities to calculate log removal values from first-order decay
5processes, including pathogen inactivation and radioactive decay. The module supports
6basic log removal calculations and parallel flow arrangements where multiple flow paths
7operate simultaneously.
9First-Order Decay Model
10-----------------------
11The log removal from any first-order decay process is:
13 Log Removal = log10_decay_rate * residence_time
15where ``log10_decay_rate`` has units [log10/day] and ``residence_time`` has units [days].
16This is equivalent to exponential decay ``C_out/C_in = 10^(-mu * t)``, where mu is the
17log10 decay rate and t is residence time. The natural-log decay rate constant lambda [1/day]
18is related to mu by ``lambda = mu * ln(10)``.
20This model applies to any process that follows first-order kinetics:
22- **Pathogen inactivation**: viruses, bacteria, and protozoa lose infectivity over time
23- **Radioactive decay**: isotopes used for groundwater dating (tritium, CFC, SF6)
24- **Chemical degradation**: first-order breakdown of contaminants
26Pathogen Removal in Bank Filtration
27------------------------------------
28For pathogen removal during soil passage, total removal consists of two distinct mechanisms
29(Schijven and Hassanizadeh, 2000):
311. **Inactivation (time-dependent)**: Pathogens lose infectivity over time through biological
32 decay. This follows first-order kinetics and is modeled by this module as
33 ``LR_decay = log10_decay_rate * residence_time``. The inactivation rate depends strongly
34 on temperature and pathogen type.
362. **Attachment (geometry-dependent)**: Pathogens are physically removed by adsorption to soil
37 grains and straining. This depends on aquifer geometry, distance, soil properties, and pH,
38 and is NOT modeled by this module. Users should add this component separately based on
39 site-specific data.
41Total log removal = LR_decay (this module) + LR_attachment (user-specified).
43At the Castricum dune recharge site, Schijven et al. (1999) found that attachment contributed
44approximately 97% of total MS2 removal, with inactivation contributing only 3%. Inactivation
45rates for common model viruses at 10 degrees C are typically 0.02-0.11 log10/day (Schijven and
46Hassanizadeh, 2000, Table 7).
48Available functions:
50- :func:`residence_time_to_log_removal` - Calculate log removal from residence times and
51 decay rate coefficient. Uses formula: Log Removal = log10_decay_rate * residence_time.
52 Handles single values, 1D arrays, or multi-dimensional arrays of residence times. Returns
53 log removal values with same shape as input.
55- :func:`decay_rate_to_log10_decay_rate` - Convert a natural-log decay rate constant
56 lambda [1/day] to a log10 decay rate mu [log10/day].
58- :func:`log10_decay_rate_to_decay_rate` - Convert a log10 decay rate mu [log10/day]
59 to a natural-log decay rate constant lambda [1/day].
61- :func:`parallel_mean` - Calculate weighted average log removal for parallel flow systems.
62 Computes overall efficiency when multiple treatment paths operate in parallel with different
63 log removal values and flow fractions. Uses formula: Total Log Removal = -log10(sum(F_i * 10^(-LR_i)))
64 where F_i is flow fraction and LR_i is log removal for path i. Supports multi-dimensional
65 arrays via axis parameter for batch processing. Assumes equal flow distribution if flow_fractions
66 not provided.
68- :func:`gamma_pdf` - Compute probability density function (PDF) of log removal given
69 gamma-distributed residence time. Since R = mu*T and T ~ Gamma(alpha, beta), R follows a
70 Gamma(alpha, mu*beta) distribution.
72- :func:`gamma_cdf` - Compute cumulative distribution function (CDF) of log removal given
73 gamma-distributed residence time. Returns probability that log removal is less than or equal
74 to specified values.
76- :func:`gamma_mean` - Compute effective (parallel) mean log removal for gamma-distributed
77 residence time. Uses the moment generating function of the gamma distribution to compute the
78 log-weighted average: LR_eff = alpha * log10(1 + beta * mu * ln(10)).
80- :func:`gamma_find_flow_for_target_mean` - Find flow rate that produces specified target
81 effective mean log removal given gamma-distributed aquifer pore volume. Solves inverse problem:
82 flow = apv_beta * mu * ln(10) / (10^(target_mean / apv_alpha) - 1).
84This file is part of gwtransport which is released under AGPL-3.0 license.
85See the ./LICENSE file or go to https://github.com/gwtransport/gwtransport/blob/main/LICENSE for full license details.
86"""
88import numpy as np
89import numpy.typing as npt
90from scipy import optimize, stats
93def residence_time_to_log_removal(
94 *, residence_times: npt.ArrayLike, log10_decay_rate: float
95) -> npt.NDArray[np.floating]:
96 """
97 Compute log removal given residence times and a log10 decay rate.
99 This function calculates the log removal based on residence times
100 and a log10 decay rate coefficient using first-order decay:
102 Log Removal = log10_decay_rate * residence_time
104 This corresponds to exponential decay of pathogen concentration:
105 C_out/C_in = 10^(-log10_decay_rate * residence_time).
107 Parameters
108 ----------
109 residence_times : array-like
110 Array of residence times in days. Must be positive values.
111 log10_decay_rate : float
112 Log10 decay rate coefficient (log10/day). Relates residence time
113 to log removal efficiency via first-order decay.
115 Returns
116 -------
117 log_removals : numpy.ndarray
118 Array of log removal values corresponding to the input residence times.
119 Same shape as input residence_times.
121 See Also
122 --------
123 decay_rate_to_log10_decay_rate : Convert natural-log decay rate to log10 decay rate
124 log10_decay_rate_to_decay_rate : Convert log10 decay rate to natural-log decay rate
125 gamma_mean : Compute mean log removal for gamma-distributed residence times
126 gamma_find_flow_for_target_mean : Find flow rate to achieve target log removal
127 parallel_mean : Calculate weighted average for parallel flow systems
128 gwtransport.residence_time.residence_time : Compute residence times from flow and pore volume
129 :ref:`concept-residence-time` : Time in aquifer determines pathogen contact time
131 Notes
132 -----
133 Log removal is a logarithmic measure of pathogen reduction:
134 - Log 1 = 90% reduction
135 - Log 2 = 99% reduction
136 - Log 3 = 99.9% reduction
138 The first-order decay model is mathematically identical to radioactive
139 decay used in tracer dating. To convert a published natural-log decay
140 rate lambda [1/day] to log10_decay_rate mu [log10/day], use
141 :func:`decay_rate_to_log10_decay_rate`.
143 Examples
144 --------
145 >>> import numpy as np
146 >>> from gwtransport.logremoval import residence_time_to_log_removal
147 >>> residence_times = np.array([10.0, 20.0, 50.0])
148 >>> log10_decay_rate = 0.2
149 >>> residence_time_to_log_removal(
150 ... residence_times=residence_times, log10_decay_rate=log10_decay_rate
151 ... ) # doctest: +NORMALIZE_WHITESPACE
152 array([ 2., 4., 10.])
154 >>> # Single residence time
155 >>> residence_time_to_log_removal(residence_times=5.0, log10_decay_rate=0.3)
156 np.float64(1.5)
158 >>> # 2D array of residence times
159 >>> residence_times_2d = np.array([[10.0, 20.0], [30.0, 40.0]])
160 >>> residence_time_to_log_removal(
161 ... residence_times=residence_times_2d, log10_decay_rate=0.1
162 ... )
163 array([[1., 2.],
164 [3., 4.]])
165 """
166 return log10_decay_rate * np.asarray(residence_times, dtype=float)
169def decay_rate_to_log10_decay_rate(decay_rate: float) -> float:
170 """
171 Convert a natural-log decay rate constant to a log10 decay rate.
173 Converts lambda [1/day] to mu [log10/day] using the relationship
174 mu = lambda / ln(10).
176 Parameters
177 ----------
178 decay_rate : float
179 Natural-log first-order decay rate constant lambda (1/day).
180 For example, from tracer dating: lambda = ln(2) / half_life.
182 Returns
183 -------
184 log10_decay_rate : float
185 Log10 decay rate mu (log10/day).
187 See Also
188 --------
189 log10_decay_rate_to_decay_rate : Inverse conversion
190 residence_time_to_log_removal : Apply the log10 decay rate
192 Examples
193 --------
194 >>> from gwtransport.logremoval import decay_rate_to_log10_decay_rate
195 >>> import numpy as np
196 >>> # Convert a decay rate of ln(2)/30 (half-life of 30 days)
197 >>> decay_rate = np.log(2) / 30
198 >>> decay_rate_to_log10_decay_rate(decay_rate) # doctest: +SKIP
199 0.01003...
200 """
201 return decay_rate / np.log(10)
204def log10_decay_rate_to_decay_rate(log10_decay_rate: float) -> float:
205 """
206 Convert a log10 decay rate to a natural-log decay rate constant.
208 Converts mu [log10/day] to lambda [1/day] using the relationship
209 lambda = mu * ln(10).
211 Parameters
212 ----------
213 log10_decay_rate : float
214 Log10 decay rate mu (log10/day).
216 Returns
217 -------
218 decay_rate : float
219 Natural-log first-order decay rate constant lambda (1/day).
221 See Also
222 --------
223 decay_rate_to_log10_decay_rate : Inverse conversion
225 Examples
226 --------
227 >>> from gwtransport.logremoval import log10_decay_rate_to_decay_rate
228 >>> log10_decay_rate_to_decay_rate(0.2) # doctest: +SKIP
229 0.4605...
230 """
231 return log10_decay_rate * np.log(10)
234def parallel_mean(
235 *, log_removals: npt.ArrayLike, flow_fractions: npt.ArrayLike | None = None, axis: int | None = None
236) -> np.floating | npt.NDArray[np.floating]:
237 """
238 Calculate the weighted average log removal for a system with parallel flows.
240 This function computes the overall log removal efficiency of a parallel
241 filtration system. If flow_fractions is not provided, it assumes equal
242 distribution of flow across all paths.
244 The calculation uses the formula:
246 Total Log Removal = -log10(sum(F_i * 10^(-LR_i)))
248 Where:
249 - F_i = fraction of flow through system i (decimal, sum to 1.0)
250 - LR_i = log removal of system i
252 Parameters
253 ----------
254 log_removals : array-like
255 Array of log removal values for each parallel flow.
256 Each value represents the log10 reduction of pathogens.
257 For multi-dimensional arrays, the parallel mean is computed along
258 the specified axis.
260 flow_fractions : array-like, optional
261 Array of flow fractions for each parallel flow.
262 Must sum to 1.0 along the specified axis and have compatible shape
263 with log_removals. If None, equal flow distribution is assumed
264 (default is None).
266 axis : int, optional
267 Axis along which to compute the parallel mean for multi-dimensional
268 arrays. If None, the array is treated as 1-dimensional
269 (default is None).
271 Returns
272 -------
273 float or array-like
274 The combined log removal value for the parallel system.
275 If log_removals is multi-dimensional and axis is specified,
276 returns an array with the specified axis removed.
278 Raises
279 ------
280 ValueError
281 If ``flow_fractions`` does not sum to 1.0 along the specified axis.
283 See Also
284 --------
285 residence_time_to_log_removal : Compute log removal from residence times
287 Notes
288 -----
289 Log removal is a logarithmic measure of pathogen reduction:
291 - Log 1 = 90% reduction
292 - Log 2 = 99% reduction
293 - Log 3 = 99.9% reduction
295 For parallel flows, the combined removal is typically less effective
296 than the best individual removal but better than the worst.
297 For systems in series, log removals would be summed directly.
299 Examples
300 --------
301 >>> import numpy as np
302 >>> from gwtransport.logremoval import parallel_mean
303 >>> # Three parallel streams with equal flow and log removals of 3, 4, and 5
304 >>> log_removals = np.array([3, 4, 5])
305 >>> parallel_mean(log_removals=log_removals)
306 np.float64(3.431798275933005)
308 >>> # Two parallel streams with weighted flow
309 >>> log_removals = np.array([3, 5])
310 >>> flow_fractions = np.array([0.7, 0.3])
311 >>> parallel_mean(log_removals=log_removals, flow_fractions=flow_fractions)
312 np.float64(3.153044674980176)
314 >>> # Multi-dimensional array: parallel mean along axis 1
315 >>> log_removals_2d = np.array([[3, 4, 5], [2, 3, 4]])
316 >>> parallel_mean(log_removals=log_removals_2d, axis=1)
317 array([3.43179828, 2.43179828])
318 """
319 # Convert log_removals to numpy array if it isn't already
320 log_removals = np.asarray(log_removals, dtype=float)
322 # If flow_fractions is not provided, assume equal distribution
323 if flow_fractions is None:
324 if axis is None:
325 # 1D case: calculate the number of parallel flows
326 n = len(log_removals)
327 # Create equal flow fractions (avoid division by zero)
328 flow_fractions = np.full(n, 1.0 / n) if n > 0 else np.array([])
329 else:
330 # Multi-dimensional case: create equal flow fractions along the specified axis
331 n = log_removals.shape[axis]
332 shape = [1] * log_removals.ndim
333 shape[axis] = n
334 flow_fractions = np.full(shape, 1.0 / n)
335 else:
336 # Convert flow_fractions to numpy array
337 flow_fractions = np.asarray(flow_fractions, dtype=float)
339 fraction_sums = np.sum(flow_fractions, axis=axis)
340 if not np.all(np.isclose(fraction_sums, 1.0)):
341 msg = "flow_fractions must sum to 1.0 (along the specified axis)"
342 raise ValueError(msg)
344 # Convert log removal to decimal reduction values
345 decimal_reductions = 10 ** (-log_removals)
347 # Calculate weighted average decimal reduction
348 weighted_decimal_reduction = np.sum(flow_fractions * decimal_reductions, axis=axis)
350 # Convert back to log scale
351 return -np.log10(weighted_decimal_reduction)
354def gamma_pdf(
355 *,
356 r: npt.ArrayLike,
357 rt_alpha: float,
358 rt_beta: float,
359 rt_loc: float = 0.0,
360 log10_decay_rate: float,
361) -> npt.NDArray[np.floating]:
362 """
363 Compute the PDF of log removal given (shifted) gamma-distributed residence time.
365 With residence time ``T = T0 + rt_loc`` where ``T0 ~ Gamma(rt_alpha, rt_beta)``,
366 the log removal ``R = mu * T`` follows a shifted gamma distribution with shape
367 ``rt_alpha``, scale ``mu * rt_beta``, and location ``mu * rt_loc``.
369 Parameters
370 ----------
371 r : array-like
372 Log removal values at which to compute the PDF.
373 rt_alpha : float
374 Shape parameter of the gamma distribution for residence time.
375 rt_beta : float
376 Scale parameter of the gamma distribution for residence time (days).
377 rt_loc : float, optional
378 Location (minimum residence time, days) of the residence time distribution.
379 Must be non-negative. Default is ``0.0``.
380 log10_decay_rate : float
381 Log10 decay rate mu (log10/day). Relates residence time to
382 log removal via R = mu * T.
384 Returns
385 -------
386 pdf : numpy.ndarray
387 PDF values corresponding to the input r values.
389 Raises
390 ------
391 ValueError
392 If ``rt_loc`` is negative.
394 See Also
395 --------
396 gamma_cdf : Cumulative distribution function of log removal
397 gamma_mean : Mean of the log removal distribution
398 """
399 if rt_loc < 0:
400 msg = "rt_loc must be non-negative"
401 raise ValueError(msg)
402 r = np.asarray(r)
403 return stats.gamma.pdf(r, a=rt_alpha, loc=log10_decay_rate * rt_loc, scale=log10_decay_rate * rt_beta)
406def gamma_cdf(
407 *,
408 r: npt.ArrayLike,
409 rt_alpha: float,
410 rt_beta: float,
411 rt_loc: float = 0.0,
412 log10_decay_rate: float,
413) -> npt.NDArray[np.floating]:
414 """
415 Compute the CDF of log removal given (shifted) gamma-distributed residence time.
417 With residence time ``T = T0 + rt_loc`` where ``T0 ~ Gamma(rt_alpha, rt_beta)``,
418 the CDF is ``P(R <= r) = P(mu*(T0 + rt_loc) <= r) =
419 P(T0 <= (r - mu*rt_loc)/mu)`` which is the CDF of a shifted gamma distribution
420 with location ``mu * rt_loc``.
422 Parameters
423 ----------
424 r : array-like
425 Log removal values at which to compute the CDF.
426 rt_alpha : float
427 Shape parameter of the gamma distribution for residence time.
428 rt_beta : float
429 Scale parameter of the gamma distribution for residence time (days).
430 rt_loc : float, optional
431 Location (minimum residence time, days) of the residence time distribution.
432 Must be non-negative. Default is ``0.0``.
433 log10_decay_rate : float
434 Log10 decay rate mu (log10/day). Relates residence time to
435 log removal via R = mu * T.
437 Returns
438 -------
439 cdf : numpy.ndarray
440 CDF values corresponding to the input r values.
442 Raises
443 ------
444 ValueError
445 If ``rt_loc`` is negative.
447 See Also
448 --------
449 gamma_pdf : Probability density function of log removal
450 gamma_mean : Mean of the log removal distribution
451 """
452 if rt_loc < 0:
453 msg = "rt_loc must be non-negative"
454 raise ValueError(msg)
455 r = np.asarray(r)
456 return stats.gamma.cdf(r, a=rt_alpha, loc=log10_decay_rate * rt_loc, scale=log10_decay_rate * rt_beta)
459def gamma_mean(
460 *,
461 rt_alpha: float,
462 rt_beta: float,
463 rt_loc: float = 0.0,
464 log10_decay_rate: float,
465) -> float:
466 """
467 Compute the effective (parallel) mean log removal for (shifted) gamma-distributed residence time.
469 When water travels through multiple flow paths with gamma-distributed
470 residence times, the effective log removal is determined by mixing the
471 output concentrations (not by averaging individual log removals). For a
472 shifted gamma distribution ``T = T0 + rt_loc`` with ``T0 ~ Gamma(alpha, beta)``,
473 factoring the moment generating function gives:
475 LR_eff = -log10(E[10^(-mu*T)])
476 = -log10(10^(-mu*rt_loc) * E[10^(-mu*T0)])
477 = mu * rt_loc + alpha * log10(1 + beta * mu * ln(10))
479 The ``rt_loc`` term shifts the whole log-removal distribution by a constant
480 ``mu * rt_loc``; the alpha/beta term is unchanged. This is always less than
481 the arithmetic mean ``mu * (alpha * beta + rt_loc)`` because short residence
482 time paths contribute disproportionately to the output concentration.
484 Parameters
485 ----------
486 rt_alpha : float
487 Shape parameter of the gamma distribution for residence time.
488 rt_beta : float
489 Scale parameter of the gamma distribution for residence time (days).
490 rt_loc : float, optional
491 Location (minimum residence time, days) of the residence time distribution.
492 Must be non-negative. Default is ``0.0``.
493 log10_decay_rate : float
494 Log10 decay rate mu (log10/day).
496 Returns
497 -------
498 mean : float
499 Effective (parallel) mean log removal value.
501 Raises
502 ------
503 ValueError
504 If ``rt_loc`` is negative.
506 See Also
507 --------
508 gamma_find_flow_for_target_mean : Find flow for target mean log removal
509 parallel_mean : Discrete version of this calculation
510 gamma_pdf : PDF of the log removal distribution
511 gamma_cdf : CDF of the log removal distribution
512 :ref:`concept-pore-volume-distribution` : Why residence times are distributed
513 """
514 if rt_loc < 0:
515 msg = "rt_loc must be non-negative"
516 raise ValueError(msg)
517 return log10_decay_rate * rt_loc + rt_alpha * np.log10(1 + rt_beta * log10_decay_rate * np.log(10))
520def gamma_find_flow_for_target_mean(
521 *,
522 target_mean: float,
523 apv_alpha: float,
524 apv_beta: float,
525 apv_loc: float = 0.0,
526 log10_decay_rate: float,
527) -> float:
528 """
529 Find the flow rate that produces a target effective mean log removal.
531 Given a (shifted) gamma-distributed aquifer pore volume with parameters
532 ``(apv_alpha, apv_beta, apv_loc)``, the residence time distribution at flow
533 ``Q`` is a shifted gamma with shape ``apv_alpha``, scale ``apv_beta/Q``, and
534 location ``apv_loc/Q``. From :func:`gamma_mean`:
536 LR_eff = mu * apv_loc / Q + apv_alpha * log10(1 + (apv_beta/Q) * mu * ln(10))
538 For ``apv_loc == 0`` this is closed-form:
540 Q = apv_beta * mu * ln(10) / (10^(target_mean / apv_alpha) - 1)
542 For ``apv_loc > 0`` the equation is transcendental and solved numerically
543 with :func:`scipy.optimize.brentq` by bracketing the root in ``1/Q``.
545 Parameters
546 ----------
547 target_mean : float
548 Target effective mean log removal value. Must be positive.
549 apv_alpha : float
550 Shape parameter of the gamma distribution for aquifer pore volume.
551 apv_beta : float
552 Scale parameter of the gamma distribution for aquifer pore volume.
553 apv_loc : float, optional
554 Location (minimum aquifer pore volume) of the gamma distribution.
555 Must be non-negative. Default is ``0.0``.
556 log10_decay_rate : float
557 Log10 decay rate mu (log10/day).
559 Returns
560 -------
561 flow : float
562 Flow rate (same units as apv_beta per day) that produces the
563 target mean log removal.
565 Raises
566 ------
567 ValueError
568 If ``apv_loc`` is negative, if ``target_mean`` is not positive, if
569 ``log10_decay_rate`` is not positive (no decay can never produce a
570 positive target log removal), or if ``apv_alpha`` or ``apv_beta`` are
571 not positive.
573 See Also
574 --------
575 gamma_mean : Compute effective mean log removal for given parameters
576 """
577 if apv_loc < 0:
578 msg = "apv_loc must be non-negative"
579 raise ValueError(msg)
580 if target_mean <= 0:
581 msg = "target_mean must be positive"
582 raise ValueError(msg)
583 if apv_alpha <= 0:
584 msg = "apv_alpha must be positive"
585 raise ValueError(msg)
586 if apv_beta <= 0:
587 msg = "apv_beta must be positive"
588 raise ValueError(msg)
589 if log10_decay_rate <= 0:
590 # Without decay, the effective mean log removal is identically zero
591 # regardless of flow, so no finite flow can attain a positive target.
592 msg = "log10_decay_rate must be positive to attain a positive target_mean"
593 raise ValueError(msg)
595 ln10 = np.log(10)
596 flow_closed_form = apv_beta * log10_decay_rate * ln10 / (10 ** (target_mean / apv_alpha) - 1)
598 if apv_loc == 0.0:
599 return float(flow_closed_form)
601 # Solve target = mu*apv_loc*u + apv_alpha*log10(1 + apv_beta*mu*ln(10)*u) for u = 1/flow.
602 # Both terms are monotonically increasing in u, so f(u) - target is monotonic with a
603 # unique positive root. Bracket: at u = 1/flow_closed_form the alpha/beta term alone
604 # equals target_mean, so the full f overshoots by exactly mu*apv_loc*u_upper > 0.
605 u_upper = 1.0 / flow_closed_form
606 if not np.isfinite(u_upper) or u_upper <= 0:
607 # Defensive: with the validated inputs above, flow_closed_form is finite and
608 # strictly positive, so u_upper should be a finite positive bracket. If a
609 # pathological input slips through, surface it instead of calling brentq on
610 # an invalid interval.
611 msg = (
612 "Cannot bracket the root for the given parameters; check that "
613 "target_mean, apv_alpha, apv_beta, and log10_decay_rate are finite and positive."
614 )
615 raise ValueError(msg)
617 def residual(u: float) -> float:
618 return float(
619 log10_decay_rate * apv_loc * u
620 + apv_alpha * np.log10(1 + apv_beta * log10_decay_rate * ln10 * u)
621 - target_mean
622 )
624 u_root = optimize.brentq(residual, 0.0, u_upper)
625 return 1.0 / float(u_root) # type: ignore[arg-type]