Title: | Mass Balance Reconstruction |
---|---|
Description: | Mass-balance-adjusted Regression algorithm for streamflow reconstruction at sub-annual resolution (e.g., seasonal or monthly). The algorithm implements a penalty term to minimize the differences between the total sub-annual flows and the annual flow. The method is described in Nguyen et al (2020) <DOI:10.1002/essoar.10504791.1>. |
Authors: | Hung Nguyen |
Maintainer: | Hung Nguyen <[email protected]> |
License: | GPL (>= 2.0) |
Version: | 0.0.1.1 |
Built: | 2025-02-09 04:00:08 UTC |
Source: | https://github.com/critical-infrastructure-systems-lab/mbr |
Transform the reconstructed values back to the flow space and convert to data.table
back_trans(hat, years, mus, sigmas, log.trans, N, season.names)
back_trans(hat, years, mus, sigmas, log.trans, N, season.names)
hat |
A vector of estimated flow in the transformed space. |
years |
A vector of all years in the study period |
mus |
A vector of means, one for each target. |
sigmas |
A vector of the standard deviations, one for each target. |
log.trans |
A vector containing the indices of the columns to be log-transformed. |
N |
The number of targets (number of seasons plus one for the annual reconstruction). |
season.names |
A character vector containing the names of the seasons |
A data.table
with three columns: Q (the back-transformed streamflow), season, and year.
Calculate reconstruction metrics from the instrumental period
calculate_metrics(sim, obs, z, norm.fun = mean)
calculate_metrics(sim, obs, z, norm.fun = mean)
sim |
A vector of reconstruction output for instrumental period |
obs |
A vector of all observations |
z |
A vector of left out indices in cross validation |
norm.fun |
The function (unquoted name) used to calculate the normalizing constant. Default is |
A named vector of performance metrics
calculate_metrics(rnorm(100), rnorm(100), z = 1:10) calculate_metrics(rnorm(100), rnorm(100), z = 1:10, norm.fun = sd)
calculate_metrics(rnorm(100), rnorm(100), z = 1:10) calculate_metrics(rnorm(100), rnorm(100), z = 1:10, norm.fun = sd)
Same as base::scale()
but much faster.
colScale(x, add_attr = TRUE)
colScale(x, add_attr = TRUE)
x |
A matrix. |
add_attr |
If TRUE, the column means and standard deviations are returned as attributes. This is consistent with |
The scaled matrix.
This function was adopted from John Muschelli's code on StackOverflow, but I changed the underlying functions to calculate mean and standard deviation from matrixStats
to Rfast
, which is much faster.
Backtransform a matrix that was scaled before.
colUnscale(x, cm, csd)
colUnscale(x, cm, csd)
x |
A matrix. |
cm |
A vector of column means |
csd |
A vector of column standard deviations |
The unscaled matrix
Cross-validation
cv_mb( instQ, pc.list, cv.folds, start.year, lambda = 1, log.trans = NULL, force.standardize = FALSE, return.type = c("fval", "metrics", "metric means", "Q") )
cv_mb( instQ, pc.list, cv.folds, start.year, lambda = 1, log.trans = NULL, force.standardize = FALSE, return.type = c("fval", "metrics", "metric means", "Q") )
instQ |
Instrumental data, in the same order as pc.list. The "season" column must be a factor. |
pc.list |
List of PC matrices |
cv.folds |
A list containing the cross validation folds |
start.year |
The first year of record |
lambda |
The penalty weight |
log.trans |
A vector containing indices of the targets to be log-transformed. If no transformation is needed, provide |
force.standardize |
If TRUE, all observations are standardized. See Details. |
return.type |
The type of results to be returned. Several types are possible to suit multiple use cases.
|
A data.table
containing cross-validation results (metrics, fval, or metric means) for each target.
cvFolds <- make_Z(1922:2003, nRuns = 50, frac = 0.25, contiguous = TRUE) cv <- cv_mb(p1Seasonal, pc3seasons, cvFolds, 1750, log.trans = 1:3, return.type = 'metrics')
cvFolds <- make_Z(1922:2003, nRuns = 50, frac = 0.25, contiguous = TRUE) cv <- cv_mb(p1Seasonal, pc3seasons, cvFolds, 1750, log.trans = 1:3, return.type = 'metrics')
Kling-Gupta Efficiency
KGE(yhat, y)
KGE(yhat, y)
yhat |
Model outputs |
y |
Observations |
KGE value
KGE(rnorm(100), rnorm(100))
KGE(rnorm(100), rnorm(100))
Least square with mass balance penalty
lsq_mb(hat, obs, lambda, mus, sigmas, log.seasons, log.ann, N, sInd)
lsq_mb(hat, obs, lambda, mus, sigmas, log.seasons, log.ann, N, sInd)
hat |
A vector of estimated flow in the transformed space. |
obs |
A vector of observed flow in the transformed space. |
lambda |
Penalty weight. |
mus |
A vector of means, one for each target. |
sigmas |
A vector of the standard deviations, one for each target. |
log.seasons |
A vector containing the indices of the seasons that are log-transformed. |
log.ann |
TRUE if the annual reconstruction is log-transformed. |
N |
The number of targets (number of seasons plus one for the annual reconstruction). |
sInd |
Indices of the seasons, i.e, 1...N-1 |
Objective function value: least squares plus a penalty term.
Make a list of cross-validation folds. Each element of the list is a vector of the cross-validation points for one cross-validation run.
make_Z(obs, nRuns = 30, frac = 0.1, contiguous = TRUE)
make_Z(obs, nRuns = 30, frac = 0.1, contiguous = TRUE)
obs |
Vector of observations. |
nRuns |
Number of repetitions. |
frac |
Fraction of left-out points. For leave-one-out, use |
contiguous |
Logical. If |
A list of cross-validation folds
Z <- make_Z(p1Seasonal$Qa, nRuns = 30, frac = 0.25, contiguous = TRUE)
Z <- make_Z(p1Seasonal$Qa, nRuns = 30, frac = 0.25, contiguous = TRUE)
Fit parameters with mass balance criterion
mb_fit(X, Y, lambda, mus, sigmas, log.seasons, log.ann, N, sInd)
mb_fit(X, Y, lambda, mus, sigmas, log.seasons, log.ann, N, sInd)
X |
Inputs, must have columns of 1 added |
Y |
Observed Dry, Wet, and Annual log-transformed flows |
lambda |
Penalty weight. |
mus |
A vector of means, one for each target. |
sigmas |
A vector of the standard deviations, one for each target. |
log.seasons |
A vector containing the indices of the seasons that are log-transformed. |
log.ann |
TRUE if the annual reconstruction is log-transformed. |
N |
The number of targets (number of seasons plus one for the annual reconstruction). |
sInd |
Indices of the seasons, i.e, 1...N-1 |
A one-column matrix of beta value
Mass-balance-adjusted reconstruction
mb_reconstruction( instQ, pc.list, start.year, lambda = 1, log.trans = NULL, force.standardize = FALSE )
mb_reconstruction( instQ, pc.list, start.year, lambda = 1, log.trans = NULL, force.standardize = FALSE )
instQ |
Instrumental data, in the same order as pc.list. The "season" column must be a factor. |
pc.list |
List of PC matrices. The first element is for the first season, second element for second season, and so on. The last element is for the annual reconstruction. |
start.year |
The first year of record |
lambda |
The penalty weight |
log.trans |
A vector containing indices of the targets to be log-transformed. If no transformation is needed, provide |
force.standardize |
If TRUE, all observations are standardized. See Details. |
A data.table
with the following columns: season, year, Q, and lambda.
If some targets are log transformed and some are not, they will have different scales, which affects the objective function. In this case the observations will be standardized so that they are in the same range. Otherwise, standardization are skipped for speed. However, in some cases you may want to standardize any ways, for example when flows in some months are much larger than in other months. In this case, set force.standardize = TRUE
.
mb_reconstruction(p1Seasonal, pc3seasons, 1750, lambda = 1, log.trans = 1:3)
mb_reconstruction(p1Seasonal, pc3seasons, 1750, lambda = 1, log.trans = 1:3)
RMSE is normalized by the normalization constant
nRMSE(yhat, y, normConst)
nRMSE(yhat, y, normConst)
yhat |
Model outputs |
y |
Observations |
normConst |
The normalization constant |
normalized RMSE value
x <- rnorm(100) y <- rnorm(100) nRMSE(x, y, sd(y))
x <- rnorm(100) y <- rnorm(100) nRMSE(x, y, sd(y))
Nash-Sutcliffe Efficiency
NSE(yhat, y)
NSE(yhat, y)
yhat |
Model outputs |
y |
Observations |
NSE value
NSE(rnorm(100), rnorm(100))
NSE(rnorm(100), rnorm(100))
This is a wrapper for lsq_mb()
. It first calculates hat
, then calls lsq_mb()
.
This is used in optim()
, so it returns a scalar.
obj_fun(beta, X, Y, lambda, mus, sigmas, log.seasons, log.ann, N, sInd)
obj_fun(beta, X, Y, lambda, mus, sigmas, log.seasons, log.ann, N, sInd)
beta |
Parameters |
X |
Inputs, must have columns of 1 added |
Y |
Observed Dry, Wet, and Annual log-transformed flows |
lambda |
Penalty weight. |
mus |
A vector of means, one for each target. |
sigmas |
A vector of the standard deviations, one for each target. |
log.seasons |
A vector containing the indices of the seasons that are log-transformed. |
log.ann |
TRUE if the annual reconstruction is log-transformed. |
N |
The number of targets (number of seasons plus one for the annual reconstruction). |
sInd |
Indices of the seasons, i.e, 1...N-1 |
Objective function value
Streamflow at P.1 station (Chiang Mai, Thailand) for three reconstruction targets: dry season (NJ, Nov-Jun), wet season (JO, Jul-Oct), and water year (WY, Nov-Oct), as used by Nguyen et al (2020).
p1Seasonal
p1Seasonal
A data table with 246 rows and 3 variables:
a factor with three levels: "NJ", "JO", and "WY"
integer, from 1922 to 2003
Annual flow for each target
https://www.essoar.org/doi/10.1002/essoar.10504791.1
Nguyen, H. T. T., Galelli, S., Xu, C., & Buckley, B. (2020). Multi-Proxy, Multi-Season Streamflow Reconstruction with Mass Balance Adjustment. Earth and Space Science Open Archive, 22. https://doi.org/10.1002/essoar.10504791.1
Principal components of the Southeast Asian Dendrochronology Network, after appropriate sites have been selected for each season.
pc3seasons
pc3seasons
A list with three elements (NJ, JO, and WY), each element is a principal component matrix.
https://www.essoar.org/doi/10.1002/essoar.10504791.1
Nguyen, H. T. T., Galelli, S., Xu, C., & Buckley, B. (2020). Multi-Proxy, Multi-Season Streamflow Reconstruction with Mass Balance Adjustment. Earth and Space Science Open Archive, 22. https://doi.org/10.1002/essoar.10504791.1
Prepend a column of ones
prepend_ones(x)
prepend_ones(x)
x |
The input matrix |
x with a column of ones prepended, which is named 'Int' for 'intercept'
Reduction of Error
RE(yhat, y, yc_bar)
RE(yhat, y, yc_bar)
yhat |
Model outputs in the validation set |
y |
Observations in the validation set |
yc_bar |
Mean observations in the calibration set |
RE value
x <- rnorm(100) y <- rnorm(100) yc_bar <- mean(x[1:50]) RE(x[51:100], y[51:100], yc_bar)
x <- rnorm(100) y <- rnorm(100) yc_bar <- mean(x[1:50]) RE(x[51:100], y[51:100], yc_bar)
Similar to colScale
rowScale(x, add_attr = TRUE)
rowScale(x, add_attr = TRUE)
x |
A matrix. |
add_attr |
If TRUE, the column means and standard deviations are returned as attributes. This is consistent with |
The scaled matrix.
Backtransform a matrix that was scaled before.
rowUnscale(x, rm, rsd)
rowUnscale(x, rm, rsd)
x |
A matrix. |
rm |
A vector of row means |
rsd |
A vector of row standard deviations |
The unscaled matrix