| Title: | Covariate Assisted Principal Regression |
|---|---|
| Description: | Covariate Assisted Principal Regression (CAPR) for multiple covariance-matrix outcomes. The method identifies (principal) projection directions that maximize the log-likelihood of a log-linear regression model of the covariates. See Zhao et al. (2021), "Covariate Assisted Principal Regression for Covariance Matrix Outcomes" <doi:10.1093/biostatistics/kxz057>. |
| Authors: | Xi Luo [aut, cre], Yi Zhao [aut], Brian Caffo [aut] |
| Maintainer: | Xi Luo <[email protected]> |
| License: | GPL-3 |
| Version: | 0.3.0 |
| Built: | 2026-05-27 07:25:55 UTC |
| Source: | https://github.com/rluo/capr |
Fits CAP components sequentially for principal direction vectors and regression
coefficients , . Each component is estimated via a flip-flop
algorithm with optional orthogonalization of successive directions.
capr( S, X, K, B.init = NULL, Gamma.init = NULL, weight = NULL, max_iter = 200L, tol = 1e-06, orth = TRUE, n.init = 10L )capr( S, X, K, B.init = NULL, Gamma.init = NULL, weight = NULL, max_iter = 200L, tol = 1e-06, orth = TRUE, n.init = 10L )
S |
Numeric 3D array of size |
X |
Numeric matrix |
K |
Integer scalar; number of components ( |
B.init |
Initial value of the coefficient array |
Gamma.init |
Initial value of the principal direction array |
weight |
Numeric vector of length |
max_iter |
Integer scalar; maximum flip-flop iterations per component (default 200). |
tol |
Positive numeric scalar; convergence tolerance (default 1e-6). |
orth |
Logical scalar; if |
n.init |
Integer scalar; number of random initializations (default 10). If |
For component , CAP solves
subject to
and, for ,
Here denotes the weight for slice , is the -th covariance slice,
and is the positive definite matrix used for the orthogonality constraint (see Zhao et al., 2021).
The algorithm fits and sequentially with multiple random initializations and
returns the solution pair that minimizes the negative log-likelihood.
A list of class capr with:
B |
numeric matrix |
Gamma |
numeric matrix |
loglike |
negative log-likelihood, up to constant scaling and shift |
S |
3D array used for fitting |
X |
design matrix used for fitting |
weight |
weight values used for fitting |
Zhao, Y., Wang, B., Mostofsky, S. H., Caffo, B. S., & Luo, X. (2021). "Covariate assisted principal regression for covariance matrix outcomes." Biostatistics, 22(3), 629-645.
simu.data <- simu.capr(seed = 123L, n = 120L) K <- 2L fit <- capr( S = simu.data$S, X = simu.data$X, K = K ) print(fit)simu.data <- simu.capr(seed = 123L, n = 120L) K <- 2L fit <- capr( S = simu.data$S, X = simu.data$X, K = K ) print(fit)
Generates bootstrap inference for the CAP regression coefficients while
holding the fitted directions fixed. Each replicate samples the
covariance slices with replacement, projects them onto the fixed
directions to obtain component-specific variances, and re-solves the
equations. Quantile-based confidence intervals are returned
for every predictor/component pair.
capr.boot( fit, nboot = 1000L, level = 0.95, max_iter = 100L, tol = 1e-06, seed = NULL )capr.boot( fit, nboot = 1000L, level = 0.95, max_iter = 100L, tol = 1e-06, seed = NULL )
fit |
A |
nboot |
Integer; number of bootstrap replicates. |
level |
Confidence level for the returned intervals. |
max_iter |
Maximum Newton iterations for solving |
tol |
Convergence tolerance for the Newton solver. |
seed |
Optional integer seed for reproducibility. |
A list of class capr.boot with:
beta |
bootstrap average of |
ci_lower, ci_upper
|
Matrices |
level |
The requested confidence level. |
simu.data <- simu.capr(seed = 123L, n = 120L) K <- 3L fit <- capr( S = simu.data$S, X = simu.data$X, K = K ) capr.boot(fit, nboot = 10L, level = 0.95, seed = 42L)simu.data <- simu.capr(seed = 123L, n = 120L) K <- 3L fit <- capr( S = simu.data$S, X = simu.data$X, K = K ) capr.boot(fit, nboot = 10L, level = 0.95, seed = 42L)
Computes the cosine of the angle between two numeric vectors. Both vectors must have equal length and non-zero Euclidean norms.
cosine_similarity(a, b, eps = 1e-12)cosine_similarity(a, b, eps = 1e-12)
a, b
|
Numeric vectors of equal length. |
eps |
Non-negative numeric tolerance used to guard against division by
zero. Defaults to |
A scalar double value in [-1, 1] representing the cosine
similarity between a and b.
cosine_similarity(c(1, 2, 3), c(1, 2, 3)) cosine_similarity(c(1, 0), c(0, 1)) cosine_similarity(c(1, 2), c(-1, -2))cosine_similarity(c(1, 2, 3), c(1, 2, 3)) cosine_similarity(c(1, 0), c(0, 1)) cosine_similarity(c(1, 2), c(-1, -2))
Implements the Flury & Gautschi (1986) (FG) iterative algorithm and a variant to estimate a common loading matrix across multiple covariance matrices. Each iteration cycles over all ordered pairs of variable indices and updates a (2 x 2) rotation so that the transformed matrices share diagonal structure.
FG(cov_array, p = NULL, m = NULL, maxit = 30L) FG2(cov_array, p = NULL, m = NULL, maxit = 30L)FG(cov_array, p = NULL, m = NULL, maxit = 30L) FG2(cov_array, p = NULL, m = NULL, maxit = 30L)
cov_array |
Numeric 3D array of shape |
p |
Optional integer specifying the matrix dimension; defaults to
|
m |
Optional integer specifying the number of matrices/slices; defaults
to |
maxit |
Integer scalar; number of outer iterations of the algorithm. |
Two solvers are exported:
FG()The original FG algorithm.
FG2()An alternative algorithm by Eslami et al. (2013).
A numeric matrix of estimated common loadings.
Flury, B. N. (1984). "Common Principal Components in k Groups." Journal of the American Statistical Association, 79, 892-898.
Flury, B. N., & Gautschi, W. (1986). "An Algorithm for Simultaneous Orthogonal Transformation of Several Positive Definite Symmetric Matrices to Nearly Diagonal Form." SIAM Journal on Scientific and Statistical Computing, 7(1), 169-184.
Eslami, A., Qannari, E. M., Kohler, A., & Bougeard, S. (2013). "General Overview of Methods of Analysis of Multi-Group Datasets." Revue des Nouvelles Technologies de l'Information, 25, 108-123.
set.seed(1) p <- 3 m <- 4 mats <- replicate(m, { A <- matrix(rnorm(p * p), p, p) crossprod(A) }, simplify = FALSE ) cov_cube <- array(NA_real_, dim = c(p, p, m)) for (k in 1:m) cov_cube[, , k] <- mats[[k]] FG(cov_cube, maxit = 5) FG2(cov_cube, maxit = 5)set.seed(1) p <- 3 m <- 4 mats <- replicate(m, { A <- matrix(rnorm(p * p), p, p) crossprod(A) }, simplify = FALSE ) cov_cube <- array(NA_real_, dim = c(p, p, m)) for (k in 1:m) cov_cube[, , k] <- mats[[k]] FG(cov_cube, maxit = 5) FG2(cov_cube, maxit = 5)
Evaluates the Flury-Gautschi log-deviation criterion for a collection of covariance matrices transformed by a loading matrix.
log_deviation_from_diagonality(S_cube, nval, B)log_deviation_from_diagonality(S_cube, nval, B)
S_cube |
Numeric 3D array of shape |
nval |
Numeric vector of length |
B |
Numeric |
Numeric scalar value equal to
.
covs <- array(diag(2), dim = c(2, 2, 1)) log_deviation_from_diagonality(covs, 1, diag(2))covs <- array(diag(2), dim = c(2, 2, 1)) log_deviation_from_diagonality(covs, 1, diag(2))
For a fitted CAP regression, plots two diagnostics across the first
components: (1) the negative log-likelihood returned by capr()
and (2) the log deviation-from-diagonality (DfD) for the loading matrix
formed by the first directions. Both curves help assess the gain
from adding components.
## S3 method for class 'capr' plot(x, ...)## S3 method for class 'capr' plot(x, ...)
x |
A |
... |
Additional arguments passed to |
The DfD criterion for the first directions is
where
for a positive definite matrix .
The curve shows . A common choice for
is the last point before a sudden jump in the negative
log-likelihood or log-DfD curve.
Invisibly returns the numeric vector of log deviation values (one per component).
log_deviation_from_diagonality()
sim <- simu.capr(seed = 123L, n = 120L) fit <- capr(S = sim$S, X = sim$X, K = 3L) plot(fit)sim <- simu.capr(seed = 123L, n = 120L) fit <- capr(S = sim$S, X = sim$X, K = 3L) plot(fit)
Formats the coefficient matrix returned by capr() in a
linear-regression style table, showing the estimate for each predictor and
component.
## S3 method for class 'capr' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'capr' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
An object of class |
digits |
Number of significant digits to show when printing numeric values. |
... |
Additional arguments passed on to |
The input object x, invisibly.
simu.data <- simu.capr(seed = 123L, n = 120L) K <- 2L fit <- capr( S = simu.data$S, X = simu.data$X, K = K ) print(fit)simu.data <- simu.capr(seed = 123L, n = 120L) K <- 2L fit <- capr( S = simu.data$S, X = simu.data$X, K = K ) print(fit)
capr.boot objectsDisplays bootstrap coefficient estimates and their confidence intervals component by component as compact tables.
## S3 method for class 'capr.boot' print(x, digits = max(4L, getOption("digits") - 4L), ...)## S3 method for class 'capr.boot' print(x, digits = max(4L, getOption("digits") - 4L), ...)
x |
An object of class |
digits |
Number of significant digits to show when printing numeric values. |
... |
Additional arguments passed on to |
The input object x, invisibly.
simu.data <- simu.capr(seed = 123L, n = 120L) K <- 2L fit <- capr( S = simu.data$S, X = simu.data$X, K = K ) fit.boot <- capr.boot( fit = fit, nboot = 10L, max_iter = 20L, tol = 1e-6, seed = 123L ) print(fit.boot)simu.data <- simu.capr(seed = 123L, n = 120L) K <- 2L fit <- capr( S = simu.data$S, X = simu.data$X, K = K ) fit.boot <- capr.boot( fit = fit, nboot = 10L, max_iter = 20L, tol = 1e-6, seed = 123L ) print(fit.boot)
capr()
Generates a simple synthetic dataset for CAP regression consisting of a covariance cube, design matrix, and the latent orthogonal directions used to build the covariance slices.
simu.capr(seed = 123L, n = 120L)simu.capr(seed = 123L, n = 120L)
seed |
Integer seed used for reproducibility. |
n |
Number of observations (slices) to generate. |
A list with components:
S |
Array of dimension |
X |
Design matrix of size |
Q |
Orthogonal matrix whose columns are the latent directions. |
BetaMat |
True coefficient matrix used to form the eigenvalues. |
H |
Average covariance matrix |
p, n
|
The dimension and sample size supplied to the generator. |
sim <- simu.capr(seed = 10, n = 50) str(sim$S)sim <- simu.capr(seed = 10, n = 50) str(sim$S)