| Title: | Community Estimation in G-Models via CORD |
|---|---|
| Description: | Partitions data points (variables) into communities/clusters, similar to clustering algorithms such as k-means and hierarchical clustering. This package implements a clustering algorithm based on a new metric CORD, defined for high-dimensional parametric or semiparametric distributions. For more details see Bunea et al. (2020), Annals of Statistics <doi:10.1214/18-AOS1794>. |
| Authors: | Xi (Rossi) LUO [aut, cre], Florentina Bunea [aut], Christophe Giraud [aut] |
| Maintainer: | Xi (Rossi) LUO <[email protected]> |
| License: | GPL-3 |
| Version: | 0.2.0 |
| Built: | 2026-05-22 08:09:35 UTC |
| Source: | https://github.com/rluo/cord |
Partition data points (variables) into clusters/communities. Reference: Bunea et al (2020). Model assisted variable clustering: Minimax-optimal recovery and algorithms, Annals of Statistics, doi:10.1214/18-AOS1794.
cord( X, tau = 2 * sqrt(log(ncol(X))/nrow(X)), kendall = T, input = c("data", "cor", "dist") )cord( X, tau = 2 * sqrt(log(ncol(X))/nrow(X)), kendall = T, input = c("data", "cor", "dist") )
X |
Input data matrix. It should be an n (samples) by p (variables) matrix when |
tau |
Threshold to use at each iteration. A theoretical choice is about |
kendall |
Whether to compute Kendall's tau correlation matrix from |
input |
Type of input |
list with one element: a vector of integers showing which cluster/community each point is assigned to.
set.seed(100) X <- 2*matrix(rnorm(200*2), 200, 10)+matrix(rnorm(200*10), 200, 10) cord(X)set.seed(100) X <- 2*matrix(rnorm(200*2), 200, 10)+matrix(rnorm(200*10), 200, 10) cord(X)