FDR control via Transformational Lasso & Stability Knockoffs under arbitrary transformations
Han Su · Qingyang Sun · Yinfei Kong · Gaorong Li
Beijing Normal Univ · Duke Univ · CSU Fullerton
Background, Structural Types & Existing Methods
Problem Formulation & GLasso
TLasso, StatKnock & SATLasso Methods
Theoretical Properties (5 Theorems)
Simulation: Estimation & FDR Control
Alzheimer's Disease Application & Conclusion
Examples of $D$:
• $D = I_p$: traditional variable selection
• $D = \Delta_G^{(1)}$: fused Lasso / change point detection
• $D = \Delta_G^{(2)}$: trend filtering / piecewise linear detection
• $D = [\Delta_G^{(1)T}, \Delta_G^{(2)T}]^T$: simultaneous testing of multiple structures
Incidence operator. On chain graphs this becomes fused Lasso for change point detection.
Second-order graph difference. It detects kinks in piecewise linear trends.
Stack multiple operators for chain, lattice, and general graphs.
What's missing for arbitrary $D_{m \times p}$:
• No method handles $m \gg n$ (ultra-high dimensional $\gamma$)
• No finite-sample FDR guarantees for general transformation matrices
• No support for higher-order structures (piecewise linear, polynomial)
• No framework for testing multiple operators simultaneously
Makes structural sparsity estimable for arbitrary $D$ via lifting and noise injection.
Adds finite-sample FDR control on the TLasso-projected design.
Screens ultra-high dimensional $m$ while keeping true transformation signals.
Screen in $\gamma$-space, then reuse $\beta$ information for post-screening FDR.
Data: 65 AD patients from ADNI database, T1-weighted MRI segmented via MRICloud into 279 brain regions (Level 5), ADNI-MEM memory score as response.
Challenge: $n = 65$, $p = 279$, and under Model (5.2) $m = 378$ edges — a genuine high-dimensional structural sparsity problem requiring FDR control.
We select structural changes, not just raw coefficients.
The goal is to recover true transformed signals.
Discover signals while controlling false discoveries.
Idea: ordinary GLasso forces γ to live in the lower-dimensional space Col(D). TLasso lifts the model, then projects away nuisance directions so γ behaves like a standard Lasso target.
$\gamma=D\beta$ lives in Col$(D)$, so it is not a free $m$-vector when rank$(D)
Augment $D$ to full row rank: $D_+=[D,N]$
Work with $\omega_+=(\omega^T,\omega_N^T)^T$ so $\gamma=D_+\omega_+$.
Remove nuisance directions and solve standard Lasso directly in $\gamma$.
Construct $N$ via QR decomposition on rank-$r$ matrix $D$; form $D_+ = [D, N]$ with full row rank $m$
Generate auxiliary noise $X_N$; obtain augmented design $X_+ = [X, X_N]$
Apply projection technique to eliminate nuisance parameters: transform $(X_+, Y)$ to $(\tilde{X}, \tilde{Y})$
Solve standard Lasso on projected data: $\hat{\gamma} = \arg\min_\gamma \frac{1}{2n}\|\tilde{Y} - \tilde{X}\gamma\|_2^2 + \lambda\|\gamma\|_1$
Construct transformational knockoff matrix $\widetilde{X}$ based on projected design $\tilde{X}$ from TLasso, satisfying pairwise exchangeability: swapping $\tilde{X}_j$ and $\widetilde{X}_j$ for null $j$ does not change the joint distribution $[\tilde{X}, \widetilde{X}]$.
Subsample data $L = 100$ times; for each subsample, run TLasso on both $[\tilde{X}, \widetilde{X}]$ and $[\widetilde{X}, \tilde{X}]$. Compute importance $W_j$ = probability that feature $j$ is selected in both halves — more powerful than LSM/LCD statistics.
Apply knockoff filter: $\hat{S} = \{j : W_j \geq T\}$ where $T = \min\{t : \frac{|\{j: W_j \leq -t\}|}{|\{j: W_j \geq t\}| \vee 1} \leq q\}$, ensuring finite-sample FDR $\leq q$.
Theorem 1 (Oracle Inequality). With $\lambda \geq (2\sigma/\kappa)\sqrt{2\log m/n}$: (a) TLasso support $\hat{S} \subseteq S$ — no false positives; (b) $\ell_\infty$ bound $\|\hat{\gamma}_S - \gamma_S\|_\infty \leq C_n\lambda$ — estimation error vanishes at rate $\sqrt{\log m / n}$.
Theorem 2 (Selection & Sign Consistency). Under minimal signal $\min_{j \in S}|\gamma_j| \geq \varsigma_n\sqrt{\log m/n}$, TLasso achieves $P(\hat{S} = S) \to 1$ — recovers the exact active set with correct signs asymptotically.
Theorem 3 (Pairwise Exchangeability). The constructed knockoffs satisfy: (a) $[\tilde{X}, \widetilde{X}]_{\text{swap}(S^c)} \overset{d}{=} [\tilde{X}, \widetilde{X}]$; (b) $Y \perp \widetilde{X}_{S^c} \mid \tilde{X}$ — enabling the knockoff filter to control FDR without knowing null distribution.
Key insight: all conditions (minimal eigenvalue, mutual incoherence) are imposed on the projected design $\tilde{X}$, which is constructed algorithmically — not on the raw design $X$.
Theorem 4 (Finite-Sample FDR). For any $q \in (0,1)$, StatKnock controls the modified FDR: $\text{mFDR}(\hat{S}) = E\left[\frac{|\hat{S} \cap S^c|}{|\hat{S}| + 1/q}\right] \leq q$. With threshold $T_1$, also controls the usual $\text{FDR}(\hat{S}) \leq q$.
Theorem 5 (Power Guarantee). If the base procedure $\Phi$ (TLasso) achieves selection consistency, then $\text{Power}(\hat{S}) = E[|\hat{S} \cap S| / |S|] \to 1$ as $n \to \infty$ — StatKnock discovers all true signals asymptotically.
Theorem 6 (Sure Screening). SATLasso satisfies $P(\hat{S}_\gamma \supseteq S) \to 1$ — retains all true signals while reducing dimensionality from $m = O(e^{n^\xi})$ to $O(n)$, enabling subsequent StatKnock application.
$n=300, p=150, m_1=20, A=0.5$ — Chain graph $G_1$ with $D=\Delta_{G_1}^{(1)}$ (piecewise constant)
| Metric | TLasso① | TLasso② | TLasso③ | GenLasso | SplitLasso |
|---|---|---|---|---|---|
| $\ell_2$ error (lower is better) | 0.947 | 0.956 | 0.942 | 1.326 | 2.148 |
| True Pos. | 19.74 | 19.77 | 19.76 | 19.95 | 3.00 |
| Model Size | 75.94 | 76.62 | 75.37 | 147.57 | 3.14 |
Red numbers mark the three TLasso variants, which have the smallest estimation error among competing estimators.
TLasso achieves smallest $\ell_2$ error with compact model size. GenLasso overfits; SplitLasso too conservative. Results robust across 3 noise mechanisms.
Find abnormal atrophy/expansion in hippocampus, amygdala, and cingulate regions.
Find atypical edges, including cingulate and frontal connections.
Find higher-order deviations in corpus callosum genu and precuneus regions.
Data: 65 AD patients, 279 brain regions (Level 5), ADNI-MEM memory score, $L = 100$ subsamples for StatKnock
Graph: regions connected if they share the same Level 3 anatomical parent, yielding $m = 378$ edges for $\Delta_G^{(1)}$
Methods: SATLasso-StatKnock vs BY vs SplitKnockoff at FDR level $q = 0.2$
Hippocampus, amygdala, and cingulate regions match established AD biomarkers.
Additional frontal and corpus callosum signals appear under SATLasso-StatKnock.
Selected regions and connections are balanced across hemispheres.
BY and SplitKnockoff miss higher-order structures found by Model (5.3).
Oracle inequality, selection & sign consistency for arbitrary $D$
Finite-sample FDR control via stability knockoffs
Sure screening for ultra-high dimensional $m$
Novel insights into brain structural changes in Alzheimer's
Future directions: Adaptive FDR-controlled screening · Debiased TLasso for global testing · Extension to e-values & FWER control
Thank You — Questions?