Skip to content

Commit 3c471fe

Browse files
committed
harmonize documentation
1 parent 2439a23 commit 3c471fe

File tree

12 files changed

+112
-111
lines changed

12 files changed

+112
-111
lines changed

R/addTest.R

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
#' Test for Additive Batch Effects
22
#'
3-
#' \code{addTest} function will test for additive batch effects in the residuals for each feature after fitting a linear mixed effects model. Uses Kenward-Roger method. Data should be in "long" format. Depends on \code{lme4} and \code{pbkrtest} packages.
4-
#' @param idvar name of ID variable (character string).
5-
#' @param batchvar name of the batch/site/scanner variable (character string).
6-
#' @param features vector of names of the feature variables (character string) or the numeric indices of the corresponding columns.
7-
#' @param formula character string representing everything on the right side of the formula for the model, in the notation used by \code{lme4} including covariates, time, and any interactions, e.g., \code{"age + sex + diagnosis*time"} fits model with main effects age, sex, diagnosis, and time and the diagnosis*time interaction. Formula should NOT include batchvar and should NOT include random effects.
8-
#' @param ranef character string representing formula for the random effects in the notation used by \code{lme4}, e.g., \code{"(1|subid)"} fits a random intercept for each unique idvar \code{subid}, and \code{"(1 + time|subid)"} fits a random intercept and slope for unique \code{subid}.
9-
#' @param data name of the data frame that contains the variables above. Rows are different subject/timepoints (long format), columns are different variables.
10-
#' @param verbose prints messages (logical \code{TRUE} or \code{FALSE}). Default is \code{TRUE}.
3+
#' \code{addTest} function will test for additive batch effects in the residuals for each feature after fitting a linear mixed effects model. Uses Kenward-Roger method for significance testing. Data should be in "long" format. Depends on \code{lme4} and \code{pbkrtest} packages.
4+
#' @param idvar character string that specifies name of ID variable. ID variable can be factor, numeric, or character.
5+
#' @param batchvar character string that specifies name of the batch variable. Batch variable should be a factor.
6+
#' @param features character string that specifies names of the numeric feature variables, or the numeric indices of the corresponding columns.
7+
#' @param formula character string representing all fixed effects on the right side of the formula for the linear mixed effects model. This should be in the notation used by \code{lme4} and include covariates, time, and any interactions. For example, \code{"age + sex + diagnosis*time"} fits model with fixed effects age, sex, diagnosis, time, and the diagnosis*time interaction. Formula should NOT include batchvar and should NOT include random effects.
8+
#' @param ranef character string representing formula for the random effects in the notation used by \code{lme4}. For example, \code{"(1|subid)"} fits a random intercept for each unique idvar \code{subid}, and \code{"(1 + time|subid)"} fits a random intercept and random slope for each unique \code{subid}.
9+
#' @param data name of the data frame that contains the variables above. Rows are different observations (subject/timepoints), columns are different variables.
10+
#' @param verbose prints messages. Logical \code{TRUE} or \code{FALSE}. Default is \code{TRUE}.
1111
#' @return A data frame of Kenward-Roger test results for each feature.
1212

13-
addTest <- function(idvar, batchvar, features,
14-
formula, ranef, data, verbose=TRUE){
13+
addTest <- function(idvar, batchvar, features,
14+
formula, ranef, data, verbose=TRUE){
1515
# make batch a factor if not already
1616
batch <- as.factor(data[,batchvar])
1717
if (verbose) cat("[addTest] found", nlevels(batch), 'batches\n')

R/batchBoxplot.R

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,21 @@
11
#' Boxplot for Batch Effects
22
#'
3-
#' \code{batchBoxplot} function will plot residuals of linear mixed effect model for a simgle feature by batch to visualize additive and multiplicative batch effects. Data should be in "long" format. Depends on \code{lme4} packages.
4-
#' @param idvar name of ID variable (character string).
5-
#' @param batchvar name of the batch/site/scanner variable (character string).
6-
#' @param feature name of the feature variable (character string) or the numeric index of the corresponding column.
7-
#' @param formula character string representing everything on the right side of the formula for the model, in the notation used by \code{lme4} including covariates, time, and any interaction, e.g., \code{"age + sex + diagnosis*time"} fits model with main effects age, sex, diagnosis, and time and the diagnosis*time interaction. Formula should NOT include batchvar and should NOT include random effects.
8-
#' @param ranef character string representing formula for the random effects in the notation used by \code{lme4}, e.g., \code{"(1|subid)"} fits a random intercept for each unique idvar \code{subid}, and \code{"(1 + time|subid)"} fits a random intercept and slope for unique \code{subid}.
9-
#' @param data name of the data frame that contains the variables above. Rows are different subject/timepoints (long format), columns are different variables.
10-
#' @param adjustBatch should residuals be adjusted for batch? (logical \code{TRUE} or \code{FALSE}). Use \code{FALSE} to illustrate additive (and multiplicative) batch effects. Use \code{TRUE} to illustrate only multiplicative batch effects. Default is \code{FALSE}.
11-
#' @param orderby \code{'mean'} orders boxplots by increasing mean; best for illustrating additive batch effects (use with \code{adjustBatch=FALSE}). \code{'var'} orders boxplots by increasing variance; best for illustrating multiplicative batch effects.
12-
#' @param plotMeans Should batch means be plotted on top of the boxplots? (logical \code{TRUE} or \code{FALSE}). Default is \code{TRUE}.
13-
#' @param colors Vector of colors the same length and order as \code{levels(as.factor(data[,batchvar]))} that determines the colors of the boxplots (character string of color names or hexadecimal codes). Default is \code{"grey"}.
14-
#' @param xlabel x-axis label, default is \code{'batch'} (character string).
15-
#' @param ylabel y-axis label, default is \code{'residuals'} (character string).
3+
#' \code{batchBoxplot} function will plot residuals of linear mixed effects model for a single feature by batch to visualize additive and multiplicative batch effects. Data should be in "long" format. Depends on \code{lme4} package.
4+
#' @param idvar character string that specifies name of ID variable. ID variable can be factor, numeric, or character.
5+
#' @param batchvar character string that specifies name of the batch variable. Batch variable should be a factor.
6+
#' @param feature character string that specifies name of the numeric feature variable, or the numeric index of the corresponding column.
7+
#' @param formula character string representing all fixed effects on the right side of the formula for the linear mixed effects model. This should be in the notation used by \code{lme4} and include covariates, time, and any interactions. For example, \code{"age + sex + diagnosis*time"} fits model with fixed effects age, sex, diagnosis, time, and the diagnosis*time interaction. Formula should NOT include batchvar and should NOT include random effects.
8+
#' @param ranef character string representing formula for the random effects in the notation used by \code{lme4}. For example, \code{"(1|subid)"} fits a random intercept for each unique idvar \code{subid}, and \code{"(1 + time|subid)"} fits a random intercept and random slope for each unique \code{subid}.
9+
#' @param data name of the data frame that contains the variables above. Rows are different observations (subject/timepoints), columns are different variables.
10+
#' @param adjustBatch should residuals be adjusted for the fixed effect of batch? Logical \code{TRUE} or \code{FALSE}. Use \code{FALSE} to illustrate additive (and multiplicative) batch effects. Use \code{TRUE} to illustrate only multiplicative batch effects. Default is \code{FALSE}.
11+
#' @param orderby \code{'mean'} orders boxplots by increasing mean; best for illustrating additive batch effects (use with \code{adjustBatch=FALSE}). \code{'var'} orders boxplots by increasing variance; best for illustrating multiplicative batch effects. Default is \code{'mean'}.
12+
#' @param plotMeans should batch means be plotted on top of the boxplots? Logical \code{TRUE} or \code{FALSE}. Default is \code{TRUE}.
13+
#' @param colors vector of colors the same length and order as \code{levels(as.factor(data[,batchvar]))} that determines the colors of the boxplots (character string of color names or hexadecimal codes). Default is \code{"grey"} for all.
14+
#' @param xlabel x-axis label (character string). Default is \code{'batch'}.
15+
#' @param ylabel y-axis label (character string). Default is \code{'residuals'}.
1616
#' @param title main title for the plot, default is no title (character string).
17-
#' @param verbose prints messages (logical \code{TRUE} or \code{FALSE}). Default is \code{TRUE}.
18-
#' @param ... other graphical parameter arguments passed to \code{par()}.
17+
#' @param verbose prints messages. Logical \code{TRUE} or \code{FALSE}. Default is \code{TRUE}.
18+
#' @param ... other graphical parameter arguments passed to \code{\link[graphics]{par}}.
1919
#' @return Creates a boxplot.
2020

2121
batchBoxplot <- function(idvar, batchvar, feature,

R/batchTimeViz.R

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,19 @@
11
#' Visualize Batches Over Time
22
#'
33
#' \code{batchTimeViz} is a simple function that will visualize batches over time for multi-batch longitudinal data. Data should be in "long" format.
4-
#' @param batchvar name of the batch/site/scanner variable (character string).
5-
#' @param timevar name of the time variable, e.g. age or time from baseline (character string).
6-
#' @param xlabel x-axis label, default is \code{'time'} (character string).
7-
#' @param ylabel y-axis label, default is \code{'batch'} (character string).
8-
#' @param title main title for the plot, default is no title (character string).
9-
#' @param data name of the data frame that contains the variables above rows are different subject/timepoints (long format), columns are different variables.
10-
#' @param verbose prints messages (logical \code{TRUE} or \code{FALSE}). Default is \code{TRUE}.
11-
#' @param ... other graphical parameter arguments passed to \code{par()}.
4+
#' @param batchvar character string that specifies name of the batch variable. Batch variable should be a factor.
5+
#' @param timevar character string that specifies name of numeric variable that distinguishes within-subject repeated measures, e.g., time, age, or visit. Will be plotted along x-axis.
6+
#' @param data name of the data frame that contains the variables above. Rows are different observations (subject/timepoints), columns are different variables.
7+
#' @param xlabel x-axis label (character string). Default is \code{'time'}.
8+
#' @param ylabel y-axis label (character string). Default is \code{'batch'}.
9+
#' @param title main title for the plot (character string). Default is no title.
10+
#' @param verbose prints messages. Logical \code{TRUE} or \code{FALSE}. Default is \code{TRUE}.
11+
#' @param ... other graphical parameter arguments passed to \code{\link[graphics]{par}}.
1212
#' @return Creates a plot.
1313

14-
batchTimeViz <- function(batchvar, timevar, xlabel='time', ylabel='batch',
15-
title='', data, verbose=TRUE, ...){
14+
batchTimeViz <- function(batchvar, timevar, data,
15+
xlabel='time', ylabel='batch', title='',
16+
verbose=TRUE, ...){
1617

1718
# make batch a factor if not already
1819
batch <- as.factor(data[,batchvar])

R/longCombat.R

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
#' Harmonize Multi-batch Longitudinal Data
22
#'
33
#' \code{longCombat} function will implement longitudinal ComBat harmonization for multi-batch longitudinal data. Longitudinal ComBat uses an empirical Bayes method to harmonize means and variances of the residuals across batches in a linear mixed effects model framework. Detailed methods are described in the manuscript at \url{https://www.biorxiv.org/content/10.1101/868810v4}. This is a modification of the ComBat function code from the \code{sva} package that can be found at \url{https://bioconductor.org/packages/release/bioc/html/sva.html} and \code{combat.R} that can be found at \url{https://github.com/Jfortin1/ComBatHarmonization}. Data should be in "long" format. Depends on \code{lme4} package.
4-
#' @param idvar name of ID variable (character string).
5-
#' @param timevar name of variable that distinguishes within-subject repeated measures, e.g., time, age, or visit (character string).
6-
#' @param batchvar name of the batch/site/scanner variable (character string).
7-
#' @param features vector of names of the feature variables (character string) or the numeric indices of the corresponding columns.
8-
#' @param formula character string representing everything on the right side of the formula for the model, in the notation used by \code{lme4} including covariates, time, and any interactions, e.g., \code{"age + sex + diagnosis*time"} fits model with main effects age, sex, diagnosis, and time and the diagnosis*time interaction. Formula should NOT include batchvar and should NOT include random effects.
9-
#' @param ranef character string representing formula for the random effects in the notation used by \code{lme4}, e.g., \code{"(1|subid)"} fits a random intercept for each unique idvar \code{subid}, and \code{"(1 + time|subid)"} fits a random intercept and slope for unique \code{subid}.
10-
#' @param data name of the data frame that contains the variables above. Rows are different subject/timepoints (long format); columns are different variables.
4+
#' @param idvar character string that specifies name of ID variable. ID variable can be factor, numeric, or character.
5+
#' @param timevar character string that specifies name of numeric variable that distinguishes within-subject repeated measures, e.g., time, age, or visit.
6+
#' @param batchvar character string that specifies name of the batch variable. Batch variable should be a factor.
7+
#' @param features character string that specifies names of the numeric feature variables, or the numeric indices of the corresponding columns.
8+
#' @param formula character string representing all fixed effects on the right side of the formula for the linear mixed effects model. This should be in the notation used by \code{lme4} and include covariates, time, and any interactions. For example, \code{"age + sex + diagnosis*time"} fits model with fixed effects age, sex, diagnosis, time, and the diagnosis*time interaction. Formula should NOT include batchvar and should NOT include random effects.
9+
#' @param ranef character string representing formula for the random effects in the notation used by \code{lme4}. For example, \code{"(1|subid)"} fits a random intercept for each unique idvar \code{subid}, and \code{"(1 + time|subid)"} fits a random intercept and random slope for each unique \code{subid}.
10+
#' @param data name of the data frame that contains the variables above. Rows are different observations (subject/timepoints), columns are different variables.
1111
#' @param niter number of iterations for empirical Bayes step. Usually converges quickly in less than 30 iterations. Default is 30.
12-
#' @param method method for estimating sigma in standardization step (character string). \code{'REML'} (default, more conservative type I error control) or \code{'MSR'} (more powerful, may not control type I error at nominal level).
13-
#' @param verbose prints messages (logical \code{TRUE} or \code{FALSE}). Default is \code{TRUE}.
12+
#' @param method method for estimating sigma in standardization step (character string). \code{'REML'} (default, more conservative type I error control) or \code{'MSR'} (more powerful, less conservative type I error control).
13+
#' @param verbose prints messages. Logical \code{TRUE} or \code{FALSE}. Default is \code{TRUE}.
1414
#' @return Function outputs a list including the following:
1515
#' \describe{
16-
#' \item{\code{data_combat}}{data frame with columns idvar, timevar, and ComBat harmonized data}
16+
#' \item{\code{data_combat}}{data frame with columns idvar, timevar, and ComBat-harmonized data for each feature}
1717
#' \item{\code{gammahat}}{data frame containing mean of standardized data for each batch (row) and feature (column)}
1818
#' \item{\code{delta2hat}}{data frame containing variance of standardized data for each batch (row) and feature (column)}
1919
#' \item{\code{gammastarhat}}{data frame containing empirical Bayes estimate of additive batch effects}

R/multTest.R

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
#' Test for Multiplicative Batch Effects
22
#'
3-
#' \code{multTest} function will test for multiplicative batch effects in the residuals for each feature after fitting a linear mixed effects model. Uses Fligner-Killeen method. Data should be in "long" format. Depends on \code{lme4} package.
4-
#' @param idvar name of ID variable (character string).
5-
#' @param batchvar name of the batch/site/scanner variable (character string).
6-
#' @param features vector of names of the feature variables (character string) or the numeric indices of the corresponding columns.
7-
#' @param formula character string representing everything on the right side of the formula for the model, in the notation used by \code{lme4} including covariates, time, and any interactions, e.g., \code{"age + sex + diagnosis*time"} fits model with main effects age, sex, diagnosis, and time and the diagnosis*time interaction. Formula should NOT include batchvar and should NOT include random effects.
8-
#' @param ranef character string representing formula for the random effects in the notation used by lme4, e.g., \code{"(1|subid)"} fits a random intercept for each unique idvar \code{"subid"}, and \code{"(1 + time|subid)"} fits a random intercept and slope unique \code{"subid"}.
9-
#' @param data name of the data.frame that contains the variables above. Rows are different subject/timepoints (long format), columns are different variables.
10-
#' @param verbose prints messages (logical \code{TRUE} or \code{FALSE}). Default is \code{TRUE}.
3+
#' \code{multTest} function will test for multiplicative batch effects in the residuals for each feature after fitting a linear mixed effects model. Uses Fligner-Killeen method for significance testing. Data should be in "long" format. Depends on \code{lme4} package.
4+
#' @param idvar character string that specifies name of ID variable. ID variable can be factor, numeric, or character.
5+
#' @param batchvar character string that specifies name of the batch variable. Batch variable should be a factor.
6+
#' @param features character string that specifies names of the numeric feature variables, or the numeric indices of the corresponding columns.
7+
#' @param formula character string representing all fixed effects on the right side of the formula for the linear mixed effects model. This should be in the notation used by \code{lme4} and include covariates, time, and any interactions. For example, \code{"age + sex + diagnosis*time"} fits model with fixed effects age, sex, diagnosis, time, and the diagnosis*time interaction. Formula should NOT include batchvar and should NOT include random effects.
8+
#' @param ranef character string representing formula for the random effects in the notation used by \code{lme4}. For example, \code{"(1|subid)"} fits a random intercept for each unique idvar \code{subid}, and \code{"(1 + time|subid)"} fits a random intercept and random slope for each unique \code{subid}.
9+
#' @param data name of the data frame that contains the variables above. Rows are different observations (subject/timepoints), columns are different variables.
10+
#' @param verbose prints messages. Logical \code{TRUE} or \code{FALSE}. Default is \code{TRUE}.
1111
#' @return A data frame of Fligner-Killeen test results for each feature.
1212

1313
multTest <- function(idvar, batchvar, features,

0 commit comments

Comments
 (0)