shinyepico 1.4.1
shinyÉPICo is a web application based on shiny and created to accelerate and to streamline the study of Illumina Infinium DNA methylation arrays, including the 450k and EPIC. For this purpose, shinyÉPICo uses, among others, the widely used minfi (Aryee et al. 2014) and limma (Ritchie et al. 2015) packages for the array normalization and the differentially methylated positions (DMPs) calculation, respectively. Moreover, shinyÉPICo also allows calculation of differentially methylated regions calculation (DMRs), based on the mCSEA package (Martorell-Marugán, González-Rumayor, and Carmona-Sáez 2019).
It is conceived as a ‘graphical pipeline’ since, throughout the analysis, you can observe the result of the different options with multiple charts, and interactively modify decisions in current or prior steps. In addition, the application automates calculations that would otherwise take much more time and effort.
shinyÉPICo is a fully graphical application and the calculations performed in R are done entirely on the server-side. Therefore, the application can be used both on the local computer and through a web server, to which devices, such as computers, smartphones or tablets could be connected, without high RAM or CPU requirements and only a web browser as requirement.
The steps followed by the application includes the data importing, starting with iDAT files (Raw Illumina DNA methylation datasets), array normalization, quality control, and data exploring, and DMP and DMR calculation. Each step has multiple options and charts that we will describe in this manual.
ShinyÉPICo can run in GNU/Linux, Windows, or macOS. The package dependencies are automatically installed with the package.
Since the application allows to follow interactively all the analysis process, many objects have to be stored in RAM memory. Therefore, the application can be memory demanding, especially when trying to analyze a large number of samples at the same time. We recommend at least 12GB of RAM for smooth use of the application, but depending on the number of samples analyzed and whether they are EPIC or 450k, the needs may be lower or higher.
The package will be available soon from Bioconductor, where it can be installed following standard instructions:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("shinyepico")
Currently, you can install the application through GitHub using the ‘remotes’ R package:
install.packages("remotes")
library("remotes")
install_github("omorante/shinyepico", upgrade="always", dependencies = TRUE)
ShinyÉPICo implements internal functions to follow the array normalization, and the DMP/DMR calculations, including, for example, the differential of beta values calculation between the groups. For the main features of the application, the package dependencies are the following:
The complete list of the application dependencies are:
DT (>= 0.15.0),
data.table (>= 1.13.0),
doParallel (>= 1.0.0),
dplyr (>= 1.0.9),
foreach (>= 1.5.0),
GenomicRanges (>= 1.38.0),
ggplot2 (>= 3.3.0),
gplots (>= 3.0.0),
heatmaply (>= 1.1.0),
limma (>= 3.42.0),
minfi (>= 1.32.0),
plotly (>= 4.9.2),
reshape2 (>= 1.4.0),
rlang (>= 1.0.2),
rmarkdown (>= 2.3.0),
rtracklayer (>= 1.46.0),
shiny (>= 1.5.0),
shinyWidgets (>= 0.5.0),
shinycssloaders (>= 0.3.0),
shinyjs (>= 1.1.0),
shinythemes (>= 1.1.0),
statmod (>= 1.4.0),
tidyr (>= 1.2.0),
zip (>= 2.1.0)
For DMR calculation, additional suggested packages are required. You can use the application without installing this package, but you will not able to calculate DMRs. Moreover, for the array normalization, you will need to install annotation and manifest packages, for 450k or EPIC.
knitr (>= 1.30.0),
mCSEA (>= 1.10.0),
IlluminaHumanMethylation450kanno.ilmn12.hg19,
IlluminaHumanMethylation450kmanifest,
IlluminaHumanMethylationEPICanno.ilm10b4.hg19,
IlluminaHumanMethylationEPICmanifest,
testthat,
minfiData
And, to run the application:
library("shinyepico")
set.seed(123)
run_shinyepico()
The function run_shinyepico has 4 parameters that can be modified:
As can be seen in the example, we used the set.seed function before starting ShinyÉPICo. We recommend always setting up the same seed, which can be any number, to avoid uncertainties and ensure that the results obtained are reproducible. Particularly, DMRs calculation, that rely on the mCSEA package, utilizes permutations to estimate the p.value. For that reason, some uncertainty is expected, and results can be a bit different every time you run the application. Using a seed avoids this problem, as you will obtain always the same result with the same seed.
shinyÉPICO workflow is divided into tabs that summarize the different steps to follow:
While it is possible to go back to previous steps and change options at any point in the process, some buttons are disabled until the steps required to perform them are completed. For example, you will not be able to generate a heatmap if you do not complete the normalization of the arrays and the calculation of the DMPs. This design is intended to avoid errors and guide the user through the analysis steps.
In the next sections, we will discuss the different steps, the options, and chart interpretation. Moreover, we will use a public DNA methylation array dataset to show an example of application use.
In the following example, iDATs from a study using Illumina EPIC DNA methylation arrays will be used to illustrate the steps (Li et al. 2020). Although shinyÉPICo can be used with multiple groups and large cohorts, and in that case, iteration and automation are more useful, for this manual we have decided to use a minimal example that can be easily reproduced on almost any computer. Specifically, monocyte and monocyte-derived macrophages (produced with M-CSF) has been selected, only 6 samples (3 monocyte samples and 3 macrophage samples from 3 different healthy donors).
The .zip file of this dataset can be found in the Github repository
In this example, the seed 123 has been set up using the function set.seed() before run_shinyepico().
The first step in the shinyÉPICo workflow is to prepare the data in the proper format. iDAT files should be compressed in a .zip file. The name of the files should follow the standard convention: XXXXXXXXXXXX_YYYYYY_ZZZ.idat being XXXXXXXXXXXX the Sentrix_ID, YYYYYY the Sentrix_Position, and ZZZ Grn or Red (corresponding, respectively, to the Red and Green signal file).
Moreover, a CSV (comma-separated) file with the annotation of the experiment should be included. It is mandatory to include the Sentrix_ID and Sentrix_Position columns that allow the software to find their respective iDAT files. Moreover, other columns should be added to reflect the different variables (e.g. sample name, health/disease, treatment/control, age, sex, hybridization day, etc.). Usually, iDATs and sample sheet have these features by default, and no additional work is required.
In this regard, 3 parameters are required in the Input tab to continue the analysis:
A column with the sample names, that should be unique, without duplicates.
A column with the variable of interest, in which sample groups will be found to calculate DMPs and DMRs.
Optionally, a donor variable is also included in the Input tab. If you have an experiment where several samples come from the same donor, it is recommended to add a column with this information and select it in the form. This information is used in the SNPs heatmap of the Normalization section (as we will discuss below), and also for the DMPs/DMR calculation. If you select a valid donor variable, this information is automatically added as covariable of the linear model in the DMPs section, and, therefore a “paired analysis” will be performed. If you do not have or you would not like to use this information, you can select the same column as sample names.
An aspect to take into account is that shinyÉPICo autodetects the variable types (numerical or categorical) depending on whether or not they can be coerced to numbers. For this purpose, variables are coerced to a numeric vector, and when the generated NAs are less than 75% of the total, the variable is set as numeric, and, otherwise, as categorical. Moreover, not informative categorical variables are excluded (e.g., a variable with the same value in all the samples, or with not equal values). Therefore, numbers should not be used for categorical variables in the sample sheet. Numerical variables with some NAs but less than 75% can be used in the exploratory analysis but not as covariables of the linear model, since Limma needs a design matrix without missing values.
In the next table, you can see the sample sheet of the example dataset:
Sample_Name | Sample_Well | Sample_Plate | Sample_Group | Donor | Pool_ID | Sentrix_ID | Sentrix_Position | Scan_Date | barcode |
---|---|---|---|---|---|---|---|---|---|
MAC A | B05 | SMET0256 | MAC | A | Epic | 202163550095 | R02C01 | 27-Apr-18 | 202163550095_R02C01 |
MO B | D05 | SMET0256 | MO | B | Epic | 202163550095 | R04C01 | 27-Apr-18 | 202163550095_R04C01 |
MAC C | F05 | SMET0256 | MAC | C | Epic | 202163550095 | R06C01 | 27-Apr-18 | 202163550095_R06C01 |
MO A | G05 | SMET0256 | MO | A | Epic | 202163550095 | R07C01 | 27-Apr-18 | 202163550095_R07C01 |
MO C | E06 | SMET0256 | MO | C | Epic | 202163550097 | R05C01 | 27-Apr-18 | 202163550097_R05C01 |
MAC B | G06 | SMET0256 | MAC | B | Epic | 202163550097 | R07C01 | 27-Apr-18 | 202163550097_R07C01 |
It includes the Sentrix_ID and Sentrix_Position mandatory columns, and also other columns relevant for the experiment such as Sample_name, Sample_group, and donor.
After selecting the proper column in each form field, and the samples to be included (in this example, all), it is possible to proceed with the next step, pressing the “Continue” button. When the application is performing an operation expected to take time, a progress bar is displayed at the bottom right of the window to inform the user that the application is busy.
Internally, shinyÉPICO uses the “read.metharray.sheet” and “read.metharray.exp” functions to read the sample sheet and load the DNA methylation data, respectively.
After loading the methylation data of the selected samples, the Normalization tab is automatically shown, where you can see an overview of the data quality and perform the array normalization.
First, the quality control tab shows two useful charts to identify bad samples.
On the one hand, the QC Signal plot shows the median methylated (mMed) and unmethylated (uMed) of each sample array. When mMed or uMed of a sample is less than 10, it is considered “Suboptimal”. Although this cutoff is arbitrary and it is the user who decides whether a sample is valid or not, a signal much lower than this threshold or very different from most samples indicates that there has been a problem with it. Depending on whether the signal is very far from this threshold and depending on the rest of the results, these types of samples should be excluded from the analysis. To exclude samples, you can go back to the Input tab to change the samples selected and load them again.
On the other hand, the Bisulfite conversion plot is calculated using the information of the bisulfite conversion II control probes of the 450k/EPIC arrays. When the bisulfite conversion reaction is successful, the probe of the control position will have intensity in the Red channel, whereas if the sample has unconverted DNA, the probe will have a high signal in the Green channel.
For each sample, we calculate all the Red/Green ratios for each control position, and the minimum ratio is shown in the chart. When the ratio of a sample is lower than 1.5, we flag it as “Failed”.
A correct array normalization is essential for the results of the subsequent steps in the workflow. shinyÉPICo can use all the normalization methods available in the minfi package:
Raw: Without normalization.
Illumina: A reverse-engineered implementation of the Genome Studio normalization.
Funnorm: A between-array normalization method that relies in data from control probes of the arrays. It performs also Noob normalization before the Functional Normalization.
Noob: A within-array normalization method with dye-bias normalization.
SWAN: A within-array normalization method that allows Infinium I and Infinium II probes to be normalized together.
Quantile: A between-array normalization method that assumes no global differences in methylation between the samples. When global changes are expected, such as in cancer samples, other methods, such as Funnorm, are recommended.
Additionally, it offers the option of performing Noob within array normalization followed by Quantile normalization (Noob + Quantile), analogously to the Funnorm method. This non-standard approach has empirically shown good results in our experience.
For details about the methods, the minfi documentation and the respective publications are good references. Although depending on the experimental design and the type of changes expected a normalization method can be expected to be better than another, usually the best method should be determined empirically among the valid options.
Additionally, shinyÉPICo offers three more options in the Normalization tab:
Drop CpHs: When this option is selected, CH probes (non-CpG methylation) are removed. It uses the minfi::dropMethylationLoci function with standard parameters.
Drop SNPs: When this option is selected, positions with SNPs annotated by Illumina are removed, with a MAF (minimum allele frequency) higher than the cutoff selected are removed. It uses the minfi::dropLociwithSnps function with standard parameters.
Drop X/Y Chr: When this option is selected, all the positions of the sex chromosomes are removed. Since the methylation values of these chromosomes can be very different depending on the sex of the donors, removing them can help to homogenize the data. Alternatively, or additionally, sex information can be added to the linear model.
Moreover, Using the minfi recommended value, the genomic positions of the final obtained RGChannelSet object (Raw Data) are filtered by the detection p.value (using the minfi::detectionP function) in the normalized data, removing all the positions with an average p.value higher of 0.01.
shinyÉPICo simplify the testing of different methods and options. When a normalization method is selected, clicking the “Select” button, the different charts are automatically generated.
In this example, we are going to select the Quantile normalization. We show different examples of the type of charts generated by the software.
This chart shows the distribution of the beta values of every sample. A bimodal distribution is expected, with two peaks centered around 0 (unmethylated) and 1 (methylated) beta values. When more peaks are shown in the middle, or when the pattern of different samples is very different, it indicates problems in some step of the sample preparation or hybridization. After normalization, we expect an improved alignment of the sample patterns, with little discrepancies between them.
Raw:
Quantile:
This chart shows the box-and-whisker plot of beta values in each sample. The median and interquartile range of the samples should be similar, and more homogeneous after the normalization.
Raw:
Quantile:
This heatmap utilizes information of a special subset of single nucleotide polymorphism (SNP) probes in the 450k and EPIC arrays. For the representation, we use the beta values of these probes (extracted using the function minfi::getSnpBeta()). The heatmap shows a column clustering, that should correspond with the donor information. For example, in this experiment, with 3 different treatments and 3 donors, we can see a clear clustering by donor.
Depending on the median X and Y chromosomes intensities, we show the sex prediction generated with the function minfi::getSex(). Note that, even if the Drop X/Y Chr. option is selected, this prediction is done before that step, and this graph will not be altered.
In this example, the three donors are males, as can be observed in the chart.
In order to get a first overview of the data before the DMP/DMR calculation, we provide two more charts.
First, we show the principal component analysis (PCA). Selecting specific principal components to plot in the x and y axis is possible, as well as changing the color variable. Moreover, a table showing the percentage of variance explained in each principal component is also depicted.
Secondly, we also provide a heatmap of the correlations between principal components and variables. As we explained above, variable types (categorical or numerical) are guessed, and not informative variables are excluded.
In the correlations tab, a detailed table of the variables is shown, including the type (categorical variables are shown as “factor”, numerical variables as “numeric”, and discarded variables as “discarded”).
Pearson correlation is applied to correlate principal components with numerical variables. For categorical variables, linear models (principal component ~ categorical variable) are generated and R-squared statistics are shown in the representation. Alternatively, the p.values associated with these statistical approaches can also be plotted in a heatmap.
Both PCA and correlations plots can be also useful to find possible variables affecting methylation data, in order to select covariates and interactions for DMP/DMR calculation.
Correlation value:
P value:
After normalizing the methylation data, the next tab, “DMPs”, will become enabled. The process of DMP calculation is divided into two parts: model generation and contrasts calculation.
In order to calculate DMPs between groups of the variable of interest, shinyÉPICo uses the limma package. First, shinyÉPICo fits a linear model using the M values, with the limma::lmFit function (see the limma user guide for more details.)
Beta values are used in all the charts of the application, because of its easier biological interpretation: Beta values range from 0 (totally unmethylated) to 1 (totally methylated).
However, for the statistical analysis with Limma, shinyÉPICo utilizes the M values (calculated as the logit of Beta values). M values range from -Infinite to Infinite.
Beta values are not as suitable to use with limma, because they have severe heteroscedasticity for highly methylated or unmethylated positions. For that reason, M values (homoscedastic data) are used for the limma model, whereas Beta values are used for representation (Du et al. 2010).
Several options can be set in this step:
Variable of interest: This variable, that should be categorical, will be used to perform the contrasts to generate the DMPs.
Linear model covariables: When other variables different from the variable of interest can be influencing the DNA methylation differences, they can be included in the linear model. Limma will take into account these differences and correct them in the result. For example, DNA methylation is very dependent on the donor, and, in experiments with several samples of the same donor, it is generally advisable to add the donor as a covariable. If you have selected a donor covariable in the Input tab, this option will be selected by default.
Linear model interactions: When you suspect that a variable can be affecting DNA methylation depending on another variable (for example, changes of DNA methylation between the sex that are specific of age), an interaction term can be added to the model.
Array Weights: If this option is enabled, estimated relative quality weights for each array are calculated and used in the limma model, using the function limma::arrayWeights. This option ponderates the influence of the arrays in the model depending on the calculated qualities. It is especially useful with large datasets of heterogeneous quality. Ritchie et al (2006)
When the linear model is generated, a diagnosis chart is plotted. It represents the square root of the standard deviation versus the average log expression of each position. We would expect that, as seen in the example, there is a flat relationship between both variables.
Moreover, the design matrix used to generate the model is also shown:
MAC | MO | DonorB | DonorC |
---|---|---|---|
1 | 0 | 0 | 0 |
0 | 1 | 1 | 0 |
1 | 0 | 0 | 1 |
0 | 1 | 0 | 0 |
0 | 1 | 0 | 1 |
1 | 0 | 1 | 0 |
ShinyÉPICo autodetects all the possible comparisons between the groups found in the variable of interest. The number of pairwise combinations between the different groups is calculated as follows: n! / (2! (n ― 2)!), where n is the number of groups.
Then, the application iterates over the contrasts and calculates the tables with the statistics produced by limma, using the functions limma::contrasts.fit, limma::eBayes and limma::topTable. Two more options can be set in this section:
Moreover, shinyÉPICo also calculates the differential of betas between groups, for each position. This information will be used in the next steps.
shinyÉPICo provides plenty of options to filter the statistics produced by limma and to generate a custom heatmap.
Filtering options
With these options, a summary table with the DMPs found in each contrast is shown. In this example:
contrast | Hypermethylated | Hypomethylated | total |
---|---|---|---|
MAC-MO | 2275 | 21 | 2296 |
The differential of beta is always calculated as the subtraction of the average value from the first group to the second group (in this example, MOavg - MACavg) and the positions are assigned to the “Hypermethylated” group if the differential is greater than 0, and to the “Hypomethylated” group if the differential is lower than 0.
Group options
Groups to plot: Groups to be shown in the heatmap. When the column dendrogram is disabled, the order of the samples follows the order in this parameter. Drag and drop is enabled to reorder the groups in the form.
Contrasts to plot: DMPs from the contrasts selected in this parameter are shown in the heatmap. Since several contrasts can share common DMPs, the heatmap shows only these common DMPs once, without duplicates.
Data options
Clustering options
Additional options
Since heatmap plotting can be very time- and memory-consuming, we have fixed a limitation of 12.000 positions (rows) in the heatmap. If you try to plot more positions, you will see a message with this information. However, if you need a bigger heatmap, you can download the information to generate it in the Export tab, as we will discuss below
In the next tab of the DMPs section, DMPs annotation, the information about the genes associated with the DMPs is provided in a table. Moreover, the boxplot of a selected position can be plotted.
The information on the table can be downloaded in CSV or XLSX (Excel) formats, using the buttons above it. However, only rows shown can be downloaded. It is necessary to display all the information in order to download it.
In addition to DMP calculation, shinyÉPICO also supports DMR calculation. A DMR is defined as a genomic region, including several positions, with differential methylation between groups. Instead of checking individual changes in genomic positions, DMR calculation involves the comparison of groups of genomic positions, which can be clusterized in several ways depending on the specific algorithm used.
DMR calculation of shinyÉPICO is based on the package mCSEA which searches DMRs in predefined regions: promoters, gene bodies and CpG islands.
mCSEA uses an algorithm based on the gene set enrichment analysis (Subramanian et al. 2005). Instead of genes, mCSEA uses a sorted list of genomic positions compared between two conditions.
Therefore, we use the eBayes result of limma of each contrast, sorted by the t-statistic, as the input of mCSEA. This implies that the parameters introduced in the limma model, including variables and covariates, are also taken into account for the calculation of DMRs, without the need of additional calculations.
In the DMRs tab, some options can be set for the calculation generated by the mCSEA package:
In the DMR section, shinyÉPICo provides the same options to filter the data and customize the data as in the DMP section.
Note that the default differential of beta threshold for DMRs is 0, in contrast with DMPs (0.2). This is justified considering that DMRs are calculated using the information of the methylation data of several positions, and that small changes may also be relevant if they are consistent. With these options, a summary table with the DMRs found in each contrast is shown. In this example:
contrast | Hypermethylated | Hypomethylated | total |
---|---|---|---|
MAC-MO | 64 | 78 | 142 |
MAC-MO | 160 | 111 | 271 |
MAC-MO | 45 | 62 | 107 |
In order to calculate the differential of beta for DMRs, all the positions in DMRs are considered, and the average difference is used. The differential of beta is always calculated as the subtraction of the average value from the first group to the second group (in this example, MOavg - MACavg) and the positions are assigned to the “Hypermethylated” group if the differential is greater than 0, and to the “Hypomethylated” group if the differential is lower than 0.
All the options of the DMP heatmap are also in the DMR heatmap. The explanation of the options can be found in the DMPs section.
In the Single DMR plot tab, a table with the DMRs of a selected contrast and region is shown. A DMR can be selected to be plotted in its genomic context.
In the Export tab, shinyÉPICo allows the user to download the results of the analysis. Initially, all the download buttons are disabled, and they only become enabled when the information to generate the objects is available.
Up to 9 R Objects can be download, selecting them in the form. They are generated in RDS format, using the saveRDS function. Therefore, they can be easily read in R using the function readRDS. We are going to describe the objects:
RGSet: An object of the RGChannelSet class (see the minfi vignette) with the raw information of the loaded arrays.
GenomicRatioSet: An object of the GenomicRatioSet class (see the minfi vignette) with the normalized information of the loaded arrays.
Fit: An MArrayLM object generated using the limma::lmFit function. It contains the linear model used to calculate DMPs.
Design: A matrix with the design information, with the variable of interest, covariates and interactions.
Ebayestables: A named list with dataframes including the information of the limma statistics, one for each contrast.
Bvalues: A dataframe with the normalized beta values of the experiment.
Mvalues: A dataframe with the normalized M values of the experiment.
Globaldifs: A dataframe with the calculated means and differentials of beta for each group and contrast, respectively.
DMR_results: A list of objects generated by the mCSEA::mCSEATest function, with the statistics of the DMR analysis for each contrast.
DMPs or DMRs genomic positions can be downloaded in BED format. The list can be generated with the hypomethylated and hypermethylated lists of DMPs/DMRs in each contrast, or with the heatmap clustering (if the Row Colors option is enabled). Additionally, a list with the background (all the genomic positions or genomic regions tested) is also generated.
Moreover, the genome can also be selected (hg19 or hg38 genome). Since 450k and EPIC arrays are annotated in hg19, when hg38 is selected, shinyÉPICo uses the function rtracklayer::liftOver to convert the genomic coordinates. It should be note that, as a result of this process, some positions may not be converted, and will not appear in the final lists.
This option is convenient to produce some typical analysis before the DMP/DMR calculation, such as Gene Ontology Over-representation (with, for example, GREAT website) or motif enrichment analysis (with, for example, the HOMER application).
An HTML file with all the options selected during the analysis, and most tables and charts produced, can be generated and downloaded. This report is intended as a reference that indicates all the details of the analysis that can be consulted at any time and reproduced again in shinyÉPICo if necessary.
This option creates a custom R script, including the parameters selected in the analysis, the main functions that shinyÉPICo has used to produced the results and the pipeline followed, from the data importing to the DMP/DMR calculation, DMP/DMR filtering and heatmap(s) generation. The user only has to select the proper folder with the unzipped data used in the analysis (sample sheet and iDAT files), and running the script will produce the same objects as shinyÉPICo.
Both DMP and DMR heatmaps can be downloaded in PDF format. This high quality vector format allows the user to modify heatmaps using software such as Inkscape or Adobe Illustrator.
#> R version 4.2.0 (2022-04-22)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.15-bioc/R/lib/libRblas.so
#> LAPACK: /home/biocbuild/bbs-3.15-bioc/R/lib/libRlapack.so
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] BiocStyle_2.24.0
#>
#> loaded via a namespace (and not attached):
#> [1] bookdown_0.26 digest_0.6.29 R6_2.5.1
#> [4] jsonlite_1.8.0 magrittr_2.0.3 evaluate_0.15
#> [7] highr_0.9 stringi_1.7.6 rlang_1.0.2
#> [10] cli_3.3.0 jquerylib_0.1.4 bslib_0.3.1
#> [13] rmarkdown_2.14 tools_4.2.0 stringr_1.4.0
#> [16] xfun_0.31 yaml_2.3.5 fastmap_1.1.0
#> [19] compiler_4.2.0 BiocManager_1.30.17 htmltools_0.5.2
#> [22] knitr_1.39 sass_0.4.1
Aryee, Martin J., Andrew E. Jaffe, Hector Corrada-Bravo, Christine Ladd-Acosta, Andrew P. Feinberg, Kasper D. Hansen, and Rafael A. Irizarry. 2014. “Minfi: A flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays.” Bioinformatics 30 (10): 1363–9. https://doi.org/10.1093/bioinformatics/btu049.
Du, Pan, Xiao Zhang, Chiang Ching Huang, Nadereh Jafari, Warren A. Kibbe, Lifang Hou, and Simon M. Lin. 2010. “Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis.” BMC Bioinformatics 11 (1): 587. https://doi.org/10.1186/1471-2105-11-587.
Law, Charity W., Yunshun Chen, Wei Shi, and Gordon K. Smyth. 2014. “Voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.” Genome Biology 15 (2). https://doi.org/10.1186/gb-2014-15-2-r29.
Li, Tianlu, Antonio Garcia-Gomez, Octavio Morante-Palacios, Laura Ciudad, Sevgi Özkaramehmet, Evelien Van Dijck, Javier Rodríguez-Ubreva, Alejandro Vaquero, and Esteban Ballestar. 2020. “SIRT1/2 orchestrate acquisition of DNA methylation and loss of histone H3 activating marks to prevent premature activation of inflammatory genes in macrophages.” Nucleic Acids Research 48 (2): 665–81. https://doi.org/10.1093/nar/gkz1127.
Martorell-Marugán, Jordi, Víctor González-Rumayor, and Pedro Carmona-Sáez. 2019. “MCSEA: Detecting subtle differentially methylated regions.” Bioinformatics 35 (18): 3257–62. https://doi.org/10.1093/bioinformatics/btz096.
Phipson, Belinda, Stanley Lee, Ian J. Majewski, Warren S. Alexander, and Gordon K. Smyth. 2016. “Robust hyperparameter estimation protects against hypervariable genes and improves power to detect differential expression.” Annals of Applied Statistics 10 (2): 946–63. https://doi.org/10.1214/16-AOAS920.
Ritchie, Matthew E., Belinda Phipson, Di Wu, Yifang Hu, Charity W. Law, Wei Shi, and Gordon K. Smyth. 2015. “Limma powers differential expression analyses for RNA-sequencing and microarray studies.” Nucleic Acids Research 43 (7): e47. https://doi.org/10.1093/nar/gkv007.
Subramanian, Aravind, Pablo Tamayo, Vamsi K. Mootha, Sayan Mukherjee, Benjamin L. Ebert, Michael A. Gillette, Amanda Paulovich, et al. 2005. “Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles.” Proceedings of the National Academy of Sciences of the United States of America 102 (43): 15545–50. https://doi.org/10.1073/pnas.0506580102.