Download
The regenie source code is hosted on Github.
Installation
Pre-requisites
regenie requires compilation with GCC version >= 5.1 (on Linux) or Clang version >=3.3 (on Mac OSX). It also requires having GFortran library installed.Pre-compiled binaries
Pre-compiled binaries are available in the Github repository. These are provided for Linux (including Centos7) and Mac OSX computing environments and are statically linked. For the Linux binaries, users should have GLIBC version >= 2.22 installed. Additionally, they are provided compiled with Intel MKL library which will provide speedups for many of the operations done in regenie.
Standard installation
- regenie requires the BGEN library so you will need to download and install that library.
- Edit the BGEN_PATH variable in the
Makefile
to the BGEN library path. - On the command line type
make
while in the main source code directory. - This should produce the executable called
regenie
.
regenie has been enhanced to allow for gzip compressed input
(for phenotype/covariate files) and output (for association results files)
using the Boost Iostream library.
If this library is installed on the system, you should compile using
make HAS_BOOST_IOSTREAM=1
.
Furthermore, we have enabled compilation of regenie with
the Intel Math Kernel (MKL) library. You first need to have it installed
on your system and modify the MKLROOT variable in the Makefile
to the installed MKL library path.
With CMake
You can compile the binary using CMake version >=3.13 (instead of make
as above).
mkdir -p build
cd build
BGEN_PATH=<path_to_bgen_lib> cmake ..
make
This will generate the binary in the build/
subdirectory.
To use with Boost Iostreams and/or Intel MKL library,
add the corresponding flags before the cmake
command on line 3
(e.g. BGEN_PATH=<path_to_bgen_lib> HAS_BOOST_IOSTREAM=1 cmake ..
).
With Docker
Alternatively, you can use a Docker image to run regenie. A guide to using docker is available on the Github page.
With conda
To install with conda, you can use the following commands:
# create new environment
conda create -n regenie_env -c conda-forge -c bioconda regenie
# load it
conda activate regenie_env
Computing requirements
We have tested regenie on 64-bit Linux and 64-bit Mac OSX computing environments.
Note that for Mac OSX computing environments, compiling is done without OpenMP, as the library is not built-in by default and has to be installed separately.
Memory usage
In both Step 1 and Step 2 of a regenie run the genetic data file is read once, in blocks of SNPs, so at no point is the full dataset ever stored in memory.
regenie uses a dimension reduction approach using ridge regression to produce a relatively small set of genetic predictors, that are then used to fit a whole-genome regression model. These genetic predictors are stored in memory by default, and can be relatively large if many phenotypes are stored at once.
For example, if there are phenotypes, SNPs and samples, and a block size of SNPs is used with ridge parameters, then regenie needs to store roughly doubles per phenotype, which is 8Gb per phenotype when and 200Gb in total when .
However, the --lowmem
option can be used to avoid that memory usage,
at negligible extra computational cost, by writing temporary files to disk.
Threading
regenie can take advantage of multiple cores using threading. The
number of threads can be specified using the --threads
option.
regenie uses the Eigen library for efficient linear algebra operations and this uses threading where possible.
For PLINK bed/bim/fam files, PLINK2 pgen/pvar/psam files, as well as BGEN v1.2 files with 8-bit encoding (format used for UK Biobank 500K imputed data), step 2 of regenie has been optimized by using multithreading through OpenMP.
When running the SKAT/ACAT gene-based tests, we recommend to use at most 2 threads and instead parallelize the runs over partitions of the genome (e.g. groups of genes).
For Windows platforms
If you are on a Windows machine, we recommend to use Windows Subsystem for Linux (WSL)
to install a Ubuntu distribution so that you will be able to run REGENIE
from a Linux terminal.
You can download pre-compiled REGENIE binaries from the Github repository
(note that you will need to install the libgomp1
library).
Note: from your Windows command prompt, you can run REGENIE using wsl regenie
.