General
Abstract
Structural genomic variations play an important role in human disease and phenotypic diversity. With the rise of high-throughput sequencing tools, mate-pair/paired-end sequencing has become an important technique for the detection and exploration of structural variation. ViVar is a comprehensive analysis platform for the processing, analysis and visualization of structural variation based on sequencing data or genomic microarrays, enabling the rapid identification of disease loci or genes. Vivar allows you to scale your analysis with your work load over multiple (cloud) servers, has user access control to keep your data safe but still easy to share, and is easy expandable as analysis techniques advance.
Data
To help you explore ViVar we loaded some demo experiments, using real (non-simulated) human data. When you're logged in, you can click the links in the table below to zoom to interresting genomic regions of the sample experiments.
Experiment | Description | Links |
---|---|---|
Patient 12 | Mate-pair sequencing data of duplication and deletion on chromosome 1 | View chr1 region |
Multiple Patients | Clustering analysis detecting a balanced abberation, an inversion. This would not be detectable by coverage or genomic microarray analysis. | View chr4 region |
Patient 28 | Mate-pair sequencing data, and genomic microarray data of a complex duplications on the long arm of the X chromosome | View chrX region |
Patient 29 | Mate-pair sequencing data, and genomic microarray data of a complex trisomy 21 | View chr21 region |
Technical Documentation
1. Requirements:
- General
ViVar is a php5 application powered by the Slim framework and the Twig templating engine.
It is available as a stand alone webapplication or can be served through a Docker container .
The ViVar docker image is an easy distributable installation of the ViVar platform, based on the official Nginx image available at Docker Hub, which is again based on Debian:jessie.
All internal processes are monitored by Supervisord and the whole depends on a separately run Mongo database.
- Hard- and Software
The ViVar webapp requires three separate systems to run.
The website front runs as a php5 application and thus requires the following packages to be installed, available through apt-get
on Ubuntu, Debian, etc.
- php5-fpm
- php5-mongo
- php5-gd
- php5-curl
- curl
- samtools
- mongodb-clients
The analysis back-end requires a compute cluster with Torque. This cluster should have access to a folder shared with the webserver where the job submission scripts can be written. A shared folder for raw data to be analyzed should also be available. Software-wise, the cluster must have the following executables installed and available in the PATH environmental value:
- bowtie2
- bedtools
- samtools
- R (>=3.2.2) with libraries
- Bioconductor-QDNAseq
- Bioconductor-CGHcall
- Perl with Mojolicious
bash submitdaemon.sh "queueName" "queueServer" "queueUser"This script will also install the Mojolicious perl libraries. To use the daemon, start a screen session and execute:
morbo vivar_submitdaemon.pl &
The whole webapplication uses a Mongo database as backbone for data storage.
2. Installation:
- Docker Image (recommended)
Inside the container Nginx is listening at port 80
which can be mounted to a port of your choice. A graphical user interface for supervisord is made available at port 9002
.
A Mongo database can be run locally or as a separate Docker container. To link a local or existing Mongo database, mount the appropriate port to 21017
in the ViVar container.
In a Docker container, data is not persistent. To avoid loss of data, you can mount /vivar
and /vivar-data
to your local filesystem.
The /vivar
folder contains all source pre and configuration files for the web application. /vivar-data
can be used to store and import data files.
CAVEAT: This Docker image is a work in progress. While efforts have been made to automate as much of the configuration as possible, things still can go wrong. We advise to check the configuration files at /vivar/config.ini
and /vivar/jobs/general.json
for errors if any problems might occur.
2. Run the ViVar webapp.
example: ViVar with MongoDB run locally and mounted data volumes
docker run --name=vivar -d -p 8080:80 -p 9002:9002 -e VIVAR_API="172.17.0.1:8080" MONGO_HOST="your_mongo_host" -v /foo/bar/vivar:/vivar/ -v /foo/bar/vivar-data:/vivar-data/ vivarAvailable environmental values to edit
Variable | Default | Synopsis |
---|---|---|
VIVAR_API | 172.17.0.1:8080 | This is the adress and port where ViVar listens for API calls, normally this will be the ip adress of the docker0 interface when using the command ip a . |
MONGO_HOST | 127.0.0.1 | Hostname of the machine where the mongo database listens for connections. When using a linked container, this value is set automatically. |
MONGO_PORT | 27017 | Port where where the mongo database listens for connections. |
PBS_HOST | 127.0.0.1 | FQDN of the host where the job submission daemon listens. |
PBS_PORT | 3000 | Port at which the job submission daemon listens. |
PBS_SERVER | pbsqueue | Hostname of the PBS server. |
PBS_QUEUE | batch | Name of the queue where the jobs can be submitted. |
HOST_DATADIR | /mnt/vivar-data | Directory on the docker host where the /vivar-data volume is mounted. |
A list of available options can be found in the MongoDB documentation .
example:
docker run --name express -d -p 8081:8081 -e ME_CONFIG_MONGODB_SERVER="your_mongo_host" knickers/mongo-express
Database entries
Reference Organisms
It is possible to manually add new reference organisms. To add a new organism of reference build, edit or add the appropriate field to the database. New organism documents should be formatted as follows:
example: Organism document for Homo Sapiens, with 2 reference builds included{ "_id": 9606, "name": "Homo sapiens", "build": ["GRCh37","GRCh38"] }
Each organism should have the corresponding reference files stored in the appropriate file structure, with the top folder named after the corresponding taxonomy ID
example: Document tree for Homo Sapiens, with 2 reference builds included./9606 ├── GRCh37 │ ├── bioconductor │ │ ├── GRCh37.fa -> ../seq/GRCh37.fa │ │ ├── QDNAseq.GRCh37.100kbp.SR100.rds │ ├── bowtie2 │ │ ├── GRCh37.1.bt2 │ │ ├── GRCh37.2.bt2 │ │ ├── GRCh37.3.bt2 │ │ ├── GRCh37.4.bt2 │ │ ├── GRCh37.fa -> ../seq/GRCh37.fa │ │ ├── GRCh37.rev.1.bt2 │ │ └── GRCh37.rev.2.bt2 │ ├── seq │ │ └── GRCh37.fa └── GRCh38 ├── bioconductor ├── bowtie2 ├── seq └── wisecondor
Loading reference features
To be continued ...