_ __ __ __ __ /_| /__)/_ / )( / |/ | (__)(__/__) Analysis of Repetitive GenOme Sequences.
ARGOS is a pipeline for extracting intragenomic similarity signals from genomic sequences
ARGOS is a pipeline for extracting three types of one-dimensional signals from a genomic sequence that characterize its repetitiveness. By considering these three signals, one can learn about what parts of a sequence are redundant/copied, how many copies there are and how similar these copies are to each other. This enables insights into genome architecture, genome evolution and generally into the role of repetitive genomic sequences. Please note that an estimated 2/3 of the human genome, for example, are considered repetitive or repeat-derived.
The following three scores are calculated by ARGOS:
Here you can download pre-calculated ARGOS signals for some model organisms. You can also directly embed these tracks into the UCSC Genome Browser. Either press the links below or open genome browser and:
Please note that ARGOS is work in progress! ARGOS source code and releases can be downloaded from GitHub. ARGOS uses maven as build tool. For development we recommend the Eclipse IDE for Java developers and the m2e Maven Integration for Eclipse.
The ARGOS jars can be built with
bin/build-java.sh <VERSION>
(version is, e.g., 0.0.1)
ARGOS contains a fully automated python pipeline that basically takes a mFASTA file and some parameters as input and outputs a set of BigWig files (along with some other useful result files) containing the ARGOS scores. You can call the pipeline using the following command:
python path/to/argos-pipeline.py [PARAMS]
Calling it w/o parameters will give a basic usage information, calling it
with "-h" to get up-to-date usage information:
$ python software/argos-pipeline.py -h
usage: argos-pipeline.py [-h] -g genome [-c [context [context ...]]] -o outdir
-t tmpdir [-rl rl] [-step step] [-ctxSize ctxSize]
[-dontclean] [-recalcScores recalcScores]
[-calcChrom calcChrom] [-gpu]
...
USAGE
optional arguments:
-h, --help show this help message and exit
-g genome, --genome genome
The mfasta file containing the considered genome
-c [context [context ...]], --context [context [context ...]]
The mfasta files containing the context-genome(s)
-o outdir, --outdir outdir
output directory
-t tmpdir, --tmpdir tmpdir
temp directory
-rl rl read length
-step step step size
-ctxSize ctxSize context size for local signals
-dontclean if set to true, the temp files will not be removed
(for debugging purposes only!)
-recalcScores recalcScores
force recalculation of mapping scores
-calcChrom calcChrom Calculation of per-chromosome ctx scores
-gpu Use GPU for alignment
Please note that this pipeline calls the following external tools: