ARGOS is a pipeline for extracting three types of one-dimensional signals from a genomic sequence that characterize its repetitiveness. By considering these three signals, one can learn about what parts of a sequence are redundant/copied, how many copies there are and how similar these copies are to each other. This enables insights into genome architecture, genome evolution and generally into the role of repetitive genomic sequences. Please note that an estimated 2/3 of the human genome, for example, are considered repetitive or repeat-derived.
The following three scores are calculated by ARGOS:
Here you can download pre-calculated ARGOS signals for some model organisms. You can also directly embed these tracks into the UCSC Genome Browser. Either press the links below or open genome browser and:
Please note that ARGOS is work in progress! ARGOS source code and releases can be downloaded from GitHub. ARGOS uses maven as build tool. For development we recommend the Eclipse IDE for Java developers and the m2e Maven Integration for Eclipse.
The ARGOS jars can be built with
(version is, e.g., 0.0.1)
ARGOS contains a fully automated python pipeline that basically takes a mFASTA file and some parameters as input and outputs a set of BigWig files (along with some other useful result files) containing the ARGOS scores. You can call the pipeline using the following command:
Calling it w/o parameters will give a basic usage information, calling it with "-h" to get up-to-date usage information:
python path/to/argos-pipeline.py [PARAMS]
Please note that this pipeline calls the following external tools:
$ python software/argos-pipeline.py -h usage: argos-pipeline.py [-h] -g genome [-c [context [context ...]]] -o outdir -t tmpdir [-rl rl] [-step step] [-ctxSize ctxSize] [-dontclean] [-recalcScores recalcScores] [-calcChrom calcChrom] [-gpu] ... USAGE optional arguments: -h, --help show this help message and exit -g genome, --genome genome The mfasta file containing the considered genome -c [context [context ...]], --context [context [context ...]] The mfasta files containing the context-genome(s) -o outdir, --outdir outdir output directory -t tmpdir, --tmpdir tmpdir temp directory -rl rl read length -step step step size -ctxSize ctxSize context size for local signals -dontclean if set to true, the temp files will not be removed (for debugging purposes only!) -recalcScores recalcScores force recalculation of mapping scores -calcChrom calcChrom Calculation of per-chromosome ctx scores -gpu Use GPU for alignment