logo
Center for Integrative Bioinformatics Vienna
Max F. Perutz Laboratories
Dr. Bohr Gasse 9
A-1030 Vienna, Austria
printable version  
   
   Home
   People
   Publications
   Research
   Teaching
   Software
   Services/Databases

   Max F. Perutz Laboratories
   University of Vienna
   Medical University, Vienna

   Deep Metazoan Phylogeny
   MaBS group
   evolVienna
   Max Perutz Library
 

Next Generation Sequencing Evaluation (NGSE)


Next Generation Sequencing Evaluation (NGSE) is a program to evaluates the performance of different mapping programs on the basis of the simulated read set created by Next Generation Sequencing Simulation (NGSS). Supported mapping programs are Reference assembly-wise BWA (and all assemblers producing SAM format output (including the MD:Z field (See SAM-format specification)), Shrimp, NextGenMap, SSaha2, Bowtie, Blastn are supported.

Requirements:

NGSE requires a PC equipped with at least 4 GB of memory and the latest version of the java runtime environment.

Installation:

No installation required.

Parameters:

-d Assembly method that is evaluated (supported is BWA (SAM format), Shrimp, NextGenMap, SSaha2, Bowtie, Blastn )

-o Path of output file to be created

-p Path of property file (prefix_log) from simulation with NGSS

-r Path to file from assembly program

A call of NGSE looks like this:

java -Xmx8000m -jar NGSE.jar -d BWA -o result -p /home/workspace/testRun/mySeq_log -r /scratch/data/out

The option -Xmx8000m specifies that you will allow the Java Runtime to use up to 8000 MB of RAM. You can of course adjust this parameter according to your hardware. This call will evaluate the results of reference assembly with BWA in file '/scratch/data/out' from data produced by NGSS in the directory '/home/workspace/testRun/'.

Further adjustments:

Additional adjustments in the program can be made in the file 'default.properties'. The option 'maxLineLength' determines how long any line in the input files can be. If a line is longer than this value, the program will stop prematurely. 'maxReadLength' determines how long a read can be. If there are any reads longer than this threshold the program will stop prematurely. It is not recommended to change any of those values. 'logPath' determines the path of the log files the program creates during evaluation.

NGSE Output Files:

log directory:
Log files from application and debug log file

Output file:
File containing one row for each read with 4 columns:
% bases mapped correctly, % bases mapped wrongly, % unmapped bases, read ID

contact imprint .