Funseq2 hands-on tutorial

Only the most essentials are kept, mostly explanation of installation errors and how to fix them. For detailed instructions, please refer to Funseq2 Protocol.
It’s recommended to install things to a designated Conda environment, for some tools, I will only list Conda installation command.

Dependances to install first

TPMpvalue [link]
– Perl package Parallel::ForkManager
[link]
– bedtools/tabix/bigwigAverageOverBed
– VAT (snpMapper, indelMapper Module) [link]
– sed/awk/grep
(those should have been pre-installed already)

TPMpvalue
Download the TFM-Pvalue.tar.gz from its website, tar -xf TFM-Pvalue.tar.gz to decompress the package and cd into the folder. To compile the package type as follow.

tar -xf TFM-Pvalue.tar.gz
cd TFM-Pvalue
make

After compiling the tool successfully, either directly copy the compiled binary files into your /usr/bin folder, or add the path to the environmental variable as below.

export PATH=$PATH:/where/you/put/binary/files

Possible Error:
g++ -O3 -DJASPAR=1 -DPROGRAM=0 TFMpvalue.cpp Matrix.cpp ArgumentException.cpp FileException.cpp ParseException.cpp -o TFMpvalue-pv2sc
TFMpvalue.cpp: In function ‘void arguments(int, char* const*)’:
TFMpvalue.cpp:503:45: error: ‘getopt’ was not declared in this scope

Solution:
In the TFM-Pvalue folder, find the file TFPpvaue.cpp, uncomment line 16 in that file and change GetOpt.h into getopt.h.

Perl package Parallel::ForkManager
Here is the installation using Conda.

conda install -n yourEnvr -c bioconda perl-parallel-forkmanager

bedtools/tabix/bigwigAverageOverBed
Installation using Conda.

conda install -n yourEnvr -c bioconda bedtools
conda install -n yourEnvr -c bioconda tabix
conda install -n yourEnvr -c bioconda ucsc-bigwigaverageoverbed

VAT
The easiest way is to download pre-built binaries, save them into your local /usr/bin/ or add the path to your environmental variable and make them executable.

export PATH=$PATH:/where/you/put/binary/files
chmod +x snpMapper
chmod +x indelMapper # <- to make them executable

Download the Funseq2 and pre-processed data

Download the Funse2 from its newest update:
https://github.com/khuranalab/FunSeq2_DC

Download the latest data needed for Funseq2 and put everything into the data_context folder, unzip XXX.tar.gz folders within data_context
https://khuranalab.med.cornell.edu/data_DC3.html

You can also download an older build of the data set from https://khuranalab.med.cornell.edu/data.html
The content can be downloaded in a compressed folder from that page.

Prepare funseq2.sh and config.txt file to run Funseq2

In your FunSeq2_DC folder, you need to modify two files before starting: funseq2.sh and contig.txt.

### In funseq2.sh ###

# keep "data_context/user_annotations" intact
user_anno=/your/destination/of/data_context/user_annotations

### In contig.txt ###

# Change the file path first and then change the annotation files you want to use accordingly.
file_path=/your/destination/of/data_context

The destination user_anno can be specified in the running command by using the option -ua. But if you don’t want to type that every time, you can just change it in funseq2.sh.

If running on cluster and using Conda, remember to export the $TMPDIR to $PERL5LIB

export PERL5LIB=$PERL5LIB:$TMPDIR

Running FunSeq2

After all the preparation, we are finally here. To run FunSeq2 is simple.

funseq2.sh -f file -maf MAF -m <1/2> -len length_cut -inf <bed/vcf> -outf <bed/vcf> -nc -o path -g file -exp file -cls file -exf <rpkm/raw> -p int -cancer cancer_type -s score -uw -ua user_annotations_directory -db
FunSeq2 options

Source:
https://currentprotocols.onlinelibrary.wiley.com/doi/full/10.1002/cpbi.23
https://github.com/gersteinlab/FunSeq2/issues/4
http://info.gersteinlab.org/Funseq2#E._Input_files
https://anaconda.org/bioconda/perl-parallel-forkmanager
https://metacpan.org/pod/release/SZABGAB/Parallel-ForkManager-1.03/lib/Parallel/ForkManager.pm
https://anaconda.org/bioconda/bedtools
https://anaconda.org/bioconda/tabix
https://anaconda.org/bioconda/ucsc-bigwigaverageoverbed
https://github.com/khuranalab/FunSeq2_DC
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0480-5
https://science.sciencemag.org/content/342/6154/1235587.long