Liftover bam files

The most straightforward way is using CrossMap.

Taking from hg19 to hg38 as example:

pip install CrossMap bam -a hg19ToHg38.over.chain input.bam output
#.bam extension will be added automatically

genome liftover chain files can be downloaded here: (change according to your needs)

It is suggested to always use ‘-a’ option according to the CrossMap website.


Unix check if a file exists

test -f yourFile

if [ -f "$FILE" ]; then
    echo "$FILE exists."
    echo "$FILE does not exist."

If need to match wildcard in the file names

if test -n "$(find /dir/to/search -maxdepth 1 -name 'files*' -print -quit)"
    echo found
    echo not found


Ignore row with only NaN in plotHeatmap – deepTools

If there are NaN in the output from computeMatrix, the generated heatmap is not sorted and a warning message stating Mean of empty slice will show up.

To overcome this, those null values need to be replaced using 0 in the computeMatrix step by --missingDataAsZero tag.

computeMatrix scale-regions -S -R xxx.bed --missingDataAsZero -m xxx -b xxx -a xxx --numberOfProcessors xx -o xxx.gz

plotHeatmap -m xxx.gz -out xxx.png


Python return os.system and subprocess output as a string

When we want to use Unix command in python we can directly use os.system() to realize it. However, if we only want to return the output as a string, for example, return ls file names into a string, we need to use subprocess.check_out instead.

To note that, the output from subprocess.check_out() is a bytes object instead of a string, thus we need to further decode to transform into a string.

fileList = subprocess.check_output('ls someFolder/*', shell=True).decode('utf-8').strip().split('\n')