Installation time: approximately 5-10 min, depending on your machine
configuration. 
 Set up the FUSION program folder: 
cd ~/scratch/programs/
wget https://github.com/gusevlab/fusion_twas/archive/master.zip -O fusion.zip
unzip fusion.zip
The following assumes that you already have conda installed. For more
information on conda, see https://docs.conda.io/en/latest/miniconda.html 
 To
create the conda environment for fusion and focus, please using the yml
files provided - see https://github.com/rodrigoduarte88/neuro_rTWAS/blob/main/fusion_final_environment.yml
 conda env create --file fusion_final_environment.yml
 This yml file contains most software and library versions required
to run focus/fusion.
We still will need to install the R library “plink2R”. To do this,
rename libraries for plink2R in the conda environment folder (as
detailed here).
cd /users/rodrigoduarte88/scratch/miniconda3/envs/fusion_final/lib
mv liblapack.so libRlapack.so
mv libblas.so libRblas.so
 
 You will need to start
R, and install manually plink2R using the following command: 
conda activate fusion_final 
 R 
devtools::install_github("carbocation/plink2R/plink2R", ref="carbocation-permit-r361")
Now, let’s create the conda environment for FOCUS
conda create -n focus python=3.7 r-base 
conda activate focus 
pip install pyfocus --user 
pip install mygene --user 
pip install rpy2 --user 
These include the SNP weights for FOCUS/FUSION and the 1000 Genomes
reference panel for the population of interest. Please download the
required files from the King’s College London Research Data Repository
(KORDS), at https://doi.org/10.18742/22179655. Then decompress
files.
 tar zxvf FOCUS_weights.tgz
tar zxvf FUSION_weights.tgz
tar zxvf 1000G_ref_panel.tgz
 
 N.B.: The reference
panels are annotated with dbsnp151/hg19 information.
Your GWAS summary statistics must be annotated with variant IDs according to dbsnp151. Use munge_sumstats.py from the ldsc package for pre-filtering. You can find an example of how this was done on the scripts available from https://github.com/rodrigoduarte88/TWAS_HERVs-SCZ. You can also check the FUSION guidelines for additional instructions.
Summary statistics for FUSION should look like:
SNP     A1      A2      Z
rs10    A       C       -0.501
rs1000000       G       A       2.238
rs10000003      A       G       -1.324
rs10000010      T       C       -0.082
rs10000013      C       A       -2.04
Summary statistics for FOCUS should look like:
CHR     SNP     BP      A1      A2      Z       N
7       rs10    92383888        A       C       -0.501  58749.13
12      rs1000000       126890980       G       A       2.238   58749.13
4       rs10000003      57561647        A       G       -1.324  58749.13
4       rs10000010      21618674        T       C       -0.082  58749.13
4       rs10000013      37225069        C       A       -2.04   58749.13
To run FUSION, activate the conda environment, and use the FUSION
weights and linkage disequilibrium reference panel provided. 
conda activate fusion_final
 
Rscript FUSION.assoc_test.R \
--sumstats PGC2.SCZ.sumstats.fusion \
--weights ./wrapped/CMC.pos \
--weights_dir ./wrapped/ \
--ref_ld_chr ./LDREF_harmonized/1000G.EUR. \
--chr 22 \
 --out PGC2.SCZ.22.dat 
 
To run the conditional analysis, you can follow the instructions as
provided by the authors of FUSION. For example, first, you have to
obtain a file containing only Bonferroni significant hits, and then you
can perform the conditional analysis. 
 
 Combine all files from
all chromosomes 
head -1  PGC2.SCZ.1.dat > SCZ_____all_chr.tsv
tail -n +2 -q PGC2.SCZ.* >> SCZ_____all_chr.tsv
 Create file with significant hits only (Bonferroni) 
bonferroni_p='bc -l <<< "scale=50; 0.05/8212"' #
8212 is the number of expressed features in the weights
cat SCZ_____all_chr.tsv | awk -v var="${bonferroni_p}" 'NR == 1 || $20 < var' > SCZ_____all_chr.tsv.Sig
 Rscript FUSION.post_process.R \
--sumstats PGC2.SCZ.sumstats.fusion \
--input SCZ_____all_chr.tsv.Sig \
--out SCZ_____all_chr.tsv.Sig.analysis \
--ref_ld_chr ./LDREF_harmonized/1000G.EUR. \
--chr 22 \
--plot --locus_win 100000
To run FOCUS, activate the conda environment, and use the FOCUS
weights and linkage disequilibrium reference panel provided. 
conda activate focus 
module load mesa-glu/9.0.1-gcc-9.4.0 # this is for CREATE
users - loads libGL.so.1
 
focus finemap schizophrenia.gwas.focus \
LDREF_harmonized/1000G.EUR.22 CMC_brain_focus_database.db \
--chr 22 --plot --p-threshold 5E-08 \
--out SCZ_pgc3_CMC.5e-8.chr.22 --locations 37:EUR
 
 
For interpretation of the output files, please use the instructions
provided by the authors of FOCUS and FUSION. The results
contain gene and HERV expression signatures associated with genetic
susceptibility to your trait of interest.