TFClass Predict

Description

tfclass_predict can be used to predict transcription factor binding sites in ATAC-seq data on TFClass level using DNABERT.

Package Workflow Structure

Installation

Currently, only a pre-alpha version of the package is available. The package can be installed via pip:

pip install -i https://test.pypi.org/simple/ tfclass-predict

To use the package the human genome (v38) and the DNABERT model (v1-06) are needed.

DNABERT6 by jerryji1993

HG38 from UCSC

Both downloads need to be unzipped so that the path to hg38.fa and the path to the directory 6-new-12w-0 can be passed to the command-line tool or PredictionManager.

Models

TFClass Predict currently allows to use only one hierarchy level (the class-level). The corresponding model needs to be downloaded in order to use the tool.

Classlevel

Usage

The tool can be used from the command line with the following parameters:

tfclass-predict [-h] [--tfm TFM] bed_file hg_file dnabert output_dir

positional arguments:
  bed_file      Path to bed file of ATAC-seq or other NGS experiment.
  hg_file       Path to human genome reference.
  dnabert       Path to DNABERT model.
  output_dir    Path to output directory.
  tfclass_model Path to TFClass model.

options:
  -h, --help  show this help message and exit
   

Or directly in python scripts:

from tfclass_predict import PredictionManager

bed_file = 'tests/GSM6915056_P1_summits_100.bed'  # smaller bed file for testing
genome_file = "hg38.fa"
tfclass_model = "model/Classlevel.h5" #see Installation
bert_model = "model/6-new-12w-0" #see Intallation
res_dir = "tests/res"

pred_manager = PredictionManager(bed_file, genome_file, res_dir, bert_model, tfclass_model)
pred_manager.predict()
pred_manager.save_results()

Docker Image (under construction)

Includes the Dockerfile to install Docker. Go into the docker directory and run:
docker build -t “username_name_of_the_image” .

How to use the docker image?
docker run -it -u 2696:205 –gpus ‘“device=0,1,2,3,4,5,6,7”’ -v /scratch/docker_hti/MultiModel_160523/:/AI_PLATFORM/ –rm –name hti hti_tfplatform:1.1\

Please change the -u or user id to your own. You can find your own user id by checking “id -u ” form the terminal. Specify the gpu devices you want to use and modify the scratch directory to where you have the files downloaded. Of course, also change the “hti” to your own username.