{"id":2983,"date":"2025-09-11T17:28:49","date_gmt":"2025-09-11T08:28:49","guid":{"rendered":"https:\/\/nagasakilab.csml.org\/en\/?page_id=2983"},"modified":"2025-12-01T10:02:04","modified_gmt":"2025-12-01T01:02:04","slug":"qtfpred","status":"publish","type":"page","link":"https:\/\/nagasakilab.csml.org\/en\/qtfpred","title":{"rendered":"QTFPred: Quantum-based Transcription Factor Predictor"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\"><strong>Overview<\/strong><\/h1>\n\n\n\n<p>QTFPred (Quantum-based Transcription Factor Predictor) is a quantum-classical hybrid deep learning framework for predicting transcription factor (TF) binding signals at base-pair resolution. By integrating quantum convolutional layers with fully convolutional neural networks (FCNs), QTFPred achieves state-of-the-art performance, particularly in data-sparse scenarios where conventional methods struggle.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>Download<\/strong><\/h1>\n\n\n\n<p>The package of QTFPred can be downloaded from here (<a href=\"https:\/\/nagasakilab.csml.org\/data\/QTFPred_signal.v6.tar.gz\">QTFPred_signal.v6.tar.gz<\/a>)<\/p>\n\n\n\n<p><strong>Execute the Extraction Command<\/strong><\/p>\n\n\n\n<p>Use the <code>tar -zxvf<\/code> command to decompress and extract the archive. This will create the top-level repository folder.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>tar -zxvf QTFPred_signal.v6.tar.gz\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Implementation Reference<\/h2>\n\n\n\n<p>QTFPred builds upon the implementation approach of FCNsignal:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>FCNsignal Paper: <a href=\"https:\/\/journals.plos.org\/ploscompbiol\/article?id=10.1371\/journal.pcbi.1009941\">Base-resolution prediction of transcription factor binding signals by a deep learning framework<\/a> | PLOS Computational Biology, 2022<\/li>\n\n\n\n<li>FCNsignal GitHub: https:\/\/github.com\/turningpoint1988\/FCNsignal<\/li>\n<\/ul>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>Key Features<\/strong><\/h1>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Quantum Convolutional Layer (QConv): 4-qubit parameterized quantum circuit for enhanced feature extraction<\/li>\n\n\n\n<li>Hybrid Architecture: Seamless integration of quantum and classical neural network layers<\/li>\n\n\n\n<li>Comprehensive Evaluation: Evaluation metrics include RMSE, Pearson correlation (PR), AUROC, and AUPRC<\/li>\n\n\n\n<li>Pre-optimized Hyperparameters: Hyperparameters tuned via Optuna for optimal performance<\/li>\n\n\n\n<li>Interpretable Motifs: Extract position frequency matrices (PFMs) from learned quantum filters<\/li>\n<\/ul>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>Competing Methods<\/strong><\/h1>\n\n\n\n<p>QTFPred can be compared with the following state-of-the-art methods for TF binding prediction:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/kundajelab\/bpnet\/\">BPNet<\/a> &#8211; Base-pair resolution neural network for TF binding and chromatin accessibility prediction<\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/turningpoint1988\/FCNsignal\">FCNsignal<\/a> &#8211; Fully convolutional network for TF binding signal prediction at base resolution<\/li>\n<\/ul>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>System Requirements<\/strong><\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Hardware Requirements<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>GPU: NVIDIA GPU with CUDA support (H100 or A100 recommended)<\/li>\n\n\n\n<li>GPU Memory: 5GB or more recommended<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Software Requirements<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OS: Linux (Ubuntu 20.04 or later recommended)<\/li>\n\n\n\n<li>CUDA: 12.0.1<\/li>\n\n\n\n<li>cuDNN: 8.x<\/li>\n\n\n\n<li>Singularity: 3.8 or higher<\/li>\n<\/ul>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>Installation<\/strong><\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Prerequisites<\/h2>\n\n\n\n<p>Ensure your working directory is set to the QTFPred_signal repository root:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cd \/path\/to\/QTFPred_signal\nexport QTFPRED_ROOT=$(pwd)<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Quick Setup (Recommended)<\/strong><\/h2>\n\n\n\n<p>The repository includes a pre-built Singularity container (<code>singularity\/test.v2.sif<\/code>, 11GB) with all dependencies installed. This is the recommended approach since the MEME Suite website is currently down.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Verify the pre-built container exists\nls -lh singularity\/test.v2.sif\n# Expected: ~11GB file\n\n# Test the container\nsingularity exec --nv singularity\/test.v2.sif python3.11 -c \"\nimport torch\nimport pennylane as qml\nprint(f'PyTorch: {torch.__version__}')\nprint(f'PennyLane: {qml.__version__}')\nprint(f'CUDA available: {torch.cuda.is_available()}')\n\"<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Alternative: Build Container from Definition (Optional)<\/strong><\/h2>\n\n\n\n<p>If you need to rebuild the container:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Requires sudo privileges\nsudo singularity build singularity\/test.v2.sif singularity\/project_FCNsignal.def<\/code><\/pre>\n\n\n\n<p>Note: Building takes approximately 15-20 minutes.<\/p>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<h2 class=\"wp-block-heading\" id=\"before-you-start-configuring-paths\"><strong>Before You Start: Configuring Paths<\/strong><\/h2>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Understanding Singularity Bind Mounts<\/h2>\n\n\n\n<p>QTFPred uses Singularity containers to ensure reproducible execution environments. To use the scripts, you need to configure bind mounts that connect your host filesystem to the container&#8217;s internal filesystem.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>What is a Bind Mount?<\/strong><\/h2>\n\n\n\n<p>A bind mount allows the Singularity container to access files and directories on your host system. Without proper bind configuration, the container cannot read your data or save results.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Bind Mount Syntax<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>BIND_PATH=\"HOST_PATH:CONTAINER_PATH\"<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>HOST_PATH (left side): Your actual filesystem path where QTFPred_signal is located<\/li>\n\n\n\n<li>CONTAINER_PATH (right side): Fixed internal path inside the container (<code>\/mnt\/QTFPred_signal<\/code>)<\/li>\n\n\n\n<li>The colon (<code>:<\/code>) separates the two paths<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Why This Matters<\/strong><\/h2>\n\n\n\n<p>When you run scripts, they execute inside the Singularity container. The container has its own isolated filesystem. By setting <code>BIND_PATH<\/code>, you tell Singularity to &#8220;mount&#8221; your host directory at a specific location inside the container, making your files accessible.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>How to Configure Scripts<\/strong><\/h2>\n\n\n\n<p>Each execution script contains a &#8220;User Configuration&#8221; section at the top that you must update before running:<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Step 1: Locate Your QTFPred_signal Path<\/strong><\/h2>\n\n\n\n<p>First, determine the absolute path to your QTFPred_signal repository:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cd QTFPred_signal\npwd\n# Output example: \/home\/username\/QTFPred_signal<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Step 2: Edit Each Script<\/strong><\/h2>\n\n\n\n<p>Open the script you want to run and find the &#8220;User Configuration&#8221; section (first 20-30 lines):<\/p>\n\n\n\n<p><strong>Example from <code>execute_train_QTFPred_signal.sh<\/code>:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># ============================================================================\n# User Configuration (REQUIRED: Update these paths for your environment)\n# ============================================================================\n#\n# Set the absolute path to your QTFPred_signal repository root directory.\n# This path should point to where you cloned\/extracted QTFPred_signal.\n#\n# Example configurations:\n#   PROJECT_ROOT=\"\/home\/username\/QTFPred_signal\"\n#   PROJECT_ROOT=\"\/data\/projects\/QTFPred_signal\"\n#   PROJECT_ROOT=\"\/mnt\/storage\/research\/QTFPred_signal\"\n#\nPROJECT_ROOT=\"\/path\/to\/QTFPred_signal\"  # &lt;- CHANGE THIS LINE\n\n# Singularity container path (relative to PROJECT_ROOT)\nSINGULARITY_CONTAINER_PATH=\"${PROJECT_ROOT}\/singularity\/test.v2.sif\"\n\n# Bind path configuration (HOST:CONTAINER format)\nBIND_PATH=\"${PROJECT_ROOT}:\/mnt\/QTFPred_signal\"  # &lt;- This updates automatically\n\n# Python path for container environment\nSINGULARITYENV_PYTHONPATH=\"\/mnt\/QTFPred_signal\/scripts:${PYTHONPATH}\"<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Step 3: Update PROJECT_ROOT<\/strong><\/h2>\n\n\n\n<p>Replace <code>\/path\/to\/QTFPred_signal<\/code> with your actual path:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Before (default)\nPROJECT_ROOT=\"\/path\/to\/QTFPred_signal\"\n\n# After (your environment - example)\nPROJECT_ROOT=\"\/home\/username\/QTFPred_signal\"<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Step 4: Save and Verify<\/strong><\/h2>\n\n\n\n<p>After editing, verify your configuration:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Check that the container exists at the specified path\nls ${PROJECT_ROOT}\/singularity\/test.v2.sif\n\n# Output: Should show the 11GB container file<\/code><\/pre>\n\n\n\n<p>The following scripts need path configuration before use:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Script<\/th><th>Purpose<\/th><th>Configuration Required<\/th><\/tr><\/thead><tbody><tr><td><code>execute_train_QTFPred_signal.sh<\/code><\/td><td>Train quantum model<\/td><td>\u2713<\/td><\/tr><tr><td><code>execute_train_FCNsignal_signal.sh<\/code><\/td><td>Train FCNsignal<\/td><td>\u2713<\/td><\/tr><tr><td><code>execute_train_BPNet_signal.sh<\/code><\/td><td>Train BPNet<\/td><td>\u2713<\/td><\/tr><tr><td><code>execute_bed2signal.sh<\/code><\/td><td>Data preprocessing<\/td><td>\u2713<\/td><\/tr><tr><td><code>execute_download.sh<\/code><\/td><td>Download ChIP-seq data<\/td><td>\u2713<\/td><\/tr><tr><td><code>extract_motif_from_QTFPred.sh<\/code><\/td><td>Extract motifs<\/td><td>\u2713<\/td><\/tr><tr><td><code>run_tomtom_against_JASPAR.sh<\/code><\/td><td>TomTom analysis<\/td><td>\u2713<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>Directory Structure<\/strong><\/h1>\n\n\n\n<pre class=\"wp-block-code\"><code>QTFPred_signal\/\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 data\/                              # Data directory\n\u2502   \u251c\u2500\u2500 HeLa-S3\/                       # Cell line directory\n\u2502   \u2502   \u251c\u2500\u2500 datalist.txt              # List of TFs to download\n\u2502   \u2502   \u2514\u2500\u2500 ELK1\/                     # TF directory (example data included)\n\u2502   \u2502       \u251c\u2500\u2500 thresholded.bed       # IDR thresholded peaks\n\u2502   \u2502       \u251c\u2500\u2500 p-value.bigWig        # ChIP-seq signal\n\u2502   \u2502       \u2514\u2500\u2500 data\/                 # Preprocessed data\n\u2502   \u2502           \u251c\u2500\u2500 ELK1_train.npz    # Training data (~180MB)\n\u2502   \u2502           \u251c\u2500\u2500 ELK1_test.npz     # Test data (~23MB)\n\u2502   \u2502           \u2514\u2500\u2500 ELK1_neg.npz      # Negative data (~22MB)\n\u2502   \u251c\u2500\u2500 K562\/                          # Other cell lines\n\u2502   \u2502   \u2514\u2500\u2500 datalist.txt\n\u2502   \u251c\u2500\u2500 GM12878\/\n\u2502   \u2502   \u2514\u2500\u2500 datalist.txt\n\u2502   \u251c\u2500\u2500 Genome\/                        # Reference genome (\u203buser downloads)\n\u2502   \u2502   \u251c\u2500\u2500 hg38.fa                   # \u203bobtained via download_genome.sh\n\u2502   \u2502   \u2514\u2500\u2500 chromsize                 # \u203bobtained via download_genome.sh\n\u2502   \u2514\u2500\u2500 JASPAR\/                        # Motif database\n\u2502       \u2514\u2500\u2500 JASPAR2024_CORE_vertebrates_non-redundant_pfms_meme.txt\n\u251c\u2500\u2500 scripts\/                           # Execution scripts\n\u2502   \u251c\u2500\u2500 models\/                        # Model definitions\n\u2502   \u2502   \u251c\u2500\u2500 QTFPred_signal.py         # Quantum model\n\u2502   \u2502   \u251c\u2500\u2500 FCNmotif.py               # Classical model components\n\u2502   \u2502   \u2514\u2500\u2500 quantum_convolutional_layer.py  # Quantum convolutional layer\n\u2502   \u251c\u2500\u2500 data_processing\/               # Data processing\n\u2502   \u2502   \u251c\u2500\u2500 download_genome.sh        # Download genome reference\n\u2502   \u2502   \u251c\u2500\u2500 download_encode_data.py   # Download ChIP-seq data (Python)\n\u2502   \u2502   \u251c\u2500\u2500 execute_download.sh       # Download ChIP-seq data (Shell)\n\u2502   \u2502   \u251c\u2500\u2500 bed2signal.py             # Preprocessing (Python)\n\u2502   \u2502   \u251c\u2500\u2500 execute_bed2signal.sh     # Preprocessing (Shell)\n\u2502   \u2502   \u2514\u2500\u2500 datasets.py               # Dataset class\n\u2502   \u251c\u2500\u2500 training_execution_sh\/         # Training execution scripts\n\u2502   \u2502   \u251c\u2500\u2500 execute_train_QTFPred_signal.sh   # Train quantum model\n\u2502   \u2502   \u251c\u2500\u2500 execute_train_FCNsignal_signal.sh # Train FCNsignal\n\u2502   \u2502   \u2514\u2500\u2500 execute_train_BPNet_signal.sh     # Train BPNet\n\u2502   \u251c\u2500\u2500 run_model\/                     # Model execution\n\u2502   \u2502   \u251c\u2500\u2500 run_QTFPred_signal.py     # Run quantum model\n\u2502   \u2502   \u251c\u2500\u2500 run_classical_signal.py   # Run classical models\n\u2502   \u2502   \u2514\u2500\u2500 Trainer_signal.py         # Trainer class\n\u2502   \u251c\u2500\u2500 motif\/                         # Motif analysis\n\u2502   \u2502   \u251c\u2500\u2500 extract_motif_from_QTFPred.sh     # Extract motifs (Shell)\n\u2502   \u2502   \u251c\u2500\u2500 extract_motif_from_QTFPred.py     # Extract motifs (Python)\n\u2502   \u2502   \u2514\u2500\u2500 run_tomtom_against_JASPAR.sh      # TomTom analysis\n\u2502   \u2514\u2500\u2500 utils\/                         # Utilities\n\u2502       \u251c\u2500\u2500 loss.py                   # Loss functions\n\u2502       \u2514\u2500\u2500 check_npz_shapes.py       # Data validation\n\u251c\u2500\u2500 singularity\/                       # Singularity container\n\u2502   \u251c\u2500\u2500 test.v2.sif                   # Pre-built image (11GB, recommended)\n\u2502   \u251c\u2500\u2500 project_FCNsignal.def         # Container definition file\n\u2502   \u2514\u2500\u2500 requirements.txt              # Python dependencies\n\u251c\u2500\u2500 experiments\/                       # Experiment results (created after execution)\n\u2502   \u2514\u2500\u2500 {model}_{cell}_{TF}_{date}\/   # Experiment directory\n\u2502       \u251c\u2500\u2500 training\/                 # Training results\n\u2502       \u2502   \u251c\u2500\u2500 model_best.pth        # Best model weights\n\u2502       \u2502   \u251c\u2500\u2500 record.txt            # Evaluation metrics\n\u2502       \u2502   \u251c\u2500\u2500 info.log              # Execution log\n\u2502       \u2502   \u251c\u2500\u2500 debug.log             # Debug log\n\u2502       \u2502   \u2514\u2500\u2500 losscurve\/            # Loss curves\n\u2502       \u2502       \u2514\u2500\u2500 LossCurve.png\n\u2502       \u2514\u2500\u2500 motif\/                    # Motif analysis results\n\u2502           \u251c\u2500\u2500 motif.meme            # Extracted motifs (MEME format)\n\u2502           \u251c\u2500\u2500 info.log\n\u2502           \u251c\u2500\u2500 debug.log\n\u2502           \u2514\u2500\u2500 tomtom\/               # TomTom analysis results\n\u2502               \u251c\u2500\u2500 tomtom.tsv        # Matching results\n\u2502               \u2514\u2500\u2500 tomtom.xml\n\u251c\u2500\u2500 notebooks\/                         # Tutorial Jupyter notebooks\n\u2502   \u251c\u2500\u2500 01_quantum_computing_introduction.ipynb        # Quantum computing basics\n\u2502   \u251c\u2500\u2500 02_quantum_convolutional_layer_tutorial.ipynb  # Quantum convolutional layer\n\u2502   \u251c\u2500\u2500 requirements.txt               # Python dependencies for notebooks\n\u2502   \u2514\u2500\u2500 dev\/                           # Development versions (archived)\n\u251c\u2500\u2500 docs\/                              # Documentation (empty)\n\u2514\u2500\u2500 logs\/                              # Logs (empty)<\/code><\/pre>\n\n\n\n<p><strong>Notes:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Files under <code>data\/Genome\/<\/code> (hg38.fa, chromsize) must be downloaded by users via <code>download_genome.sh<\/code><\/li>\n\n\n\n<li><code>docs\/<\/code> and <code>logs\/<\/code> directories are initially empty<\/li>\n\n\n\n<li><code>experiments\/<\/code> directory is automatically created during training execution<\/li>\n<\/ul>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>Quick Start<\/strong><\/h1>\n\n\n\n<p>This quick start demonstrates training QTFPred using the pre-included HeLa-S3\/ELK1 dataset, allowing you to immediately evaluate the model without downloading additional data.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Set working directory\ncd \/path\/to\/QTFPred_signal\n\n# Step 1: Verify pre-existing data\nls data\/HeLa-S3\/ELK1\/data\/\n# Expected output:\n#   ELK1_train.npz (~180MB)\n#   ELK1_test.npz (~23MB)\n#   ELK1_neg.npz (~22MB)\n\n# Step 2: Configure script paths (REQUIRED - First time only)\n# Before running training, configure PROJECT_ROOT in the script\n# See \"Before You Start: Configuring Paths\" section above for details\nnano scripts\/training_execution_sh\/execute_train_QTFPred_signal.sh\n# Change: PROJECT_ROOT=\"\/path\/to\/QTFPred_signal\"\n# To: PROJECT_ROOT=\"\/your\/actual\/path\/to\/QTFPred_signal\"\n\n# Step 3: Train QTFPred (quantum model)\n# Note: This step requires GPU and takes approximately 30-60 minutes\nbash scripts\/training_execution_sh\/execute_train_QTFPred_signal.sh HeLa-S3 ELK1\n\n# Step 4: Check training results\n# Results are saved in experiments\/QTFPred_signal_HeLa-S3_ELK1_{date}\/training\/\ncat experiments\/QTFPred_signal_HeLa-S3_ELK1_*\/training\/record.txt<\/code><\/pre>\n\n\n\n<p><strong>Expected Output in record.txt:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Test Results:\nRegression - RMSE: 0.XXX, PR: 0.7X-0.8X\nClassification - AUROC: 0.8X-0.9X, AUPRC: 0.7X-0.8X\nSample Size:\nTrain: ~8000, Test: ~1000, Negative: ~1000<\/code><\/pre>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>Complete Workflow<\/strong><\/h1>\n\n\n\n<p>This section describes the complete workflow from raw data download to motif analysis. If you want to process your own ChIP-seq data, follow these steps sequentially.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Important: Before executing any scripts in this workflow, you must configure the <code>PROJECT_ROOT<\/code> path in each script. See the &#8220;<a href=\"#before-you-start-configuring-paths\">Before You Start: Configuring Paths<\/a>&#8221; section for detailed instructions. The scripts that require configuration are listed in the configuration section.<\/p>\n<\/blockquote>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Step 1: Singularity Container Setup<\/strong><\/h2>\n\n\n\n<p>Recommended: Use the pre-built container included in the repository.<br>Note: The pre-built container is recommended because the MEME Suite website is currently experiencing downtime, which may cause build failures.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cd \/path\/to\/QTFPred_signal\n\n# Verify container exists\nls -lh singularity\/test.v2.sif\n# Expected: 11409920000 bytes (~11GB)\n\n# Test container functionality\nsingularity exec singularity\/test.v2.sif python3.11 --version\nsingularity exec singularity\/test.v2.sif meme -version<\/code><\/pre>\n\n\n\n<p><strong>Alternative<\/strong>: Build the container yourself (requires sudo privileges and ~20 minutes).<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo singularity build singularity\/test.v2.sif singularity\/project_FCNsignal.def<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Step 2: Data Download<\/strong><\/h2>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2a. Download Genome Reference<\/strong><\/h2>\n\n\n\n<p>Download the hg38 human genome reference and chromosome size information:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cd \/path\/to\/QTFPred_signal\n\n# Download hg38.fa (~938MB) and chromsize\nbash scripts\/data_processing\/download_genome.sh\n\n# Verify downloaded files\nls -lh data\/Genome\/\n# Expected output:\n#   hg38.fa (~3GB uncompressed)\n#   chromsize (~3KB)<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2b. Download ChIP-seq Data<\/strong><\/h2>\n\n\n\n<p>Download ChIP-seq datasets (peak files and signal tracks) for specific cell lines. The repository includes <code>datalist.txt<\/code> files for three cell lines:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>HeLa-S3<\/strong>: 12 TFs (CTCF, E2F1, E2F6, ELK1, ELK4, JUND, MAFF, MAX, MAZ, REST, RFX5, TBP)<\/li>\n\n\n\n<li><strong>K562<\/strong>: Multiple TFs (see <code>data\/K562\/datalist.txt<\/code>)<\/li>\n\n\n\n<li><strong>GM12878<\/strong>: Multiple TFs (see <code>data\/GM12878\/datalist.txt<\/code>)<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code># Example: Download all ChIP-seq data for HeLa-S3\nbash scripts\/data_processing\/execute_download.sh HeLa-S3\n\n# Output structure:\n# data\/HeLa-S3\/{TF}\/\n#   \u251c\u2500\u2500 thresholded.bed      # IDR thresholded peaks\n#   \u2514\u2500\u2500 p-value.bigWig       # ChIP-seq signal track<\/code><\/pre>\n\n\n\n<p><strong>Arguments<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>&lt;cell_line&gt;<\/code> &#8211; Cell line name (HeLa-S3, K562, or GM12878)<\/li>\n\n\n\n<li><code>[--force]<\/code> &#8211; Force re-download existing files (optional)<\/li>\n\n\n\n<li><code>[--verbose]<\/code> &#8211; Enable verbose logging (optional)<\/li>\n\n\n\n<li><code>[--dry_run]<\/code> &#8211; Test URLs without downloading (optional)<\/li>\n<\/ul>\n\n\n\n<p><strong>Note<\/strong>: You can download specific cell lines only. For example, to start with HeLa-S3:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>bash scripts\/data_processing\/execute_download.sh HeLa-S3<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Step 3: Data Preprocessing (bed2signal)<\/h2>\n\n\n\n<p>Convert BED peak files and BigWig signal files into NPZ format suitable for model training.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cd \/path\/to\/QTFPred_signal\n\n# Example: Preprocess E2F6 data for HeLa-S3\nbash scripts\/data_processing\/execute_bed2signal.sh HeLa-S3 E2F6\n\n# Output: data\/HeLa-S3\/E2F6\/data\/\n#   \u251c\u2500\u2500 E2F6_train.npz\n#   \u251c\u2500\u2500 E2F6_test.npz\n#   \u2514\u2500\u2500 E2F6_neg.npz<\/code><\/pre>\n\n\n\n<p>Arguments:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>&lt;cell_line&gt;<\/code> &#8211; Cell line name (e.g., HeLa-S3, K562, GM12878)<\/li>\n\n\n\n<li><code>&lt;TF_name&gt;<\/code> &#8211; Transcription factor name (e.g., E2F6, ELK1, CTCF)<\/li>\n<\/ul>\n\n\n\n<p><strong>Processing steps<\/strong>:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Extract 1,000 bp sequences centered on each peak position<\/li>\n\n\n\n<li>Apply random position shifts (-100 to 100 bp) for augmentation<\/li>\n\n\n\n<li>Filter samples with signal values in bottom 5%<\/li>\n\n\n\n<li>Generate negative samples from 3,000 bp upstream regions<\/li>\n\n\n\n<li>Normalize signal values: <code>log10(1 + signal)<\/code><\/li>\n<\/ol>\n\n\n\n<p><strong>Processing time<\/strong>: 5-15 minutes per TF depending on peak count.<\/p>\n\n\n\n<p><strong>Note<\/strong>: For quick testing with pre-processed data, see the Quick Start section which uses HeLa-S3\/ELK1 with included NPZ files.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 4: Model Training<\/h2>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Tip for Large-Scale Experiments: For training across multiple TFs and cell lines, we recommend using a job management system such as SLURM. The provided scripts are compatible with SLURM array jobs for efficient parallel processing.<\/p>\n<\/blockquote>\n\n\n\n<p>Train models to predict TF binding signals from DNA sequences. QTFPred supports both quantum and classical models.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">4a. Quantum Model (QTFPred)<\/h2>\n\n\n\n<p>Train the quantum-enhanced model with 4-qubit quantum convolutional layers:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cd \/path\/to\/QTFPred_signal\n\n# Train QTFPred for HeLa-S3\/E2F6\nbash scripts\/training_execution_sh\/execute_train_QTFPred_signal.sh HeLa-S3 E2F6\n\n# Output directory structure:\n# experiments\/QTFPred_signal_HeLa-S3_E2F6_{date}\/training\/\n#   \u251c\u2500\u2500 model_best.pth              # Best model weights (saved at lowest validation loss)\n#   \u251c\u2500\u2500 record.txt                  # Evaluation metrics (RMSE, PR, AUROC, AUPRC)\n#   \u251c\u2500\u2500 info.log                    # Training progress log\n#   \u251c\u2500\u2500 debug.log                   # Detailed debug information\n#   \u2514\u2500\u2500 losscurve\/\n#       \u2514\u2500\u2500 LossCurve.png           # Training\/validation loss curves<\/code><\/pre>\n\n\n\n<p>Arguments:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>&lt;cell_line&gt;<\/code> &#8211; Cell line name (e.g., HeLa-S3, K562, GM12878)<\/li>\n\n\n\n<li><code>&lt;TF_name&gt;<\/code> &#8211; Transcription factor name (e.g., E2F6, ELK1, CTCF)<\/li>\n<\/ul>\n\n\n\n<p><strong>Evaluation metrics in record.txt<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>RMSE<\/strong>: Root Mean Square Error (regression task)<\/li>\n\n\n\n<li><strong>PR<\/strong>: Pearson Correlation (regression task)<\/li>\n\n\n\n<li><strong>AUROC<\/strong>: Area Under ROC Curve (classification task)<\/li>\n\n\n\n<li><strong>AUPRC<\/strong>: Area Under Precision-Recall Curve (classification task)<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">4b. Classical Models (for comparison)<\/h2>\n\n\n\n<p>Train baseline classical models for performance comparison:<\/p>\n\n\n\n<p><strong>FCNsignal<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>bash scripts\/training_execution_sh\/execute_train_FCNsignal_signal.sh HeLa-S3 E2F6\n\n# Output: experiments\/FCNsignal_HeLa-S3_E2F6_{date}\/training\/<\/code><\/pre>\n\n\n\n<p><strong>Arguments<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>&lt;cell_line&gt;<\/code> &#8211; Cell line name (e.g., HeLa-S3, K562, GM12878)<\/li>\n\n\n\n<li><code>&lt;TF_name&gt;<\/code> &#8211; Transcription factor name (e.g., E2F6, ELK1, CTCF)<\/li>\n<\/ul>\n\n\n\n<p><strong>BPNet<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>bash scripts\/training_execution_sh\/execute_train_BPNet_signal.sh HeLa-S3 E2F6\n\n# Output: experiments\/BPNet_HeLa-S3_E2F6_{date}\/training\/<\/code><\/pre>\n\n\n\n<p><strong>Arguments<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>&lt;cell_line&gt;<\/code> &#8211; Cell line name (e.g., HeLa-S3, K562, GM12878)<\/li>\n\n\n\n<li><code>&lt;TF_name&gt;<\/code> &#8211; Transcription factor name (e.g., E2F6, ELK1, CTCF)<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Step 5: Motif Extraction (Quantum Model Only)<\/h2>\n\n\n\n<p>Extract learned TF binding motifs from the quantum convolutional filters. This step applies only to QTFPred, as quantum filters learn interpretable sequence patterns.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cd \/path\/to\/QTFPred_signal\n\n# Extract motifs from trained QTFPred model\n# Replace {experiment_name} with your actual experiment directory name\n# Example: QTFPred_signal_HeLa-S3_E2F6_1027\nbash scripts\/motif\/extract_motif_from_QTFPred.sh HeLa-S3 E2F6 QTFPred_signal_HeLa-S3_E2F6_1027\n\n# Output: experiments\/QTFPred_signal_HeLa-S3_E2F6_1027\/motif\/\n#   \u251c\u2500\u2500 motif.meme         # 64 PFMs in MEME format (16 bp each)\n#   \u251c\u2500\u2500 info.log           # Execution log\n#   \u2514\u2500\u2500 debug.log          # Debug information<\/code><\/pre>\n\n\n\n<p><strong>Arguments<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>&lt;cell_line&gt;<\/code> &#8211; Cell line name (e.g., HeLa-S3, K562, GM12878)<\/li>\n\n\n\n<li><code>&lt;TF_name&gt;<\/code> &#8211; Transcription factor name (e.g., E2F6, ELK1, CTCF)<\/li>\n\n\n\n<li><code>&lt;experiment_name&gt;<\/code> &#8211; Experiment directory name from training step (e.g., QTFPred_signal_HeLa-S3_E2F6_1027)<\/li>\n<\/ul>\n\n\n\n<p><strong>What this step does<\/strong>:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Processes test dataset through trained QTFPred model<\/li>\n\n\n\n<li>Identifies 100 bp sub-regions with highest predicted binding signals<\/li>\n\n\n\n<li>Calculates activation scores from 64 quantum convolutional filters<\/li>\n\n\n\n<li>Extracts 16 bp sub-sequences with highest activation for each filter<\/li>\n\n\n\n<li>Constructs Position Frequency Matrices (PFMs) from high-scoring sequences<\/li>\n\n\n\n<li>Outputs 64 PFMs representing learned motif patterns<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">Step 6: TomTom Analysis<\/h2>\n\n\n\n<p>Compare extracted motifs against the JASPAR 2024 vertabrate database to identify known TF binding motifs and discover cooperative binding patterns.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cd \/path\/to\/QTFPred_signal\n\n# Run TomTom analysis against JASPAR database\nbash scripts\/motif\/run_tomtom_against_JASPAR.sh HeLa-S3 E2F6 QTFPred_signal_HeLa-S3_E2F6_1027\n\n# Output: experiments\/QTFPred_signal_HeLa-S3_E2F6_1027\/motif\/tomtom\/\n#   \u251c\u2500\u2500 tomtom.tsv         # Motif matching results (q-value &lt; 0.1)\n#   \u2514\u2500\u2500 tomtom.xml         # Detailed XML output<\/code><\/pre>\n\n\n\n<p><strong>Arguments<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>&lt;cell_line&gt;<\/code> &#8211; Cell line name (e.g., HeLa-S3, K562, GM12878)<\/li>\n\n\n\n<li><code>&lt;TF_name&gt;<\/code> &#8211; Transcription factor name (e.g., E2F6, ELK1, CTCF)<\/li>\n\n\n\n<li><code>&lt;experiment_name&gt;<\/code> &#8211; Experiment directory name from training step (e.g., QTFPred_signal_HeLa-S3_E2F6_1027)<\/li>\n<\/ul>\n\n\n\n<p>Interpreting results (tomtom.tsv):<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Query_ID<\/strong>: Filter number (0-63)<\/li>\n\n\n\n<li><strong>Target_ID<\/strong>: Matched JASPAR motif ID<\/li>\n\n\n\n<li><strong>p-value<\/strong>: Statistical significance<\/li>\n\n\n\n<li><strong>q-value<\/strong>: Multiple testing corrected p-value (threshold: &lt; 0.1)<\/li>\n\n\n\n<li><strong>Overlap<\/strong>: Number of overlapping positions<\/li>\n\n\n\n<li><strong>Offset<\/strong>: Alignment offset<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Tutorial Notebooks (Optional)<\/h2>\n\n\n\n<p>For users who want to:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Understand quantum computing fundamentals and QTFPred implementation<\/li>\n\n\n\n<li>Apply quantum convolutional layers to custom use cases<\/li>\n\n\n\n<li>Interactively learn quantum circuit learning principles<\/li>\n<\/ul>\n\n\n\n<p>We provide interactive Jupyter notebooks in the <code>notebooks\/<\/code> directory.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Running Notebooks in VS Code (Recommended)<\/h2>\n\n\n\n<p><strong>Quick Start<\/strong> &#8211; 3 Steps:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Open VS Code<\/strong> and open the <code>QTFPred_signal<\/code> folder<\/li>\n\n\n\n<li><strong>Install Extensions<\/strong>: Python + Jupyter (by Microsoft)<\/li>\n\n\n\n<li><strong>Open Notebook<\/strong>: <code>notebooks\/01_quantum_computing_introduction.ipynb<\/code><\/li>\n\n\n\n<li><strong>Select Kernel<\/strong>: Click top-right \u2192 Choose <code>.venv: Python 3.11.x<\/code><\/li>\n\n\n\n<li><strong>Run Cells<\/strong>: Press <code>Shift + Enter<\/code> to execute sequentially<\/li>\n<\/ol>\n\n\n\n<p>The repository includes a <strong>pre-configured virtual environment<\/strong> (<code>.venv\/<\/code>) with Python 3.11 and all required dependencies (PennyLane, PyTorch, NumPy, Matplotlib, Jupyter).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Alternative: Jupyter Lab (Command Line)<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code># Activate the pre-configured environment\nsource .venv\/bin\/activate\n\n# Launch Jupyter Lab\njupyter lab\n\n# Opens browser at http:\/\/localhost:8888<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Building Your Own Virtual Environment (Advanced)<\/h2>\n\n\n\n<p>If you prefer to create your own virtual environment instead of using the pre-configured <code>.venv<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Create virtual environment with Python 3.11\npython3.11 -m venv my_qtfpred_env\n\n# Activate environment\nsource my_qtfpred_env\/bin\/activate  # Linux\/macOS\n# OR\nmy_qtfpred_env\\Scripts\\activate     # Windows\n\n# Install dependencies from notebooks\/requirements.txt\npip install -r notebooks\/requirements.txt\n\n# Register kernel for Jupyter\npython -m ipykernel install --user --name=my_qtfpred_env\n\n# Launch Jupyter Lab or VS Code with this environment\njupyter lab<\/code><\/pre>\n\n\n\n<p>The <code>notebooks\/requirements.txt<\/code> file contains all necessary dependencies including PennyLane, PyTorch, and visualization libraries.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Tutorial Contents<\/h2>\n\n\n\n<p><strong>Notebook 01: Quantum Computing Introduction<\/strong> (<code>01_quantum_computing_introduction.ipynb<\/code>)<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Bra-ket notation and quantum state vectors<\/li>\n\n\n\n<li>Quantum gates (Hadamard, Pauli, CNOT, rotation gates)<\/li>\n\n\n\n<li>Multi-qubit systems and entanglement<\/li>\n\n\n\n<li>Measurement and expectation values<\/li>\n\n\n\n<li>4-qubit circuits (QTFPred architecture foundation)<\/li>\n\n\n\n<li>Parametric quantum circuits for machine learning<\/li>\n\n\n\n<li>PennyLane and PyTorch integration basics<\/li>\n<\/ul>\n\n\n\n<p><strong>Notebook 02: Quantum Convolutional Layer Tutorial <\/strong>(<code>02_quantum_convolutional_layer_tutorial.ipynb<\/code>)<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Part 1<\/strong>: Quantum circuit fundamentals with 4-qubit examples<\/li>\n\n\n\n<li><strong>Part 2<\/strong>: QTFPred&#8217;s quantum circuit architecture (36 parameters, data re-uploading)<\/li>\n\n\n\n<li><strong>Part 3-4<\/strong>: Single and multi-channel quantum convolution operations<\/li>\n\n\n\n<li><strong>Part 5<\/strong>: Kernel Division Strategy for receptive field extension (16 bp)<\/li>\n\n\n\n<li><strong>Part 6<\/strong>: Production <code>QConv1d<\/code> class usage with realistic examples (L=1001)<\/li>\n\n\n\n<li><strong>Part 7<\/strong>: <strong>PennyLane broadcasting for efficient batch processing<\/strong> (100-1000\u00d7 speedup)<\/li>\n\n\n\n<li><strong>Part 8<\/strong>: Complete QTFPred model forward pass with base-resolution output<\/li>\n<\/ul>\n\n\n\n<p><strong>Prerequisites<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Notebook 02 assumes completion of Notebook 01<\/li>\n\n\n\n<li>Basic understanding of machine learning and Python<\/li>\n\n\n\n<li>Familiarity with PyTorch (optional but helpful)<\/li>\n<\/ul>\n\n\n\n<p><strong>Total Tutorial Time<\/strong>: ~3-4 hours for complete walkthrough<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Hyperparameter Optimization (Optional)<\/h2>\n\n\n\n<p>For users who need to optimize hyperparameters for custom datasets, we provide an Optuna-based hyperparameter tuning workflow.<br>This feature enables automatic optimization of model hyperparameters.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">When to Use Hyperparameter Optimization<\/h2>\n\n\n\n<p>Consider using Optuna tuning when:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Novel TF Targets<\/strong>: Working with TF targets not covered in the paper&#8217;s pre-optimized configurations<\/li>\n\n\n\n<li><strong>Custom Model Architectures<\/strong>: Developing quantum convolutional layer-based custom models<\/li>\n<\/ul>\n\n\n\n<p>The hyperparameters included in the paper were optimized using this Optuna implementation.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What Optuna Optimizes<\/h2>\n\n\n\n<p>The optimization process searches for the best combination of:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Hyperparameter<\/th><th>Type<\/th><th>Search Range<\/th><th>Description<\/th><\/tr><\/thead><tbody><tr><td>Learning rate<\/td><td>Log-scale<\/td><td>1e-5 to 1e-1<\/td><td>AdamW optimizer learning rate<\/td><\/tr><tr><td>Weight decay<\/td><td>Log-scale<\/td><td>1e-5 to 1e-1<\/td><td>AdamW optimizer weight decay<\/td><\/tr><tr><td>Batch size<\/td><td>Integer<\/td><td>20 to 120<\/td><td>Training batch size<\/td><\/tr><tr><td>Dropout<\/td><td>Float<\/td><td>0.1 to 0.8<\/td><td>Dropout rate for regularization<\/td><\/tr><tr><td>Init method<\/td><td>Categorical<\/td><td>Xavier, Default<\/td><td>Weight initialization method<\/td><\/tr><tr><td>Pooling type<\/td><td>Categorical<\/td><td>max, avg<\/td><td>Pooling layer type<\/td><\/tr><tr><td>Decoder kernel<\/td><td>Categorical<\/td><td>3, 5, 7<\/td><td>Decoder kernel size (odd only)<\/td><\/tr><tr><td>Activation<\/td><td>Categorical<\/td><td>elu, silu, gelu<\/td><td>Activation function<\/td><\/tr><tr><td>Bottleneck size<\/td><td>Integer<\/td><td>1 to 50<\/td><td>Bottleneck layer output size<\/td><\/tr><tr><td>GRU dropout<\/td><td>Float<\/td><td>0.1 to 0.8<\/td><td>GRU layer dropout rate<\/td><\/tr><tr><td>Quantum kernel<\/td><td>Categorical<\/td><td>3, 5, 7<\/td><td>Quantum kernel size (n_qubits)<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong>Optimization Objective<\/strong>: Maximize Pearson correlation on test set<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Basic Usage<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>cd \/path\/to\/QTFPred_signal\n\n# Example: Optimize hyperparameters for HeLa-S3\/ELK1\nbash scripts\/optuna\/run_optuna_QTFPred_signal.sh HeLa-S3 ELK1\n\n# With custom settings\nbash scripts\/optuna\/run_optuna_QTFPred_signal.sh HeLa-S3 ELK1 \\\n    --n_trials 50 \\\n    --max_epoch 20 \\\n    --study_name custom_study_name<\/code><\/pre>\n\n\n\n<p><strong>Arguments<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>&lt;cell_line&gt;<\/code> &#8211; Cell line name (e.g., HeLa-S3, K562, GM12878)<\/li>\n\n\n\n<li><code>&lt;TF_name&gt;<\/code> &#8211; Transcription factor name (e.g., ELK1, CTCF, E2F6)<\/li>\n\n\n\n<li><code>--n_trials<\/code> &#8211; Number of optimization trials (default: 100)<\/li>\n\n\n\n<li><code>--max_epoch<\/code> &#8211; Training epochs per trial (default: 30)<\/li>\n\n\n\n<li><code>--study_name<\/code> &#8211; Optuna study name (default: QTFPred_{cell}_{TF})<\/li>\n<\/ul>\n\n\n\n<p><strong>Prerequisite<\/strong>s:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Training and test data must be preprocessed (Step 3: bed2signal)<\/li>\n\n\n\n<li>GPU recommended for reasonable optimization time<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Understanding Output<\/h2>\n\n\n\n<p>After optimization completes, results are saved to:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>experiments\/optuna_QTFPred_signal_{cell}_{TF}_{date}\/\n\u251c\u2500\u2500 optuna.log                    # Optuna framework logs\n\u251c\u2500\u2500 debug.log                     # Detailed execution logs\n\u2514\u2500\u2500 {study_name}.json             # Best hyperparameters (JSON format)<\/code><\/pre>\n\n\n\n<p><strong>Example best_params.json<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n    \"batch_size\": 64,\n    \"lr\": 0.0001234,\n    \"weight_decay\": 0.00567,\n    \"dropout\": 0.35,\n    \"init_method_name\": \"Xavier\",\n    \"pooling_type\": \"max\",\n    \"decoder_kernel\": 5,\n    \"activation\": \"gelu\",\n    \"bottleneck_size\": 25,\n    \"gru_dropout\": 0.42,\n    \"kernel_size\": 5\n}<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Advanced: Parallel Optimization<\/h2>\n\n\n\n<p>One of Optuna&#8217;s powerful features is <strong>parallel optimization<\/strong>. Multiple processes can contribute to the same optimization study simultaneously, dramatically accelerating the search process.<\/p>\n\n\n\n<p><strong>How it works<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Multiple processes share the same SQLite database and study name<\/li>\n\n\n\n<li>Each process runs trials independently<\/li>\n\n\n\n<li>Results are synchronized through the shared database<\/li>\n\n\n\n<li>No manual coordination required<\/li>\n<\/ul>\n\n\n\n<p><strong>Example &#8211; Running 2 parallel optimization processes<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Terminal 1: Start first optimization process\nbash scripts\/optuna\/run_optuna_QTFPred_signal.sh HeLa-S3 ELK1 \\\n    --study_name shared_study \\\n    --n_trials 50\n\n# Terminal 2: Start second process (simultaneously)\nbash scripts\/optuna\/run_optuna_QTFPred_signal.sh HeLa-S3 ELK1 \\\n    --study_name shared_study \\\n    --n_trials 50\n\n# Both processes contribute to the same study\n# Total: 100 trials completed faster through parallel execution<\/code><\/pre>\n\n\n\n<p>Optuna Database Location:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Shared database: <code>experiments\/optuna_db\/optuna_results.db<\/code><\/li>\n\n\n\n<li>Studies persist across runs<\/li>\n\n\n\n<li>Resume interrupted optimizations by using the same study name<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Acknowledgment<\/h2>\n\n\n\n<p>This hyperparameter optimization functionality is powered by <a href=\"https:\/\/optuna.org\/\">Optuna<\/a>, an open-source hyperparameter optimization framework designed for machine learning. We gratefully acknowledge the Optuna development team for providing this powerful and user-friendly optimization library.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Citation<\/h2>\n\n\n\n<p>If you use QTFPred in your research, please cite:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>@article{matsubara2025qtfpred,\n  title={QTFPred: robust high-performance quantum machine learning modeling that predicts main and cooperative transcription factor bindings with base resolution},\n  author={Matsubara, Taichi and Machida, Shuto and Owusu, Samuel Papa Kwesi and Asakura, Akihiro and Hashimoto, Hiroki and Matsuoka, Masanori and Nagasaki, Masao},\n  journal={Briefings in Bioinformatics},\n  volume={26},\n  number={6},\n  pages={bbaf604},\n  year={2025},\n  publisher={Oxford University Press}\n}<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Contact<\/h2>\n\n\n\n<p>For questions, issues, or feedback:<\/p>\n\n\n\n<p><strong>First Author<\/strong>: Taichi Matsubara<br>&#8211; Division of Biomedical Information Analysis<br>&#8211; Medical Research Center for High Depth Omics<br>&#8211; Medical Institute of Bioregulation, Kyushu University<\/p>\n\n\n\n<p><strong>Corresponding Author<\/strong>: Masao Nagasaki, Ph.D.<br>&nbsp;&#8211; Division of Biomedical Information Analysis<br>&nbsp;&#8211; Medical Research Center for High Depth Omics<br>&nbsp;&#8211; Medical Institute of Bioregulation, Kyushu University<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Acknowledgments<\/h2>\n\n\n\n<p>This work was supported by:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.encodeproject.org\/\">ENCODE Project<\/a> &#8211; ChIP-seq datasets for TF binding analysis<\/li>\n\n\n\n<li><a href=\"https:\/\/jaspar.elixir.no\/\">JASPAR<\/a> &#8211; TF binding motif database (JASPAR 2024)<\/li>\n\n\n\n<li><a href=\"https:\/\/pennylane.ai\/\">PennyLane<\/a> &#8211; Quantum machine learning framework<\/li>\n\n\n\n<li><a href=\"https:\/\/pytorch.org\/\">PyTorch<\/a> &#8211; Deep learning infrastructure<\/li>\n\n\n\n<li><a href=\"https:\/\/optuna.org\/\">Optuna<\/a> &#8211; Hyperparameter optimization framework<\/li>\n\n\n\n<li><a href=\"https:\/\/sylabs.io\/singularity\/\">Singularity<\/a> &#8211; Container platform for reproducible environments<\/li>\n\n\n\n<li><a href=\"https:\/\/meme-suite.org\/\">MEME Suite<\/a> &#8211; Motif analysis tools (TomTom, FIMO)<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>Last Updated: 2025-10-28<br>Version: 1.0.0<br>Repository: QTFPred_signal<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overview QTFPred (Quantum-based Transcription Factor Predictor) is a quantum-classical hybrid deep learning framework for predicting transcription factor (TF) binding signals at base-pair resolution. By integrating quantum convolutional layers with fully convolutional neural networks (FCNs), QTFPred achieves state-of-the-art performance, particularly in &hellip; <a href=\"https:\/\/nagasakilab.csml.org\/en\/qtfpred\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"passster_activate_protection":false,"passster_protect_child_pages":"","passster_protection_type":"password","passster_password":"","passster_activate_overwrite_defaults":"","passster_headline":"","passster_instruction":"","passster_placeholder":"","passster_button":"","passster_id":"","passster_activate_misc_settings":"","passster_redirect_url":"","passster_hide":"no","passster_area_shortcode":"","footnotes":""},"class_list":["post-2983","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/nagasakilab.csml.org\/en\/wp-json\/wp\/v2\/pages\/2983","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nagasakilab.csml.org\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/nagasakilab.csml.org\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/nagasakilab.csml.org\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nagasakilab.csml.org\/en\/wp-json\/wp\/v2\/comments?post=2983"}],"version-history":[{"count":18,"href":"https:\/\/nagasakilab.csml.org\/en\/wp-json\/wp\/v2\/pages\/2983\/revisions"}],"predecessor-version":[{"id":3101,"href":"https:\/\/nagasakilab.csml.org\/en\/wp-json\/wp\/v2\/pages\/2983\/revisions\/3101"}],"wp:attachment":[{"href":"https:\/\/nagasakilab.csml.org\/en\/wp-json\/wp\/v2\/media?parent=2983"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}