Skip to content

DeltaDesign/BioStructBenchmark

 
 

Repository files navigation

BioStructBenchmark

A tool for identifying the differences between two sets of structures, in particular DNA-protein complexes. Built primarily for identifying systematic differences between experimentally determined structures and computationally predicted structures generated by programs such as AlphaFold3.

Features

  • Structure Alignment: Superimpose protein-DNA complexes for comparison
  • RMSD Calculation: Comprehensive RMSD decomposition (protein, DNA, interface, per-residue)
  • B-factor/pLDDT Analysis: Compare experimental B-factors with AlphaFold confidence scores
  • Consensus Error Mapping: Handle datasets with incidental mutations
  • Principal Component Analysis: Identify systematic differences
  • DNA Geometry Analysis: Analyze DNA structure using CURVES+

Installation

To install BioStructBenchmark:

pip install biostructbenchmark

Usage

Basic Structure Comparison

biostructbenchmark experimental.cif predicted.cif

This will align the structures and output RMSD metrics for the protein, DNA, and interface regions.

B-factor/pLDDT Confidence Analysis

Compare experimental B-factors with AlphaFold pLDDT confidence scores:

biostructbenchmark experimental.cif predicted.cif --analyze-bfactor --output-dir ./results

This will generate:

  • Console output with summary statistics
  • CSV file with per-residue B-factor comparisons at ./results/analysis/bfactor_comparison.csv

Understanding the Output:

  • Mean Experimental B-factor: Average thermal motion/disorder in experimental structure
  • Mean Predicted Confidence (pLDDT): Average AlphaFold confidence (0-100 scale)
  • Correlation: How well B-factors correlate with pLDDT scores
  • RMSD: Root mean square difference between B-factors and pLDDT
  • High/Low Confidence Accuracy: Accuracy in regions with pLDDT > 70 and ≤ 70

CSV Output Format:

residue_id,chain_id,position,experimental_bfactor,predicted_confidence,difference,normalized_bfactor
A_1,A,1,25.3,85.2,59.9,-0.5

Notes:

  • Only standard amino acid residues are analyzed; water molecules, ligands, and other heteroatoms are excluded
  • For multi-model structures (e.g., NMR ensembles), only the first model is used
  • Structures must have >50% non-zero B-factors to pass validation

Custom Output Location

Specify a custom path for the B-factor CSV:

biostructbenchmark experimental.cif predicted.cif \
  --analyze-bfactor \
  --bfactor-output my_analysis.csv

Save Aligned Structures

biostructbenchmark experimental.cif predicted.cif \
  --save-structures \
  --output-dir ./aligned_structures

Development

This project uses uv for fast, reliable package management.

Quick Start

# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and install with dev dependencies
git clone https://github.com/BioStructBenchmark/BioStructBenchmark.git
cd BioStructBenchmark

# Install dev dependencies and set up pre-commit hooks
make dev

Development Commands

make help       # Show all available commands
make test       # Run tests with coverage (80% threshold)
make lint       # Run Ruff linting and format checks
make typecheck  # Run mypy type checking
make format     # Auto-format code with Ruff
make quality    # Run docstring coverage and dead code checks
make security   # Run Bandit and pip-audit security scans
make check      # Run all checks (lint, typecheck, quality, test, security)

Code Quality Tools

  • Ruff: Fast linting and formatting (replaces Black, isort, Flake8)
  • mypy: Static type checking with strict mode
  • pytest-cov: Test coverage with 80% threshold
  • interrogate: Docstring coverage checking
  • vulture: Dead code detection
  • Bandit: Security vulnerability scanning
  • pip-audit: Dependency vulnerability scanning
  • pre-commit: Git hooks for automated checks (includes zizmor for GitHub Actions)

CI/CD

All pull requests run:

  • Linting and formatting checks
  • Type checking
  • Tests with coverage
  • Security scans (Bandit + pip-audit)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 99.0%
  • Makefile 1.0%