Getting Started¶
Biofilter 4 (BF4) is a biological knowledge platform that resolves entities (genes, proteins, pathways, diseases, variants), tracks their relationships, and exposes them through ready-to-use reports.
This section walks you through your first run end-to-end. Pick the path that matches your situation and follow it in order.
Choose your path¶
I just want to run reports against a database that already exists¶
You have access to a Biofilter database, you don’t need to do any data ingestion yourself.
Install Biofilter — pick pip (recommended) or Docker.
Connect to the database — read Option A: connect to an existing database.
Find a report that fits your need — the catalog and the GPT assistant help here.
Run your first report — CLI and Python API examples.
I’m setting up my own Biofilter from scratch¶
You want a local database (SQLite for testing or PostgreSQL for production), populated by running the ETL yourself.
Install Biofilter — pick pip or source if you’ll contribute back.
Connect to the database — read Option B: bootstrap a new database.
Run the ETL pipeline (covered in Option B of the same page).
Run your first report once the ETL completes.
What you’ll need¶
Python 3.10+ for pip-based installation, or Docker if you prefer containers.
A database connection string if you’re connecting to an existing instance — get this from whoever administrates it.
Roughly 1 TB of disk space if you’re bootstrapping your own local DB with the full data.
Where this guide stops¶
This Getting Started track is intentionally minimal. Once you can run a report, the rest of the documentation goes deeper:
Report catalog — every available report with descriptions and tutorials.
Configuration — full options for
.biofilter.toml.Database — schema, migrations, backup/restore.
ETL — managing data sources, ETL packages, and rollbacks.
System overview — architecture and design rationale.
Troubleshooting — common errors and fixes.