Report Catalog¶

Complete index of all reports available in Biofilter 4. Each report has a name (used in CLI and Python API), a brief description, and links to its explain guide and interactive notebook tutorial where available.

For general usage — how to run, list, and introspect reports — see Reports.

Running any report¶

# CLI
biofilter report run --report-name <name> [--param KEY=VALUE ...] [--output file.csv]
biofilter report explain --report-name <name>
biofilter report run --report-name <name> --params-template

# Python API
df = bf.report.run("<name>", param1=value1, param2=value2)

ETL & Platform Monitoring¶

Reports for inspecting the state of the ETL pipeline and the knowledge base.

Report	Description	Explain	Notebook
`etl_status`	Current status of all ETL packages (active, last run, row counts)	Guide	Tutorial
`etl_packages`	Full provenance log of all ETL executions with timestamps and file hashes	Guide	Tutorial
`platform_data_statistics`	Row counts and coverage metrics across all master tables	Guide	Tutorial
`db_pg_table_stats`	PostgreSQL table sizes, row estimates, and bloat metrics (PostgreSQL only)	Guide	Tutorial
`db_pg_index_stats`	PostgreSQL index usage, size, and scan counts (PostgreSQL only)	Guide	Tutorial

Entity & Relationship¶

Reports for exploring the biological entity graph.

Report	Description	Explain	Notebook
`entity_filter`	Filter and list entities (genes, pathways, diseases, …) by type, source, or name pattern	Guide	Tutorial
`entity_relationship_model`	Retrieve all entities related to an input list through shared biological groups (pathways, diseases, GO, PPI)	—	Tutorial
`entity_neighborhood_summary`	Resolve heterogeneous inputs (gene:, disease:, pathway:, …) and return a 1-hop neighborhood summary grouped by neighbor type	Guide	Tutorial

Annotation Masters¶

Reference tables exposing the full content of each biological domain in the knowledge base. Useful for exploring available terms before using them as filters in other reports.

Report	Description	Explain	Notebook
`annotation_master_gene`	All genes with HGNC symbol, Ensembl ID, locus, and source provenance	Guide	Tutorial
`annotation_master_pathway`	All pathways across all source systems (Reactome, KEGG, …)	Guide	Tutorial
`annotation_master_protein`	All proteins with UniProt IDs and gene mappings	Guide	Tutorial
`annotation_master_disease`	All diseases with MONDO/ClinGen IDs and gene associations	Guide	Tutorial
`annotation_master_go`	All Gene Ontology terms (BP, MF, CC) with gene memberships	Guide	Tutorial
`annotation_master_chemical`	All chemical compounds (ChEBI) with gene and pathway associations	Guide	Tutorial
`annotation_master_variant`	Full annotation for input variants: frequencies, pathogenicity scores, VEP consequences per transcript, AlphaMissense	Guide	Tutorial

Variant Analysis¶

Reports for annotating and filtering genomic variants.

Report	Description	Explain	Notebook
`variant_binning`	Assign variants to genomic bins; useful for burden-test preparation	Guide	Tutorial
`variant_gene_location_model`	Map variants to overlapping gene loci with distance and region annotations	—	Tutorial
`variant_annotation_expanded`	Full annotation expansion for a variant list (consequence, AF, predictions)	—	—
`variant_single_gene_annotation`	Phase 1 — Given a seed variant, returns the seed gene and all partner genes sharing biological context	Guide	Tutorial
`gene_to_variant_filtering`	Phase 2 — Collect and filter variants across a gene list with SQL-level pathogenicity filters	Guide	Tutorial
`annotation_variant_regulatory_evidence`	Variant ↔ gene regulatory evidence (eQTL / sQTL). Accepts gene symbols, rsids, or chr:pos as input; returns one row per (variant × tissue × regulated gene) with effect size and p-value	Guide	Tutorial

Variant Interaction Modeling¶

Direct variant-to-variant interaction modeling from a pre-genotyped input list. Both variants in every pair come from the input — no DB expansion.

Report	Description	Explain	Notebook
`variant_modeling`	Input variants → gene overlap → group co-membership → Variant×Variant pairs with group_support_count weight	Guide	Tutorial

SNP×SNP Interaction Pipeline¶

Reports implementing the biologically-informed SNP×SNP interaction workflow. See the full pipeline tutorial and methods document for end-to-end guidance.

Resource	Link
Pipeline notebook	pipeline__from_single_variant_to_interactions.ipynb
Pipeline methods doc	pipeline__from_single_variant_to_interactions.md

Report	Phase	Description	Explain	Notebook
`variant_single_gene_annotation`	Phase 1	Seed variant → partner gene list via biological network	Guide	Tutorial
`gene_to_variant_filtering`	Phase 2	Gene list → filtered, annotated variant set (Lista A)	Guide	Tutorial
`variant_list_intersect`	Phase 2.5	Lista A ∩ Lista B → Lista C (genotyped subset, PLINK-ready)	Guide	Pipeline notebook
`snp_snp_pair_generator`	Phase 3	Lista D → annotated interaction pairs with configurable pairing strategy	Guide	Pipeline notebook
`snp_snp_model`	Legacy	Earlier SNP×SNP pair model — expands variants from gene loci (superseded by `variant_modeling`)	Guide	Tutorial

Pathway Burden Pipeline¶

Pipeline for prioritising pathways given a list of significant genes (e.g., ExWAS hits) and a target pathway list, using cross-source convergence scoring.

Resource	Link
Pipeline notebook	pipeline__pathway_burden_score.ipynb
Pipeline methods doc	pipeline__pathway_burden_score.md

Utilities¶

Report	Description	Explain	Notebook
`template`	Blank report template for development and testing	Guide	Tutorial

Coverage summary¶

Status	Count
Reports with explain guide + notebook	20
Reports with explain guide only	2 (`variant_list_intersect`, `snp_snp_pair_generator` — covered by pipeline notebook)
Reports with notebook only	2 (`entity_relationship_model`, `variant_gene_location_model`)
Reports with neither	1 (`variant_annotation_expanded`)
Total	25