Use Cases
This page covers the most common analysis scenarios you can run with Hoodini Colab.
Single Protein Analysis
When you want to explore the genomic neighborhood of a specific protein.
Scenario: You found an interesting protein in a paper and want to see what genes surround it across different organisms.
Configuration:
- Select Single Input mode
- Enter an NCBI protein ID (e.g.,
WP_012345678.1) - Configure Remote BLAST settings:
- E-value:
1e-10(default, adjust for stringency) - Max targets:
100(how many homologs to retrieve)
- E-value:
Input: WP_012345678.1
Mode: Single Input
Remote BLAST E-value: 1e-10
Max targets: 100Single Input mode uses remote BLAST to automatically find homologous sequences. This only works with NCBI protein IDs.
Custom Homolog List
When you already have a curated list of sequences to compare.
Scenario: You ran your own BLAST search and want to analyze specific sequences rather than letting Hoodini choose.
Configuration:
- Select Input List mode
- Paste your IDs (one per line):
WP_000000001.1
WP_000000002.1
WP_000000003.1
NZ_CP000001.1Unlike Single Input mode, you can mix NCBI protein IDs and nucleotide IDs in this mode.
Analyzing Non-Coding Regions
For CRISPR arrays, regulatory elements, or genomic islands.
Scenario: You want to compare CRISPR arrays or other non-coding genomic regions.
Configuration:
- Select Input List mode
- Enter nucleotide accessions (e.g.,
NZ_CP000001.1) - In Neighborhood Window:
- Set appropriate window sizes for your elements
- Consider larger windows for genomic islands
Input: NZ_CP000001.1, NZ_CP000002.1
Window upstream: 15000
Window downstream: 15000Defense System Survey
Comprehensive scan for antiphage defense systems.
Scenario: You want to catalog all defense systems in the neighborhoods around your genes of interest.
Configuration:
- Set up your input (any mode)
- Enable annotation tools:
- ✅ PADLOC — Antiphage defense systems
- ✅ DefenseFinder — Defense system detection
- ✅ CCtyper — CRISPR-Cas systems
- ✅ geNomad — Mobile genetic elements
First-time setup: Each tool requires downloading its database. Allow extra time (~2-5 min per tool) on the first run.
Phylogenetic Context Analysis
Add evolutionary context with tree construction.
Scenario: You want to visualize how genomic neighborhoods vary across a phylogenetic tree.
Configuration:
- Set up your input
- In Tree Construction, select a method:
Taxonomy Tree
Fastest option — Uses NCBI taxonomy
Tree mode: taxonomy_treeCustom Coordinates (Input Sheet)
For precise control over which genomic regions to analyze.
Scenario: You have specific coordinates from your own analysis or want to reproduce exact regions.
Configuration:
- Select Input Sheet mode
- Fill in the table with:
protein_idornucleotide_idassembly_idstartandendcoordinatesstrand(+/-)
Example table format:
| protein_id | assembly_id | start | end | strand |
|---|---|---|---|---|
| WP_000001.1 | GCF_000001.1 | 10000 | 25000 | + |
| WP_000002.1 | GCF_000002.1 | 50000 | 65000 | - |
You can also paste TSV data directly into the table.
Comparing Gene Clusters
Visualize conserved operons across species.
Scenario: You’re studying a biosynthetic gene cluster and want to see how it varies across organisms.
Configuration:
- Use Input List mode with proteins from the cluster
- Enable Protein Links to see sequence similarity
- Set Clustering to group similar genes:
- Identity threshold:
0.3(30%) - Coverage threshold:
0.8(80%)
- Identity threshold:
- Enable a tree method to order genomes
Batch Processing Tips
For large-scale analyses:
- Start with a subset: Test with 10-20 sequences first
- Use taxonomy trees: Faster than AAI/ANI for large datasets
- Limit annotations: Each tool adds processing time
- Add NCBI API key: Speeds up data downloads significantly
# Set API key before running
import os
os.environ['NCBI_API_KEY'] = 'your_api_key_here'Next Steps
- API Reference — Programmatic access to the widget
- Hoodini CLI — Full CLI documentation
- Hoodini Viz — Visualization library documentation