Methodology
How the cost calculator works
Every figure in the calculator is either a published price (cited below) or a clearly-labelled engineering assumption. Nothing is invented. This page is the full derivation so the numbers can be checked — including against AWS's own Pricing Calculator.
1 · Your workload, in numbers
You enter your imaging volume in plates, images or terabytes. We convert it to two quantities that drive every cost: terabytes stored and CPU-hours of CellProfiler processing per month.
- A standard Cell Painting plate is 384 wells × 9 sites/well × 8 images/site ≈ 27,600 images — the imaging layout of the JUMP Consortium dataset (Chandrasekaran et al., Nature Methods 2024).
- Image size is ~2.5 MB, measured directly on JUMP
cpg0000images in the public Cell Painting Gallery — so a plate ≈ 69 GB. - Sites per day = (new TB/month ÷ 20 MB per 8-channel site) ÷ 30.4. CPU-hours/month = sites/day × 30.4 × (CPU-minutes per site) ÷ 60.
2 · CellProfiler compute
The "analysis depth" you pick sets CPU-minutes per imaging site. The anchor is a real benchmark: we ran CellProfiler 4.2.6 on JUMP Cell Painting images and measured ~8 CPU-seconds per image. The presets:
| Analysis depth | CPU-min / site |
|---|---|
| Illumination correction & QC | ~1 |
| Standard Cell Painting analysis | ~3 |
| Deep analysis (dense cells, large feature sets) | ~7 |
This is the calculator's main assumption rather than a published price — a pilot measures it exactly on your pipelines.
3 · "Your AWS today" — self-managed
Models running CellProfiler yourself on AWS Batch. The formula:
storage + compute + overhead
storage = S3 Standard, AWS volume tiers
compute = CPU-hours ÷ utilization × on-demand vCPU rate
overhead = 20% of compute
| S3 Standard storage | $0.023 / $0.022 / $0.021 per GB-month (tiered) | AWS S3 pricing |
| Compute — c7i on-demand | $0.0446 / vCPU-hour (us-east-1) | AWS EC2 pricing |
| Effective utilization | ~55% — assumption: hand-run Batch fleets idle on scale-up lag, the array-job tail, and re-running failed jobs | |
| Overhead | 20% of compute — assumption: EBS volumes, data transfer, orchestration | |
Check it yourself: open the AWS Pricing Calculator, add S3 Standard for your stored terabytes and EC2 c7i for your monthly vCPU-hours — it should reconcile with the storage and compute lines (before the utilization and overhead adjustments, which are ours).
4 · ToxIndex Cell — the rate card
The two ToxIndex columns are not modelled costs — they are published prices: a flat rate per TB-month stored and per CPU-hour processed, applied to the same workload.
| Option | Storage | Processing |
|---|---|---|
| ToxIndex Cell · Cloud | $12 / TB-month | $0.035 / CPU-hour |
| ToxIndex Cell · On-prem | $6 / TB-month | $0.030 / CPU-hour |
Cloud runs on AWS spot capacity in your region; On-prem runs on dedicated ToxIndex-owned hardware, where storage is cheaper to provide. Neither price asks you to fund any infrastructure — you pay only the rate card.
5 · Prices vs. assumptions
To keep it honest: the exact, cited prices are AWS S3 and EC2 rates and the ToxIndex rate card. The assumptions — which a two-week pilot replaces with measured figures — are CPU-minutes per site, the ~55% AWS utilization, and the 20% overhead. The model is directional; it is built to be re-priced on your real pipelines and data, not to be the final quote.