import marimo as mocloudposterior: Cloud Execution
Run PyMC MCMC sampling on cloud VMs with one line of code. This notebook demonstrates remote execution using the classic Radon dataset from Gelman & Hill (2006).
Run this notebook locally (Jupyter or marimo) to watch the live progress display animate in-cell. Some outputs don’t render in GitHub’s notebook viewer.
import numpy as np
import pandas as pd
import pymc as pm
import arviz as azStart fresh
Clear any cached results and this project’s Modal volume so the notebook runs cold and reproducibly. (Shown in marimo; hidden in the rendered notebook.)
import shutil
from pathlib import Path
import cloudposterior as cp
# Wipe the local result cache + this project's Modal volume so the example
# starts cold. The sampling cells below use `cp`, so marimo runs this first.
shutil.rmtree(Path(".cloudposterior"), ignore_errors=True)
cp.cleanup_volumes()/Users/spencerboucher/Projects/cloudposterior-map-dashboard/cloudposterior/backends/modal_backend.py:790: AsyncUsageWarning: A blocking Modal interface is being used in an async context.
This may cause performance issues or bugs. Consider rewriting to use Modal's async interfaces:
https://modal.com/docs/guide/async
Suggested rewrite:
await modal.Volume.objects.delete.aio(volume_name)
Original line:
modal.Volume.objects.delete(volume_name)
Data
919 household radon measurements across 85 Minnesota counties (Gelman & Hill, 2006).
df = pd.read_csv(pm.get_data("radon.csv"))
county_names = df.county.unique()
county_idx = df.county_code.values
log_radon = df.log_radon.values
floor = df.floor.values
print(f"{len(df)} observations, {len(county_names)} counties")919 observations, 85 counties
Model
Hierarchical varying-intercepts model with non-centered parameterization. Each county gets its own intercept (partial pooling), and floor level (basement vs first floor) is a fixed effect.
with pm.Model(name="radon_intercepts", coords={"county": county_names}) as radon:
mu_a = pm.Normal("mu_a", mu=0, sigma=5)
sigma_a = pm.HalfNormal("sigma_a", sigma=2)
a_raw = pm.Normal("a_raw", mu=0, sigma=1, dims="county")
a = pm.Deterministic("a", mu_a + sigma_a * a_raw, dims="county")
b_floor = pm.Normal("b_floor", mu=0, sigma=5)
mu = a[county_idx] + b_floor * floor
sigma_y = pm.HalfNormal("sigma_y", sigma=2)
pm.Normal("obs", mu=mu, sigma=sigma_y, observed=log_radon)pm.model_to_graphviz(radon)Remote execution
cp.cloud() intercepts pm.sample() and runs it on a cloud VM. The model is uploaded to a volume on first run. Resources (CPU cores, memory) are auto-sized to your model.
with cp.cloud(radon, remote=True):
idata = pm.sample(draws=2000, tune=1000, chains=4)Diagnostics
az.summary(idata, filter_vars="like", var_names=["mu_a", "sigma_a", "b_floor", "sigma_y"])| mean | sd | hdi_3% | hdi_97% | mcse_mean | mcse_sd | ess_bulk | ess_tail | r_hat | |
|---|---|---|---|---|---|---|---|---|---|
| radon_intercepts::mu_a | 1.492 | 0.050 | 1.402 | 1.587 | 0.001 | 0.001 | 1997.0 | 3559.0 | 1.0 |
| radon_intercepts::b_floor | -0.664 | 0.069 | -0.797 | -0.536 | 0.001 | 0.001 | 6298.0 | 5719.0 | 1.0 |
| radon_intercepts::sigma_a | 0.322 | 0.045 | 0.234 | 0.402 | 0.001 | 0.001 | 1677.0 | 2974.0 | 1.0 |
| radon_intercepts::sigma_y | 0.727 | 0.018 | 0.693 | 0.760 | 0.000 | 0.000 | 8036.0 | 5877.0 | 1.0 |
az.plot_trace(idata, filter_vars="like", var_names=["mu_a", "sigma_a", "b_floor", "sigma_y"])[0, 0].figureGPU acceleration with JAX
For models that benefit from GPU acceleration, use nuts_sampler="numpyro" to sample with JAX via NumPyro. cloudposterior automatically provisions a GPU container and installs jax[cuda12] when it detects a JAX-based sampler – no configuration needed.
with cp.cloud(radon, remote=True):
idata_jax = pm.sample(draws=2000, tune=1000, chains=4, nuts_sampler="numpyro")Cleanup
Model payloads are stored in a project-scoped volume. Delete it when you’re done.
cp.cleanup_volumes()/Users/spencerboucher/Projects/cloudposterior-map-dashboard/cloudposterior/backends/modal_backend.py:790: AsyncUsageWarning: A blocking Modal interface is being used in an async context.
This may cause performance issues or bugs. Consider rewriting to use Modal's async interfaces:
https://modal.com/docs/guide/async
Suggested rewrite:
await modal.Volume.objects.delete.aio(volume_name)
Original line:
modal.Volume.objects.delete(volume_name)