Three frameworks, one problem
In 2021, Meta open-sourced Robyn — an R-based Marketing Mix Modelling framework built by their Marketing Science team. In 2022, PyMC Labs released PyMC-Marketing, a Python library that wraps the powerful PyMC probabilistic programming framework with purpose-built MMM components. In 2024, Google released Meridian, a JAX-based Bayesian MMM framework designed to handle the full complexity of large multi-geo advertising datasets.
Within the space of four years, three major technology companies handed the DTC advertising world something that had previously cost hundreds of thousands of pounds to access: production-grade, validated, open-source Marketing Mix Modelling frameworks. You can download all three today for free. You can run your first model this week if your data is in order.
The problem is that "free" and "accessible" are not the same thing. Each framework makes different technical assumptions, requires different skills, runs at different speeds, and is suited to different types of Shopify brands. Choosing the wrong one does not just mean wasted engineering time — it can mean running a model that confidently produces wrong answers.
This post is a practical comparison. We will explain what each framework actually does under the hood, show you the key trade-offs in a comparison table, and give you clear criteria for deciding which one to run first — or whether to run all three and triangulate.
If you are new to Marketing Mix Modelling, read our complete guide to MMM for Shopify brands first. This post assumes you understand what adstock, saturation, and channel contributions are.
What each framework actually does
Meta Robyn — fast frequentist with a clever twist
Robyn is built in R and uses ridge regression as its core modelling engine. Ridge regression is ordinary least squares with an L2 regularisation penalty — it shrinks large coefficients towards zero, which helps enormously with the multicollinearity problem that plagues MMM when ad channels move together during peak periods. This is not a Bayesian model. There are no posterior distributions, no prior beliefs about channel behaviour, no credible intervals.
What Robyn has instead is a remarkably clever approach to model selection. Rather than fitting a single model and calling it the best, Robyn uses Meta's open-source Nevergrad gradient-free optimisation library to explore thousands of combinations of adstock decay rates and saturation curve parameters. The result is not one model — it is a Pareto frontier of equally valid models, plotted along two axes: model fit (NRMSE against held-out data) and decomposition quality (a metric called DECOMP.RSSD that measures how realistic the channel contributions look).
This Pareto frontier approach is genuinely useful. Instead of pretending your model is certain, Robyn shows you the space of plausible models and forces you to choose. Different models on the frontier will attribute different proportions of revenue to different channels. The spread of that frontier is an honest signal of how much genuine uncertainty exists in your data.
Robyn also includes a built-in budget optimiser and a calibration framework. The calibration framework allows you to feed in results from geo-holdout experiments or Meta's Conversion Lift studies — converting the absolute lift estimates from those experiments into constraints on the model's channel contribution estimates. This is a powerful feature that most practitioners underuse.
Runtime is fast. A typical Robyn run with 100 weeks of weekly data and four channels completes in under ten minutes on a modern laptop. This speed makes it practical to run multiple configurations, explore sensitivity to data choices, and iterate quickly on model specification.
PyMC-Marketing — full Bayesian inference
PyMC-Marketing is a Python library built on top of PyMC, which is one of the most widely used probabilistic programming frameworks in the world. Unlike Robyn, PyMC-Marketing fits a genuinely Bayesian model — which means it uses Markov Chain Monte Carlo (specifically the No-U-Turn Sampler, or NUTS) to draw samples from the full posterior distribution of every parameter in the model.
What does this mean in practice? It means that instead of getting a single point estimate for Meta's channel contribution, you get a distribution. You might find that Meta's contribution is estimated at £180,000 with a 94% credible interval of £110,000 to £250,000. That interval is telling you something real: given your data and your prior beliefs, any value in that range is plausible. A narrow interval means your data is highly informative about Meta's effect. A wide interval means genuine uncertainty — and you should be cautious about making large budget moves based on the point estimate alone.
The Bayesian framework also allows you to encode prior beliefs about how advertising works. You can tell the model that adstock decay rates should probably be between 0.3 and 0.8 for most channels, or that saturation half-points should be positive. These priors are particularly valuable when you have limited data — they pull the model towards sensible answers in situations where the data alone cannot distinguish between multiple explanations.
PyMC-Marketing includes pre-built components for geometric adstock, Weibull adstock, Hill saturation, and logistic saturation. It has a MMM class that wraps all of these into a coherent model with sensible defaults. More advanced users can build custom model structures using PyMC's probabilistic programming primitives directly.
The trade-off for all this statistical rigour is runtime. A full MCMC sampling run on a typical Shopify MMM dataset (100 weeks, four channels, two thousand posterior samples per chain, four chains) takes between twenty minutes and several hours depending on your hardware and model complexity. You cannot iterate as quickly as you can with Robyn.
Google Meridian — JAX-based Bayesian with geo support
Meridian is the newest of the three frameworks and the most technically ambitious. Released by Google in 2024, it is built on JAX (Google's high-performance numerical computing library) and uses Hamiltonian Monte Carlo for Bayesian inference — similar in spirit to PyMC-Marketing but with a different computational backend that allows better parallelisation on GPUs.
Meridian's defining feature is native support for geo-level data. Most MMM frameworks model national-level weekly data: one row per week, one column per channel. Meridian can simultaneously model revenue across dozens of regions, states, or DMAs — each with their own spend levels and revenue outcomes — while sharing information across geographies through a hierarchical model structure. This dramatically increases the effective sample size and reduces confounding from factors like national economic trends.
Why does this matter? Because geo-level models can exploit natural experiments. If you ran a campaign heavily in the North West but not in the South East for six weeks, Meridian can use that geographic variation to identify Meta's causal effect much more precisely than a national-level model can. This is structurally similar to what geo-holdout tests do manually — but built directly into the model.
Meridian also has native integration with Google's own measurement tools, including support for ingesting reach and frequency data from YouTube campaigns and applying Google's recommended prior distributions for search and display advertising. Whether you view this as a feature or a conflict of interest depends on your perspective — and your level of Google spend.
The runtime picture for Meridian is nuanced. On CPU alone, Meridian can be slower than PyMC-Marketing. On GPU hardware (or Google Cloud's TPUs), it is substantially faster. For most Shopify brands, this means Meridian is most practical when run in a cloud environment.
Side-by-side comparison
| Dimension | Robyn (Meta) | PyMC-Marketing | Meridian (Google) |
|---|---|---|---|
| Statistical approach | Ridge regression (frequentist) | Bayesian MCMC (NUTS sampler) | Bayesian HMC (JAX backend) |
| Language | R | Python | Python |
| Runtime (typical) | 5–15 minutes | 20 min – 3 hours | 30 min – 4+ hours (CPU); much faster on GPU |
| Minimum data recommended | 52 weeks weekly | 52–104 weeks weekly | 104+ weeks or geo-level data |
| Uncertainty estimates | Pareto frontier (implicit) | Full posterior distributions | Full posterior distributions |
| Geo-level support | No (national only) | Limited (manual setup) | Yes (native hierarchical) |
| Shopify-friendly | Yes (with CSV input) | Yes (Python DataFrame) | Yes (Python DataFrame) |
| Prior calibration | Via lift test integration | Yes (full prior specification) | Yes (full prior specification) |
| Budget optimiser | Built-in | Built-in | Built-in |
| Technical difficulty | Medium (R) | High (Python + Bayesian) | Very high (Python + JAX + Bayesian) |
| Cost | Free (open source) | Free (open source) | Free (open source) |
| Best for | Fast iteration, mid-size DTC brands | Rigorous uncertainty, Python teams | Large brands, Google-heavy, geo data |
When to use Robyn
Robyn is the right choice when speed and iteration matter more than full statistical rigour. It is the framework you reach for when you need to answer a budget question this week, not next month. If your analyst is comfortable in R, Robyn will feel familiar and powerful within a day or two. If they are primarily a Python user, the R learning curve adds friction — though Robyn does have a Python interface via the robynpy wrapper that has improved substantially since its initial release.
Robyn excels in several specific situations:
- You are running your first MMM. The Pareto frontier is a gentle introduction to model uncertainty. It makes the point that there is no single correct model in an intuitive visual format, without requiring you to understand posterior distributions.
- Your team needs to iterate quickly. Running Robyn ten times to explore different adstock configurations takes less time than a single PyMC-Marketing fit. When you are in the exploratory phase of understanding your data, speed is a virtue.
- You have 1–3 years of data and 3–6 channels. This is Robyn's sweet spot. Ridge regression handles multicollinearity better than OLS, and the Nevergrad search does a reasonable job of exploring the adstock/saturation parameter space.
- You want to integrate conversion lift tests. Robyn's calibration framework is well-documented and specifically designed for this. If you regularly run Meta Conversion Lift or geo-holdout tests, Robyn makes it straightforward to use those results to constrain your model.
- You do not need credible intervals on individual channel contributions. If you are making directional budget decisions — "shift some spend from TikTok to Google" rather than "shift exactly £40,000 from TikTok to Google" — the Pareto frontier gives you enough information to act.
One honest limitation of Robyn worth flagging: because it is frequentist, it does not have a principled mechanism for incorporating prior knowledge about how advertising works. If you have strong external evidence that your Meta ads have a decay rate of approximately 0.6, you cannot formally encode that belief and let it influence your results. You can constrain the search range — but that is a cruder mechanism than Bayesian priors. This matters most when your data is limited or noisy.
If your Shopify store is doing between £500k and £5m annually, you have at least 12 months of weekly data, and you want a fast answer you can act on now, start with Robyn. You can always graduate to PyMC-Marketing once you understand your data and need tighter uncertainty estimates.
When to use PyMC-Marketing
PyMC-Marketing is the framework to reach for when the quality of your uncertainty estimates matters as much as the central estimate itself. In practical terms, this means: when you are making large budget decisions, when you have stakeholders who need to understand the confidence range around a recommendation, or when you are building MMM into a recurring measurement process that will inform quarterly planning.
The key advantage PyMC-Marketing has over Robyn is that its uncertainty is statistically principled. When PyMC says Meta's contribution is between £120,000 and £240,000 at the 94% credible level, that interval is a genuine posterior probability statement — it reflects everything your data and your prior beliefs can tell you about the true value. When Robyn's Pareto frontier shows a range of models attributing 15–28% of revenue to Meta, that range reflects model selection uncertainty, which is a related but different concept.
For Shopify brands, PyMC-Marketing is the right choice in these situations:
- You are managing a budget of £1m or more annually across paid channels. At this scale, the difference between Meta contributing 22% and 27% of revenue is worth hundreds of thousands of pounds. Getting a proper uncertainty interval on that estimate before shifting budget is worth the extra complexity.
- Your team knows Python and is comfortable reading MCMC diagnostics. PyMC-Marketing's outputs include trace plots, R-hat convergence statistics, and effective sample size metrics. If your analyst cannot interpret these, you cannot tell whether your model has converged correctly — and an unconverged Bayesian model is worse than no model at all.
- You want to encode prior knowledge about your channels. If you have run lift tests, analysed post-campaign effects manually, or have strong intuitions about which channels have long vs short decay rates, PyMC-Marketing allows you to formalise those beliefs as priors and propagate them through the model.
- You are building a recurring MMM workflow. PyMC-Marketing's Python API integrates naturally into data pipelines built on pandas, dbt, and modern data stacks. Once built, a PyMC-based MMM can be re-run on updated data with minimal manual intervention.
- You are combining MMM with other Bayesian analyses. If your team already uses probabilistic methods — Bayesian A/B testing, hierarchical models for LTV, etc. — PyMC-Marketing will feel like a natural extension of existing methodology.
The runtime cost is real. A well-specified PyMC-Marketing model with four chains and two thousand samples will take 30–90 minutes on a modern laptop for a typical Shopify dataset. You should plan for this and run models overnight or in a cloud environment. The flip side is that once your model has converged and you trust the results, the posterior is a rich object — you can extract uncertainty-quantified ROAS, budget optimiser recommendations with confidence intervals, and counterfactual simulations, all from the same fitted object.
A note on convergence
The single most important thing to check after running PyMC-Marketing is whether your MCMC chains have converged. The key diagnostics are:
- R-hat (Gelman-Rubin statistic): Should be below 1.01 for all parameters. Values above 1.05 indicate the chains have not mixed and your estimates are unreliable.
- Effective Sample Size (ESS): Should be at least 400 for all parameters. Low ESS means your posterior samples are autocorrelated and your uncertainty estimates are too narrow.
- Trace plots: Each chain should look like a stationary, hairy caterpillar. Trends, drifts, or stuck chains are all warning signs.
Do not skip these checks. A model that has not converged will produce authoritative-looking numbers that are simply wrong.
When to use Google Meridian
Meridian is the most technically demanding of the three frameworks and the one with the narrowest ideal use case for most Shopify brands. That is not a criticism — it is optimised for problems that most DTC brands do not face yet, and if your brand has outgrown national-level MMM, it is the right tool.
The cases where Meridian genuinely outperforms Robyn and PyMC-Marketing are:
- You have regional or geographic sales data. If you can break your Shopify revenue by county, state, region, or DMA — and you have at least two years of this data at weekly granularity — Meridian's hierarchical geo model will produce substantially more reliable channel attribution than a national-level model. The geographic variation gives the model natural experiments to work with.
- Google accounts for a large proportion of your spend. Meridian's default priors are calibrated on Google's own advertising research. If Google Search, Google Shopping, and YouTube are your primary channels, those priors will be well-matched to your situation. If Meta is your dominant channel, the priors may be less appropriate.
- You have in-house data science capability. Meridian's documentation is thorough but the debugging surface is large. JAX stack errors are notoriously cryptic. CUDA/GPU setup adds another layer of complexity. This is a framework for teams with a dedicated data scientist who has time to invest in the tooling.
- You are spending over £5m annually on advertising. At this scale, the precision gains from geo-level modelling start to pay for themselves in improved budget allocation. Below £2m, the engineering overhead of setting up Meridian is rarely justified by the marginal improvement in results.
- You need YouTube incrementality estimates. Meridian has specific support for reach and frequency data from YouTube campaigns, allowing it to model GRP-style exposure in a way that neither Robyn nor PyMC-Marketing supports natively.
There is one thing worth saying plainly about Meridian: it was built by Google, and it is designed to integrate with Google's measurement ecosystem. The framework is genuinely open-source and technically sound. But if you are concerned about methodology being influenced by the platform whose ads you are measuring, you should be aware of that context. Running Meridian alongside Robyn or PyMC-Marketing and checking that the results are broadly consistent is a reasonable safeguard.
If you are a £10m+ Shopify brand with meaningful Google spend, regional sales data, and a data scientist on staff, Meridian is worth the investment. For everyone else, start with Robyn or PyMC-Marketing and revisit Meridian when you have outgrown them.
Nuso runs all three — you compare results in one dashboard
One of the strongest validation techniques available to any MMM practitioner is framework triangulation: running the same dataset through multiple frameworks and checking that the channel attributions are broadly consistent. If Robyn attributes 23% of revenue to Meta, PyMC-Marketing attributes 19–27% (credible interval), and Meridian attributes 21%, you can be reasonably confident you are in the right ballpark. If one framework produces a wildly different result, that is a signal to investigate your data and model specification rather than to simply trust one answer over another.
The problem with framework triangulation is that it requires running and maintaining three separate modelling environments — R for Robyn, Python with MCMC overhead for PyMC-Marketing, and JAX/GPU infrastructure for Meridian. For most Shopify brands, that is three separate engineering investments, three sets of dependencies to maintain, and three sets of outputs to manually compare in spreadsheets.
Nuso's MMM module runs all three frameworks on your data automatically. You connect your Shopify revenue and ad platform spend data once. Nuso handles the data preparation, runs each framework in a cloud environment sized appropriately for the workload, and surfaces the results in a single comparison dashboard.
The dashboard shows you:
- Channel attribution by framework — side-by-side waterfall charts showing how each framework decomposes your revenue. Where they agree, you have high confidence. Where they diverge, the dashboard flags the discrepancy and explains likely causes.
- Uncertainty ranges — PyMC-Marketing and Meridian posterior credible intervals displayed alongside Robyn's Pareto frontier spread, so you can see the full uncertainty picture in one place.
- Budget optimiser consensus — the recommended spend allocation from each framework's built-in optimiser, and where they agree on directional moves (e.g., all three suggest increasing Google relative to TikTok).
- Baseline trend — how your organic revenue baseline is evolving over time, separated from paid media effects, across all three models.
You do not need to know R, set up a JAX environment, or interpret MCMC trace plots. Nuso runs the models and translates the outputs into plain-English findings and specific budget recommendations. If you want to go deeper, the raw model outputs and diagnostic plots are available for export.
For Shopify brands who want MMM-grade measurement without building and maintaining an in-house modelling stack, this is the practical path. You get the rigour of all three frameworks, the triangulation benefit, and results in a dashboard your whole team can read.
Run Robyn, PyMC, and Meridian on your Shopify data
Nuso connects to your store and ad accounts, runs all three MMM frameworks in parallel, and shows you where they agree — and where they don't.
Get started free