What is Marketing Mix Modelling?
Marketing Mix Modelling (MMM) is a statistical technique that uses historical sales and marketing data to estimate the contribution of each marketing channel — and non-marketing factors like price, seasonality, and distribution — to your overall revenue. It was invented in the 1960s by econometricians working for packaged-goods companies and has been quietly running inside the largest consumer brands in the world ever since.
The core question MMM answers is: if I had not spent anything on Meta ads last month, how much less revenue would I have made? That counterfactual — the revenue without the channel — is what the model is actually estimating. The difference between actual revenue and counterfactual revenue is the incremental contribution of that channel.
This is fundamentally different from last-click attribution, multi-touch attribution, or platform-reported ROAS. Those methodologies attribute credit to touchpoints in a customer's journey. MMM does not look at individual customers at all. It looks at aggregate weekly (or daily) totals and finds the statistical relationship between your spend and your revenue, controlling for everything else that was happening simultaneously.
MMM is not a replacement for pixel-based attribution — it is an alternative measurement layer that operates at a completely different level of abstraction. The two approaches answer different questions. Attribution says "which ad did this customer see before buying?" MMM says "if we had not run that campaign at all, how much revenue would we have lost?"
Because MMM works on aggregate data, it is completely privacy-safe. It does not require cookies, pixels, or individual-level tracking. This makes it uniquely robust in a world where third-party data is disappearing.
Why Shopify brands need MMM
For most of the 2010s, Shopify brands could run their entire marketing operation off the Meta and Google dashboards. The pixel was reliable. The attribution windows were long enough to catch most conversions. ROAS was a reasonable proxy for performance.
That world ended in April 2021 when Apple released iOS 14.5 and made App Tracking Transparency opt-in. Within months, roughly 60–70% of iOS users had opted out of tracking. Meta's pixel lost a significant proportion of its signal. Reported ROAS collapsed — in many cases by 30–50% — not because campaigns stopped working, but because the measurement was broken.
The problems with platform attribution in 2026
- iOS signal loss — Meta still estimates conversions through Aggregated Event Measurement and modelled signals, but these introduce material uncertainty into reported numbers.
- Platform bias — every ad platform has a financial incentive to claim credit for as many conversions as possible. When Meta and Google both claim the same sale, your blended reported ROAS looks fantastic and your true incrementality looks very different.
- Attribution window gaming — a 7-day click window will catch substantially more conversions than a 1-day click window, even if the campaign influence expired on day two.
- Offline and organic lift — branded search often spikes when you run TV, out-of-home, or heavy social campaigns. Platform attribution will credit branded search with these sales, making paid social look underpowered and SEO look miraculous.
- Cookie deprecation — third-party cookies are substantially degraded across Safari and Firefox and face continued pressure in Chrome. The cookie-based measurement stack is a depreciating asset.
MMM sidesteps all of these problems by ignoring individual tracking entirely. It does not care whether you can follow a user from ad impression to checkout. It observes that when you spent £50,000 on Meta in weeks where revenue was £400,000, and when you spent £20,000 in similar weeks where revenue was £280,000, there is a measurable relationship — and it estimates that relationship with statistical rigour.
How MMM works technically
A Marketing Mix Model is, at its core, a regression model. The dependent variable is revenue (or conversions, or orders). The independent variables include your spend by channel, plus a set of control variables that capture non-marketing factors.
But raw spend data is not sufficient. Advertising does not have an instant, simultaneous effect on sales. Someone who sees a Meta ad on Tuesday may not buy until Friday. A TV campaign may drive search volume for three weeks after the spots air. This lagged effect needs to be modelled explicitly.
Adstock transformations
Adstock is the transformation applied to raw spend data to capture carry-over effects. The simplest version is geometric adstock, where each week's effective spend is the sum of the current week's spend plus a fraction of the previous week's adstock:
Adstock(t) = Spend(t) + decay_rate × Adstock(t-1)
A decay rate of 0.5 means that half of last week's advertising effect carries over into this week. Channels like paid search typically have low decay rates (effects are immediate) whilst TV or brand campaigns typically have high decay rates (effects persist for weeks).
More sophisticated models use Weibull adstock, which allows the peak effect to be delayed rather than always occurring in the week of spend. This is important for channels like display or video where awareness builds before purchase intent forms.
Saturation curves
Advertising also exhibits diminishing returns at high spend levels. The first £10,000 you spend on a channel drives more incremental revenue than the second £10,000. Saturation functions capture this. The most common is the Hill function:
Saturation(x) = x^alpha / (x^alpha + K^alpha)
Where K is the half-saturation point (the spend level at which you're getting half of maximum possible effect) and alpha controls the steepness of the curve. Calibrating these parameters is one of the most technically demanding parts of building a reliable MMM.
The regression equation
Once spend variables have been adstocked and saturated, the model fits a regression equation of the form:
Revenue = Baseline + β₁×Meta_adstocked + β₂×Google_adstocked + β₃×TikTok_adstocked + β₄×Seasonality + β₅×Promotions + ε
The baseline captures revenue that would occur even if you spent nothing on advertising — direct traffic, organic search, word of mouth, repeat customers. For many Shopify brands, baseline is 30–60% of total revenue, which is often a revelation when first seen.
The baseline contribution is often the most strategically important number in a Marketing Mix Model. A brand with 60% baseline revenue has a very different risk profile and budget flexibility than a brand with 20% baseline. If you suddenly stopped all paid advertising, how long would your business survive? MMM tells you exactly that.
OLS vs PyMC vs Meridian vs Robyn — when to use each
There are now four credible approaches to building an MMM, each with different trade-offs in terms of technical complexity, uncertainty quantification, and data requirements.
| Framework | Approach | Best for | Uncertainty | Complexity | Cost |
|---|---|---|---|---|---|
| OLS | Frequentist regression | First models, quick iteration | p-values / CI only | Low | Free |
| Robyn (Meta) | Ridge regression + Nevergrad optimiser | Mid-size brands, R users | Pareto frontier | Medium | Free |
| PyMC-Marketing | Bayesian (MCMC / NUTS sampler) | Full uncertainty quantification | Full posterior | High | Free |
| Meridian (Google) | Bayesian (Stan, HMC) | Large brands, geo-level data | Full posterior | High | Free |
OLS (Ordinary Least Squares)
OLS is the starting point for any MMM practitioner. It is fast, interpretable, and runs in any statistical package including Excel's LINEST function and Python's statsmodels or scikit-learn. The limitation is that it produces point estimates without meaningful uncertainty quantification, and it is sensitive to multicollinearity between channels (a common problem when brands increase spend on all channels simultaneously during peak periods).
Use OLS when you are exploring your data for the first time, checking for seasonality patterns, or need to present a quick directional answer to a stakeholder. Do not use OLS as your production model if you are making budget allocation decisions of material size.
Robyn (Meta Open Source)
Meta released Robyn as open-source R code in 2021, and it has become one of the most widely used MMM tools for DTC brands. Robyn uses ridge regression (which penalises large coefficients and helps with multicollinearity) combined with Meta's Nevergrad optimisation library to explore thousands of model configurations and return a Pareto frontier of equally valid models.
The Pareto frontier is a genuinely useful concept: rather than giving you a single "best" model, Robyn shows you the set of models where you cannot improve fit without degrading decomposition, and vice versa. This helps you reason about model uncertainty without full Bayesian inference. Robyn also includes a built-in budget optimiser and calibration framework for integrating conversion lift tests.
Robyn is a strong choice for brands spending £500k–£5m annually across 3–6 channels, with an analyst comfortable in R.
PyMC-Marketing
PyMC-Marketing is a Python library built on PyMC (a probabilistic programming framework) that provides purpose-built MMM components including geometric and Weibull adstock, Hill saturation, and a full Bayesian model class. Because it uses Markov Chain Monte Carlo (MCMC) to sample the full posterior distribution of every parameter, it gives you genuine uncertainty intervals on all outputs — including the channel contributions and the budget optimiser recommendations.
This uncertainty quantification is the key advantage of Bayesian MMM. Instead of saying "Meta contributed £180,000 to revenue last quarter," PyMC says "Meta contributed between £120,000 and £240,000, with a median of £180,000." That interval matters enormously when you are deciding whether to shift £100,000 from Meta to Google.
PyMC-Marketing requires comfort with Python, Bayesian reasoning, and MCMC diagnostics (R-hat, ESS, trace plots). It is the right choice for brands that want to build a rigorous, production-grade MMM in-house.
Meridian (Google)
Google released Meridian in 2024 as an open-source Bayesian MMM framework built on Stan and TensorFlow Probability. Its key differentiator is native support for geo-level data — modelling revenue at the regional or DMA level simultaneously — which dramatically increases the effective sample size and reduces confounding. If you have regional sales data and run TV or out-of-home advertising, Meridian is worth investigating.
Meridian is the most technically demanding of the four frameworks and is best suited to brands with in-house data science capability or an analytics agency partner.
How much data you need
This is the question every brand asks first, and the honest answer is: more than you probably have if you are under two years old.
A reliable MMM needs enough variance in your spend patterns to identify the effect of each channel. If you have always spent roughly the same amount on each channel every week, the model has very little signal to work with. You need periods of high spend, low spend, and ideally zero spend on each channel.
- Minimum viable: 52 weeks of weekly data (1 year), at least 3 channels, with natural spend variation of at least ±30% around the mean per channel
- Recommended: 104 weeks (2 years) — this gives you two cycles of seasonality, which the model needs to separate seasonal effects from advertising effects
- Ideal: 156+ weeks (3 years) with geo-level splits and at least one geo-holdout experiment per major channel
Weekly granularity is the standard for MMM. Daily data is theoretically richer but introduces more noise and requires more sophisticated error structures. For most Shopify brands, weekly is the right level.
Required data columns for a basic model:
- Week start date
- Total revenue (net of refunds)
- Spend per channel (Meta, Google, TikTok, email, etc.)
- Impressions or clicks per channel (optional but helpful)
- Promotion indicator (sale periods, discount codes)
- Seasonality indicators (or raw search volume data as a proxy)
- Price index (if you have had significant pricing changes)
Running your first MMM model: a step-by-step guide
Step 1: Collect and clean your data
Pull two years of weekly revenue from Shopify (gross revenue minus refunds, minus shipping). Pull weekly spend by channel from each ad platform — Meta Ads Manager, Google Ads, TikTok Ads Manager. Export to a single CSV with one row per week. Check for missing weeks, platform outages, and data anomalies. Flag any promotional periods (Black Friday, sitewide sales) with a binary column.
Step 2: Exploratory analysis
Before building any model, spend an afternoon understanding your data. Plot revenue over time. Plot spend per channel over time. Look for visual correlations. Check whether peaks in spend align with peaks in revenue — and by how many weeks they are offset. Calculate the correlation matrix between all channels. Very high correlations (>0.8) between channels indicate potential multicollinearity problems ahead.
Step 3: Apply adstock transformations
Transform your spend variables using geometric adstock. Start with a decay rate of 0.5 for all channels as a baseline. You will refine these in the next step. In Python with pandas:
def geometric_adstock(spend, decay):
adstocked = [spend[0]]
for i in range(1, len(spend)):
adstocked.append(spend[i] + decay * adstocked[i-1])
return adstocked
Step 4: Fit a baseline OLS model
Run a simple OLS regression with your adstocked spend variables, a trend term, and seasonal dummy variables. Check that all channel coefficients are positive (negative coefficients mean your spend is anti-correlated with revenue, which is a signal of multicollinearity or data error). Check R-squared — a reasonable MMM should explain 85–95% of revenue variance.
Step 5: Iterate with Robyn or PyMC
Once you have a sanity-checked OLS baseline, move to Robyn or PyMC-Marketing for production. Configure the adstock and saturation hyperparameter search ranges based on your OLS learnings. Run the model. Review fit metrics, trace plots (PyMC), or Pareto frontiers (Robyn).
Step 6: Validate with holdout tests
The only way to truly validate an MMM is to run a real-world experiment. Turn off spend on one channel in one geography for 4 weeks and see whether the model's prediction of the revenue impact matches what actually happened. This geo-holdout test is the gold standard for MMM validation and should be run at least once per year per major channel.
Step 7: Run the budget optimiser
Given the fitted saturation curves, a budget optimiser can identify the spend allocation across channels that maximises total revenue for a given total budget. This is where MMM produces direct commercial value — not just measurement insight but a specific budget recommendation.
Interpreting MMM results: ROAS, contributions, and the budget optimiser
Channel contributions
The primary output of an MMM is a decomposition of total revenue into its component parts. For a typical Shopify brand this might look like:
| Component | % of Revenue | £ (example, £2M annual) |
|---|---|---|
| Baseline (organic, direct, repeat) | 42% | £840,000 |
| Meta Ads | 24% | £480,000 |
| Google Ads | 16% | £320,000 |
| Email / SMS | 11% | £220,000 |
| TikTok Ads | 4% | £80,000 |
| Seasonality / Promotions | 3% | £60,000 |
mROAS — marginal return on ad spend
Standard MMM ROAS divides incremental contribution by spend. But the more useful number is mROAS — the marginal return on the next pound of spend. Because saturation curves mean diminishing returns, a channel may have a historical ROAS of 4.0 but an mROAS of 1.8 — meaning if you spend one more pound today, you get £1.80 back. This is the number the budget optimiser uses.
Budget optimiser
Given the marginal returns across all channels, the budget optimiser finds the allocation where the mROAS of every channel is equal. At that point, there is no shift you can make between channels that would increase total revenue. In practice this means increasing spend on under-saturated channels (high mROAS) and reducing it on over-saturated channels (low mROAS), until the marginal returns equalise.
Common MMM mistakes
- Too little data: Building an MMM on less than 52 weeks of data almost always produces unreliable results. The model has not seen enough variance in spend to separate channel effects from confounders.
- Not controlling for promotions: A single Black Friday week without a promotion flag will cause the model to attribute promotional lift to whichever channel happened to be running that week.
- Ignoring multicollinearity: If you always increase all channels at the same time, the model cannot separate their effects. Vary your channel mix deliberately, or use regularisation (ridge regression) to handle it.
- Forgetting the baseline: Many brands optimise their paid media and ignore the 30–60% of revenue coming from organic and retention. A strong brand investment that grows your baseline is often worth more than incremental paid efficiency.
- Not validating with experiments: An unvalidated MMM is a sophisticated guess. At minimum, run one geo-holdout test per year to check that the model's incremental estimates are approximately correct.
- Updating too infrequently: Consumer behaviour, competitive dynamics, and media efficiency all shift over time. An MMM built on 2024 data may give systematically wrong budget recommendations in 2026. Refit at least quarterly.
MMM in Nuso
Nuso includes a built-in Marketing Mix Modelling module designed specifically for Shopify brands. It connects directly to your Shopify revenue data and pulls spend automatically from your connected ad platforms, so you do not need to build and maintain a data pipeline before you can run a model.
The Nuso MMM module uses a PyMC-based Bayesian engine with geometric adstock and Hill saturation, pre-configured with sensible priors for DTC e-commerce brands. You select your channels, confirm your data range, and the model runs in the background — typically completing in 15–45 minutes depending on data volume.
Results are presented as an interactive contribution waterfall, a channel ROAS comparison table (MMM ROAS vs platform-reported ROAS, side by side), and a budget optimiser that shows you the recommended weekly spend per channel given your current total budget and target MER.
Nuso also surfaces the baseline trend over time — so you can see whether your brand is getting stronger or weaker independent of paid activity, which is arguably the most important strategic signal of all.
For brands new to MMM, Nuso includes a model confidence score and a plain-English interpretation of the key findings, so you do not need a data scientist on staff to act on the results.
Run your first Marketing Mix Model in Nuso
Connect your Shopify store and ad accounts. Your first MMM runs automatically — no data science required.
Start free trial