PIEModel#

class pymc_marketing.pie.model.PIEModel(*, pre_determined_features=FieldInfo(annotation=NoneType, required=True, description='Feature columns known before the campaign runs.'), post_determined_features=FieldInfo(annotation=NoneType, required=True, description='Feature columns known only after the campaign runs.'), target_column=FieldInfo(annotation=NoneType, required=False, default='y', description='Label for the target variable in idata.'), model_config=FieldInfo(annotation=NoneType, required=False, default=None), sampler_config=FieldInfo(annotation=NoneType, required=False, default=None))[source]#

Predicted Incrementality by Experimentation model.

Trains a Bayesian BART regression on a corpus of past RCTs mapping campaign features to measured incrementality, then predicts incrementality for non-experimental campaigns.

Parameters:

pre_determined_featureslist[str]

Feature columns known before the campaign runs (e.g. objective, vertical, budget, audience_type). In the current alpha implementation this list is concatenated with post_determined_features and fed identically into BART; the distinction is recorded for future versions that gate prediction on feature availability but has no effect on the model graph today.

post_determined_featureslist[str]

Feature columns known only after the campaign runs (e.g. exposure_rate, ctr, last_click_conversions_per_dollar, avg_treated_outcome). See note above — treated identically to pre_determined_features in this release.

target_columnstr

Name used for the target variable in the PyMC graph and in saved idata groups (posterior_predictive[target_column], fit_data[target_column]). Does not select a column from X — X and y are always passed separately. Defaults to "y".

model_configdict, optional

Override default priors / BART settings. Top-level keys merge with default_model_config(); nested dicts (e.g. "bart") are replaced wholesale, so a partial "bart" override must restate every required key (m, alpha, beta). Keys:

"bart": dict with m (int), alpha (float), beta (float), and optional response — "constant" (default, piecewise-constant leaves), "linear", or "mix" (the latter two fit linear models in the leaves, which can help on smooth response surfaces).
"sigma": pymc_extras.prior.Prior for the noise std.
"categorical_split": "onehot" (default) or "continuous". Controls how label-encoded categorical columns are split by BART — see Notes.

sampler_configdict, optional

Passed to pymc.sample(). Defaults to {}.

Notes

This module is alpha — the API and defaults may change. Tracked deviations from the paper [1]:

The paper uses a random forest fit to 2,226 RCTs; this implementation uses Bayesian Additive Regression Trees (PyMC-BART) for native posterior uncertainty.
The paper’s decision-theoretic framework (Type I/II error rates, disagreement vs RCT-based go/no-go decisions; paper §6) is not implemented.
Within-campaign sample splitting (paper §4.2) — which breaks the mechanical correlation between post-determined features and the target — is not implemented.
Extrapolation / cold-start diagnostics across advertiser segments (paper §5.3) are not implemented.
The footnote-2 measurement-error layer y_observed ~ Normal(y_true, se_rct) for per-RCT standard errors is not implemented.

Categorical columns (object or category dtype) are label-encoded in build_model. With categorical_split="onehot" (default), BART uses pymc_bart.split_rules.OneHotSplitRule for those columns so that splits are “level X vs not-X” rather than “encoded value < c” — this avoids imposing the encoder’s alphabetical ordering on unordered categories. Set categorical_split="continuous" to fall back to ordered splits.

References

[1]

Gordon, B. R., Moakler, R., & Zettelmeyer, F. (2026). Predicted Incrementality by Experimentation (PIE) for Ad Measurement. NBER Working Paper No. 35044.

Examples

import pandas as pd

from pymc_marketing.pie import PIEModel

# Corpus of past campaigns, each labelled with the incrementality
# measured by its RCT.
X = pd.DataFrame(
    {
        "objective": ["conversions", "traffic", "awareness", "traffic"],
        "vertical": ["retail", "travel", "finance", "retail"],
        "budget": [50_000, 12_000, 80_000, 30_000],
        "exposure_rate": [0.42, 0.71, 0.33, 0.55],
    }
)
y = pd.Series([0.81, 0.34, 1.12, 0.49])  # incrementality per dollar

model = PIEModel(
    pre_determined_features=["objective", "vertical", "budget"],
    post_determined_features=["exposure_rate"],
)
model.fit(X, y, random_seed=42)
preds = model.sample_posterior_predictive(X)

Methods

`PIEModel.__init__`(*[, ...])	Initialize model configuration and sampler configuration for the model.
`PIEModel.approximate_fit`(X[, y, ...])	Fit a model using Variational Inference and return InferenceData.
`PIEModel.attrs_to_init_kwargs`(attrs)	Reconstruct constructor kwargs from saved idata attrs.
`PIEModel.build_from_idata`(idata)	Rebuild the model from saved inference data.
`PIEModel.build_model`(X, y, **kwargs)	Build the PyMC model graph.
`PIEModel.create_fit_data`(X, y)	Create the fit_data group based on the input data.
`PIEModel.create_idata_attrs`()	Extend the base idata attrs with PIEModel-specific fields.
`PIEModel.fit`(X[, y, progressbar, random_seed])	Fit a model using the data passed as a parameter.
`PIEModel.graphviz`(**kwargs)	Get the graphviz representation of the model.
`PIEModel.idata_to_init_kwargs`(idata)	Create the model configuration and sampler configuration from the InferenceData to keyword arguments.
`PIEModel.load`(fname[, check])	Create a ModelBuilder instance from a file.
`PIEModel.load_from_idata`(idata[, check])	Create a ModelBuilder instance from an InferenceData object.
`PIEModel.post_sample_model_transformation`()	Perform transformation on the model after sampling.
`PIEModel.predict`(X[, extend_idata])	Use a model to predict on unseen data and return point prediction of all the samples.
`PIEModel.predict_posterior`(X[, ...])	Posterior predictive draws for `X` as a single DataArray.
`PIEModel.predict_proba`(X[, extend_idata, ...])	Alias for `predict_posterior`, for consistency with scikit-learn probabilistic estimators.
`PIEModel.sample_posterior_predictive`(X[, ...])	Sample posterior predictive draws and return in the original target scale.
`PIEModel.sample_prior_predictive`(X[, y, ...])	Sample from the model's prior predictive distribution.
`PIEModel.save`(fname, **kwargs)	Save the model's inference data to a file.
`PIEModel.set_idata_attrs`([idata])	Set attributes on an InferenceData object.
`PIEModel.table`(**model_table_kwargs)	Get the summary table of the model.

Attributes

`default_model_config`	Default BART hyperparameters, noise prior, and categorical split mode.
`default_sampler_config`	Default sampler configuration (empty — PyMC auto-assigns PGBART + NUTS).
`fit_result`	Get the posterior fit_result.
`id`	Generate a unique hash value for the model.
`output_var`	Name of the target variable in the PyMC graph and saved idata.
`posterior`	Access the 'posterior' attribute of the InferenceData object.
`posterior_predictive`	Access the 'posterior_predictive' attribute of the InferenceData object.
`predictions`	Access the 'predictions' attribute of the InferenceData object.
`prior`	Access the 'prior' attribute of the InferenceData object.
`prior_predictive`	Access the 'prior_predictive' attribute of the InferenceData object.
`version`
`idata`
`sampler_config`
`model_config`