PIEModel#

class pymc_marketing.pie.model.PIEModel(*, pre_determined_features=FieldInfo(annotation=NoneType, required=True, description='Feature columns known before the campaign runs.'), post_determined_features=FieldInfo(annotation=NoneType, required=True, description='Feature columns known only after the campaign runs.'), target_column=FieldInfo(annotation=NoneType, required=False, default='y', description='Label for the target variable in idata.'), model_config=FieldInfo(annotation=NoneType, required=False, default=None), sampler_config=FieldInfo(annotation=NoneType, required=False, default=None))[source]#

Predicted Incrementality by Experimentation model.

Trains a Bayesian BART regression on a corpus of past RCTs mapping campaign features to measured incrementality, then predicts incrementality for non-experimental campaigns.

Parameters:
pre_determined_featureslist[str]

Feature columns known before the campaign runs (e.g. objective, vertical, budget, audience_type). In the current alpha implementation this list is concatenated with post_determined_features and fed identically into BART; the distinction is recorded for future versions that gate prediction on feature availability but has no effect on the model graph today.

post_determined_featureslist[str]

Feature columns known only after the campaign runs (e.g. exposure_rate, ctr, last_click_conversions_per_dollar, avg_treated_outcome). See note above — treated identically to pre_determined_features in this release.

target_columnstr

Name used for the target variable in the PyMC graph and in saved idata groups (posterior_predictive[target_column], fit_data[target_column]). Does not select a column from X — X and y are always passed separately. Defaults to "y".

model_configdict, optional

Override default priors / BART settings. Top-level keys merge with default_model_config(); nested dicts (e.g. "bart") are replaced wholesale, so a partial "bart" override must restate every required key (m, alpha, beta). Keys:

  • "bart": dict with m (int), alpha (float), beta (float), and optional response"constant" (default, piecewise-constant leaves), "linear", or "mix" (the latter two fit linear models in the leaves, which can help on smooth response surfaces).

  • "sigma": pymc_extras.prior.Prior for the noise std.

  • "categorical_split": "onehot" (default) or "continuous". Controls how label-encoded categorical columns are split by BART — see Notes.

sampler_configdict, optional

Passed to pymc.sample(). Defaults to {}.

Notes

This module is alpha — the API and defaults may change. Tracked deviations from the paper [1]:

  • The paper uses a random forest fit to 2,226 RCTs; this implementation uses Bayesian Additive Regression Trees (PyMC-BART) for native posterior uncertainty.

  • The paper’s decision-theoretic framework (Type I/II error rates, disagreement vs RCT-based go/no-go decisions; paper §6) is not implemented.

  • Within-campaign sample splitting (paper §4.2) — which breaks the mechanical correlation between post-determined features and the target — is not implemented.

  • Extrapolation / cold-start diagnostics across advertiser segments (paper §5.3) are not implemented.

  • The footnote-2 measurement-error layer y_observed ~ Normal(y_true, se_rct) for per-RCT standard errors is not implemented.

Categorical columns (object or category dtype) are label-encoded in build_model. With categorical_split="onehot" (default), BART uses pymc_bart.split_rules.OneHotSplitRule for those columns so that splits are “level X vs not-X” rather than “encoded value < c” — this avoids imposing the encoder’s alphabetical ordering on unordered categories. Set categorical_split="continuous" to fall back to ordered splits.

References

[1]

Gordon, B. R., Moakler, R., & Zettelmeyer, F. (2026). Predicted Incrementality by Experimentation (PIE) for Ad Measurement. NBER Working Paper No. 35044.

Examples

import pandas as pd

from pymc_marketing.pie import PIEModel

# Corpus of past campaigns, each labelled with the incrementality
# measured by its RCT.
X = pd.DataFrame(
    {
        "objective": ["conversions", "traffic", "awareness", "traffic"],
        "vertical": ["retail", "travel", "finance", "retail"],
        "budget": [50_000, 12_000, 80_000, 30_000],
        "exposure_rate": [0.42, 0.71, 0.33, 0.55],
    }
)
y = pd.Series([0.81, 0.34, 1.12, 0.49])  # incrementality per dollar

model = PIEModel(
    pre_determined_features=["objective", "vertical", "budget"],
    post_determined_features=["exposure_rate"],
)
model.fit(X, y, random_seed=42)
preds = model.sample_posterior_predictive(X)

Methods

PIEModel.__init__(*[, ...])

Initialize model configuration and sampler configuration for the model.

PIEModel.approximate_fit(X[, y, ...])

Fit a model using Variational Inference and return InferenceData.

PIEModel.attrs_to_init_kwargs(attrs)

Reconstruct constructor kwargs from saved idata attrs.

PIEModel.build_from_idata(idata)

Rebuild the model from saved inference data.

PIEModel.build_model(X, y, **kwargs)

Build the PyMC model graph.

PIEModel.create_fit_data(X, y)

Create the fit_data group based on the input data.

PIEModel.create_idata_attrs()

Extend the base idata attrs with PIEModel-specific fields.

PIEModel.fit(X[, y, progressbar, random_seed])

Fit a model using the data passed as a parameter.

PIEModel.graphviz(**kwargs)

Get the graphviz representation of the model.

PIEModel.idata_to_init_kwargs(idata)

Create the model configuration and sampler configuration from the InferenceData to keyword arguments.

PIEModel.load(fname[, check])

Create a ModelBuilder instance from a file.

PIEModel.load_from_idata(idata[, check])

Create a ModelBuilder instance from an InferenceData object.

PIEModel.post_sample_model_transformation()

Perform transformation on the model after sampling.

PIEModel.predict(X[, extend_idata])

Use a model to predict on unseen data and return point prediction of all the samples.

PIEModel.predict_posterior(X[, ...])

Posterior predictive draws for X as a single DataArray.

PIEModel.predict_proba(X[, extend_idata, ...])

Alias for predict_posterior, for consistency with scikit-learn probabilistic estimators.

PIEModel.sample_posterior_predictive(X[, ...])

Sample posterior predictive draws and return in the original target scale.

PIEModel.sample_prior_predictive(X[, y, ...])

Sample from the model's prior predictive distribution.

PIEModel.save(fname, **kwargs)

Save the model's inference data to a file.

PIEModel.set_idata_attrs([idata])

Set attributes on an InferenceData object.

PIEModel.table(**model_table_kwargs)

Get the summary table of the model.

Attributes

default_model_config

Default BART hyperparameters, noise prior, and categorical split mode.

default_sampler_config

Default sampler configuration (empty — PyMC auto-assigns PGBART + NUTS).

fit_result

Get the posterior fit_result.

id

Generate a unique hash value for the model.

output_var

Name of the target variable in the PyMC graph and saved idata.

posterior

Access the 'posterior' attribute of the InferenceData object.

posterior_predictive

Access the 'posterior_predictive' attribute of the InferenceData object.

predictions

Access the 'predictions' attribute of the InferenceData object.

prior

Access the 'prior' attribute of the InferenceData object.

prior_predictive

Access the 'prior_predictive' attribute of the InferenceData object.

version

idata

sampler_config

model_config