Wisto x Dyme: Mandate Optimization and Prediction
Business Context
Dyme acts as an intermediary for recurring consumer contracts (car insurance, energy, health insurance, legal insurance, travel insurance, liability insurance). Users grant Dyme a “mandate” to monitor their existing contracts and auto-switch them to better deals.
The deal lifecycle:
- Mandate created — user opts in for a contract category
- Weekly comparisons — Dyme’s system compares the user’s current contract against alternatives
- Deal found — user is notified
- Pre-rejection window — user can reject before seeing details (
pre_rejected) - Result shared — full comparison sent to user (
result_sent) - User decides — accept (auto-switch proceeds) or decline (
deal_declined) - 14-day cooling off — user can still reject after switch (
rejected) - Success — 14 days pass without rejection (
succeeded)
Recent change: deal sending was previously manual (advisor discretion), now automated via business rules (including a minimum savings threshold). This creates two data eras with different characteristics.
Business rules
Rules that govern how deals are selected and sent. Understanding these is important because the model learns from deals that passed these filters — any rule change shifts the data distribution.
- Deal selection is not always cheapest. Other criteria are sometimes used in addition to price (details TBD).
Prediction Target
Modeling space: result_sent onwards. We predict the user’s response to a deal they’ve been shown. Everything before result_sent is out of scope.
- Population: All auto_switches that reached
result_sentwith terminal outcome, excludingcomparison_type = 'recommendation'andis_new_insurance = 1. - Label:
succeeded= positive,deal_declined+rejected= negative. - Baseline: 57.2% conversion (33K succeeded / 58K total after scope filters).
[mandate] -> [comparisons] -> [deal found] -> [pre-rejection window]
|
OUT OF SCOPE
|
result_sent <-- OUR START
|
+------------+------------+
| |
user accepts user declines
| (deal_declined)
auto-switch |
| NEGATIVE
14-day cooling off
|
+-------+-------+
| |
no reject user rejects
| (rejected)
POSITIVE |
(succeeded) NEGATIVE
Goals
- Understand what drives deal acceptance — which factors matter, and which ones Dyme can influence
- Score each deal by predicted acceptance probability
- Optimize the variables Dyme controls: when to send, whether to send, how to sequence deals
- Enable the model to adapt as user behaviour and market conditions change (future)
What the model delivers
The model predicts the probability that a user will accept a given deal. This prediction draws on two kinds of signal:
Things Dyme controls — when to send a deal (day, time, spacing), whether to send it (savings threshold, filtering rules), and how many deals to send before it becomes noise. The model shows how these choices affect acceptance, enabling direct optimization.
Things Dyme can’t change but can act on — the deal’s savings amount, the contract category, and the user’s history (how many deals they’ve seen before, how many they accepted or rejected). The model uses these to score each deal, so Dyme can make better decisions about which deals are worth sending.
Understanding what drives acceptance
The model reveals which factors most influence whether a user accepts a deal. Splitting these by controllable vs uncontrollable shows where operational changes can improve conversion and where the deal or user profile is the dominant factor.
Deciding whether to send a deal
For each deal that’s ready to share, the model predicts the likelihood the user will accept. If the probability is too low, the deal can be suppressed — it’s not worth the user’s attention and may erode trust over time.
The key question is where to draw the line. A strict threshold means fewer deals sent but a higher acceptance rate. A lenient threshold means more deals sent but more declines. The right balance depends on how Dyme weighs the cost of a declined deal against the cost of a missed opportunity.
Removing hard-coded thresholds
Currently there is business logic in place to filter our some deals. Potentially you’re leaving money on the table. This model can replace that system easily and be live recallibrated and more finegrainly adjusted to your appetite for “errors” in either way.
Routing high-value deals to advisors
Some deals have high savings but a lower predicted acceptance probability. These may be worth routing to a human advisor who can walk the user through the deal. For example: a deal saving €200/year with a 35% predicted acceptance rate could be a good candidate for personal follow-up.
Optimizing send timing
The model can capture whether deals sent on certain days or at certain times convert better. If so, Dyme can schedule deal delivery for the best windows.
A caveat: historical timing patterns may reflect Dyme’s existing behaviour (e.g., better deals already tend to be sent on certain days) rather than a true timing effect. Validating timing changes requires A/B testing.
Managing deal frequency per user
If a user has declined several deals in a row, is the next one worth sending? The model uses each user’s history (prior deals seen, accepted, rejected) to estimate whether additional deals are still productive or becoming noise.
Adjusting margins to maximize profit
Dyme earns commission on successful switches. If Dyme can adjust how much savings is passed to the user (by taking a smaller or larger margin), there’s a tradeoff: passing more savings increases acceptance probability but reduces revenue per deal.
The model can estimate how acceptance probability changes at different savings levels. This allows Dyme to find the margin that maximizes expected profit per deal: the point where acceptance probability * commission is highest. A deal where the user is very likely to accept anyway doesn’t need a large discount; a borderline deal might convert with a small nudge.
This requires careful validation — we can model the relationship between savings and acceptance from historical data, but each deal was only offered at one price. Testing a pricing strategy in practice requires controlled experimentation.
Continuous learning
User behaviour and market conditions change over time. Once the model is validated, a retraining pipeline ensures it adapts to shifting patterns and surfaces changes in what drives acceptance.
Key Decisions
Decisions made during Phase 0 exploration. See the Phase 0 Report for supporting evidence.
Data scope
- Recommendations excluded:
comparison_type = 'recommendation'(25K rows, 15.6% conversion) — manual advisor process, not applicable to the automated pipeline we’re modeling. - New insurance excluded:
is_new_insurance = 1(~2.9K rows) — different user flow, noold_price/yearly_savings. Future extension. - NULL user_id kept: 8K labeled deals from real users who weren’t logged in. Per-user features will be NULL; handled natively by tree-based models.
Modeling strategy
- General model first: Cross-category model on
auto_switchesbase features. Car-specific pilot deferred until general model proves out. - Derived savings:
COALESCE(yearly_savings, (old_price - new_price) * 12)gives 99.9% savings coverage across the four major categories. originnot used as feature: Derived column with unclear semantics. We use per-user history (prior success rate, deal count) instead.
Answers from Dyme
| Topic | Answer | Impact |
|---|---|---|
| NULL user_id | Real users not logged in | Keep in dataset |
processed_at |
COALESCE(result_shared_at, completed_at) |
Excluded — post-prediction leakage |
origin |
Last relevant sub_status before current status |
Not independent signal |
recurring |
Defaults TRUE, user can disable | Include as feature |
is_failed_reopen |
Manually reopened after failure | Include as feature |
is_recurrence |
Auto-switch created as recurrence of previous | Include as feature |
| Recurrence lifecycle | Succeeded + recurring=1 → new auto-switch ~1 year later | Each record is distinct per user/category/year |
| Pricing (energy) | yearly_savings = (old_monthly - new_monthly) * 12 |
Validates derived savings approach |
| Pricing (insurance) | (old_price - new_price) * 12 |
Simple formula |
auto_switches is live |
Rows change over time | Need snapshot strategy |
Reports
- Phase 0: Data Exploration — data landscape, population scoping, label definition, feature inventory, feasibility assessment
- Phase 1: Feature Importance & Optimization — model performance (AUC 0.927), SHAP feature importance, timing analysis, threshold analysis
Reference
- Data Model — table schemas, relationships, column documentation
- Data Quality Issues — known bugs, type mismatches, coverage gaps
Open Questions for Dyme
Retries: internal or user-facing?
origin values failed (9.9K) and deal_declined (2.4K) indicate a new auto_switch was created after a previous one concluded negatively. These retries have notably low conversion (~28%).
- Is this a system retry (automatic, user may not be aware)?
- Or user-facing (user sees another deal and decides again)?
Sales platform
Appears correlated with success rate (30% to 86% spread). Need to understand:
- What does each value mean? (
overstappen,mobiel_nl,united_consumers,risk_intermediary,zorgkiezer,awin,arx,daisycon) - Is this the comparison engine, the sales channel, or the user acquisition source?
- Why is it NULL for ~60% of rows?
recurring NULL semantics
recurring is the strongest single predictor in the model (SHAP #1). In the data it only has two states: 1.0 (70% of rows, 75% conversion) and NULL (30% of rows, 19% conversion). There is no explicit 0.
- Does NULL mean the user actively disabled recurring (real behavioural signal)?
- Or does NULL mean the field wasn’t populated for certain deal types, eras, or categories (missing data)?
This matters because the model currently treats NULL as a distinct value (LightGBM handles NaN natively). If NULL = “user opted out”, we should coalesce to 0/1 and treat it as a binary feature. If NULL = “not applicable”, the current handling is correct but the feature’s predictive power may partly reflect data coverage patterns rather than user intent.
Other
- Why are there negative savings? (price increase deals)
- What does “insurance” category mean?
subscription_id/subscription_product_id— what do these reference?original_monthly_completed_count/original_weekly_completed_count— system-wide counts?origin = 'in_contract'(20K rows) — what triggers this?- Status
in_contract(6,668) andclosed(2,795) — terminal or in-progress? - Pre-automation vs post-automation cutoff date?
- Current minimum savings threshold? Has it changed?