BPI 2017 Offer Log

TLDR: This case study uses the public BPI Challenge 2017 Offer log to build a staged Simply Solver model that mirrors the core shape of a Simio or FlexSim flow model well: one source, several processing stages, weighted routing, and terminal outcomes. It fits mean cycle time and throughput closely, but it remains an intentionally simplified abstraction rather than a full operational digital twin.

Start Here

How to read this case study

Use this page like a professional review memo, not a blog post. Start with the decision question, verify the baseline fit, then open the model and test the recommended lever yourself.

Read the problem framing first. The point is to test whether a lean staged model can preserve routing and cycle-time behavior well enough to support scenario thinking.
Check the validation before trusting the result. The mean and throughput fit closely; the median and tail do not. That tells you what this abstraction is good for.
Use the template as a sandbox. Start by changing O_Returned service time before touching capacity, because the baseline intentionally avoids artificial queue bottlenecks.

Open The Template

Jump straight into the runnable template

One click creates a working copy inside Simply Solver so you can inspect the board, run the baseline, and try the first what-if without stopping in the template library first.

Open Runnable Template Preview Template Details

Best for: a clean staged-flow benchmark with auditable routing.
First experiment: shorten O_Returned service time.
Watch: mean cycle time and throughput per day.

Question it answers

Can a lean, staged DES preserve the high-signal structure of a real offer workflow closely enough to support policy testing?

What it preserves

Stage order, branching weights, cycle-time calibration, and the main intervention ranking from the public event log.

What it does not claim

A staffing-level digital twin, explicit queues from the real institution, or a trace-faithful replay of the full cycle-time distribution.

Cases

42,995

Public offer-level cases parsed from the 4TU event log.

Arrivals / Day

125.35

Steady-state source rate used to anchor throughput.

Mean Cycle

457.45 h

Empirical mean cycle time from the public log.

Accepted After Return

74.1%

The strongest terminal branch once an offer is returned.

Where volume concentrates

The log is clean and front-loaded. Most flow moves through the main mail-and-online path before branching into returned, cancelled, or refused outcomes.

O_Create Offer42,995

O_Created42,995

O_Sent (mail and online)39,707

O_Returned23,305

O_Cancelled20,898

O_Accepted17,228

O_Refused4,695

What the branching really says

The big modeling move is not queue logic. It is preserving the two critical split points: creation-to-send and returned-to-outcome.

O_Created -> mail and online92.35%

O_Created -> online only4.71%

O_Returned -> accepted74.08%

O_Returned -> refused15.36%

O_Returned -> cancelled10.56%

These percentages come from the calibrated routing weights projected from the public event log.

Why This Problem

The offer log is a strong first public case study because it sits in the sweet spot between realism and tractability. It comes from a real loan-application workflow, it has persistent IDs, and it preserves meaningful routing structure without the heavy rework complexity of the full application log. In the offer-only subset we see 42,995 cases, five core activity stages, three terminal outcomes, and no repeated-activity traces, which makes it easier to explain the translation from event log to discrete-event simulation clearly.

Signal	Value
Public dataset	BPI Challenge 2017 - Offer log
DOI	10.4121/12705737.v2
Cases parsed	42,995
Observed window	2016-01-01 through 2017-02-02
Mean cycle time	457.454 hours
Median cycle time	381.360 hours
P90 cycle time	742.564 hours

How The Model Was Built

The modeling strategy was deliberately conservative. Each observed event-log stage became a single Simply Solver processor. The observed branch frequencies became weighted routing on outgoing edges. The arrival stream was represented as a steady-state exponential source calibrated to empirical arrivals per day. Processor capacities were kept intentionally high so the model would not invent queues that the log itself did not support. Processor timing was then calibrated to the empirical mean cycle time.

Interpretation choice

This is a staged DES model of the offer process, not a claim that the real institution had one queue, one team, or one deterministic task duration per stage. The abstraction is chosen to preserve behavioral flow, not staffing detail.

Observed transition	Modeled as
Applications arrive	One source node calibrated to 125.350 arrivals/day
O_Create Offer → O_Created	Two sequential processors
O_Created branches	Weighted routing to mail/online, online-only, cancelled, refused
O_Sent branches	Weighted routing to returned, cancelled, refused
O_Returned branches	Weighted routing to accepted, cancelled, refused
Offer outcomes	Three sink nodes

Baseline Validation

The baseline model fits the empirical mean cycle time almost exactly and stays within about one percent of empirical arrival-driven throughput. That is strong enough for scenario exploration on a staged flow model. The tradeoff is in the cycle-time distribution: because the calibrated version avoids artificial queueing and uses simple processor timing, it does not reproduce the observed P50 and P90 very well.

Empirical vs simulated

The mean and throughput line up closely. The distribution shape does not, which is why this is a useful operations model but not a full trace-faithful replay.

Mean cycle time+0.01%

Empirical457.454 h

Simulated457.500 h

Throughput / day-0.89%

Empirical125.350

Simulated124.238

Median cycle time+36.59%

Empirical381.360 h

Simulated520.916 h

P90 cycle time-29.85%

Empirical742.564 h

Simulated520.916 h

Metric	Empirical	Simulated	Difference
Mean cycle time	457.454 h	457.500 +/- 0.374 h	+0.01%
Arrivals / throughput per day	125.350	124.238 +/- 0.364	-0.89%
Median cycle time	381.360 h	520.916 h	+36.59%
P90 cycle time	742.564 h	520.916 h	-29.85%
Mean wait	Not directly observed	0.000 min	Intentional simplification

Intervention Results

Once the baseline was stable, the next step was not to add more realism blindly. It was to ask whether the staged model could still discriminate between useful policy changes. It could. Reducing service time in O_Returned improved cycle time the most. Increasing capacity there did effectively nothing in this abstraction, which is exactly what we should expect when the model is intentionally configured to avoid synthetic queue bottlenecks.

Which lever moved the needle

Measured as hours reduced from baseline mean cycle time. In this model, faster returned-offer handling is the only intervention that clearly changes the result.

ST-1 Reduce Returned Service Time14.00 h faster

RT-1 Shift Creation Toward Online0.03 h faster

CAP-1 Increase Returned Capacity0.00 h faster

Intervention	Mean cycle time	Throughput/day	Reading
Baseline	457.269 +/- 0.242 h	124.488 +/- 0.512	Reference point
Increase Returned Capacity	457.269 +/- 0.242 h	124.488 +/- 0.512	No material change in this abstraction
Reduce Returned Service Time	443.268 +/- 0.288 h	124.613 +/- 0.928	Best cycle-time improvement
Shift Creation Toward Online	457.242 +/- 0.378 h	124.516 +/- 0.656	Mix change, slight wait introduced

How Close Is This To Simio Or FlexSim?

Structurally, it is close. Official Simio training material teaches beginners to build models with a Source, Server, and Sink, then use connector selection weights for probabilistic routing. FlexSim’s fixed-resource examples similarly revolve around Source, Queue, Processor, and Sink objects. That is the same family of modeling move we use here: a source feeds staged processors, weighted routes split the flow, and sinks capture terminal outcomes.

Dimension	Simply Solver case study	Simio / FlexSim comparison
Core object pattern	Source → processors → sinks	Very close to Simio Source/Server/Sink and FlexSim Source/Queue/Processor/Sink flow models
Routing	Weighted edge probabilities	Very close to Simio selection weights and standard branching logic in FlexSim
Experiment style	Parameter edits and reruns	Conceptually similar, though Simio and FlexSim offer deeper experiment-management tooling
Queue detail	Minimal by design in this model	Less detailed than FlexSim fixed-resource models with explicit queues and less detailed than richer Simio server logic
Resource calendars and staffing logic	Not modeled here	Material gap relative to fuller Simio and FlexSim projects
Animation / facility realism	2D process map	Much lighter than 3D facility-style representations

Bottom line

This case study mirrors the structural language of Simio and FlexSim well enough for staged process analysis, scenario comparison, and teaching. It does not yet mirror the full operational depth of a detailed staffing, scheduling, or 3D facility model.

What This Model Does Not Claim

It does not estimate individual worker utilization or staffing plans for a real lender.
It does not model calendars, retries, rework loops, or exception handling beyond the observed offer-routing structure.
It does not reproduce the full cycle-time distribution of the public log.
It should be read as a calibrated staged-flow model, not a complete digital twin.

Use The Model

Open the template in Simply Solver to inspect the public-data translation directly and run your own what-if changes.