A note before we begin: this post deals with the loss of a home to a natural disaster, and may be upsetting. If you are affected there are some helpline numbers at the bottom of the page.
The tree ranks, the LLMA neural network trained to predict the next token in a sequence, large enough that it generalises to tasks it wasn’t explicitly trained for. explains, the don’t-call list filters, Dina’s script is on Sam’s team’s wall, and the holdout design has a signature from Marcus on it in a green pen that Marcus never uses for anything else. Now the system needs to get out into the world without anyone getting hurt.
Priya writes the rollback plan on a Thursday in late February, before anyone has written the deployment script.
This is not because she is virtuous. It is because she has been thinking, since Ruth, about the specific thing that went wrong in December. The ModelA trained set of weights plus the architecture that makes them useful – the thing you load up and run inference against. produced 3,100 drafts. Maya read 98. Somewhere between the model is running and the model is hurting someone there was a gap of two weeks and an inability to stop, and Priya cannot stop thinking about that gap.
She writes the rollback in a markdown file. The file has four sections, all short: what breaks, how we know, what we do in the first five minutes, what we do in the next hour. The first-five-minutes entry is a single command. It sets the feature flag to off, drains the outreach queue, and sends one email to Sam’s team. She tests it twice on the pre-production system. Both times the command takes under ninety seconds.
She pins the file in the modelling-Thursday gist and writes at the top: This is the rollback plan. It is more important than the deployment plan.
Dina’s kitchen table
Shadow mode starts on Monday the first of March. The model runs on the live event stream. Every Tuesday morning it produces a ranked at-risk list for each city. The list goes to Sam, Dina, and Priya. Nobody on Sam’s team phones anybody off it. Nobody on Sam’s team is told they’re not phoning anyone off it. The list is a draft, for the team to argue with rather than act on.
Week one’s Perth list has 47 subscribers in the top decile. Dina reads it at 7am on Tuesday at her kitchen table with a cup of tea and a biscuit. Her kitchen faces west and still has yesterday afternoon’s heat in the walls, the kind of late-March week where the nights don’t cool down properly until after midnight and the Doctor is still hours away.
By 7:45 she has marked seventeen of them in a column she’s added to the spreadsheet. The column is called I’d phone her. Of the 47, Dina would phone 17. Of the 30 she wouldn’t phone, fifteen have reasons Dina can write down immediately. The other fifteen are subscribers she genuinely can’t tell from the data, and for those she adds a second column: I’d want more context.
She sends the spreadsheet back to Priya with one message: your model is about 36% useful. that’s a good start. let me show you the ones that aren’t.
The Thursday after
Modelling-Thursday, 2pm. Priya, Kai, Dina, Sam, Anika on the laptop, Charlotte who is now a fixture and nobody comments on it any more.
Dina walks Kai through the 30 she wouldn’t phone. Kai listens, writes, occasionally asks a clarifying question. Most of the discussion is not about the model. It is about Dina’s judgement, which has nineteen years of subscription-services experience baked into it, and which the model has not met before.
About twenty minutes in, Kai realises the model has no concept of we already phoned her last week. The substitution-cancellation signal it is reading as high-risk is, for eight of the 47 subscribers, a direct consequence of a call Dina’s team made the week before. The call said let us know if there’s anything we can change about the box, and the subscribers, reasonably, cancelled the substitutions they didn’t want. The model is ranking responsiveness to our outreach as drift toward pause. Which is exactly backwards. Kai will add a recent-contact feature that night, train overnight, and by next Tuesday the cohort drops out of the top decile.
Dina interrupts herself, about forty minutes in, to read out one of the LLM’s explanation sentences. “Drop in engagement and three cancelled substitutions indicate this subscriber may be drifting toward pause.” She puts the spreadsheet down. “I don’t want the sentence to diagnose. I want it to describe. Three cancelled substitutions in the last four weeks and no reply to the last two weekly emails. Let me bring the diagnosis.” Priya rewrites the PromptThe input you hand to an LLM – system instructions, user message, examples, retrieved documents, tool descriptions, the lot. that afternoon. Thirty lines becomes thirty-two. The sentences change.
Kai catches the last one himself, halfway out the door at the end of the session. Dina would phone three people who are in the fortieth to forty-second percentile, not on the top decile. She can say why, each time. The model is under-weighting the opened-pause-help-page signal relative to how Dina weights it. Kai writes a note to himself. He will adjust the feature weights by Friday. The three subscribers will move into the high thirties, which is still not the top decile, and Dina will phone them anyway, because Dina is allowed to phone people off-list. The whole point of the list is that it does not replace her.
Four weeks into shadow mode, Dina’s I’d phone her percentage on the top decile is 71%. Priya is pleased. Kai is delighted. Dina is measured.
“Seventy-one percent,” Dina says, reading the new spreadsheet at her kitchen table. “That’s not the model getting better. That’s the model and I meeting in the middle. I’ll tell you when I trust it.”
Priya writes that sentence in the Thursday gist, without adding her own commentary. Some things don’t need glossing.
The canary
Canary starts on Monday the 22nd of March. Ten percent of subscribers are in the live cohort. Ten percent in the holdout. The remaining eighty percent are still in shadow mode, for measurement purposes, being phoned or not-phoned by Sam’s team on the basis of the existing manual process.
Dina’s team works through the list on Monday and Tuesday. By Tuesday evening they have phoned twenty-seven subscribers. Fourteen of the calls are uneventful, a few thanks for checking in, we’re fine, two actually yes, the substitution rules aren’t working, can you fix them, one please take me off your calling list (noted, tagged, will not be contacted again). Six lead to a concrete action, a substitution-rule update, a box-size change, a conversation that ended with the subscriber saying that’s really helpful.
Seven of the twenty-seven unpause or extend. That is a better rate than the eight-week baseline. Marcus’s team does not call it a result yet. Marcus’s team calls it week one.
On Wednesday morning at 9:17, Priya’s phone rings. It is Iris, one of Dina’s team.
“I need to pause the model.”
“Why?”
“The call I just made. The subscriber said something. I need Dina to hear it and I need the rest of the team to stop calling.”
Priya does not ask what the subscriber said. She runs the rollback command. The flag flips. The outreach queue drains. Every member of Sam’s team receives an email at 9:19 saying do not phone anyone from the list until further notice.
The rollback takes two minutes forty seconds from Iris’s phone call to the email landing. Priya sits at her desk with her hands on the keyboard and makes herself breathe for a count of four before she opens Slack to find out what happened.
Iris, on the call
The subscriber, a woman in her sixties, in Redcliffe north of Brisbane, three years with Greenbox, had not been tagged as no-contact. She was on the at-risk list because she had reduced her box size, cancelled two substitutions, and opened the pause page once. She was in the top 3% of the risk ranking.
When Iris phoned her and opened with Is this a good time? I’m not calling about anything urgent, the subscriber paused for a long time, and then said: the cyclone took my house. I’ve been at my sister’s for three weeks. I can’t use the box from here, I tried to pause it online but I got halfway through and had to answer a door. Please just stop it.
Iris finished the call with care, stayed on the line for seventeen minutes, told the subscriber she’d take the account off any outreach list and that Maya would personally sign off on a three-month pause at no cost and hold her old address on the books until she knew where she was going next. Then she hung up, and rang Priya.
Dina is on the phone to Sam within five minutes. Priya is on the phone to Charlotte within seven.
What the don’t-call list had missed
The don’t-call list is populated in three ways. Manually, by Sam’s team when they learn of a life event. Via integration with the customer-service ticket system. And via Dina’s hand-tagging from funeral notices, media pieces, and obvious-at-a-glance signals, address changes to a hospice, bounced deliveries after a flagged postcode, which Dina has never stopped adding to.
None of these had captured the subscriber’s displacement. The cyclone had been three weeks ago. She had not updated her delivery address because she did not yet know where she would be next week. She had not told customer service. Nothing in the data said this person should not be on this list. The system worked correctly given the inputs it had.
Dina, on the phone to Priya at a quarter past one: “We will have others like her. Not because we’re sloppy. Because the world keeps quiet sometimes and we don’t always find out. I need the opening-seconds part of the script to be strong enough to catch it when that happens. Iris caught it. I want to write down how she did, so the next person does it too.”
Priya: “Write it. Send it to me tonight.”
Dina wrote it that night. The draft was on the back page of the script by the following morning. Its top line was Dina’s, her exact phrasing, that she had used on calls like this four or five times across her career: If the subscriber tells you something hard in the first thirty seconds, a loss, an illness, a divorce, a job ended, stay with them, not with the script. Ask if there’s anything we can do that would make life easier. Offer a pause at no cost. Let them end the call when they want.
The second line came from Sam, after she read Dina’s first pass. Tell me about it afterwards, every time, so we can update the record. The call is not a sales call. The call is the company saying we have noticed.
The two of them read it together on Friday morning in Sam’s office. Sam said it was enough. Dina said it wasn’t enough but was as much as a script could do, and at some point you have to stop writing the script and teach it to a person. They agreed on that and closed the laptop.
The resumption
They resume the canary on Friday morning. The pause has been thirty-six hours. They add two things before resuming: a more generous no-contact window after any life-event flag (ninety days, up from sixty) and a mandatory re-check of any subscriber in the top 1% against Sam’s team’s knowledge before the outreach email goes out. Dina now reads the top-1% list herself, every Tuesday morning, at her kitchen table, before anyone is phoned.
The canary runs for another four weeks before going to full rollout. In the Wednesday morning stand-up of the third post-rollback week, Iris asks, quietly, looking at her notebook, whether the team could talk one afternoon about what it means to end a call well. Dina says yes before Priya has opened her mouth. The conversation happens the following Monday, for an hour, in Sam’s office, with tea. It is not, strictly, a modelling-Thursday. Nobody minds.
The rollback, in the end
The rollback plan was written on the 24th of February. It was used on the 24th of March, exactly four weeks later, from draft to production, without modification.
It took two hours to write, cost nothing to deploy, and was tested twice in a staging environment by a person who had already written the deploy script in her head. When Iris phoned on the Wednesday morning, it saved Sam’s team from making, on Priya’s count, somewhere between three and seven more calls before the problem would have surfaced through another channel.
Priya did not write a reflection about this. She did not underline anything. The rollback file sat in the gist next to the deploy plan, which is longer, and nobody ever photographed it for a slide. The two files lived alongside each other, in the same folder, at the same indentation, one of them the size of a Post-it and the other the size of a book.
On the Friday after the resumption, Dina brought Priya a small plant in a pot, a rosemary cutting from her garden in Bayswater, and put it on Priya’s desk without saying anything. Priya watered it the following Monday. It was still alive six months later.
If any of this landed close to home, please be kind to yourself today.
Your company's Employee Assistance Programme, your GP, or a mental health professional are good places to turn. When you can't face those yet, a phone call to one of the lines below is a reasonable first step.
In Australia, all free and most 24/7:
- Lifeline — 13 11 14 — crisis support, 24/7.
- Griefline — 1300 845 745 — grief counselling, Mon-Fri 6am-midnight AEST.
- Beyond Blue — 1300 22 4636 — mental health, 24/7.
- 13YARN — 13 92 76 — crisis support for Aboriginal and Torres Strait Islander peoples, 24/7.
Elsewhere: findahelpline.com will route you to a line in your country.
The Greenbox story is fiction. The feelings it touches on are not.
Two months on: the weekly ritual that will outlive Priya’s attention, and the question the team keeps trying not to ask about when to turn the whole thing off.