Ilija
efti.mov
Ilija
@efti.mov
Shipping software, changing diapers, sleep deprived. Engineering Manager @ Stripe.

https://stratechgist.com | https://ieftimov.com
Reviewers had a specific responsibility regarding debt: if a changeset ignored or introduced debt (per the plan), the reviewer had to explicitly acknowledge it in a comment, often linking to the IOU ticket.

This reinforces conscious decision-making and shared accountability. 4/4
May 22, 2025 at 10:02 AM
For lower-risk changes, a quick review by one person was sufficient. The key was pragmatism, leveraging extra scrutiny selectively where it added the most value. 3/4
May 22, 2025 at 10:02 AM
We adopted a nuanced approach: for changes touching critical or brittle hotspots, we required a primary and a secondary reviewer. Pair-programmed changes often only needed one reviewer, as the "review" happened during creation. 2/4
May 22, 2025 at 10:02 AM

We were pedantic here, but also pragmatic. We agreed that temporarily accepting reduced coverage in some non-critical areas was okay, provided we compensated with focused manual/exploratory testing and logged the gap on the IOU list. It's about applying testing effort intelligently. 4/4
May 20, 2025 at 10:02 AM
Where did we insist on good tests?

New critical path functionality – obvious.

Areas where debt was specifically deferred, especially high-impact items – you need to know if your workaround is holding.

Key integration points – where systems connect is often where things break. 3/4
May 20, 2025 at 10:02 AM
First things first: before touching brittle legacy code, solidify its existing test coverage. Use coverage tools to find the dark corners, write tests to illuminate them, and then start making your changes. 2/4
May 20, 2025 at 10:02 AM
If a debt-ridden section started causing problems in production, we could disable it or revert to older logic via the flag without a frantic redeploy.

This combo of visibility (monitoring/alerting) and control (feature flags) gives the confidence to ship fast—even when things aren't perfect. 4/4
May 15, 2025 at 10:02 AM
Feature flags are your best friend. We didn't just flag the main new feature; we used flags for specific risky code paths or areas where we'd taken shortcuts. This gave us an emergency brake. 3/4
May 15, 2025 at 10:02 AM
Monitor key metrics, dashboard the hell out of system behavior (especially the risky parts!), and set up aggressive alerting for anomalies or failures. But monitoring isn't enough. 2/4
May 15, 2025 at 10:02 AM
It turns vague "we'll fix it later" promises into concrete, tracked commitments. We even held short, regular "debt triage" meetings to manage this list. It's about accountability. 4/4
May 13, 2025 at 12:57 PM
What's the potential impact if left unaddressed? And critically, an explicit commitment to tackle it post-launch. This list must be visible to the entire team (including Product!), and adding items should be acknowledged openly. 3/4
May 13, 2025 at 12:57 PM
This isn't just another backlog; it's a visible, agreed-upon list of specific debt items you've chosen to incur and are committed to addressing later. Every item needs context: What is it? Why was it deferred (link the risk assessment!)? 2/4
May 13, 2025 at 12:57 PM
The plan was always to remove the wrapper and fix the core issue later, but during the crunch, containment was key.

Stick to the rule: refactor only if it directly unblocks the critical path or demonstrably reduces immediate launch risk.

Everything else goes on the list for later. 4/4
May 8, 2025 at 12:31 PM
We ended up writing wrappers or facades around particularly nasty bits of legacy code. The wrapper provided a cleaner interface for the new API code, shielding it from the underlying complexity. 3/4
May 8, 2025 at 12:31 PM
During a high-pressure crunch, giving in to drive-by refactoring is usually an anti-pattern. Instead, focus on isolation. If a module or class is problematic and fixing it properly is off the critical path, find ways to contain it. 2/4
May 8, 2025 at 12:31 PM
This wasn't about elegance; it was about ruthless triage. It forced us to ask: How likely is this shortcut to blow up during launch? How hard is it to patch?

Thinking about debt in these dimensions turns vague anxiety into actionable categories. 3/3
May 6, 2025 at 12:57 PM
Otherwise, you're flying blind, fixing low-impact stuff while ignoring potential launch-killers. We used a simple 2x2 matrix that saved our bacon:

X-axis: 'Impact on Launch Success/Stability'

Y-axis: 'Effort to Quickly Address'

2/3
May 6, 2025 at 12:57 PM
...and ultimately, losing the very user trust we're working so hard to gain. Getting their buy-in that this debt must be repaid isn't just CYA; it's crucial for securing the resources and time needed for the cleanup later.

Honesty upfront prevents nasty surprises down the line. 4/4
May 1, 2025 at 12:31 PM
Don't sugar-coat it. Go to your stakeholders and lay it out: "we can hit this deadline, but we're taking shortcuts". The trade-off is clear: if we don't allocate time to pay this debt back post-launch, we risk instability, incidents, poor user experience... 3/4
May 1, 2025 at 12:31 PM