Skip to content(if available)orjump to list(if available)

We stopped roadmap work for a week and fixed bugs

ChrisMarshallNY

I love the idea, but this line:

> 1) no bug should take over 2 days

Is odd. It’s virtually impossible for me to estimate how long it will take to fix a bug, until the job is done.

That said, unless fixing a bug requires a significant refactor/rewrite, I can’t imagine spending more than a day on one.

Also, I tend to attack bugs by priority/severity, as opposed to difficulty.

Some of the most serious bugs are often quite easy to find.

Once I find the cause of a bug, the fix is usually just around the corner.

JJMcJ

It's like remodeling. The drywall comes down. Do you just put up a new sheet or do you need to reframe one wall of the house?

muixoozie

I worked for a company that.. Used msql sever a lot and we would run into a heisenbug every few months that would crash our self hosted msql server cluster or it would become unresponsive. I'm not a database person so I'm probably butchering the description here. From our POV progress would stop and require manual intervention (on call). Back and forth went on with MS and our DBAs for YEARS pouring over logs or whatever they do.. Honestly never thought it would be fixed. Then one time it happened and we caught all the data going into the commit and realized it would 100% reproduce the crash. Only if we restored the database to a specific state and with this specific commit it would crash MS SQL Server. NDAs were signed and I took machete to our code base to create a minimal repro binary that could deserialize our data store and commit / crash MS SQL sever. Made a nice powershell script to wrap it and repro the issue fast and guess what? Within a month they fixed it. Was never clear on what exactly the problem was on their end.. I got buffer overflow vibes, but that's a guess.

QuiEgo

As someone who works with hardware, hard to repo bugs can take months to track down. Your code, the compiler, or the hardware itself (which is often a complex ball of IP from dozens of manufacturers held together with a NoC) could all be a problem. The extra fun bugs are when a bug is due to problems in two or three of them combining together in the perfect storm to make a mega bug that is impossible to reproduce in isolation.

QuiEgo

Random example: I once worked on a debug where you were not allowed to send zero length packets due to a known HW bug. Okay fine, work around in SW. Turns out there was an HW eviction timer that was disabled. It was connected to a counter that counted sys clk ticks. Turns out it was not disabled entirely properly due to SW bug, so once every 2^32 ticks, it would trigger an evection, and if the queue happened to be empty, it would send a ZLP, which triggered the first bug (hard hang the system in a way that breaks the debugger). There were dozens of ways that could hard hang the system, this was just one. Good luck debugging that in two days.

jeffreygoesto

We had one where data, interpreted as address (simple C typo before static analysis was common) fell into an unmapped memory region and the PCI controller stalled trying to get a response, thereby also halting the internal debugging logic and JTAG just stopped forever (PPC603 core). Each time you'd hit the bug, the debugger was thrown off.

kykat

Sometimes, a "bug" can be caused by nasty architecture with intertwined hacks. Particularly on games, where you can easily have event A that triggers B unless C is in X state...

What I want to say is that I've seen what happens in a team with a history of quick fixes and inadequate architecture design to support the complex features. In that case, a proper bugfix could create significant rework and QA.

arkh

> Sometimes, a "bug" can be caused by nasty architecture with intertwined hacks

The joys of enterprise software. When searching for the cause of a bug let you discover multiple "forgotten" servers, ETL jobs, crons all interacting together. And no one knows why they do what they do how they do. Because they've gone away many years ago.

fransje26

> searching for the cause of a bug let you discover multiple "forgotten" servers, ETL jobs, crons all interacting together. And no one knows why they do [..]

And then comes the "beginner's" mistake. They don't seem to be doing anything. Let's remove them, what could possibly go wrong?

silvestrov

plus report servers and others that run on obsolete versions of Windows/unix/IBM OS plus obsolete software versions.

and you just look at this and thinks: one day, all of this is going to crash and it will never, ever boot again.

groestl

And then it turns out the bug is actually very intentional behavior.

ChrisMarshallNY

In that case, maybe having bug fixing be a two-step process (identify, then fix), might be sensible.

OhMeadhbh

I do this frequently. But sometimes identifying and/or fixing takes more than 2 days.

But you hit on a point that seems to come up a lot. When a user story takes longer than the alloted points, I encourage my junior engineers to split it into two bugs. Exactly like what you say... One bug (or issue or story) describing what you did to typify the problem and another with a suggestion for what to do to fix it.

There doesn't seem to be a lot of industry best practice about how to manage this, so we just do whatever seems best to communicate to other teams (and to ourselves later in time after we've forgotten about the bug) what happened and why.

Bug fix times are probably a pareto distribution. The overwhelming majority will be identifiable within a fixed time box, but not all. So in addition to saying "no bug should take more than 2 days" I would add "if the bug takes more than 2 days, you really need to tell someone, something's going on." And one of the things I work VERY HARD to create is a sense of psychological safety so devs know they're not going to lose their bonus if they randomly picked a bug that was much more wicked than anyone thought.

marginalia_nu

I think in general, bugs go unfixed in two scenarios:

1. The cause isn't immediately obvious. In this case, finding the problem is usually 90% of the work. Here it can't be known how long finding the problem is beforehand, though I don't think bailing because it's taking too long is a good idea. If anything, it's those really deep rabbit holes the real gremlins can hide.

2. The cause is immediately obvious, but is an architecture mistake, the fix is a shit-ton of work, breaks workflows, requires involving stakeholders, etc. Even in this case it can be hard to say how long it will take, especially if other people are involved and have to sign off on decisions.

I suppose it can also happen in low-trust sweatshops where developers held on such a tight leash they aren't able to fix trivial bugs they find without first going through a bunch of jira rigmarole, which is sort of low key the vibe I got from the post.

OhMeadhbh

At Amazon we had a bug that was the result of a compiler bug and the behaviour of intel cores being mis-documented. It was intermittent and related to one core occasionally being allowed to access stale data in the cache. We debugged it with a logic analyzer, the commented nginx source and a copy of the C++ 11 spec.

It took longer than 2 days to fix.

amoss

When you work on compilers, all bugs are compiler bugs.

(apart from the ones in the firmware, and the hardware glitches...)

ChrisMarshallNY

I’m old enough to have used ICEs to trace program execution.

They were damn cool. I seriously doubt that something like that, exists outside of a TSMC or Intel lab, these days.

Windchaser

/imagining using an internal combustion engine here

plq

ICE meaning in-circuit emulator in this instance, I assume?

auguzanellato

What kind of LA did you use to de bug an Intel core?

OhMeadhbh

The hardware team had some semi-custom thing from intel that spat out (no surprise) gigabytes of trace data per second. I remember much of the pain was in constructing a lab where we could drive a test system at reasonable loads to get the buggy behavior to emerge. It was intermittent so it took use a couple weeks to come up with theories, another couple days for testing and a week of analysis before we came up triggers that allowed us to capture the data that showed the bug. it was a bit of a production.

peepee1982

Yep. Also, sometimes you figure out a bug and in the process you find a whole bunch of new ones that the first bug just never let surface.

khannn

I had a job that required estimation on bug tickets. It's honestly amazing how they didn't realize that I'd take my actual estimate, then multiply it by 4, then use the extra time to work on my other bug tickets that the 4x multiplier wasn't good enough for.

mewpmewp2

That's just you hedging, they don't really need to know that. As long as if you are hedging accurately in the big picture, that's all that matters. They need estimates to be able to make decisions on what should be done and what not.

You could tell them that 25% chance it's going to take 2 hours or less, 50% chance it's going to take 4 hours or less, 75% chance it's going to take 8 hours or less, 99% it's going to take 16 hours or less, to be accurate, but communication wise you'll win out if you just call items like those 10 hours or similar intuitively. Intuitively you feel that 10 hours seems safe with those probabilities (which are intuitive experience based too). So you probably would say 10 hours, unless something really unexpected (the 1%) happens.

Btw in reality with above probabilities the actual average would be 5h - 6h with 1% tasks potentially failing, but even your intuitive probability estimations could be off so you likely want to say 10h.

But anyhow that's why story points are mostly used as well, because if you say hours they will naturally think it's more fixed estimation. Hours would be fine if everyone understood naturally that it implies a certain statistical average of time + reasonable buffer it would take over a large amount of similar tasks.

etamponi

Ex-Meta employee here. I worked at reality labs, perhaps in other orgs the situation is different.

At Meta we did "fix-it weeks", more or less every quarter. At the beginning I was thrilled: leadership that actually cares about fixing bugs!

Then reality hit: it's the worst possible decision for code and software quality. Basically this turned into: you are allowed to land all the possible crap you want. Then you have one week to "fix all the bugs". Guess what: most of the time we couldn't even fix a single bug because we were drown in tech debt.

mentos

Reminds me of ids policy of "As soon as you see a bug, you fix it"

"...if you don't fix your bugs your new code will be built on buggy code and ensure an unstable foundation and if you check in buggy code someone else is going to be writing code based on your bad code and well you know you can imagine how wasteful that's going to be"

16:22 of "The Early Days of id Software: Programming Principles" by John Romero (Strange Loop 2022) https://www.youtube.com/watch?v=IzqdZAYcwfY&t=982s

AdamN

Yeah, Joel Spolsky is adamant about this with the "Bugs First" approach and he claims most of the delays and garbage that Microsoft released during the early years of his career were centered on that one rule being violated.

demaga

From the post:

> That’s not to say we don’t fix important bugs during regular work; we absolutely do. But fixits recognize that there should be a place for handling the “this is slightly annoying but never quite urgent enough” class of problems.

So in their case, fixit week is mostly about smaller bugs, quality of life improvements and developer experience.

gregoriol

It must be part of the normal process. If the normal process leaves things like this to "some other time", one should start by fixing the process.

IgorPartola

Say you are working on a banking system. You ship a login form, it is deployed, used by tons of people. Six months later you are mid-sprint on the final leg of a project that will hook your bank into the new FedNow system. There are dozens of departments working together to coordinate deploying this new setup as large amounts of money will be moved through it. You are elbows deep in the context of your part of this and the system cannot go live without it. Twice a day you are getting QA feedback and need to make prompt updates to your code so the work doesn’t stall.

This is when the report comes in that your login form update from six months ago does not work on mobile Opera if you disable JavaScript. The fix isn’t obvious and will require research, potentially many hours or even days of testing and since it is a login form you will need the QA team to test it after you find another developer on your team to do a code review for you.

What exactly would you do in this case? Pull resources from a major project that has the full attention of the C suite to accommodate some tin foil Luddite a few weeks sooner or classify this as lower priority?

BurningFrog

This is weird to me...

The way I learned the trade, and usually worked, is that bug fixing always comes first!

You don't work on new features until the old ones work as they should.

This worked well for the teams I was on. Having a (AFAYK) bug free code base is incredibly useful!!

Celeo

Depending on the size of the team/org/company, working on anything other than the next feature is a hard sell to PM/PO/PgM/management.

NegativeK

I've had to inform leadership that stability is a feature, just like anything else, and that you can't just expect it to happen without giving it time.

One leader kind of listened. Sort of. I'm pretty sure I was lucky.

deaux

Ask them if they're into pro sports. If so (and most men outside of tech are in some way), they'll probably know the phrase "availability is the best ability".

dijksterhuis

i got lucky at my last shop. b2b place for like 2x other customer companies. eng manager person (who was also like 3x other managers :/ ) let everything get super broken and unstable.

when i took lead of eng it was quite an easy path to making it clear stability was critical. slow everything down and actually do QA. customer became super happy because basically 3x releases went out with minimal bugs/tweaks required. “users don’t want broken changes immediately, they want working changes every so often” was my spiel etc etc.

unfortunately it was impossible to convince people about that until they screwed it all up. i still struggle to let things “get bad so they can get good”, but am aware of the lesson today at least.

tl;dr sometimes you gotta let people break things so badly that they become open to another way

BurningFrog

That's what I hear.

I've had some mix of luck and skill in finding these jobs. Working with people you've worked with before helps with knowing what you're in for.

I also don't really ask anyone, I just fix any bugs I find. That may not work in all organizations :)

ramon156

I can guarantee you this doesn't work in our team! you didn't make a ticket, so the PM has no idea what you're doing!

Yes, a ticket takes 2 seconds. it also puts me off my focus :P but i guess measuring is more important than achieving

zelphirkalt

micro-managing middle manager: "Are all your other sprint tasks finished?"

code reviewing coworker: "This shouldn't be done on this branch!" (OK, at least this is easy to fix by doing it on a separate branch.)

RHSeeger

Bugs have priorities associated with them, too. It's reasonable for a new feature to be more important than fixing a lower priority bug. For example, if reading the second "page" of results for an API isn't working correctly; but nobody is actually using that functionality; then it might not be that important to fix it.

tonyedgecombe

>For example, if reading the second "page" of results for an API isn't working correctly; but nobody is actually using that functionality; then it might not be that important to fix it.

I've seen that very argument several times, it was even in the requirements on one occasion. In each instance it was incorrect, there were times when a second page was reached.

AdamN

IMHO the best way to deal with that situation is to mark the bug as wontfix. Better to have a policy of always fixing bugs but be more flexible on what counts as a bug (and making sure the list of them is very small and being actively worked on).

null

[deleted]

jaredklewis

Where have you worked where this was practiced if you don’t mind sharing?

I’ve seen very close to bug free backends (more early on in development). But every frontend code base ever just always seems to have a long list of low impact bugs. Weird devices, a11y things, unanticipated screen widths, weird iOS safari quirks and so on.

Also I feel like if this was official policy, many managers would then just start classifying whatever they wanted done as a bug (and the line can be somewhat blurry anyway). So curious if that was an issue that needed dealing with.

mavamaarten

I'm not going to share my employer, but this is exactly how we operate. Bugs first, they show up on the Jira board at the top of the list. If managers would abuse that (they don't), we'd just convert them to stories, lol.

I do agree that it's rare, this is my first workplace where they actually work like that.

zelphirkalt

Frontend bugs mostly stem from usage of overblown frontend frameworks, that try to abstract from the basics of the web too much. When relying on browser defaults and web standards, proper semantic HTML and sane CSS usage, the scope of things that can go wrong is limited.

Sharlin

It's pretty wild that this is the case now (if it indeed is), given that for a long, long time, sticking to sane, standard stuff was the exact way you'd land in a glitch/compatibility hell. Yes, thanks mostly to IE, but still.

jaredsohn

I'd love to see an actual bug-free codebase. People who state the codebase in bug-free probably just lack awareness. Even stating we 'have only x bugs' is likely not true.

NegativeK

Top commenter's "AFAYK" acronym is covering that.

The type that claims they're going to achieve zero known and unknown bugs is also going to be the type to get mad at people for finding bugs.

supriyo-biswas

> The type that claims they're going to achieve zero known and unknown bugs is also going to be the type to get mad at people for finding bugs.

This is usually EMs in my experience.

At my last job, I remember reading a codebase that was recently written by another developer to implement something in another project, and found a thread safety issue. When I brought this up and how we’ll push this fix as part of the next release, he went on a little tirade about how proper processes weren’t being followed, etc. although it was a mistake anyone could have made.

rurban

We kinda always leave documentation and test bugs in. Documentation teams have different scheduling, and tests are nice TODO's.

There are also always bugs detected after shipping (usually in beta), which need to be accounted for.

waste_monk

>I'd love to see an actual bug-free codebase.

cat /dev/null .

Sharlin

A specific individual execution is not a codebase.

null

[deleted]

mobeigi

Any modern system with a sizeable userbase has thousands of bugs. Not all bugs are severe, some might be inconveniences at best affecting only a small % of customers. You have to usually balance feature work and bug fixes and leadership almost always favours new features if the bugs aren't critical to address.

brulard

Many of the bugs have very low severity or appear to small minority of users under very specific conditions. Fixing these first might be quite bad use of your capacities. Like misaligned UI elements, etc. Critical bugs should be done immediately of course as a hotfix.

kykat

In the places that I worked, features came before all else, and bugs weren't fixed unless customers complain

Cthulhu_

Thing is, if you follow a process like scrum, your product owner will set priorities; if there's a bug that isn't critical, it may go down the list of priorities compared to other issues.

And there's other bugs that don't really have any measurable impact, or only affect a small percentage of people, etc.

stevoski

I’m a strong believer in “fix bugs first” - especially in the modern age of “always be deploying” web apps.

(I run a small SaaS product - a micro-SaaS as some call it.)

We’ll stop work on a new feature to fix a newly reported bug, even if it is a minor problem affecting just one person.

Once you have been following a “fix bugs first” approach for a while, the newly discovered bugs tend to be few, and straight forward to reproduce and fix.

This is not necessarily the best approach from a business perspective.

But from the perspective of being proud of what we do, of making high quality software, and treating our customers well, it is a great approach.

Oh, and customers love it when the bug they reported is fixed within hours or days.

ivolimmen

Would love to work on a project with this as a rule but I am working on a project that was build before me with 1.2 million lines of code, 15 years old, really old frameworks; I don't think we could add features if we did this.

chamomeal

Same. The legacy project that powers all of our revenue-making projects at work is a gargantuan hulking php monster of the worst code I’ve ever seen.

A lot of the internal behaviors ARE bugs that have been worked around, and become part of the arbitrary piles of logic that somehow serve customer needs. My own understanding of bugs in general has definitely changed.

stevoski

I wrote up my thoughts on this into a longer post: https://killthehippo.com/posts/fix-bugs-or-add-new-features

pmontra

About stopping and fixing problems, did anybody have had this kind of experience?

1. Working on Feature A, stopped by management or by the customer because we need Feature B as soon as possible.

2. Working on Feature B, stopped because there is Emergency C in production due to something that you warned the customer about months ago but there was no time to stop, analyze and fix.

3. Deployed a workaround and created issue D to fix it properly.

4. Postponed issue D because the workaround is deemed to be enough, resumed Feature B.

5. Stopped Feature B again because either Emergency E or new higher priority Feature F. At this point you can't remember what that original Feature A was about and you get a feeling that you're about to forget Feature B too.

6. Working on whatever the new thing is, you are interrupted by Emergency G that happened because that workaround at step 3 was only a workaround, as you correctly assessed, but again, no time to implement the proper fix D so you hack a new workaround.

Maybe add another couple of iterations but at this time every party are angry or at least unhappy of each other party.

You have a feeling that the work of the last two or three months on every single feature has been wasted because you could not deliver any one of them. That means that the customer wasted the money they paid you. Their problem, but it can't be good for their business so your problem too.

The current state of the production system is "buggy and full of workarounds" and it's going to get worse. So you think that the customer would have been wiser to pause and fix all the nastier bugs before starting Feature A. We could have had a system running smoothly, no emergencies, and everybody happier. But no, so one starts thinking that maybe the best course of action is changing company or customer.

cracki

Symptoms of a dysfunctional company where communication has broken down, everyone with any authority is running around EXACTLY like a headless chicken, waving around frantically (giving orders). Margins are probably thin as a razor, or non-existent. They will micromanage your work time to death. You will be treated as a commodity factory machine and if you start using your brain to solve actual problems, you will be chastised. Deadlines everywhere keep everyone's brain shut off and in panic mode. No time to properly engineer anything. Nobody has the time to check anyone else's work, causing "trust" that isn't even blind, just foolish. You as the software guy end up debugging and fixing EVERYONE's mistakes. When the bug is in hardware/electronics, everyone knows who's actually to blame, but everyone still expects YOU to fix it, and they're immensely disappointed when you can't save the day.

These places cannot and will not change. If you can, find employment elsewhere.

pjc50

This is not uncommon but I've mostly managed to avoid it, because it's a management failure. There is a delicate process of "managing the customer" so that they get a result they will eventually be satisfied with, rather than just saying yes to whatever the last phone call was.

dsego

Yes, usually not worth it to spend too much time on proper engineering if the company is still trying to find a product-market fit and you will be working on something else or deleting the code in a few months.

abroszka33

> did anybody have had this kind of experience?

Yes, the issue is not you, it's a toxic workplace. Leave as soon as you can.

Galxeagle

In my experience, having a fixit week on the calendar encourages teams to just defer what otherwise could be done relatively easily at first report. ("ah we'll get to it in fixit week"). Sometimes it's a PM justifying putting their feature ahead of product quality, other times it's because a dev thinks they're lining up work for an anticipated new hire's onboarding. It's even hinted at in the article ('All year round, we encourage everyone to tag bugs as “good fixit candidates” as they encounter them.')

My preferred approach is to explicitly plan in 'keep the lights on' capacity into the quarter/sprint/etc in much the same way that oncall/incident handling is budgeted for. With the right guidelines, it gives the air cover for an engineer to justify spending the time to fix it right away and builds a culture of constantly making small tweaks.

That said, I totally resonate with the culture aspect - I think I'd just expand the scope of the week-long event to include enhancements and POCs like a quasi hackathon

hastily3114

We do this too sometimes and I love it. When I work on my own projects I always stop and refactor/fix problems before adding any new features. I wish companies would see the value in doing this

Also love the humble brag. "I've just closed my 12th bug" and later "12 was maximum number of bugs closed by one person"

troad

It's fairly telling of the state of the software industry that the exotic craft of 'fixing bugs' is apparently worth a LinkedIn-style self-promotional blog post.

I don't mean to be too harsh on the author. They mean well. But I am saddened by the wider context, where a dev posts 'we fix bugs occasionally' and everyone is thrilled, because the idea of ensuring software continues to work well over time is now as alien to software dev as the idea of fair dealing is to used car salesmen.

remus

> But I am saddened by the wider context, where a dev posts 'we fix bugs occasionally' and everyone is thrilled, because the idea of ensuring software continues to work well over time is now as alien to software dev as the idea of fair dealing is to used car salesmen

This is not the vibe I got from the post at all. I am sure they fix plenty of bugs throughout the rest of the year, but this will be balanced with other work on new features and the like and is going to be guided by wider businesses priorities. It seems the point in the exercise is focusing solely on bugs to the exclusion of everything else, and a lot of latitude to just pick whatever has been annoying you personally.

ozim

That’s what we have fix anything Friday for.

The name is just an indication you can do it any day but idea is on Friday when you are at no point to start big thing, pick some small one you want to fix personally. Maybe a big in product maybe local dev setup.

pjmlp

That is why I stand on the side of better law for company responsibilities.

We as industry have taught people that broken products is acceptable.

In any other industry, unless people are from the start getting something they know is broken or low quality, flea market, 1 euro shop, or similar, they will return the product, ask for the money back, sue the company whatever.

zelphirkalt

There should be better regulation of course, but I want to point out, that the comparison with other industries doesn't quite work, because these days software is often given away at no financial cost. Often it costs ones data. But once that data is released into their data flows, you can never unrelease it. It has already been processed in LLM training or used somehow to target you with ads or whatever other purpose. So people can't do what they usually would do, when the product is broken.

nananana9

"Free" software resulting in your data being sold is the software working as intended, it's orthogonal to the question of software robustness.

Software isn't uniquely high stakes relative to other industries. Sure, if there's a data breach your data can't be un-leaked, but you can't be un-killed when a building collapses over your head or your car fails on the highway. The comparison with other industries works just fine - if we have high stakes, we should be shipping working products.

k4rli

Imagining that the software will be shipped with hardware, that has no internet access and therefore cumbersome firmware upgrades, might be helpful. Avoiding shipping critical bugs is actually critical so bricking the hardware is undesirable.

Example: (aftermarket) car headunit.

zeroCalories

This type of testing is incredibly expensive and you'll have a startup run circles around you, assuming a startup could even exist when the YC investment needs to stretch 4x as far for the same product.

The real solution is to have individual software developers be licensed and personally liable for the damage their work does. Write horrible bugs? A licencing board will review your work. Make a calculated risk that damages someone? Company sued by the user, developer sued by the company. This correctly balances incentives between software quality and productivity, and has the added benefit of culling low quality workers.

alansaber

A company creating the conditions that allow for high quality engineering has always been the exception, not the norm

a4isms

In the early days of Hacker News, and maybe even before Hacker News when Reddit didn't have subreddits... OG blogger Joel Spolsky posited the "Joel Test," twelve simple yes/no questions that defined a certain reasonable-by-today's-standards local optimum for shipping software:

https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...

Some seem ridiculously obvious today, but weren't standard 25 years ago. Seriously! At the turn of the century, not everyone used a bug database or ticket tracker. Lots of places had complicated builds to production, with error-prone manual steps.

But question five is still relevant today: Do you fix bugs before writing new code?

lalitmaganti

Author here! Really glad to have sparked a lively discussion in the comments. Since there is so many threads since I last looked at this post, making one top level comment to provide some thoughts:

1) I agree that estimating a bug's complexity upfront is an error prone process. This is exactly why I say in the post that we encourage everyone to "feel out" non trivial issues and if it feels like the scope is expanding too much (after a few hours of investigation), to just pick something else after writing up their findings on the bug.

2) I use the word "bug" to refer to more traditional bugs ("X is wrong in product") but also feature requests ("I wish X feature worked differently"). This is just a companyism that maybe I should have called out in the post!

3) There's definitely a risk the fixit week turns into just "let's wait to fix bugs until that week". This is why our fixits are especially for small bugs which won't be fixed otherwise - it's not a replacement for technical hygiene (i.e. refactoring code, removing dead code, improving abstractions) nor a replacement for fixing big/important issues in a timely manner.

danielbarla

Very interesting post, thank you!

I'd also be curious to know the following: how many new errors or regressions were caused by the bug fixes?

julianlam

We did this ages ago at our company (back then we were making silly Facebook games, remember those?)

It was by far the most fun, productive, and fulfilling week.

It went on to shape the course of our development strategy when I started my own company. Regularly work on tech debt and actively applaud it when others do it too.

tracker1

I've been pushing for things like this for years...

Having every 3rd or 4th sprint being dev initiatives and bugs... Or having a long/short sprint cycle where short sprints are for bugs mostly... Basically every 3rd week is for meetings and bug work so you get a solid 2 weeks with reduced meetings.

It's hard to convince upper managers of the utility though.