I have watched a large rewrite fail and cost an engineering manager his job. The next manager, perhaps learning from his fallen comrade, did something that worked spectacularly well. He did a gradual, component focused rewrite. With each release, they would carve out a part and rewrite only that chunk. For anyone looking at the big rewrite, I would suggest this as an alternative.
I've noticed this as well. You simply can't start from scratch.
The first thing you do is make sure you have good tests built around our old code base. Then you slowly start refactoring/rewriting pieces out. After every small refactoring round run your tests and make sure everything is working. The key is to break down the rewrite into small steps and make sure you have a full functioning product at each step. This might even mean that you need to write code that will be removed after a couple of refactoring iterations.
Wish I could vote you up a few more times on this. Mike Feathers called them
characteristic tests. Sure, run them as unit tests, but get them under CI
right away too. Test everything that moves all the time.
This type of work, and this approach, appeals to a limited set of
people, though. It's painstaking, detailed work. The other problem is
that businesses don't understand its value, and don't want to
pay for it, ime. I've seen two companies go down not paying attention in
this area, and two more who are currently dying.
Course, if the thing was under test to start with then things would be
so much simpler ;)
This is what Michael Feathers calls 'seams' in his book, Working With Legacy Code. Often, you have to do exploratory testing, that is, you don't really know the requirements but you make tests that the current code passes. Then you can refactor it. That way, current code behavior won't be changed.
Very good read, if you need to deal with legacy code and you don't know where to start.
This is exactly what we just did to ww.com, and while we're still on the fence whether that was the right thing to do or not the result is that we did not have big continuity issues and that we now have a completely fresh codebase.
Where I see rewrites fail is when the company has one huge monolithic and interconnected application which really has to be rewritten all at once or not at all.
Dividing your application up into logical libraries and services goes a long way towards making it easy to rewrite. Essentially, you want to obey the single responsibility principle. This means that each component should not need to change if a change is made in another component. From the Wikipedia article:
> Martin defines a responsibility as a reason to change, and concludes that a class or module should have one, and only one, reason to change. As an example, consider a module that compiles and prints a report. Such a module can be changed for two reasons. First, the content of the report can change. Second, the format of the report can change. These two things change for very different causes; one substantive, and one cosmetic. The single responsibility principle says that these two aspects of the problem are really two separate responsibilities, and should therefore be in separate classes or modules. It would be a bad design to couple two things that change for different reasons at different times.[1]
For example, some of your core logic might be written as a library or a stand-alone server. This separates it from the GUI logic and the database logic. So if you need to rewrite the interface, you do that in the GUI code. If you want to refactor the business logic, you rewrite that component. If you want to change the database, you alter your data-access layer.
> one huge monolithic and interconnected application which really has to be rewritten all at once or not at all
I disagree. Any program can always be refactored and compartmentalized. It may be a slow process but it is possible to do a rewrite one chunk at a time.
Absolutely -- a total rewrite is only necessary if broad architectural decisions are all wrong. Otherwise, working to decouple the monolithic system is the crucial step which will allow the chunk-by-chunk rewrites.
The thing that popped into my head while reading the piece is that sparing one star developer might be pretty cheap; how about putting him on a Skunkworks rewrite for 6 months and see how it goes? If it's looking good, give him whatever resources are necessary to finish. Do you think that would work?
I think that's a good way to minimize risk, but also consider the morale issue. This would be the equivalent of giving one developer a corner, window office while the rest stay in thei cubes.
Perhaps you could apply google's 20% rule? Every Friday the entire staff breaks up into teams and work on skunkworks rewrite projects. I think this will boost overall morale and you might find your developers staying late on Friday nights :)
The advantage to the component based rewrite is that it doesn't cost you your head if it fails. You can still push out new features in each version, and you have a fallback plan if the component rewrite fails or is delayed (just use the old one).
The disasters were driven mainly by an attempt to create a manageable codebase while keeping the end user experience fairly similar.
The successes were things that built on existing systems but focused on delivering things that were actually radical improvements in functionality, with the rewrites being driven by this, not an end in themselves.
I feel like the mental stumbling block that stops people has something to do with our ideas of purity and cleanliness. Maybe people intuitively feel contact with the old system would make the new code unclean.
It is slow and painful and requires you to pay lots and lots of careful attention to the behavior of the existing enterprise system that you may very well hate.
Greenfield development sings a siren song. You get to scribble on a gloriously empty page in your imagination, free of such mundane concerns as cash flow, near-term customer demands, day-to-day stability, vitally important edge cases, and hard-won but crufty bug fixes.
Part of the reason could be technical analysis paralysis. I've worked on projects rewriting PowerBuilder components in C#. Getting the two to talk to each other is non-trivial, so determining where to slice off chunks to rewrite is an anxiety-inducing prospect.
We've been doing a variant of this for the last year. We rewrote the core and one large component first and have been gradually moving over the other components to the new architecture. It's definitely a safer way to go, though implementation time is longer.