Coding from the Trenches: Software Release Engineering

In his Coding Horror post titled How Should We Teach Computer Science?, Jeff Atwood blogs about the lack of coverage of release engineering in computer science courses. At best, he points out, it's given cursory coverage in these courses.

Now, I'm a self-taught developer. I started programming computers in 1985 or so, and I've taught myself everything I know. So I can't really comment about what the courses in a college or university are like. But I can say this: experience has taught me that a few things he says are absolutely, undeniably true. So in this article, I'm going to enumerate those things I think are really important, and how I built a software release process at the company I worked for.

The Ugly Truth About Release Engineering

Release Engineering is not simply deploying your product. There's a reason it's called engineering. It involves getting the latest version of the build from source code control, building the software, executing unit tests, building installers, and labeling the build if it's correctly built. It may involve pinning or branching the build. It requires a daily "good code" check-in time policy. It requires daily builds to ensure that you have software the compiles every day, and a means of notifying folks when the build is broken, and fixing the build-breaking code right away. It's NOT simple.
Consistent, disciplined use of source-code control is the bedrock of release engineering. At any given time, you might need to fix bug in the prior release. That's hard to do if you've already started changing the code for the new release. Branched builds allow you to do that. Also, versioned files in the repository allow you to view the history of changes to a file to recover from unintentional changes. You can also develop for multiple platforms while using many of the same files, sharing them across projects without having to worry if they're out of synch. Labels on files and projects tell you exactly which version of a file was used to create any given build so that you can recreate a project from the repository if you need to.
Building for your environment is not enough. You need a test environment that mimics your client's environment as closely as you can make it, down to the OS, the browser, the applications and the add-ons. Just because it runs on your machine when you press F5 from Visual Studio does not mean it's going to run on the client's machine. If you're developing for multiple browsers, install those browsers and test for them.
(Ugly true story: our company accidentally allowed IE7 through the group policies. We had IE7 deployed everywhere. Our clients don't plan to upgrade to IE7 for another year, at least. Our product must run on IE6. I had to create a separate machine that was safe from IE7 downloads and strictly ran IE6 to be certain the product ran correctly.)
F5 is not enough. Every build should be a clean build. Every build. Don't ship files to the customer that aren't required to run the software. Create a build script that does the job. Excluding a file from a Visual Studio project doesn't delete the file from the folder, but does leave it in source code control (a good thing for versioning). To ensure clean release, have your release script remove files you aren't using prior to shipment.
You need a checkin-time policy. All good source code must be checked in by a certain time every day. Code that isn't checked in does not make it into the daily build. This check-in time should be early enough that the release manager can start the daily build (if it's a manual process), or make the rounds and make sure that all code is checked in prior to it. I favor end-of day check-ins (around 4 PM) for nightly builds, but each organization is different.
The software must successfully build every day. A successful build is a good sign of project health. An automated build tool can be set up to execute the build in the off hours after everyone has gone home, and after all files are checked in. Once the build is complete, the pass/fail report is sent to your release manager. However, just because it compiles doesn't mean that it's entirely healthy or bug-free. Therefore...
Automated unit tests should be executed on every build. If you aren't using automated unit tests, you should be. They're not hard to learn, the tools to create them are freely available and they can improve the stability and quality of your code immeasurably. Incorporate the unit tests into your build script so that they're executed every time you build the software. Correctly written unit tests alert you to build-breaking defects quickly and immediately.
Build-breaking defects must be resolved before anything else. This includes any defect that causes the software to fail to compile or any defect that causes a unit test to fail. The team must adopt a "drop everything and fix the build" mentality. In my own personal experience, this view is not easily accepted in the early stages of a project, but during the later stages, when there's typically a "crunch" mode, and the build isn't riddled with build-breaking defects, developers are thankful that those defects simply aren't there.
You need a release manager. While you might have many people who contribute to the build, checking in changes and adding new content, you need one person whose primary responsibility is to ensure that the software builds properly every day. That individual is also responsible for your installer, and for identifying the code that breaks the build and ensuring that it gets resolved. The release manager doesn't resolve the defects himself unless he checked in the build-breaking defect (since he doesn't know anything about the defect); rather, he must play the role of the hard-nosed drill sergeant ensuring that the coder who checked it in drops everything to fix the build right now. If you can't build the product, you can't ship the product, and anything else that developer might be working on is a moot point. It's an ugly, painful job, but it's crucial.
You need a dedicated build server. This machine is clean, and does nothing but build your software. This guarantees that it injects no artifacts into your final product. It runs the daily build, executes the unit tests, and sends out the notifications when the build passes or fails. It might also house archived copies of each build's source code and binaries. It must be on the network, and should be backed up regularly. The Release Manager should have access to it, but no one else on the development team.

My Own Personal Release Process

It bears noting here that I've done the release process for two different companies. At one, it was for a full team of developers (about twenty of them), and the release process there was a nightmare. At that time, we couldn't get a release out in a month if we tried. So I volunteered to take on the job, and redesigned the process. It took about a week to get the process reengineered and everyone on board, but after two weeks we had daily builds working and everything was going much more smoothly.

I took many of those same principles and applied them to my new job. Clearly, some of them don't apply in a single-developer shop. But the basic principles are the same.

Source Code Control

All developers must use source code control.
All working, compilable code that does not break the build must be checked in by 3 PM every day. Code that is not checked in at this time does not make it into the daily build.
User names and passwords are required for accessing source code control.
The admin password is written on a piece of paper, sealed in an envelope, and stored in the CIO's desk. No one else has it.
Minimal rights to access the repository based on need are granted.
The main tree has the following subprojects: Build, Dev. Each tree's subprojects are mirrors of each other. Build is where the branched and pinned copies of the successful builds are. Mainline development takes place under the Dev tree.
Every file that is required to create or ship the project is included in source code control: source files, SQL scripts, Web pages, images, build scripts, unit tests, test plans, requirements documentation, etc.
Because we use SourceSafe, every weekend, during off-peak hours, regularly scheduled maintenance is performed on the repository to keep it in tip-top shape.
The repository is stored on on the network. This folder is backed up incrementally nightly, and fully weekly.

Build Process

Every afternoon, at 3 PM, all developers must have code they want included in the build checked into the repository.
The release manager does a final verification at 3:15 to ensure that all code is checked in.
An automated script fires off the build at 3:30 PM. It does the following:
1. Clean the build folders on the build server. This involves deleting all files and folders from the project's build folder, ensuring a clean build.
2. Get the latest version of the software from the DEV tree in the repository.
3. Compiles the software and all of its dependencies. If the compilation fails, an email with high importance is sent to the release manager, notifying him of the failure, and the script aborts.
4. Executes the unit tests. The unit test results are output to a text log file which are then sent to the release manager in an email.
5. Executes a cleanup batch file that ensures that any files that should not be shipped with the product are removed.
6. Creates the installer or archives the build into a ZIP file.
7. Labels the build in the repository.
8. If the build was successful, sends a "Build success" message to the release manager.
Note that step 3 may execute multiple times depending on whether you are targeting multiple platforms or releases (such as Debug and Release, or various browsers, or various OSes).
Upon receipt of a build failure email, the Release Manager reviews its contents, and identifies the offending source code. He then determines who checked that code in, and contacts that developer and asks them to resolve the defect as soon as possible.
Important: Except in the direst of circumstances, the release manager should not attempt to fix someone else's defects. He should ask the developer to fix his own defects. If the release manager takes this task on himself, he'll quickly become inundated trying to fix all the build-breaking defects, and won't have time to do his own work.
The developer resolves the build-breaking defect and checks in the change for inclusion in the next daily build.
If enough build-breaking defects were present, the Release Manager may choose to manually rebuild the software once defect corrections are checked in.
If the build is shipped to the customer, it is labeled, pinned and branched into the BUILD tree in the source code repository.

In closing

We're a Microsoft shop. Although I've worked in Java houses, my limited experiences have largely focused on the Microsoft stack, and the process that I've outlined above is primarily geared for Microsoft Visual Studio and SourceSafe. But the basic principles should be pretty universal. You should be able to take them and apply them to just about any combination of source code repository tools, unit testing tools, and IDE (or text editor).

The primary thing to remember is this: if you can't build it reliably, predictably, and on a moment's notice, you're in trouble. When a development team knows they can't build the software, and when the testing team is sitting around for days or weeks at a time wondering when they're going to get a new release to test, morale suffers, tempers flare, and things rapidly go downhill. I've been there. I've seen it. It ain't pretty.

Every project needs a good, solid release process. I'm tempted to say that any release process is better than no release process, but that wouldn't be entirely true. A release process needs to be trim, make sense, bolster confidence in the project, and help propel the team forward towards success. That's what this process is designed to do.

I'm sure that others have some ideas on how to improve the process above. I'd love to hear those ideas. I'm sure that others have different ways of doing things. I'd love to hear that too. There is no silver bullet, and I'm not anywhere stupid enough to think that this plan is perfect. But I hope it's enough to help someone, somewhere get a little bit closer to a project that gets out the door a bit faster, healthier, and with its developers' sanity in tact.

Technorati Tags: Software, Release Engineering, Deployment, SCC, Unit Testing, Computer Science, Release Manager, Software Builds, Compile, Install

1 comment:

Anonymous said...: Developers should identify the defects in their source code in the early stage of the software development process. This may help people to save more time and money arising out of software defects. Developers can use static tool like Coverity Prevent for analyzing source code for fixing defects. Using Coverity Prevent developers can fix defects like memory leak, dangling pointer, uninitialized date etc. Coverity Prevent ,
used by the Department of Homeland security to scan many open source projects.you can get more info at at http://www.Coverity.com; February 27, 2008 at 7:25 AM

Coding from the Trenches

Sunday, January 13, 2008

Software Release Engineering