Coding from the Trenches: Quality

Showing posts with label Quality. Show all posts

Wednesday, June 30, 2010

Knowing How is Not Enough

There’s an old adage that I heard once, and it’s stuck with me through the years:

He who knows how to do a thing is a good employee. He who knows why is his boss.

I’m also fond of this one:

If you can’t explain it, you don’t understand it.

So I’ve been ramping up on some technology that I’ve really not had an opportunity to really use before, and I’m very excited about it. To make sure I understand it, I’ve decided to go back to the MSDN examples, reproduce them one line at a time, and then document the source code as I understand it. It’s a great way to learn, and sheds a great deal of light on what you think is happening, versus what’s actually happening.

To be perfectly honest, the technology is AJAX. Over the last few years, I’ve predominantly worked for companies that haven’t had any use for Web services, so there’s been no compelling need for it. I’m starting a new job soon and it will rely heavily on Web services, and I really want to make sure I understand them well before I set my foot in the door. It has never been enough for me to know that you just drag a control onto a form or page, set a few properties and press F5. To me, that degree of abstraction is a double-edged sword.

When abstraction reaches the level that it has with Microsoft AJAX, you start to run into some fairly significant issues when it comes time to test and debug the application. The MS AJAX framework is no small accomplishment, and it hides a lot of complexity from you. It makes it so easy to write AJAX applications that you really don’t need to understand the underlying fundamentals of Asynchronous Javascript and XML that make the whole thing work. Consequently, when things go wrong, you could very well be left scratching your head, without a clue, and no idea where to begin looking.

Where, in all of this enormously layered abstraction did something go wrong? Was it my code? Was it the compiler? Was it IIS? Was it permissions? Was it an update? Was it a configuration setting? Was it a misunderstanding of the protocol? Did the Web service go down or move? Was the proxy even generated? If it was, was it generated correctly? Do I even know what a proxy is and why I need it?!

When I started learning about AJAX, we coded simple calls against pages that could return anything to you in an HTTP request using the XMLHTTPRequest object. Sure, it was supposed to be XML, but that was by convention only. The stuff I wrote back then (and I only wrote this stuff on extremely rare occasions, thank the gods), returned the smallest piece of data possible: a single field of data in flat text. It was enough to satisfy the business need, and didn’t require XML DOM parsing.

But even with DOM parsing, the code to make a request and get its data back via XMLHTTPRequest was a lot smaller than all the scaffolding you have to erect now. You might argue that you don’t have create a lot of code now, but that is just an illusion. You’re not writing it, but Microsoft is. Just because you don’t see it doesn’t mean it’s not there. Do you know what that code is doing?

In theory, the Why of Microsoft AJAX, or any AJAX library is to make our lives easier when it comes time to write dynamic Web applications that behave more like desktop applications. To a certain degree, they have. When they work. But when they don’t, I wonder if the enormous degree of abstraction they’ve introduced hasn’t dumbed us down to the point where we’ve ignored essential knowledge that we should have.

If you’re going to write Web services, or consume them, you should, at a minimum, understand what they are, and how they work. You should understand their history, how they evolved, and the problem that AJAX tries to solve. It’s not enough to know how to write a Web service, you have to know why you’re doing it, and why you’re doing it the way you are. That sort of knowledge can be crucial in making the right choices about algorithms, protocols, frameworks, caching, security, and so on.

But this could be true of any technology or practice we learn. AJAX, LINQ, design patterns, TDD, continuous integration, pair programming, and so on. Know why.

Try this simple litmus test. Explain something you think you know to one of your peers. If you can’t explain it clearly without having to pull out a reference or go online, you don’t understand it the way you think you did. Consider relearning it. It’ll only improve your value to yourself, your peers, and your employer.

Technorati Tags: Productivity,Software Development,Technology

Tuesday, July 14, 2009

On Legacy Software Maintenance

In the mad, mad, mad, mad world of software development, we are faced with the trying task of maintaining legacy systems. In an ideal world, that wouldn't be the case. We'd all be developing brand new systems from the ground up, writing ground-breaking code that no one has ever seen before, without the hassles that arise when you have to worry about things like backwards compatibility and maximum system uptime.

But this isn't an ideal world. The vast majority of us don't have the luxury of developing completely new systems. Instead, our lives are fraught with the perils of correcting defects and adding new features to systems that have been around for ages and, occasionally, decades. Those systems have usually passed through a number of hands and they tend to be poorly documented. They sometimes have sprawling feature sets, support technologies that have long since fallen by the wayside, bloated with code that doesn't appear to be invoked by anyone, and riddled with obscure and seemingly nonsensical comments.

Your job, as a maintenance developer, is to massage that seemingly horrific beast into a thing of beauty. Real people doing real jobs depend on it to get their work done in a timely manner. As much as we might loathe an ancient codebase, the language it was written in, or the tools we have to use to get the job done, truth is, a legacy application is maintained for a reason: it has intrinsic value to the bottom line of the business. When the folks who depend on that system can get their jobs done on time, in a productive manner, they can continue to draw a decent paycheck. That means that they can continue to pay the rent, put food on the table for their families, afford healthcare, and all those other essentials.

So tread lightly when you delve into the code of a legacy system. It's far more important than you think. We just take it for granted that it's a dirty, loathsome job and someone has to do it. We just happen to be the unlucky bastard who drew the short straw. Not so: you happen to be the lucky one who drew that straw. People depend on you to help them keep their families safe, warm, and well-fed.

My point isn't that every legacy application is manna from heaven. My point is that legacy applications exist for a reason, that they're maintained for a reason. They have long, varied histories for a reason. They have endured because they have value; they've grown beyond their original specification because the company sees real value in them, and doesn't want to lose that investment. The problem for you, as a developer, is in ensuring that you do not destroy that investment.

When we are first introduced to a legacy system, we have a tendency to look at the source code and view it as though it were written by a blind deaf quadriplegic with Tourette's syndrome. No one in their right mind would have written a system that way. What could they possibly have been thinking? You certainly wouldn't have! I certainly wouldn't have.

But then, over time, we start to learn things about it. The code is old; very old. There's been a high turnover rate, and the code has passed through lots of hands. The companies that published third-party components went out of business when the dot-com bubble burst. They used to use Novell for security, and then switched to Active Directory. When this thing was released, Windows 95 was still popular. They upgraded the database about two years ago, and had to make some emergency revisions to the code.

There are reasons that things like this happen in a legacy system that's old enough to qualify for Medicare. Many of those reasons are valid, and many of them are the product of a lack of time and budget. Sometimes, sadly, it's a result of a lack of skilled developers (but that's something for someone else to blog about). The point, in short, is that systems grow from an original vision into large, cumbersome, bloated systems because developers respond to the needs and demands of the business.

Now, here you are, present day, and you're tasked with maintaining that source code. You have two primary responsibilities: 1.) Fix the defects, and 2.) Add new features. Keep in mind that while you are doing both, you must not at any point in time break backwards compatibility or bring down the system. People rely on this system. It's the bread and butter (or part of it) of the business. And it's your baby now.

It is absolutely crucial that you treat legacy software with its due gravity. If you view it like it's some recurring annoyance, stop that. If you leap to hasty conclusions about the cause of problems in the system, stop that immediately. This is a system that many people rely on. Get it into your head that you need to treat this thing delicately, as if it were your pride and joy. Once you fix a defect, once you put a new feature into it, your name is associated with that software. Take pride in it. Do it right. Take your time.

Over time, the legacy system stops being the horrific beast associated with all those who preceded you. It becomes the creature that you have molded it into. And then, people will associate it with you, for better or worse.

Technorati Tags: Quality, Software Development, Legacy Software, Software Maintenance

Monday, July 13, 2009

The Sad State of SQL Refactoring

I love refactoring tools.

The ability to select a variable in code, right-click on it, choose the Rename command from a context menu, and then safely rename that variable in every location in which it appears is a boon to my programming productivity. Similarly, the ability to take a large block of complex logic and extract it to its own method is really handy. Or, the ability to reorder parameters, and know that every piece of code that calls that method will be updated appropriately saves me tons of time and improves my confidence in the quality of the code.

When you refactor code of any kind, you enhance its readability without changing the way it works. But refactoring by hand is frequently difficult, tedious, and error-prone. That's why refactoring tools exist and why the essential service they provide is important. They allow us to improve the maintainability of code, hopefully reducing maintenance costs, quickly and efficiently.

Enter SQL languages.

In theory, there is a standard for SQL languages: the ANSI SQL-92 standard. One would like to think that it would be a simple matter to create refactoring tools for any SQL based on the ANSI standard for SQL. One would be wrong. You can't use the standard as the sole basis of a refactoring tool.

Any given database vender wants to strive to make their product unique, to stand out from the crowd. And so, they don't entirely conform to the standard. They have additional features that separate them from each other. For example: Oracle organizes functions and procedures into packages. Microsoft does not. Microsoft allows a bit data type on columns. Oracle does not. And, although we all loathe to think of Access as a real database, Access provides a boolean data type that you can use in columns, while Oracle does not.

Now, let's also talk about legacy support. Oracle has been around for a very long time. Version 2 came out in 1979, from what I gather. Their legacy join syntax looks nothing like the ANSI standard (and that's not a criticism, just a simple statement of fact):

SELECT Customers.CustomerId, Company, OrderID FROM Customers C, Orders O WHERE C.CustomerID (+) = O.CustomerID UNION SELECT Customers.CustomerId, Company, OrderID FROM Customers C, Orders O WHERE C.CustomerID = O.CustomerID (+)

In ANSI SQL, this is :

SELECT Customers.CustomerId, Company, OrderId FROM Customers C FULL OUTER JOIN Orders O ON (C.CustomerId = O.CustomerId)

Now, we could add in all the different SQL variations that we know that are out there:

ANSI standard
Interbase/Firebird
IBM SQL
Microsoft Transact SQL
MySQL
Oracle PL/SQL
PostgreSQL
Access
FoxPro

...and on and on and on, but you get the point. There are a lot of SQL variants out there. And their goal is to accomplish the same thing:

Create a data store.
Occasionally, modify the structure of the data store.
Get data out of a data store.
Put data into a data store.
If supported, execute code within the confines of the data store and (optionally) return a result.

Behold, SQL in a nutshell.

Over the lifetime of these various products, they have added features that they have to maintain for legacy support. Somewhere out there there's a business that absolutely must have that feature in place or their whole process will come crashing down. Don't you dare remove it. It doesn't matter that there are better features (likely based on a standard); there are applications out there for which they don't have the source code anymore, or which no one understands, and they're too afraid to touch.

Now, all these reasons have led us to a scenario where we have vastly different implementations of SQL. Sure, they share a lot in common, but they also have radically different feature sets and syntax. And that situation, in and of itself, has led us to one frightening and sad conclusion:

We will likely never have a tool that is able to connect to any database and be able to correctly refactor its SQL.

And that's just a damned shame. Because there's a lot of SQL out there. Stored procedures, functions, views, triggers, even inline SQL in applications and all that other jazz. But the amount of work it would take to get us to a point where a refactoring tool could recognize any variant of SQL and correctly parse it, refactor it, and not hose the code is enormous.

It's a pity, really. I could see tons of use for a tool like this. In my own office, we work with SQL Server, Oracle, and Atomix databases. What I wouldn't give for a tool that could refactor SQL to enhance its readability without changing the way it worked.

And who knows? Down the road, we may be working with something else.

But this is the world we live in. If we want to refactor SQL, it looks like we'll have to settle for separate refactoring tools for each language. And then each will come with its own quirks. That might be good or bad, but it's likely the best we can hope for for now.

Technorati Tags: SQL, Best Practices, Productivity, Code Quality, Refactoring, Software Development

Saturday, July 11, 2009

Defects and the Scientific Method

Let's leap back a few years to that frightening time in Junior High School. Remember those years? September, possibly October. It was still hot, and the air conditioning didn't work. Your science teacher was standing in the back of the room, with an overhead projector and had a slide up for you to copy down. On it, he had this information:

THE SCIENTIFIC METHOD

Ask and define the question.
Gather information and resources through observation.
Form a hypothesis.
Perform one or more experiments and collect and sort data.
Analyze the data .
Interpret the data and make conclusions that point to a hypothesis.
Formulate a "final" or "finished" hypothesis.

Ah, remember those days? Remember how boring it was? Remember thinking to yourself, "I'll never use this?"

Well, as a software developer who is tasked with maintaining software that is virtually guaranteed to contain defects, you can be certain that you need to be intimately familiar with The Scientific Method.

The Scientific Method provides a clear roadmap for defect isolation. In fact, anyone who has any real experience isolating defects without disturbing the rest of the system has (whether he's aware of it or not) used the Scientific Method to do so. Here's how it breaks down:

Ask and define the question. The software should behave in this manner, but it does not. What is the cause of this problem, and how do we fix it?
Gather information and resources through observation. In a controlled environment that mimics production as closely as possible, reproduce the defect. If possible, step through the code and observe its behavior.
Form a hypothesis. The defect is caused by this behavior in the system (or by the behavior of this external system).
Perform one or more experiments and collect and sort data. Implement a code fix; attempt to reproduce the defect using the fixed code. Observe the results.
Analyze the data. Did the code fix have the desired effect? If so, how?
Interpret the data and make conclusions that point to a hypothesis. Was the code that was modified the cause of the defect, or was it merely a symptom of an underlying problem requiring further resolution?
Formulate a "final" or "finished" hypothesis. If the defect is fully repaired, check all code into the repository. Otherwise, continue the analysis until you have rooted out the underlying cause of the defect.

Simply put, there's no guesswork in defect resolution. It is a rational, thinking process, much like a game of Sodoku. If you approach any defect and just yank an answer out of thin air, You're Doing It Wrong.

Instant answers to defects are a dangerous game. Your first, instinctive answer to any problem is likely to be wrong; the chances of this being true will only rise as your code base grows in size. As your product gains features, you'll want to take greater care to make sure that you have taken the time to disturb absolutely nothing outside of the defect you're trying to correct. In that case, take some advice: Keep your grubby fingers to yourself. Touch only the code in the defect domain. The best way to do that is to have a plan for defect resolution, and I strongly encourage you to apply The Scientific Method.

Developing software is a task for those who can think. It is not a task for the simple-minded, the lazy, or the inattentive. You have to be willing to pay attention to the details, and to invest the time it takes to hunt down a defect in painstaking detail to get to the root of a problem.

A good software developer knows the difference between a symptom and a disease, and how that correlates to software defects. Sure, you have a NullReferenceException, and your code is missing a handler for that. But is the problem the missing exception handler, or is the problem that a null somehow got into a table that should never have had it, or that a stored procedure in the database returned nulls when they were never expected? Which one is the symptom? Which is the disease? Make sure you're fixing the right defect. Don't just prescribe aspirin when the software needs invasive surgery. To find that out, you need to think critically. You need to apply the Scientific Method.

Thursday, March 13, 2008

NValidate: Misunderstood from the Outset

Occasionally, I will post questions about the design or feature set of NValidate on Google Newsgroups. More recently, I posted a question about it to LinkedIn. Almost immediately, I got this response:

I'd suggesting looking at the Validation Application Block portion of the Enterprise Library from the Microsoft Patterns and Practices group.

Now, I'm not belittling the response, because it's perfectly valid, and the Validation Application Block attempts to solve essentially the same problem. But when I talk about NValidate, which I find myself doing a lot as I interview for jobs (it's listed on my résumé), people often ask me questions like it:

How is that any different from the Validator controls in ASP.NET?
Why don't you just use the Validation Application Block?
Why didn't you go with attributes instead?
Why didn't you use interfaces in the design?
Why not just use assertions instead of throwing exceptions?

These days, I find myself answering these questions with alarming frequency. It occurs to me that I should probably get around to answering them, so I'm going to address them here and now.

It helps, before starting, to understand the problem that NValidate is trying to solve: Most programmers don't write consistent, correct parameter validation code because it's tedious, boring, and a pain in the neck. We'd rather be working on something else (like the business logic). Writing parameter validation code is just too difficult. NValidate tries to solve that problem by making it as easy as possible, with a minimal amount of overhead.

Q. How is NValidate any different from the Validator controls in ASP.NET?

A. The Validator controls in ASP.NET can only be used on pages. But what if I'm designing a class library? Isn't it vitally important that I make sure I test the parameters on my public interface to ensure that the caller passes me valid arguments? If I'm not, I'm going to fail spectacularly, and not in a pretty way. You can't use the Validator controls (RangeValidator, CompareValidator, and so on) in a class library you're writing that's intended to be invoked from your Web application.

Q. Why don't you just use the Validation Application Block?

A. This one's pretty easy to answer. NValidate is designed to accommodate lazy programmers (like me).

Here's the theory that essentially drives the design of NValidate: Developers don't write parameter validation code with any sort of consistency because it's a pain in the neck to write it, and because we're in a big hurry to get to the business logic (the meat and potatoes of the software). Let's face it: if the first chunk of the code has to be two to twenty lines of you checking parameters and throwing exceptions, and doing it all over the place, you'd get tired of doing it, too. Especially if that code is extremely repetitive.

if(null == foo) throw new ArgumentNullException(foo);
if(string.Empty == foo) throw new ArgumentException("foo cannot be empty.");
if(foo.length != 5) throw new ArgumentException("foo must be 5 characters.");

We hate writing this stuff. So we skip it, thinking we'll come back to it later and write it. But it never gets done, because we get all wrapped up in the business logic, and we simply forget. Then we're fixing bugs, going to meetings, putting out fires, reading blogs, and it gets overlooked. And the root cause is because it's tedious and boring.

I'm not making this up, folks. I've talked to lots of other developers and they've all admitted (however reluctantly), that it's pretty much the truth. We're all guilty of it. Bugs creep in because we fail to erect that impenetrable wall that prevents invalid parameter values from slipping through. Then, we have to go in after the fact and add the code after we've got egg on our face and fix it, at increased cost.

So, if you want to make sure that developers will write the parameter validation code, or are at least more likely to do it, you have to make it as easy as possible to do so. That means writing as little code as possible.

Now, if we look at the code sample provided by Microsoft on their page for the Validation Application Block, we see this:

using Microsoft.Practices.EnterpriseLibrary.Validation;
using Microsoft.Practices.EnterpriseLibrary.Validation.Validators;
public class Customer
{
    [StringLengthValidator(0, 20)]
    public string CustomerName;
    public Customer(string customerName)
    {
        this.CustomerName = customerName;
    }
}

public class MyExample
{
    public static void Main()
    {
        Customer myCustomer = new Customer("A name that is too long");
        ValidationResults r = Validation.Validate<Customer>(myCustomer);
        if (!r.IsValid)
        {
            throw new InvalidOperationException("Validation error found.");
        }
    }
}

A couple of things worth noting:

You have to import two namespaces.
You have to apply a separate attribute for each test.
In your code that invokes the test, you need to do the following:
1. Declare a ValidationResults variable.
2. Execute the Validate method on your ValidationResults variable.
3. Potentially do a cast.
4. Check the IsValid result on your ValidationResults variable.
5. If IsValid returned false, take the appropriate action.

That's a lot of work. If you're trying to get lazy programmers to rigorously validate parameters, that's not going to encourage them a whole lot.

On the other hand, this is the same sample, done in NValidate:

using NValidate.Framework;
public class Customer
{
    public string CustomerName;
    public Customer(string customerName)
    {
        Demand.That(customerName, "customerName").HasLength(0, 20);
        this.CustomerName = customerName;
    }
}

public class MyExample
{
    public static void Main()
    {
        try
        {
            Customer myCustomer = new Customer("A name that is too long");
        }
        catch(ArgumentException e)
        {
            throw new InvalidOperationException("Validation error found.");
        }
    }
}

A couple of things worth noting:

You only have to import one namespace.
In the property, you simply Demand.That your parameter is valid.
In your code that invokes the test, you need to do the following:
1. Wrap the code in a try...catch block.
2. Catch the exception and handle it, if appropriate.

See the difference? You don't have to write a lot of code to validate the parameter, and your clients don't have to write a lot of code to use your class, either.

Q. Why didn't you go with attributes instead?

A. I considered attributes in the original design of NValidate. But I ruled them out for a number of reasons:

Using them would have meant introducing a run-time dependency on reflection. While reflection isn't horrendously slow, it is slower than direct method invocation, and I wanted NValidate to be as fast as possible.
I wanted the learning curve for adoption to be as small as possible. I modeled the public interface for NValidate after a product I thought was pretty well known: NUnit. You'll note that Demand.That(param, paramName).IsNotNull() is remarkably similar to NUnit's Assert.IsNotNull(someTestCondition) syntax.
In NValidate, readability and performance are king. Consequently, it uses a fluent interface that allows you to chain the tests together, like so:

Demand.That(foo, "foo").IsNotNull().HasLength(5).Matches("\\d5");

This is a performance optimization that results in fewer objects created at runtime. It also allows you to do the tests in a smaller vertical space.

My concerns about attributes and reflection may not seem readily apparent until you consider the following: it's conceivable (in theory) that zealous developers could begin validating parameters in every frame of the stack. If the stack frame is sufficiently deep, the costs of invoking reflection to parse the metadata begins to add up. It may not seem significant yet, but consider the scenario where any one of those methods is recursive; perhaps it walks a binary tree, a DOM object, an XML document, or a directory containing lots of files and folders. When that happens, the costs of reflection can become prohibitively expensive.

In my book, that's simply not acceptable. And since, as a framework developer, I cannot predict or constrain where a user might invoke these methods, I must endeavor to make it as fast as possible. In other words, take the parameter information, create the appropriately typed validator, execute the test, and get the hell out as quickly as possible. Avoid any additional overhead at all costs.

Q. Why didn't you use interfaces in the design?

A. I go back and forth over this one all the time, and I keep coming back to the same answer: Interfaces would tie my hands.

Lets assume, for a moment, that we published NValidate using nothing but interfaces. Now, in a subsequent release, we decided we wanted to add new tests. Now we have a problem. We can't extend the interfaces without breaking the contract with clients who are built against NValidate. Sure, they'll likely have to recompile anyway; but if I add new methods to interfaces, they might have to recompile lots of assemblies. That's something I'd rather not force them to do.

On the other hand, abstract base classes allow me to extend classes and add new tests and new strongly typed validators fairly easily. Further, it eliminates casting (because that's handled by the factory). If, however, the system is using interfaces, some methods will return references to an interface, and some will return references to strongly typed validators, and some casting will have to be done at the point of call. I want to eliminate manual casting whenever I can, to keep that call to Demand.That as clean as possible: the cleaner it is, the more likely someone is to use it, because it's easy to do.

Q. Why not just use assertions instead of throwing exceptions?

A. This should be fairly obvious: Assertions don't survive into the release version of your software. Additionally, they don't work as you'd expect them to in a Web application (and rightly so, since they'd kill the ASP.NET worker process, and abort every session connected to it. [For a truly educational experience, set up a test web server, and issue a Visual Basic Stop statement from a DLL in your Web App. You'll kill the worker process, and it will be reset on the next request. Nifty.]).

Wisdom teaches us that the best laid plans of mice and men frequently fail. Your most thorough testing will miss some points of your code. The chances of achieving 100% code coverage are pretty remote; if you do it with a high degree of frequency, I'm duly impressed (and I'd like to submit my resume). But for the rest of us, we know that some code never gets executed during testing, and some code gets executed, but doesn't get executed under the precise conditions that might reveal a subtle defect. That's why you want to leave those checks in the code. Yes, it's additional overhead. But wouldn't you rather know?

In Summary

Sure, these are tradeoffs in the design. But let's keep in mind who I'm targeting here: lazy programmers who are typically disinclined to write lots of code to validate their parameters. The idea is that we want to make it so easy that they're more likely to do it. In this case, less code hopefully leads to more, which (I hope) leads to fewer defects, and higher quality software.

Coding from the Trenches