Friday, October 30, 2009

Oracle SQL Coding Standards, Or the Dearth of Them

So here I am, laboring furiously to get my refactoring tools polished up for Insight, and the one feature that I would really like to get implemented is a spiffy coding-standards compliance tool for SQL. One would think that it would be a simple matter of digging up the most widely accepted standard, allowing the user to modify that standard to suit their firm’s needs, and then applying it to the script in question. And that isn’t really rocket science. Except for one thing.

There isn’t really one widely-accepted PL-SQL coding standard.

Now, I’m not going to just come out and say that Oracle is horrible at documentation. No, wait, I am going to say that Oracle is horrible at documentation. Aside from the fact that the vast majority of it is nearly incomprehensible, incomplete, and badly formatted, they tend to gloss over things that concern the really important folks: those who have to use the damned thing.

A coding standard is a really important thing. Without it, you end up with ten-year-old databases that have passed through a plethora of hands and no two sets of hands agree about how the code should be laid out, how the variables and objects should be named, how things should be cased, and where commas should be placed. Sounds trivial, I know. But when you have to come in ten years later and try to reconcile all those competing non-standards into a single, cohesive “product” that makes sense so that you can maintain it, it’s a nightmare.

We need a driving vision for a SQL coding standard. And we need it from a single source, with some authority behind it. But it can’t just establish a standard that comes down with some heavy-handed practice that isn’t thought out. As much as I hate to admit it, Oracle has been doing this a long time, and they’re the logical choice to start this movement, with lots of input from the community.

On the other hand, this could be an entirely community-driven thing, a grass-roots effort to establish a common SQL coding standard. On this blog, Lewis Cunningham discusses such a project, but it doesn’t appear to have taken off. It’s a pity, because I think it’d be well worth the effort.

Who knows, maybe I’ll slide on over to Google Apps and see what I can do to start such a grassroots effort myself. I’m tired of sitting around waiting for a consensus to just pop up out of thin air. You’ve got to start somewhere, right?

Wednesday, October 21, 2009

Smug Bastards, Toy Databases, and Ease of Use

Alarmingly, I cannot find anything in this post that I disagree with. I don't know whether that makes me proud, smug, or ashamed. Either way, I laughed my ass off.

Wednesday, October 14, 2009

The Cost of a Quick Second

So, here I am, cleaning up a rather large code base, and I find myself in the throes of retrofitting properties with attributes. Now, when I wrote these properties, the prevalent thought at the time was that I just had a crap-load of properties to write, and tons of business code that needed them. So, in a bug hurry, I slapped together some C# code that spit them out. You know, kind of like this:

public string Owner { get; protected set; }

Now, at the time, that’s all well and good, because at some point, the work just has to get done. But then, I realized that the software I’ve been working on is actually very useful and that I want to polish it up and get it ready to release it to the masses. But I discover, to my utter horror, that the source code is an absolute disaster because, well, I didn’t write it with the intention that anyone else was going to actually look at the source code.


Sure, the thing looks beautiful from the outside. And it does what it’s supposed to do (by and large). And it’s fast. Very fast. But please, please do not look at the plumbing. Or the wiring. The vinyl siding is Fabulous. But don’t look underneath it.


Hence, the retrofitting.


I’m sure you’ve been there. Don’t look at me like that. I know we’ve all written crap code slapped together in a hurry because we needed something just this second and it will only take a second to do.


Which leads me to my point: A quick second now will cost you hours later.


This particular project involves two separate assemblies. A Class Library DLL, which houses all the logic, and a Windows Forms Application which references the Class Library. The problem is that the class library is missing the bulk of its XML documentation comments, and the properties aren’t properly marked with attributes. Silly me.


So now, I have to go back through the code, one file at a time, and mark up my source code. Trust me, this is no picnic. Fortunately, I believe in good, clean names, so the code is largely self-documenting. But there have been a few surprises here and there. But going back through any source file and adding the comments after-the-fact is slow and tedious work. You shouldn’t put yourself in a position to do it in the first place. Comment up front when the purpose of the property or method is fresh in your mind.


When it comes to property attributes, you’d be amazed at how many there are:



  • DefaultProperty. This attribute doesn’t go on a property, but marks up the class to indicate the name of the property that should be treated as the default.
  • ReadOnly. Use this attribute to specify that a property is read-only in the property grid.
  • Description. Use this attribute to specify the description that appears in the property grid, or in the tooltip that appears when the user hovers over the property in the editor.
  • Category. Use this attribute to specify the category that the property is grouped under when the property appears in the property grid.

These are the few that I’m primarily interested in. So, for each property, I have to move along as follows:

/// <sumary>
/// Gets the owner of the view.
/// </sumary>

[ReadOnly(true), Description("Gets the owner of the view."), Category("Schema")]
string Owner { get; protected set; }

This gets a lot more complicated if I’m documenting a method, as you can imagine. Then we’re dealing with parameters, return values, and all that other hooplah because the consumer isn’t familiar with the code like I am.


Refactoring tools can simplify some of this stuff. Some software tools boast the ability to generate documentation comments for you, but my experience with them have been less than rosy.


So take some advice. Spare yourself a little effort down the road. Document and mark up your code as you go along. It’s worth the time to do it up front. I have to go through this code base and get this stuff cleaned up now. The only good thing I can say about it is that it’s giving me a really good chance to go through the code with a fine-toothed comb and clean it up, remove dead code (oh, for an open-source tool for that!), and eliminate some redundancies. Aside from that, I’ll be kicking myself in the future for taking hasty shortcuts like these.


You should be too.

Saturday, August 1, 2009

The Perfect Villain

I play City of Heroes quite a bit. Okay, so that's an understatement. I play it a lot. After five years in the game, my addiction runs so deep it's threaded its way through my genetic makeup. You'd be hard-pressed to separate me from it.

I've always had a love of comic books. What's not to love? Sure, they're the domain of geeks, nerds, and all their ilk. But let's face it: who wouldn't love to fly? To have virtually god-like power at their disposal? To have a body so ripped you wouldn't think twice about showing up in public in a spandex body suit?

But one thing has started to really bug me about the villains in my game of choice. Now, I laud the writers for COH. As the game has aged, the stories have grown more compelling, more twisted, more convoluted. Sure, we have some dangling plot threats, gaping plot holes, and all that jazz, but they've fleshed out some of the characters quite nicely. In a world full of super-powered beings, that's frequently no mean feat. It's not easy fleshing out characters in a story about normal humans; try fleshing out a character who can incinerate you with a look, or teleport across vast distances, or summon the dead--and justify their reasons for doing what they do. That's a lot easier said than done.

The budding writer in me understands their desire to do just that. I think that the armchair psychiatrist in all of us would tend to agree that any villain who embarks on a life of crime does so for a reason. Greed, revenge, betrayal, power, lust, all the standard reasons are there.

But I've watched those foundation villains in City of Heroes for some time now, and I've discovered something about them that disturbs me.

They are all too distracted.

Give me any villain from the COH Universe and I can likely show you how that villain is too distracted. They all lack focus. A clear vision of what it is that motivates them. What Paragon City, indeed the COH Universe, needs is a villain who is so absolutely single-minded that he (or she) is utterly unstoppable and incapable of being distracted.

Imagine, if you will, a villain whom you never see. He is adept at keeping his identity and location concealed from everyone, including those with psychic and magical abilities. He does not want money, power, fame, or legions of followers. What he does want is to watch people suffer. He does not care, particularly, how he does it. He just cares that the world is filled with pain, anguish, misery, and death, and that it was his hand that brought it to pass.

He is not a god. He is not inhuman. He is not even super-powered. He is simply intelligent, and possessed of a single-minded determination to bring mass chaos to the world around him. He loathes the other villains because they are weakened by the petty squabbles they engage in between themselves. He will kill heroes, villains and civilians with merciless abandon.

While he has no followers of his own, there are those who have heard of him, and rally to his cause. These copy-cats seek to emulate him. They spread his anarchist agenda with a ruthlessness on par with his own. They could be anyone: civilians, members of known factions, heroes teetering on the brink of villainy. But none of them knows who he is or has ever seen him.

The only reason that anyone knows he exists is because he occasionally forewarns people of upcoming tragedies he will inflict, only to incite more terror. His mastery of indirection will sometimes lead heroes to one site in an attempt to avert disaster, only to realize that his real target was some other heavily populated target.

His targets of choice are always large population centers: hospitals, police departments, large office buildings, schools, and so on. Given any choice where he can inflict damage, he will inflict as much damage as is humanly possible.

Yes, this is the kind of villain that the COH Universe needs. A Black Mask, completely unpredictable, untraceable, never brought to justice, enemy to both hero and villain, and incapable of being distracted from the one thing that he seeks the most: the pain and suffering of others.

I highly doubt that this sort of a villain could be brought to pass and maintain the T for Teen rating that the game enjoys in its current incarnation. But one could dream. And if you're going to dream, dream big.

Technorati Tags:

Thursday, July 16, 2009

InternalsVisibleTo and Chasing Down Public Keys

Sometimes, just getting assemblies to cooperate the way you want them to is a real pain in the neck.

Today, I wanted to make one assembly (which we’ll call Foo) to be able to access the internal (Friend for all us VB geeks) members of another assembly (which we’ll call Bar). Now, the documented way to achieve this is by adding the InternalsVisibleTo attribute to your project. The code sample on MSDN looks like this:

[assembly:InternalsVisibleTo("AssemblyB, PublicKey=32ab4ba45e0a69a1")]

For a Visual Basic application, that statement goes into the AssemblyInfo.vb file, and looks like this:

<Assembly:InternalsVisibleTo("AssemblyB, PublicKey=32ab4ba45e0a69a1")>

No big deal. It’s not rocket science. You just need a strong name. To generate a strong name, you fire up the Visual Studio Command Prompt, and execute the Strong Name tool (sn.exe) and execute it as follows:


C:\Program Files\Microsoft Visual Studio 9.0\VC>sn.exe -k bar.snk

Microsoft (R) .NET Framework Strong Name Utility  Version 3.5.30729.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Key pair written to bar.snk

As you can see, this creates a .SNK file, which contains a public and a private key. However, its contents are not human readable. Further, the InternalsVisibleTo attribute requires that you provide the public key in the constructor. (Don’t even think about trying it without it.)


Here’s where things get tricky: The code samples on MSDN do not provide a public key to the constructor; they provide a public key token. There’s a huge difference between the two, and it’s very misleading. A strong name token is much shorter than a public key; the former is only about 16 characters long, the other well in excess of 140 characters. If you rely on the code sample from MSDN to get you where you want to be, you’ll be pulling your hair out in no time.


But how do you get the public key token?


You can retrieve the public key token as follows:


sn –p foo.snk barpublic.snk


This creates a new file that contains only the public key. It removes the private key information from the file.


sn –tp > barkey.txt


This creates a text file that contains the full dump of the key information, including the public key. We redirect it to a text file so that you can open it in the editor of your choice (because you’ll have to do some cleanup to get the key onto one line). You’ll want to select the public key and paste it into the PublicKey portion of the constructor for the InternalsVisible attribute.


So here’s everything we did at the command prompt:


C:\Program Files\Microsoft Visual Studio 9.0\VC>sn -k bar.snk

Microsoft (R) .NET Framework Strong Name Utility  Version 3.5.30729.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Key pair written to bar.snk

C:\Program Files\Microsoft Visual Studio 9.0\VC>sn -p bar.snk barpublic.snk

Microsoft (R) .NET Framework Strong Name Utility  Version 3.5.30729.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Public key written to barpublic.snk

C:\Program Files\Microsoft Visual Studio 9.0\VC>sn -tp barpublic.snk > bar.txt

C:\Program Files\Microsoft Visual Studio 9.0\VC>

And here’s the contents of our text file:


Microsoft (R) .NET Framework Strong Name Utility  Version 3.5.30729.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Public key is
0024000004800000940000000602000000240000525341310004000001000100e13cb392af5437279736fc3c33fe237242d0f6301fafb01c5cbc719d84102c2d8b30a148600997ed53d99624b5d0eab37fd6b24cca3ce7f7b62ae99f961e148d5421576bade0ac8ab1187a3eee318ca20026ffe9b56b8a63156f817cef49998633867ae547684e8e59c0fe0b68ab29dffa749340dc6cfdd18071f1b69c6772ac

Public key token is db218359dd8997df

So, when we finally add that attribute to our AssemblyInfo.vb file, it looks like this:


<Assembly:InternalsVisibleTo("Bar, PublicKey=0024000004800000940000000602000000240000525341310004000001000100e13cb392af5437279736fc3c33fe237242d0f6301fafb01c5cbc719d84102c2d8b30a148600997ed53d99624b5d0eab37fd6b24cca3ce7f7b62ae99f961e148d5421576bade0ac8ab1187a3eee318ca20026ffe9b56b8a63156f817cef49998633867ae547684e8e59c0fe0b68ab29dffa749340dc6cfdd18071f1b69c6772ac")>

Once this stuff is in place, Bar should be able to access any members in Foo that are marked internal/Friend. It should be smooth sailing from there.

Good luck!


Tuesday, July 14, 2009

On Legacy Software Maintenance

In the mad, mad, mad, mad world of software development, we are faced with the trying task of maintaining legacy systems. In an ideal world, that wouldn't be the case. We'd all be developing brand new systems from the ground up, writing ground-breaking code that no one has ever seen before, without the hassles that arise when you have to worry about things like backwards compatibility and maximum system uptime.

But this isn't an ideal world. The vast majority of us don't have the luxury of developing completely new systems. Instead, our lives are fraught with the perils of correcting defects and adding new features to systems that have been around for ages and, occasionally, decades. Those systems have usually passed through a number of hands and they tend to be poorly documented. They sometimes have sprawling feature sets, support technologies that have long since fallen by the wayside, bloated with code that doesn't appear to be invoked by anyone, and riddled with obscure and seemingly nonsensical comments.

Your job, as a maintenance developer, is to massage that seemingly horrific beast into a thing of beauty. Real people doing real jobs depend on it to get their work done in a timely manner. As much as we might loathe an ancient codebase, the language it was written in, or the tools we have to use to get the job done, truth is, a legacy application is maintained for a reason: it has intrinsic value to the bottom line of the business. When the folks who depend on that system can get their jobs done on time, in a productive manner, they can continue to draw a decent paycheck. That means that they can continue to pay the rent, put food on the table for their families, afford healthcare, and all those other essentials.

So tread lightly when you delve into the code of a legacy system. It's far more important than you think. We just take it for granted that it's a dirty, loathsome job and someone has to do it. We just happen to be the unlucky bastard who drew the short straw. Not so: you happen to be the lucky one who drew that straw. People depend on you to help them keep their families safe, warm, and well-fed.

My point isn't that every legacy application is manna from heaven. My point is that legacy applications exist for a reason, that they're maintained for a reason. They have long, varied histories for a reason. They have endured because they have value; they've grown beyond their original specification because the company sees real value in them, and doesn't want to lose that investment. The problem for you, as a developer, is in ensuring that you do not destroy that investment.

When we are first introduced to a legacy system, we have a tendency to look at the source code and view it as though it were written by a blind deaf quadriplegic with Tourette's syndrome. No one in their right mind would have written a system that way. What could they possibly have been thinking? You certainly wouldn't have! I certainly wouldn't have.

But then, over time, we start to learn things about it. The code is old; very old. There's been a high turnover rate, and the code has passed through lots of hands. The companies that published third-party components went out of business when the dot-com bubble burst. They used to use Novell for security, and then switched to Active Directory. When this thing was released, Windows 95 was still popular. They upgraded the database about two years ago, and had to make some emergency revisions to the code.

There are reasons that things like this happen in a legacy system that's old enough to qualify for Medicare. Many of those reasons are valid, and many of them are the product of a lack of time and budget. Sometimes, sadly, it's a result of a lack of skilled developers (but that's something for someone else to blog about). The point, in short, is that systems grow from an original vision into large, cumbersome, bloated systems because developers respond to the needs and demands of the business.

Now, here you are, present day, and you're tasked with maintaining that source code. You have two primary responsibilities: 1.) Fix the defects, and 2.) Add new features. Keep in mind that while you are doing both, you must not at any point in time break backwards compatibility or bring down the system. People rely on this system. It's the bread and butter (or part of it) of the business. And it's your baby now.

It is absolutely crucial that you treat legacy software with its due gravity. If you view it like it's some recurring annoyance, stop that. If you leap to hasty conclusions about the cause of problems in the system, stop that immediately. This is a system that many people rely on. Get it into your head that you need to treat this thing delicately, as if it were your pride and joy. Once you fix a defect, once you put a new feature into it, your name is associated with that software. Take pride in it. Do it right. Take your time.

Over time, the legacy system stops being the horrific beast associated with all those who preceded you. It becomes the creature that you have molded it into. And then, people will associate it with you, for better or worse.

Monday, July 13, 2009

The Sad State of SQL Refactoring

I love refactoring tools.

The ability to select a variable in code, right-click on it, choose the Rename command from a context menu, and then safely rename that variable in every location in which it appears is a boon to my programming productivity. Similarly, the ability to take a large block of complex logic and extract it to its own method is really handy. Or, the ability to reorder parameters, and know that every piece of code that calls that method will be updated appropriately saves me tons of time and improves my confidence in the quality of the code.

When you refactor code of any kind, you enhance its readability without changing the way it works. But refactoring by hand is frequently difficult, tedious, and error-prone. That's why refactoring tools exist and why the essential service they provide is important. They allow us to improve the maintainability of code, hopefully reducing maintenance costs, quickly and efficiently.

Enter SQL languages.

In theory, there is a standard for SQL languages: the ANSI SQL-92 standard. One would like to think that it would be a simple matter to create refactoring tools for any SQL based on the ANSI standard for SQL. One would be wrong. You can't use the standard as the sole basis of a refactoring tool.

Any given database vender wants to strive to make their product unique, to stand out from the crowd. And so, they don't entirely conform to the standard. They have additional features that separate them from each other. For example: Oracle organizes functions and procedures into packages. Microsoft does not. Microsoft allows a bit data type on columns. Oracle does not. And, although we all loathe to think of Access as a real database, Access provides a boolean data type that you can use in columns, while Oracle does not.

Now, let's also talk about legacy support. Oracle has been around for a very long time. Version 2 came out in 1979, from what I gather. Their legacy join syntax looks nothing like the ANSI standard (and that's not a criticism, just a simple statement of fact):

SELECT Customers.CustomerId, Company, OrderID
FROM Customers C, Orders O
WHERE C.CustomerID (+) = O.CustomerID
UNION
SELECT Customers.CustomerId, Company, OrderID
FROM Customers C, Orders O
WHERE C.CustomerID = O.CustomerID (+)

In ANSI SQL, this is :

SELECT Customers.CustomerId, Company, OrderId
FROM Customers C FULL OUTER JOIN Orders O ON
(C.CustomerId = O.CustomerId)

Now, we could add in all the different SQL variations that we know that are out there:

  • ANSI standard
  • Interbase/Firebird
  • IBM  SQL
  • Microsoft Transact SQL
  • MySQL
  • Oracle PL/SQL
  • PostgreSQL
  • Access
  • FoxPro

...and on and on and on, but you get the point. There are a lot of SQL variants out there. And their goal is to accomplish the same thing:

  • Create a data store.
  • Occasionally, modify the structure of the data store.
  • Get data out of a data store.
  • Put data into a data store.
  • If supported, execute code within the confines of the data store and (optionally) return a result.

Behold, SQL in a nutshell.

Over the lifetime of these various products, they have added features that they have to maintain for legacy support. Somewhere out there there's a business that absolutely must have that feature in place or their whole process will come crashing down. Don't you dare remove it. It doesn't matter that there are better features (likely based on a standard); there are applications out there for which they don't have the source code anymore, or which no one understands, and they're too afraid to touch.

Now, all these reasons have led us to a scenario where we have vastly different implementations of SQL. Sure, they share a lot in common, but they also have radically different feature sets and syntax. And that situation, in and of itself, has led us to one frightening and sad conclusion:

We will likely never have a tool that is able to connect to any database and be able to correctly refactor its SQL.

And that's just a damned shame. Because there's a lot of SQL out there. Stored procedures, functions, views, triggers, even inline SQL in applications and all that other jazz. But the amount of work it would take to get us to a point where a refactoring tool could recognize any variant of SQL and correctly parse it, refactor it, and not hose the code is enormous.

It's a pity, really. I could see tons of use for a tool like this. In my own office, we work with SQL Server, Oracle, and Atomix databases. What I wouldn't give for a tool that could refactor SQL to enhance its readability without changing the way it worked.

And who knows? Down the road, we may be working with something else.

But this is the world we live in. If we want to refactor SQL, it looks like we'll have to settle for separate refactoring tools for each language. And then each will come with its own quirks. That might be good or bad, but it's likely the best we can hope for for now.

 

Saturday, July 11, 2009

Defects and the Scientific Method

Let's leap back a few years to that frightening time in Junior High School. Remember those years? September, possibly October. It was still hot, and the air conditioning didn't work. Your science teacher was standing in the back of the room, with an overhead projector and had a slide up for you to copy down. On it, he had this information:

THE SCIENTIFIC METHOD

  • Ask and define the question.
  • Gather information and resources through observation.
  • Form a hypothesis.
  • Perform one or more experiments and collect and sort data.
  • Analyze the data .
  • Interpret the data and make conclusions that point to a hypothesis.
  • Formulate a "final" or "finished" hypothesis.
  • Ah, remember those days? Remember how boring it was? Remember thinking to yourself, "I'll never use this?"

    Well, as a software developer who is tasked with maintaining software that is virtually guaranteed to contain defects, you can be certain that you need to be intimately familiar with The Scientific Method.

    The Scientific Method provides a clear roadmap for defect isolation. In fact, anyone who has any real experience isolating defects without disturbing the rest of the system has (whether he's aware of it or not) used the Scientific Method to do so. Here's how it breaks down:

    1. Ask and define the question. The software should behave in this manner, but it does not. What is the cause of this problem, and how do we fix it?
    2. Gather information and resources through observation. In a controlled environment that mimics production as closely as possible, reproduce the defect. If possible, step through the code and observe its behavior.
    3. Form a hypothesis. The defect is caused by this behavior in the system (or by the behavior of this external system).
    4. Perform one or more experiments and collect and sort data. Implement a code fix; attempt to reproduce the defect using the fixed code. Observe the results.
    5. Analyze the data. Did the code fix have the desired effect? If so, how?
    6. Interpret the data and make conclusions that point to a hypothesis. Was the code that was modified the cause of the defect, or was it merely a symptom of an underlying problem requiring further resolution?
    7. Formulate a "final" or "finished" hypothesis. If the defect is fully repaired, check all code into the repository. Otherwise, continue the analysis until you have rooted out the underlying cause of the defect.

    Simply put, there's no guesswork in defect resolution. It is a rational, thinking process, much like a game of Sodoku. If you approach any defect and just yank an answer out of thin air, You're Doing It Wrong.

    Instant answers to defects are a dangerous game. Your first, instinctive answer to any problem is likely to be wrong; the chances of this being true will only rise as your code base grows in size. As your product gains features, you'll want to take greater care to make sure that you have taken the time to disturb absolutely nothing outside of the defect you're trying to correct. In that case, take some advice: Keep your grubby fingers to yourself. Touch only the code in the defect domain. The best way to do that is to have a plan for defect resolution, and I strongly encourage you to apply The Scientific Method.

    Developing software is a task for those who can think. It is not a task for the simple-minded, the lazy, or the inattentive. You have to be willing to pay attention to the details, and to invest the time it takes to hunt down a defect in painstaking detail to get to the root of a problem.

    A good software developer knows the difference between a symptom and a disease, and how that correlates to software defects. Sure, you have a NullReferenceException, and your code is missing a handler for that. But is the problem the missing exception handler, or is the problem that a null somehow got into a table that should never have had it, or that a stored procedure in the database returned nulls when they were never expected? Which one is the symptom? Which is the disease? Make sure you're fixing the right defect. Don't just prescribe aspirin when the software needs invasive surgery. To find that out, you need to think critically. You need to apply the Scientific Method.