Thursday, July 16, 2009

InternalsVisibleTo and Chasing Down Public Keys

Sometimes, just getting assemblies to cooperate the way you want them to is a real pain in the neck.

Today, I wanted to make one assembly (which we’ll call Foo) to be able to access the internal (Friend for all us VB geeks) members of another assembly (which we’ll call Bar). Now, the documented way to achieve this is by adding the InternalsVisibleTo attribute to your project. The code sample on MSDN looks like this:

[assembly:InternalsVisibleTo("AssemblyB, PublicKey=32ab4ba45e0a69a1")]

For a Visual Basic application, that statement goes into the AssemblyInfo.vb file, and looks like this:

<Assembly:InternalsVisibleTo("AssemblyB, PublicKey=32ab4ba45e0a69a1")>

No big deal. It’s not rocket science. You just need a strong name. To generate a strong name, you fire up the Visual Studio Command Prompt, and execute the Strong Name tool (sn.exe) and execute it as follows:


C:\Program Files\Microsoft Visual Studio 9.0\VC>sn.exe -k bar.snk

Microsoft (R) .NET Framework Strong Name Utility  Version 3.5.30729.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Key pair written to bar.snk

As you can see, this creates a .SNK file, which contains a public and a private key. However, its contents are not human readable. Further, the InternalsVisibleTo attribute requires that you provide the public key in the constructor. (Don’t even think about trying it without it.)


Here’s where things get tricky: The code samples on MSDN do not provide a public key to the constructor; they provide a public key token. There’s a huge difference between the two, and it’s very misleading. A strong name token is much shorter than a public key; the former is only about 16 characters long, the other well in excess of 140 characters. If you rely on the code sample from MSDN to get you where you want to be, you’ll be pulling your hair out in no time.


But how do you get the public key token?


You can retrieve the public key token as follows:


sn –p foo.snk barpublic.snk


This creates a new file that contains only the public key. It removes the private key information from the file.


sn –tp > barkey.txt


This creates a text file that contains the full dump of the key information, including the public key. We redirect it to a text file so that you can open it in the editor of your choice (because you’ll have to do some cleanup to get the key onto one line). You’ll want to select the public key and paste it into the PublicKey portion of the constructor for the InternalsVisible attribute.


So here’s everything we did at the command prompt:


C:\Program Files\Microsoft Visual Studio 9.0\VC>sn -k bar.snk

Microsoft (R) .NET Framework Strong Name Utility  Version 3.5.30729.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Key pair written to bar.snk

C:\Program Files\Microsoft Visual Studio 9.0\VC>sn -p bar.snk barpublic.snk

Microsoft (R) .NET Framework Strong Name Utility  Version 3.5.30729.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Public key written to barpublic.snk

C:\Program Files\Microsoft Visual Studio 9.0\VC>sn -tp barpublic.snk > bar.txt

C:\Program Files\Microsoft Visual Studio 9.0\VC>

And here’s the contents of our text file:


Microsoft (R) .NET Framework Strong Name Utility  Version 3.5.30729.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Public key is
0024000004800000940000000602000000240000525341310004000001000100e13cb392af5437279736fc3c33fe237242d0f6301fafb01c5cbc719d84102c2d8b30a148600997ed53d99624b5d0eab37fd6b24cca3ce7f7b62ae99f961e148d5421576bade0ac8ab1187a3eee318ca20026ffe9b56b8a63156f817cef49998633867ae547684e8e59c0fe0b68ab29dffa749340dc6cfdd18071f1b69c6772ac

Public key token is db218359dd8997df

So, when we finally add that attribute to our AssemblyInfo.vb file, it looks like this:


<Assembly:InternalsVisibleTo("Bar, PublicKey=0024000004800000940000000602000000240000525341310004000001000100e13cb392af5437279736fc3c33fe237242d0f6301fafb01c5cbc719d84102c2d8b30a148600997ed53d99624b5d0eab37fd6b24cca3ce7f7b62ae99f961e148d5421576bade0ac8ab1187a3eee318ca20026ffe9b56b8a63156f817cef49998633867ae547684e8e59c0fe0b68ab29dffa749340dc6cfdd18071f1b69c6772ac")>

Once this stuff is in place, Bar should be able to access any members in Foo that are marked internal/Friend. It should be smooth sailing from there.

Good luck!


Tuesday, July 14, 2009

On Legacy Software Maintenance

In the mad, mad, mad, mad world of software development, we are faced with the trying task of maintaining legacy systems. In an ideal world, that wouldn't be the case. We'd all be developing brand new systems from the ground up, writing ground-breaking code that no one has ever seen before, without the hassles that arise when you have to worry about things like backwards compatibility and maximum system uptime.

But this isn't an ideal world. The vast majority of us don't have the luxury of developing completely new systems. Instead, our lives are fraught with the perils of correcting defects and adding new features to systems that have been around for ages and, occasionally, decades. Those systems have usually passed through a number of hands and they tend to be poorly documented. They sometimes have sprawling feature sets, support technologies that have long since fallen by the wayside, bloated with code that doesn't appear to be invoked by anyone, and riddled with obscure and seemingly nonsensical comments.

Your job, as a maintenance developer, is to massage that seemingly horrific beast into a thing of beauty. Real people doing real jobs depend on it to get their work done in a timely manner. As much as we might loathe an ancient codebase, the language it was written in, or the tools we have to use to get the job done, truth is, a legacy application is maintained for a reason: it has intrinsic value to the bottom line of the business. When the folks who depend on that system can get their jobs done on time, in a productive manner, they can continue to draw a decent paycheck. That means that they can continue to pay the rent, put food on the table for their families, afford healthcare, and all those other essentials.

So tread lightly when you delve into the code of a legacy system. It's far more important than you think. We just take it for granted that it's a dirty, loathsome job and someone has to do it. We just happen to be the unlucky bastard who drew the short straw. Not so: you happen to be the lucky one who drew that straw. People depend on you to help them keep their families safe, warm, and well-fed.

My point isn't that every legacy application is manna from heaven. My point is that legacy applications exist for a reason, that they're maintained for a reason. They have long, varied histories for a reason. They have endured because they have value; they've grown beyond their original specification because the company sees real value in them, and doesn't want to lose that investment. The problem for you, as a developer, is in ensuring that you do not destroy that investment.

When we are first introduced to a legacy system, we have a tendency to look at the source code and view it as though it were written by a blind deaf quadriplegic with Tourette's syndrome. No one in their right mind would have written a system that way. What could they possibly have been thinking? You certainly wouldn't have! I certainly wouldn't have.

But then, over time, we start to learn things about it. The code is old; very old. There's been a high turnover rate, and the code has passed through lots of hands. The companies that published third-party components went out of business when the dot-com bubble burst. They used to use Novell for security, and then switched to Active Directory. When this thing was released, Windows 95 was still popular. They upgraded the database about two years ago, and had to make some emergency revisions to the code.

There are reasons that things like this happen in a legacy system that's old enough to qualify for Medicare. Many of those reasons are valid, and many of them are the product of a lack of time and budget. Sometimes, sadly, it's a result of a lack of skilled developers (but that's something for someone else to blog about). The point, in short, is that systems grow from an original vision into large, cumbersome, bloated systems because developers respond to the needs and demands of the business.

Now, here you are, present day, and you're tasked with maintaining that source code. You have two primary responsibilities: 1.) Fix the defects, and 2.) Add new features. Keep in mind that while you are doing both, you must not at any point in time break backwards compatibility or bring down the system. People rely on this system. It's the bread and butter (or part of it) of the business. And it's your baby now.

It is absolutely crucial that you treat legacy software with its due gravity. If you view it like it's some recurring annoyance, stop that. If you leap to hasty conclusions about the cause of problems in the system, stop that immediately. This is a system that many people rely on. Get it into your head that you need to treat this thing delicately, as if it were your pride and joy. Once you fix a defect, once you put a new feature into it, your name is associated with that software. Take pride in it. Do it right. Take your time.

Over time, the legacy system stops being the horrific beast associated with all those who preceded you. It becomes the creature that you have molded it into. And then, people will associate it with you, for better or worse.

Monday, July 13, 2009

The Sad State of SQL Refactoring

I love refactoring tools.

The ability to select a variable in code, right-click on it, choose the Rename command from a context menu, and then safely rename that variable in every location in which it appears is a boon to my programming productivity. Similarly, the ability to take a large block of complex logic and extract it to its own method is really handy. Or, the ability to reorder parameters, and know that every piece of code that calls that method will be updated appropriately saves me tons of time and improves my confidence in the quality of the code.

When you refactor code of any kind, you enhance its readability without changing the way it works. But refactoring by hand is frequently difficult, tedious, and error-prone. That's why refactoring tools exist and why the essential service they provide is important. They allow us to improve the maintainability of code, hopefully reducing maintenance costs, quickly and efficiently.

Enter SQL languages.

In theory, there is a standard for SQL languages: the ANSI SQL-92 standard. One would like to think that it would be a simple matter to create refactoring tools for any SQL based on the ANSI standard for SQL. One would be wrong. You can't use the standard as the sole basis of a refactoring tool.

Any given database vender wants to strive to make their product unique, to stand out from the crowd. And so, they don't entirely conform to the standard. They have additional features that separate them from each other. For example: Oracle organizes functions and procedures into packages. Microsoft does not. Microsoft allows a bit data type on columns. Oracle does not. And, although we all loathe to think of Access as a real database, Access provides a boolean data type that you can use in columns, while Oracle does not.

Now, let's also talk about legacy support. Oracle has been around for a very long time. Version 2 came out in 1979, from what I gather. Their legacy join syntax looks nothing like the ANSI standard (and that's not a criticism, just a simple statement of fact):

SELECT Customers.CustomerId, Company, OrderID
FROM Customers C, Orders O
WHERE C.CustomerID (+) = O.CustomerID
UNION
SELECT Customers.CustomerId, Company, OrderID
FROM Customers C, Orders O
WHERE C.CustomerID = O.CustomerID (+)

In ANSI SQL, this is :

SELECT Customers.CustomerId, Company, OrderId
FROM Customers C FULL OUTER JOIN Orders O ON
(C.CustomerId = O.CustomerId)

Now, we could add in all the different SQL variations that we know that are out there:

  • ANSI standard
  • Interbase/Firebird
  • IBM  SQL
  • Microsoft Transact SQL
  • MySQL
  • Oracle PL/SQL
  • PostgreSQL
  • Access
  • FoxPro

...and on and on and on, but you get the point. There are a lot of SQL variants out there. And their goal is to accomplish the same thing:

  • Create a data store.
  • Occasionally, modify the structure of the data store.
  • Get data out of a data store.
  • Put data into a data store.
  • If supported, execute code within the confines of the data store and (optionally) return a result.

Behold, SQL in a nutshell.

Over the lifetime of these various products, they have added features that they have to maintain for legacy support. Somewhere out there there's a business that absolutely must have that feature in place or their whole process will come crashing down. Don't you dare remove it. It doesn't matter that there are better features (likely based on a standard); there are applications out there for which they don't have the source code anymore, or which no one understands, and they're too afraid to touch.

Now, all these reasons have led us to a scenario where we have vastly different implementations of SQL. Sure, they share a lot in common, but they also have radically different feature sets and syntax. And that situation, in and of itself, has led us to one frightening and sad conclusion:

We will likely never have a tool that is able to connect to any database and be able to correctly refactor its SQL.

And that's just a damned shame. Because there's a lot of SQL out there. Stored procedures, functions, views, triggers, even inline SQL in applications and all that other jazz. But the amount of work it would take to get us to a point where a refactoring tool could recognize any variant of SQL and correctly parse it, refactor it, and not hose the code is enormous.

It's a pity, really. I could see tons of use for a tool like this. In my own office, we work with SQL Server, Oracle, and Atomix databases. What I wouldn't give for a tool that could refactor SQL to enhance its readability without changing the way it worked.

And who knows? Down the road, we may be working with something else.

But this is the world we live in. If we want to refactor SQL, it looks like we'll have to settle for separate refactoring tools for each language. And then each will come with its own quirks. That might be good or bad, but it's likely the best we can hope for for now.

 

Saturday, July 11, 2009

Defects and the Scientific Method

Let's leap back a few years to that frightening time in Junior High School. Remember those years? September, possibly October. It was still hot, and the air conditioning didn't work. Your science teacher was standing in the back of the room, with an overhead projector and had a slide up for you to copy down. On it, he had this information:

THE SCIENTIFIC METHOD

  • Ask and define the question.
  • Gather information and resources through observation.
  • Form a hypothesis.
  • Perform one or more experiments and collect and sort data.
  • Analyze the data .
  • Interpret the data and make conclusions that point to a hypothesis.
  • Formulate a "final" or "finished" hypothesis.
  • Ah, remember those days? Remember how boring it was? Remember thinking to yourself, "I'll never use this?"

    Well, as a software developer who is tasked with maintaining software that is virtually guaranteed to contain defects, you can be certain that you need to be intimately familiar with The Scientific Method.

    The Scientific Method provides a clear roadmap for defect isolation. In fact, anyone who has any real experience isolating defects without disturbing the rest of the system has (whether he's aware of it or not) used the Scientific Method to do so. Here's how it breaks down:

    1. Ask and define the question. The software should behave in this manner, but it does not. What is the cause of this problem, and how do we fix it?
    2. Gather information and resources through observation. In a controlled environment that mimics production as closely as possible, reproduce the defect. If possible, step through the code and observe its behavior.
    3. Form a hypothesis. The defect is caused by this behavior in the system (or by the behavior of this external system).
    4. Perform one or more experiments and collect and sort data. Implement a code fix; attempt to reproduce the defect using the fixed code. Observe the results.
    5. Analyze the data. Did the code fix have the desired effect? If so, how?
    6. Interpret the data and make conclusions that point to a hypothesis. Was the code that was modified the cause of the defect, or was it merely a symptom of an underlying problem requiring further resolution?
    7. Formulate a "final" or "finished" hypothesis. If the defect is fully repaired, check all code into the repository. Otherwise, continue the analysis until you have rooted out the underlying cause of the defect.

    Simply put, there's no guesswork in defect resolution. It is a rational, thinking process, much like a game of Sodoku. If you approach any defect and just yank an answer out of thin air, You're Doing It Wrong.

    Instant answers to defects are a dangerous game. Your first, instinctive answer to any problem is likely to be wrong; the chances of this being true will only rise as your code base grows in size. As your product gains features, you'll want to take greater care to make sure that you have taken the time to disturb absolutely nothing outside of the defect you're trying to correct. In that case, take some advice: Keep your grubby fingers to yourself. Touch only the code in the defect domain. The best way to do that is to have a plan for defect resolution, and I strongly encourage you to apply The Scientific Method.

    Developing software is a task for those who can think. It is not a task for the simple-minded, the lazy, or the inattentive. You have to be willing to pay attention to the details, and to invest the time it takes to hunt down a defect in painstaking detail to get to the root of a problem.

    A good software developer knows the difference between a symptom and a disease, and how that correlates to software defects. Sure, you have a NullReferenceException, and your code is missing a handler for that. But is the problem the missing exception handler, or is the problem that a null somehow got into a table that should never have had it, or that a stored procedure in the database returned nulls when they were never expected? Which one is the symptom? Which is the disease? Make sure you're fixing the right defect. Don't just prescribe aspirin when the software needs invasive surgery. To find that out, you need to think critically. You need to apply the Scientific Method.

    Sunday, August 10, 2008

    Top 10 Signs You've Become Indispensable (and Are Therefore About to Be Fired)

    A wise man once told me that when someone becomes indispensable the very best thing you can do is get rid of them as soon as possible. I've always thought that was sage advice. I've tried to keep that in mind. So I'm always keeping an eye out for habits that "indispensable" coders have. These are the guys I want eliminated, even if one of those guys is me.

    What it comes down to is whether or not one guy on the team can hold an entire project hostage. No company can reasonably afford to be in that position. I certainly wouldn't want to put a company in that position. It's about risk, and managing that risk, and doing so proactively.

    So, without further adieu, the Top 10 Signs You've Become Indispensable (and Are Therefore About to Be Fired):

    1. You're the only one who can work on the particular tasks assigned to you, because you're the only one who understands them.
    2. You believe in or practice job security through code obscurity.
    3. You don't communicate, and hoard valuable information that other members of the team need to get their jobs done.
    4. You make technology decisions, implement them, and expect everyone to follow suit, whether they understand them or not.
    5. You frequently make vast, sweeping changes to the underlying architecture of the system, without first discussing those changes or their impact with others on the team.
    6. You don't really understand object oriented analysis and design, but you act like you do.
    7. You resist any suggestions for better, proven ways to implement solutions, simply because someone tried it that way before and it left a bad taste in your mouth.
    8. You use source code control like a backup device, rather than a version control system.
    9. When designing a system, your first thought is the code or data model and not the problem domain.
    10. You have no interest in being a member of the team, and would rather do it your way all the time.

    (This is, of course, completely unscientific and totally subjective. Take it or leave it.)

    Thursday, August 7, 2008

    Why Censor the Internet (Language Warning)

    A poster on Digg offered this eloquent response to the article, Internet Censorship is On it's Way. The i-Patriot Act:

    WHY THE FUCK ARE THEY CONCERNED ABOUT THE FUCKING INTERNET?!?!

    I mean seriously there are much bigger issues in the whole fucking world then the internet. We cannot be in our on privacy doing our own thing without the people watching over us. I mean come on its such bullshit. Soon it will be like the movie Demolition Man and we will get fined when we fucking curse at home! We are slowly creeping into a government who has complete control over everything we do.

    Censor the internet... Give me a fucking break.

    My apologies for the language, but this impassioned question deserves an equally impassioned answer. Why, indeed?

    In a fascist state, the last thing you want is for people to be able to express themselves and speak out against the government. It's all about control. And people who can speak freely can't be controlled. Neither can those who listen to those who speak freely.

    Anyone who's been raptly paying attention to what's been going on in our country (particularly over the last 8 years or so) knows that we've been becoming a fascist state. But it's by our own choosing. We elected these people to power, either by choice or by sheer apathy. We refused to entertain the notion of deviating from a two party system, and we allowed them to strip us of our rights and freedoms. We did not cry out in protest when the Patriot Act was put into place; rather, many of us celebrated it, embracing it as a necessary evil in order to hunt down the vile terrorists who had dared to attack us on our own soil.

    Thus, we surrendered our rights, our freedoms, our liberties in order to gain a false sense of security and chase after demons that never really existed. And from that day forward our government, whom we put into power and have kept in power, have continued to play upon our fears in order to further strip us of any vestige of the Constitutional rights we had before. They can do what they want, when they want, to whomever they want, and there is little that any of us can do about it.

    But we chose this path. We elected it. Sixty percent of the population failed to vote in the last presidential election. It was far more important to watch reality television than it was to secure a meaningful future for our nation, and we allowed the same criminals to maintain their stranglehold on what was once a powerful, respectable democracy. But those same people maintain that their government fails them, that they have no rights, that the economy is in the toilet, that we're sending our troops to senseless deaths overseas in a war we should never have been involved in, and a myriad of other complaints. When asked, though, they'll tell you that they didn't vote because their vote didn't count. Of course it didn't. No uncast vote counts.

    But we've learned nothing. Even now we entertain the absurd notion of effecting change by maintaining the status quo. We're going to elect either Obama or McCain. Yet another pawn from a two-party system. Neither will be able to revolutionize the country and restore what it was. Neither will break the back of the military industrial complex. Neither will do what must be done to fix what must be fixed.

    That responsibility rests with you and me. Right here. Right now. Every day. But it means getting off our asses, turning off our televisions, and getting involved. We must DEMAND change, DEMAND our rights, and DEMAND the restoration of the Constitution.

    You see? A fascist would never want words like these uttered on any medium. And the Internet makes it all too easy to make such statements in a forum where hundreds, thousands, even millions of people can read it.

    Sunday, April 6, 2008

    The Absurdity of "Don't Reinvent the Wheel"

    As developers, we've had this adage drilled into us from the beginning: Don't reinvent the wheel. In short, don't rewrite what's already been written. The idea is sound, in theory. You can save yourself time and money if you'll simply reuse existing code and/or components rather than writing them yourself from scratch. This time and money is saved up front when you write it (or would have written it), and down the road, when you have to maintain your system.

    However, I'd like to point out another, equally applicable adage: There's nothing new under the sun. Anyone who's ever tried to write a novel, a short story, a play, a movie, a song, or a piece of software, will know this one simple truth: somewhere, at some point in time, it's already been written.

    Every algorithm, every piece of code that we will ever attempt to write has already been written somewhere, at some point in time, by someone. Only the names have been changed. You're not inventing anything that is completely new, that's never been seen before. You should wisely disabuse yourself of that notion as quickly as possible.

    In the grand scheme of things, at the application level, you may very well have an idea for a system that is unlike anything that has been done to date. But the algorithms that drive it have already been written. Bubble sorts, hashes, exception handlers, encryption, data access, socket management, shopping carts, entire application frameworks, date management, document management, serialization, port I/O, and all that other stuff has already been done. Further, it's already been done several times over in many different languages to varying degrees of success.

    Tragically, if you're using a large application framework, like Java's EE or Microsoft's .NET, the chances are good that the functionality you're looking for is built right into the framework itself. The problem is that the framework is so vast that you'll spend more time looking for it, and determining whether or not it works the way you need it to work than you would just rewriting it yourself.

    Application frameworks are stunningly afflicted with feature creep. They must do everything under the sun, must meet every possible need. The problem, then, is that their scope becomes so broad, so vast, that no one in their right mind could possibly grasp the totality of all that they can do. It is inevitable that anyone using them will reinvent some of their functionality. The scale of that functionality might be small (reformatting dates) or it might be substantial (pooled database connections).

    In the end, it's absurd to think that we can possibly avoid reinventing the wheel. Of course we're going to reinvent it. Every application we write is a reinvention of someone else's wheel. It just so happens that our wheel is a custom wheel. All this paranoia about reinventing the wheel is blown out of proportion. A proper buy vs. build decision should never be neglected; but don't ever think for one minute that what you're creating hasn't been created before.

    Consider the scenario where you're under the gun to get a product out the door. And I mean it's a really tight schedule. And don't act like it's a perfect world, and you have leverage over the schedule. This is reality here. In the real world, the customer controls the schedule, because it's tied to when the product is released, and that's tied to this big, huge monstrosity in another state or another country. The product's delivery schedule is a train barreling down the track at 120mph and no one short of God can stop it. Now, you have a very finite amount of time to work in. You need an algorithm. You know you could write it. Or you could look to see if someone else has written it.

    If you do the whole Web search thing, you have to ask yourself a few questions: Is it from a source you trust? Is it in the language you're using, or do you have to convert it? Does it work? Does it need to be tweaked? If any of these fail, you're back to the drawing board. Time's wasting here, and that train's getting closer to its destination. If all the answers pass, you have to make sure you don't run into any copyright or licensing issues with that code. (You are paying attention to that, aren't you?)

    If you decide to peruse your application's framework, you'd better hope it's well documented, and very easy to search. Good luck using the search features in .NET. It's not like the ASK.COM interface, where you can ask, "How do I convert a date in DOD format to Gregorian format?" Yeah. Good luck with that. On the other hand, you could ask your coworkers. They might know. Then again, they might not. If they don't, you're off to Google to get the information. Here's hoping you get a timely and accurate response.

    Sure, this is an extreme example. But it makes my point: At some point, the work has to get done. You can't afford to spend days or weeks scratching your head about whether or not that wheel's already been invented. Believe me, it has. The problem is, there are a countless number of wheels, and none of them are labeled, and you don't know where to find the wheel you're looking for.

    Stop wasting time, and invent your own damned wheel.

    After all, whatever code you might reuse, is just someone else's reinvention of the same wheel.