Yesterday, I came across a spirited defense of reinventing the wheel in a recent post from Jeff Atwood. Dare Obasanjo stands firmly in the “roll your own as last resort” camp. In this particular case, Atwood asserts the following:
[D]eeply understanding HTML sanitization is a critical part of my business.
I’ll take Atwood at his word on what’s critical to his business (and what isn’t), but it seems that there’s a middle ground between his position and Obasanjo’s. Particularly when there’s an open source solution available (SgmlReader in this case, since it’s written in C#), adopting and improving it has these benefits:
- Improved understanding of HTML sanitization for the adopter.
- Strengthening of the existing community.
To Atwood’s credit, he’s made his solution available here so that all of us who write software for a living can benefit from it. I would be very interested in seeing a comparison between the SgmlReader and Atwood’s HTML Sanitizer to see which is better.
My own experience with reinventing the wheel (in software development terms) has rarely, if ever, been positive. Therefore, I have a lot of sympathy for Obasanjo’s perspective. Because I’ve inherited a lot of software from predecessors at various employers, I’ve seen a lot of less-than-ideal (to put it kindly) custom implementations of validation, encryption, search and logging functionality.
There are probably plenty of reasons that development teams reinvent the wheel in these areas, but one highly likely (and unfortunate) reason seems to be insufficient awareness about the wide variety of high-quality open source solutions available for a variety of problems. I don’t know whether this is actually more true in internal IT shops than other environments or not, but it seems that way. Encryption and logging in particular are two areas where it seems like custom code would be a bad idea for virtually everyone (except those actually in the encryption and logging library businesses). With libraries like log4j, log4net, the Enterprise Library, and Bouncy Castle available, developers can spend their time focusing on what’s really important to their application. Code for authentication and authorization seems like one of those areas as well. It seems like there are a lot of solutions to this problem (like OpenID on the public web, and Active Directory in the enterprise) that time spent hand-rolling login/password anything is time not spent working in areas where more innovation is possible (and needed).
When I asked the question of “what should always be third-party” to Stack Overflow, I got some interesting answers. Most answers seemed to agree that encryption should be third-party, except in rare cases, but there was surprising little consensus beyond that. Beyond the scarce resources argument against custom logging (or other areas with widely available open source alternatives), there’s a diminishing returns argument as well. I’ve only used Log4Net and the logging in the Enterprise Library, but they’re really good frameworks. Even if I had the resources to implement custom logging well, the odds that the result would be a significant improvement over the existing third-party options are slim to none. I’d like to see the quality argument made more often in buy vs. build decisions.