Will AI Change My Job or Replace It?

One of my Twitter mutuals recently shared the following tweet with me regarding AI:


I found Dare Obasanjo’s commentary especially interesting because my connection to Stack Overflow runs a bit deeper than it might for some developers. As I mentioned in a much older post, I was a beta tester for the original stackoverflow.com. Every beta tester contributed some of the original questions still on the site today. While the careers site StackOverflow went on to create was sunsetted as a feature last year, it helped me find a role in healthcare IT where I spent a few years of my career before returning to the management ranks. Why is this relevant to AI? Because the purpose of Stack Overflow was (and is) to provide a place for software engineers to ask questions of other software developers and get answers to help them solve programming problems. Obasanjo’s takeaway from the CEO’s letter is that this decade-plus old collection of questions and answers about software development challenges will be used as input for an AI that can replace software engineers altogether. My main takeaway from the same letter is that at some point this summer (possibly later) Stack Overflow and Stack Overflow for Teams (their corporate product) will get some sort of conversational AI capability added, perhaps even without the “hallucination problems” that have made the news recently.

Part of the reason I’m more inclined to believe that [chatbot] + [10+ years of programming Q & A site data] = [better programming Q & A resource] or [better starter app scaffolder] instead of [replacement for junior engineers] is knowing just how long we’ve been trying to replace people with expertise in software development with tools that will enable people without expertise to create software. While enough engineers have copied and pasted code from Stack Overflow into their own projects that it led to an April Fool’s gag product (which later became a real product), I believe we’re probably still quite some distance away from text prompts generating working Java APIs. I’ve lost track of how many companies have come and gone who put products into the market promising to let businesses replace software developers with tools that let you draw what you want and generate working software, or drag and drop boxes and arrows you can connect together that will yield working software, or some other variation on this theme of [idea] + [magic tool] = [working software product] with no testing, validation, or software developers in between. The truth is that there’s much more mileage to be gained from tools that help software developers do their jobs better and more quickly.

ReSharper is a tool I used for many years when I was writing production C# code that went a long way toward reducing (if not eliminating) a lot of the drudgery of software development. Boilerplate code, variable renaming, class renaming are just a few of the boring (and time-consuming) things it accelerated immensely. And that’s before you get to the numerous quick fixes it suggested to improve your code, and static code analysis to find and warn you of potential problems. I haven’t used GitHub Copilot (Microsoft’s so-called “AI pair programmer) myself (in part because I’m management and don’t write production code anymore, in part because there are probably unpleasant legal ramifications to giving such a tool access to code owned by an employer), but it sounds very much like ReSharper on steroids.

Anthony B (on Twitter and Substack) has a far more profane, hilarious (and accurate) take on what ChatGPT, Bard, and other systems some (very) generously call conversational AI actually are:

His Substack piece goes into more detail, and as amusing as the term “spicy autocomplete” is, his comparison of how large language model systems handle uncertainty to how spam detection systems handle uncertainty provides real insight into the limitations of these systems in their current state. Another aspect of the challenge he touches on briefly in the piece is training data. In the case of Stack Overflow in particular, having asked and answered dozens of questions that will presumably be part of the training data set for their chatbot, the quality of both questions and answers varies widely. The upvotes and downvotes for each are decent quality clues but are not necessarily authoritative. A Stack Overflow chatbot could conceivably respond with an answer based on something with a lot of upvotes that might actually not be correct.

There’s an entirely different discussion to be had (and litigation in progress against an AI image generation startup, and a lawsuit against Microsoft, GitHub, and OpenAI) regarding the training of large language models on copyrighted material without paying copyright holders. How the lawsuits turn out (via judgments or settlements) should answer at least some questions about what would-be chatbot creators can use for training data (and how lucrative it might be for copyright holders to make some of their material available for this purpose). But in the meantime, I do not expect my job to be replaced by AI anytime soon.

Tell Me About Yourself–Engineering Leader Edition

The following tweet starts an excellent thread of questions that I’m taking as a starting point for this post looking back over the past 5 years with my current company:

When was the last time you promoted someone on your team?  How did it happen? My organization works in a way that promotion decisions are actually approved (or rejected) at a much higher level than mine.  But I’ve advocated successfully for promotion for two of my direct reports, both during the pandemic.

The first was a recent college graduate who spent the 18 months of his professional career on my team.  While I wasn’t his manager for the entirety of that time, I encouraged him to work on communication across various channels (Slack, email, documentation, pull request comments, etc).  I did what I could to put opportunities in front of him to grow and showcase his skills.  What he did on his own (in addition to pursuing a master’s degree in computer science on the side) was earn AWS certifications.  He passed 4(!) in a single calendar year.  So when it came time to year-end reviews, there were a lot of accomplishments to point to as well as positive feedback from people outside our team from their experiences of working with him.  He was the first direct report I had who earned the highest possible year-end rating: exceptional, and the first promotion (to senior engineer).  He’s still with the company today, and received another promotion (to principal engineer) in the same cycle I received a promotion to senior manager.

The second promotion was for someone who had been with the company longer than I had.  From what I was told she had been submitted for promotion once or twice before but had not been selected for promotion.  She was (and is) one of those engineers who leads much more by example than by talking.  Having observed over the years that the review process tends to overindex on software engineers that present well, I became the person in meetings who consistently pushed people to consider written communication as well as presentations in judging the quality of an engineer’s communication.  I also recommended she take the technical writing courses offered by Google.  These steps, plus highlighting her numerous critical contributions to the team’s success during another year-end review cycle appear to have been enough to get her promoted to principal engineer.

Why did the last person in this role leave?  It’s been long enough that I don’t actually recall why the previous leader of the team moved on.  I presume they found an opportunity with another company.

How do you nurture psychological safety in your team?  Regular one-on-ones (I follow a weekly cadence for these) has been important to nurturing psychological safety.  Because I joined the team to lead it after work-from-home began, Zoom meetings were really the only avenue available to build the rapport necessary for my team to trust me.  I also started a technical book club with the team, with the intention of giving my team exposure to software design and implementation principles outside the scope of our current work, along with providing opportunities for each member of the team to lead discussions and explore ideas.  It seems to have had the additional benefit of building everyone’s comfort level with, and trust in, each other along with all the other things I’d intended it for (including ideas originating from book club showing up as production enhancements to our software).

When was the last time you supported a direct report’s growth, even if it meant leaving your team or company?  In my previous department, I had staffing responsibilities for two teams for awhile: one composed entirely of contractors in addition to my own team.  In helping a scrum master friend of mine diagnose the causes of the contractor team struggling to be productive, I concluded that the main issue wasn’t technical expertise but the lack of a leader to help remove impediments and connect them with others in the organization who could help their tasks move forward.  I proposed this as a leadership opportunity for one of my direct reports and got buy-in from higher-level management.  He was so successful in the stretch opportunity I created, he got promoted after leaving my team.  Not long after that, he left our organization to join Amazon as an engineering team lead in Seattle.  He’s currently a principal software engineering manager with Microsoft in Atlanta.

Can I speak to some women on the team to hear more about their experience?  Two of the engineers on my current team are women.  If all goes well, another one of them will be promoted to principal engineer by virtue of her performance over the past 18 months.  While it will likely mean losing her to another team, her getting promoted and gaining new opportunities that my team’s scope doesn’t provide is more important to me.  I see it as another opportunity to build up another engineer in her place.

Nulls Break Polymorphism, Revisited

Steve Smith wrote this post regarding the problem with null about two years ago.  It’s definitely worth reading in full (as is pretty much anything Steve Smith writes).  The post provided code for the implementation of an extension method and a change in the main code that would address null without throwing an exception.  It also mentioned the null object design pattern but didn’t provide a code example.

I took the code from the original post and revised it to use the null object design pattern.  Unlike the original example, my version of the Employee class overrides ToString() instead of using an extension method.  There are almost certainly any number of other tweaks which could be made to the code which I may make later.

Smith’s post links additional material that’s also worth checking out:

Entity Framework Code First to a New Database (Revised Again)

As part of hunting for a new employer (an unfortunate necessity due to layoffs), I’ve been re-acquainting myself with the .NET stack after a couple of years building and managing teams of J2EE developers.  MSDN has a handy article on Entity Framework Code First, but the last update was about a year ago and some of the information hasn’t aged so well.

The first 3 steps in the article went as planned (I’m using Visual Studio 2017 Community Edition).  But once I got to step 4, neither of the suggested locations of the database worked per the instructions.  A quick look in App.config revealed what I was missing:

Once I provided the following value for the server name:

(localhostdb)\mssqllocaldb

database I could connect to revealed themselves and I was able to inspect the schema.  Steps 5-7 worked without modifications as well.  My implementation of the sample diverged slightly from the original in that I refactored the five classes out of Program.cs into separate files.  This didn’t change how the program operated at all–it just made for a simpler Program.cs file.  The code is available on GitHub.

Best Practices for Software Testing

I originally wrote the following as an internal corporate blog post to guide a pair of business analysts responsible for writing and unit testing business rules. The advice below applies pretty well to software testing in general.

80/20 Rule

80% of your test scenarios should cover failure cases, with the other 20% covering success cases.  Too much of testing (unit testing or otherwise) seems to cover the happy path.  A 4:1 ratio of failure case tests to success case tests will result in more durable software.

Boundary/Range Testing

Given a range of valid values for an input, the following tests are strongly recommended:

  • Test of behavior at minimum value in range
  • Test of behavior at maximum value in range
  • Tests outside of valid value range
    • Below minimum value
    • Above maximum value
  • Test of behavior within the range

The following tests roughly conform to the 80/20 rule, and apply to numeric values, dates and times.

Date/Time Testing

Above and beyond the boundary/range testing described above, the testing of dates creates a need to test how code handles different orderings of those values relative to each other.  For example, if a method has a start and end date as inputs, you should test to make sure that the code responds with some sort of error if the start date is later than the end date.  If a method has start and end times as inputs for the same day, the code should respond with an error if the start time is later than the end time.  Testing of date or date/time-sensitive code must include an abstraction to represent current date and time as a value (or values) you choose, rather than the current system date and time.  Otherwise, you’ll have no way to test code that should only be executed years in the future.

Boolean Testing

Given that a boolean value is either true or false, testing code that takes a boolean as an input seems quite simple.  But if a method has multiple inputs that can be true or false, testing that the right behavior occurs for every possible combination of those values becomes less trivial.  Combine that with the possibility of a null value, or multiple null values being provided (as described in the next section) and comprehensive testing of a method with boolean inputs becomes even harder.

Null Testing

It is very important to test how a method behaves when it receives null values instead of valid data.  The method under test should fail in graceful way instead of crashing or displaying cryptic error messages to the user.

Arrange-Act-Assert

Arrange-Act-Assert is the organizing principle to follow when developing unit tests.  Arrange refers to the work your test should do first in order to set up any necessary data, creation of supporting objects, etc.  Act refers to executing the scenario you wish to test.  Assert refers to verifying that the outcome you expect is the same as the actual outcome.  A test should have just one assert.  The rationale for this relates to the Single Responsibility Principle.  That principles states that a class should have one, and only one, reason to change.  As I apply that to testing, a unit test should test only one thing so that the reason for failure is clear if and when that happens as a result of subsequent code changes.  This approach implies a large number of small, targeted tests, the majority of which should cover failure scenarios as indicated by the 80/20 Rule defined earlier.

Test-First Development & Refactoring

This approach to development is best visually explained by this diagram.  The key thing to understand is that a test that fails must be written before the code that makes the test pass.  This approach ensures that test is good enough to catch any failures introduced by subsequent code changes.  This approach applies not just to new development, but to refactoring as well.  This means, if you plan to make a change that you know will result in broken tests, break the tests first.  This way, when your changes are complete, the tests will be green again and you’ll know your work is done.  You can find an excellent blog post on the subject of test-driven development by Bob Martin here.

Other Resources

I first learned about Arrange-Act-Assert for unit test organization from reading The Art of Unit Testing by Roy Osherove.  He’s on Twitter as @RoyOsherove.  While it’s not just about testing, Clean Code (by Bob Martin) is one of those books you should own and read regularly if you make your living writing software.

Which Programming Language(s) Should I Learn?

I had an interesting conversation with a friend of mine (a computer science professor) and one of his students last week.  Beyond the basic which language(s) question were a couple more intriguing ones:

  1. If you had to do it all over again, would you still stick with the Microsoft platform for your entire development career?
  2. Will Microsoft be relevant in another ten years?

The first question I hadn’t really contemplated in quite some time.  I distinctly recall a moment when there was a choice between two projects at the place where I was working–one project was a Microsoft project (probably ASP, VB6 and SQL Server) and the other one wasn’t (probably Java).  I chose the former because I’d had prior experience with all three of the technologies on the Microsoft platform and none with the others.  I probably wanted an early win at the company and picking familiar technology was the quickest way to accomplish that.  A couple of years later (in 2001), I was at another company and took them up on an opportunity to learn about .NET (which at the time was still in beta) from the people at DevelopMentor.  It only took one presentation by Don Box to convince me that .NET (and C#) were the way to go.  While it would be two more years before I wrote and deployed a working C# application to production, I’ve been writing production applications (console apps, web forms, ASP.NET MVC) in C# from then to now.  While it’s difficult to know for sure how that other project (or my career) would have turned out had I gone the Java route instead of the Microsoft route, I suspect the Java route would have been better.

One thing that seemed apparent even in 1999 was that Java developers (the good ones anyway) had a great grasp of object-oriented design (the principles Michael Feathers would apply the acronym SOLID to).  In addition, quite a number of open source and commercial software products were being built in Java.  The same could not be said of C# until much later.

To the question of whether Microsoft will still be relevant in another ten years, I believe the answer is yes.  With Satya Nadella at the helm, Microsoft seems to be doubling-down on their efforts to maintain and expand their foothold in the enterprise space.  There are still tons of business of various sizes (not to mention state governments and the federal government) that view Microsoft as a familiar and safe choice both for COTS solutions and custom solutions.  So I expect it to remain possible to have a long and productive career writing software with the Microsoft platform and tools.

As more and more software is written for the web (and mobile browsers), whatever “primary” language a developer chooses (whether Java, C#, or something else altogether), they would be wise to learn JavaScript in significant depth.  One of the trends I noticed over the past couple of years of regularly attending .NET user groups, fewer and fewer of the talks had much to do with the intricacies and syntactic tricks of Microsoft-specific technologies like C# or LINQ.  There would be talks about Bootstrap, Knockout.js, node.js, Angular, and JavaScript.  Multiple presenters, including those who worked for Microsoft partners advocated quite effectively for us to learn these technologies in addition to what Microsoft put on the market in order to help us make the best, most flexible and responsive web applications we could.  Even if you’re writing applications in PHP or Python, JavaScript and JavaScript frameworks are becoming a more significant part of the web every day.

One other language worth knowing is SQL.  While NoSQL databases seem to have a lot of buzz these days, the reality is that there is tons of structured, relational data in companies and governments of every size.  There are tons of applications that still remain to be written (not to mention the ones in active use and maintenance) that expose and manipulate data stored in Microsoft (or Sybase) SQL Server, Oracle, MySQL, and Postgresql.  Many of the so-called business intelligence projects and products today have a SQL database as one of any number of data sources.

Perhaps the best advice about learning programming languages comes from The Pragmatic Programmer:

Learn at least one new language every year.

One of a number of useful things about a good computer science program is that after teaching you fundamentals, they push you to apply those fundamentals in multiple programming languages over the course of a semester or a year.  Finishing a computer science degree should not mean the end of striving to learn new languages.  They give us different tools for solving similar problems–and that ultimately helps make our code better, regardless of what language we’re writing it in.

Pseudo-random Sampling and .NET

One of the requirements I received for my current application was to select five percent of entities generated by another process for further review by an actual person. The requirement wasn’t quite a request for a simple random sample (since the process generates entities one at a time instead of in batches), so the code I had to write needed to give each entity generated a five percent chance of being selected for further review.  In .NET, anything involving percentage chances means using the Random class in some way.  Because the class doesn’t generate truly random numbers (it generates pseudo-random numbers), additional work is needed to make the outcomes more random.

The first part of my approach to making the outcomes more random was to simplify the five percent aspect of the requirement to a yes or no decision, where “yes” meant treat the entity normally and “no” meant select the entity for further review.  I modeled this as a collection of 100 boolean values with 95 true and five false.  I ended up using a for-loop to populate the boolean list with 95 true values.  Another option I considered was using Enumerable.Repeat (described in great detail in this post), but apparently that operation is quite a bit slower.  I could have used Enumerable.Range instead, and may investigate the possibility later to see what advantages or disadvantages there are in performance and code clarity.

Having created the list of decisions, I needed to randomize their order.  To accomplish this, I used LINQ to sort the list by the value of newly-generated GUIDs:

decisions.OrderBy(d => Guid.NewGuid()) //decisions is a list of bool

With a randomly-ordered list of decisions, the final step was to select a decision from a random location in the list.  For that, I turned to a Jon Skeet post that provided a provided a helper class (see the end of that post) for retrieving a thread-safe instance of Random to use for generating a pseudo-random value within the range of possible decisions.  The resulting code is as follows:

return decisions.OrderBy(d => Guid.NewGuid()).ToArray()[RandomProvider.GetThreadRandom().Next(100)]; //decisions is a list of bool

I used LINQPad to test my code and over multiple executions, I got between 3 and 6 “no” results.

RadioButtonListFor and jQuery

One requirement I received for a recent ASP.NET MVC form implementation was that particular radio buttons be checked on the basis of other radio buttons being checked. Because it’s a relatively simple form, I opted to fulfill the requirement with just jQuery instead of adding knockout.js as a dependency.

Our HTML helper for radio button lists is not much different than this one.  So the first task was to identify whether or not the radio button checked was the one that should trigger another action.  As has always been the case when grouping radio buttons in HTML, each radio button in the group shares the same name and differs by id and value.  The HTML looks kind of like this:

@Html.RadioButtonListFor(m => m.Choice.Id, Model.Choice.Id, Model.ChoiceListItems)

where ChoiceListItems is a list of System.Web.Mvc.SelectListItem and the ids are strings.  The jQuery to see if a radio button in the group has been checked looks like this:

$("input[name='Choice.Id']").change(function(){
...
}

Having determined that a radio button in the group has been checked, we must be more specific and see if the checked radio button is the one that should trigger additional action. To accomplish this, the code snippet above is changed to the following:

$("input[name='Choice.Id']").change(function(){
if($("input[name='Choice.Id']:checked").val() == '@Model.SpecialChoiceId'){
...
}
}

The SpecialChoiceId value is retrieved from the database. It’s one of the values used when building the ChoiceListItems collection mentioned earlier (so we know a match is possible). Now the only task that remains is to check the appropriate radio button in the second grouping. I used jQuery’s multiple attribute selector for this task.  Here’s the code:

$("input[name='Choice.Id']").change(function(){
 if($("input[name='Choice.Id']:checked").val() == '@Model.SpecialChoiceId'){
  $("input[name='Choice2.Id'][value='@Model.Choice2TriggerId']").prop('checked',true);
 }
}

The first attribute filter selects the second radio button group, the second attribute filter selects the specific radio button, and prop(‘checked’,true) adds the ‘checked’ attribute to the HTML. Like SpecialChoiceId, Choice2TriggerId is retrieved from the database (RavenDB in our specific case).

Complex Object Model Binding in ASP.NET MVC

In the weeks since my last post, I’ve been doing more client-side work and re-acquainting myself with ASP.NET MVC model binding.  The default model binder in ASP.NET MVC works extremely well.  In the applications I’ve worked on over the past 2 1/2 years, there have been maybe a couple of instances where the default model binder didn’t work the way I needed. The problems I’ve encountered with model binding lately have had more to do with read-only scenarios where certain data still needs to be posted back.  In the Razor template, I’ll have something like the following:

@Html.LabelFor(m => m.Entity.Person, "Person: ")
@Html.DisplayFor(m => m.Entity.Person.Name)
@Html.HiddenFor(m => m.Entity.Person.Name)

Nothing is wrong with the approach above if Name is a primitive (i.e. string).  But if I forgot that Name was a complex type (as I did on one occasion), the end result was that no name was persisted to our datastore (RavenDB) which meant that there was no data to bring back when the entity was retrieved.  The correct approach for leveraging the default model binder in such cases is this:

@Html.LabelFor(m => m.Entity.Person, "Person: ")
@Html.DisplayFor(m => m.Entity.Person.Name)
@Html.HiddenFor(m => m.Entity.Person.Name.FirstName)
@Html.HiddenFor(m => m.Entity.Person.Name.LastName)
@Html.HiddenFor(m => m.Entity.Person.Name.Id)

Since FirstName, LastName and Id are all primitives, the default model binder handles them appropriately and the data is persisted on postback.

XUnit: Beyond the Fact Attribute (Part 2)

One thing I initially missed about NUnit compared to XUnit (besides built-in support for it in tools like TeamCity) is attributes like SetUp and TestFixtureSetUp that enable you to decorate a method with variables that need to be set (or any other logic that needs to run) before each test or before all the tests in a test fixture.  When I first adopted test-driven development as a work practice, I felt it made things easier.  But the authors of NUnit eventually came to a different conclusion about those attributes, and implemented XUnit without those attributes as a result.

Rather than define attributes for per-test and per-fixture setup, the authors of XUnit recommend using a no-arg constructor where you’d use SetUp and IUseFixture where you’d use TestFixtureSetUp or TestFixtureTearDown.  While this took me some time to get used to, leveraging the interface made it easier to handle the external dependencies of code I needed to implement and test.  One technique I’ve adopted to give myself additional flexibility in my test implementations is to add an extension point to the implementation of the SetFixture method

In this example, the extension point is a method named AdditionalFixtureConfiguration.  Calling it inside SetFixture ensures it will be called before each test class derived from UnitTestBase.  Making the method virtual and keeping it free of implementation means that I only need to override it if I need additional pre-test setup for particular test scenarios.  Because we use StructureMap as our IOC container, the equivalent of UnitTestFixture class in my example has a public attribute of type StructureMap.IContainer.  The AdditionalFixtureConfiguration method provides a natural home for any setup code needed to configure additional mappings between interfaces and implementations, set up mocks, and even inject mocks into the container if a concrete implementation isn’t available (or isn’t needed).

While this is the implementation I’ve chosen, there are certainly other ways to accomplish the same thing.  Instead of defining AdditionalFixtureConfiguration in UnitTestBase, I could define it in UnitTestFixture instead and call it in every SetFixture implementation (or not , if that customization wasn’t needed).  I prefer having the abstract class because it makes for simpler actual test code.