Goto Statements
I've been quietly lobbying to my programmer friends about the utility of using the old "goto" keyword. I've gotten a little push back from them, mostly dismissive scorn, but hey, not everyone can be good. :) I should start a series on this and this will serve as my first post on the glories of GOTO.
The other day I posted a twitter image showing a good case of a goto statement.
This is a utility method I use when parsing Xml documents into objects. I received some push back from this specific twitter, mainly from my friend @gbattle. The requirements I gave to Greg were: "no nested if's, no multiple exits, no unnecessary object assignments and better error explanation"
He posted about his code for doing the same thing without using gotos. There were a couple points where he was incorrect, and I wanted to walk through his post. I start trivially and then increase for performance and readability. He still made his points regardless of the library being used. There was only one point, property accessing, that does behave different in .NET than say C++.
I will accept from Greg that breaking down each error condition into exactly what caused the specific failure might be better, but on a utility method like this, if I can wrap up the failure condition into a single statement be accurate, I'm fine with it. "Node should have one child CDATA section with data." Sure, specifically telling the user which 3 words of the all inclusive error might be better, but any software dev that can't follow that error message and look at the node that caused the failure, prolly should not be programing anyway.
// For style reasons, I always put the constant on the lhs within an if statement
// for the compiler to automatically discover nasty undesired assignment errors
if (1 != node.ChildNodes.Count)
I understand the sentiment of the lhs positioning of constants, and in my c++ days, I was an advocate of his pattern; however in .NET it is an error to do an assignment in an IF evaluator. (Which was extremely frustrating starting out because I wanted to do it sometimes)
// There is no reason for you to create a new XmlNode return the value.
// You waste time creating a new object and the assignment. Not needed.
XmlNode dataNode = node.FirstChild;
The assignment operator doesn't create a new object. It creates a pointer to the return object of calling property FirstChild. Creating a pointer to a Property result is the fastest way to access the property when you are accessing it more than once. When loading an property it first must resolve the memory location of the containing object, then look into its method offset table, calling the accessor (get) method, which walks more memory offsets, to return an object. Rather than repeatedly calling the Property (get method) I store the results in a stack variable pointer to the actual data returned by the Property call. (I'm ignoring the code inlining that might occur at run time, but that doesn't meet the speed of storing the reference locally)
Now the money shot for why gotos are good in this situation is something in his code he did not account for. When he is checking the number of ChildNodes off the parent node, if it comes out to zero, he has opened his code up for a NullReferenceException. If ChildNodes.Count == 0, then FirstChild will be null. Checking FirstChild.Value when first child is null will result in an exception. And here lies the nested if statements that I'm trying to avoid.
If the ChildNodes.Count property is 0 (or negatively not 1, the actual node count we are looking for), then you must branch this utility method into either a quick return or a nested if statement that continues checking the validity of the XmlNode when there are the correct number of ChildNodes.