Tuesday, July 01, 2008

A Rule A Day: MarkMembersAsStatic

Today I’d like to talk about one of my favorite rules - “Mark Members As Static” MarkMembersAsStatic rule (CA1822). The rule is triggered when you have methods in class that do not use instance members and therefore can be declared as static:

public class StaticTest
{
    // ... implementation 
     
    // fires CA 1822
    public void Log1(string message)
    {
        Trace.TraceError(message);
    }
 
    // does not use instance and is static 
    public static void Log2(string message)
    {
        Trace.TraceError(message);
    }
}

The rule is listed under “Performance” category, and here is why: when you declare method as static it (obviously) affects MSIL code emitted. As shown below, the method declarations will differ in one keyword (I have omitted the method implementations as it is exactly the same):

// static method
.method public hidebysig static void Log2() cil managed { ... }
// instance method
.method public hidebysig instance void Log1() cil managed { ... }

So where is the “performance” difference between two methods? It turns out, when you call those methods different MSIL gets generated using different opcode to call static vs. instance methods (call vs. callvirt). Here is the source code:

public class StaticTest
{
    // ...
    // static method call
    public void TestStatic()
    {
        StaticTest.Log2();
    }
    // instance method call
    public void TestInstance()
    {
        StaticTest test = new StaticTest();
        test.Log1();
    }
}

And here is generated MSIL code:

// static method call
.method public hidebysig instance void  TestStatic() cil managed
{
  .maxstack  8
  IL_0000:  nop
  IL_0001:  call void StaticTest::Log2()
  IL_0006:  nop
  IL_0007:  ret
}
// instance method call
.method public hidebysig instance void  TestInstance() cil managed
{
  .maxstack  1
  .locals init ([0] class StaticTest test)
  IL_0000:  nop
  IL_0001:  newobj instance void StaticTest::.ctor()
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  callvirt instance void StaticTest::Log1()
  IL_000d:  nop
  IL_000e:  ret
}

The potential performance hit is hidden behind the opcode: as callvirt calls method on instance, it will throw NullReferenceException if the instance is null (as defined in CIL standard for callvirt opcode); on the other hand, call instruction does not require instance. In MSIL generated from C#, every instance method call will result in callvirt emitted (“why” is explained in Brad Abrams post); as far as performance is concerned, that means that for every instance method call, the check for null reference will be performed (regardless of whether method makes use of instance members). Thus if one has performance critical code, and has lots of methods that may be made static, fixing MarkMembersAsStatic violations can improve performance.

I hear you saying “Whoa, and how big is that performance gain?”. Well, myself I am pretty skeptical about significant performance increase; and I could not think about realistic enough benchmark to test it.

However, I still like this rule very much and here is why – it is my silver bullet against procedural programming style code written in C#. Here is how it goes: if you run code analysis on the code that is being reviewed and get a whole lot of MarkMembersAsStatic violations, then you might be looking at one of the following:

1. Utility function that is integral part of the class. My rule of thumb is that there should be limited number of such functions per class definition, and they should be protected/private.  If that’s the case, the rule violation is fixed by adding static keyword to utility methods (with the added benefit of clarifying the class interface; now it is clear which methods are “global” and not customizable per class inheritance chain).

2. Utility static class hidden in another class definition. If the number of static utility functions defined is large (absolute number or percentage relative to total number of members), it is worth checking whether static routines should be grouped in another (static) class. In such case to fix the rule violations the utility class should be created and used throughout the code.
What I like about having utility classes separate from code describing entity and its behavior, is that by virtue of separating this functionality you increase chances of code reuse. Just consider how likely you are to discover and reuse something contained in DirectoryUtilities class over, say, Patient class.

3. Procedural-style code disguised as static methods. This is the situation where MarkMembersAsStatic help can be invaluable. Basically, what happens is this: you have valid class (judging by interface), and then it turns out that most (if not all) methods can be declared static (MarkMembersAsStatic violations). Then it may be case of static utility class or alternatively behavior separated from data. To illustrate the latter, here is simple example:

    public class Patient
    {
        public string Id
        {
            get { ... }        
        }
    }
 
    public class PatientManager
    {
        public static bool IsSamePatientId(Patient patient1, 
                  Patient patient2)
        {
            if (String.IsNullOrEmpty(patient1.Id) &&    
                   String.IsNullOrEmpty(patient2.Id))
                return true;
            return (String.Compare(patient1.Id, 
                             patient2.Id, true) == 0);
        }
    }

Generally, I am of the opinion that in such cases the behavior belongs to the entity providing data (in the example above, Patient class should implement relevant method):

    public class Patient
    {
        public string Id
        {
            get { ... }        
        }
        public bool IsSameId(Patient patient)
        {
            if (String.IsNullOrEmpty(Id) &&   
                  String.IsNullOrEmpty(patient.Id))
                return true;
            return (String.Compare(Id, patient.Id, true) == 0);
        }
    }

Before C# 3.0, one could argue that if the class cannot be modified (f.e. belongs to System or 3rd party libraries), the procedural approach has its uses. But with the extension methods in C# 3.0 desired behavior may be attached to whatever class.

Thus to conclude, to me finding static methods not marked as such is less about performance and more about finding out (especially when researching unknown code) the interface design quality and “object-oriented”ness of the code, and I recommend it as an instrument for the code review (though your mileage may vary ;).

Related posts
- Design CA1009 DeclareEventHandlersCorrectly
- Performance CA1810 InitializeReferenceTypeStaticFieldsInline

No comments: