Recent reading

Cool articles ! You must read them.

Sunday, September 04, 2005

The Case for Nullable Types

Nullable Types
by Kamran Qamar


Imagine some part of your code stores an amount of money. Now software is driven by strict rules of mathematics, and there's normally a need to quantify each variable we use. That is why, in .NET, each type has default value. Value-type variables always have predefined values e.g. int has default value of 0 and bool has default value of false . So far so good, but a problem arises when we try to apply our strict mathematical rules to the real world, in which you might not have all the data available. For example, how does a value of 0 distinguish that a user has no money from the case where they have not entered any value for your software to store. This problem becomes clearer when you start working with relational databases. Relational databases have a concept of a NULL value, which indicates that no data is present, and, for value types, does not yield an appropriate value in .NET programming languages.

Achieving Nullable Values - Current Techniques

Over the years developers have adopted a number of workarounds for this problem, some of which I'll briefly discuss:

Special Values

One of the most notable solutions is to use some special unused value of the given data type as the null value e.g. if you are dealing with an age, you define it as int and use -1 as the null value. This solution is workable only when you can identify value which is definitely unused and won't therefore be mistaken for a real value, e.g. if you are working with dollar ($) amount, -1 may indicate a loss of 1$ and is not therefore suitable to represent a null value if this value might occur.

This solution is simple to implement, but generally frowned upon because of the potential for subtle runtime bugs, especially if the application requirements subsequently change so that the value previously used as the null value becomes a legitimate value.

Using a Flag

One solution that I have used frequently is to define a flag in the object that contains the possibly null field, which is used to determine if the value is null or not. Incidentally system classes also use this approach e.g. SqlDataReader has a property IsDbNull that identifies if value returned from a database record is null or not.

Here's an example of this approach:

public class Person
{
private string _fullName;
private int _age;
private bool _hasAge;
public string FullName
{
get{return _fullName;}
set{_fullName = value;}
}
public int Age
{
get
{
if(!_hasAge)
throw new InvalidOperationException("Value not set");
return _age;
}
set
{
_age = value;
_hasAge = true;
}
}
public bool HasAge
{
get{return _hasAge;}
}
public void Load(SqlDataReader dr)
{
if(dr["FullName"].IsDbNull)
_fullName = "";
else
_fullName = dr["FullName"].ToString();
_hasAge = dr["Age"].IsDbNull;
if(!_hasAge)
_age = (int)dr["Age"];
}
}

The above code is an example of using flag to identify a value as null. Here the code is using the HasAge property to determine if you have an Age value or not. The class sets this flag to true , whenever the Age property is set. In client code, you always need to check HasAge value before using the Age property.

Defining a Nullable Type

Some developers go even a step further and define their own value types e.g. consider the following code:

public struct NullableInt
{
bool isNull;
public bool IsNull { get { return isNull;}}
bool value;
public int Value {
get { return value;}
set { @value=value;isNull=false;}
}
public NullableInt()
{
isNull = true;
}
// etc.
}

By using this NullableInt type instead of standard system int type, we obtain more control over our data.

Editors Note: ASP Today readers might be interested to know that the ASP Today codebase makes extensive use of this solution for dates. Most of the dates you see displayed on the site come to you courtesy of a custom-coded Date struct, whose main raison d'etre is to wrap System.DateTime to make it nullable!

One advantage of this approach is you can if you wish also define a DefaultValue field, something that is available in value types, though lost in the above code. If you do define such a field, you'll need to decide what the most reasonable behaviour is for example constructing instances - eg. you might want a value to default to its default value, but be settable to null if the client code requires it.

This approach can work really well, and has the advantage of encapsulating the nullable-ness so client code doesn't have to worry as much about null values. But what happens when you add - say - two NullableInt s? Yes you can do that, all you have to do is upgrade the NullableInt struct with some operator overloads. And what about casting a double to NullableInt , that's less trivial, but you can achieve the same result by providing a constructor at the cost of some easy syntax. (You could instead use an operator overload, but I'll illustrate the principle here with a constructor):

public NullableInt(double d)
{
Value = d;
}

The client code that uses this might look as follows:

NullableInt i;
double d = 2.5;
i = new NullableInt(d);
Console.WriteLine(i.ToString()); //will print 2

Where traditionally you would have simply written code like this:

int i;
double d = 2.5;
i = d;
Console.WriteLine(i.ToString()); //will print 2

This approach is extremely powerful but in some cases can be over kill, because not only that all the values do not require to be null able at all the time, it also requires custom data types and special coding discipline on part of all the developers involved.

Introducing .NET2.0 Nullable Types

Now that I have scared you with the limitation of the third approach, let me give you the good news. The C# 2.0 language solves this long standing problem by proving complete and integrated support for a nullable version of all value types, based on a clever application of generics along with some neat supporting syntax.

A variable of a nullable type can represent all the values of its underlying type, plus an additional null value. The way to define any valuetype as a nullable type is to define the generic System.Nullable type. Therefore the syntax to define a nullable integer is:

System.Nullable i;

This is kind of tedious, that is why C# team did a favor to us by providing a clean syntax for the nullable type in the form of [Value-Type]? hence System.Nullable and T? are interchangeable, and the statements below are equivalent and will both result in declaring a nullable int variable.

System.Nullable i;

Or

int? i;

Using this syntax you can define any value-type as nullable type. You'll see more examples in the rest of this column piece.

Understanding the System.Nullable type

Before I dig deeper into the different features and benefits of the nullable types, it will be beneficial to understand how the nullable behavior is achieved in .NET. As I mentioned, at the heart of this behavior is generics. For more information about generics see the excellent article from Juval Lowy at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvs05/html/csharp_generics.asp .

The System.Nullable is the generic type that makes the magic of nullable types possible. You can define any value type as nullable. Examples of nullable types are:

int? i = -1;
bool? flag = null;
char? chr = 'z';
int?[] MyArray = new int?[10];

Remember, T in System.Nullable is always a value-type. Hence the following lines of codes are incorrect and will result in compile time warnings (you might expect errors but in practice I found it was simply giving warnings):

string? Message = "Hello, World!";       // Compiler warning
SomeClass? someClass { }; // Compiler warning

System.Nullable has two public properties: HasValue of type bool and Value , which is of the same type as the nullable type's underlying type. The HasValue property is always true for any non-null instance and false for null instances. When HasValue is true, the Value property returns the contained value. When the HasValue is false, an attempt to access the Value property will result in InvalidOperationException exception. This follows what you'd probably regard as expected behaviour for a nullable type.

Using a Nullable Type

The following code shows how you would use a nullable integer

int? i;    // declaration
Console.WriteLine(i.HasValue); // will print false
Console.WriteLine(i.Value); // through InvalidOperationException
i = 18; //initialization
Console.WriteLine(i.HasValue); //true
Console.WriteLine(i.Value); //print 18

Checking for Null Values

As you can see in above code, when the new instance of a nullable type is created, its HasValue property return false, which means, as indicated before, that any attempt to access its value will result in InvalidOperationException . Does this mean, we always have to use the HasValue property to determine if a nullable type variable is null or not? Absolutely not, C# allows you to can check if a nullable object is null or not by comparing it against null keyword just as we do for reference-type objects. Hence, following syntax is valid:

if (i == null)
Console.WriteLine("i is null");

In client code, you would most commonly check if the object is null or not, if it is not null you would use the value, otherwise you'll need to make some decision based on the situation. (Assuming of course that you are not in a situation where the algorithm you have been coding guarantees that a non-null value has been placed in the variable concerned). For example in the following code, I have defined i as a nullable int and x as a normal int. Next I check if i is null or not, if it is, I get the default value of i. This by the way demonstrates the new default keyword in C#2.0. Next, if i is not null, I set x with the value of i.

int? i = 18;
int x;
if(i == null)
x = default(int);
else
x = i.Value;

This is quite a lengthy code isn't it? The C#2.0 introduce new operator ?? that does exactly this for us. See the code below:

int? i = 18;
int x;
x = i?? default(int); //getting default value

or

x = i ?? 18; //defining my own value

Conclusion

Remember how difficult it was to initialize or compare an enum or struct type for a null value with .NET 1.0 - you typically had to write special code to check if the value of is valid or not. Now with C# 2.0, you can employ the power and elegance of the nullable type and the associated special syntax to check for null just like when using a reference-type.

Nullable types equip us with a powerful mechanism to handle special situations including database null values or other data types that contain fields that may not be assigned a value. Since this functionality is built right into the infrastructure using generics, it does not introduce any drastic performance penalty. However, converting an ordinary value type to nullable value type does introduce additional work for the infrastructure; hence nullable types should be used cautiously. Their use also dictates special consideration while assigning to ordinary value types and operating with them

That was a quick overview of null able types in .NET 2.0. for more information you can consult the .NET2.0 documentation.

Happy Programming!