.NET Type System, Part I

Learn how to write better .NET and C# applications! .NET allows you to store simple values and complex values. This article explains how, as well as how to use the ILDASM.EXE tool (Intermediate Language Disassembler) to inspect MSIL compiled .NET files. It also gives a detailed explanation of how and where the values are stored in your programs.

Contributed by
Rating: 5 stars5 stars5 stars5 stars5 stars / 44
March 08, 2005
Rate this Article:
MEH MEH++


SEARCH ASP FREE
TOOLS YOU CAN USE

advertisement

In this article you will learn about Value Types and Reference Types, and when you complete the article you will have an in-depth understanding of how and where the values are stored in your programs, which will help you write better .NET and C# applications. You will learn about the Stack and the Managed Heap, too. You will use the ILDASM.EXE tool (Intermediate Language Disassembler) to inspect the MSIL compiled .NET files (which we call Assemblies). As for Value Types, you will learn about Built-In Data Types, Enumerations and Structures. As for Reference Types, you will learn about Arrays and Strings for now. There are also Classes and Delegates, but I will just introduce them here and discuss each of them in its own article. So let's get started with a look at the .NET solution for storing simple values and complex values.

The .NET Types, again

Every time you write C# applications you will probably be defining Types. As we discussed in the previous article, Introducing C# and the .NET Framework, a Type is a representation to a value and its associated operations, but let's try to divide the types of values that we can have in our applications.

Every programming language has its own set of built-in data types such as integer numbers, floating point numbers and strings, and user defined data types such as classes and structures. This is great, but it's essential for you as a developer to understand what happens when you use these types, and what are the expected behaviors from each type. Say that at a given point in your application you want to store a student's total grade (which is 98) in a variable of type int (integer), so you will use the following statement:

int TotalGrade = 98; 

What happens here is that the CLR allocates a  four byte memory location of type System.Int32 and stores the value 98 there, and gives this location the name TotalGrade. What you need to know at this point is that:

  1. The location that the CLR has allocated to the variable TotalGrade is Strongly-Typed, which means that only the defined operations on Int32 type are permitted for this location.

  2. The TotalGrade contains a simple value which is 98 so the value is stored in the variable's memory location.

The TotalGrade variable is a Value Type, which means that the value 98 is stored in-line (in the variable location). .NET actually defines another kind of Type called Reference Types. We will discuss Value and Reference Types in this article, but for now think of a Reference Type as a type that stores a reference to another memory location where the actual data is stored. So the .NET Type system is built on the idea of types whose variables contain the value (Value Types) and types whose variables contain a reference to another memory location where the actual value is stored (Reference Types). Table 2.1 illustrates the CTS Type System.

Reference TypesValue Types
  1. Classes

  2. Delegates

  3. Arrays *

  4. Strings *

  5. Interfaces

  1. Primitive Types *

  2. Structures *

  3. Enumerations *

* means that this Type will be discussed in this article, others will be discussed in later articles.

Before we discuss Value Types and Reference Types, I want to introduce the ILDASM.EXE tool.

The ILDASM.EXE Tool

The .NET SDK includes a tool called ILDASM.EXE (Intermediate Language Disassembler) that you can use to inspect the generated .NET Assemblies and see the byte code instructions. For this section copy the following code, paste it into notepad, and save it as HelloILDASM.cs.

using System;

namespace HelloILDASM
{
  public class Class1
    {
      static void Main()
      {
        Console.WriteLine("Hello ILDASM from C#");
      }
    }
}

Compile the application using the following command (note that I saved the file on drive C on my machine).

C:\csc HelloILDASM.cs

Use the following command to load the application with the ILDASM tool

C:\ildasm HelloILDASM.EXE

And this is exactly what you will get:

First of all, let's learn more about these icons. Examine the table below.

IconDescription

This is a namespace in the loaded assembly; here we have the HelloILDASM namespace.

This is a Reference Type symbol. We use this icon for Class1 (which is of course a Reference Type).

This is a structure Type symbol; it's the same shape as the Reference Type, but a different color.

This is an Interface symbol. In .NET there are some naming conventions, such as preceding the interface name with the I letter, as in IDisposable interface.

This is an Enumeration type icon.

This red arrow means more information can be obtained.

This symbol represents an instance method.

The S letter on the method symbol means that it's a static method.

This symbol represents a field.

This is a Property symbol.

This is an Event symbol.

This is the symbol for a static field or a constant.

Understanding MSIL is beyond the scope of this article, but we will take a quick look at the MSIL Instructions. It's good for you to understand  the MSIL code that the C# compiler generates from your source code, and in fact it will help you understand many concepts and give you the ability to develop better applications.

All the classes in the .NET Framework derive from a common base class which is called Object and lives in the System namespace. To illustrate this, double click on the More Information Icon of the Class1 class and the following class declaration window will be shown:

Actually this is no more than declaring a class (Class1) which is public and extending the class System.Object which lives in the assembly mscorlib. We didn't define that our Class1 would extend and inherit from System.Object; it's implicitly done for us by the C# Compiler. We will talk about this class soon. Close this window and double click on the .ctor(): void method and the next window will be shown:

This method declaration is the default constructor for our class Class1, and it does nothing more than call the System.Object constructor using the MSIL Instruction call. The method is marked as Managed Code using the cil managed keyword. Close this window and double click on the Main: void(). The next window will be shown:

As you know, every C# application has an entry point, and it's always the Main method. Here the .entrypoint defines the Main method as the application's entry point. Our C# source code was very simple; we just made a call to Console.WriteLine() method and passed the string "Hello ILDASM from C#". Let's see what the C# compiler has generated in the compiled assembly file.

IL_0000: ldstr   "Hello ILDASM from C#"
IL_0005: call     void [mscorlib]System.Console::WriteLine(string)
IL_000a: ret

The MSIL code is a series of instructions that the CLR executes. The first instruction is ldstr, which loads the specified string constant onto the stack. The next instruction, call, calls the method System.Console.WriteLine(string). The method argument is taken from the stack and nothing is put back -- because the method returns void, which means that the method doesn't return a value.

I don't want to delve into a lot of details about the MSIL Instructions because it would fill a book. But I will show you the MSIL code in a lot of articles until you get used to it. It will help you understand the code that gets generated under the hood. It's not that difficult and you will not need to write .NET Applications using IL Assembler; in most cases you will be using C#.

System.Object Class

The CTS has a base class, System.Object, that all other classes in the .NET Framework must derive directly or indirectly. At first you may think that it's not useful, but think about it: traditional programming languages has primitive types to store integers, floating point numbers and character strings besides the user defined data types in the form of classes. In these programming languages, the lack of a common base class for all other classes caused a lot of problems. In traditional languages built-in types have nothing in common, so you can write general purpose code (unlike C# and .NET Complaint Languages). In C# we can write code this way:

public void GetName(System.Object anyType)
{
  Console.WriteLine(anyType.ToString());
}

This method accepts a paramter of type System.Object and calls the method ToString() on that Type. We can't do this in the same way in C or C++ and we will have to write wrapper classes and overload constructors for each type we want to support. The CTS comes to the rescue with its base class (System.Object); all objects derive from it, and this includes value types and reference types of built-in and use-defined types. The following are the public and protected methods of this class, which are implicitly inherited by any class.

Public methods

Method

Description

ToStringThis method returns a string to represent the object; it's like an answer for the question "What's your name?". It is virtual, which means that derived classes override it to supply an appropriate name for the object. For example, the Customer class may override this method to return the following pattern: CustomerFirstName + CustomerLastName;
EqualsThis method behavior differs from value types and reference types. We will talk about this method in the last article (Objects manipulation and Operations). For now it is enough to say that it compares two objects and return true if they are equal and false if they are not.
GetTypeThis method returns a string that represents the Type of the object.
GetHashCodeCall this method on the object to retrieve a hash code that can be used in hash tables for faster performance.

Protected methods

FinalizeThe CLR calls this method before garbage collection operates. This method is used for clean up and release operations of resources that has been used by your object. It's not guaranteed that the CLR will call this method, so you need to use another technique to release the resources.
MemberWiseCloneIt's a shallow copy of the object. More on Copying objects later.

 

The Stack and the Heap

If you are a C++ programmer, you are already familiar with object allocation with the Stack and the Heap. With C++ we need to decide if our object needs to be allocated on the Stack or the Heap, and of course it's a painful task (in comparison with what we have now with C# and .NET).

An object allocated on the Stack gives us fast access to the value, but the problem is that it is lifetime bounded to the method that has allocated the object on the stack, and the object will be destroyed when the method returns or completes execution. On the other hand, objects allocated on the Heap have a lifetime that goes beyond the method that has allocated the object, and can be shared with other classes -- but it costs a little more to allocate (unlike Stack objects). As you know, objects allocated on the Heap must be explicitly managed, and if you forget to free objects, it will lead to memory leak or program failure.

In .NET the life has been Managed for us, and also optimized (because it's managed by the CLR). Now Types are allocated when they are used. I know, this is a bit confusing. In other words, a value-type declared in a method will be allocated on the Stack and a value-type declared in a reference-type will be allocated, WHEN IT'S USED, on the Managed Heap. Let's look at Value-Types and Reference-Types in detail so you can understand what I mean.

Value-Types

Value-Types represent any simple value you may have in your programs, and semantically it's a numerical (3, 65.67) or even a particular kind of data (12/1/2004). For example, a variable of type int is a value-type, and when you declare it the CLR will allocate a  four byte memory location to store the value of this variable. So a variable of type int needs a fixed memory size of four bytes to store its value.

Value-Types don't cause any overhead of the CLR because they are located on the stack (most of the types, as we will see) and destroyed when the method returns. The memory locations that has been used for the local variables will be reclaimed. In .NET, Value-Types include the primitive types, User-Defined Value Types (Structures) and Enumerations. Value Types derive from the ValueType base class, which in turn derives from the System.Object base class. The ValueType class extends the functionality provided by the System.Object base class. This functionality includes comparing two instances of value types, which will be discussed in the last article of the series. 

When you assign one value-type to another value-type (like x = y) the approach taken here is copy-by-value, which means that the value of the y will be copied into x. Let's use a simple example to explain this behavior. Create a new Visual Studio.NET Console Project and call it ValueTypes, then delete all the code in the Class1.cs and copy the code below. If you are not using VS.NET then you would copy the code below and paste it into a file with a .cs extension, then compile it. 

using System;

namespace ValueTypes
{
  class Class1
    {
      static void Main(string[] args)
        {
          int EnglishGrade = 40;
          int FrenchGrade = 35;

          Console.WriteLine("The English Grade = {0}", EnglishGrade);
          Console.WriteLine("The French Grade = {0}", FrenchGrade);

          //here we assign EnglishGrade to FrenchGrade
          FrenchGrade = EnglishGrade; 

          //then we write to the console the value again 
          Console.WriteLine("(after assignment)The English Grade = {0}", EnglishGrade); 
          Console.WriteLine("(after assignment)The French Grade = {0}", FrenchGrade); 

          //read line to just make the console wait for you
         
Console.ReadLine(); 
        } 
    } 
 }

Run the Application and you will get the following result in the console:

The English Grade = 40
The French Grade = 35
(after assignment)The English Grade = 40
(after assignment)The French Grade = 40

It's pretty clear that the value of EnglishGrade has been copied to the variable FrenchGrade (through the assignment statement) and this illustrates the idea of Copy-By-Value. What you can notice here is that the variable FrenchGrade value was 35 before the assignment statement, and after this assignment it has the same value as the EnglishGrade variable, so the CLR performed bit-by-bit copying to have the value of EnglishGrade copied to the variable FrenchGrade.

All the built-in Types (listed in the previous article's CTS built-in types tables) are Value-Types except System.String and System.Object. We will talk about System.String object in the section Reference-Type. There's nothing special about primitive types to discuss at the moment, so we will move directly to Structures and Enumerations.

Structures: complex Value-Types

The primitive data types such as int, double and byte represent simple values, but sometimes you need to store complex numerical values, such as a point on the screen in the form (x, y). Maybe you want to store the student English grade and French grade in one structure, and you may want to associate a number of operations with this structure. The primitive types can't help you to store such values, because you can't store two values in one variable. You may use a class in the following manner:

public class StudentGrades
{
  public int EnglishGrade = 40;
  public int FrenchGrade = 35;
}

But using a class (which is a Reference Type) to store two simple Value-Types (like x and y for the screen coordinates) would be very costly to the CLR. In C# we have another construction called a structure (defined using the keyword struct). When you need to define a Value-Type you will be defining structures. We can write the above class in the following structure:

public struct StudentGrades
{
  public int EnglishGrade = 40;
  public int FrenchGrade = 35;
}

You define a structure using the keyword struct, and you may precede it with an access modifier (the keyword public means that the structure is visible for any client code). Let's put the class StudentGrades and the structure StudentGradesStruct in one file and compile it so we can take a look at the MSIL. Copy the following code into a file and save it as StructClass.cs, then compile it using the command c:\csc /t:library StructClass.cs. Or you can use Visual Studio.NET to compile it as a Class Library by selecting a Class Library project.

using System;
namespace Structures
{
  public class StudentGrades
  {
    public int EnglishGrade;
    public int FrenchGrade;
  } 

  public struct StudentGradesStruct
  {
    public int EnglishGrade;
    public int FrenchGrade;
  }
}

Now let's take a look at the MSIL code using the ILDASM tool. Write the following command:

c:\ ildasm StructClass.dll

The ILDASM tool will load the Class Library file StructClass.dll. Now double click on the namespace Structure and you will see both the class and the structure. Double click on both so you can see their members.

You can learn a lot of "under the wood" concepts by loading your compiled files using this tool as we do here. Note the following points about both the class and the structure:

  • The class StudentGrades have a .ctor method, which is a default constructor, while the structure StudentGradesStruct has no constructor.

  • Both of them are just classes, but the class StudentGrades derive from System.Object, while the structure StudentGradesStruct derive from System.ValueType.

  • The structure is declared in the MSIL code as sealed, which means that it can't be a base class for other classes (it can't be inherited). 

It's interesting, isn't it?

Structures differ from Class in that they don't need the Garbage Collector (the .NET mechanism of destroying reference-Types) to reclaim the memory. Structures are Value-Types, which means that when you create a structure it will be located on the stack, and when the method completes, it will be destroyed. When you assign one structure instance to another, it will be copied much like the above assignment example.

You should be careful when using structures because they're a Value-Type. What I mean is that, if you have a very large structure, then copying this structure in the application will slow down the performance. It's better to define that structure as a class because it's easier to copy a reference (classes exhibit Copy-By-reference behavior) of the class to another variable than to copy the values of the structure (say 20 int variable which means 80 bytes) to another one.

Structures can contain fields, methods, indexers, events, constructors and properties, but can't contain destructors. This makes sense because they're not Heap-Based objects -- thus, they can't have destructors. There are some important points you should know about structures, because they are Stack-based objects.

  • You can't define a default constructor for a structure, because the CLR will define it and call it when you use the new operator to initialize the structure's fields to the default values, which are: false for Boolean Types, 0 for numerical Types.

  • You can't initialize a field in the same declaration statement, except for constant values, which you declare using the keyword const (if you are not familiar with constant values, we will talk about these in the article covering Class and Access Modifiers).

  • You can define as many parameterized constructors as you want to initialize your fields.

  • As we saw in the MSIL code, the Structure is declared using the keyword sealed, which means that you can't derive from it. In other words, it can't be a base class, so you will not use the C# keywords virtual, sealed or abstract. Note that when you define a structure it inherits from System.ValueType, and the C# compiler declares the class (the structure) as sealed, so you can't derive from it. You can inherit (implement) interfaces in a structure; this is a result of supporting methods in structures.

  • You can override the methods inherited from System.ValueType like the ToString() method to support your programming construction.

Let's extend the StudentGradesStruct to override the inherited ToString() method. We will define two constructors to initialize the member fields.

public struct StudentGradesStruct

  public int EnglishGrade;
  public int FrenchGrade;

  public StudentGradesStruct(int EG)
  {
    EnglishGrade = EG;
    FrenchGrade = 0;
  }

  public StudentGradesStruct(int EG, int FG)
  {
    EnglishGrade = EG;
    FrenchGrade = FG;
  }

  public override string ToString()
  {
    return Convert.ToString(EnglishGrade + FrenchGrade);
  }
}

There is nothing new about this code except for the ToString() method, which simply overrides the base class implementation (which returns the string System.ValueType) to return an appropriate representation of the data. In this case, we return the total grades. To run this code you need a test class like the following one:

public class Test
{
  static void Main()
  {
    StudentGradesStruct student = new StudentGradesStruct(40,35);
    Console.WriteLine(student.EnglishGrade.ToString());
    Console.WriteLine(student.FrenchGrade.ToString());
    Console.WriteLine(student.ToString());
    Console.ReadLine();
  }
}

You will get the following result from the console:

40

35

70

Note that you can call the ToString() method on any type in the .NET because it's an inherited functionality from the ultimate base class System.Object.

Let's copy the student structure variable into another variable and see the result. Modify the Main method to look like the following:

static void Main()
{
  StudentGradesStruct student = new StudentGradesStruct(40,35);
  Console.WriteLine(student.EnglishGrade.ToString());
  Console.WriteLine(student.FrenchGrade.ToString());
  Console.WriteLine(student.ToString());

  //assignment statement
  StudentGradesStruct student2 = student;
  Console.WriteLine(student.EnglishGrade.ToString());
  Console.WriteLine(student.FrenchGrade.ToString());
  Console.WriteLine(student.ToString());

  Console.ReadLine();
}

The result would be:

40

35

70

40

35

70

Enumerations

Enumerations are not a new construction to programming languages. In .NET and C# they have been designed as Object-Oriented Construction, and this gives developers a lot of new features that were not available, for example, in C++.

An Enumeration is a set of named-values. In other words, it's a set of numerical values associated with a name for each value. You define Enumerations in C# like the following example, a Jobs Enumeration:

enum Jobs

  Programmer, //Implicitly assigned a value of 0 
  Administrator, //Implicitly assigned a value of 1 
  Designer,  //Implicitly assigned a value of 2 
  Architect, //Implicitly assigned a value of 3 
  Manager    //Implicitly assigned a value of 4
}

We use the keyword enum followed by the Enumeration identifier. Before we talk more about Enumerations you should know that every time you define an Enumeration, it inherits from the base-class System.Enum, which in turn inherits from System.ValueType, which in turn inherits from System.Object.

As we said before, an Enumeration is a set of names (or symbols) associated with a value. In the Jobs enumeration, as you can see, my comments //Implicitly assigned a value of 0 and so on. The values are not explicit, but will be set by the C# Compiler. Each Enumeration has an underlying type, which represents the values that will be assigned to the symbols. By default it's int, but you can use byte, sbyte, uint, ulong, long, short and ushort as the underlying type, as the following example:

enum Jobs: byte

  Programmer,    //Implicitly assigned a value of 0 
  Administrator, //Implicitly assigned a value of 1 
  Designer,      //Implicitly assigned a value of 2 
  Architect,     //Implicitly assigned a value of 3 
  Manager        //Implicitly assigned a value of 4
}

You can explicitly define the values:

enum Jobs: byte
{
    Programmer = 1,
    Administrator = 2,
    Designer = 3,
    Architect = 4,
    Manager = 5
}

You can define the values in what order you like:

enum Jobs: byte
{
    Programmer = 5,
    Administrator = 4,
    Designer = 1,
    Architect = 2,
    Manager = 3
}

You can define a value for the first symbol and the other symbols' values will be implicitly added:

enum Jobs: byte
{
    Programmer = 1,
    Administrator,     // here is 2
    Designer,           // here is 3
    Architect,         // here is 4
    Manager           // here is 5
}

As you can see, Enumerations are a very flexible Object-Oriented construction. They make developing applications easier because you don't have to remember the types of jobs you have in your application and their values. For example, you will just write Jobs.Programmer instead of writing the value 1 (maybe as an argument to a method that has a paramter of type Jobs). Besides that, Enumerations in .NET are Strongly-Typed, which means that you can't pass a Jobs value to a method's parameter that expects a Titles Enumerations value.

Enumerations are a Value-Type like Structures. Under the hood it looks like a structure; when the C# compiler compiles the Jobs Enumeration it makes all the symbols and their associated values into constant members of a structure like the following pseudo code:

struct Jobs: System.Enum
{
  public const Jobs Programmer = (Jobs)1;
  public const Jobs Administrator = (Jobs) 2;
  public const Jobs Designer = (Jobs) 3;
  public const Jobs Architect = (Jobs) 4;
  public const Jobs Manager = (Jobs) 5;
}

This pseudo code should clear the internal working of an enumeration, it's a structure that defines constant values for the symbols that you want to define. All the fields have been declared as public const, which means that in the compile-time the compiler will replace any references to Jobs.Programmer, Jobs.Administrator and so on by their numerical value. Also note that the produced compiled file (the assembly) is full of information about the Types you defined and about Enumerations Types. The assembly will contain information about each symbol and its associated value, so we can use casting operations to get the underlying value of a given symbol as with the following:

namespace Structures
{
  public class Test
  {
    static void Main()
    {
      Console.WriteLine(Convert.ToString((byte)Jobs.Administrator));
      Console.ReadLine();
    }
  }

  enum Jobs: byte
  {
    Programmer = 1,
    Administrator, // here is 2
    Designer, // here is 3
    Architect, // here is 4
    Manager // here is 5
  }
}
     

The output will be:

2

We have used the cast operator (type), in our example (byte) to cast to the underlying type of the Jobs Enumeration, to cast the Symbol Administrator of the Jobs Enumeration to its numerical value. The Convert.ToString() method returns a string representation, and here we pass the expression that will return a byte value of 2 (which represents the Jobs.Administrator value). I think it will be interesting at this point to take a look at the MSIL as we did with structures. Create a Enum.cs file, copy only the Jobs Enumerations to this file, and compile it using the following command

c:\csc /t:library Enum.cs

Now let's open it using ILDASM tool using the following command:

c:\ ildasm Enum.dll

The ILDASM tool will load the class library and you will get the same result as the following:

As you can see the Jobs Enumeration contains constant values and when you click on one of them you will get a window with its value:

We have assigned the value 1 to the Jobs.Programmer symbol, and as you can see here it has 0x01 value (means 1). I have told you programming in C# is fun, especially with the ILDASM tool, so you can understand what happens under the hood.

In the second part of this article I will discuss Reference-Types and you will learn about Classes, Arrays, Strings and Boxing and Unboxing operations.

blog comments powered by Disqus
.NET ARTICLES

- .Net 4.5 Brings Changes
- Understanding Events in VB.NET
- Objects, Properties, Events and Methods in V...
- Install Visual Web Developer Express 2010
- Microsoft Gadgeteer an Open Source Alternati...
- Best DotNetNuke Modules
- Facebook Image Viewer in Visual Basic
- Murach`s ADO.NET 4 Database Programming with...
- 5 Must Have Visual Studio 2010 Extensions
- Dynamic Web Applications with ASP.NET Mono u...
- PDFSharp: HTML to PDF in ASP.NET 3.5 using V...
- Using the PDFSharp Library in ASP.NET 3.5 wi...
- Sending Email in ASP.NET 3.5 using VB.NET wi...
- ASP.NET 3.5 Role Based Security and User Aut...
- Creating ASP.NET Login Web Pages and Basic C...

ASP Web Hosting ASP.Net Web Hosting Windows Web Hosting
ASP Free Forums 
 RSS  Tutorials RSS
 RSS  Forums RSS
 RSS  All Feeds
Site Map 
Request Media Kit
Write For Us Get Paid 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Privacy Policy 
Support 


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 2 - Follow our Sitemap
Most Popular Topics
All ASP.Net Tutorials