CIL stands for Common Intermediate Language. It is used to help make programs cross-platform compatible. This article, the first of two parts, will introduce you to CIL and show you how to use it.
With .NET, a program is usually written in a language such as C# or Visual Basic and is then compiled. However, the source code of the program is not directly compiled into native code. Rather, it is compiled into Common Intermediate Language (CIL), also known as Microsoft Intermediate Language (MSIL). Then, when the resulting executable file is executed, the intermediate code is translated into native code. This process, which is similar to the process a Java application goes through, has the advantage of cross-platform compatibility.
Of course, one need not know too much about CIL in order to develop applications in .NET. However, CIL is human-readable, and it's possible to work directly with it. In this article, we'll take a quick look at CIL.
A Conversion Task
In order to become familiar with the very basics (the scope of this article) of CIL, let's examine a program in C# and then work to convert it into CIL. Let's use a program that finds prime numbers between two and one million:
using System;
publicclassFindPrimes
{
publicstaticvoid Main(string[] args)
{
for (int i = 2; i <= 1000000; i++)
{
bool prime = true;
int limit = (int)Math.Sqrt(i);
for (int f = 2; f <= limit; f++)
{
if (i % f == 0)
{
prime = false;
break;
}
}
if (prime)
{
Console.WriteLine(i);
}
}
Console.WriteLine("Done!");
}
}
The program is simple. We first set up an outside loop with loops over the numbers two through one million. Inside of this loop, we assume that each number is prime to begin with. We then loop over all the numbers from two to the square root of the number being examined. If the number being examined for primality is divisible by any of the numbers from the inner loop (that is, if it has no remainder), then the number is not prime, and we may safely break out of the inner loop in order to begin testing the primality of a new number. Of course, if the inner loop runs completely through, then the number is prime, and we print it. At the end of everything, we print out a message and exit.
We'll now break up our short program and convert it into CIL step-by-step. The first thing we see in the program is a using statement. Statements like this one, however, only exist for convenience and have no equivalent in CIL. Before we move on, though, we have to do a little setting up. Create a file and name it what you will, giving it a “.il” extension. Now, paste the following into it:
.assembly extern mscorlib {}
.assembly FindPrimes {}
This is a necessary step. Ordinarily, when compiling a program from a higher-level language, the compiler would add some more meat here, but for our purposes, that isn't necessary – the assembler will take care of what we need.
The basic building block of a .NET program is, of course, the class. Here's ours:
publicclassFindPrimes
{
...
}
The concept of classes is not just a higher-level component of .NET. Classes, of course, exist in CIL, too, and, surprisingly, they don't look much different from those in C#. The following CIL code is equivalent to the above C# code. Paste it into the file we created earlier (leaving the “...” out in order to be replaced later with meaningful code, of course):
.class public FindPrimes
{
...
}
Again, notice how similar the two are. So far, we've encountered nothing scary. The main difference here is that the protection modifier goes second in the CIL code, rather than first as in the C# code.
Methods
Now, of course, we need to add a method to our class. The syntax to create a method in CIL is, as with creating classes, not too different from the C# equivalent. Place the following CIL code inside of our class:
.method public static void Main(string[] args)
{
...
}
Again, so far, everything is simple. Observe, however, that we're not dealing with just any method here. No, we're dealing with the entry point of our program. Execution starts with this method, and while C# is able to automatically identify the method and designate it as the program's entry point based on its signature, this isn't taken care of for us with CIL. Instead, we need to manually mark this method as the entry point of our program. Fortunately, this is easy enough and only involves a few characters:
.entrypoint
So, this is what you should have in the source file so far:
Before we continue with our main conversion task, let's break off for a second, temporarily turning our program into a “hello world” example, just to explore some basic concepts without taking a dive (otherwise, with our example, it is quite a dive).
With C#, operations and method calls are all quite readable – the syntax all makes sense and is what we're used to. CIL, however, is quite different. Up until now, CIL wasn't too different from C#. Now, however, some clear differences will appear. To perform operations, programs use what is called a stack. Basically, first, values are pushed onto the stack. Then, some sort of operation is performed which involves popping those values off of the stack and, if applicable, pushing a result back onto the stack. For example, consider addition. Let's add the numbers three and four together. First, we push three onto the stack:
3
Then, we push four onto the stack:
4
3
Notice how it is stacked on top of the previous value. Now, in order to perform the addition, we need to take both values off of the stack, add them up, and then push the value onto the stack. After that's done, the stack looks like this:
7
Now we have the result. Let's turn to a more complicated example: a method call. Consider the following method call in C#:
Console.WriteLine("Two numbers: {0} and {1}", 73, 82);
C# makes this easy, but how does this look in CIL when we compile it? The resulting code seems quite scary:
It only seems scary, though. Let's take it line-by-line and examine what it does to the stack. The first line starts with ldstr, which stands for “load string.” As its name suggests, it loads a string onto the stack. So, after it's executed, the stack looks like this:
“Two numbers: {0} and {1}”
The next line isn't any scarier. It loads the number 73 onto the stack. The stack now looks like this:
73
“Two numbers: {0} and {1}”
After that, we box the number at the top of the stack (73), turning it into an Int32:
73 (Int32 object)
“Two numbers: {0} and {1}”
The next two lines are very similar, and after they are executed, the stack looks like this:
82 (Int32 object)
73 (Int32 object)
“Two numbers: {0} and {1}”
Now, the final line is where the action takes place. We call the WriteLine method of System.Console. Note that we have to specify the parameters of the method we want – this is important! Here, we specify that we want the WriteLine method that takes a string and then two objects. When this line is executed, an object is first popped off of the stack, followed by another object and then finally a string. Notice how this is done in reverse order – this, too, is important!
Returning to the “hello world” example, it's now clear what we have to do. First, we have to push a string onto the stack -- “Hello World.” Then, we simply call the WriteLine method, specifying that we have a string as the sole argument. So, let's write a short program using the techniques we've learned so far. First comes the class and the method. We also have to put the required bit at the beginning. Put everything in a file called “helloworld.il”:
.assembly extern mscorlib {}
.assembly HelloWorld {}
.class public HelloWorld
{
.method public static void Main(string[] args)
{
.entrypoint
...
}
}
Now, we have to push the string that we're working with onto the stack in order for it to be passed in our call to WriteLine:
ldstr "Hello, World."
WriteLine can now be called. The above string will be popped off the stack, and WriteLine will print it onto the screen:
We're still not done, however. There are two things that need to be taken care of. First, in order for a method to work properly in CIL, we need the explicitly set the maximum size of the stack. That is, we must state the most number of items that will be on the stack at any one time. In our example, we're only pushing one item onto the stack at a time, obviously:
.maxstack 1
Second, we have to add a return statement. In C#, this isn't necessary for a void method, but in CIL, we have to take care of this ourselves. Fortunately, it's easy enough:
In order to be able to execute our program, we must first assemble it. This is done using ilasm.exe, which you should find in C:WINDOWSMicrosoft.NETFrameworkvX.x: