Low-Level Code Optimization in Phoenix

Compiler writing has been reduced to the job of a few people in the past. The developer should stop worrying about the phase at which the code snippet he entered in becomes code that the machine can execute. The appearance of the Phoenix Compiler Framework may make things easier for those who wish to take advantage of it.

Contributed by
Rating: 5 stars5 stars5 stars5 stars5 stars / 3
March 11, 2009
Rate this Article:
MEH MEH++


SEARCH ASP FREE
TOOLS YOU CAN USE

advertisement

I already presented the framework in a nutshell, so this week I will not cover this section again. If you missed my article titled "The Phoenix Compiler Framework," you are free to read it, just search the ASP Free site, or even better, under my profile. Everything I presented there will come in handy for a better understanding of this article, although you do not have to read it first. 

Writing code has always been built around the developer. There always has been a direction to make the life of the developer easier. For this purpose, languages appeared that we call "managed," and for this reason, tools exist that help us develop, such as Microsoft Visual Studio or Netbeans.

Of course you can write a great program with just a text editor; however, at some point, that can become so complicated that the developer will "break his legs" in the code, and he will have enough of everything related to coding for a lifetime.

No one wants this to happen. Therefore, specific tools were created to compensate for algorithmic problems with ease of implementation. However, at the end of the day, the code written by a developer will have to be transformed into something that the machine can understand and run.

Nice, readable code will have little importance for the machine. It executes anything without checking its appearance. Moreover, the saddest thing is that the most optimal and fastest codes still are the ones that are hard to read and write. To make the best trade with the developer, the ideal case scenario is to transform your beautiful code into fast code.

By the end of the previous article about Phoenix, you realized the importance of the linear flow of the IR (Intermediate Representation). This is what a back=end programmer of Phoenix will have, and this is what he has to generate.

The front-end is responsible for checking that what you wrote is valid according to the language's semantics and enforce those rules on you. This includes jobs like making sure you do make every cast correct, or even that you made it at all, or that you have not missed a ";" somewhere.

So for this part, we are going to the back end of the compiler to see how your code snippet is going to become executable machine code. One of the people responsible for this job for the Phoenix Compiler is Russell Hadley. The code that he gets from the front-end looks just like your programming teacher at the university told you not to write it.

It is full of labels and "goto"s. It may include a sequence of instructions that says add this with this, flip this bit, check if this variable is zero and if so go to there. Russell is responsible for filtering the code in a manner that, in the end, will make it both efficient and understandable to the platform.

Resource allocation

The main object at this point is to use the resources we have as intelligently as we can. It is all about saying what needs to be done when, in what order, so the code can run as fast as possible. For this, a back-end programmer can do everything as long as the "as if" rule is maintained.

This says that as long as the code will remain the same from the outside, and the application does the same thing for which it was coded, the compiler is free to do whatever it wants. Ultimately, you can reduce it all to two main jobs: register allocation and low-level code optimization. As for register allocation, it is a job of prioritization. For example, an X64 processor has 16 registers. Now a couple of them are reserved for various tasks, so in practice you will have even less.

Nevertheless, sometimes you will have two dozen operations waiting for execution. You cannot run all of them now, so you will have to push a couple of them back to the memory and read them later on. Which of them you will send back so that the program will run as fast as possible will be decided here.

On the other hand, another way to increase the speed of the code you write is to simplify it, and this is what a low-level code optimizer is meant to accomplish. You see language designers (C++, F#, C#, Java) all encourage developers to write readable and simple code.

Jobs like adding up two constants or dividing them effectively go to the compiler. Let the user write it separately if, with it, the code produced is easier to comprehend and extend. Now this optimization can be easily broken down into many sub-categories. On the following page, you will see a simplified sketch of all this, and I will also present each one briefly.

The optimization checkpoints

This is a long picture here if you ask me, so let's get started. First, the code from the front-end comes in at the Reader. Here the code is just read in. This will produce the I ntermediate R epresentation, whose variations are preserved all the way down the translation descent. The translation descent is the process itself that the image presents, starting from high-level code provided by the front-end and ending with the machine code produced at the bottom.

From there we go to the Inliner. The inliner will try to avoid function calls where possible by getting the source code, one by one, and inserting it where the function is called. This will save one call each time and reduce call load.

This is great if you have a function called at a single place a lot of times; however, what if you call a function a million times from a million different places? In essence, it will try to reduce call load by making trade offs and deciding which one of them is better.

The global optimizer does a lot. It will optimize the calculation of values. Say you calculate something a dozen times. If the global optimizer observes it, it may say "I can save the value that we got the first time inside a temporary value and use it for the later calculations."

It will simplify your expressions. Inside statements, you want to express possibilities so that they will be readable, and you do not think about mathematical ways of expressing it as simply as possible with as few checks as possible. This change will be made here. 

The global optimizer computes ranges of values. Therefore, if you have an "if" that tells the program to enter here if this value is less than zero, but the global optimizer sees that it will never be less than zero, it just removes the entire block. This leaves less code to manage, and fewer places for something to go wrong. 

Moreover, it goes through a couple of other optimizations, causing a great improvement in performance. The global optimizer will take up around a quarter of the compilation time.

The Loop Optimizer, meanwhile, tries to make major changes where it really matters. It may decide to run the "for" loop in a different order if that will not change the result and the new order is easier to accomplish for the platform you compile. For example, it may invert the stepping of the variable I into decreasing rather than increasing order.

Alternatively, if you keep "i", "j" (two classic variables) inside an array, it may be more efficient to keep only an address to "i" and the constant that will show where "j" is, as we will have less data to track. In its essence, it will try to make all the loops as effective as possible, as these tend to slow down programs most of all. 

The Lower will produce LIR, a lower level of the optimization that is closer to the machine's code. Idioms Optimizer is an interesting phase that will try to simplify the code even more, if that is possible. There are some differences between machine code and the code written by us on the keyboard.

There may be a function on the machine that does not exist in our language, as it is too low-level for us to use. After the code is lowered to this level, this optimizer will search for this kind of presence according to a pattern-matching algorithm, and replace it with the simpler functions.

The Register allocator will allocate values to a register on the processor. First, it will make the value a candidate. After that, the register allocator may assign the value to something like EAX, which is actually a registry address on the processor. Therefore, it will allocate and assign from a set of abstract registries to some physical registries on the processor. This is accomplished according to the prioritization I talked about in the prior page.

The last part is the Encoder. Here it will actually bundle everything together into the binary representation that you can put inside an executable. It will fetch the IR that by now is super-optimized with appropriate resource allocation made, so it can just generate the binary representation from it. That is a long flow of alternating zeros and ones for you.

Actually, the Phoenix compiler can directly produce the executable as shown in the upper diagram; nevertheless, nowadays the compiler will first create, instead of the exe, an obj from which later on the Linker will create the proper executable file.

The linker exists so we can import libraries that contain functionality that we do not need to write. It just tells the application: "you know what, even now you do not have this function, trust me. It will exist, just ask the OS at run time to give it to you and you will have it."

Levels of the IR and producing plug-ins

You may have been asking what the HIR, MIR and LIR symbolize on the side of the image. Well these are the levels of the IR. It can be split up into three segments: the High Level, Medium Level and Low Level Intermediate Representation.

While all of them will maintain the same syntax during the translation descent with each step, the representation gets a little harder to understand as it gets a little closer to the machine's language. There are three segments where the representation suffers a change, just as shown in the upper image.

Why is this important? Well, because Phoenix is, at the end of the day, a code generating, optimizing and analyzing tool that has plug-in support. At any time, you may just come in and ask the IR from the compiler to let you use its knowledge for your own purposes.

However, because this will get harder and harder to track, the places marked with the star (*) point to where it is advisable for you to insert your own plug-in that will analyze/change the code. You can make sure that every error is checked in the program, even for your own custom set of rules.

Alternatively, you may be curious when someone actually adds a plug-in. You simply need to ask the Phoenix framework to raise an event at situations like this and notify you. This way you may check even the content of the plug-in and see what changes it makes to the code.

This can be reused inside a security application for analyzing legacy code and new code for system defects, things that might go wrong at run time. You can add new functionality inside the program just by extending the executable.

Nevertheless, be aware that all this is a lot more complicated than it sounds. What I have presented so far was a simplified overview/sketch of the process. The problem is that in real life usage, there are many details for which you have to be prepared. The number of details and exceptions encourage only those very familiar with the documentation to start writing. However, if you want the option it is there.

As we come to the end of this article, the obvious question pops up. Is it worth it to spend time on this now, with processors appearing with more and more core, and memory slowly reaching sky-high values? Well, it might not always be required. Sometimes you may get away without the performance increase.

Nevertheless, in the end every saved operation is a saved bit flip that is a saved energy source. Now the question remains: how much energy do you have to waste? I hope you found some interesting information inside this article.

I offer you the possibility to express your ideas related to this article here on the blog or join the friendly forum of Dev Hardware. If you decide to rate my article I would also thank you and until we meet again in one of my articles Live With Passion!

blog comments powered by Disqus
BRAINDUMP ARTICLES

- Microsoft Windows 8 Committed to Cloud Compu...
- Independent Developers Favor Windows Phone 7
- Dell Introduces VMware-based Cloud
- Microsoft and Skype Agree to Acquisition Deal
- Transfer Contacts in Microsoft Outlook
- Zune`s Next Steps
- Safari Books Online Review
- Does Microsoft Get Touch Screens Now?
- Microsoft`s Record Quarterly Earnings Not En...
- Basic Operations and Registers in Assembly
- Assembly Coding within Visual C/C++ IDE
- New Microsoft Office Coming with a Twist
- Microsoft`s FUSE Labs Unveils Spindex Social...
- HP Slate with Windows 7: Dead or Alive?
- Windows Phone 7 Mobile OS to Rival Android a...

ASP Web Hosting ASP.Net Web Hosting Windows Web Hosting
 
 
 

ASP Free Forums 
 RSS  Tutorials RSS
 RSS  Forums RSS
 RSS  All Feeds
Site Map 
Request Media Kit
Write For Us Get Paid 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Privacy Policy 
Support 


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 10 - Follow our Sitemap
Most Popular Topics
All ASP.Net Tutorials