If you code in C++, you will run into pointers sooner or later. They can be a great tool when implementing design patterns and building complex data structures. In this first of a series of articles about pointers, J. Nakamura introduces several operations that can be performed on pointers.
Contributed by J. Nakamura Rating: / 6 November 29, 2004
When you are coding in C++, you are bound to run into pointers; we are not going to discuss here whether they are good or bad, rather how we can make good use of them. Don’t worry, even if you are well versed in C++, you might still find some interesting points in this article.
Pointers can be used to strengthen your understanding of computer memory and its management. They become an invaluable tool when implementing design patterns and building complex data structures. You can delay the construction of an object by using pointers, implement your own memory management if you work with a lot of fragmented data, and build powerful abstract interfaces utilizing polymorphism and function pointers.
This is the first in a series of articles that will introduce you to the implementation of some useful patterns. In order to make good use of them, a solid understanding of pointers is a prerequisite. At the end you will have built an application in C++ that downloads and displays the latest weather satellite images of Europe from the Internet. My goal, however, is to provide you with some essential building blocks that will help you implement these useful design patterns as reusable classes.
If you encounter a topic touched in this article which you would like to see explained in more depth, please don’t hesitate to contact me.
So what is a pointer? Simple: pointers are variables that store addresses and can be null. How hard can it be or get? Read on to find out.
Lets start with a code example...I like code examples.
int main(int argc, char *argv[]) { int myVar = 10; I nt *pPtr = &myVar; return 0; }
When you compile this and look at the memory address pPtr contains in the debugger (or should I say points at), you will find something that resembles:
The hexadecimal numbers on the left represent the addresses of sequential memory (note how I have chosen to display 4 bytes – 1 integer per line here). In the middle you find the values stored at that location, and next to that, its ASCII representation.
So a pointer is a variable that stores an address and can be null. In this case you see that pPtr (at 0x0012FEC8) contains the address of myVar (at 0x0012FED4). Note that although the address pointed at looks like an integer (i.e. 4 bytes long) you may not presume this is always the case (e.g. on an IBM AS/400 a pointer is 16 bytes long).
The address dereference operator
How do we play with pointers? One way is to retrieve the address of a variable and store that in a pointer. That is what you use the address dereference operator (&) for.
In the example ‘&myVar’ returns the address of ‘myVar’ (0x0012FED4) which is then stored in ‘pPtr’; this is a pointer that can hold the address to an int. From now on we can access the contents of myVar through pPtr by using the reference operator (see below). It might seem confusing that you would want to have access to a value in two different ways, but it will prove to be very useful.
The reference operator: *
The reference operator is also known as the indirection operator. You use this operator to retrieve the value stored at the address contained in the pointer. Thus ‘*pPtr’ will yield the value 10, which is the value stored in variable myVar at address 0x0012FED4. (Notice that this operator was also used for declaring the pointer.)
It is a good habit to set any uninitialized pointer to 0 (NULL), because otherwise you might be poking around in a random piece of memory when accessing it through the pointer. This would be a good moment to use an assert to make sure the pointer is not 0 (using #include <assert.h>):
int *pPtr = 0; /* some code here */ assert(pPtr != 0); *pPtr = 10;
When you are using classes or structs (do you know what the difference is between them in C++?), you can use the pointer-to-member operator (->) to retrieve/set their values or call their functions. Consider this class:
class MyClass { public: MyClass() : myVar(10) {} int myVar; };
Assume you are using it the following way:
MyClass myThing; MyClass *pPtr = &myThing;
then you can access MyClass::myVar with the pointer-to-member operator:
pPtr->myVar = 20;
or use the reference operator if you like:
(*pPtr).myVar = 20;
... the outcome will be the same.
Pointer arithmetic
It is possible to conduct arithmetical operations on pointers, although only the addition and subtraction operators can be used. This is very useful when you need to traverse a block of contiguous memory, filled sequentially with instances of a single type. Depending on the size of the type, you will get different results (you can determine the size of a type with the sizeof(<type>) operator).
char *pPtr1 = str1; (void)printf(“address of pPtr1: 0x%p\n”, pPtr1); (void)printf(“address of pPtr1+1: 0x%p\n”, pPtr1+1); wchar_t *pPtr2 = str2; (void)printf(“address of pPtr2: 0x%p\n”, pPtr2); (void)printf(“address of pPtr2: 0x%p\n”, pPtr2+1);
return 0; }
Your output will be similar to:
address of pPtr1: 0x00424038 address of pPtr1+1: 0x00424039 address of pPtr2: 0x004246E0 address of pPtr2+1: 0x004246E2
Notice how pPtr1+1 increased with 1 and how pPtr2+1 increased with 2! This is because sizeof(wchar_t) is 2. If we had been working with pointers to int, you could have seen an increase of 4 (if you want to be sure what the size of int - or for that matter of any type - is, use the sizeof operator!).
Pointers can take on different identities, depending on how you wish to use them. You might find older C APIs using void* as function parameters to allow you to implement your own types. An API could, for example, allow you to provide function pointers that construct, destruct, read and print strings. Since the API is only concerned with void* it won’t care whether you are using char, wchar_t, std::string, BSTR or whatever. This way the integration of string into your code base remains up to you, although lack of types makes it unsafe. This is an interesting point we are definitely going to explore later on.
In C++, the API could use pointers (preferably abstract) to base classes instead of void*. It does all its operations on the base class and leaves the implementation up to you in a derived class. Through virtual functions, the compiler will know which functions have to be called at runtime (the world of polymorphic classes). This way you can integrate your own code in a type safe way.
In order to use the different forms of a pointer, you sometimes need to cast them into their right shape. A nice rule to remember is that a cast doesn’t show respect for identity; so always verify how much respect your code is showing. Don’t be surprised to find bug-riddled code to be showing no respect at all.
C-style cast
The standard C-style cast is one that simply forces one type onto another.
T t = (T)expression;
e.g. float *pPtr2 = (float*)pMyVar;
C-style casts are totally unsafe (meaning they don’t perform neither compile time nor run time checks), so C++ provides operators that enhance these casts. Although the C++ cast operators require more work as far as typing is concerned, they will help you catch bugs early in the development process.
Static_cast
The static_cast works like the C-style cast and relies only on compile time information and performs no run time checks.
T t = static_cast<T>(expression);
e.g. float *pPtr2 = static_cast<float*>(myVar);
When you are dealing with a chunk of memory pointed at by char* (sizeof(char)==1... thus the pointer allows you to iterate byte for byte through the memory), there is an easy way to extract data types from this chunk. All you need is the reference operator and the static_cast:
int framerate = *(static_cast<int*>(&pChunk[n]));
What happens is that the address of the nth byte in pChunk is retrieved and casted from a char* to an int*, on which we use the reference operator. This forces the retrieval of the next 4 bytes (sizeof(int)==4), the contents of which are copied as an int into framerate. Always make sure that the memory you are reading really is available and that you are aware of any possible endianity issues. Endianity is something you deal with when you are writing cross-platform code and will be explained it in another article.
The dynamic_cast is used to safely cast up and down an inheritance hierarchy (e.g. from base object pointer into a derived object pointer) and relies on compile time and run time information.
When dynamic_cast fails on a pointer during run time, it returns NULL, and when it fails on references it throws an exception. This feature can be used during run time to determine if a cast succeeded or not, but you do take a small performance penalty. For this very reason the dynamic_cast is often used to implement the generic Visitor pattern; but if you don’t want that performance penalty you can still implement the Visitor pattern without dynamic_cast (albeit not as generic).
T t = dynamic_cast<T>(expression);
e.g. Derived *pPtr2 = dynamic_cast<Derived*>(pBase);
Note that you can never cast a base object into a derived object when it is an instantiation of that very base class! You can not make the object any larger than it actually is: I may have a barrel of oil, but how could I just bluntly assume it is unleaded gasolinel? I can static_cast it into unleaded gasoline or diesel and try to run my car on it, but this most definitely would fail.
Reinterpret_cast
Your last resort might be the reinterpret_cast, which just blindly casts type A into type B. Remember the void* I mentioned in the legacy C API? That would be a typical scenario where you would encounter a reinterpret_cast.
T t = reinterpret_cast<T>(expression);
Another example might be the need to cast between function pointer types. Let's say you have the following code:
Typedef void (*FuncPtr)(); // FuncPtr is a function pointer
FuncPtr array[5]; // this is an array of 5 FuncPtrs
Now it won’t be a problem to put a void function(); as a FuncPtr into that array, but what happens when you want to place int function(); as a FuncPtr in there? Then this is what you need to make the compiler do your bidding:
array[0] = reinterpret_cast<FuncPtr>(&foo);
And realize that this is not showing a lot of respect for identity; I would even say that this code is being quite rude. Please stay away from dark, muddy, treacherous waters like these.
The `void* operator'
Once I ran into a statement that quite puzzled me… it read:
if (!myObj) {/*something is wrong*/}
What puzzled me was that myObj was an object and how can an object be evaluated as true or false? It turns out that the compiler uses the void* operator as a last attempt to make sense of this through implicit conversion.
This can come in real handy, since you can query whether an object is valid or not. The STL allows you to validate iostreams this way; you can find in the file xiosbase:
operator void *() const { // test if any stream operation has failed return (fail() ? 0 : (void*)this); }
As you see in the code above, when a certain state/condition isn’t met the operator returns 0, resolving to false in our statement (and yes a C++ cast could have been used instead of the C-style cast).
Conclusion
This concludes part one of our series on the SatView project. Stay tuned for the next article, where we'll be studying construction and deconstruction in C++. Until next time!