As well, take a look at our Other FAQs, the Comeau Templates FAQ and the the Comeau C99 FAQ
The intent of this page is to address questions about C++ and C that come up often, perhaps too often. However, it is exactly the frequency of these topics that is the reason for including a discussion of them below. These issues usually come up as having originated from a misleading statement commonly made, or from code shown in a book. These points have found themselves here as the result of our connection to the C++ and C communities for 20 years, whether teaching, helping in newsgroups, providing tech support for Comeau C++, or just plain listening to folks' issues. Some of the topics below can be found in other FAQs, however, here we try to offer more information on the respective topics, as well as issues related to them. Here's the current topics:
Note that we've broken our list down into categories, consider getting one or two from each category. Some you should get as references, others you should get to read from cover to cover.
Note that often there is a problem between technical accuracy and readability. You may have a very readable book that's telling you wrong things, or avoiding fundamentals or insights, but that's won't get you anywhere to be reading the wrong things albeit easily. The converse is a very accurate book that is not as easy to read. There is a price to pay either way, but generally you're better off with the technically correct text. We believe this is so for the short and long term. And in general, ple ase make an effort to avoid product oriented books, or books with titles that just make things sound like everything will just be so great.
Categorically, we have not been satisfied with online tutorials (this does not mean that there are no good ones, just that we have not seen it yet). If you know of one that's excellent, don't hesitate to email us about it.
news:alt.comp.lang.learn.c-c++ |
http://www.comeaucomputing.com/learn/faq |
news:comp.lang.c news:comp.std.c |
http://c-faq.com |
news:comp.lang.c++ news:comp.lang.c++.moderated |
http://www.parashift.com/c++-faq-lite/ Welcome to comp.lang.c++ http://www.slack.net/~shiva/welcome.txt |
news:comp.std.c++ | http://www.jamesd.demon.co.uk/csc/faq.html |
The Comeau C++ and C FAQ | http://www.comeaucomputing.com/techtalk |
This document (The Comeau C++ TEMPLATES FAQ) | http://www.comeaucomputing.com/techtalk/templates |
The Comeau C99 FAQ | http://www.comeaucomputing.com/techtalk/c99 |
In general, if you need a FAQ, check out http://www.faqs.org
The latest revision of Standard C, so-called C99, is also available electronically too from ANSI. ANSI C89 or ISO C90 is still currently available but only in paper form. For instance, Global Engineering Documents was carrying C89 (X3.159). Once at that link, click US, enter "x3.159" in the document number (NOT the document title) search box, and that'll get you the latest info from Global on this paper document (last we checked it was US$148)
// A: main should not return a void void main() { /* ...Whatever... */ } // AA: This is just the same as A: void main(void) { /* ...Whatever... */ }The problem is that this code declares main to return a void and that's just no good for a strictly conforming program. Neither is this:
// B: implicit int not allowed in C++ or C99
main() { /* ...Whatever... */ }
The problem with code example B is that it's declaring main to return nothing. But not declaring a function's return value is an error in C++, whether the function is main or not. In C99, the October 1999 revision to Standard C, not declaring a function's return value is also an error (in the previous version of Standard C an implicit int was assumed as the return value if a function was declared/defined without a return value). But the usual requirement of both Standard C++ and Standard C is that main should be declared to return an int. In other words, this is an acceptable form of main:
// C: Ok in C++ and C
int main(void) { /* ... */ }
Also, an empty parameter list is considered as void in C++, so this is ok in C++:
// D: Ok in C++ and C
int main(/*NOTHING HERE*/) { /* ... */ }
Note that it is ok in Standard C too, because although there is an empty parameter list, D is not a declaration but a definition. Therefore, it is not unspecified as to what the arguments are in C, and it is also considered as having declared a void argument.
When you desire to process command line arguments, main may also take this form:
// E: Ok in C++ and C
int main(int argc, char *argv[]) { /* ... */ }
Note that the names argc and argv themselves are not significant, although commonly used. Therefore F is also acceptable:
// F: Ok in C++ and C
int main(int c, char *v[]) { /* ... */ }
Similarly, array parameters "collapse" into pointer arguments, therefore, G is also acceptable:
// G: Ok in C++ and C
int main(int c, char **v) { /* ... */ }
The int return value may also be specified through a typedef, for instance:
// H: Ok in C++ and C
typedef int Blah;
Blah main() { /* ... */ }
Here, main is declared to return an int, since Blah is defined as an int. This might be used if you have system-wide typedefs in your shop. Of course, the following is also allowed since BLAH is text substituted by the preprocessor to be int:
// I: Ok in C++ and C
#define BLAH int
BLAH main() { /* ... */ }
Do note though that the standards do not talk about all integers, but int, so you wouldn't want to do these:
// J: Not ok
unsigned int main() { /* ... */ }
// K: Not ok
long main() { /* ... */ }
Often some of this can compound. For instance, a problem some run into looks as follows, consider:
#include "SomeHeader.h" main() { /* ... */ }However, note that Someheader.h is written like this:
struct WhatEver { /* ... */ } /* MISSING semicolon for the struct */That means that the main in this case is being declared to return a WhatEver. This usually results in a bunch of funny errors, at best.
In short, you wouldn't expect this to work:
// file1.c float foo(int arg) { ... } // file2.c int foo(int);But that's exactly the scenario that occurs when you misdeclare main, since a call to it is already compiled into your C or C++ "startup code".
The above said, the standards also say that main may be declared in an implementation-defined manner. In such a case, that does allow for the possibility of a diagnostic, that is, an error message, to be generated if forms other than those shown above as ok are used. For instance, a common extension is to allow for the direct processing of environment variables. Such capability is available in some OS's such as UNIX and MS-Windows. Consider:
// L: Common extension, not standard
int main(int argc, char *argv[], char *envp[])
That said, it's worth pointing out that you should perhaps favor getenv() from stdlib.h in C, or cstdlib in C++, when you want to access environment variables passed into your application. (Note this is for reading them, but writing environment variables so that they are available after your application ends is tricky and OS specific at best.)
Last but not least, it may be argued that all this is not worth the trouble of worrying about, since it's "such a minor issue". But that fosters carelessness. It also would support letting people accumulate wrong, albeit "small", pieces of information, but there is no productive benefit to that. It's important to know what is a compiler extension or not. There's even been compilers known to generate code that crashes if the wrong definition of main is provided.
By the way, the above discussions do not consider so-called freestanding implementations, where there may not even be a main, nor extensions such as WinMain, etc. It may also be so that you don't care about whether or not your code is Standard because, oh, for instance, the code is very old, or because you are using a very old C compiler; this is something you need to weigh. Too, note that void main was never K&R; C, because K&R; C never supported the void keyword. Anyway, if you are concerned about your code using Standard C++ or Standard C, make sure to turn warnings and strict mode on.
If you have a teacher, friend, book, online tutorial or help system that is informing you otherwise about Standard C++ or Standard C, please refer them to this web page http://www.comeaucomputing.com/techtalk. If you have non-standard code which is accepted by your compiler, you may want to double check that you've put the compiler into strict mode or ANSI-mode, and probably it will emit diagnostics when it is supposed to.
Semantically, returning from main is as if the program called exit (found in <cstdlib> in C++ and <stdlib.h> in C) with the same value that was specified in the return statement. One way to think about this is that the startup code which calls main effectively looks like this:
// ...low-level startup code provided by vendor
exit(main(count, vector));
This is ok even if you explicitly call exit from your program, which is another valid way to terminate your program, though in the case of main many prefer to return from it. Note that C (not C++) allows main to be called recursively (perhaps this is best avoided though), in which case returning will just return the appropriate value to wherever it was called from.
Also note that C++ destructors won't get run on ANY automatic objects if you call exit, nor obviously on some newd objects. So there are exceptions to the semantic equivalence I've shown above.
By the way, the values which can be used for program termination are 0 or EXIT_SUCCESS, or EXIT_FAILURE (these macro can also be found in stdlib.h in C and cstdlib in C++), representing a successful or unsuccessful program termination status respectively. The intention is for the operating system to do something with the value of the status along these same lines, representing success or not. If you specify some other value, the status is implementation-defined.
What if your program does not call exit, or your main does not return a value? Well, first of all, if the program really is expected to end, then it should. However, if you don't code anything (and the program is not in a loop), then if the flow of execution reaches the terminating brace of main, then a return 0; is effectively executed. In other words, this program:
int main() { }
is effectively turned into this:
int main() { return 0; }
Some C++ compilers or C compilers may not yet support this (and some folks consider it bad style not to code it yourself anyway). Note that an implication of this is that your compiler may issue a diagnostic that your main is not returning a value, since it is usually declared to return an int. This is so even if you coded this:
int main() { exit(0); }
since exit is a library function, in which case you may or may not have to add a bogus return statement just to satisfy it.
An implementation-defined description follows. On UNIX, the low-order 8-bits of the int status are returned. Another process doing a wait() system call, which is not a Standard function, might be able to pick up the status. A UNIX Bourne shell script might pick it up via the shell's $? environment variable. MS-DOS and MS-Windows are similar, where the respective compilers also have functions such as wait upon which another application can obtain the status. As well, in command line batch .BAT files, you can code something like IF ERRORLEVEL..., or with some versions of Windows, the %ERRORLEVEL% environment variable. Based upon the value, the program checking it may take some action.
Do note as mentioned above, that 0, EXIT_SUCCESS and EXIT_FAILURE are the portable successful/unsuccessful values allowed by the standard. Some programs may choose to use other values, both positive and negative, but realize that if you use those values, the integrity of those values is not something that the Standard controls. In other words, exiting with other than the portable values, let's assume values of 99 or -99, may or may not have the same results/intentions on every environment/OS (in other words, there is no guarantee that the 99 or -99 "go anywhere").
#include <iostream>
int main()
{
��int i = 99;
��std::cout << i << '\n'; // A
��std::cout << i << std::endl; // B
��return 0;
}
In short, using '\n' is a request to output a newline. Using endl also requests to output a newline, but it also flushes the output stream. In other words, the latter has the same effect as (ignoring the std:: for now):
cout << i << '\n'; // C: Emit newline
cout.flush(); // Then flush directly
Or this:
cout << i << '\n' << flush; // D: use flush manipulator
In a discussion like this, it's worth pointing out that these are different too:
cout << i << '\n'; // E: with single quotes
cout << i << "\n"; // F: with double quotes
In specific, note that Es last output request is for a char, hence operator <<(ostream&, char) will be used. In Fs case, the last is a const char[2], and so operator <<(ostream&, const char *) will be used. As you can imagine, this latter function will contain a loop, which one might argue is overkill to just print out a newline. Some of these same point also apply to comparing these three lines of code, which all output h, i and newline, somewhere in some way:
cout << "hi\n"; // G
cout << "hi" << '\n'; // H
cout << "hi" << "\n"; // I
By the way, although these examples have been using cout, it does not matter which stream is being used. Also, note that line A may also cause a flush operation to occur in the case where the newline character just happens to be the character to fill up the output buffer (if there is one, that is, the stream in question may happen to not be buffered).
In conclusion, which of these should you use? Unless performance is absolutely necessary, many favor using endl, as many find that just typing endl is in most cases easy and readable.
Here's a bit more technical information for those so inclined. As it turns out, endl is called an iostreams manipulator. In reality, it is a function generated from a template, even though it appears to be an object. For instance, retaining its semantics, it might look like this:
inline ostream &std;::endl(ostream& OutStream)
{
��OutStream.put('\n');
��OutStream.flush();
��return OutStream;
}
Iostreams machinery kicks in because it has an ostream& std::operator <<(ostream &(*)(ostream&)) which will provide a match for endl. And of course, you can call endl directly. In other words, these two statements are equivalent:
endl(cout);
cout << endl;
Actually, they are not exactly the same, however their observable semantics is.
There are other standard manipulators. We leave it as an exercise to the reader to research how to create your own manipulator, or how to override something like endl.
#include <stdio.h> int main() { /* Code using stdin inserted here */ fflush(stdin); // eat all chars from stdin, allegedly /* More code using stdin */ }But this won't work. It is undefined behavior because the standard only provides words to make sense out of fflush()ing output streams, not input streams. The reason this is so is because the stdio buffer and the operating system buffer are usually two different buffers. Furthermore, stdio may not be able to get to the OS buffer, because for instance, some OSs don't pass the buffer along until you hit return. So, normally, you'd either have to use a non-standard non-portable exte nsion to solve this, or write a function to read until the end of the line.
In C++, you might do this:
#include <iostream> int main() { char someBuffer[HopeFullyTheRightSize]; /* Code this and that'ing std::cin inserted here */ /* Now "eat" the stream till the next newline */ std::cin.getline(someBuffer, sizeof(someBuffer), '\n'); /* More code using std::cin */ }One problem here is that an acceptable buffer size must be chosen by the user. To get around this, the user could have used a C++ string:
#include <iostream> #include <string> int main() { std::string someStringBuffer; // Don't worry about line length /* This and that'ing with std::cin here */ std::getline(std::cin, someStringBuffer); /* More code using std::cin */ }A problem still remain here in that the code still requires a buffer to be explicitly provided by the user. However, iostreams has capability to handle this:
#include <iostream> #include <limits> int main() { /* Code using std::cin inserted here */ std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); /* More code using std::cin */ }It might be worth wrapping that line up into an inline function, and letting it take a std::istream & argument:
inline void eatStream(std::istream &inputStream;) { inputStream.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); }
Note that often you will see std::cin.clear() used, which may look redundant given .ignore(). However, .clear() does not clear characters. Instead, it clears the respective stream's state, which is important to do sometimes. The two operations often go hand in hand because perhaps some error situation has occurred, like reading a number when alphabetic characters are in the input stream. Therefore, often clearing the stream state and the bad characters is oft en done in adjacent lines together, but be clear :), they are two different operations.
Of course, you don't have to .ignore() the max possible number of characters, but as many chars as you like, if less makes sense for the problem that you are solving. The above shows C++ solutions, but C solutions will be similar, to wit, you need to explicitly eat the extra characters too, perhaps:
#include <stdio.h> void eatToNL(FILE * inputStream) { int c; /* Eat till (& including) the next newline */ while ((c = getc(inputStream)) != EOF) if (c == '\n') break; } /* blah blah */ eatToNL(stdin); /* blah blah */As usual, don't hesitate to read your texts on the functionality of iostreams or stdio.
class Comeau { }; // ... Comeau *p = new Comeau[99]; delete [] p; // ok // ... Comeau *p2 = new Comeau; delete [] p2; // not okIf you new a scalar, then you need to delete a scalar:
Comeau *p3 = new Comeau; delete p3; // ok // ... Comeau *p4 = new Comeau[99]; delete p4; // not okThe reason so is because delete doesn't just get rid of the memory, but runs the respective destructors for each element as well. This does not mean that since builtin types don't have destructors that this is ok:
int *p5 = new int[99]; // AA delete p5; // BB: NO!It does not matter if violations of these points appears to work with your compiler. It may not work on another compiler, or even an upgrade of the same compiler, since the Standard has no such provision for line BB given line AA.
#include <iostream> int main() { cout << "hello, world\n"; }it may be an error, because there is no cout, there is a std::cout. So spelling it like that is one way to fix the above code:
std::cout << "hello, world\n";You could also use a using declaration:
using std::cout; // read as: hey, cout is in now in scope // Therefore, cout is a synonym for std::cout //... std::cout << "hello, world\n"; // ok no matter what cout << "hello, world\n"; // ok because std::cout is in scopeYou could also use a using directive:
using namespace std; // hey, all of std is in scope //... std::cout << "hello, world\n"; // ok no matter what cout << "hello, world\n"; // ok because std::cout is in scope // because all of std from the headers used is in scopeAlthough this is not a namespaces tutorial, it's worth pointing out that you should usually consider putting your usings as local as possible. The reason why is because the more usings that you use, the more you'll be defeating namespaces. IOWs, when possible:
These points are true of the rest of the names in the standard library, whether std::vector, or whatever... so long as you've #included the right header of course:
#include <vector> #include <string> vector<string> X; // nope std::vector<std::string> Y; // okWhich brings up the issue that Standard C++ does not have a header named <iostream.h>, although many compilers support it for backwards compatibility with pre-Standard implementations, where namespace didn't exist. So, as an extension, the original example given in this section may work on some implementations (with iostream.h or iostream).
Similarly, the Standard C headers, such as <stdio.h>, are supported in C++, but they are deprecated. Because .h forms of C headers are deprecated, so-called Cname headers are often said to be preferred, for instance, cstdio, or cctype instead of ctype.h Furthermore, names in the .h are assumed to be in the global namespace, and so therefore do not need to be qualified with std::. The .h form is sometimes used as a transition model for backwards compatibility. Or, for a "co-ed" source file able to be compiled by a C or C++ compiler. This said, there is some controversy about whether these headers should have ever been deprecated, so, IMO, the jury is still out on whether you must in all cases prefer Cname's to name.h's, or for that matter, in any cases.
// x.cpp static bool flag = false; // AAA void foo() { if (flag)... } void bar() { ...flag = true... }should instead often be composed this way in C++:
// x.cpp namespace /* NOTHING HERE!! */ { // BBB bool flag = false; // no need for static here }
The use of static in AAA indicates that flag has internal linkage. This means that flag is local to its translation unit (that is, effectively it is only known by its name in some source file, in this case x.cpp). This means that flag can't be used by another translation unit (by its name at least). The goal is to have less global/cross-file name pollution in your programs while at the same time achieving some level of encapsulation. Such a goal is usually considered admirable and so therefore is often considered desirable (note that the goal, not the code, is being discussed in this sentence).
Contrast this to BBB. In the case of using the unnamed namespace above, flag has external linkage, yet it is effectively local to the translation unit. It is effectively still local because although we did not give the namespace a name, the compiler generated a unique name for it. In effect, the compiler changes BBB into this:
// Just get UNIQUE established namespace UNIQUE { } // CCC // Bring UNIQUE into this translation unit using namespace UNIQUE; // Now define UNIQUEs members namespace UNIQUE { bool flag = false; // As Before }For each translation unit, a uniquely generated identifier name for UNIQUE somehow gets synthesized by the compiler, with the effect that no other translation unit can see names from an unnamed namespace, hence making it local even though the name may have external linkage.
Therefore, although flag in CCC has external linkage, its real name is UNIQUE::flag, but since UNIQUE is only known to x.cpp, it's effectively local to x.cpp and is therefore not known to any other translation unit.
Ok, so far, most of the discussion has been about how the two provide local names, but what are the differences? And why was static deprecated and the unnamed namespace considered superior?
First, if nothing else, static means many different things in C++ and reducing one such use is considered a step in the right direction by some people. Second to consider is that names in unnamed namespaces may have external linkage whereas with static a name must have internal linkage. In other words, although there is a syntactic transformation shown above between AAA and BBB, the two are not exactly equal (the one between BBB and CCC is equal).
Most books and usenet posts usually leave you off about right here. No problem with that per se, as the above info is not to be tossed out the window. However, you can't help but keep wondering what the BIG deal some people make about unnamed namespaces are. Some folks might even argue that they make your code less readable.
What's significant though is that some template arguments cannot be names with internal linkage, instead some require names with external linkage. Remember, the types of the arguments to templates become part of the instantiation type, but names with internal linkage aren't available to other translation units. A good rule of thumb to consider (said rather loosely) is that external names shouldn't depend upon names with less linkage (definitely not of those with no linkage, and often not even w ith names of internal linkage). And so it follows from the above that instantiating such a template with a static such as from AAA just isn't going to work. This is all similar to why these won't work:
template <const int& T> struct xyz { }; int c = 1; xyz<c> y; // ok static int sc = 1; // This is the kicker-out'er above xyz<sc> y2; // not ok template <char *p> struct abc { }; char comeau[] = "Comeau C++"; abc<comeau> co; // ok abc<"Comeau C++"> co2; // not ok template <typename T> struct qaz { }; void foo() { char buf[] = "local"; abc<buf> lb; // not ok static char buf2[] = "local"; abc<buf2> lb2; // not ok struct qwerty {}; qaz<qwerty> dq; // not ok }
Last but not least, static and unnamed namespaces are not the same because static is deficient as a name constrainer. Sure, a C programmer might use it for flag above, but what do you do when you want to layer or just encapsulate say a class, template, enum, or even another namespace? ...for that you need namespaces. It might even be argued that you should wrap all your files in an unnamed namespace (all the file's functions, classes, whatever) and then only pull out the parts other files should know about.
Draw your attention to that that none of the above is equal to this:
// x.cpp namespace { // DDD static bool flag = false; }(The point of showing DDD is that you really wouldn't want to say it. I guess one could say that it is redundant, and really just brings all the above issues right back (as in this flavor, it's not external). So, it's only shown to make sure you see that none of the previous versions look like this :).
Note that namespaces containing extern "C" declarations are in some ways as if they were not declared in the namespace, so since an unnamed namespace is a namespace, this holds true for an unnamed namespace as well.
Note also that the above discussion does not apply to the other uses of static: static lifetime, static members or to static locals in a function.
void foo(int /*no name here*/) { // code for foo }Take note that although an int argument would be passed, it is not named. Why would you do that? One reason is to "stub out" a routine. For instance, let's say that some functionality was removed from an already existing program. Instead of finding all calls of foo and removing them, they can be left in. The effect is to no-op the code. If the function is inline defined, it wouldn't even generate any code.
Of course, this doesn't depend upon functions with just one argument. For instance, you might have:
void bar(int arg1, int arg2, int arg3) { // code for bar, using arg1, arg2 and arg3 }Now let's say that the functionality of the program changes and that arg2 is no longer needed. Well, obviously you'll remove the code that uses arg2. But now the problem is that you'll probably get an "unused identifier" warning from your compiler. To get rid of the warning you could give it a dummy value or use within the function, but that'll just confuse the issue. Instead, you can just remove the argument name too:
void bar(int arg1, int /* Now unnamed */, int arg3) { // code for bar, using arg1 and arg3 }Sometimes though, the above approach is not just used to support legacy code, but also to make sure an overloaded function gets picked, perhaps a constructor. In other words, passing an additional argument just to make sure a certain function gets picked. As well, during code development it might help to use an unnamed argument if, for instance, you write stubs for some routines.
When possible, it probably should be argued that unused parameters should be removed completely both from the function and from all the call points though, unless your specifically trying to overload operator new or something like that.
Also, note that the above discussion has no relation to code such as:
void function(int arg, ...);which uses to the ellipsis notation (the dot dot dot) to specify that the arguments to a function are unspecified. These are unnamed too, but it establishes variable arguments to a function (which C supports too), and that's something else altogether.
As you may be aware, Standard C and Standard C++ each support a rich myriad of implicit conversions. Generally, they allow us to "manipulate" common values and put them into an object of similar, but not exactly the same, type. And they happen by default, hence why they are implicit. IOWs, as objects of some types more naturally convert to objects of other types, the language provides rules allowing some of these conversions w/o needing to specify extra code or directives (it also has rules prohibiting other conversions). Therefore, this allows compilers to generate code to do these conversions automatically.
A classic example of this is stdio's getchar(). That is, although most code will be using the return value of getchar() as a char of some sort, it actually returns an int. That means that this code may have a problem:
#include <stdio.h> int main() { char c; /* ... */ c = getchar(); }because normally you'll probably be reading in characters in a loop and you would also want to be checking to see if there was a problem with input, for instance, an end of file condition:
while ((c = getchar()) != EOF) putchar(c);But that would require:
signed int c;Why? Well, Standard C says that EOF is a macro which "expands to an integer constant expression, with type int and a negative value, that is returned by several functions to indicate end-of-file, that is, no more input from a stream." Therefore, if getchar() is to be able to return all valid character input, then there must be some way to capture the negative value "error" condition represented by EOF as well. This means getchar() must return a type able to hold more than the character type can hold. (Some may question the design approach used with getchar(), but that's a topic for a different discussion.) In this case, that means an int, and a signed one (plain int is signed by default).
To connect all the dots here, the long story short is that internally getchar() is assigning a char value to an int, and then returning that. Inside getchar() (or some function it calls), that code is something along the lines of the following "Canglais" (C pseudo-code):
/* ... */ unsigned char buffer[SomeBufferSize]; unsigned char *bufferp; readBuffer(buffer); bufferp = buffer; /* ... */ int charToReturn; /* ... */ if (bufferNotEmpty) charToReturn = *bufferp++; /* int = unsigned char */ else if (DidHitEndOfFile) charToReturn = EOF; /* ... */ return charToReturn;Notice that the conversion from unsigned char to int "just happens". The rules of Standard C and Standard C++ each spell out what conversions can happen in contexts like this, and so compiler writers are able to generate the correct required executable code. A piece of code like this is ok too:
int main() { double d; long l = 1234; d = l; /* long implicitly converted to double */ }These same rules however will let compilers diagnose a problem such as:
int i = 99; char *p; // ... p = i; // error, types are too different for implicit conversion
As you may also be aware, C and C++ also support explicit conversions, aka casts. They allow one to specify all the implicit conversions (consider these "castless conversions" if you want)), and also other ones that are not implicit. Note that does not mean you can convert anything to anything. Explicit conversions, or casts, are expressions which take the form of a so-called "C-style cast":
(T)EIn other words, a cast is a syntactic -- and hence purposeful, explicit and non-automatic -- mechanism/notation to accomplish a conversion. Normally it's for conversions that the compiler would not be doing by default, but you can also cast the default ones too if for some reason you want to make them explicit.
The type, T above, in a C style cast can be a simple type like int, a qualified (const or volatile) pointer, etc. and it is parenthesized.
The expression, E above, can be most normal expressions: additions, function calls, constants, etc. Therefore, to change the last example, I might have:
p = (int *)i; // compilesNote that I do not say "OK" but "compiles", because we are converting an int to a pointer, and we are not guaranteed by the language that that conversion can actually take place on a give implementation correctly.
Certainly my toy example use of 99 is a garbage memory location, however, an example where the address of a video card memory at location 0xFFFF0F used by a device driver may not be. This is a reason that use of casts should be approached cautiously: they may or may not even make sense. Even if it does, it has to actually work on a specific compiler and platform. And of course, portability of such constructs is often just thrown completely out the window.
NOTE: A cast is effectively a statement to the compiler that you know what you are doing and that it should shut up about any possible violations you may be making. So do make sure you know what you are doing, and why. This is important to consider in shops that do not permit warnings, because it is often too easy to insert a cast to satisfy the requirement and inadvertently rendered the code non-portable or incorrect (on other platforms, but probably even the same one). Furthermore, to their demise, newbies often also add casts to get around problems they don't understand and/or because of compiler errors, usually with no good basis to do so. A bad newbie basis, or even one from an expert, is because you are frustrated, or because you think you got the code working satisfactorily. These are very easy bugs to add but slippery once there, and painful to detect and fix.
Note that in the earlier examples we could have added casts:
charToReturn = (int)*bufferp++; /* ... */ d = (double)l;but in the context of those examples, the casts in those particular cases are strictly speaking not necessary, since as mentioned, they can already be done implicitly. In cases where the cast is exactly the same as not having one, the compiler will accept the cast but it will essentially have no effect, since the conversion will be happening anyway.
There is an argument that specifying cast in such a manner makes the line of code more self-documenting. That may be so, but, gratuitous casts can get burdensome. Furthermore, gratuitous casts becomes a code maintenance nightmare, and a trap, one which will most assuredly render many programs not only incorrect, but silently incorrect! So, be wise. Even casts which are not gratuitous should be used judiciously.
BTW, note that C allows types to be defined in casts, but C++ does not:
int main() { // code that has not yet declared xyz p = (struct xyz { int i; } *)0; // C++ error: type xyz can't be defined here return 0; }
C-style casts are more formally known as "explicit conversions" because you code them explicitly. Note, C99 also now supports "compound literals" which seem to have a cast'y look to them, however, they are not casts. That is, compound literals bring forth lvalues and are not conversion requests per se.
Speaking of which, note that a cast in Standard C cannot yield an lvalue (some compilers have non-Standard extensions that allow it though). However, in C++ you can cast to an lvalue as a reference:
...(some_base_class&)E...C++ also allows classes:
class xyz { // ... xyz(int); // constructor taking an int } xyz anXyz(99); // init anXyz with 99where we can bring forth instances of user-defined types through constructors. This same kind of possibility has been allowed for built-in types too:
int i(99);This has led to the so-called constructor style initializer in C++. And hand in hand with this, but in contrast to the C style cast form (T)E, is constructor style casts of the form:
T(E)for instance:
// ... char somechar; typedef int int_typedef_example; // ... x = this - that + (int)somechar; // C style form x = this - that + int(somechar); // C++ ctor style form x = this - that + (int)(somechar); // C style form with parens x = this - that + int_typedef_example(somechar); // ctor style using typedef as typeNote that in C++, casts (all casts) may result in class specific "operator functions" being called. Note also that the type in a constructor style cast has to take place as one token, so this is not allowed:
q = int *(p);You would need a typedef for that.
As mentioned earlier, casts are a brute force conversion mechanism. As such, they are probably too powerful, and therein lies a problem: It cannot always be grokked by looking at a cast what exactly the intention of the conversion is. Consider:
const T1 *p1; //... p2 = (T2)p1; // converting T1* to T2? const to non-const? typo error?Therefore, in order to make such code -- at least the casts deemed necessary to remain in the code -- more self-documenting, Standard C++ supports additional forms of casts often referred to as "new style casts". They are split into distinct categories for an opportunity to:
// ... struct base {}; struct derived : base {}; base b; derived d; // Normal implicit "upcast" derived to base conversion base *bp = &d; // similar to = static_cast<base *>(&d;) // here we do what is normally considered a "conversion in the wrong direction": derived *dp1 = bp; // error: base * can't init a derived * derived *dp2 = static_cast<derived *>(bp); // ok as far as cast goes (static class navigation) // IOWs, may not be a base, but see dynamic_cast // below to "properly" navigate & check hierarchies // Can use references as well as ptrs derived &dr; = static_cast<derived &>(*bp); // ok as far as cast goes // As with C style casts, some of these are redundant (implicit), // may narrow, widen, etc. and are shown for exposition purposes only charToReturn = static_cast<int>(*bufferp++); // char to int d = static_cast<double>(l); // long to double l = static_cast<long>(d); // and back x = this - that + static_cast<int>(somechar); // char to int void *SomeFunc(); void *p = SomeFunc(); // perhaps malloc() MyType *myp = static_cast<MyType *>(p); // void * to MyType * int SomeInt = 99; int *pi = &SomeInt; // ... void *vp = pi; // castless conversion // ... say pi = 0; pi = vp; // error in C++ pi = static_cast<int *>(vp); // put it back enum colors { red, white, blue }; int rwb = static_cast<int>(white); // enum to int colors bwr = static_cast<colors>(rwb); // and back // see some other static_cast's below
void foo(const int *cip) { int *ip = const_cast<int *>(cip); // de-const // ... modifying *ip might still be undefined behavior }You can also use const_cast to add qualification, although that is rarer. Such use might be to force say a const member function to be called instead of its non-const version.
// from previous example using C style cast p = reinterpret_cast<int *>(i); // int may not look the same as int *A piece of code like this:
int main() { int *ip; float *fp; ip = fp; fp = ip; ip = static_cast<int *>(fp); fp = static_cast<float *>(ip); }might produce many errors as they are implemented-defined issues:
Comeau C/C++ for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ "pcast.cpp", line 6: error: a value of type "float *" cannot be assigned to an entity of type "int *" ip = fp; ^ "pcast.cpp", line 7: error: a value of type "int *" cannot be assigned to an entity of type "float *" fp = ip; ^ "pcast.cpp", line 9: error: invalid type conversion ip = static_cast<int *>(fp); ^ "pcast.cpp", line 10: error: invalid type conversion fp = static_cast<float *>(ip); ^ 4 errors detected in the compilation of "pcast.cpp".Assuming you really want to do such a cast, a resolution using new style casts would be:
ip = reinterpret_cast<int *>(fp); fp = reinterpret_cast<float *>(ip); // fp may or may not have its original valueAssuming it make sense to say one cast is more portable than another, don't count on it with reinterpret_cast, as you are really asking for type checking to be thrown out the window here.
Unlike the static_cast inheritance example earlier, be careful as reinterpret_cast does not require all types to be complete in some contexts:
struct B; struct D; // should be inherited from B! int main() { B *pb; D *pd; // ... pb = pd; // error: can't assign D* to B* implicitly pb = static_cast<B*>(pd); // error: still can't assign D* to B*, // they are incomplete so no relationship established pb = reinterpret_cast<B*>(pd); // compiles: but still no known relationship: Yikes!Here's another example with reinterpret_cast, which may have come from somebody trying to say write a debugger and "normalize" some pointers for some reason. Naturally this kind of stuff would be platform specific, and hence may have portability issues:
void foo() { } int bar(int) { return 99; } struct wacko { void mfoo() { } int mbar(int) { return 99; } }; int main() { void (*vpv)() = foo; int (*ipi)(int) = bar; vpv = bar; // error: void (*)() can't hold int(*)(int) vpv = (void (*)())bar; // force it vpv = static_cast<void (*)()>(ipi); // force this too? No: error: not related types vpv = reinterpret_cast<void (*)()>(ipi); // ok: unrelated types, but force anyway ipi = reinterpret_cast<int (*)(int)>(vpv); // usually copies back same void (wacko::*mvpv)() = &wacko;::mfoo; int (wacko::*mipi)(int) = &wacko;::mbar; mvpv = reinterpret_cast<void (wacko::*)()>(mipi); // compiles but does it do what you want? return 0; }
The static_cast examples given earlier pertaining to inheritance usually apply to derived pointers or references being converted to base pointers or references respectively. Here's a classic OO example:
// shapes.h class Shape { // ABC: abstract base class // ... public: virtual void draw() = 0; // pure virtual function virtual ~Shape(); }; class Circle : public Shape { public: // ... ctor with args, etc. void draw() { /* ... */ } // virtuals inherit as virtual }; class Square : public Shape { public: // ... ctor with args, etc. void draw() { /* ... */ } }; // Your Shapes Here....Since some visual representations of such hierarchies are normally drawn with the base class on top and the derived classes leaves as fingers pointing downward (another way to look at it is that the base class is the root class and roots normally grow downward), then to go from a derived to a base is often spoken of as upcasting since you will be casting up such a visual representation of the inheritance diagram. Upcasting is normally done implicitly, as that is normally the direction you convert your pointers and references when an inheritance hierarchy is involved:
#include <shapes.h> int main() { Shape *sp; sp = new Circle; // implicit upcast from Circle * to Shape * // ... sp = static_cast<Shape *>(new Circle); // same but unnecessarily explicit sp->draw(); // Draw a Circle ala virtual draw() // ... sp = new Square; // implicit Square * to Shape * sp->draw(); // Draw a Square ala virtual draw() // ... Square *sqp = static_cast<Square *>(sp); // Shape * to Square *? Ok, get Square back Circle *cp = static_cast<Circle *>(sp); // Silent Shape * to Circle * from underlying Square *? return 0; }
An issue with this static_cast is that it did not consider the dynamic type of the object that was cast, it only considered the declared static type, and so only "viewed" an object "slice" so to speak. In particular, what can be made of cp? Not much (a Circle * pointing at a Square? Ugh.). So what's often needed in some cases is to be able to "go deeper." This is significant because it points to an underlying issue when we have subsystems which were not designed with each other in mind, are not extensible enough, etc. and so they are not normally able "to speak" with each other directly or purposely or optimally in some way.
As many conversions go, this subsystem stuff can be ugly. It is still desirable because you want to hook into the services of a ("3rd party") library that you have and use, whether it be for windows, graphics, databases, games, file systems, geometry, networking, whatever. However, often you have not written the library, so usually you don't want to modify it, and often you can't because often you don't even have access to the source code, among other reasons. This means that the library author does not know if you have created a derived class using "it" as one of your base classes, and obviously its objects thereof. This then means that the library will often only "produce", use, pass around, etc. what it knows about, hence only going so far as its own classes and objects (which will be your base classes and base class objects), and services, since they are provided with the library. However, if you derived from them, then you may have some specialized functionality that you have added in your derived classes for your derived class objects that the library may not know about. And so, obviously, you'll want to perform some of your own operations and services and have them work with the closed base library and your "extensions".
Such conditions, where say you can't modify the design of the library, often delve into situations that will involving casting. What is crucially desired here is the answer to "Is it safe to use the derived class object? Does it even exist?" In particular, the thrust is that dynamic_cast provides direct language support by accepting a pointer or reference to a base class object (the one in the "closed library"), and respectively rendering (converting) it as a pointer or reference to a particular derived class (yours), all at runtime. Note that the cast sought is in the opposite direction from earlier.
This base to derived conversion is downcasting, as it is casting down the inheritance diagram. Downcasting behavior does not happen implicitly. With upcasting you are normally zeroing in on a specific ancestral base class, usually quite clearly, even considering multiple inheritance. However, with downcasting, since it fans out, the breadth of the choices expands unlimitedly, and worse, the classes become less general and more specific since this is how derived classes for different niches work and are often for. For instance, we can keep adding derived Shapes to the classic example given above. This should "just work" and normally should "just interface" with the subsystem (of course, one must override virtuals, etc.).
And yet there is a problem with the cp pointer in the shape example earlier: you cannot always use static_cast to traverse a hierarchy back and forth as it does not always work this way safely. This is similar to the classic problem of:
// ... int *pi; float *pf; // ... void *pv = pi; // send to pointer to void. // IOWs, toss type baggage out the window. pi = (int *)pv; // and back to pointer to int // ... pf = (float *)pv; // but back to point to float? Say what?but with it's own issues: static_cast uses the static type of an object, not its dynamic type, whereas dynamic_cast "looks into" the object. There is a difference between a "plain ol" pointer conversion, and object interrogation. This difference, which is reflected in the difference between static_cast and dynamic_cast, provides the safety that is sought in this problem domain:
Square *sqp = dynamic_cast<Square *>(sp); // Shape * to Square *? Ok, get Square back Circle *cp = dynamic_cast<Circle *>(sp); // Shape * to Circle * from underlying Square *? NOPE
Important here is that dynamic_cast will render the request if the derived class object really is the overlying object mentioned by the base class pointer or reference. As with most upcasting, this downcasting also begs a polymorphic (involving virtual functions -- at least a virtual destructor in a base class) inheritance relationship; it's important that the dynamic type of the pointer or reference can be picked up and used properly (that's why static_casting in the cp example is not useful). In other words, there is a check that occurs. That is, as dynamic_cast only applies to objects of polymorphic classes, then a request outside the realm of an inheritance relationship among polymorphic classes will simply and naturally elicit a compiler diagnostic:
$ cat ccbase.c class CCbase { }; // NO VIRTUALS class CCderived : public CCbase { }; class SomeOtherClass { }; int main() { CCbase *b = new CCderived; CCderived *d = dynamic_cast<CCderived *>(b); // related but not polymorphically SomeOtherClass *p = new SomeOtherClass; d = dynamic_cast<CCderived *>(p); // not same inheritance relationship return 0; } $ como ccbase.c Comeau C/C++ 4.3.4.1 (Oct 30 2005 22:29:44) for MAC_OS_X Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ "ccbase.c", line 9: error: the operand of a runtime dynamic_cast must have a polymorphic class type CCderived *d = dynamic_cast<CCderived *>(b); // related but not polymorphically ^ "ccbase.c", line 12: error: the operand of a runtime dynamic_cast must have a polymorphic class type d = dynamic_cast<CCderived *>(p); // not same inheritance relationship ^ 2 errors detected in the compilation of "ccbase.c".So we can even get compiler detection in addition to the runtime checks. If during runtime, the downcast pointer conversion request finds the specific base to derived relationship not to be the case, it return a null pointer. In the non-failure case, the correct polymorphic inheritance relationship has been "verified" hence the cast correctly returns a pointer or reference, respectively, to the derived class object, therefore you can perform the necessary derived class operations upon that result.
In other words, in the examples given, the host parenthesized expression pointer sp is not just converted to the bracketed target pointer Square * or Circle *. Instead the object being pointed to (*sp, not sp the pointer itself) "is queried" (it's implementation defined exactly how) as to whether or not it is a Square or Circle (or a class object derived from the Square or Circle class), and only if that is the case is it successfully converted. Let's play some:
$ cat shapes2.c #include "shapes.h" #include <iostream> int main() { Shape *sp; sp = new Circle; // implicit upcast from Circle * to Shape * std::cout << "sp=" << sp << std::endl; Square *sqp = dynamic_cast<Square *>(sp); // Attempt downcast Circle *cp = dynamic_cast<Circle *>(sp); // Attempt downcast std::cout << "sqp=" << sqp << std::endl; std::cout << "cp=" << cp << std::endl; return 0; } $ como shapes2.c Comeau C/C++ 4.3.4.1 (Oct 30 2005 22:29:44) for MAC_OS_X Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ $ a.out sp=4198976 Address of Circle object sqp=0 FAILED: sp points to a Circle, not a Square, hence null pointer cp=4198976 Okey dokey: Got back original address
The discussion thus far has focused on pointers to bases and pointers to deriveds, but a similar rendering request applies to references. Here though, a failure to convert does not result in the null pointer. Instead, as a reference should not be null, then a failure to convert a reference to a base into a reference to a derived in the same polymorphic inheritance relationship hierarchy will result in the std::bad_cast exception being thrown at runtime. So the following program will abort:
$ cat shapes3.c #include "shapes.h" #include <iostream> void foo(Shape &s;) { Circle &c; = dynamic_cast<Circle &>(s); // Attempt downcast } int main() { Circle c; Square sq; std::cout << "foo()ing Circle" << std::endl; foo(c); std::cout << "foo()ing Square" << std::endl; foo(sq); std::cout << "Done" << std::endl; return 0; } $ como shapes3.c Comeau C/C++ 4.3.4.1 (Oct 30 2005 22:29:44) for MAC_OS_X Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ $ a.out foo()ing Circle foo()ing Square C++ runtime abort: terminate() called by the exception handling mechanism Abort trapThis program never emits "Done" as passing a Square to foo() caused dynamic_cast to throw a bad_cast since Squares and Circles are siblings. Since we don't catch it, the program by default eventually calls terminate() which eventually by default calls abort() as prescribed by the Standard C++ exception handling mechanism. If you did want to catch it, then just add try/catch:
$ cat shapes4.c #include "shapes.h" #include <iostream> #include <typeinfo> void foo(Shape &s;) { Circle &c; = dynamic_cast<Circle &>(s); // Attempt downcast } int main() { Circle c; Square sq; try { std::cout << "foo()ing Circle" << std::endl; foo(c); std::cout << "foo()ing Square" << std::endl; foo(sq); std::cout << "Done" << std::endl; } catch (std::bad_cast) { std::cerr << "Caught bad_cast" << std::endl; // Do something appropriate here } return 0; } $ como shapes4.c Comeau C/C++ 4.3.4.1 (Oct 30 2005 22:29:44) for MAC_OS_X Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ $ a.out foo()ing Circle foo()ing Square Caught bad_castNote the #include of <typeinfo> (not <exception>) to obtain bad_cast.
It is tempting to use this infrastructure to orchestrate your program like some giant selection process using switch or if this-or-that-or-that. That is not necessarily the intent. Instead, when dynamic_cast was accepted for adoption into Standard C++, there was also another feature accepted at the same time: condition declarations. You may already be familiar with the idea with say a for statement:
int i; // line A for (i = 0; i < 99; i++) blah(i); j = i; // ok, use i from line A, it's still in scopewhich may in some cases be preferred to be written as the following, though with different behavior:
for (int i = 0; i < 99; i++) blah(i); j = i; // error, unless some other i is in scopeNote the declaration is inside the first clause of the for. The behavior difference is that now i only has scope within the for statement, and if you really need it afterwards (and sometimes you do) then it needs to be declared outside the loop the way the first example did. Well, this can also now be done with whiles, switches and ifs. For instance:
if (int *p = Somelist.next()) { // If Somelist is empty we don't need to be here! // This p is only in scope here, or in a corresponding else } // No p from the if statement here, whether by id name or addressNote that this declares, defines, and initializes p. Then, p is tested to see if it a null pointer or not. This is often just what you want. This provides a basis of locality, it builds on C++'s capability of allowing declarations to be nearer to their use and initialization, especially when linked to the test. This creates a scoped identifier, but only where and when needed. In this example, we only need p if there is another node still left in the list, otherwise we probably don't care. So something like this doesn't help in this or other circumstances:
if (int *p2) { ... }This line of thinking, and behavior, is usually exactly what's needed in the dynamic_cast situation. Putting "2 and 2" together provides us a purposely limited and succinct scope:
if (Circle *cp = dynamic_cast<Circle *>(sp)) { // downcast was successful, we are where we need to be // Now can use services not supported by the core library }If it's not obvious yet, a prime aspect of doing a dynamic_cast is to be sure you end up with the right object, and so for that you need to be sure it worked.
Some may argue that a static_cast is more efficient than a dynamic_cast but when supporting code is added, any improvement, if there even was one, is probably a wash. So unless there is some super-duper compelling reason to not use the most appropriate feature, then, well, use it! Also, to counter the argument, it can also be claimed there are cases where dynamic_cast and other RTTI features can actually improve performance by allowing you to directly handle some otherwise inefficient case. As with all efficiency concerns, don't hallucinate them, and be sure they actually exist and are a problem to be solved.
Of course, on the other hand, library use is taxing in various ways. But the combination of these behaviors seems to achieve a reasonable balance between type safety and "getting along" in one's work, hence alleviating some aspects of the concern when using closed library systems. But also of course: when possible to use a cleaner design, by all means, do consider it. That is to say, if you have access to all the source code, then you may want to consider if there is a better way be integrate your libraries with different base class behavior, virtuals, etc. so as to avoid dynamic_cast altogether. They should usually always be considered over RTTI. Newbies please note this! It's easy to pound out if statements that are without much thought and that are unmaintainable, than to properly think through a design. But do think it through! This is only a feature to use when control of the class design and integrated automatic polymorphic behavior is out of reach.
Lastly, I've been speaking of things such as proper class design, or limited closed systems, but in fact, in many cases the proper design is the functionality already provided. That is, extending subsystems for corner or specialized cases is not clean either. So be very careful here: design is not always black and white. Therefore, don't force-fit hierarchies "just because" or to always avoid dynamic_cast and then end up with almost dead code for the "but-we-must-handle-this-case-code" (sic) . What will you do when the next twist arises?
There are other parts to C++'s RTTI:
NOTES:
short *sp = reinterpret_cast<short *>(cip); // error, const would be implicitly tossed out first short *sp = reinterpret_cast<short *>(const_cast<int *>(cip)); // ok, explicitly deconst
$ cat shapes5.c #include "shapes.h" #include <iostream> int main() { Shape *sp; sp = new Circle; // implicit upcast from Circle * to Shape * std::cout << "sp=" << sp << std::endl; Circle *cp = dynamic_cast<Circle *>(sp); // Attempt downcast std::cout << "cp=" << cp << std::endl; sp = dynamic_cast<Shape *>(cp); // And go back??? std::cout << "sp=" << sp << std::endl; return 0; } $ como shapes5.c Comeau C/C++ 4.3.4.1 (Oct 30 2005 22:29:44) for MAC_OS_X Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ $ a.out sp=4198976 cp=4198976 sp=4198976Note that you can sometimes go back the other direction. In this example, dynamic_casting from cp to sp is really not what you wanted to do. Be sure that is the conversion request that you really do want to make and that it is in the right direction (note it models assignment with the target type on the left of the host type)!
// a.h: // ... #include "xyz.h" // ... // b.h: // ... #include "xyz.h" // ... // xyz.h: class xyz { };Now, if you were to use some of these headers:
// main.c #include "a.h" #include "b.h" // ...then xyz.h will inadvertently be brought in twice, once by a.h and once by b.h. This is a problem because then xyz will end up being defined twice in the same translation unit, and it is an error to define the body of a class twice like that. This situation is by no means unique to just classes though, and is as true for duplicate definitions of structs, enums, inline functions, etc.
To get around this, in both C++ and in C, a common code technique is to sandwich the header file's source code with an #ifndef preprocessor directive. Consider a revised xyz.h:
// xyz.h: #ifndef XYZ_H #define XYZ_H class xyz { }; #endifWith this revision, if xyz.h is #included, and the XYZ_H macro is not yet defined, then it will #define it, and then it will inject the rest of the header file's code into the translation unit, in this case, the code is the definition of class xyz. However, if XYZ_H has already been macro defined, then the preprocessor will skip to the #endif, and in effect will not inject, or re-inject, class xyz into the translation unit. This tech nique establishes the notion of include guards.
Look back at main.c. We see that xyz.h will be included through a.h and since XYZ_H is not yet defined, it will define it and also let the compiler see the class xyz definition. Now, when it moves to b.h, it will open xyz.h but since it was already brought in by a.h, then XYZ_H is already macro defined. Therefore, it will skip right down to the #endif and the preprocessor will close b.h.
In effect, the second #include of xyz.h was skipped. Do note though that it did need to open it, process it only to see that it didn't need to do anything with it, and then close it.
As shown above, the traditional naming of the macro is just to follow suite with _H as a prefix to the file name. Do try to make the name unique, or maybe even long'ish in some cases, in order to avoid name clashes with it though. As well, avoid a name that will begin with an underscore, as most of those names are reserved for compiler vendors.
Given all this, it's easy to consider that you can just start including headers all over the place without any concerns. Clearly though, this can impact compile-time, and so you want to remain judicious in the headers you include. As well, some compilers support pre-compiled headers, but this should not be an excuse to get carried away.
It should be clear by now that you would add include guards for a.h and b.h too. Lastly, there are times even when you want to do this:
// abc.h #ifndef ABC_H #define ABC_H #ifndef XYZ_H #include "xyz.h" #endif #endifHere, abc.h actually tests the macro from xyz.h before even trying to include it. This avoids it from even being opened, but only if it's already been processed.
There's an implication here then that you can "fool a header", by #defineing the respective macro for it, so that perhaps it may not end up getting processed even once. Similarly, you can #undef the respective macro once the header has been processed. As with many of the above remarks, make sure that's really what you want to do.
Using of inline involves a space/time tradeoff.
A main premise of inline is that it should be used when the overhead of calling a function is higher that the overhead of the function itself. For instance here:
foo(a, b, c);The overhead of passing the 3 arguments, setting up returning to the call location, etc. might actually be higher than the cost of what the function does with the arguments. This normally means an inline function would be a small function, for some definition of small.
Note that inline functions are typesafe over macros and maintain function semantics (unlike macros, their arguments are only evaluated once, etc.).
Note that whether the inline'ing is honored is up to the implementation.
The question arises: since inline is good, should I inline everything? No. For instance, it might be suspect to inline a function with a loop in it, even if it is small.
Also, as inline can result in an increase in the code size of your applications "program image" (because it must expand the function and its expression where it is used if the inline is honored) it may lead to program that run slower when issues such as virtual memory, paging, and thrashing come into play. This is system dependent then. But it can mean your program runs slower in some cases because you "inline'd everything."
Of course, the compiler is allowed to ignore your inline request, whether implicity inline (say the definition of a function within a class) or explicitly inline by using the inline keyword. As well, the compiler is allowed to inline functions that have not been declared inline.
Also, don't just assume your application need to blazingly fast. Many applications are I/O bound. For instance, to exaggerate the point, it doesn't matter how fast your app is if it is sitting there wating for keyboard input. You can't make it wait faster. :) Instead give consideration to profiling your application to try and understand its performance. Do this over guessing. And if its shown to delve into an area involving a particular function, be sure that the actual problem is the function and not something else. Also, don't forget that another algorithm may be what the real resolution should be.
Also, remember than even if a small function, doing something like inline'ing a virtual function may have no effect, because probably it won't be called directly often.
Note that normally inline does not disturb normal function semantics. There is s case where it matters though. Often, functions are declared in header files but defined in source files. And then the linker resolves the call across the object files. However, if a function is defined as inline in a source file, it will usually only be known as inline in that source file. Furthermore, it may not get a physical footprint in the object file corresponding to that source file. For instance, given:
// blah.h void foo(); // blahmain.c #include "blah.h" int main() { foo(); } // blah.c #include "blah.h" inline void foo() { }Will result in a linker error:
c:\tmp>como blah.c blahmain.c C++'ing blah.c... Comeau C/C++ 4.3.8 (Sep 25 2006 11:02:23) for MS_WINDOWS_x86 Copyright 1988-2006 Comeau Computing. All rights reserved. MODE:strict errors C++ C++'ing blahmain.c... Comeau C/C++ 4.3.8 (Sep 25 2006 11:02:23) for MS_WINDOWS_x86 Copyright 1988-2006 Comeau Computing. All rights reserved. MODE:strict errors C++ blahmain.obj : error: unresolved external symbol foo() referenced in function mainas foo does not need to be presented. This means inline functions are usually defined in header files.
Earlier I mentioned that inlined functions should be small, for some definition of small. That was a cop out answer. The problem is, there is no concrete answer, since it depends upon a number of things that may be beyond your control. Does that mean you should not care? In many cases yes. Also, as compilers get smarter, many situations involving inline'ing will be able to be resolved automatically as they have in many cases involving the register keyword. That said, the technology is not there yet, and it's doubtful it will ever be perfect. Some compilers even support special force-it inlining keywords and pragma's for this and other reasons.
So, the question still begs itself: How to decide whether to make something inline or not? I will answer with some considerations that need to be decided upon and/or calculated, which may be platform dependent, etc.:
Note that a function may be inline substituted in one place and not in other places. Also, you may let it be inline'd but also take its address. This may also mean there is an inline substituted version and a static local version.
Note that inline functions must still obey the "one definition rule". So, although it may work in a given implementation, you should not be providing different function bodies that do different things in different files for the same inline function for the same program.
Be aware of functions that get called implicitly. In particular be aware of constructors and destructors as there are many contexts they may be invoked whether as arguments to functions, as return values, while new'ing, during initializations, during conversions, for creating temporaries, etc. Also, of particular concern is that if ctor/dtors are inline up and down a class hierarchy, there can be a cascade of inlineing that occurs in order to accommodate every base class subobject.
Lastly, I think there are some counter issues to be discussed so that you don't have solutions looking for problems:
struct xyz { struct abc Abc; // AA }; struct abc { struct xyz Xyz; // BB };Unfortunately, for this to work, struct abc needs to be moved before xyz, or else, how could line AA work? But wait! That would mean xyz needs to be moved before abc making this circular. One way around this is:
struct abc; // CC struct xyz { struct abc* Abc; // DD }; struct abc { struct xyz* Xyz; // EE };Here, we've changed Abc and Xyz into pointers. As well, we've forward declared abc in line CC. Therefore, even though abc has still not been defined, only declared, that enough to satisfy the pointers, because there is not yet any code which is going to be dereferencing the pointers, and by the time there is, both struct will have been defined.
Of course this new design implies having to dynamically allocate the memory these pointers will point to... unless of course, the self references are through C++ references, which are possible. Here's a toy example:
struct abc; // CC struct xyz { struct abc& Abc; // DD xyz(); }; struct abc { struct xyz& Xyz; // EE abc(); }; xyz::xyz() : Abc(*new abc) { } abc::abc() : Xyz(*new xyz) { }Of course, in this last example, the way the constructors are set up would establish an infinite loop, so you need to be careful when using self-referential classes. In other words, you would probably normally not do it like the example above shows, and depending upon your specifics may have to change your design.
#include <iostream> int main() { const int maxsides = 99; for (int numsides = 0; numsides < maxsides; numsides++) { // A if (...SomeCondition...) break; // blah blah } if (numsides != maxsides) // B std::cout << "Broke out of loop\n"; return 0; }
However, toward the end of the standardization process, related to some other rules (about conditional declarations in general), the committee decided that the scope of the identifier should only be within the for, therefore, line B is an error accto Standard C++. Note that some compilers have a switch to allow old style code to continue to work. For instance, the transition model switch is --old_for_init under Comeau C++. In order for line B to work under Standard C++ , line A needs to be split into two lines:
int numsides;
for (numsides = 0; numsides < maxsides; numsides++) { // revised line A
Here numsides is in scope till the end of its block, having nothing to do with the for as far as the compiler is concerned, which is "as usual". And of course, you can always introduce another brace enclosed block if you want to constrain the scope for some reason:
// ...code previous... { int numsides; for (numsides = 0; numsides < maxsides; numsides++) { // ... } // end of for loop } // scope of numsides ends here // .. rest of code...
Finally, consider this code:
#include <iostream> int numsides = 999; // C int main() { const int maxsides = 99; for (int numsides = 0; numsides < maxsides; numsides++) { if (...SomeCondition...) break; // blah blah } if (numsides == maxsides) // B: refers to ::numsides std::cout << "Broke out of loop\n"; return 0; }
Here, line B is ok because although the local numsides from the for loop is out of scope, it turns out that the global numsides from line C is still available for use. As a QoI (Quality of Implementation) issue, a compiler might emit a warning in such a case. For instance, Comeau C++ generates the following diagnostic when --for_init_diff_warning is set:
warning: under the old for-init scoping rules variable "numsides" (the one declared at line 2) -- would have been variable "numsides" (declared at line 6)
Too, try to name your identifiers better, and perhaps they will have less chance of clashing like this anyway. Also, in some cases, you may have code something like this:
for (int numsides = 0; numsides < maxsides; numsides++) // D ; // numsides should be out of scope here // create another, different, numsides for (int numsides = 0; numsides < maxsides; numsides++) // E ;
The numsides from line D can not be in scope in order for line E to work. Which is fine with Standard C++, as it won't be. However, you may be using it with a compiler that doesn't support the proper scoping rules, or for some reason, has disabled them. As a short term solution, you may want to consider the following hackery (think about how it'll limit the scope):
#define for if(0);else forDepending upon your compiler, these forms may or may not be better:
#define for if(0) { } else for #define for if(false) { } else forOf course, technically, as per Standard C++, redefining a keyword is illegal, but few compilers (if any) will actually diagnose this as a problem.
As well, for a totally different take on the subject, as shown slightly different above, you can add an extra set of braces around the code
{ for (int numsides = 0; numsides < maxsides; numsides++) // F ; } { for (int numsides = 0; numsides < maxsides; numsides++) // G ; }You can also of course add additional identifiers:
for (int numsides = 0; numsides < maxsides; numsides++) // H ; for (int numsides2 = 0; numsides2 < maxsides; numsides2++) // I ;None of these are optimal for normal code and perhaps not even good short term solutions, but if you have an old compiler, you may have no choice.
// ... int counter; // ... // counter set by some other code somehow // ... for (/* nothing here */; counter < SomeMaxValue; counter++) { // ... }Also, perhaps the last expression is complex, and therefore you may want to put its computation into the body of the loop, etc. So, perhaps you might have:
// ... int counter; // ... for (counter = 0; counter < SomeMaxValue; /* nothing here */) { // ... // elaborate formula to calculate the next value of counter // ... }This middle condition can also be missing, upon which it is taken as true. So unless specified elsewhere, this loops forever:
for (int counter; /* nothing here */; counter++) { // ... // Perhaps this loop has a test here, and executes a "break;" // ... }In the case where all are missing, as in for(;;), it also establishes an infinite loop. Here too, some other code would be necessary to break out of it, unless of course the app is intended to run continuously:
int main() { for ( ; ; ) { // ... // Perhaps this loop has a test, and eventually hits a "break;" // or Perhaps it really does loop forever // ... } }As an aside, it's worth considering that this:
for (xxx; yyy; zzz) { aaa; }can be trivially thought of as this:
{ xxx; while(yyy) { aaa; zzz; } }We say trivially because we are not trying to say every for loop can be so easily transformed into such a while loop so directly. For instance, if the code in aaa contains a continue statement, then zzz will be executed in the for loop but not in the while loop.
Also, too, it's worth pointing out that the while loop was placed into a block. See #forinit in this FAQ for why.
Note the following infinite loops:
for(;;)..... for(; 1; ).... for(; 1 == 1; ).... for(; true; )... // in C++ while(1).... while(1 == 1).... while(true)... // in C++ do .... while(1); do .... while(true); // in C++ label: ... goto label;The first is probably the most common form seen, though now that C++ and C99 have booleans, that may be changing.
That being so, many compilers, or operating systems, do support an extension in order to be able to clear the screen, or erase the current line. For instance, with Borland, you might do this:
#include <conio.h> // ... clrscr();
For some versions of Microsoft, you might do this:
#include <conio.h> // ... _clearscreen(_GCLEARSCREEN);
One of these will also often work on some systems:
#include <stdlib.h> // ... system("cls"); // For Dos/Windows et al "console app" system("clear"); // For some UNIX's, LINUX, etc. system("tput clear"); // For some UNIX's // ...
effectively running the command in quotes as a process. Tied in with this, you could also create a shell script named cls which internally will do the clear or tput clear, and so on. Note that <stdlib.h> might be <cstdlib> in the case of C++, and so therefore references to system() above would be std::system().
You can also just output a known number of newlines in some cases, if you know the number:
// ... const int SCREENSIZE = ???; for (int i = 0; i < SCREENSIZE; i++) // output a newline, perhaps with std::cout << '\n'; // flush the output, perhaps with std::cout.flush();You can also just hard-code an escape sequence, though this too may not be portable:
std::cout << "\033[2J" << std::flush;This of course means too that something like this could be done in VC++ as well:
#include <conio.h> // ... _cputs("\033[2J");There is also the "curses" (or ncurses in some cases) system, which, although not Standard, was popular with UNIX before X-windows became popular:
#include <curses.h> int main() { initscr(); // set up a curses window clear(); // clear the curses window refresh(); // commit the physical window to be the logical one // stuff using the window endwin(); return 0; }and/or various different graphics and windowing systems, etc.
If it's not obvious to you yet, whether you're clearing the screen, erasing the current line, changing the color of something, or whatever, you'll need to concede that no one way is portable, and that to accomplish what you need you'll have to find a platform specific solution in most cases. Check with your compiler or operating system vendor's documentation for more details. "Google" for it (websites and newsgroups) to see how somebody else may have solved this platform specific problem.
And never discard ingenuity when it's necessary. For instance, on some displays you may get away with emitting a form feed, or a special escape sequence. Or, if you have a primitive device you may have to do something like figure out how many newlines/spaces/tabs/whatever, to emit in order to scroll the screen away, though the cursor may not be where you prefer it. And so on.
Always construct your code sensibly. If you really do need say the ability to clear the screen, don't lace system dependent calls through your code, or even #ifdefs. Instead, write a function to encapsulate the call, so that if you do need to localize it in some way, at least it's just in one place.
There are many different phrases which use involve the use of "null" in Standard C and Standard C++, and they are often misunderstood, confused, and misspoken. Here I will cover some of them:
if (blah) ; /* null statement "just before" the ; */ else a = b;That's more as a convenience, and perhaps not the best strategy. However, there are some other cases where it ends up being a requirement. For instance, a common idiom might be in a loop:
while (getAnotherLine()) /* eat up current input */ ;But there is a more serious problem:
void foo() { // ... goto leave_foo; // ... leave_foo: }The problem here is that you can't label a closing brace because a label must apply to a statement, so the null statement comes in handy here if there turns out to be nothing else to do:
leave_foo: ;Of course, you could use a return;. Of course, the conditions should be reviewed to see if a goto is warranted. You may want to use null statements in the cases of some switch statements too.
I suspect this rule originally came about because of accidental typings of a colon instead of a semicolon, and so the requirement allows some of those typos to get detected.
#That is, the only thing on the line is the hash mark. At least once in your programming career you might hear the # called an octothorp. If not, this is the one time. :) Anyway, once upon a time, in a land far, far away, this was used for spacing the input file, and so continues to be supported by Standard C (and Standard C++). Actually, I think also that the original C compiler used it as a clue that preprocessing directives were going to be used in the file (somebody email me whether I'm nuts or not). These days it seems a useless directive, unless you are trying to guess the final answer in a game of "C++ Jeopardy".
char array[] = { 'C', 'o', 'm', 'e', 'a', 'u', ' ', 'C', '+', '+', '\0' }; char c; // ... c = 'x'; /* bits representing x */ c = '\0'; /* all bits cleared */As it has the value zero, often plain zero is used instead:
signed char c1 = 0; unsigned char c2 = 0;as there is an implicit conversion from int to char.
But remember that in C, a single character in a character constant is int, but it is a char in C++.
G:\tmp>type socc.c #include <stdio.h> int main() { printf("%d\n", sizeof(char)); printf("%d\n", sizeof(int)); printf("%d\n", sizeof('x')); printf("%d\n", sizeof('xx')); return 0; } G:\tmp>como --c99 socc.c Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C99 G:\tmp>aout 1 4 4 4 G:\tmp>como --c++ socc.c Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ G:\tmp>aout 1 4 1 4This is so even when the character is spelled in hex, which is what '\0' is attempting to do. So be careful when using 0 as a char (instead of an int) in C++; if you pass it to a function, you may end up picking the wrong overload:
void foo(int); void foo(char); // ... foo(0); // calls foo(int) foo('\0'); // calls foo(char)A null character can come in handy when defining a "C string": characters terminated by a null character. If doing so, this can allow us to get the sizeof a string literal (which may contain embedded null characters), or the strlen() of char arrays (which counts until the first null character, assuming valid input).
Remember that not all character arrays are C strings:
char a[2] = { 'a', 'b' }; /* No null byte */You would not pass this to a function such as strlen(), and that limitation might be fine assuming that is the intent of this kind of array.
Remember that unspecified elements are 0'd, and hence become null characters too in a context such as:
char x[99] = { '\0' };Here, x[0] is explicitly initialized to the null character, and the other 98 elements are implicitly initialized to the null character.
A null pointer constant is often used in initializers, assignments, and comparisons.
Note that a null pointer constant is "syntactic sugar". It is only a way of representing a concept in code. Repeat after me 1024 times, or at least write a for loop to generate such output: It does not mean that a subsequent null pointer has all zero bits -- so using a union pun, or calloc(), or memset(), etc is not a portable strategy for obtaining null pointers (or for setting floating points to zero either while it's being mentioned). However, once assigned (or initalized), a null pointer (this can be an expression too not necessarily just an identifier) can be compared to a null pointer constant (see above) with no problem, because again, the comparison happens at the syntax level in your code, and the code generated "does the right thing" whether it is all bits zero or not.
Note that although a null pointer is a valid pointer, it is not valid to dereference one:
int *p = 0; *p = 99; // KablooieNote that there is no requirement for programmers to make null pointers out of all pointers that do not point to valid memory. Often doing so makes sense, but often too, the flow and structure of the program can help determine if doing so makes sense or not. For instance, it may be that the pointer goes out of scope, and so = 0'ing it may not always make sense, nor the minor overhead of doing so, nor the possible false sense of security for thinking you've done so. My saying this does not endorse leaving invalid pointers that are going to be used invalid. Doing that is a huge source of bugs.
Note that testing an invalid pointer to see if it is a null pointer against a known null pointer or a null pointer constant is undefined behavior, so it is not usually considered wise to try it.
NULL is an implementation-defined null pointer constant. In C it is often:
#define NULL ((void*)0)However, because of overloading in C++, it is often defined in C++ as:
#define NULL 0 // or 0LNote I say often, because the C or C++ implementation is allowed to choose which way to define it, though NULL is not (void*)0 in C++.
In either language, be careful passing it to a variadic function (one taking a dot-dot-dot argument, aka an ellipsis), since if your code expects one type and it gets passed as another, perhaps with a different size, alignment and representation, then, as usual, all hell breaks loose, and speaking of nulls, you'll get a big fat null check from your boss. Oh, BTW, avoid variadic functions when possible.
Oh, oh, and in C++, a different overload can end up getting chosen depending upon which definition of NULL is used.
Oh, oh, and in C, don't forget to use function prototypes lest you run into similar problems when passing NULL or when returning NULL even when not a variadic function.
Also, newbies seem to like to do this:
char c // ... c = NULL; // Don't do thisIt may compile, or it may not. Either way, it's a misuse of NULL since it should be involved with pointers. It should follow not to use it to do math either.
#include <stdio.h> // ... getchar(); // Wait for any character to be hitmay not work because often input is first processed by your operating system a line at a time by default. This means, in those cases, that although the getchar() might be executing it won't be satisfied until the operating system passes its buffer to your program, upon which stdio will pass it to getchar(). Second, even if your OS is placed into a "raw" mode (assuming the OS even allows that), processing a char at a time, the stdio input stream being used might be in a buffered m ode, therefore it would need to be made unbuffered, via say setbuf() (setting setbuf alone w/o raw mode may be insufficient).
Some compilers do support an extension in order to be able to get one character from a stdio input stream. For instance, on some Windows' compilers, you might do this:
#include <conio.h> // ... getch(); // NOT standard, NOT in stdio getche(); // same as getch() but echo's the char to display deviceEverything above is true about iostreams in C++ as well, although it would use different input routine names, etc.
Final note: before getting too carried away with this, but sure you actually need such a capability. That is, often it's as handy to get a whole line, waiting for return to be entered. So, just make sure you know what you want and/or what is acceptable.
#include <stdlib.h> // ... const char array[] = "99"; int x = atoi(array); // convert C-style string to intStandard C also supports some other conversion function. We'll show it using C++ style headers for a C++ program though (a C program would continue to use stdlib.h):
#include <cstdlib> // Use C++ style header // ... const char xyz[] = "99"; const char abc[] = "99.99"; int x = atoi(xyz); // convert C-style string to int // Note: atof returns a double, not a float double x = atof(abc); // convert C-style string to float long x = atol(xyz); // convert C-style string to long // long long and atoll is new in C99, the 1999 revision to Standard C long long x = atoll(xyz); // convert C-style string to long longNote that if you pass an invalid number to any of these functions, the result is undefined behavior, ditto for if the value will be too big for the resulting type:
int x = atoi("999999999999999999"); // Undefined (if int cannot hold that value on your machine) int y = atoi("Use Comeau C++"); // ok: results in 0 int z = atoi(" 99"); // ok: results in 99 int e = atoi("-99"); // ok: results in -99For exposition purposes, take note of the following "equivalences" (except for error handling):
atof | strtod(nptr, (char **)NULL) |
atoi | (int)strtol(nptr, (char **)NULL, 10) |
atol | strtol(nptr, (char **)NULL, 10) |
atoll | strtoll(nptr, (char **)NULL, 10) |
Note: the above strto.. equivalences are shown as semantic equivalences, and may not necessarily be the way you would code them. For instance, if you properly have a #include for stdlib.h (or cstdlib in C++ possibly with std:: qualification on the calls, etc.) then the cast of NULL may not be necessary in C or C++. The cast would be necessary in say C90 if you did not have the declarations of the routines available, for instance, by not having the #include available , since then you can run the risk of the compiler synthesizing that the argument is an int, even though it should really be a pointer, which is not good on some platforms, and sloppy at best.
Although much legacy code uses the ato... versions, and an implementation might have faster ato... versions, it's also true that the interface to the strto... version are more flexible, reliable and with more control possible. As such, we encourage you to investigate these routines. There are also wide string versions of all the above too. Since the strto.. versions pass a null pointer as the "end pointer", here's a quick example that actually uses it with a "real" end pointer argument:
#include <stdlib.h> #include <stdio.h> #include <string.h> int main() { char someBuffer[] = "1234"; char someOtherBuffer[] = "1234s"; char *endptr; long l; l = strtol(someBuffer, &endptr;, 10); if (endptr == &someBuffer;[strlen(someBuffer)]) printf("SomeBuffer %s was a number\n", someBuffer); else printf("SomeBuffer %s was NOT a number\n", someBuffer); l = strtol(someOtherBuffer, &endptr;, 10); if (endptr == &someOtherBuffer;[strlen(someOtherBuffer)]) printf("SomeOtherBuffer %s was a number\n", someOtherBuffer); else printf("SomeOtherBuffer %s was NOT a number\n", someOtherBuffer); return 0; }Do read up on these routines since they have other capabilities (allowing plus, minuses, spaces, etc.).
Do not forget the "String I/O" routines either. For instance:
#include <stdio.h> // cstdio in C++ // ... int return_status; int x; double d; return_status = sscanf(xyz, "%d", &x;); // ... return_status = sscanf("99.99", "%f", &d;);Perhaps abstracting this into a function of your own naming. You'll need to weigh the impact of bringing in all of sscanf, making sure you use the &, etc. There is also a similar situation with "String Streams I/O" in C++:
#include <sstream> // ... char carray[] = "99"; std::istringstream buffer(carray); int x; buffer >> x; // ... #include <string> std::string str("-99.99"); std::istringstream FloatString(str); float f; FloatString >> f;You can also convert a C++ string in other ways too, for instance:
#include <cstdlib> #include <string> inline int stoi(std::string &s;) { return std::atoi(s.c_str()); } //... int i = stoi(str);And so on. Of course in all situations above, you should check error situations as appropriate. This implies the last example should not be using atoi.
Lastly, don't forget to have a look at the Boost library. It is a peer reviewed library written by many members of the C++ committee, though outside of the auspices of the committee. Parts of Boost will most likely eventually be incorporated (probably with some modifications) into the next revision of Standard C++. Therefore, also have a look at say the lexical_cast<>() functionality. For in stance:
#include <iostream> #include <boost/lexical_cast.hpp> //... void example() { std::string s(" 123"); // establish some string try { int i = boost::lexical_cast<int>(s); // ... std::getline(std::cin, s); // stream error detection code left out i = boost::lexical_cast<int>(s); // ... ++i; s = boost::lexical_cast<std::string>(i); // works other direction too } catch(boost::bad_lexical_cast const& blc) // ... } }
#include <stdio.h> // cstdio in C++ // ... char buffer[N]; // Use a buffer of the appropriate size for N!! // Again: Use a buffer of the appropriate size for N!! int x = 99; sprintf(buffer, "%d", x);If you were to wrap this into a routine, you'd need to either pass in the buffer, dynamically allocate it (realizing that it would need to be deallocated by the calling code somewhere), or use statically allocated space internal to the function (which would need to be copied). It might also be handy to have another argument which specifies the base that the string form of the number should be written in.
In the "new" version of C, C99, there is another function which can help:
// ... snprintf(buffer, N, "%d", x); // ...Notice this routine name has the letter n in it. It's different from sprintf in that any characters beyond N - 1 are thrown away. In other words, the buffer won't overflow. Note: Your C compiler may not yet support this routine. If it does, use it, as it can be helpful in your strategy to avoid buffer overruns, which adds up to bugs, often in unobvious places in your code at inopportune times. Note that many non-C99 compilers already support this routine, though it may have a different name such as _snprintf. As always, remember that you write programs, so don't expect magic out of something such as snprintf. That is, by this I mean, make sure you are passing the right buffer size, consider checking the return value of snprintf, and also consider what it means to throw away the other characters and whether this should be used in unison with some other strategy/technique.
In C++ you might also have:
#include <sstream> // ... std::ostringstream buffer; buffer << x;(And as a side remark, you can clear the ostringstream to reuse it by doing this:
// buffer contains value of x from above buffer << y; // buffer contains xy buffer.str(""); // 'clear' the buffer buffer << y; // buffer contains yNote the empty string literal).
So you might write yourself a handy routine:
#include <sstream> #include <string> std::string itos(int arg) { std::ostringstream buffer; buffer << arg; // send the int to the ostringstream return buffer.str(); // capture the string }To use it you might have:
// ... int x = 99; std::string s = itos(x); // ...For that matter, you can do this for any type:
#include <sstream> #include <string> template <typename T> std::string Ttos(T arg) { std::ostringstream buffer; buffer << arg; // send the type to the ostringstream return buffer.str(); // capture the string } // ... std::string s = Ttos(99.99); std::string t = Ttos(99L);Note that these examples assume that the << is successful, that is to say, that the conversion from the type to the string worked. For your code, you may want to check that it actually did before .str()ing it and deal with the situation as appropriate.
An issue such as "How can I append a float to a string?" has obvious solutions, whether you want to get a new string or not:
std::string u; u = "Some float is " + Ttos(222.333); std::string v = "Some other float is "; v += Ttos(-333.222);
#include <string> int main() { std::string ComeauCpp("Check out Comeau C++"); char *p = ComeauCpp; // nope }To wit, there is no conversion from string to char *. Some folks would try to "correct" this by doing this:
char *p = &ComeauCpp; // If at first you don't succeed, try, try, againBut this is wrong too, because there is no conversion from string * to char *. As if two wrongs were not enough, some would still try to correct this as:
char *p = (char *)&ComeauCpp; // If two wrong's don't make a right, try threeunder that premise that there was now a char * on either side. The problem here is that this assumes that the string class is implemented with a char array buffer, which may not be true. And even if it were, it assumes that the buffer is at the start of the string class, not to mention a violation of encapsulation. The right way to do this is to use strings conversion feature, its member function c_str(). For instance:
const char *p = ComeauCpp.c_str(); // Aha!Note that you cannot alter the C-string "returned":
*p = 'X'; // Nopeand non-const string operations may invalidate it:
ComeauCpp.length(); // ok ComeauCpp += "now!"; // may invalidate, so ASSUME IT DOESTherefore, you should usually make use of the pointer immediately, and without string related side-effects. If the C-style string is something that you want to "hang around" and be used by other parts of your program, you can call c_str() again when you need it (which may mean its value may be changed). Or you may want to take a snapshot of its current value and therefore you'll need to make a copy of it in that case. For instance, you might do this:
char *p = new char[ComeauCpp.length() + 1]; strcpy(p, ComeauCpp.c_str());It might be handy to "wrap" this up into a helper function:
char *GetCString(std::string &s;) { /* ... */ }realizing it will be your responsibility to delete [] the memory that you new []d when you are done with it. Do not ever delete[] the pointer returned from c_str() though, as that memory is "owned" by the string class.
You can also use the copy() operation from the string class if you don't want to use routines such as strcpy(). For instance:
#include <string> char *ToCString(std::string &s;) { char *cString = new char[s.length() + 1]; s.copy(cString, std::string::npos); // copy s into cString cString[s.length()] = 0; // copy doesn't do null byte return cString; } int main() { std::string hello("hello"); // ... char *p = ToCString(hello); // AA: copy the std::string CallSomeCRoutineForInstance(p); // ok: p points to a copy // ... Modify hello below, no problem, that is // " world\n" DOES NOT effect p, since it's been copied already hello += " world\n"; // passing p is ok // (it's pointing to the same C-string from line AA) CallSomeCRoutineForInstance(p); // ... delete [] p; // Our ToCString new[]d it, so... }And so on.
// The "C way": #include <cstring> // <string.h> in C #include <cstdlib> // <stdlib.h> in C void TheCishWay() { char *s; // usually needs a pointer s = std::malloc(???); // enough space allocated? // did you add +1 bytes for the null byte? std::strcpy(s, "hello"); std::strcat(s, " world"); // did this overflow the space allocated? if (std::strcmp(s, "hello worlds") == -1).... std::free(s); // don't forget to delete the memory somewhere } // The "C++ way": #include <string> void TheCppishWay() { std::string s; s = "hello"; s += " world"; if (s < "hello worlds").... std::string s2 = s + "\n"; }On this same note, but more generally, don't hesitate to make use of C++'s std::vector. As well, there are no null pointer std::strings, so "std::string ess;" establishes an empty string not a null pointer.
Of course, sometimes you have no choice as to whether you want a C++ or a C string. For instance, perhaps you are dealing with fragile legacy code. Or, perhaps, you need to use command line arguments, you know, argv from int main (int args, char **argv). The above is also not to say that one can't have involved code in C++ using std::string, you can.
One question that arises about the differences is: Which should one use if a const string is desired? Well, consider what it means. It implies no modification, and so it will either be copied, inspected, or output. Well, it seems there are three choices:
const char *one = "Comeau C++"; const char two[] = "Comeau C++" const std::string three = "Comeau C++";One might say the purest of one and two is two since it truly declares a named array. This might be handy if you just want to output it as a message or something. And if you want to traverse it with pointer arithmetic or something, you may want to add:
const char *ps; // ... ps = two; // .. ps++; // ...The issue with one is that the string literal is an unnamed array, and one is made to point to it. This means the array and the pointer needs to exists. This is different from two and ps because two is still the array, whereas once you do say one++ you've lost the pointer to the string literal. In other words, to use one in the same way, you'd normally still need ps:
ps = one; // ... ps++; // ...So in common use, when const, in general, prefer the real array when possible. That leaves the general choice between two and three. Well, what can be said of const std::strings?
const std::string comeau = "Comeau C++";
const std::string morecomeau(comeau);
So which you want in the const case should be based on criteria such as this.
#include <stdio.h> int main() { char *p = "comeau"; /* Line A */ p[0] = 'C'; /* Capitalize comeau?? */ printf("p='%s'\n", p); /* Line B */ return 0; }Here the characters enclosed within the double quotes is a string literal. String literals are normally terminated with a null character, which establishes a NTBS (null terminated byte string). The street slang for this term is string (and furthermore, is reinforced by being defined as such in the library section of Standard C. This is also true for Standard C++, but do note that this can get confusing because C++ has the NTBS strings and it also provides std::strings. See #stringvsstring elsewhere in this document.)
The problem with the above code is that it is undefined behavior. This means it may appear to work. Or, it may not. For instance, here's possible output from the above program:
p=Comeau // 1 p=comeau // 2 dialog box under Windows about a GP fault or access violation // 3 UNIX core dump // 4 whatever it wants to happen // 5Let's look at each:
/* ... */ static char NameProvidedByCompiler[7] = "comeau"; /* ... */ char *p = NameProvidedByCompiler;However, the compiler is allowed to assume that a string literal is readonly (actually, in C++ it is a static const char[?]) and so therefore generate code accordingly with that assumption. This means it can put it in ROM. That's why case 2 might occur. Even if it does not, the system might put it into a unwritable section. That's why 3 or 4 might occur.
In short, this and some other related issues means that you should not write into a string literal. Note that this:
char Comeau[7] = "comeau";is not writing into a string literal because it is a named array, whereupon the string literal is used as a syntactic sugar form of initializer instead of having done:
char Comeau[7] = { 'c', 'o', 'm', 'e', 'a', 'u', '\0' };and so this does not suffer anything from the discussion above.
Another form of string literal prefixed with an L is also available. For instance, L"...." which establishes an array (a const one in C++) of wchar_ts. This is called a wide string literal. The above discussion applies to them as well.
Just to be clear:
Note that this has nothing to do with other uses of the literal before the undefined use. So for example, in the code above, making a copy of Line B where Line A is, would allow for the string to be output with no problem. You could also add something like:
char c = p[0]; /* retrieve from the literal, c is now 'c' */
char *p1 = "Comeau"; char *p2 = "Comeau"; /* Same "Comeau" or seperate ones? */
#include <stdio.h> int main() { char comeau[] = "comeau"; /* Name it, now it's writeable, and unique. You may or may want it to also be static or const which may or may not require other code changes. */ char *p = comeau; p[0] = 'C'; /* Ok */ printf("p='%s'\n", p); return 0; }When possible, you can also declare the original pointer as a const char * but that won't help in all cases.
printf("abcd\n");it does not mean that printf must always be called with a string literal argument! Go look for its prototype and you may find something such as:
int printf(const char *, ...);Note that it takes a pointer to a const char. IOWs, there is no requirement that printf should only take things in the form of "blah". For instance, you could do this:
char format_buffer[] = { 'a', 'b', 'c', 'd', '\0' }; printf(format_buffer);This should be true of other functions or your own or other libraries declared as such. Note that not only can I use a named buffer but I did not do this:
char format_buffer[] = { '"', 'a', 'b', 'c', 'd', '"', '\0' };because the double quotes in a string literal are not part of the unnamed array that is established, so when you use alternatives to string literals, then they don't need the double quotes. This does not preclude that the semantics of your application may require you to put double quotes around something, but that's just character data in memory and not a string literals.
Note that this is so with std::strings too. They may be initialized with string literals, but internally just the characters are being stored. And remember there are operations to manipulate the stored characters, including metabolizing them into a C-like array:
#include <string> // ... void foo(const char *); // ... std::string format_buff("abcd\n"); // only a b c d NL gets stored // .. foo(format_buff.c_str());though if possible don't bring forth a C like array from a std::string unless absolutely necessary. Tend to only do so to interface with a C routine, and when possible, see if a clean wrapper can be abstracted instead, and/or the whole design changed in some way; that way your code is not littered with .c_str()s just to comply with what is probably an arcane oversight.
char *p = "abcde"; // LINE A char q[] = "ABCDE";Well, they are not the same. p is a pointer, a variable in its own right with its own memory, that is being initialized to point to the address of "abcd", an unnamed string literal. If you could break it down into steps, you might have:
char *p; p = "abcde";whereupon p will be pointing at the location of where the a is.
With q, it is actually the name for an array, specifically a 6 character array. It is as if it were declared as:
char q[] = { 'A', 'B', 'C', 'D', 'E', '\0' };Here q does not point at the array of chars, it actually is the beginning location of the array of chars, 6 of them.
With those things in mind, to be clear, note that p is of type char * and q is of type char [6]. In fact, the pointer can point at the array:
p = q; // ok, p points at the location holding the A, aka &q;[0]we can even write that:
p = &q;[0];The opposite is not true:
q = p; // error, can't assign to an array like this strcpy(q, p); // copy char at a time (needs to be large enough to hold)although given the original definition from LINE A the opposite strcpy would not hold:
strcpy(p, q); /* undefined behavior */as you should assume that a string literal of the form used to initialize p is not a writable entity nor may it be unique. See elsewhere in techtalk for more details on this point.
Note that the sizeof both of these are different while strlen() is the same:
G:\tmp>type ptrarr.c #include <stdio.h> #include <string.h> char *p = "abcde"; char q[] = "ABCDE"; int main() { printf("so p=%d\n", sizeof(p)); printf("so q=%d\n", sizeof(q)); printf("sl p=%d\n", strlen(p)); printf("sl q=%d\n", strlen(q)); return 0; } G:\tmp>como --c99 ptrarr.c Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C99 G:\tmp>aout so p=4 so q=6 sl p=5 sl q=5Note that the size of a pointer to a char may not be 4 on all machines, but that the size of q will always be 6 characters.
So, coming full circle, although outputting the contents of q and also what p points to may look the same, there is much difference going on under the hood. A program which just outputs the two may not seem like a big deal since the program is so toyish, but the difference in and to a more real program might be a big deal. For instance, with q there are only the 6 characters. With p there are 6 characters for the unnamed string literal and the space for the pointer p itself needs to be allocated. This space trade-off may be significant if you have lots of strings you're manipulating. On the other hand, if you need to look at things one character at a time, you'll be for want of some pointers. IOWs, each has their place. Of course, the convenience of a C++ std::string might be what you're seeking too.
struct xyz { int i; xyz() : i(99) { } // Style A }; xyz x;will initialize x.i to 99. The issue on the table here is what's the difference between that and doing this:
struct abc { int i; abc() { i = 99; } // Style B };Well, if the member is a const, then style B cannot possibly work:
struct HasAConstMember { const int ci; HasAConstMember() { ci = 99; } // not possible };since you cannot assign to a const. Similarly, if a member is a reference, it needs to be bound to something:
struct HasARefMember { int &ri; HasARefMember() { ri = SomeInt; } // nope };This does not bind SomeInt to ri (nor does it (re)bind ri to SomeInt) but instead assigns SomeInt to whatever ri is a reference to. But wait, ri is not a reference to anything here yet, and that's exactly the problem with it (and hence why it should get rejected by your compiler). Probably the coder wanted to do this:
struct HasARefMember { int &ri; HasARefMember() : ri(SomeInt) { } };Another place where a member initializer is significant is with class based members:
struct SomeClass { SomeClass(); SomeClass(int); // int ctor SomeClass& operator=(int); }; struct HasAClassMember { SomeClass sc; HasAClassMember() : sc(99) { } // calls sc's int ctor };It is preferred over this:
HasAClassMember::HasAClassMember() { sc = 99; } // AAAbecause the code for the assignment operator may be different than the code for the constructor. Furthermore, sc still needs to be constructed and so the compiler will insert a default constructor call anyway:
// Compiler re-writes AAA to this: HasAClassMember::HasAClassMember() : sc() { sc = 99; } // BBBwhich, at the least, might result in being inefficient since it will run SomeClass::SomeClass() and then SomeClass::operator=(). And of course, assignment is intended for copying an already existing object, whereas construction is intended to initialize an object to some sane and initial state. Meaning there may even be cases where coding something such as BBB is semantically wrong, which means there may be cases where coding something like AAA is wrong too.
As well, there is the case where a class has no default constructor declared, in particular:
struct ClassWithNoDefCtor { //ClassWithNoDefCtor(); // Note this is not declared ClassWithNoDefCtor(int); // int ctor // ... }; struct Blah { ClassWithNoDefCtor sc; Blah() /* Nothing here */ { } // calls sc's int ctor };Which will result in an error:
c:\tmp>como cwndc.cpp Comeau C/C++ 4.3.8 (Sep 25 2006 11:02:23) for MS_WINDOWS_x86 Copyright 1988-2006 Comeau Computing. All rights reserved. MODE:strict errors C++ "cwndc.cpp", line 9: error: no default constructor exists for class "ClassWithNoDefCtor" Blah() /* Nothing here */ { } // calls sc's int ctor ^
Taking care to note that the error is not necessarily a demand that you add a default ctor, just that it saw the ctor taking an int, and so expected that one was going to be called. And remember that a ctor with all default arguments is able to be used as a default ctor. IOWs, this is NOT an error:
struct ClassWithCallableDefCtor { ClassWithCallableDefCtor(int = 99); // int ctor, WITH DEFAULT ARG // ... }; struct Blah2 { ClassWithCallableDefCtor sc; Blah2() /* Nothing here */ { } // calls sc's int ctor };since now a programmer call to the ctor without an argument will work. This of course means that the compiler can too, and in this case will.
Builtin types such as int do not have these particular class based concerns, however, as a consistent style issue, it's worth treating them as such. Furthermore, through the life of the code, you may need to change the type of a member (say from char * to std::string), and you don't want a silent problem cropping up because of such a change.
Do note that the order of the member initializer lists in your code is not the order that the member will be initialized. Instead they are initialized in the order that the were declared in the class. For instance, given:
struct xyz { int i; int j; // j after i here xyz() : j(99), i(-99) { } // CCC }; xyz X;Note at runtime that X.j is not initialized before i. This is because j is defined after i. IOWs, think of it "as if" the compiler rewrites CCC as:
xyz() : i(-99), j(99) { } // DDDThis point can also go beyond just style though, because the value of one member may depend upon the value of another member to have already been initialized. So beware. (BTW, the reasoning behind this ordering is as usual: consistency in that destructors run in reverse order of constructors, and so it becomes another general rule for all such member initializer lists, not just those with constructors and/or destructors.)
Note that static members don't get member initialized; you can only member initialize nonstatic data members or classes specified as base classes.
Lastly, note that because member intializers accept expressions lists and not initializer lists, that you cannot member initialize a declared array. You also cannot member initialize a specific array element.
#define BOUNDS(array) sizeof(array) / sizeof(array[0]) // ... char buf[99]; // ...size_t is from stddef.h in C, cstddef in C++ size_t bd = BOUNDS(buf);In C++ you might do this:
#include <cstddef> #include <iostream> template <typename T, std::size_t N> inline std::size_t bounds(T (&)[N]) { return N; } int main() { char a[99]; int b[9999]; char hello[] = "Hello"; char *ptr = hello; std::cout << bounds(a) << std::endl; // outputs 99 std::cout << bounds(b) << std::endl; // outputs 9999 std::cout << bounds(hello) << std::endl; // outputs 6, includes null byte std::cout << bounds(ptr) << std::endl; // error }Note that ptr IS NOT an array, and so neither bounds nor BOUNDS will work correctly with it (one could argue by design).
To pick up additional bounds, you might do:
template <typename T, std::size_t N, std::size_t N2> inline std::size_t bounds2(T (&)[N][N2]) { return N2; }
void foo(int); void foo(double);However, this is not allowed:
int bar(); double bar(); // error: int bar() already existsBut why isn't it allowed? One may argue that the compiler should be able to figure this out the same way (or kind of way) it figures out overloading based upon the argument types:
int i = bar(); // error, though intended to call 'int bar()' double i = bar(); // error, though intended to call 'double bar()'One obvious problem though is, which bar does this call:
// ... int main() { bar(); // call 'int bar()' or 'double bar()'??? }Or, which do these call:
int i = 99 + bar(); double d = 99.99 + bar(); double d2 = 99 + bar();And so on. So not only would allowing this be confusing in many cases, but as it turns out, rules necessary to enforce/guide it would not be as similar as one would originally think they would be to argument overloading.
In fairness, this issue is pretty slippery, and in fact one really has to work to understand the example below (so get out a pen and paper!). But this elusiveness is exactly why it can be harmful! Code can speak 1024 words, so here goes:
int main() { const char cc = 'x'; // cc is const, so you should NOT write to it // cc = 'X'; // ErrorAAA: You should normally NOT write to a const char *pc; // Some pointer to char // pc = &cc; // ErrorBBB: attempt to assign const char * to char * char **ppc = &pc; // Some pointer to a pointer to char, &pc; is one of those // This is the line in question that LOOKS legal and intuitive: const char **ppcc = ppc; // ErrorCCC: const char ** = char ** not allowed // But WE'RE ASSUMING IT'S ALLOWED FOR NOW // Could also have attempted:const char **ppcc = &pc; *ppcc = &cc; // So, const char * = const char *, in particular: pc = &cc; // Note: But this was no good in line BBB above! *pc = 'X'; // char = char, IOWs: cc = 'X'; ==> Yikes! return 0; }Do not whip out a cast to silence the compiler! The issue at hand is that cc is const. But as you can see, if the conversion on line CCC were allowed, it would be possible to (inadvertently and purposely) circumvent normal type checking. Moreover, it would do so silently. Because of this, a char ** cannot implicitly be assigned to a const char **, nor can it initialize one.
Do note that the pointers involved here are dealing with two levels of indirection, not one. At first glance such a conversion seems like it should be allowed, because char * to const char * is allowed. But that's one level of indirection and now you know that any such type hijacking attempt like the example above should be considered suspect. Now you know why the const matters here. Now you know why a cast may not be a safe suggestion. Conclusion: Intuition is not always right.
Often, instead of the cast, you want this:
const char * const *ppcc = ppc; // DDD Notice the additional const
Note: Some earlier C++ compilers allow the conversion on line CCC without the cast. The C++ committee fixed the wording on this before Standard C++ was accepted and all current/modern compilers should reject the conversion on line CCC, if implicitly attempted, at least in their strict modes. Standard C requires rejecting this too. As a quality of implementation, you'd want to see a compiler at least give a warning about this.
Note: It seems that Standard C requires even line DDD to be an error because of how it deals with and specifies the interactions of compatible types. This appears to be an overspecification or an oversight.
The above deals with a "double pointer" example, however, it will of course extend into any additional levels of pointers too. As well, in C++, the same problem exists when converting a char * to a const char *&, etc.
int main() { const char cc = 'x'; // cc is const, so you should NOT write to it char *pc = 0; // Some pointer to char // This is the line in question that LOOKS legal and intuitive: const char *&rpcc; = pc; // ErrorEEE: const char *& = char * not allowed // But WE'RE ASSUMING IT'S ALLOWED FOR NOW // Could also have attempted:const char *&rpcc; = &cc; rpcc = &cc; // So, const char * = const char *, in particular: pc = &cc; *pc = 'X'; // char = char, IOWs: cc = 'X'; ==> Yikes! return 0; }
In C (or C++) you can tell what it is for your system by looking at limits.h (known as climits in C++) where the macro CHAR_BIT is defined. It represents the "number of bits for the smallest object that is not a bit-field", in other words, a byte. Note that it must be at least 8 (which mean that strictly speaking, a CPU that supports a 6 bit byte has a problem with C or C++). Also note that sizeof(char) is defined as 1 by C++ and C (ditto for the sizeof unsigned char, signed char, and their const and volatile permutations).
It might be helpful to show a quote from Standard C:
int main() { bool b = true; // ... if (b == false)... }Such a boolean might be used as a flag. As well, many conditions in C++ now have boolean "targets". That is, consider this:
int i = 99; // ... if (i == 99)...Here, i is compared to 99 and if they are equal, the result of that expression if true, otherwise it is false. This means something like this is ok too:
b = i == 99;How big is a bool? Its size is implementation-defined, so use sizeof to find out for your platform. It is allowed take up as little space as a char.
There are various other details, especially about conversions to bool that you should be aware of. Therefore, you should check a recent C++ book for further details on bools. While you're at it, you'll probably want to check out std::vector<bool> and the std::bitset template from the C++ Standard Library, especially if an array of single bit booleans are necessary (the FAQ right after this one, #binaryliteral has an example using bitset). That said, a word of caution is in order. As it turns out there are some requirements placed upon "containers" in the C++ Standard Library, and as std::vector<bool> is a partial specialization of std::vector it turns out that it does not meet those requirements. In other words, std::vector<bool> is not a true container type. The C++ committee is currently weighing how to resolve this. Please take a look at http://www.comeaucomputing.com/iso/lwg-active.html#96 for an executive summary of their varied thoughts. For a discussion of the issues, look at the article on Herb Sutter's site: http://www.gotw.ca/gotw/050.htm
C99 now also supports a boolean, however note that it will take some time before many C compilers catch up with the new 1999 revision to C. Note that Comeau C++ 4.2.45.2 and above supports most C99 language features when in C99 mode. Until most other C compilers support it, one must make do with various strategies. An oft used one might be to do something like this:
#define FALSE 0 #define TRUE 1 typedef int BOOL;Perhaps putting it into one of your header files. Here's another way:
typedef enum BOOL { FALSE, TRUE } BOOL;By the way, it's situations like this is where BOOL comes from, either from folks doing it on their own, or it's provided in some API they are using.
As to the specifics of C99's boolean, it has added a new keyword, _Bool (yes, note the underscore, and the upper case letter) as a so-called extended integer type for booleans in C. As well, C99 support a new header, <stdbool.h>, which does a few things. It defines the macro bool to expand to _Bool, true to expand to 1, false to expand to 0, and __bool_true_false_are_defined which expands to 1. The thinking here is that so many program already use the names bool and Bool that a new independent name from the so-called reserved vendor namespace was used. For programs where this is not the case (that is, where they do not do something such as define their own bool -- presumably then this would be for new code), one would #include <stdbool.h> to get the "common names" forms just mentioned. This seems confusing, but so be it.
Anyway, you use __bool_true_false_are_defined so that you can control integrating "old bools" with the new one. Until C compilers catch up with C99, perhaps it is best off using something like this:
#ifndef __bool_true_false_are_defined // <stdbool.h> not #include'd typedef enum _Bool { false, true } _Bool; typedef enum _Bool bool; #endifThis would offer a reasonable transition model to the C99 bools.
Despite this, there are ways to obtain the same effect. For instance, consider the following C++ program, based upon the bitset template found in the C++ standard library:
#include <bitset> #include <iostream> #include <limits> #include <climits> #include <string> int main() { std::bitset<9> allOnes = ~0; // sets up 9 1's // bitset has an operator<<, so just output the "binary number": std::cout << allOnes << std::endl; int someInt = 1; // AA // Get the number of bits in an unsigned int const int bitsNeeded = std::numeric_limits<unsigned int>::digits; // Establish a bitset that's a copy of someInt std::bitset<bitsNeeded> intAsBits = someInt; // As with allOnes, just output the binary rep: std::cout << intAsBits << std::endl; // BB // This is provided as an alternate to AA because some compilers // do not yet support <limits>, which contains numeric_limits<>. // (See http://www.comeaucomputing.com/techtalk/#bitsinbyte ) // If this is your situation, you'll need to use <climits> // (or perhaps even <limits.h> for your compiler) in order to obtain // CHAR_BIT. The net effect is that bitsNeeded2 should yield the // same value as bitsNeeded, and of course intAsBits2 should output // the same characters as intAsBits did, though +1, since it does someInt++ someInt++; const int bitsNeeded2 = sizeof(int) * CHAR_BIT; std::bitset<bitsNeeded2> intAsBits2 = someInt; std::cout << intAsBits2 << std::endl; // CC // This is just to show that if there is no reason to keep using // the bitset object later on in the code, then there may be no // reason to need intAsBits or intAsBits2. So then this version // directly calls the bitset<> constructor with the int, hence // creating a temporary bitset which is then output in binary: someInt++; std::cout << std::bitset<bitsNeeded>(someInt) << std::endl; // DD // Just an example where a hex constant was assigned to the int // and another where it is called directly with a hex someInt = 0xf0f0; std::cout << std::bitset<bitsNeeded>(someInt) << std::endl; std::cout << std::bitset<bitsNeeded>(0xf0f0) << std::endl; // EE // The binary forms as above may sometimes be used for more manipulations. // Therefore, the "output" may not be to a stream but to a string. // In such cases, perhaps converting the bitset to a std::string is // what's wanted. // // Converting a bitset to a string is quite involved syntax-wise. // Take note in the following code that bitset has a to_string() // member function, however, it MUST BE CALLED: // * With the template keyword, if in a template // * By explicitly specifying the template arguments to to_string() // as to_string() takes no function arguments in order to deduce anything. // // This is best put into an function instead of mainline code std::string s; // intAsBits declared above // Next line no good unless in a template! //s = intAsBits.template to_string<char, std::char_traits<char>, std::allocator<char> >(); s = intAsBits.to_string<char, std::char_traits<char>, std::allocator<char> >(); std::cout << s << std::endl; // FF // You can also fake-out actually writing a binary literal // by using a string. Then, you can convert it to a long. std::string binaryFakeOut("0101010101010101"); std::bitset<16> bfoAsBits(binaryFakeOut); unsigned long bfoAsLong = bfoAsBits.to_ulong(); // As above, this can be done in one step: unsigned long x = std::bitset<16>(binaryFakeOut).to_ulong(); // Or: unsigned long y = std::bitset<16>(std::string("0101010101010101")).to_ulong(); return 0; }Note that this works for negative values as well as positive ones. There are many other things that bitset can do, however, the intent of this section is just to show how you can convert some types into a binary representation. Check out a good recent C++ book if you want to know more about bitset.
In some of the above, for instance EE, we suggested abstracting such a line away in a function, perhaps in a template function which is keyed off the type in order to compute the number of bits automatically, etc. Of course, once that has been done, dependencies on <limits>, etc., are no longer laced through the mainline code, just in the function you've written to handle this.
That said, this same organizational point would need to be considered even if writing a C version:
#include <stdio.h> void OutputIntAsBinary(unsigned int theInt) { /* Find the "highest" bit of an int on your machine */ unsigned int currentBit = ~0 - (~0U >> 1); while (currentBit) { /* Go through each bit */ /* Is the respective bit in theInt set? */ putchar(currentBit & theInt ? '1' : '0'); currentBit >>= 1; /* Get next highest bit */ } putchar('\n'); }Of course, you'd probably not want to hard code putchar() in that case either, but pass a pointer to a function.
G:\tmp>type enumvals.c #include <stdio.h> typedef enum colors { red, orange, yellow, green, blue, indigo, violet } colors; int main() { colors c = violet; /* Need enum colors in C */ printf("%d\n", yellow); printf("%d\n", c); return 0; } G:\tmp>como enumvals.c Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ G:\tmp>aout 2 6A question comes up surprisingly often though: How do I get the string values that these numbers represent? IOWs, as 2 is yellow, how can I get to see yellow output and not the unhelpful (in some case) 2? Well, there is no direct language support in C and C++ for obtaining the name of an enumerator value. Instead, you need to do something such as:
#include <stdio.h> typedef enum colors { red, orange, yellow, green, blue, indigo, violet, maxcolors } colors; void emitcolor(colors c) { const char *p; switch (c) { case red: p = "red"; break; case orange: p = "orange"; break; case yellow: p = "yellow"; break; case green: p = "green"; break; case blue: p = "blue"; break; case indigo: p = "indigo"; break; case violet: p = "violet"; break; } printf("%s\n", p); } int main() { colors c = yellow; emitcolor(c); c = red; emitcolor(c); return 0; } G:\tmp>como enumstring1.c Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ G:\tmp>aout yellow redemitcolor() is rather tedious, so perhaps that can be at least partially alleviated with this alternative:
G:\tmp>type enumstring.c #include <stdio.h> enum colors { red, orange, yellow, green, blue, indigo, violet, maxcolors }; char *colorsstrings[] = { "red", "orange", "yellow", "green", "blue", "indigo", "violet" }; int main() { printf("%s\n", colorsstrings[yellow]); printf("%s\n", colorsstrings[red]); return 0; } G:\tmp>como enumstring.c Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ G:\tmp>aout yellow redBut this in and of itself is not terribly handy, as you might as well have just emitted "red" instead of colorsstrings[red]. However, this kind of set up does provide a mechanism say to plug in say the colors in another language, or dialect of the same language.
However, that said, this can be done, which is more useful:
G:\tmp>type enumstring2.c #include <stdio.h> typedef enum colors { red, orange, yellow, green, blue, indigo, violet, maxcolors } colors; const char *colorsstrings[] = { "red", "orange", "yellow", "green", "blue", "indigo", "violet" }; void emitcolor(colors c) /* need enum colors in C */ { printf("%s\n", colorsstrings[c]); } int main() { colors c = yellow; /* need enum colors in C */ emitcolor(c); c = red; emitcolor(c); return 0; } G:\tmp>como enumstring2.c Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ G:\tmp>aout yellow redNote that here we don't necessarily know the color, that is we can use a variable of type colors and it still works.
Note that colorsstrings was changed to point to const's, although an array of std::strings could have been used as well (meaning this example cannot be used in C, only C++):
G:\tmp>type enumstring3.c #include <stdio.h> #include <string> enum colors { red, orange, yellow, green, blue, indigo, violet, maxcolors }; const std::string colorsstrings[] = { "red", "orange", "yellow", "green", "blue", "indigo", "violet" }; void emitcolor(colors c) { printf("%s\n", colorsstrings[c].c_str()); } int main() { colors c = yellow; emitcolor(c); c = red; emitcolor(c); return 0; }Moving further into C++, there is also something like this:
#include <iostream> enum colors { red, orange, yellow, green, blue, indigo, violet, maxcolors }; std::ostream& operator<<(std::ostream& os, colors c) { const char *p; switch (c) { case red: p = "red"; break; case orange: p = "orange"; break; case yellow: p = "yellow"; break; case green: p = "green"; break; case blue: p = "blue"; break; case indigo: p = "indigo"; break; case violet: p = "violet"; break; default: p = "unknown-enum-colors"; break; } return os << p; } int main() { colors c = yellow; std::cout << c << std::endl; c = red; std::cout << c << std::endl; return 0; }There is still one problem. The problem is that these all depend upon the enumerators starting at 0 and incrementing by 1, as normally happens by default. However, if colors was say
enum colors { red = 2, orange = 2, yellow = 2, green = 2, blue = 2, indigo = 2, violet = 2 };then things would not work. It's not that it is important for them to be in order, but in that each one needs to have a unique value; this is a dead in the water issue since this an artifact of enum's. Anyway, if unique, enum colors would have to follow the same order. And also the enumerators would have to be be non-negative. So perhaps you might have:
enum colors { red = 2, orange = 4, yellow = 6, green = 8, blue = 10, indigo = 12, violet = 14 };BUT, that would mean colorsstrings would also have to accommodate:
const std::string colorsstrings[] = { "", "", "red", "", "orange", "", "yellow", "", "green", "", "blue", "", "indigo", "", "violet" };Notice the dummy entries. Now imagine a color enumerator with a value of 30000. This strategy clearly does not scale up. A std::map could scale up in that area though:
G:\tmp>type enumstring7.c #include <stdio.h> #include <string> #include <map> enum colors { red = 200, orange = 400, yellow = 600, green = 800, blue = 1000, indigo = 1200, violet = 1400 }; std::map<colors, std::string> colorsstrings; void initColorMap() { colorsstrings[red] = "red"; colorsstrings[orange] = "orange"; colorsstrings[yellow] = "yellow"; colorsstrings[green] = "green"; colorsstrings[blue] = "blue"; colorsstrings[indigo] = "indigo"; colorsstrings[violet] = "violet"; } void emitcolor(colors c) { std::cout << colorsstrings[c] << std::endl; } int main() { colors c = yellow; initColorMap(); emitcolor(c); c = red; emitcolor(c); return 0; } G:\tmp>como enumstring7.c Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ G:\tmp>aout yellow redHere, this will handle the larger values. However, it will not handle different colors of the same value: an additional enum key with another color string value will just overwrite any previous key entry. But this (duplicate enumerator values) is a flaw of all choices thus far, as is converting an int which does not have the value of any enumerator and trying to output that as a color string:
colors c = colors(99); emitcolor(c);That may mean invoking std::map's .find() capability in emitcolor(). Under these constraints, the point is to provide a "link" to the enums in each alternative. The map flavor requires the machinery of initColorMap() and of course the assignments within it, even if abstracted away.
The above considered, it would seem handy to be able to generate colorsstrings automatically. But how? One way is similar to the way generic.h worked before C++ supported templates. Basically we want to be able to do two things: stringize the enumerator, and declare/define colorsstrings. We saw in the previous examples a 1 to 1 mapping. So is there a way to use the preprocessor to automate this? Turns out there is. But it is intrusive on the code. In order to do this, the enum should be put into a header file, and then after some code changes involving function-like macros, the header will need to be processed twice. Once to get the actual enum color itself, and another time to get colorsstring. And you'd control this via a #ifdef. Enough English, let's talk code. You might want this:
// enumcolors.h #include <string> #ifdef DECLARE_ENUM_AS_STRING #define declare_enumerator(arg) #arg #define declare_enum(arg) extern const std::string arg##strings[] = #else #define declare_enumerator(arg) arg #define declare_enum(arg) extern const std::string arg##strings[]; enum arg #endif declare_enum(colors) { declare_enumerator(red), declare_enumerator(orange), declare_enumerator(yellow), declare_enumerator(green), declare_enumerator(blue), declare_enumerator(indigo), declare_enumerator(violet) };This successfully parameterizes the enum. The above version assumes C++, but a version can be written for C as well. Also, a different version that works with the std::map version should be able to be written as well.
When DECLARE_ENUM_AS_STRING is defined, then declare_enumerator's use of # stringizes its argument, hence generating "red", etc. respectively. Also, declare_enums use of ## concatenates the enum id name itself with the token strings, hence generating a definition of colorsstrings in the above example, which we only want to occur once. Here's part of the preprocessed output:
extern const std::string colorsstrings[] = { "red", "orange", "yellow", "green", "blue", "indigo", "violet" };
When DECLARE_ENUM_AS_STRING is not defined, which is how we normally want it, then the alternative declare_enumerator just utters its argument unchanged. Also, declare_enum declares the macro argument as an enum and also make a declaration (not a definition) for colorsstrings. Here's part of that preprocessed output:
extern const std::string colorsstrings[]; enum colors { red, orange, yellow, green, blue, indigo, violet };So instead of defining colors and colorsstrings in the examples above, use #include "enumcolors.h". You will also need to add another source file to establish the definition of colorsstrings:
G:\tmp\enum>type enumcolors.c #define DECLARE_ENUM_AS_STRING #include "enumcolors.h" G:\tmp\enum>como enumstring8.c enumcolors.c C++'ing enumstring8.c... Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ C++'ing enumcolors.c... Comeau C/C++ 4.3.4.1 (Mar 30 2005 22:54:12) for MS_WINDOWS_x86 Copyright 1988-2005 Comeau Computing. All rights reserved. MODE:strict errors C++ G:\tmp\enum>aout yellow red
To use this technique for other enum's, then pull out the declare_ machinery, and create a declare_enum.h or something to that effect that would be used in a header such as enumcolors.h.
In the "and now for something completely different" category, the best solution in some cases is to derive from a C++ std::locale::facet. I think the best way to explain this is to direct you to Stroustrup's already ample description: check out sections D.3.2 and D.4.7.1 of Appendix D: Locales (note it is copyrighted) of The C++ Programming Language (3rd and Special Edition). The infrastructure here is much more complicated, but it is robotic; it also provides for additional capabilities of error checking, and other schemes in addition to the macro technique shown above.
Each alternative above has focused on the output of the enumerator names. Of course, sending the output to buffers using s[n]printf(), ostreamstreams, etc is fair game.
#include <cstdlib> // use <stdlib.h> in C #include <string> int main() { const char dateCommand[] = "date"; std::string theDate; int result; result = std::system("date"); // run the date command and return result = std::system(dateCommand); // run it again theDate = "/usr/bin/"; theDate += "date"; result = std::system(theDate.c_str()); // yet again }Using system() attempts to run the command processor on your system, for instance a shell. It returns the error code as determined by the command processor. This clearly depends upon whether or not there even is a command processor on your system. To see if there is a command processor being made available, pass the null pointer to system():
int result = system(0); if (result) // there is a command processor else // there is not a command processorSimilarly, the result from your execution attempt is returned:
result = system("date");Its value, and the meaning of such a value, is implementation-defined. For many popular operating system, the PATH environment variable is expected to contain a list of paths where the command will be execute from. Of course too, you can specify a full path name, as theDate does above. You can also pass arguments to apps:
result = system("como file1.c file2.c");These are just passed as you would normally type them on the command line to your command processor (on MS-OSs, the command processor might be command.com, whereas on UNIX, it might be /bin/sh). Of course, be careful when specifying characters that may end up being escape sequences (for instance, \c should be \\c).
If you need some other sort of control over executing processes from your C or C++ programs, many operating systems support additional, though non-standard routines. They often have names such as ?fork(), exec?(), spawn?(), etc., where ? is some characters as documented in the extensions that the OS supports, as an example, execvp(), and so on.
Well, for starters, POD is an acronym for "Plain Ol' Data". That's right, that's an official technical term. :)
More generally, POD refers to POD-structs, POD-unions, and even to POD-scalars. However, saying "POD" is usually meant to refer to POD-structs in most discussions, so that's where I'll focus.
A POD-struct is an aggregate that may not contain non-static members which are references, user-defined destructor, user-defined assignment operators, pointers to members, or members which are non-PODs (struct or union) or arrays of non-PODs (struct or union). Note that aggregate is not being used in the typical English meaning here, instead it has a specific C++ meaning. In particular, an aggregate may not contain any user-defined constructors, base classes, virtual functi ons, or private/protected non-static data (so it may contain private/protected static member data/functions). It's significant to point out that as a POD-struct is an aggregate, it may not contain those things either.
In other words, a POD wouldn't contain the things classes are usually used for. What is it useful for then? In short, what this gives us is a shot at strong compatibility with C's structs. This is why they come up often. That is, compatibility with the C memory model is important to some programs.
This is not intended to be a full tutorial, but the above should address the initial questions asked. As to why most books don't cover any of this, well, most books are not worth buying. That said, what's important is not necessarily to be able to recite and memorize the above, but to be able to use it and know what it means to do so (in other words, some books may discuss it, but not refer to it as PODs).
What's important is to obtain a fighting chance at multi-language programming, in specific to be able to obtain C compatibility. For that you need info on the memory layout, clear copying semantics, and no surprises. Note that although extern "C" does not depends upon PODs, often is it PODs which you will be passing and returning to extern "C" functions.
Coming full circle, what's going on here is that a POD struct in C++ is an intent to obtain what a normal struct would look like and be in C (which wouldn't be using inheritance or virtual functions or C++ features).
Some other quick POD facts:
For more formal details about POD's, see the following sections in Standard C++ as starting points:
void foo(int, double);it internally changes the name to foo__Fid, where the F indicates that it is a function, the i indicates the first argument is an int and the d indicates that the second argument is a double. Other compilers do this in a similar manner. Since the definition of this function will be similarly decorated, the linker will resolve this symbol fine.
Now, consider something such as printf which is normally prototyped as follows:
int printf(const char *, ...);which hence might map into printf__FPCce. However, if printf was compiled with a C compiler, well, no C compiler that we know of does name decorating (although they could, they don't), therefore, the name will not be resolved by the linker. To get around this, in most cases, you should use a linkage specification. Here's the above example again:
extern "C" void foo(int, double);Now most implementations will not decorate foo any longer. Of course, you would extern "C" the function definition too (since the definition file should also be #includeing the header file which contains the prototype, that will be satisfactory as far as the compiler is concerned, though you should consider self-documenting code). As well, if you have many such functions, you can create linkage blocks, consider:
// stdio.h extern "C" { // ... int printf(const char *, ...); // ... }Here, most C++ compilers will not decorate printfs name, therefore, it will probably now link with the binary version that was compiled by the C compiler. It's often useful to create a "co-ed" header file, so in such cases, this coding technique might be helpful:
// stdio.h #ifdef __cplusplus extern "C" { #endif // stuff from before #ifdef __cplusplus } #endifSince Standard C is not required to define __cplusplus, then, when compiling with a C compiler, the extern block won't be established (but of course all the prototypes and such will be, which is as the C compiler would expect it).
Please note that name decoration is not required by C++, it is strictly an implementation detail. However, all C++ compilers do it. Similarly, a linkage specification should be considered a fighting chance at cross language linkage and not a guarantee. Again though, for most platforms, the reality is that it will work fine.
Too, do note that a linkage specification does not bring you into the other language! That is, you are still writing C++ code. As such, note that C++ keywords are still in existance even within a linkage specification -- so for instance, using new or bool or private as identifier names will end up being kicked out as errors; obviously then you'll need to rename those identifiers (you could do some preprocessor gymanstics but in the long run doing so usually does not pan out). Also note that doing something like passing a class based object, a reference, etc., to a C function means you are on your own. Note that other things effect name decoration too, such as a class name, namespace, etc. As well, you won't be overloading a function which you've extern "C"d because then you would have two functions with the same name (since most implementation wouldn't mangle them).
Also, doing this is usually a mistake:
// ... extern "C" { // wrapped around #include, but prefer within header #include "SomeHeaderFile.h" } // ...because if the header is not prepared for being extern "C"d then doing the above is most likely just going to result in a bunch of error messages, and probably cryptically so too. Chances are good that the header will drag in other headers and bad fun will just ensue from there with nested headers, typedefs and who knows what else. Your extern "C"ing should follow the Las Vegas tenet What happens in extern "C" stays in extern "C" and keep your extern "C" blocks within files not across them.
The above has so far considered only the scenario of calling a C function from C++. The contrary, calling a C++ function from a C function, has the same solution. In other words, if you extern "C" a C++ function, then most implementations won't mangle it, therefore, most C compilers will be able to link with it. However, as just mentioned, if the C++ function expects something such as a reference argument, you are usually of course on your own. There's other issues too in this direction, for instance, consideration of C++'s overloading, the function expecting a C++ class based object as a parameter, etc. If it's not clear, C does not have all the features available in C++, and trying to mimic them (the calling routine(s) from C will have to do this) can be challenging to say the least in some cases. This all tends to complete with calling a C++ function from C, even with extern "C". In some cases, it may be worth considering stubs routines to try and ease the pain, but this should be decided carefully. Of course as well, a C++ application can use C++ functions which have been extern "C"d.
There are other aspect to linkage specifications that you might want to be aware of. For instance, it can be applied to objects. With the above under your belt, you may now want to pursue a quality C++ text book on this matter.
class xyz { int x; int y; mutable int z; // ... public: static void staticMemFunc(); void definedElsewhere() const; // ... void foo() { x = 99; } int getx() { return x; } int gety() const { return y; } void bar() const { z = -99; } }; void global(int);We know that in effect, the member functions really are:
class xyz { // ... void foo() { this->xyz::x = 99; } int getx() { return this->xyz::x; } int gety() const { return this->xyz::y; } void bar() const { this->xyz::z = -99; } }Note that gety() has a const after its argument list (which is void in this example), whereas foo() and getx() do not. Because of the const, gety() is called a const member function. This means that the type of its this pointer is a const xyz * const. In the case of foo() and getx(), which are then non-const member functions, the type of their this pointer is a xyz * const. In other words, an implication is that gety() normally couldn't modify the (modifiable) object it was called with, whereas the other two can. As such, note that foo() modifies x, however, note that getx() and gety() just return copies of some of xyzs members. This seems to imply that getx should also have been declared as a const member function.
Because of the difference in the qualification of what the this pointer points to, consider the following:
xyz A; // ... A.foo(); // ok int ax = A.getx(); // ok int ay = A.gety(); // ok const xyz B; B.foo(); // Not ok, expect a diagnostic int bx = B.getx(); // Not ok, expect a diagnostic int by = B.gety(); // okWe observe here that a non-const object can be used with all the functions. We also observe that the const object's integrity will be upheld if non-const member functions are called upon it, because they are allowed to modify the object, but if the object is const, that would not be a good thing.
A twist, and a point worth noting, is about member z. In particular, it is declared mutable. This means that even if a object is declared const (meaning it is immutable, that is, not modifiable), any member declared mutable are well mutable, that is, modifiable. This is upheld through const member functions, therefore, member bar() is able to modify z even though it was declared as a const member function.
Also note that const member functions apply to non-static member functions only. That means it does not apply to global or staticMemFunc in the example above. IOWs, this is wrong:
class xyz { // ... static void staticmemfunc() const; // nope }; void global(int) const; // nope, even if in a namespace void blah() { global(99); // calling with a constant does not matter }Please do not confuse this with an out-of-class definition of a const member function, which is fine:
// ... void xyz::definedElsewhere() const { // const required in decl and def global(x); // perhaps global(z); // perhaps }
Do note that just because a product name uses C++ in it, does not mean that it is compliant with the C++ standard. There are various vendors who have implementations that target Standard C++. For instance, Comeau C++, or Microsoft's Visual C++ (aka VC++). Implementations are available for specific platforms. The implementations are compliant to the Standard in varying degrees, also, the implementations may or may not have extensions, which would be documented by the vendor of such a product.
Usually you want to compare implementations with each other. That is, you wouldn't necessarily ask what the advantage of C++ is over VC++, because that tends to be a non-question. That is, usually you don't want to look at a specific implementation of C++ as a language in and of itself. This doesn't mean that you cannot discuss the products, or ask what their extensions, platforms, easy of use, speed, etc., are, so long as you are aware that they are specifics about what a particular vendor does.
Usually you should contact your vendor about such an error. Too though, the diagnostic may give you information about the line of code which caused the error, and you may or may not be able to figure out a work around from there.
Here's some situations which may be the indirect cause of the internal error:
Here's some additional points to consider:
template <class T> class xyz { };could have been written as:
template <typename T> class xyz { };These two definitions of xyz are considered equivalent since template parameters using class or typename are interchangeable.
Additionally, there are various contexts where the compiler needs to know whether it is dealing with a declaration or an expression. In the case of templates, a similar parsing issues comes up. In particular, if T is a template parameter as it is in xyz above, then what does it mean for it to use say T::x? In other words, if the compiler does not know what T is until you instantiate it, how could it know what x is, since it is based upon T? Consider :
template <typename T> class xyz { void foo() { T::x * p; /* ... */ p = blah; } };Does this declare p or does it multiply some p somewhere by T::x? If it should be a declaration, then you would do this to make that clear:
template <typename T> class xyz { void foo() { typename T::x * p; /* ... */ p = blah; } };Now we know that blah is being assigned to the local p in foo.
Note that there is no guarantee that when you actually instantiate the template that x is actually a type. If it's not, you'll get an error at that point. Anyway, make sure that typenamed things will actually eventually refer to types.
Note too that some earlier compilers do not support the typename keyword at all. As well, some current compiler which do support it may have a switch to disable it, so double check your situation if your compiler won't accept your using this keyword. Furthermore, some compilers and/or modes will make an attempt to guess which names are actually supposed typenames. It can clearly do this in some cases, and in some cases it is not required because the compiler requires a type, but to expect it to do it in all cases is not possible.
Also worth pointing out is that typename is not the same as typedef. In brief, typedef creates an alias (another name) for an existing type, whereas typename acts as a contextual disambiguator as described above. In fact, you will often see them used together, for instance:
template <typename T> class xyz { typedef typename T::SomeTypeInT SomeNewTypeForxyz; };By the way, if you are using just a T, then it does not require typename, so this is wrong:
template <typename T> class abc { typedef typename T abcT; };For instance, here is the error message from Comeau C++ about this:
Comeau C/C++ 4.2.43 (Mar 3 2000 18:18:13) for Solaris_SPARC_2_x Copyright 1988-2000 Comeau Computing. All rights reserved. MODE:strict errors C++ "tn.c", line 2: error: a class or namespace qualified name is required typedef typename T abcT; ^
There are other examples where typename can be used and their premise is similar to the above.
C++ was originally called "C With Classes". The problem was that people started called C With Classes things like "new C" and even just plain old C. Because of this, AT&T; management suggest that Stroustrup change its name to be more politically courteous. So, it (shortly) came to be known as C84. However, then the problem was that people began calling original C names like "old C". Furthermore, ANSI C was being developed around that time too, and C84 would clash with it too, because during standard ization, languages usually get coined names like LANGUAGEYY, so ANSI C might end up being something like C86, so naming C++ as C84 would just make that confusing, especially if a new version of C84 came along!
So a new name still needed to be found. Around 1983 Rick Mascitti suggested C++. It's a pun off of the ++ operator in C, which deals with incrementing something (although it is semantically problematic since it should really be ++C), but anyway: first C, then C++, get it?