Section 1.2
Variables. Data types. Constants.
cplusplus.com

The usefulness of the "Hello World" programs shown in the previous section are something more than questionable. Since we have had to write several lines of code, compile them, and then execute the resulting program to obtain just a phrase on the screen as result. It is true, but programming is not limited only to print texts on screen, it would be much faster to simply write the output sentence by ourselves. In order to go a little further on and to become able to write programs that perform useful tasks that really save us work we need to introduce the concept of variable.

Let's think that I ask you to retain number 5 in your mental memory, and then I ask you to memorize also the number 2. You have just stored two values in your memory. Now, if I ask you to add 1 to the first number I said, you should be retaining numbers 6 (that is 5+1) and 2 in your memory. Values that now we could subtract and obtain 4 as result.

All this process that you have made is a simil of what a computer can do with two variables. This same process can be expressed in C++ with the following instruction set:

a = 5;
b = 2;
a = a + 1;
result = a - b;

Obviously this is a very simple example since we have only used two small integer values, but consider that your computer can store several million of numbers like these at the same time and conduct sophisticated mathematical operations with them.

Therefore, we can define a variable like a portion of memory to store a determined value.

Each variable needs an identifier that distinguishes it from the others, for example, in the previous code the variable identifiers were a, b and result, but we could have called the variables with any name we had wanted to invent, whenever they were valid identifiers.

Identifiers

A valid identifier is a sequence of one or more letters, digits or underline symbols ( _ ). The length of an identifier is not limited, although for some compilers only the 32 first characters of an identifier are significant (the rest are not considered).

Neither spaces nor marked letters can be part of an identifier. Only letters, digits and underline characters are valid. In addition, variable identifiers should always begin with a letter. They can also begin with an underline character ( _ ), but this is usually reserved for external links. In no case they can begin with a digit.

Another rule that you have to consider when inventing your own identifiers is that they cannot match with any language's key word nor your compiler's ones since they could be confused with these, for example, the following expressions are always considered key words according to the ANSI-C++ standard and therefore they must not be used as identifiers:

asm, car, bool, break, marry, catch, to char, class, const, const_cast, continue, default, delete, do, double, dynamic_cast, else, enum, explicit, extern, false, float, for, friend, goto, if, inline, int, long, mutable, namespace, new, operator, private, protected, public, to register, reinterpret_cast, return, short, signed, sizeof, static, static_cast, struct, switch, template, this, throw, true, try, typedef, typeid, typename, union, unsigned, using, virtual, void, volatile, wchar_t
Additionally, alternative representations for some operators do not have to be used as identifiers since they are reserved words under some circumstances:
and, and_eq, bitand, bitor, compl, not, not_eq, or, or_eq, xor, xor_eq
your compiler may also include some more specific reserved keywords. For example, many compilers which generate 16bits code (like some compilers for DOS) include also far, huge and near as key words.

Important: The C++ language is "case sensitive", that means that a same identifier written in capital letters is not equivalent to another one with the same name but written in small letters. Thus, for example the variable RESULT is not the same one that the variable result nor variable Result.

Data types

When programming, we store the variables in our computer's memory, but the computer must know what we want to store in them since it is not going to occupy the same space in memory to store a simple number, a letter or a large number.

Our computer's memory is organized in bytes. A byte is the minimum amount of memory which we can manage. A byte can store a relatively small data, like an integer between 0 and 255 or one single character. But in addition, the computer can manipulate more complex data types that come from grouping several bytes, like long numbers or numbers with decimals. Next you have a list of the existing fundamental data types in C++, as well as the range of values that can be represented with each of them:

DATA TYPES
Name Bytes* Description Range*
char 1 character or integer 8 bits length. signed: -128 to 127
unsigned: 0 to 255
short 2 integer 16 bits length. signed: -32763 to 32762
unsigned: 0 to 65535
long 4 integer 32 bits length. signed:-2147483648 to 2147483647
unsigned: 0 to 4294967295
int * Integer. Its length traditionally depends on the length of the system's Word type, thus in MSDOS it is 16 bits long, whereas in 32 bit systems (like Windows 9x/2000/NT and systems that work under protected mode in x86 systems) it is 32 bits long (4 bytes). See short, long
float 4 floating point number. 3.4e + / - 38 (7 digits)
double 8 double precision floating point number. 1.7e + / - 308 (15 digits)
long double 10 long double precision floating point number. 1.2e + / - 4932 (19 digits)
bool 1 Boolean value. It can take one of two values: true or false NOTE: this is a type recently added by the ANSI-C++ standard. Not all compilers support it. Consult section bool type for compatibility information. true or false
wchar_t 2 Wide character. It is designed as a type to store international characters of a two-byte character set. NOTE: this is a type recently added by the ANSI-C++ standard. Not all compilers support it. wide characters

* Values of columns Bytes and Range may vary depending on your system. The values included here are the most commonly accepted. They are used by almost all compilers.

In addition to these fundamental data types there also exist the pointers and the void parameter type specification, that we will see later.

Declaration of variables

In order to use a variable in C++, we must first declare it specifying which of the data types above we want it to be. The syntax to declare a new variable is to write the data type specifier that we want (like int, short, float...) followed by a valid variable identifier. For example:
int a;
float mynumber;
Aae valid declarations of variables. The first one declares a variable of type int with the identifier a. The second one declares a variable of type float with the identifier mynumber. Once declared, variables a and mynumber can be used within the rest of their scope in the program.

If you need to declare several variables of the same type and you want to save some writing work you can declare all of them in the same line separating the identifiers with commas. For example:

int a, b, c;
declares three variables (a, b and c) of type int , and has exactly the same meaning as if we have written:
int a;
int b;
int c;

Integer data types (char, short, long and int) can be signed or unsigned according to the range of numbers that we need to represent. Thus to specify an integer data type we do it by putting the keyword signed or unsigned before the data type itself. For example:

unsigned short NumberOfSons;
signed int MyAccountBalance;
By default, if we do not specify signed or unsigned it will be assumed that the type is signed, thus in the second declaration we could have written:
int MyAccountBalance;
with exactly the same meaning and being this last one the most usual, in fact, it is rare to see source codes including the keyword signed as part of a compound type name.

The only exception to this rule is the char type that exists by itself and it is considered a diferent type than signed char and unsigned char.

Finally, signed and unsigned may also be used as a simple types, meaning the same as signed int and unsigned int respectivelly. The following two declarations are equivalent:

unsigned MyBirthYear;
unsigned int MyBirthYear;

To see in action how looks like a declaration in a program, we are going to see the C++ code of the example about your mental memory proposed at the beginning of this section:

// operating with variables

#include <iostream.h>

int main ()
{
  // declaring variables:
  int a, b;
  int result;

  // process:
  a = 5;
  b = 2;
  a = a + 1;
  result = a - b;

  // print out the result:
  cout << result;

  // terminate the program:
  return 0;
}
4

Do not worry if something out of the varibale declarations sounds a bit strange to you. You will see the rest in detail in following sections.

Initialization of variables

When declaring a variable, its value is undetermined by default. But you may want that a variable stores a concrete value when it is declared. For that, you have to append an equal sign and the value you want to the variable declaration:
type identifier = initial_value ;
For example, if we want to declare an int variable called a that contains the value 0 since the moment in which is declared, we could write:
int a = 0;

Additionally to this C-like way of initializating variables, C++ has added a new way to initialize a variable: by enclosing the initial value between parenthesis ():

type identifier (initial_value) ;
For example:
int a (0);
Both ways are valid in C++.

Scope of variables

All the variables that we are going to use must have been previously declared. An important difference between the C and C++ languages, is that in C++ we can declare variables anywhere in the source code, even between two executable sentences, and not only at the beginning of a block of instructions, like happens in C.

Even so it can be recommendable under some circumstances to follow the indications of the C language when declaring variables, since it can be useful at the time of correcting a program to have all the declarations grouped together. Therefore, the traditional way to declare C-like variables is to include their declaration at the beginning of each function (local variables) or directly in the body of the program outside any function (global variables).

Global variables can be referred anywhere in the code, even within functions, whenever it is after its declaration.

The scope of the local variables is limited to the code level in which they are declared. If they are declared at the beginning of a function (like in main) its scope is the whole main function. This means that if in the example above, moreover than the function main() another function existed, the local variables declared in main could not be used in the other function and vice versa.

In addition to local and global scopes exists the external scope, that causes a variable to be visible not only in the same source file but in all other files which will be linked together with.

In C++, the scope of a variable is given by the block in which it is declared (a block is a group of instructions grouped together within key brackets {} signs). If it is declared within a function it will be a variable with function scope, if it is declared in a loop its scope will be only the loop, etc...

Constants: Literals.

A constant is any expression that has a fixed value. They can be divided in Integer Numbers, Floating-Point Numbers, Characters and Strings.

Integer Numbers

1776
707
-273
they are numerical constants that identify integer decimal numbers. Notice that to express a numerical constant we do not need to write quotes (") nor any special symbol. There is no doubt that it is a constant: whenever we write 1776 in a program we will be referring to the value 1776.

In addition to decimal numbers (those that all of us already know) C++ allows the use as literal constant of octal numbers (base 8) and hexadecimal numbers (base 16). If we want to express an octal number we must precede it with a 0 character (zero character). And to express a hexadecimal number we have to precede it with the characters 0x (zero, x). For example, the following literal constants are all equivalent to each other:

75         // decimal
0113       // octal
0x4b       // hexadecimal
All of them represent the same number: 75 (seventy five) expressed as a radix-10 number, octal and hexdecimal, respectively.

[ Note: You can find more information on hexadecimal and octal representations in the document Numerical radixes]

Floating Point Numbers
They express numbers with decimals and/or exponent. They can include a decimal point, an e character (that expresses "by ten at the Xth height", where X is the following integer value) or both.

3.14159    // 3.14159
6.02e23    // 6.02 x 1023
1.6e-19    // 1.6 x 10-19
3.0        // 3.0
these are four valid numbers with decimals expressed in C++. The first number is PI, the second one is the number of Avogadro, the third is the electric load of an electron (an extremely small number) -all of them approximated- and the last one is the number 3 expressed as a floating point number literal.

Characters and strings
There also exist non-numerical constants, like:

'z'
'p'
"Hello world"
"How do you do?"
The first two expressions represent single characters, and the following two represent strings of several characters. Notice that to represent a single character we enclose it between single quotes (') and to express a string of more than one character we enclose them between double quotes (").

When writing both single characters and strings of characters in a constant way, it is necessary to put the quotation marks to distinguish them from possible variable identifiers or reserved words. Notice this:

x
'x'
x refers to variable x, whereas 'x' refers to the character constant 'x'.

Character constants and string constants have certain peculiarities, like the escape codes. These are special characters that cannot be expressed otherwise in the sourcecode of a program, like newline (\n) or tab (\t). All of them are preceded by an inverted slash (\). Here you have a list of such escape codes:

\nnewline
\rcarriage return
\ttabulation
\vvertical tabulation
\bbackspace
\fpage feed
\aalert (beep)
\'single quotes (')
\"double quotes (")
\question (?)
\\inverted slash (\)
For example:
'\n'
'\t'
"Left \t Right"
"one\ntwo\nthree"
Additionally, to numerically express an ASCII code you can also use the inverted slash bar (\) followed by an ASCII code expressed as an octal (radix-8) or hexadecimal (radix-16) number. In the first case (octal) the number must follow immediately the inverted slash (for example \23 or \40), in the second (hexacedimal), you must put an x character before the number (for example \x20 or \x4A).
[Consult the document ASCII Code for more information about this type of escape code].

String of characters constants can be extended by more than a single code line if each code line ends with an inverted slash (\):

"string expressed in \
two lines"
You can also concatenate several string constants separating them by one or several blankspaces, tabulators, newline or any other valid blank character:
"we form" "a unique" "string" "of characters"

Defined constants (#define)

You can define your own names for constants that you use quite often without having to resource to variables, simply by using the #define preprocessor directive. This is its format:
#define identifier value
For example:
#define PI 3.14159265
#define NEWLINE '\n'
#define WIDTH 100
they define three new constants. Since the moment when they are declared, you will be able to use them in the rest of the code as any other constants, for example:
circle = 2 * PI * r;
cout << NEWLINE;
In fact the only thing that the compiler does when it finds #define directives is to replace literally any occurrence of the them (in the previous example, PI, NEWLINE or WIDTH) by the constants to which they have been defined (3.14159265, 'n' and 100, respectively). For this reason, #define constants are considered macro constants.

The #define directive is not an code instruction, it is a directive for the preprocessor, reason why it assumes the whole line as the directive and does not require a semicolon (;) at the end of it. If you include a semicolon character (;) at the end, it will also be added when the preprocessor will substitute any occurence of the defined constant within the body of the program.

declared constants (const)

With the const prefix you can declare constants with a specific type exactly just as you would do with a variable:
const int width = 100;
const char tab = '\t';
const zip = 12440;
In case that the type was not specified (as in the last example) the compiler assumes that it is type int.

© The C++ Resources Network, 2000-2001 - All rights reserved

Previous:
1-1. Structure of a C++ program.

index
Next:
1-3. Operators.