page28

Program 25

Multi-file project & more logic

This sample program shows how you use more than one source code file in a project. It consists of three files :

function.c - Holds a couple of handy functions that will be called from the main program
function.h - Holds the prototypes for the functions in the function.c file
prog25.c - The main source file, where the program starts and calls functions in the function.c source file.

If you've forgotten about function prototypes, they were covered briefly in the description on page15.

Create the three program files, and if using Visual C, place them in the folders shown in the description at the top of the code for each. It isn't important to place the function.h header file in the Header Files folder, it doesn't even have to be placed in the project at all, as long as the file exists. However, it is a lot neater to have it listed in the header files section and it also means you can double click and look at it by using the left pane at any time, which is handy.

As well as showing how functions in different source files in a project can be called from another source file, these programs also show a few nifty new bits as well. It isn't important to understand how the capitalise or the addascii functions work, but if you can understand, all the better. The capitalise function will be explained in the description as well to help.

Program code - new parts in red

Function.c ( place in Source Files folder. Holds two functions, that are called from a seperate source code file )

#include <math.h>
#include <stdio.h>
#include <string.h>

// Capitalise the passed string
void capitalise(char *str)
{
unsigned int i;
char lastchar=' ';

for(i=0;i<strlen(str);i++)
{
if(*(str+i)>64&&*(str+i)<123)
{
if(lastchar==' ')
*(str+i)&=223;
else
*(str+i)|=32;
}
lastchar=*(str+i);
}
}

// Return the sum total of all ascii character codes in the passed string
unsigned int addascii(char *str)
{
unsigned int i,tot=0;

for(i=0;i<strlen(str);i++)
tot+=*(str+i);

return(tot);
}

Function.h ( Place in the Header Files folder. Holds prototypes for the functions in function.c )

void capitalise(char *str);
unsigned int addascii(char *str);

Prog25.c ( Place in the Source Files folder. Is the main source code program which calls functions in the function.c source code )

#include <stdio.h>
#include <conio.h>
#include "function.h"

void main(void)
{
char mystr[200];

printf("Please enter your name : ");
gets(mystr);
printf("Initial string = %s ( Ascii value = %u )\n",mystr,addascii(mystr));
capitalise(mystr);
printf("After capitalise = %s ( Ascii value = %u )\n",mystr,addascii(mystr));
getch();
}

Description of the program code

This program consists of three files ( function.c, function.h, and prog25.c ). When working with multiple source code files in a project only one can have a main function. This is the function the compiler uses to find the start of the program, if you had two, this would leave it confused as to which one to use. The main function ( start of the program ) is in the prog25.c program in this case.

Starting then, with the prog25.c program :

The first new part that is encountered is the #include "function.h" line. A quick glance may not see why this is new, but a closer look will reveal that the filename of the header file is not enclosed in < and > but quotation marks " ". This is saying to the compiler that the function.h file is located in the same folder on the disk as the source code file it is being included ( in this case prog25.c ). Way back on page15 it was said that a compiler works from top to bottom through a source file, so if a function is written lower in the source file than a call to that function, a compilation error will be generated. This is because, at the time the compiler is compiling the call, it has no knowledge at all of the function being called. If the function is above in the source file, then the compiler has already met up with it, and won't generate an error. To get around this function prototypes are used, which in short, say to the compiler, "The function is here ... honest" ! The compiler in effect says, "Okay, i'll believe you, in any case, if you're wrong, the linker will get you" :-)

The same is true if a function that is being called is located in a seperate source code file, the compiler will need a prototype to convince it, that it is there. These prototypes are best stored in a single file which can be included, in this case it's the function.h file. It is safe now to look into header files in a little bit more detail to examine exactly what they are. The word include pretty much gives the game away though, because exactly as it sounds, it includes the contents of the file specified into the program at that point. So effectively, you don't need the #include "function.h" line, you could type the prototypes straight into the prog25.c file. This isn't really desirable though, and it's far easier using the #include lines and neater to.

Now the compiler has been told about the functions in the function.c program, it will now not complain when it meets up with calls to the them. The addascii function is the first to be called to sum up the total of all the ascii character codes in the string entered. The capitalise function is then called, which in short, will take a string and make all the first letters of each word in that string upper case, and any other letters lower case. Eg :

tHis iS thE strING turns into This Is The String

It's a useful little function to have handy for lots of programs, especially for dealing with names. For example, it can turn joe bloggs into Joe Bloggs. With it being such a handy function to have and probably one which could be used in lots of different programs. It is good that it is stored in a seperate source file, because everytime you need it, you can copy the function.c and function.h files to the folder of the program you are creating and just add them to your project. As you write other handy functions that you will use again, you can keep adding them to the function.c and function.h files. After a while though, you may end up with a big function.c file and want to split the functions into categories, for example, all functions dealing with strings, functions dealing with conversions, and so on.. Eventually you'll end up with quite a few source and header files that can be re-used over and over for each program you write. So much so, that after a while, you'll probably end up writing programs which contain less and less standard 'C' functions and more of your own functions from your library.

Description of the capitalise function and logic

The code for the capitalise function is shown below, to save scrolling up the page :

void capitalise(char *str)
{
unsigned int i;
char lastchar=' ';

for(i=0;i<strlen(str);i++)
{
if(*(str+i)>64&&*(str+i)<123)
{
if(lastchar==' ')
*(str+i)&=223;
else
*(str+i)|=32;
}
lastchar=*(str+i);
}
}

The function converts a string ( or array of characters, depending on which way you prefer to visualise them ) into a string where all the first characters of each word are upper case, and the rest of the characters are lower case. To do this, it scans through the string ( start pointed to by str ). Yes, it's pointers again :-(. If you are still a bit shy at the sight of pointers, the same function could be written as below. It still uses pointers, but it's not as apparent as above. This also goes to prove, as well, that arrays are stored in consequetive memory addresses.

void capitalise(char *str)
{
unsigned int i;
char lastchar=' ';

for(i=0;i<strlen(str);i++)
{
if(str[i]>64&&str[i]<123)
{
if(lastchar==' ')
str[i]&=223;
else
str[i]|=32;
}
lastchar=str[i];
}
}

The i variable holds the loop counter and will start at 0 and run through to 1 less than the string length ( remember the last character in a string is a NULL ) :

for(i=0;i<strlen(str);i++)

The if line next, checks to see if the character at the current position in the string is a letter. If you look back to the ASCII table on page 9, you will see that the ascii codes for A - Z are 65 to 90 inclusive, and the ascii codes for a - z are 97 to 122 inclusive. You may now be starting to see a bug in the line :

if(str[i]>64&&str[i]<123)

Note, that this bug won't show up when you compile the program, and may never show up at all, depending on what information is fed to the function. If you look again at the ascii table, there are codes from 91 to 96 (inclusive), that are not letters ! These will be converted by the function in some cases, which is definately not what is needed. So the corrected line shoule read :

if(( str[i]>64 && str[i]<91 ) || ( str[i]>96 && str[i]<123 ))

There are really two main conditions now in the if statement, with two further conditions in each of these. If either of the main conditions is met the if line will result in true and if both main if conditions are false the if line will result in false. If you're a bit lost with this line, consult the description on page 12 again.

Finally on this line, there is yet another more meaningful way still you could write it and still perform the same task :

if(( str[i]>='A' && str[i]<='Z' ) || ( str[i]>='a' && str[i]<='z' ))

Whatever way suits you and your programming style though is the best way of writing the line. If you are working as part of a team of programmers though, I think your collegues would prefer this be written in the final way though... back to the code

Assuming that the function has met up with a letter character, it then uses the lastchar variable to see if the last character was a space. If it was a space, as will always be the case with the first character in the string, then it will capitalise the character. It does this using a logical AND ( & ).

str[i]&=223;

If you read through and understood page 23 and understood the logic and binary sections in the attached document, then you should be okay with this, if not, then to fully understand this you may want to go over it again.

If you convert the 'A' character ( ascii code 65 ), and the 'a' character ( ascii code 97 ), to binary, you can see that the two binary numbers are very similar :

Code	128	64	32	16	8	4	2	1
65 (A)	0	1	0	0	0	0	0	1
97 (a)	0	1	1	0	0	0	0	1

The only difference with the two binary patterns is in the 32 column, where for the capital 'A' it is not set, but for the lower case 'a' it is set. This is the same for all the letters in the ASCII code. Try a few to prove it. This then means, that setting or clearing the bit in the 32 column will change either an upper case letter to lower case or visa versa. You can do this also by adding or subtracting 32 from the ascii code, but this means that you would need to find whether the character is upper or lower case first, which is a bit more work. Instead the logical AND ( & ) and logical OR ( | ) can be used.

The logical AND is normally used to mask bits, ie, to set them to zero. The truth table for the AND below, may help show why.:

input 1	input 2	output
0	0	0
0	1	0
1	0	0
1	1	1

Only when both inputs are 1 ( or true ) is the output 1 ( or true ), if either of the two inputs are 0 ( false ) then the output will be 0 ( false ). Therefore, if a bit is AND'd with a bit not set, it will set that bit to 0, irrespective of it's initial value : EG - 97(a) is input 1, 223 is input 2 and the output is 65 (A) :

Code	128	64	32	16	8	4	2	1
97 (a)	0	1	1	0	0	0	0	1
And (223)	1	1	0	1	1	1	1	1
65 (A)	0	1	0	0	0	0	0	1

By setting the 32 bit column value to 0, and the others to 1, and AND'ing that value, it will set the bit in the 32 column to 0 regardless of it's previous state, and leave the other bits in the same state, hence why the decimal value 223 is used. This has the effect of turning a small case ascii letter into upper case, and if it's already upper case it will not change it.

The next important line ( below ), does a similar thing, but this will turn an ascii letter into lower case using an OR ( I ) operation.

str[i]|=32;

The logical OR is usually used to set bits, ie, to set them to one. The truth table for the OR is shown below :

input 1	input 2	output
0	0	0
0	1	1
1	0	1
1	1	1

When any of theinputs are 1 ( or true ) is the output 1 ( or true ), and only when both inputs are 0 ( false ), the output is 0 ( false ). Therefore, if a bit is OR'd with a bit that is set, it will set that bit to 1, irrespective of it's initial value : EG - 65(A) is input 1, 32 is input 2 and the output is 97(a) :

Code	128	64	32	16	8	4	2	1
65 (A)	0	1	0	0	0	0	0	1
Or (32)	0	0	1	0	0	0	0	0
97 (a)	0	1	1	0	0	0	0	1

By setting the 32 bit column value to 1, and OR'ing that value, it will set the bit in the 32 column to 1 regardless of it's previous state, and leave the other bits in the same state, hence why the decimal value 32 is used. This has the effect of turning an upper case ascii letter into lower case, and if it's already lower case it will not change it.

Irrespective of whether the current character is a letter or not, or if it's upper or lower case, the next line remembers it, so that it can check to see if the last character was a space when it comes to decide whether to capitalise it or not next time around.

lastchar=str[i];

Summary

This page showed how you can create projects with more than one source code file. Even though it's pretty simple to do, this is a massive step up, because it gives the ability to create large programs and also create your own set of standard routines for dealing with various things. These then can be used over and over again with different projects, making the development of programs a lot quicker and neater.

A small function for turning the first letters of a word into a capital was also shown, as well as various other ways of writing the same function with it still performing and acting the same way. This function should also show that you should never rely on the compiler to show "bugs" or errors in programs, the bugs compilers pick up are the easiest to recognise and fix. It's the ones they don't that give the most headaches. Worse still, the bug that was in the program above, may not have arisen for a while, and sat there dormant, waiting to give you a headache at a later time.

To progress further still, a closer look at parameter passing and pointers is needed, which is next.

Tasks

25.1) Alter the capitalise function, so that it will capitalise letters after either a space character or a full stop ( period . ). .