Introduction to C File Access

The program below will serve as a first introduction to reading and writing files (directly, not through the < and > redirection methods) in C. It will also introduce a few new C constructs.

The sample program first prompts the user to specify the names of input and output files. The program will then read the input file, and search for any single-digit, nonzero words; it will then produce an identical output file, except that the single-digit words have been changed from arabic to roman numerals. Here is a script record of a sample run:

     1  Script started on Wed Apr 15 23:52:27 1992
     2  heather% cat x
     3  
     4  She has 7 sisters, 
     5  2 brothers, 6 nieces and 3 
     6  nephews.
     7  
     8  heather% a.out
     9  enter name of input file
    10  x
    11  enter name of output file
    12  y
    13  heather% cat y
    14  
    15  She has VII sisters, 
    16  II brothers, VI nieces and III 
    17  nephews.
    18  
    19  heather% e
    20  heather% 
    21  script done on Wed Apr 15 23:52:46 1992

Here is the program. Some remarks will follow it.

     1  
     2  
     3  /* converts single-digit, nonzero arabic numbers to roman numbers 
     4     in a given text file */
     5  
     6  
     7  /* WARNING:  for demonstration purposes only; not completely general */
     8  
     9  
    10  #include <stdio.h>
    11  
    12  
    13  char InFile[1000],  /* the entire input file */
    14       InFileName[20],  /* name of the input file */
    15       OutFileName[20],  /* name of the output file */
    16       Word[25];  /* current word */
    17  
    18  
    19  int InFileSize,  /* number of characters in the input file */
    20      CurrInFilePlace,  /* current scanning place in the input file; 
    21                           used when we are alternating looking for
    22                           blanks and words */
    23      WordLength;  /* length of the current word */
    24  
    25  
    26  FILE *InFilePtr,  /* pointer to input file */
    27       *OutFilePtr;  /* pointer to output file */
    28  
    29  
    30  ReadInFile()  /* read the input file into the array InFile */
    31  
    32  {  char C;
    33  
    34     InFileSize = 0;
    35     while (fscanf(InFilePtr,"%c",&C) != -1) InFile[InFileSize++] = C;
    36  }
    37  
    38  
    39  PrintFile()  /* for debugging purposes only */
    40  
    41  {  int I;
    42  
    43     for (I = 0; I < InFileSize; I++) printf("%c",InFile[I]);
    44  }
    45  
    46  
    47  GetFile()
    48  
    49  {  printf("enter name of input file\n");
    50     scanf("%s",InFileName);
    51     printf("enter name of output file\n");
    52     scanf("%s",OutFileName);
    53     InFilePtr = fopen(InFileName,"r");
    54     OutFilePtr = fopen(OutFileName,"w");
    55     ReadInFile();
    56  }
    57  
    58  
    59  WriteBlanksAndEOLs()  /* starting at the current position in the
    60                           input file, keep scanning until get a
    61                           "real" character, i.e. not a blank or
    62                           end-of-line; meanwhile, copy all the
    63                           blanks and EOLs to the output file */
    64  
    65  {  char C;
    66  
    67     while (1)  {
    68        C = InFile[CurrInFilePlace];
    69        if (C == ' ' || C == '\n')  {
    70           fprintf(OutFilePtr,"%c",C);
    71           CurrInFilePlace++;
    72        }
    73        else break;
    74     }
    75  }
    76  
    77  
    78  int GetWord()  /* scans this word until hit its end; copy the word
    79                    to the array Word; also, return 1 if successful
    80                    in getting a word, 0 otherwise (failed due to
    81                    reaching the end of the file) */
    82  
    83  {  char C;
    84  
    85     WordLength = 0;
    86  
    87     while (1)  {
    88        /* if reach the end of the file, leave, reporting failure */
    89        if (CurrInFilePlace == InFileSize) return 0;
    90        C = InFile[CurrInFilePlace];
    91        /* if hit the end of the word, leave, otherwise record the
    92           current character in the array Word */
    93        if (C == ' ' || C == '\n')  break;
    94        else  {
    95           Word[WordLength++] = C;
    96           CurrInFilePlace++;
    97        }
    98     }
    99  
   100     return 1;
   101  }
   102  
   103  
   104  
   105  ConvertWord()  /* the arabic-to-roman conversion is done here */
   106  
   107  {  switch (Word[0])  {
   108        case '1': Word[0] = 'I';                        /* 1 = I */
   109                  WordLength = 1; break;
   110        case '2': Word[0] = Word[1] = 'I';              /* 2 = II */
   111                  WordLength = 2; break;
   112        case '3': Word[0] = Word[1] = Word[2] = 'I';    /* 3 = III */
   113                  WordLength = 3; break;
   114        case '4': Word[0] = 'I'; Word[1] = 'V';         /* 4 = IV */
   115                  WordLength = 2; break;
   116        case '5': Word[0] = 'V';                        /* 5 = V */
   117                  WordLength = 1; break;
   118        case '6': Word[0] = 'V'; Word[1] = 'I';         /* 6 = VI */
   119                  WordLength = 2; break;
   120        case '7': Word[0] = 'V';                        /* 7 = VII */
   121                  Word[1] = Word[2] = 'I';  
   122                  WordLength = 3; break;
   123        case '8': Word[0] = 'V';                        /* 8 = VIII */
   124                  Word[1] = Word[2] = Word[3] = 'I';
   125                  WordLength = 4; break;
   126        case '9': Word[0] = 'I'; Word[1] = 'X';         /* 9 = IX */
   127                  WordLength = 2; 
   128     }
   129  }
   130  
   131  
   132  WriteWord()
   133  
   134  {  int I;
   135  
   136     for (I = 0; I < WordLength; I++)  fprintf(OutFilePtr,"%c",Word[I]);
   137  }
   138  
   139  
   140  main()
   141  
   142  {  GetFile();
   143     CurrInFilePlace = 0;
   144     /* keep alternating this cycle:  scan through blanks and end-of-line
   145        characters, copying them to the output file */
   146     while (CurrInFilePlace < InFileSize)  {
   147        WriteBlanksAndEOLs();
   148        if (GetWord())  {
   149           if (Word[0] > '0' && Word[0] <= '9' && WordLength == 1) 
   150              ConvertWord();
   151           WriteWord();
   152        }
   153     }
   154  }
   155  
   156

Analysis:

Line 10. This tells the C compiler (actually, the C preprocessor) to include C source code from a file stdio.h. Due to the presence of the `< >', the file will be searched for in some standard directories, e.g. on most systems the directory /usr/include. (You should take a look at the file.) This file contains a bunch of definitions which are needed for the file manipulation we will be doing.

Lines 26-27. Here we are declaring two pointers to a new type called `FILE'. That latter type is defined in the #include file, and we will discuss pointers soon, but for the time being just take it on faith that we need them here.

Line 35. Here we are using the fscanf() function, which is just like scanf(), except that it can be used on any file, not just the standard input. It has one more argument, which is the pointer to that file, in this case InFilePtr. Note that we keep looping until fscanf() tells us that we have reached the end of the file, which it does by returning a value of -1, just as with scanf() in earlier examples.

Lines 50 and 52. Here we see the %s format, for reading in character strings, just like %c format reads in a character and %d reads in an integer. Note that we could have used a for loop, with %c format being used to read in the characters one at a time, but %s is much more convenient.

Line 53. Before accessing a file, we must ``open'' it first, using the fopen() function. Here we are opening the file whose name we have stored in the array InFileName, and we intend to read from it, which is why we have specified ``r''. The fopen() function then returns a pointer for us, which we assign to InFilePtr; this is how the pointer gets associated with the file. From that point on, all access to this file will be done via the pointer InFilePtr (as in Line 35).

footnote: Just as scanf() returns a value which can be used for error checking, so does fopen(). If the latter returns a 0 value, this means that it was not able to open the file.

Line 54. Same as Line 53, but with write access.

Line 69. || is the C symbol for or.

Line 70. fprintf() is just like printf(), except for general files, not just the standard output. See remarks on fscanf() and scanf() above.

Lines 107-128. A C switch statement is like a Pascal case statement. It is quite similar, except that in the C version, all cases past the one at hand will be done. For example, suppose Word[0] is `3'. Then if we didn't put in the break (Line 113), then the cases `4', `5', etc. would be done too! So the break is crucial; it says, ``Now leave this switch statement, and go to the first statement following the end of the switch.''


Norm Matloff
Wed Nov 8 17:32:58 PST 1995