Handling Binary Files for Sequential and Random Access

Richard Kay, 14 February 2000

1. Reading in binary mode: fread()

Prototype: size_t fread(void *ptr, size_t size, size_t nobj, FILE *stream);

fread() returns the number of bytes read from file *stream. We can check for end of file on reading by testing for 0 bytes read. ptr is a pointer holding the address of the allocated storage (e.g. address of the start of a structure record) into which data as read from the file will be stored. size is the size in bytes of each object to be read and nobj is the number of objects to be read. It follows that size*nobj bytes will be read and stored in the block to which ptr holds the starting address. size_t is the data type of the size and nobj parameters and the return value, which must be some kind of integer typically defined using a typedef somewhere in stdio.h for portability reasons.

e.g. 1 with known file size:

FILE *in;
int ia[10];
in=fopen("ints.bin","rb"); /* read and binary mode */
fread(ia,sizeof(int),10,in); /* reads 10 integers into ia[] */
e.g. 2 with unknown file size:
FILE *in;
int ia[100],bytes,i=0; /* we know or hope there are less than 100 ints in the file */
in=fopen("ints.bin","rb"); /* read and binary mode */
while((bytes=fread(ia+i,sizeof(int),1,in))>0)
  /* reads integers one at a time into ia[] until EOF */
  printf("integer %d is %d\n",i+1,ia[i++]);
printf("there were %d integers in the file\n",i);
2. writing in binary mode: fwrite()

Prototype: size_t fwrite(const void *ptr, size_t size, size_t nobj, FILE *stream);

fwrite() is the "mirror" function of fread() and has the same parameters and return type. Instead of copying a block of data from a file into a memory area it copies a memory area to the binary file. E.G:

int ia[]={2,4,8,16};
FILE *out;fopen("ints.bin","wb");
fwrite(ia,sizeof(int),sizeof(ia)/sizeof(int),out);
  /* writes contents of ia[] to binary file */
3. Positioning within the file

The above functions can be used for sequential file access, in which case when reading the file "cursor" will always automatically be positioned at the next record to be read, and when writing the cursor will always be positioned (either from the start in overwrite or new file mode or at the end of the existing file in append mode) at the next record to be written. However, files may also be accessed randomly. This can be faster and more efficient when handling with very large files, perhaps where access is needed to only small amounts of data compared to the size of the file. Random access will typically require some kind of an index structure, possibly stored in a separate file or at the start of a large file, to indicate the file offsets at which various record keys are stored.

prototype: void rewind(FILE *stream);

 
rewind() will always reposition the cursor to the start of the file.
 
prototype: int fseek(FILE *stream, long offset, int origin);
 
origin is an integer which may take the values present in stdio.h of the constants SEEK_SET (start of file), SEEK_CUR (current position of file) and SEEK_END (end of file). offset is relative to the origin and is a long integer value.
 
e.g.
 
fseek(in,100L,SEEK_SET); /* positions file cursor 100 bytes from the start of file */
 
prototype: int fgetpos(FILE *stream, fpos_t *ptr);
 
fgetpos() records the current position of the file in *ptr for subsequent use by fsetpos().
 
prototype: int fsetpos(FILE *stream, const fpos_t *ptr);
 
fsetpos() can be used to reposition the file at a place previously recorded by fsetpos, making use of pointer *ptr to data of type fpos_t which is an integral type created for this purpose.