CMSC--202 , Section 1 Computer Science II for Majors
Fall 1997 28 August 1997 Homework 1

Assigned: Tuesday, 2 Sep
Due Date: Tuesday, 16 Sep
Late Date: Thursday, 18 Sep

Introduction

This assignment has these main objectives:

You are to write a C program that searches a file for words which contain occurrences of given strings at given positions. Here's how the program goes:
  1.   it requests (from stdin) the name of the file, the ``dictionary,'' that contains the list of words. If that file cannot be opened, the program terminates with an error message to stderr.
  2. it reads and stores the list of words from the file, then closes the file. Non-valid words are ignored.
  3.   it requests (from stdin) the maximum number of matching words to be displayed.
  4.   it repeatedly requests (from stdin) strings and positions to be found, outputting (to stdout) all the words in the dictionary that ``match.'' The output is to be limited in size to the number of words indicated in step 3. In case there are additional matching words, the number of excess matches is outputted.
In step 4, your program repeatedly asks the user for strings and positions. The user responds either by hitting ^D (for end-of-file) to terminate the program or by typing a line containing up to five whitespace-separatedgif pairs of strings. For each pair, the first string is the ``pattern'' (maximum three characters) and the second string is an indication of the position of the pattern in the word (maximum two characters). Position can be indicated by any of the following: The dictionary file will be an ASCII file containing one word per line with no extraneous white space. Words that contain numbers, upper-case characters, or non-alphabet characters are to be ignored. Words longer than 30 characters are to be ignored. No more than 19,000 valid words are to be read from the file. The file /usr/lib/dict/words on the UCS gl machines ( umbc8, umbc9, umbc10, and the SGI workstations) is the file on which your program will be tested. It makes sense for you to use it as your dictionary, too. A sample run of the program is given in Section 11. Your output does not have to be formatted exactly the same as that shown in the sample, but should be in a similar style. Note that the program does not exit when a blank line is entered. It exits on EOF (entering ^D).

Error Checking

The program should terminate after issuing an error message if the dictionary file cannot be opened (step 1 on page 1).

Error messages (but not termination of the program) are to be issued for other invalid input:

Note that in step 4, the program is to terminate on end-of-file (^D), not on null input. Null input is not to be considered an error.

Program Modularity

You must do this homework using multiple files. You may have as many files as you wish, and you may name them as you wish. The main function should appear (with little else) in its own file (suggestion: name it hw1.c). Other functions must be declared in one or more interface files (suggestion: name them something like hw1_aux.h) and be defined in corresponding implementation files (suggestion: name them something like hw1_aux.c). The interface files must be guarded. It is perfectly ok to have just one interface file and one corresponding implementation file. A sample main file is given in Section 9. Note that it is short and organized to ``tell the story'' of what the program does. The interface file hw1_aux.h is given in Section 10. It contains the declarations for the functions called in the main file. Note the commenting style. Note the guarding of the interface file. Note the judicious use of #include. You may use these files, as is, if you wish.

Style Issues

Please refer to the handout ``Guidelines for Programs'' for guidelines on style and honesty issues. It is very important to follow this guidance. You will lose very substantial credit if you do not follow the required style.

Producing a Script

You will be submitting a script of an execution of your program. To make a script, enter script. A file named typescript will be created containing your session. Execute your program (by entering hw1), and enter some data. Be sure to enter at least as much data as is given in the sample script in Section 11. Don't over-do it. When you are finished, exit your program (by entering {^D), then exit the script by entering exit. The file typescript is now ready to submit.

Makefile

Write a makefile which causes correct compilation of your homework You may name your files as you wish (suggestions were given above), but the executable must be named hw1. Your makefile will be used to compile your submitted code, so be sure it works. A simple makefile which just does the compilation is fine.

Submitting the Homework

You must submit the following files:

  1. the ``main function'' file ( hw1.c),
  2. all your interface files (for example, hw1_aux.h),
  3. all your implementation files (for example, hw1_aux.c),
  4. your makefile, and
  5. a script ( typescript) of your program execution.

To submit the files, use the submit program. For example, to submit the files hw1.c and hw1_aux.c, enter submit cs202_01 hw1 hw1.c hw1_aux.c

You can check your submission by entering submitls cs202_01 hw1

Due Date

The homework is due on Tuesday, 16 Sep . Submittals received by midnight of the due date will receive 5 bonus credits. There is an automatic extension of two days, so submittals received by midnight of Thursday, 18 Sep will receive full credit. No submittals will be accepted after Thursday, 18 Sep .

Do not wait till the last minute to submit your assignment. The automatic extension is given to accommodate unforeseen problems such as a machine or network crash. If you have not finished your assignment by the due date, please submit your work for partial credit. You can use the automatic extension to submit further work.

Sample Main File

 

/*
  hw1.c
  Main function for HW-1, CMSC-202, Fall 1997
  Thomas A. Anastasio
  Created: 24 July 1997
  Current: 28 August 1997
*/

     CODE DELETED -- YOU MUST TYPE IT IN IF YOU WANT TO USE IT

Sample Interface File

 

/* 
   hw1_aux.h 
   Interface file for HW-1,  CMSC-202, Fall 1997
   Thomas A.  Anastasio
   Created: 24 July 1997
   Current: 28 August 1997
*/

     CODE DELETED -- YOU MUST TYPE IT IN IF YOU WANT TO USE IT

A Sample Run

 

Script started on Sat Jul 26 17:08:39 1997
umbc10[101] hw1
Please enter filename for dictionary: /usr/lib/dict/words
Number of matches to print: a
 >>Must be an integer
Number of matches to print: -1
 >>Must be a positive integer
Number of matches to print: 8
Enter up to 5 pattern-position pair(s): a 1 zi * 
  azimuth
  azimuthal
Enter up to 5 pattern-position pair(s): zi *
  azimuth
  azimuthal
  bilharziasis
  brazier
  buzzing
  kibbutzim
  magazine
  muezzin
  12 additional matching words were found
Enter up to 5 pattern-position pair(s): zi * az *
  azimuth
  azimuthal
  brazier
  magazine
  palazzi
Enter up to 5 pattern-position pair(s): a   2 zi  *   ne  +
  magazine
Enter up to 5 pattern-position pair(s): k  1    ro    *
  kangaroo
  kerosene
  klystron
Enter up to 5 pattern-position pair(s): k 1 r * n + l 2
  klystron
Enter up to 5 pattern-position pair(s): t 1 ax 2
  tax
  taxation
  taxi
  taxicab
  taxied
  taxiway
  taxonomic
  taxonomy
  2 additional matching words were found
Enter up to 5 pattern-position pair(s): b 1 ax *
  biaxial
  borax

Enter up to 5 pattern-position pair(s): ax +
  ax
  borax
  climax
  coax
  flax
  lax
  minimax
  parallax
  6 additional matching words were found
Enter up to 5 pattern-position pair(s): a 1 b 2 e 3 f 4
  No word matches
Enter up to 5 pattern-position pair(s): a 1 b * c + d 2 e * f 3
 >>Maximum of 5 pairs allowed
Enter up to 5 pattern-position pair(s): abcd 1
 >>Maximum pattern size is three characters
Enter up to 5 pattern-position pair(s): ab  \#
 >>Each position must be positive int (1..99), *, or +
Enter up to 5 pattern-position pair(s): ab 100
 >>Each position must be positive int (1..99), *, or +
Enter up to 5 pattern-position pair(s): ab -1
 >>Each position must be positive int (1..99), *, or +
Enter up to 5 pattern-position pair(s): ab 0
 >>Each position must be positive int (1..99), *, or +
Enter up to 5 pattern-position pair(s): 
Enter up to 5 pattern-position pair(s): a 1 zi * 
  azimuth
  azimuthal
Enter up to 5 pattern-position pair(s): ^D
That's all folks
umbc10[102] exit
script done on Sat Jul 26 17:14:10 1997
...whitespace-separated
The standard whitespace characters are space, form feed, newline, carriage return, horizontal tab, and vertical tab.



Thomas A. Anastasio
Thu Aug 28 20:10:47 EDT 1997