CMSC313, Computer Organization & Assembly Language Programming, Spring 2013
 Project 2: An Error-Correcting Code 
Due: Tuesday February 26, 2013 11:59pm
Objective
The objective of this programming project is for you to gain some
familiarity with the bit manipulation instructions in assembly language
programming.
Background
In Project 1, we saw that ISBN codes can detect some simple
typographical errors.  However, there is not much we can do after we
have detected the error.  An error-correcting code can fix errors, not
just detect them.
In this project, we will use a 31-bit Hamming code that can correct a 1-bit
error in each 32-bit codeword. Each 32-bit codeword encodes 3 bytes of the
original data. The format of the codeword is on the
Project 2 Codeword Format page.
Assignment
Write an assembly language program that encodes the input file using the
codeword format described below. Your program should read from standard
input and write to standard output. We can use Unix redirection to read
from and write to files:
	./a.out ofile
Some details:
    - 
    Although it is terribly inefficient, your program should read three
    bytes in each system call to READ and write 4 bytes in each system
    call to WRITE.
    
- 
    You may assume that when the operating system returns with 0 bytes read
    that the end of the input file has been reached. You may also assume
    that if fewer than 3 bytes are read, then those bytes are the
    last bytes in the file. (These are not fool-proof assumptions
    in "real life".)
    
    
- 
    The 32-bit codewords must be written out in little-endian format.
    (You don't have to do anything special for this to happen. It is
    the "normal" thing on a little-endian CPU.)
Two programs decode and corrupt
are provided in the GL file system in the directory:
   /afs/umbc.edu/users/c/h/chang/pub/cs313
Copy these programs to your own directory.  They can be used to decode
an encoded file and to corrupt an encoded file. You can use these
programs to check if your program is working correctly. Both programs
use I/O redirection.
Record some sample runs of your program using the Unix script
command. You should show that you can encode a file using your program,
then decode it and obtain a file that is identical to the original. Use the
Unix diff command to compare the original file with the decoded
file. You should also show that this works when the file is corrupted.
For example:
   linux2% ./a.out encoded_file
   linux2% ./decode decoded_file
   linux2% diff decoded_file test_file
   linux2% ./corrupt corrupted_file
   linux2% diff encoded_file corrupted_file
   Binary files encoded_file and corrupted_file differ
   linux2% ./decode decoded_file2
   linux2% diff decoded_file2 test_file
Extra Credit
For 10 points extra credit, revise your program so that it reads at
least 200 bytes during each system call to READ (if that many bytes are available). Your
program must also write at least 200 bytes for each system call to
WRITE. (Note: it is advantageous to you if the number of bytes you read
is a multiple of 3.) You will need an inner loop to process 3-byte
blocks of the input you have read.
As stated previously, the extra credit policy for this class is that
extra credit is only given for programs that are mostly correct. A
half-hearted attempt at extra credit that doesn't really work will
receive 0 extra credit points. (This is to have you concentrate on the
regular portion of the assignment.)
Implementation Notes
   
   
- 
   Your program should not prompt the user for input, since
   we will be using Unix redirection.
   
    
- 
   Pay attention to the byte order both for input and for output.
   In the Project 2 Codeword Format,
   a0 – a7
   is the first byte of the 3-byte input block.
    
   
    
-  The parity flag PF is set to 1 if the result of an instruction
   contains an even number of 1's. Unfortunately, PF only looks at the
   lowest 8 bits of the result. For this project, you will need to compute
   32-bit parities. Here's a simple way to compute the parity of the EAX
   register.
   
	    mov     ebx,eax
	    shr     eax,16
	    xor     ax,bx
	    xor     al,ah
	    jp      even_label
   
   Note that the EAX and EBX registers are modified in this process, so
   you may need to use different registers.
   
    
- 
   When you compute the value of a parity bit (see below), only
   16 bits of the 32 bit codeword is involved. You should use
   the AND instruction to mask out the 16 bits that you don't
   care about. (Make a copy, of course.)
   
    
-  Most assembly language instructions we are using require that
   its operands have the same number of bits. For example, you cannot
   OR a 32-bit register with an 8-bit register.
   
    
-  Take advantage of the fact that some 8-bit portions of the 32-bit
   general purpose registers have names. For example:
   
            mov     ebx, 0
            mov     bl, [buf]
   
   will copy the byte in address buf in the lowest 8 bits of
   the EBX register and clear the top 24 bits.
   
    
-  Yes, you can add constants to labels like this:
   
            mov     al, [buf+1]
            mov     al, [buf+2]
   
   (This is not an indexed addressing mode. Addresses like
   buf+2 are resolved by the loader.)
   
    
-  A single OR instruction can be used to set a single bit in a
   register. For example to make bit 5 in the EBX register 1, use
   the instruction
   
            or     ebx, 0x00000020
   
   This is assuming that the bits are numbered 0 (least significant)  thru 31
   (most significant).
   
    
- 
   When you write 4 bytes to the output, you must store the 4
   bytes in memory somewhere (you decide where). The WRITE system call
   only writes from memory locations (and definitely will not write from a register).
   
    
-  The last 32-bit word output by your program requires special
   handling since the bits m1 and m0 must be encoded. Since these
   bits are also involved in the computation of the parity bits,
   the bits m1 and m0 must be set before you compute the parity
   bits p4, p3, p2, p1 and p0. 
   
    
-  The UNIX octal dump program is useful to see the contents
   of a file in hexadecimal. The name of the command is od. To 
   see the file foo in hexadecimal as 4-byte words use:
   
        od -t x4 foo
   
   To see the file foo in hexadecimal as 1-byte words use:
   
        od -t x1 foo
   
Turning in your program
Use the UNIX submit command on the GL system to turn in your project. You
should submit two files: 1) the assembly language program and 2) the
typescript file of sample runs of your program. The class name for submit
is cs313. The name of the assignment name is proj2.
The UNIX command to do this should look something like:
        submit cs313 proj2 encode.asm typescript
Last Modified:
22 Jul 2024 11:28:12 EDT
by
Richard Chang
 to Spring 2013 CMSC 313 Homepage
to Spring 2013 CMSC 313 Homepage