CMSC313, Computer Organization & Assembly Language Programming, Spring 2013
Project 2: An Error-Correcting Code
Due: Tuesday February 26, 2013 11:59pm
Objective
The objective of this programming project is for you to gain some
familiarity with the bit manipulation instructions in assembly language
programming.
Background
In Project 1, we saw that ISBN codes can detect some simple
typographical errors. However, there is not much we can do after we
have detected the error. An error-correcting code can fix errors, not
just detect them.
In this project, we will use a 31-bit Hamming code that can correct a 1-bit
error in each 32-bit codeword. Each 32-bit codeword encodes 3 bytes of the
original data. The format of the codeword is on the
Project 2 Codeword Format page.
Assignment
Write an assembly language program that encodes the input file using the
codeword format described below. Your program should read from standard
input and write to standard output. We can use Unix redirection to read
from and write to files:
./a.out ofile
Some details:
-
Although it is terribly inefficient, your program should read three
bytes in each system call to READ and write 4 bytes in each system
call to WRITE.
-
You may assume that when the operating system returns with 0 bytes read
that the end of the input file has been reached. You may also assume
that if fewer than 3 bytes are read, then those bytes are the
last bytes in the file. (These are not fool-proof assumptions
in "real life".)
-
The 32-bit codewords must be written out in little-endian format.
(You don't have to do anything special for this to happen. It is
the "normal" thing on a little-endian CPU.)
Two programs decode and corrupt
are provided in the GL file system in the directory:
/afs/umbc.edu/users/c/h/chang/pub/cs313
Copy these programs to your own directory. They can be used to decode
an encoded file and to corrupt an encoded file. You can use these
programs to check if your program is working correctly. Both programs
use I/O redirection.
Record some sample runs of your program using the Unix script
command. You should show that you can encode a file using your program,
then decode it and obtain a file that is identical to the original. Use the
Unix diff command to compare the original file with the decoded
file. You should also show that this works when the file is corrupted.
For example:
linux2% ./a.out encoded_file
linux2% ./decode decoded_file
linux2% diff decoded_file test_file
linux2% ./corrupt corrupted_file
linux2% diff encoded_file corrupted_file
Binary files encoded_file and corrupted_file differ
linux2% ./decode decoded_file2
linux2% diff decoded_file2 test_file
Extra Credit
For 10 points extra credit, revise your program so that it reads at
least 200 bytes during each system call to READ (if that many bytes are available). Your
program must also write at least 200 bytes for each system call to
WRITE. (Note: it is advantageous to you if the number of bytes you read
is a multiple of 3.) You will need an inner loop to process 3-byte
blocks of the input you have read.
As stated previously, the extra credit policy for this class is that
extra credit is only given for programs that are mostly correct. A
half-hearted attempt at extra credit that doesn't really work will
receive 0 extra credit points. (This is to have you concentrate on the
regular portion of the assignment.)
Implementation Notes
-
Your program should not prompt the user for input, since
we will be using Unix redirection.
-
Pay attention to the byte order both for input and for output.
In the Project 2 Codeword Format,
a0 – a7
is the first byte of the 3-byte input block.
- The parity flag PF is set to 1 if the result of an instruction
contains an even number of 1's. Unfortunately, PF only looks at the
lowest 8 bits of the result. For this project, you will need to compute
32-bit parities. Here's a simple way to compute the parity of the EAX
register.
mov ebx,eax
shr eax,16
xor ax,bx
xor al,ah
jp even_label
Note that the EAX and EBX registers are modified in this process, so
you may need to use different registers.
-
When you compute the value of a parity bit (see below), only
16 bits of the 32 bit codeword is involved. You should use
the AND instruction to mask out the 16 bits that you don't
care about. (Make a copy, of course.)
- Most assembly language instructions we are using require that
its operands have the same number of bits. For example, you cannot
OR a 32-bit register with an 8-bit register.
- Take advantage of the fact that some 8-bit portions of the 32-bit
general purpose registers have names. For example:
mov ebx, 0
mov bl, [buf]
will copy the byte in address buf in the lowest 8 bits of
the EBX register and clear the top 24 bits.
- Yes, you can add constants to labels like this:
mov al, [buf+1]
mov al, [buf+2]
(This is not an indexed addressing mode. Addresses like
buf+2 are resolved by the loader.)
- A single OR instruction can be used to set a single bit in a
register. For example to make bit 5 in the EBX register 1, use
the instruction
or ebx, 0x00000020
This is assuming that the bits are numbered 0 (least significant) thru 31
(most significant).
-
When you write 4 bytes to the output, you must store the 4
bytes in memory somewhere (you decide where). The WRITE system call
only writes from memory locations (and definitely will not write from a register).
- The last 32-bit word output by your program requires special
handling since the bits m1 and m0 must be encoded. Since these
bits are also involved in the computation of the parity bits,
the bits m1 and m0 must be set before you compute the parity
bits p4, p3, p2, p1 and p0.
- The UNIX octal dump program is useful to see the contents
of a file in hexadecimal. The name of the command is od. To
see the file foo in hexadecimal as 4-byte words use:
od -t x4 foo
To see the file foo in hexadecimal as 1-byte words use:
od -t x1 foo
Turning in your program
Use the UNIX submit command on the GL system to turn in your project. You
should submit two files: 1) the assembly language program and 2) the
typescript file of sample runs of your program. The class name for submit
is cs313. The name of the assignment name is proj2.
The UNIX command to do this should look something like:
submit cs313 proj2 encode.asm typescript
Last Modified:
22 Jul 2024 11:28:12 EDT
by
Richard Chang
to Spring 2013 CMSC 313 Homepage