Question 1
Assume benchmarks show that 12.5% of the instructions executed by a specific MIPS program use an arithmetic immediate. Also, 14% of these need 16 bits, 10% need 15 bits, and 7% need 14 bits. The SPARC processor has only a 13-bit immediate for its register/immediate instruciton format. Assuming clock speed and CPI were the same, what is the expected slowdown of the SPARC due to the smaller immediate value?
Question 2
You are considering adding a multiply-add instruction:
MAD $rd, $ra, $rb ; $rd = $rd + $ra*$rb
If multiplies make up 10% of the instructions executed a particular program, and 5% of the total instructions are adds with an accompanying multiply that could be fused into a single instance of the new instruction, what is the expected speedup?
Question 3
Consider this sequence of MIPS instructions
LW $r1, #0($r4) ADD $r1, $r1, $r2 SUBI $r3, $r1, #1 BNEZ $r3, target ADDI $r1, $r1, #-1 target: SW $r1, #0($r4)
Assuming the standard 5-stage MIPS pipeline with zero check in the ID stage, show a pipeline timing diagram for the case when there is a branch, and for the case when there is not. Draw an arrow between stages for any forwarded result, and circle the stage when the branch target and direction are known.
Question 4
You are considering adding a new "branch on not equal" immediate instruction that could replace the subtract and BNEZ. So
SUBI $r3, $r1, #1 BNEZ $r3, target
becomes
BNEQI $r1, #1, target
Assuming the branch resolution could only be determined in the EX stage, show new pipeline timing diagrams for the sequence in Question 3 with the SUBI/BNEZ replaced with the new instruction.
Question 5
This new instruction uses one register and two immediates. To support it, you'd need a new instruction format:
opcode (6b) | Rs (5b) | Immediate (?) | Offset (?) |
To figure out how to allocate the remaining 21 bits between immediate and offset, you could instrument a program to know the percentage of instructions that are BNEQI (call that B), and the percentage of immediate and offset values in the new BNEQI that need n bits. From these measurements, you get a series of percentages, so I1 is the percentage needing 1 bit, I2 is the percentage needing 2 bits, etc. You could also find the distribution of offsets, so O1 is the percentage needing 1 bit, O2 is the percentage needing 2 bits, etc. Give the equation for the expected speedup for any given choice of bit allocation, assuming you can fall back to SUBI/BNEZ if either the immediate or offset is too big.