# Lecture 7A Computer Architecture I

# **Instruction Set Architecture**

F7A - 1 - Datorarkitektur 2009

# **Y86 Processor State**

| Program registers |      | Condition | Memory |
|-------------------|------|-----------|--------|
| %eax              | %esi | codes     |        |
| %ecx              | %edi | OF ZF SF  |        |
| %edx              | %esp | PC        |        |
| %ebx              | %ebp |           |        |

Datorarkitektur 2009

- Program Registers
  - Same 8 as with IA32. Each 32 bits
- Condition Codes
  - Single-bit flags set by arithmetic or logical instructions

» OF: Overflow ZF: Zero SF:Negative

- Program Counter
  - Indicates address of instruction
- Memory

F7A -3-

- Byte-addressable storage array
- Words stored in little-endian byte order

**Instruction Set Architecture** 

### **Assembly Language View**

- Processor state
  - Registers, memory, ...
- Instructions
  - addl. movl. andl. ...
  - How instructions are encoded as bytes

### **Layer of Abstraction**

- Above: how to program machine
  - Processor executes instructions in a sequence
- Below: what needs to be built
  - Use variety of tricks to make it run fast
  - E.g., execute multiple instructions simultaneously



Datorarkitektur 2009

# **Y86 Instructions**

### **Format**

F7A -2-

- 1--6 bytes of information read from memory
  - » Can determine instruction length from first byte
  - » Not as many instruction types, and simpler encoding than with IA32
- Each accesses and modifies some part(s) of the program state

# **Encoding Registers**

### Each register has 4-bit ID

| IIas 4-DIL I |      |   |
|--------------|------|---|
|              | %eax | 0 |
|              | %ecx | 1 |
|              | %edx | 2 |
|              | %ebx | 3 |

| %esi | 6 |
|------|---|
| %edi | 7 |
| %esp | 4 |
| %ebp | 5 |

■ Same encoding as in IA32

### Register ID 8 indicates "no register"

Will use this in our hardware design in multiple places

F7A – 4 – Datorarkitektur 2009

# **Instruction Example**

### **Addition Instruction**



- Add value in register rA to that in register rB
  - Store result in register rB
  - Note that Y86 only allows addition to be applied to register data
- Set condition codes based on result
- e.g., addl %eax, %esi Encoding: 60 06
- Two-byte encoding
  - First indicates instruction type
  - Second gives source and destination registers

F7A - 5 - Datorarkitektur 2009

# **Move Operations**

F7A -7-



- Like the IA32 mov1 instruction
- Simpler format for memory addresses
- Give different names to keep them distinct

# **Arithmetic and Logical Operations**



- Refer to generically as "0P1"
- Encodings differ only by "function code"
  - Low-order 4 bytes in first instruction word
- Set condition codes as side effect

F7A - 6 - Datorarkitektur 2009

# **Move Instruction Examples**

|                       |                         | Little endian     |
|-----------------------|-------------------------|-------------------|
| IA32                  | Y86                     | Encoding          |
| movl \$0xabcd, %edx   | irmovl \$0xabcd, %edx   | 30 82 cd ab 00 00 |
| movl %esp, %ebx       | rrmovl %esp, %ebx       | 20 43             |
| movl -12(%ebp),%ecx   | mrmovl -12(%ebp),%ecx   | 50 15 f4 ff ff ff |
| movl %esi,0x41c(%esp) | rmmovl %esi,0x41c(%esp) | 40 64 1c 04 00 00 |

| movl \$0xabcd, (%eax)    | <del>-</del> |
|--------------------------|--------------|
| movl %eax, 12(%eax,%edx) | _            |
| movl (%ebp,%eax,4),%ecx  | _            |

Datorarkitektur 2009 F7A - 8 - Datorarkitektur 2009



# **Stack Operations**

pushl rA a 0 rA 8

- Decrement %esp by 4
- Store word from rA to memory at %esp
- Like IA32

poplrA b 0 rA 8

- Read word from memory at %esp
- Save in rA
- Increment %esp by 4
- Like IA32

# **Y86 Program Stack**



- Region of memory holding program data
- Used in Y86 (and IA32) for supporting procedure calls
- Stack top indicated by %esp
  - Address of top stack element
- Stack grows toward lower addresses
  - Top element is at lowest address in the stack
  - When pushing, must first decrement stack pointer
  - When popping, increment stack pointer

F7A - 10 -

Datorarkitektur 2009

Datorarkitektur 2009

# **Subroutine Call and Return**



- Push address of next instruction onto stack
- Start executing instructions at Dest
- Like IA32



- Pop value from stack
- Use as address for next instruction
- Like IA32

F7A - 12 - Datorarkitektur 2009

# **Miscellaneous Instructions**



Don't do anything



- Stop executing instructions
- IA32 has comparable instruction, but can't execute it in user mode
- We will use it to stop the simulator

F7A - 13 - Datorarkitektur 2009

# **Y86 Code Generation Example**

### First Try

■ Write typical array code

# /\* Find number of elements in null-terminated list \*/ int len1(int a[]) { int len; for (len = 0; a[len]; len++) ; return len; }

■ Compile with gcc -02 -S

### **Problem**

- Hard to do array indexing on Y86
  - Since don't have scaled addressing modes
  - Similar to SPARC

```
loop:
   incl %eax
entry:
   cmpl $0,(%edx,%eax,4)
   jne loop
```

# Writing Y86 Code

## Try to Use C Compiler as Much as Possible

- Write code in C
- Compile for IA32 with gcc -02 -S
- This will generate optimized code that use registers for local variables
- Transliterate into Y86

# **Coding Example**

Find number of elements in null-terminated list int len1(int a[]);

```
\begin{array}{ccc}
 & \rightarrow & 5043 \\
 & & 6125 \\
 & & 7395 \\
\hline
 & & 0
\end{array}
```

F7A - 14 - Datorarkitektur 2009

# **Y86 Code Generation Example #2**

### Second Try

■ Write with pointer code

### Result

loop:

entry:

Don't need to do indexed addressing

movl (%edx),%eax
incl %ecx

addl \$4.%edx

```
/* Find number of elements in
   null-terminated list */
int len2(int a[])
{
   int len = 0;
   while (*a++)
        len++;
   return len;
}
```

```
testl %eax,%eax
jne loop
```

■ Compile with gcc -02 -S

F7A – 15 – Datorarkitektur 2009 F7A – 16 – Datorarkitektur 2009

# **Y86 Code Generation Example #3**

IA32 Code

len2: pushl %ebp xorl %ecx, %ecx mov1 %esp,%ebp mov1 8(%ebp).%edx mov1 (%edx),%eax imp entry loop: movl (%edx),%eax # Get \*a incl %ecx # len++ entry: addl \$4.%edx # a++ test1 %eax,%eax # \*a == 0?# No--Loop jne loop movl %ebp.%esp # Pop # Rtn len movl %ecx, %eax popl %ebp ret

Y86 Code

```
len2:
   pushl %ebp
                         # Save %ebp
   xorl %ecx.%ecx
                         \# len = 0
                         # Set frame
   rrmovl %esp,%ebp
   mrmovl 8(%ebp).%edx # Get a
   mrmovl (%edx),%eax
                         # Get *a
   imp entry
                         # Goto entry
loop:
   mrmovl (%edx),%eax
                         # Get *a
   irmovl $1.%esi
   addl %esi.%ecx
                         # len++
entry:
   irmovl $4.%esi
   addl %esi.%edx
                         # a++
   and1 %eax, %eax
                         # *a == 0?
   ine loop
                         # No--Loop
   rrmovl %ebp.%esp
                         # Pop
   rrmovl %ecx,%eax
                         # Rtn len
   popl %ebp
   ret
```

Datorarkitektur 2009

# **Assembling Y86 Program**

unix> yas eg.ys

F7A - 17 -

- Generates "object code" file eg.yo
  - Actually looks like disassembler output

```
0x000: 308400010000 | irmovl Stack,%esp
                                             # Set up stack
                      rrmovl %esp,%ebp
                                             # Set up frame
0x006: 2045
0x008: 308218000000
                     irmovl List,%edx
0x00e: a028
                      push1 %edx
                                             # Push argument
0x010: 8028000000
                     call len2
                                            # Call Function
                     halt
                                             # Halt
0x015: 10
0x018:
                     .align 4
0x018:
                     List:
                                             # List of elements
0x018: b3130000
                     .long 5043
0x01c: ed170000
                      .long 6125
0x020: e31c0000
                     .long 7395
0x024: 00000000
                    | .long 0
```

F7A - 19 - Datorarkitektur 2009

# **Y86 Program Structure**

```
irmovl Stack,%esp
                      # Set up stack
   rrmovl %esp,%ebp
                      # Set up frame
   irmovl List.%edx
   pushl %edx
                      # Push argument
   call len2
                      # Call Function
  halt
                      # Halt
 align 4
                      # List of elements
List:
   .long 5043
   .long 6125
   .long 7395
   .long 0
# Function
len2:
# Allocate space for stack
.pos 0x100 ×
Stack:
```

- Program starts at address 0
- Must set up stack
  - Make sure there is space enough between end of code and beginning of Stack so we don't overwrite code!
- Must initialize data
- Can use symbolic names

F7A - 18 - Datorarkitektur 2009

# **Simulating Y86 Program**

unix> yis eg.yo

- Instruction set simulator
  - Computes effect of each instruction on processor state
  - Prints changes in state from original

```
Stopped in 41 steps at PC = 0x16. Exception 'HLT'. CC Z=1 S=0 0=0
Changes to registers:
%eax:
                        0x00000000
                                       0x00000003
%ecx:
                        0x00000000
                                       0x00000003
%edx:
                        0x00000000
                                       0x00000028
%esp:
                        0x00000000
                                       0x000000fc
                        0x00000000
                                       0x00000100
%ebp:
%esi:
                        0x00000000
                                       0x00000004
Changes to memory:
0x00f4:
                        0x00000000
                                       0x00000100
0x00f8:
                        0x00000000
                                       0x00000015
0x00fc:
                        0x00000000
                                       0x00000018
```

F7A – 20 – Datorarkitektur 2009

# **CISC Instruction Sets**

- Complex Instruction Set Computer
- Dominant style through mid-80's

### Stack-oriented instruction set

- Use stack to pass arguments, save program counter
- Explicit push and pop instructions

### **Arithmetic instructions can access memory**

- addl %eax, 12(%ebx,%ecx,4)
  - requires memory read and write
  - Complex address calculation

### **Condition codes**

Set as side effect of arithmetic and logical instructions

### **Philosophy**

Add instructions to perform "typical" programming tasks

F7A - 21 - Datorarkitektur 2009

# CISC vs. RISC

# **Original Debate**

- Strong opinions!
- CISC proponents---easy for compiler, fewer code bytes
- RISC proponents---better for optimizing compilers, can make run fast with simple chip design

### **Current Status**

- For desktop processors, choice of ISA not a technical issue
  - With enough hardware, can make anything run fast
  - Code compatibility more important
- For embedded processors, RISC makes sense
  - Smaller, cheaper, less power

# **RISC Instruction Sets**

- Reduced Instruction Set Computer
- Internal project at IBM, later popularized by Hennessy (Stanford) and Patterson (Berkeley)

### Fewer, simpler instructions

- Might take more to get given task done
- Can execute them with small and fast hardware

### Register-oriented instruction set

- Many more (typically 32) registers
- Use for arguments, return pointer, temporaries

### Only load and store instructions can access memory

Similar to Y86 mrmovl and rmmovl

### **No Condition codes**

- Test instructions return 0/1 in register
- But SPARC has condition codes

F7A – 22 – Datorarkitektur 2009

# **Summary**

### Y86 Instruction Set Architecture

- Similar state and instructions as IA32
- Simpler encodings
- Somewhere between CISC and RISC

### **How Important is ISA Design?**

- Less now than before
  - With enough hardware, can make almost anything go fast
- Intel is moving away from IA32
  - Does not allow enough parallel execution
  - Introduced IA64
    - » 64-bit word sizes (overcome address space limitations)
    - » Radically different style of instruction set with explicit parallelism
    - » Requires sophisticated compilers

F7A - 23 - Datorarkitektur 2009 F7A - 24 - Datorarkitektur 2009