NUS CS MODS: CS2100: Performance and ISA General Concepts

Performance and ISA General Concepts

Response: Number of seconds per instruction

Performance: Number of instruction per one second

We can use this for pipeline

Average CPI

CPI = sum of (cycles for each instructions * F)
F = instruction Freq / Instruction Count

CPI = (CPUtime * Clock Rate) / Instruction Count
= Clock Cycles / Instruction Count

Influencing Factors on Performance

1. Compile the program to Binary

Depending on the compiler and the kind of instruction we have

We can change number of instruction we have
[AVG CPI]

Compiler:

Different compiler use different technique to compile

- gcc

- Clang

- icc

They will generate different binary and will optimise code by using -o(level) in the compile line

e.g -o4 will run level 4

Instruction Set Architecture:

The same high level statement is translated differently depending to ISA
e.g A*+B

2. Binary Executes on Machine

[CYCLE TIME CPI]

Machine:

- More accurately the hardware implementation

- Determine cycle time and cycle per instruction

Cycle time:
Different clock frequency

Cycle per instruction

Design of internal mechanism

Summary:

Performance is specific

- A given machine can have a different CPU

- Common misunderstanding: Expect improvement by changing one aspect machines purposes

Amdahl's Law

Performance is limited to the non speedup portion of the program. When we improve it, we do not need to update. Optimise but there is a limit to the optimise.

FP run 5x faster doesn't mean we have to divide.
We have to multiply the FP instruction and add back to the remaining time.
e.g
FP ins = 6 sec
Benchmark = 6 sec
Total : 12 sec
SpeedUp: 6/5 (FP Ins) + 6

Boolean Algebra

Take a X Y to represent a set of logic

e.g X = A + B

RISC VS CISC

CISC:

is like a matrix, each use a matrix multiplication

Give whatever the user wants.

EXE: small

Hardware: Complex

e.g Intel x8b

RISC:

Give them the simplest things, and the rest is build

e.g add, mul, branch

EXE: Big

Simple: optimise

e.g MIPS, ARMS

#1 Data Storage

Storage architecture.

Von Neumann architecture, all the memory is in the memory and when processor needs it, we bring it

Standard register (GPR)

There are instructions such as load and store to load and store the information from memory

This is the more popular

Memory-Memory

Specify the memory address in the instrcution and straight away

This is bad because memory takes very long to load

Stack

Last in first out data structure. The push and pop use to bring the infomation from the memory.

When we perform add, we will take the value from the stack and result is push into stack.

Popping it will store it back to the memory.

All the instruction here is very tiny

e.g Java JVM

Accumulator

There is accumulator, what ever result that is calculated will be place as accumulator to preload everything when doing ALU execution.

Loading it back will take it from the accumulator

#2 Memory and Addressing mode

- Address size is different from the data.

2^k is means k different location

When reading, we will use a n-bit data bus but n may not be the same as k

Loading:

Value place in MAR

Store

Value in MDR

Endianness

The ordering of the bytes in multiple byte word store in memory

Big endian:

Store the most significant in the lower address

Little-endian

Store the least significant byte in the lower address

The problems lies when you do a load, different machine will give return results.

Intel: Little Endian

Mips: Depends (Sim: Little Endian)

Network order: Big endian

Addressing modes

3 kinds of addressing modes but in other there are more than 3

e.g Register indirect, auto increment

#3 Operation

Standard operation in instruction set

- data movement

- Arithmetic

- Shift

- Branch

- Call

Note: Load is the most used so optimise it first

#4 Instruction Format

Instruction Length

Variable length of instruction:

Used in most CISC

Require multi step Fetch and decoded

Fixed-Length:

Use in RISC

Easy to decode/encode

Instruction bit are scarce

Hybrid:

A mix of both

NUS CS MODS

Pages

Thursday, October 10, 2019

CS2100: Performance and ISA General Concepts