06. Mathematical coprocessor
Real numbers
- The're no such object IRL
⇒ Always model
- ⇒ No float1 == float2 (only by coincidence)
Representation
Fixed point: 1231234.21341234
Trailing/leading zeros: 1230000000.0/0.0000000123
Floating point: 1.213123·10²³ = 1.213123E23 = 1.213123e+23
Normalization $$1<=mantissa<10$$
Accidental zeros: 712E+8 + 3e-5 = 71200000000.00003
Real numbers modelling
- Fixed point: very small range
- Lexical (string/remainder based): too slow/complex, although perfect
⇒ float point.
Binary fixed point: use 2-1, 2-2, 2-3 etc.
- 155.625 =
= 1·27 +0·26+0·25+1·24+1·23+0·22+1·21+1·20+1·2-1+0·2-2+1·2-3 =
= 128 + 0 + 0 + 16 + 8 + 0 + 2 + 1 + 0,5 + 0 + 0,125 155.62510 =
= 10011011.1012
155.625 = 1.55625·exp10+2 = 10011011.1012·exp20 = 1.0011011101·exp2+111 (1112 = 710)
IEEE_754
But no, they thought they're all smartasses.
S[ E ][ M ] 01110101111100010110101110101111
- S - sign bit
E - biased exponent; 8 bits for 32-bit float
- E = exponent +127 for 32-bit float
M - remainder of mantissa (23 bits for 32-bit float)
223=8388608 , if mantissa > 223, it will loose lower digits
2-normalized float: $$1<= mantissa <2$$
⇒ mantissa always starts from 1, do not store it
2-denormalized float ($$n<=0.5$$): $$0.5<=mantissa<1$$
⇒ mantissa always starts from 0, do not store it
Number |
31 bit |
30-22 bit |
22-0 bit |
Hexadecimal |
|
Sign |
Biased exponent |
Mantissa |
|
155.625 (normalized) |
0 |
10000110 |
00110111010000000000000 |
431BA000 |
-5.23E-39 (denormalized) |
1 |
00000000 |
01110001111001100011101 |
8038f31d |
- Signed like integer
- Zero like integer (hence exponent bias)
- Double float: 1-bit Sign, 11-bit Exponent, 52-bit Mantissa
- MARS: «Tools → Floating Point Representation»
IEEE 754 is mathematically and practically awful! longread in Russian
- ⇒ NaN, Inf etc.
FPU / C1
The concept of coprocessor: orthogonal task, data formats, performance, data flow
- C0 — control coprocessor (later)
- FPU MIPS:
- IEEE 754 /32 /64
32 dedicated C1 f-registers
=16 d-registers $f0~$f, $f2~$f3 etc., so only $f0, $f2, $f4 ... can be used
Instruction set
- Memory:
op
cop
ft
fs
fd
funct
6bits
5bits
5bits
5bits
5bits
6bits
op = 17
cop = 16 for 32-bit and 17 for 654-bit
- fTarget, fSource, fDestination — f-registers
funct — extension
- Assembler:
command.type $fdestination $fsource $ftarget
command: add, sub, div, mul
type: s or d
mul.s $f1 $f2 $f8 add.d $f0 $f0 $f2
command.type $fdestination $fsource
command: neg, abs, mov, sqrt, movf, movt
movf/movt — conditional move
mov.s $f4 $f7 sqrt.d $f0 $f4
memory: command.type f-register offset(comon-register)
l/s (load/store), s/d
l.s $f1 40($t4) s.d $f6 ($t5)
registers: command.type common-register f-register
mfc1/mtc1 (move from/to C1) , s/d
double use 2 common registers (e. g. $t0~$t)
mtc1.s $t1 $f3 mfc1.d $t2 $f4
float/int conversion: command.type.type f-destination f-source
cvt/floor/trunc/round, s/d/w (word, i. e. integer)
- use f-register only (why ?)
cvt.w.s $f1 $f1 floor.w.d $f2 $f4
More complex instructionx
- Non-atomic conditional jumps
comparison: c.le/lt/eq.s/d $fsource.. $ftarget
store 1/0 into C1 flag (#0, but there's others, like c.le.s 1 $f0 $f1)
ge/gt is reversed lt/le
jump: bc1t/bc1f label
jump if C0 flag 0 is 1/0 (similarly bc1t 1 label for C0 flag 1)
c.le.s $f0 $f1 bc1t less
- Conditional moves:
movt/movf rdestination rsource — move conditional register if C1 flag 0 is True/False (also movt $t0 $t1 2)
movt/movf.type fdestination fsource (+optinoal flag number) — for f-registers
c.le.s $f0 $f1 movt $t4 $t3 movt.s $f1 $f0
- Also, common register conditional commands!:
slt rdest rsource rtarget (set rdest to 1/0 if rsource is less then/(not) rtarget); used in pseudoinstruction like blt $t0 $t1 label
movz/movn rdest rsource rtarget (set rdest to rsource if rtarget is zero/nonsero)
movz/movn .s//d fdest fsource rtarget (set fdest to fsource if rtarget is zero/nonsero)
Examples
Calculate a square root from integer
1 .data 2 src: .word 100 3 dst: .float 0 4 idst: .word 0 5 .text 6 lw $t0 src # source integer 7 mtc1 $t0 $f2 # store to FPU 8 cvt.s.w $f2 $f2 # convert to single-sized float 9 mtc1 $zero $f0 # zero in $f0 (non need to convert) 10 c.lt.s $f2 $f0 # check if <0 … 11 bc1t nosqrt # no root then 12 sqrt.s $f2 $f2 13 nosqrt: s.s $f2 dst # store float result 14 cvt.w.s $f2 $f2 # convert to integer 15 mfc1 $t0 $f2 # get from FPU 16 sw $t0 idst # store integer result
Caution: lt vs. 1t sucks
Caclulate $$e$$ as infinite sum of $$sum_(n=1)^infty 1/(n!)$$
1 .data
2 one: .double 1
3 ten: .double 10
4 .text
5 l.d $f2 one # 1
6 sub.d $f4 $f4 $f4 # n
7 mov.d $f6 $f2 # n!
8 mov.d $f8 $f2 # here will be e
9 l.d $f10 ten # here will be ε
10 mov.d $f0 $f2 # decimal length K
11 li $v0 5
12 syscall
13
14 enext: blez $v0 edone # 10**(K+1)
15 mul.d $f0 $f0 $f10
16 subi $v0 $v0 1
17 b enext
18 edone: div.d $f10 $f2 $f0 # ε
19
20 loop: add.d $f4 $f4 $f2 # n=n+1
21 mul.d $f6 $f6 $f4 # n!=(n-1)!*n
22 div.d $f0 $f2 $f6 # next summand
23 add.d $f8 $f8 $f0
24 c.lt.d $f0 $f10 # next summand < ε
25 bc1f loop
26
27 li $v0 3 # output a double
28 mov.d $f12 $f8 # $f12 by syscall standard
29 syscall
H/W
EJudge: CubicRoot 'Cubical root'
Input double (positive or negative) float $$1<=|A|<=1000000$$ and $$0.00001<=varepsilon<=0.01$$. Calculate a cubical root of A with closeness $$<=varepsilon$$ (you do not need to round the result). HINT: you always can calculate a cubic power of something!
1000 0.0001
9.99995
EJudge: FractionTruncate 'Inexact fraction'
Input three cardinals — A, B and n. Output double float F that has exact n decimal places of A/B. You need to write a subroutine than accepts double f=A/B in $f12 and integer n in $a0 and returns rounded double F in $f0. Hint: $$10^n*A/B < 2^31$$
123 456 7
0.2697368
EJudge: LeibPi 'Caclulating Pi'
Calculate π value using Leibniz_formula_for_π accurate to N decimal places. Input N, output the result. Use function defied in ../Homework_FractionTruncate to truncate out other digits. Keep in mind that the exact formula is calculating π/4, you probably should start with 4 instead 1 to gain exact accuracy. Warning: the algorithm is slow, do not panic, but keep code as simple as possible.
4
3.1416