Calling all gurus
Ludis Langens
ludis at cruzers.com
Tue Sep 8 13:23:51 GMT 1998
"Mike Pitts" <mpitts at mail.emi.net> wrote:
> as well as tell me what the following routine does
>
> F047 3C PSHX ; Store X
> F048 30 TSX ; X =3D (SP) + 1
> F049 A3 00 SUBD $00,x ; D =3D D - X[0]
> F04B 37 PSHB ; store B
> F04C E6 00 25 LDAB $00,y ; B =3D Y[0]
> F04F 25 03 BLO $F054 ; if D < 0 goto $F054
> F051 3D MUL ; D =3D A * B
> F052 20 04 BRA $F058 ; goto $F058
> F054 3D MUL ; D =3D A * B
> F055 A0 00 E3 SUBA $00,y ; A =3D A - Y[0]
> F058 E3 00 ADDD $00,x ; D =3D D + X[0]
> F05A ED 00 STD $00,x ; X[0] =3D D
> F05C 32 PULA ; get A from stack (was B)
> F05D E6 00 3D LDAB $00,y ; B =3D Y[0]
> F060 3D MUL ; D =3D A * B
> F061 A9 01 ADCA $01,x ; A =3D A + X[1] + carry
> F063 16 TAB ; B =3D A
> F064 A6 00 LDAA $00,x ; A =3D X[0]
> F066 89 00 ADCA #$00 ; A =3D A + carry
> F068 38 PULX ; restore X
> F069 39 RTS ; return
typedef /*compiler dependent type goes here*/ int16;
typedef /*compiler dependent type goes here*/ int32;
typedef unsigned int16 uint16;
uint16 Blend(uint16 numD, uint16 numX, unsigned char *ratio)
{
unsigned int32 temp, portionD, portionX;
portionD = *ratio;
portionX = 256 - *ratio;
temp = (numD * portionD) + (numX * portionX);
return ((temp + 128) >> 8);
}
Yes, really! Now lets do a blow by blow analysis:
First, put the contents of X on the stack where it can be accessed
easier. The address of where it is on the stack is placed into X so
that $00,x can be used to access the value.
PSHX
TSX
Now compute the difference between the X and D inputs. Save the low
order byte on the stack for later. Working on the difference means that
only one 8x16 multiply will be needed.
SUBD $00,x
PSHB
If D >= X, in other words, if the difference is positive, multiply the
upper byte of the difference (in A) by the blend ratio. Afterwards, the
value in D is aligned with bits 2^23 through 2^8 of the full 24 product
(of the difference and the blend ratio.)
LDAB $00,y
BLO @0
MUL
BRA @1
If D < X, multiply the upper byte of the difference by the blend ratio.
The subtract fixes the result of the unsigned multiply instruction to
account for the signed (negative) operand. (Essentially, 256 is added
to A to make it positive, the multiply is performed on unsigned/positive
operands, and then 256 times the multiplier (old B) is subtracted to
arrive at the proper value.)
@0 MUL
SUBA $00,y
Sum the portion of the difference into the original X input. This
effectively is doing a preemptive >>8 of the (yet) unfinished product
calculation. If the original D >= X, then the value on the stack is
increased most of the way towards the final function result. If the
original D < X, then the value on the stack is decreased to beyond the
final result - the final stage will increase it back up as needed.
@1 ADDD $00,x
STD $00,x
Now multiply the (saved) lower byte of the difference by the ratio.
This generates a result that needs to be added to bits 2^15 through 2^0
of the full 24 bit product. Because this product will be >>8, only the
result in A is fully needed. The low result byte in B will be used just
for rounding. The 68HC11 will conveniently place this rounding info
into the C flag.
PULA
LDAB $00,y
MUL
Sum the rest of the product into the X number. Rounding is rippled up
through the upper byte.
ADCA $01,x
TAB
LDAA $00,x
ADCA #$00
Clear off the intermediate value stored on the stack, and return to the
caller. Even though this looks like a restore of X, it isn't because
that stack location has been modified. So, the function result is in D,
X has been trashed, Y has not been modified.
PULX
RTS
This function is likely used by loading a new 16 bit value into D, the
old filter contents into X, pointing Y at the blend ratio, invoking the
function, and then storing D into the filter. A blend ratio number of 0
means that the "new" value is ignored, the filter value is never
changed. A blend ratio of 128 moves the filter half way towards the
"new" value.
When filtering 8 bit numbers, it can be useful to make the filter have 8
fractional bits. In that case, put the 8 bit value into A, zero out B
(or set it to 128 aka half), and then use the upper byte of the filter
value as the "output".
BTW, there are several ways to write this function. The optimal
implementation depends a lot on the target processor, desired rounding,
and so on. The 'HC11 is not an easy architecture for this function. GM
probably "paid" $1000 to $2000 for this one function.
--
Ludis Langens ludis (at) cruzers (dot) com
Mac, Fiero, & engine controller goodies: http://www.cruzers.com/~ludis/
More information about the Diy_efi
mailing list