Well the implementation I have currently, there is as few conversion as possible, I'm trying to work purely in fix16 and then when I print out the data (for telemetry purposes, and it doesn't happen often, once every out of 16 new data refreshes) I convert to floating point.
Fixed Point Math library
(57 posts) (12 voices)-
Posted 4 years ago #
-
crenn - why convert to float? Isn't there a function for direct fixed I/O? I had a quick look, and should say, I couldn't find it.
There is no reason to use float for I/OIf 'fix16 to string' is missing, output could be done by hand.
With a couple of functions to extract the integer and fractional parts:
static inline int32_t fix16_int_part(const fix16_t inVal) { return ((inVal) >> 16); } static inline int32_t fix16_frac_part(const fix16_t inVal) { return ((inVal & 0x0000ffff); }
Then actual output would be a bit clumsy:
SerialUSB.print(fix16_int_part(fx)); SerialUSB.print("."); SerialUSB.print(fix16_frac_part(fx));
but you could wrap that up in a function or macro.
Input would be a tiny bit more, but not a lot. It is reading an integer, a '.' and a natural number, then gluing them together.
(full disclosure: I am not a member of LeafLabs staff)
(Edit: WARNING - I have not compiled or tested that code)Posted 4 years ago # -
holy cow. If you are using ANY floating point calculation, even a simple sin, you MUST convert your project to support this library. I spent 15 minutes editing my Makefile to accommodate the new files and WOW!
Replacing:
xPos = centerX - (int16_t)(sin(t * RAD) * diffX);
with:
xPos = centerX - fix16_to_int(fix16_mul(fix16_sin(fix16_from_float(t * RAD)), fix16_from_float(diffX)));
literally improved performance by 2.8x! I'm sure it can be improved even more if I remove both
fix16_from_float
!-robodude666
EDIT: LOL! Pre-calculating
fix16_from_float(diffX)
bumped overall performance gain from 2.8x to 3.6x.
EDIT2: replacingt * RAD
with its equivalent fix16 value yields an overall gain in performance from 2.8x to 4.3x.Posted 4 years ago # -
this is very cool. Now to get some audio code going on it.
Posted 4 years ago # -
Scuse the question: why not use operator overloading? Is it not supported in the compiler.
Also is there a fixed point FFT algo? or just redo the one from numerical recipes?Posted 4 years ago # -
C doesn't support Operator overloading, and I haven't tried the C++ header yet. I will be doing later today, I'm going to be taking a look at a couple of other libraries first.
Posted 4 years ago # -
Thanks for all the updates crenn, I was able to duplicate all your code and do the same float comparison on my Maple v5 board. I got exactly the same results as you report.
Cheers!
Chris Troutner
Posted 4 years ago # -
I wrote up a post on my blog on how I turned my Maple into a high speed data logger. I implement this fixed point library and have examples:
http://thesolarpowerexpert.com/an-open-source-high-speed-data-logger/
Posted 4 years ago # -
libfixmath has a bug in the division routine - it gets simple arithmetic badly wrong (at least, in the version I downloaded a week ago).
You could try to fix it, but I suggest "pushing" it off the libraries page and replacing it with Trenki's fixed point library, here http://www.trenki.net/. It's beautifully written c++ with overloaded operators, so you can write expressions with simple math-like syntax instead of ugly nested function calls. It gives accurate answers on my Maple (mini).
Notes: My tests show that Trenki's library does fast arithmetic with e.g. 24.8 precision, but if you use 16.16 then multiplication and division are actually a bit slower than floating point. The trig functions are only implemented for 16.16 precision, but they are REALLY fast, and accurate. Whether/how much your code speeds up using this library will depend on what you are doing. Matrices will multiply quite a bit faster (because they are a mix of * and +, and + is much faster for fixed's) and anything with trig operations will fly. If you are only doing arithmetic and 24.8 is good enough, it's fast. It's written with inline functions, so you might make your code smaller at the cost of a little speed, by making them ordinary functions.
I am going to have a crack at inserting faster * and / functions into this library over the weekend, and will let you know if I succeed (or, what not to try if I don't!).
Posted 4 years ago # -
mpaulin - thank your for sharing your findings. Have you got a little test/demo of that division bug that you could 'pastebin'?
I wonder if the bug has crept in, or "always" been there?When you say "24.8 precision", I assume you mean 24 'integer' bits, and 8 'fractional' bits. Yes?
When you say
... if you use 16.16 then multiplication and division are actually a bit slower than floating point.
Did you really mean to say that, or did you mean to say "libfixmath" instead of "floating point"
AFAIK, libfixmath is 2x-4x faster than floating point (I think robodude666 wrote even faster in some cases), and so would it be fair to assume that libfixmath would be 2x-4x faster than "trenki" for 16.16 multiplication?Sounds like a merge of the best bits is needed :-)
Posted 4 years ago # -
Here's a little test routine for libfixmath. Comment in/out the lines after the for-loop to test speed and accuracy of different operations.
snip===============
#include <fixmath.h>
float af = 3.0;
float bf = 2.0;
float cf;fix16_t a = fix16_from_float(af);
fix16_t b = fix16_from_float(bf);
fix16_t c;int i;
long t;void setup() {};
void loop() {
long t = micros();
for (i=0; i<100; i++)
// c = a+b;
//c = fix16_mul(a,b); // c = a/b;
c = fix16_div(a,b); // c = a/b;
// c = fix16_sin(a);t = micros()-t;
SerialUSB.print("libfixmath got "); SerialUSB.print(fix16_to_float(c),8);
SerialUSB.print(", took "); SerialUSB.println(t);t = micros();
for (i=0; i<100; i++)
// cf = af+bf;
//cf = af*bf;
cf = af/bf;
// cf = sin(af);t = micros()-t;
SerialUSB.print("math got "); SerialUSB.print(cf,8);
SerialUSB.print(", took "); SerialUSB.println(t);delay(1000);
}
snip========
In libfixmath, multiplication and division run about as fast as floating point. Yes, I mean floating point. It just isn't faster.
The real problem is that division is sometimes badly incorrect (e.g. 3/2 = 1.25). And when it's wrong it seems to be much slower (factor of about 3).
Addition is about 10x faster than floating point.
sin is a lot faster (about x10) but not very accurate.
Now here is an equivalent test file for trenki's library:
snip ===============
#include <fixed_class.h>
using namespace fixedpoint;
#define P 16float af = 3;
float bf = 2;
float cf;fixed_point<P> a = af;
fixed_point<P> b = bf;
fixed_point<P> c ;int i;
long t;
void setup() {};
void loop() {
t = micros();
for (i=0; i<1000; i++)
// c = a+b;
c = a*b;
//c = a/b;
// c = sin(a);
t = micros() - t;SerialUSB.print("Trenki got ");SerialUSB.print(fix2float<P>(c.intValue), 8);
SerialUSB.print(", took "); SerialUSB.println(t);t = micros();
for (i=0; i<1000; i++)
// cf = af+bf;
cf = af*bf;
//cf = af/bf;
//cf = sin(af);
t = micros() - t;SerialUSB.print("math got ");SerialUSB.print(cf, 8);
SerialUSB.print(", took "); SerialUSB.println(t);delay(1000);
}
snip=================
Trenki's library does * about as fast as floating-point *. / is slow (factor of 3) but has the virtue of getting the right answer. + is fast, but not as fast as in libfixmath (x5).
sin is very fast (x25) and accurate.
So, you can get a big speedup from either library, depending on what operations you use. But don't use division in libfixmath. As gbulmer said, maybe we can hack something from the two of them. It's just unfortunate that they both fall down in division - one is uselessly slow, the other is uselessly inaccurate.
Posted 4 years ago # -
OK here's a fix: Use the 'inv' function in Trenki's library to implement division a/b as a*inv(b). This seems to be about as fast as division in libfixmath, with the added virtue of getting the right answer. Look in Trenki's fixed_func.cpp to see how to do this elegantly.
My benchmarks are not very exhaustive - the speed and accuracy of an operation can vary depending on operands - so a bit more testing might be in order. However, at this point it seems to me that Trenki's library with this fix is about as fast as libfixmath for most things ,more than twice as fast for trig functions, and generally quite a lot more accurate (especially for division!). In addition, it is much easier to use because of the operator overloading.
and p.s. yes when I say (32-p).p precision I mean 32-p integer bits and p fractional bits.
Posted 4 years ago # -
"Use the 'inv' function in Trenki's library to implement division a/b as a*inv(b)"
Has anyone got this to work with anything except P=16? The fixed_class.h file says:
// math functions // no default implementation template <int p> inline fixed_point<p> sin(fixed_point<p> a); template <int p> inline fixed_point<p> cos(fixed_point<p> a); template <int p> inline fixed_point<p> sqrt(fixed_point<p> a); template <int p> inline fixed_point<p> rsqrt(fixed_point<p> a); template <int p> inline fixed_point<p> inv(fixed_point<p> a);
I've looked at using
fixinv<P>(c)
But this returns a result with 32-P fractional bits, and the only way I've found to convert this back to P fractional bits is to do a fix2float<32-P> and then a float2fix<P> which seems quite wasteful (and brings in floating point operations)!
Posted 3 years ago # -
iainism - have you tried using p=16, and
fixed_point<16> inv(..)
as an experiment?The comment in fixed_func.h says:
// q is the precision of the input // output has 32-q bits of fraction template <int q> inline int fixinv(int32_t a) ...
Which seems to agree with what you wrote.
So have you tried instantiating
template <int q> inline int fixinv(int32_t a)
with<32-p>
?Posted 3 years ago # -
Hi gbulmer, yes inv<16> works fine, (there's actually a function written for p=16 as a special case, but any other p fails at compile time, with a 'no match' error).
I can use the division operator to get where I need to be. I was just mindful of the discussion of the additional time taken for division, as discussed above, and so curious if anyone had found a way to use fixinv for p!=16, without using the two conversions, which add a floating point operation each.
I think that one could do a bitwise shift one way or the other on the result of fixinv based off a comparison of p and 32-p by 32-2p bits. I'll test this once I'm back at my machine and feed back.
Posted 3 years ago #
Reply »
You must log in to post.