I'm used to developing fast FP algorithms for low end MCUs and avoiding FP calculations (using Integers instead) to speed up algorithms drastically.
Now, what I've noticed recently is a lack of education amongst engineers in how to optimally code FP Libs by using taylor series instead of Chebychev polynomials. The standard since the 60's has been to use Chebychev polynomials for calculating trig, etc, functions. See book titled Computer Approxximations by Hart and Cheney. This is the bible of such calculations.
Now, is what happened that you realized you were using Taylors series instead of Chebychev polynomials to do your calculations and on making the changes, got a speed increase of 26x ????????
What I'm interested in is what exactly you did to get your speed increase, and I'm also interested in what algorithm you are using to do your trig calculations. Alternatively, is the source code available?
The reason I want this is to evaluate whether I should use your Mathlib in the future or write my own. If you got a 26x speed increase, I have to conclude that you were doing something wrong in your original incarnation. My next conclusion is that if that happened, what else are you doing wrong?
So, I need to know how you are doing your trig calculations. Note that I'm using trig as this is the standard I'm used to. Other calculations are done differently, e.g. SQRT is optimally done using Newton Raphson double iteration algorithm. So, trig is the one I'm interested in....lets say cosine for example.
Thank you
-D