grovel - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Expand)	Author	Age	Files	Lines
*	math: fix fmodl for IEEE binary128•••This trivial copy-paste bug went unnoticed due to lack of testing. No currently supported target archs are affected.	Szabolcs Nagy	2015-02-09	1	-1/+1
*	math: fix __fpclassifyl(-0.0) for IEEE binary128•••The sign bit was not cleared before checking for 0 so -0.0 was misclassified as FP_SUBNORMAL instead of FP_ZERO.	Szabolcs Nagy	2015-02-08	1	-3/+2
*	add parenthesis in fma.c to clarify intent and silence warnings	Szabolcs Nagy	2015-02-08	1	-1/+1
*	math: use fnstsw consistently instead of fstsw in x87 asm•••fnstsw does not wait for pending unmasked x87 floating-point exceptions and it is the same as fstsw when all exceptions are masked which is the only environment libc supports.	Szabolcs Nagy	2014-11-05	11	-11/+11
*	math: fix x86_64 and x32 asm not to use sahf instruction•••Some early x86_64 cpus (released before 2006) did not support sahf/lahf instructions so they should be avoided (intel manual says they are only supported if CPUID.80000001H:ECX.LAHF-SAHF[bit 0] = 1). The workaround simplifies exp2l and expm1l because fucomip can be used instead of the fucomp;fnstsw;sahf sequence copied from i386. In fmodl and remainderl sahf is replaced by a simple bit test.	Szabolcs Nagy	2014-11-05	6	-28/+14
*	math: use the rounding idiom consistently•••the idiomatic rounding of x is n = x + toint - toint; where toint is either 1/EPSILON (x is non-negative) or 1.5/EPSILON (x may be negative and nearest rounding mode is assumed) and EPSILON is according to the evaluation precision (the type of toint is not very important, because single precision float can represent the 1/EPSILON of ieee binary128). in case of FLT_EVAL_METHOD!=0 this avoids a useless store to double or float precision, and the long double code became cleaner with 1/LDBL_EPSILON instead of ifdefs for toint. __rem_pio2f and __rem_pio2 functions slightly changed semantics: on i386 a double-rounding is avoided so close to half-way cases may get evaluated differently eg. as sin(pi/4-eps) instead of cos(pi/4+eps)	Szabolcs Nagy	2014-10-31	13	-58/+89
*	fix rint.c and rintf.c when FLT_EVAL_METHOD!=0•••The old code used the rounding idiom incorrectly: y = (double)(x + 0x1p52) - 0x1p52; the cast is useless if FLT_EVAL_METHOD==0 and causes a second rounding if FLT_EVAL_METHOD==2 which can give incorrect result in nearest rounding mode, so the correct idiom is to add/sub a power-of-2 according to the characteristics of double_t. This did not cause actual bug because only i386 is affected where rint is implemented in asm. Other rounding functions use a similar idiom, but they give correct results because they only rely on getting a neighboring integer result and the rounding direction is fixed up separately independently of the current rounding mode. However they should be fixed to use the idiom correctly too.	Szabolcs Nagy	2014-10-31	2	-4/+22
*	always provide __fpclassifyl and __signbitl definitions•••previously the external definitions of these functions were omitted on archs where long double is the same as double, since the code paths in the math.h macros which would call them are unreachable. however, even if they are unreachable, the definitions are still mandatory. omitting them is invalid C, and in the case of a non-optimizing compiler, will result in a link error.	Rich Felker	2014-10-08	2	-1/+9
*	math: fix exp10 not to raise invalid exception on NaN•••This was not caught earlier because gcc incorrectly generates quiet relational operators that never raise exceptions.	Szabolcs Nagy	2014-09-18	3	-4/+13
*	fix exp10l.c to include float.h•••the previous commit was a no op in exp10l because LDBL_* macros were implicitly 0 (the preprocessor does not warn about undefined symbols).	Szabolcs Nagy	2014-09-08	1	-0/+1
*	prune math code on archs with binary64 long double•••__polevll, __p1evll and exp10l were provided on archs when long double is the same as double. The first two were completely unused and exp10l can be a wrapper around exp10.	Szabolcs Nagy	2014-09-08	2	-0/+10
*	math: fix aliasing violation in long double wrappers•••modfl and sincosl were passing long double* instead of double* to the wrapped double precision functions (on archs where long double and double have the same size). This is fixed now by using temporaries (this is not optimized to a single branch so the generated code is a bit bigger). Found by Morten Welinder.	Szabolcs Nagy	2014-04-11	2	-2/+10
*	x32 port (diff against vanilla x86_64)	rofl0r	2014-02-23	18	-69/+69
*	import vanilla x86_64 code as x32	rofl0r	2014-02-23	30	-0/+396
*	math: add drem and dremf weak aliases to i386 remainder asm•••weak_alias was only in the c code, so drem was missing on platforms where remainder is implemented in asm.	Szabolcs Nagy	2014-01-08	2	-0/+6
*	math: define _GNU_SOURCE when implementing non-standard math functions•••this makes the prototypes in math.h are visible so they are checked agaist the function definitions	Szabolcs Nagy	2013-12-12	6	-0/+6
*	math: clean up __rem_pio2•••- remove the HAVE_EFFICIENT_IRINT case: fn is an exact integer, so it can be converted to int32_t a bit more efficiently than with a cast (the rounding mode change can be avoided), but musl does not support this case on any arch. - __rem_pio2: use double_t where possible - __rem_pio2f: use less assignments to avoid stores on i386 - use unsigned int bit manipulation (and union instead of macros) - use hexfloat literals instead of named constants	Szabolcs Nagy	2013-11-24	3	-71/+53
*	math: add (obsolete) bsd drem and finite functions	Szabolcs Nagy	2013-11-21	4	-0/+20
*	math: lgamma cleanup (simpler sin(pix) for the negative case)••• simplify sin_pi(x) (don't care about inexact here, the result is inexact anyway, and x is not so small to underflow) * in lgammal add the previously removed special case for x==1 and x==2 (to fix the sign of zero in downward rounding mode) * only define lgammal on supported long double platforms * change tgamma so the generated code is a bit smaller	Szabolcs Nagy	2013-11-21	4	-202/+110
*	math: extensive log.c cleanup•••The log, log2 and log10 functions share a lot of code and to a lesser extent log1p too. A small part of the code was kept separately in __log1p.h, but since it did not capture much of the common code and it was inlined anyway, it did not solve the issue properly. Now the log functions have significant code duplication, which may be resolved later, until then they need to be modified together. logl, log10l, log2l, log1pl: Fix the sign when the return value should be -inf. * Remove the volatile hack from log10l (seems unnecessary) log1p, log1pf: * Change the handling of small inputs: only \|x\|<2^-53 is special (then it is enough to return x with the usual subnormal handling) this fixes the sign of log1p(0) in downward rounding. * Do not handle the k==0 case specially (other than skipping the elaborate argument reduction) * Do not handle 1+x close to power-of-two specially (this code was used rarely, did not give much speed up and the precision wasn't better than the general) * Fix the correction term formula (c=1-(u-x) was used incorrectly when x<1 but (double)(x+1)==2, this was not a critical issue) * Use the exact same method for calculating log(1+f) as in log (except in log1p the c correction term is added to the result). log, logf, log10, log10f, log2, log2f: * Use double_t and float_t consistently. * Now the first part of log10 and log2 is identical to log (until the return statement, hopefully this makes maintainence easier). * Most special case formulas were removed (close to power-of-two and k==0 cases), they increase the code size without providing precision or performance benefits (and obfuscate the code). Only x==1 is handled specially so in downward rounding mode the sign of zero is correct (the general formula happens to give -0). * For x==0 instead of -1/0.0 or -two54/0.0, return -1/(xx) to force raising the exception at runtime. Arg reduction code is changed (slightly simplified) * The thresholds for arg reduction to [sqrt(2)/2,sqrt(2)] are now consistently the [0x3fe6a09e00000000,0x3ff6a09dffffffff] and the [0x3f3504f3,0x3fb504f2] intervals for double and float reductions respectively (the exact threshold values are not critical) * Remove the obsolete comment for the FLT_EVAL_METHOD!=0 case in log2f (The same code is used for all eval methods now, on i386 slightly simpler code could be used, but we have asm there anyway) all: * Fix signed int arithmetics (using unsigned for bitmanipulation) * Fix various comments	Szabolcs Nagy	2013-10-28	14	-583/+369
*	math: fix rare underflow issue in fma•••the issue is described in commits 1e5eb73545ca6cfe8b918798835aaf6e07af5beb and ffd8ac2dd50f99c3c83d7d9d845df9874ec3e7d5	Szabolcs Nagy	2013-10-07	3	-13/+55
*	math: use sqrtl if FLT_EVAL_METHOD==2 in acosh and acoshf•••this makes acosh slightly more precise around 1.0 on i386	Szabolcs Nagy	2013-10-07	2	-0/+13
*	math: remove an unused variable from modfl	Szabolcs Nagy	2013-10-06	1	-1/+0
*	math: remove code duplication in erfl found by clang analyzer•••erfl had some superflous code left around after the last erf cleanup. the issue was reported by Alexander Monakov	Szabolcs Nagy	2013-10-04	1	-13/+2
*	math: remove a useless assignment in lgammal found by clang analyzer•••the issue was reported by Alexander Monakov	Szabolcs Nagy	2013-10-04	1	-2/+2
*	fix x86_64 lrintl asm, again•••the underlying problem was not incorrect sign extension (fixed in the previous commit to this file by nsz) but that code that treats "long" as 32-bit was copied blindly from i386 to x86_64. now lrintl is identical to llrintl on x86_64, as it should be.	Rich Felker	2013-09-13	1	-2/+2
*	math: remove STRICT_ASSIGN from exp2f (see previous commit)	Szabolcs Nagy	2013-09-06	1	-1/+1
*	math: remove STRICT_ASSIGN macro•••gcc did not always drop excess precision according to c99 at assignments before version 4.5 even if -std=c99 was requested which caused badly broken mathematical functions on i386 when FLT_EVAL_METHOD!=0 but STRICT_ASSIGN was not used consistently and it is worked around for old compilers with -ffloat-store so it is no longer needed the new convention is to get the compiler respect c99 semantics and when excess precision is not harmful use float_t or double_t or to specialize code using FLT_EVAL_METHOD	Szabolcs Nagy	2013-09-06	10	-12/+13
*	math: support invalid ld80 representations in fpclassify•••apparently gnulib requires invalid long double representations to be handled correctly in printf so we classify them according to how the fpu treats them: bad inf is nan, bad nan is nan, bad normal is nan and bad subnormal/zero is minimal normal	Szabolcs Nagy	2013-09-05	1	-2/+4
*	math: fix atanh (overflow and underflow issues)•••in atanh exception handling was left to the called log functions, but the argument to those functions could underflow or overflow. use double_t and float_t to avoid some useless stores on x86	Szabolcs Nagy	2013-09-05	3	-14/+37
*	math: remove libc.h include from libm.h•••libc.h is only for weak_alias so include it directly where it is used	Szabolcs Nagy	2013-09-05	4	-1/+5
*	math: fix acoshf on negative values•••acosh(x) is invalid for x<1, acoshf tried to be clever using signed comparisions to handle all x<2 the same way, but the formula was wrong on large negative values.	Szabolcs Nagy	2013-09-05	2	-7/+8
*	math: fix expm1l on x86_64 (avoid underflow for large negative x)•••copy the fix from i386: return -1 instead of exp2l(x)-1 when x <= -65	Szabolcs Nagy	2013-09-05	3	-3/+13
*	math: fix lrintl.s on x86_64 (use movslq to signextend the result)	Szabolcs Nagy	2013-09-05	1	-1/+1
*	math: fix exp2l asm on x86 (raise underflow correctly)•••there were two problems: * omitted underflow on subnormal results: exp2l(-16383.5) was calculated as sqrt(2)2^-16384, the last bits of sqrt(2) are zero so the down scaling does not underflow eventhough the result is in subnormal range spurious underflow for subnormal inputs: exp2l(0x1p-16400) was evaluated as f2xm1(x)+1 and f2xm1 raised underflow (because inexact subnormal result) the first issue is fixed by raising underflow manually if x is in (-32768,-16382] and not integer (x-0x1p63+0x1p63 != x) the second issue is fixed by treating x in (-0x1p64,0x1p64) specially for these fixes the special case handling was completely rewritten	Szabolcs Nagy	2013-09-05	2	-67/+78
*	math: cosmetic cleanup (use explicit union instead of fshape and dshape)	Szabolcs Nagy	2013-09-05	10	-100/+84
*	math: remove *_WORD64 macros from libm.h•••only fma used these macros and the explicit union is clearer	Szabolcs Nagy	2013-09-05	1	-13/+13
*	math: long double fix (use ldshape union)•••* use new ldshape union consistently * add ld128 support to frexpl * simplify sqrtl comment (ld64 is not just arm)	Szabolcs Nagy	2013-09-05	8	-51/+24
*	math: use float_t and double_t in scalbnf and scalbn•••remove STRICT_ASSIGN (c99 semantics is assumed) and use the conventional union to prepare the scaling factor (so libm.h is no longer needed)	Szabolcs Nagy	2013-09-05	2	-16/+20
*	math: fix remaining old long double code (erfl, fmal, lgammal, scalbnl)•••in lgammal don't handle 1 and 2 specially, in fma use the new ldshape union instead of ld80 one.	Szabolcs Nagy	2013-09-05	5	-93/+65
*	math: cbrt cleanup and long double fix•••* use float_t and double_t * cleanup subnormal handling * bithacks according to the new convention (ldshape for long double and explicit unions for float and double)	Szabolcs Nagy	2013-09-05	3	-72/+59
*	math: fix underflow in exp.c and long double handling in exp2l••• don't care about inexact flag * use double_t and float_t (faster, smaller, more precise on x86) * exp: underflow when result is zero or subnormal and not -inf * exp2: underflow when result is zero or subnormal and not exact * expm1: underflow when result is zero or subnormal * expl: don't underflow on -inf * exp2: fix incorrect comment * expm1: simplify special case handling and overflow properly * expm1: cleanup final scaling and fix negative left shift ub (twopk)	Szabolcs Nagy	2013-09-05	8	-182/+139
*	math: long double trigonometric cleanup (cosl, sinl, sincosl, tanl)•••ld128 support was added to internal kernel functions (__cosl, __sinl, __tanl, __rem_pio2l) from freebsd (not tested, but should be a good start for when ld128 arch arrives) __rem_pio2l had some code cleanup, the freebsd ld128 code seems to gather the results of a large reduction with precision loss (fixed the bug but a todo comment was added for later investigation) the old copyright was removed from the non-kernel wrapper functions (cosl, sinl, sincosl, tanl) since these are trivial and the interesting parts and comments had been already rewritten.	Szabolcs Nagy	2013-09-05	8	-236/+228
*	math: long double inverse trigonometric cleanup (acosl, asinl, atanl, atan2l)•••* added ld128 support from freebsd fdlibm (untested) * using new ldshape union instead of IEEEl2bits * inexact status flag is not supported	Szabolcs Nagy	2013-09-05	6	-103/+180
*	math: rewrite hypot•••method: if there is a large difference between the scale of x and y then the larger magnitude dominates, otherwise reduce x,y so the argument of sqrt (xx+yy) does not overflow or underflow and calculate the argument precisely using exact multiplication. If the argument has less error than 1/sqrt(2) ~ 0.7 ulp, then the result has less error than 1 ulp in nearest rounding mode. the original fdlibm method was the same, except it used bit hacks instead of dekker-veltkamp algorithm, which is problematic for long double where different representations are supported. (the new hypot and hypotl code should be smaller and faster on 32bit cpu archs with fast fpu), the new code behaves differently in non-nearest rounding, but the error should be still less than 2ulps. ld80 and ld128 are supported	Szabolcs Nagy	2013-09-05	3	-324/+135
*	math: rewrite remainder functions (remainder, remquo, fmod, modf)•••* results are exact * modfl follows truncl (raises inexact flag spuriously now) * modf and modff only had cosmetic cleanup * remainder is just a wrapper around remquo now * using iterative shift+subtract for remquo and fmod * ld80 and ld128 are supported as well	Szabolcs Nagy	2013-09-05	11	-1008/+470
*	math: rewrite rounding functions (ceil, floor, trunc, round, rint)•••* faster, smaller, cleaner implementation than the bit hacks of fdlibm * use arithmetics like y=(double)(x+0x1p52)-0x1p52, which is an integer neighbor of x in all rounding modes (0<=x<0x1p52) and only use bithacks when that's faster and smaller (for float it usually is) * the code assumes standard excess precision handling for casts * long double code supports both ld80 and ld128 * nearbyint is not changed (it is a wrapper around rint)	Szabolcs Nagy	2013-09-05	15	-904/+273
*	math: fix logb(-0.0) in downward rounding mode•••use -1/(x*x) instead of -1/(x+0) to return -inf, -0+0 is -0 in downward rounding mode	Szabolcs Nagy	2013-09-05	3	-6/+6
*	math: ilogb cleanup•••* consistent code style * explicit union instead of typedef for double and float bit access * turn FENV_ACCESS ON to make 0/0.0f raise invalid flag * (untested) ld128 version of ilogbl (used by logbl which has ld128 support)	Szabolcs Nagy	2013-09-05	3	-16/+43
*	long double cleanup, initial commit•••new ldshape union, ld128 support is kept, code that used the old ldshape union was rewritten (IEEEl2bits union of freebsd libm is not touched yet) ld80 __fpclassifyl no longer tries to handle invalid representation	Szabolcs Nagy	2013-09-05	6	-70/+61