aboutsummaryrefslogtreecommitdiff
path: root/src/malloc/aligned_alloc.c (unfollow)
Commit message (Collapse)AuthorFilesLines
2018-10-18optimize internal putc_unlocked macro used in putcRich Felker1-1/+2
to check whether flush due to line buffering is needed, the int-type character argument must be truncated to unsigned char for comparison. if the original value is subsequently passed to __overflow, it must be preserved, adding to register pressure. since it doesn't matter, truncate all uses so the original value is no longer live.
2018-10-18fix wrong result for putc variants due to operator precedenceRich Felker1-1/+1
the internal putc_unlocked macro was wrongly returning a meaningless boolean result rather than the written character or EOF. bug was found by reading (very surprising) asm.
2018-10-18further optimize getc/putc when locking is neededRich Felker2-10/+10
check whether the lock is free before loading the calling thread's tid. if so, just use a dummy tid value that cannot compare equal to any actual thread id (because it's one bit wider). this also avoids the need to save the tid and pass it to locking_getc or locking_putc, reducing register pressure. this change might slightly hurt the case where the caller already holds the lock, but it does not affect the single-threaded case, and may significantly improve the multi-threaded case, especially on archs where loading the thread pointer is disproportionately expensive like early mips and arm ISA levels. but even on i386 it helps, at least on some machines; I measured roughly a 10-15% improvement.
2018-10-18use prototype for function pointer in static link libc init barrierRich Felker1-1/+1
this is not needed for correctness, but doesn't hurt, and in some cases the compiler may pessimize the call assuming the callee might be variadic when it lacks a prototype.
2018-10-18fix error in constraints for static link libc init barrierRich Felker1-1/+1
commit 4390383b32250a941ec616e8bff6f568a801b1c0 inadvertently used "r" instead of "0" for the input constraint, which only happened to work for the configuration I tested it on because it usually makes sense for the compiler to choose the same input and output register.