newlib

Commit Graph

Author	SHA1	Message	Date
Kito Cheng	654398db84	RISC-V: Fix header guard for sys/fenv.h	2019-08-02 09:34:39 +02:00
Martin Erik Werner	739e89cbe6	or1k: Avoid write outside setjmp buf & shrink buf Update the offsets used to save registers into the stejmp jmp_buf structure in order to: * Avoid writing the supervision register outside the buffer and thus clobbering something on the stack. Previously the supervision register was written at offset 124 while the buffer was of length 124. * Shrink the jmp_buf down to the size actually needed, by avoiding holes at the locations of omitted registers.	2019-06-27 12:51:54 +02:00
Martin Erik Werner	8b080534ca	or1k: Correct longjmp return value Invert equality check instruction to correct the return value handling in longjmp. The return value should be the value of the second argument to longjmp, unless the argument value was 0 in which case it should be 1. Previously, longjmp would set return value 1 if the second argument was non-zero, and 0 if it was 0, which was incorrect.	2019-06-27 09:09:37 +02:00
Jeff Johnston	eb429ad509	Fix __getreent stack calculations for AMD GCN From: Andrew Stubbs <ams@codesourcery.com> Fix a bug in which the high-part of 64-bit values are being corrupted, leading to erroneous stack overflow errors. The problem was only that the mixed-size calculations are being treated as signed when they should be unsigned.	2019-06-07 13:57:45 -04:00
Jim Wilson	5c86f0da5f	RISC-V: Add size optimized memcpy, memmove, memset and strcmp. This patch adds implementations of memcpy, memmove, memset and strcmp optimized for size. The changes have been tested in riscv/riscv-gnu-toolchain by riscv-dejagnu with riscv-sim.exp/riscv-sim-nano.exp.	2019-05-22 17:36:57 -07:00
Jozef Lawrynowicz	1e6c561d48	Implement reduced code size "tiny" printf and puts "tiny" printf is derived from _vfprintf_r in libc/stdio/nano-vfprintf.c. "tiny" puts has been implemented so that it just calls write, without any other processing. Support for buffering, reentrancy and streams has been removed from these functions to achieve reduced code size. This reduced code size implementation of printf and puts can be enabled in an application by passing "--wrap printf" and "--wrap puts" to the GNU linker. This will replace references to "printf" and "puts" in user code with "__wrap_printf" and "__wrap_puts" respectively. If there is no implementation of these __wrap* functions in user code, these "tiny" printf and puts implementations will be linked into the final executable. The wrapping mechanism is supposed to be invisible to the user: - A GCC wrapper option such as "-mtiny-printf" will be added to alias these wrap commands. - If the user is unaware of the "tiny" implementation, and chooses to implement their own __wrap_printf and __wrap_puts, their own implementation will be automatically chosen over the "tiny" printf and puts from the library. Newlib must be configured with --enable-newlib-nano-formatted-io for the "tiny" printf and puts functions to be built into the library. Code size reduction examples: printf("Hello World\n") baseline - msp430-elf-gcc gcc-8_3_0-release text data bss 5638 214 26 "tiny" puts enabled text data bss 714 90 20 printf("Hello %d\n", a) baseline - msp430-elf-gcc gcc-8_3_0-release text data bss 10916 614 28 "tiny" printf enabled text data bss 4632 280 20	2019-04-15 14:22:33 +02:00
Jozef Lawrynowicz	2af6ad9f05	Copy prerequisite file for "tiny" printf implementation Use newlib/libc/stdio/nano-vfprintf.c as baseline for tiny-printf.c	2019-04-15 14:22:30 +02:00
Andrew Stubbs	e8b23909e4	Add missing includes. These missing includes were causing build warnings, but also a real bug in which the "size" parameter to "write" was being passed in 32-bit, whereas it ought to be 64-bit. This led to intermittent bad behaviour.	2019-03-25 16:44:10 +01:00
Jozef Lawrynowicz	b14a879d85	Remove matherr, and SVID and X/Open math library configurations Default math library configuration is now IEEE	2019-01-23 10:46:24 +01:00
Jeff Johnston	1787e9d033	AMD GCN Port contributed by Andrew Stubbs <ams@codesourcery.com> Add support for the AMD GCN GPU architecture. This is primarily intended for use with OpenMP and OpenACC offloading. It can also be used for stand-alone programs, but this is intended mostly for testing the compiler and is not expected to be useful in general. The GPU architecture is highly parallel, and therefore Newlib must be configured to use dynamic re-entrancy, and thread-safe malloc. The only I/O available is a via a shared-memory interface provided by libgomp and the gcn-run tool included with GCC. At this time this is limited to stdout, argc/argv, and the return code.	2019-01-15 10:48:08 -05:00
Jeff Johnston	5726873100	Bump release to 3.1.0 for yearly snapshot	2018-12-31 23:40:11 -05:00
Wilco Dijkstra	df7824d1a4	Fix issue with dst bias in memset This patch fixes an issue in the previous memset loop change. If the zva size is >= 256 and there are more than 64 bytes left in the tail, we could enter the loop and thus need to rebias dst by 32 as well. Since no known CPUs use this size this can't be tested natively, so I've tested it on a simulator initialized with a large zva size. --	2018-11-08 16:45:19 +00:00
Wilco Dijkstra	d80db60066	Adjust writeback in non-zero memset This fixes an ineffiency in the non-zero memset. Delaying the writeback until the end of the loop is slightly faster on some cores - this shows ~5% performance gain on Cortex-A53 when doing large non-zero memsets. Tested against the GLIBC testsuite.	2018-11-06 14:59:51 +00:00
Sebastian Huber	da418955f5	Move common <sys/dirent.h> content to <dirent.h> Move common content of the various <sys/dirent.h> and the latest FreeBSD <dirent.h> to <dirent.h>. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-10-11 08:29:16 +02:00
Jon Beniston	a9cfb33b6c	Add --disable-newlib-fno-builtin to allow compilation without -fno-builtin for smaller and faster code.	2018-08-31 15:40:42 -04:00
Keith Packard	82dfae9ab0	Use __inhibit_loop_to_libcall in all memset/memcpy implementations This macro selects a compiler option that disables recognition of common memset/memcpy patterns and converting those to direct memset/memcpy calls. Signed-off-by: Keith Packard <keithp@keithp.com>	2018-08-29 16:05:37 +02:00
Siddhesh Poyarekar	d02cc7a09d	strcmp.S: Improve performance for misaligned strings Replace the simple byte-wise compare in the misaligned case with a dword compare with page boundary checks in place. For simplicity I've chosen a 4K page boundary so that we don't have to query the actual page size on the system. This results in up to 3x improvement in performance in the unaligned case on falkor and about 2.5x improvement on mustang as measured using bench-strcmp in glibc.	2018-07-13 13:27:54 +02:00
Siddhesh Poyarekar	2d9f35c2cc	memcmp.S: optimize for medium to large sizes This improved memcmp provides a fast path for compares up to 16 bytes and then compares 16 bytes at a time, thus optimizing loads from both sources. The glibc memcmp microbenchmark retains performance (with an error of ~1ns) for smaller compare sizes and reduces up to 31% of execution time for compares up to 4K on the APM Mustang. On Qualcomm Falkor this improves to almost 48%, i.e. it is almost 2x improvement for sizes of 2K and above.	2018-07-13 13:27:54 +02:00
Siddhesh Poyarekar	f44eee8f1b	Improve strncmp for mutually misaligned inputs The mutually misaligned inputs on aarch64 are compared with a simple byte copy, which is not very efficient. Enhance the comparison similar to strcmp by loading a double-word at a time. The peak performance improvement (i.e. 4k maxlen comparisons) due to this on the strncmp microbenchmark in glibc is as follows: falkor: 3.5x (up to 72% time reduction) cortex-a73: 3.5x (up to 71% time reduction) cortex-a53: 3.5x (up to 71% time reduction) All mutually misaligned inputs from 16 bytes maxlen onwards show upwards of 15% improvement and there is no measurable effect on the performance of aligned/mutually aligned inputs.	2018-07-13 13:27:54 +02:00
Jeff Johnston	cd31fbb2ae	Add nvptx port. - From: Cesar Philippidis <cesar@codesourcery.com> Date: Tue, 10 Apr 2018 14:43:42 -0700 Subject: [PATCH] nvptx port This port adds support for Nvidia GPU's, which are primarily used as offload accelerators in OpenACC and OpenMP.	2018-04-13 15:42:37 -04:00
Sebastian Huber	1658a57715	epiphany: Additional setjmp() and longjmp() syms At least with Binutils 2.30 and GCC 7.3 we need symbol definitions without the leading underscore. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-01-31 08:17:19 +01:00
Jeff Johnston	fffd2770db	Bump release to 3.0.0 for yearly snapshot - major release required due to removal of K&R support	2018-01-18 13:07:45 -05:00
Yaakov Selkowitz	7192f84096	ansification: remove _HAVE_STDC Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:30 -06:00
Yaakov Selkowitz	70ee6b17df	ansification: remove _EXFUN, _EXFUN_NOTHROW Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:29 -06:00
Yaakov Selkowitz	9087163804	ansification: remove _DEFUN Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:26 -06:00
Yaakov Selkowitz	67ee0cac4c	ansification: remove _VOID Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:20 -06:00
Yaakov Selkowitz	fff27f8429	ansification: remove _DEFUN_VOID Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:19 -06:00
Yaakov Selkowitz	670b01da7f	ansification: remove _CAST_VOID Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:17 -06:00
Yaakov Selkowitz	e6321aa6a6	ansification: remove _PTR Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:16 -06:00
Yaakov Selkowitz	eea249da3b	ansification: remove _PARAMS Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:13 -06:00
Yaakov Selkowitz	0bda30e1ff	ansification: remove _CONST Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:08 -06:00
Yaakov Selkowitz	6783860a2e	ansification: remove _AND Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-01-17 11:47:05 -06:00
Jon Turney	c006fd459f	makedoc: make errors visible Discard QUICKREF sections, rather than writing them to stderr Discard MATHREF sections, rather than discarding as an error Pass NOTES sections through to texinfo, rather than discarding as an error Don't redirect makedoc stderr to .ref file Remove makedoc output on error Remove .ref files from CLEANFILES Regenerate Makefile.ins Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>	2017-12-07 11:54:11 +00:00
Yaakov Selkowitz	1f1e477554	powerpc: remove TRAD_SYNOPSIS Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2017-12-01 03:41:50 -06:00
Yaakov Selkowitz	ddd22ee069	nds32: remove TRAD_SYNOPSIS Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2017-12-01 03:41:50 -06:00
Yaakov Selkowitz	4e8c64b928	microblaze: remove TRAD_SYNOPSIS Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2017-12-01 03:41:50 -06:00
Kito Cheng	6864c08b94	Change license to FreeBSD License for RISC-V - For prevent confuse about what BSD license variant we used, 2- or 3-clause license, we change the license to FreeBSD license to make it unambiguously refers to the 2-clause license.	2017-08-21 11:08:54 +02:00
Kito Cheng	363dbb9e44	Add RISC-V port for newlib Contributor list: - Andrew Waterman <andrew@sifive.com> - Palmer Dabbelt <palmer@dabbelt.com> - Kito Cheng <kito.cheng@gmail.com> - Scott Beamer <sbeamer@eecs.berkeley.edu>	2017-08-16 18:00:58 -04:00
Richard Earnshaw	d6cac3e1da	[arm] Fix strcpy for unified syntax on ARMv4t thumb. ARMv4t does not support mov between two low registers. Now we use unified syntax mov instructions need converting to movs.	2017-07-21 11:23:27 +01:00
Ian Tessier via newlib	4bce7ecbe1	arm: Update strcpy.c to use UAL syntax. With this change the arm platform can now be fully compiled with Clang. Tested by comparing the output with GCC 4.8.2, and Clang 4.0, using a variety of arches, big/little endianness, and arm/thumb mode to verify the generated assembly output matches between GCC vs Clang with UAL, and also GCC with UAL vs GCC with non-UAL, for all preprocessor code blocks. The only difference found is an extra nop at the end of the function when compiled with GCC using armv7-a/thumb/little-endian/-O2 compared to Clang. The nop is not emitted when compiled in big-endian mode.	2017-07-20 16:18:29 +02:00
Wilco Dijkstra	c86063bdc0	Optimized memcmp This is an optimized memcmp for AArch64. This is a complete rewrite using a different algorithm. The previous version split into cases where both inputs were aligned, the inputs were mutually aligned and unaligned using a byte loop. The new version combines all these cases, while small inputs of less than 8 bytes are handled separately. This allows the main code to be sped up using unaligned loads since there are now at least 8 bytes to be compared. After the first 8 bytes, align the first input. This ensures each iteration does at most one unaligned access and mutually aligned inputs behave as aligned. After the main loop, process the last 8 bytes using unaligned accesses. This improves performance of (mutually) aligned cases by 25% and unaligned by >500% (yes >6 times faster) on large inputs. ChangeLog: 2017-06-28 Wilco Dijkstra <wdijkstr@arm.com> * newlib/libc/machine/aarch64/memcmp.S (memcmp): Rewrite of optimized memcmp. GLIBC benchtests/bench-memcmp.c performance comparison for Cortex-A53: Length 1, alignment 1/ 1: 153% Length 1, alignment 1/ 1: 119% Length 1, alignment 1/ 1: 154% Length 2, alignment 2/ 2: 121% Length 2, alignment 2/ 2: 140% Length 2, alignment 2/ 2: 121% Length 3, alignment 3/ 3: 105% Length 3, alignment 3/ 3: 105% Length 3, alignment 3/ 3: 105% Length 4, alignment 4/ 4: 155% Length 4, alignment 4/ 4: 154% Length 4, alignment 4/ 4: 161% Length 5, alignment 5/ 5: 173% Length 5, alignment 5/ 5: 173% Length 5, alignment 5/ 5: 173% Length 6, alignment 6/ 6: 145% Length 6, alignment 6/ 6: 145% Length 6, alignment 6/ 6: 145% Length 7, alignment 7/ 7: 125% Length 7, alignment 7/ 7: 125% Length 7, alignment 7/ 7: 125% Length 8, alignment 8/ 8: 111% Length 8, alignment 8/ 8: 130% Length 8, alignment 8/ 8: 124% Length 9, alignment 9/ 9: 160% Length 9, alignment 9/ 9: 160% Length 9, alignment 9/ 9: 150% Length 10, alignment 10/10: 170% Length 10, alignment 10/10: 137% Length 10, alignment 10/10: 150% Length 11, alignment 11/11: 160% Length 11, alignment 11/11: 160% Length 11, alignment 11/11: 160% Length 12, alignment 12/12: 146% Length 12, alignment 12/12: 168% Length 12, alignment 12/12: 156% Length 13, alignment 13/13: 167% Length 13, alignment 13/13: 167% Length 13, alignment 13/13: 173% Length 14, alignment 14/14: 167% Length 14, alignment 14/14: 168% Length 14, alignment 14/14: 168% Length 15, alignment 15/15: 168% Length 15, alignment 15/15: 173% Length 15, alignment 15/15: 173% Length 1, alignment 0/ 0: 134% Length 1, alignment 0/ 0: 127% Length 1, alignment 0/ 0: 119% Length 2, alignment 0/ 0: 94% Length 2, alignment 0/ 0: 94% Length 2, alignment 0/ 0: 106% Length 3, alignment 0/ 0: 82% Length 3, alignment 0/ 0: 87% Length 3, alignment 0/ 0: 82% Length 4, alignment 0/ 0: 115% Length 4, alignment 0/ 0: 115% Length 4, alignment 0/ 0: 122% Length 5, alignment 0/ 0: 127% Length 5, alignment 0/ 0: 119% Length 5, alignment 0/ 0: 127% Length 6, alignment 0/ 0: 103% Length 6, alignment 0/ 0: 100% Length 6, alignment 0/ 0: 100% Length 7, alignment 0/ 0: 82% Length 7, alignment 0/ 0: 91% Length 7, alignment 0/ 0: 87% Length 8, alignment 0/ 0: 111% Length 8, alignment 0/ 0: 124% Length 8, alignment 0/ 0: 124% Length 9, alignment 0/ 0: 136% Length 9, alignment 0/ 0: 136% Length 9, alignment 0/ 0: 136% Length 10, alignment 0/ 0: 136% Length 10, alignment 0/ 0: 135% Length 10, alignment 0/ 0: 136% Length 11, alignment 0/ 0: 136% Length 11, alignment 0/ 0: 136% Length 11, alignment 0/ 0: 135% Length 12, alignment 0/ 0: 136% Length 12, alignment 0/ 0: 136% Length 12, alignment 0/ 0: 136% Length 13, alignment 0/ 0: 135% Length 13, alignment 0/ 0: 136% Length 13, alignment 0/ 0: 136% Length 14, alignment 0/ 0: 136% Length 14, alignment 0/ 0: 136% Length 14, alignment 0/ 0: 136% Length 15, alignment 0/ 0: 136% Length 15, alignment 0/ 0: 136% Length 15, alignment 0/ 0: 136% Length 4, alignment 0/ 0: 115% Length 4, alignment 0/ 0: 115% Length 4, alignment 0/ 0: 115% Length 32, alignment 0/ 0: 127% Length 32, alignment 7/ 2: 395% Length 32, alignment 0/ 0: 127% Length 32, alignment 0/ 0: 127% Length 8, alignment 0/ 0: 111% Length 8, alignment 0/ 0: 124% Length 8, alignment 0/ 0: 124% Length 64, alignment 0/ 0: 128% Length 64, alignment 6/ 4: 475% Length 64, alignment 0/ 0: 131% Length 64, alignment 0/ 0: 134% Length 16, alignment 0/ 0: 128% Length 16, alignment 0/ 0: 119% Length 16, alignment 0/ 0: 128% Length 128, alignment 0/ 0: 129% Length 128, alignment 5/ 6: 475% Length 128, alignment 0/ 0: 130% Length 128, alignment 0/ 0: 129% Length 32, alignment 0/ 0: 126% Length 32, alignment 0/ 0: 126% Length 32, alignment 0/ 0: 126% Length 256, alignment 0/ 0: 127% Length 256, alignment 4/ 8: 545% Length 256, alignment 0/ 0: 126% Length 256, alignment 0/ 0: 128% Length 64, alignment 0/ 0: 171% Length 64, alignment 0/ 0: 171% Length 64, alignment 0/ 0: 174% Length 512, alignment 0/ 0: 126% Length 512, alignment 3/10: 585% Length 512, alignment 0/ 0: 126% Length 512, alignment 0/ 0: 127% Length 128, alignment 0/ 0: 129% Length 128, alignment 0/ 0: 128% Length 128, alignment 0/ 0: 129% Length 1024, alignment 0/ 0: 125% Length 1024, alignment 2/12: 611% Length 1024, alignment 0/ 0: 126% Length 1024, alignment 0/ 0: 126% Length 256, alignment 0/ 0: 128% Length 256, alignment 0/ 0: 127% Length 256, alignment 0/ 0: 128% Length 2048, alignment 0/ 0: 125% Length 2048, alignment 1/14: 625% Length 2048, alignment 0/ 0: 125% Length 2048, alignment 0/ 0: 125% Length 512, alignment 0/ 0: 126% Length 512, alignment 0/ 0: 127% Length 512, alignment 0/ 0: 127% Length 4096, alignment 0/ 0: 125% Length 4096, alignment 0/16: 125% Length 4096, alignment 0/ 0: 125% Length 4096, alignment 0/ 0: 125% Length 1024, alignment 0/ 0: 126% Length 1024, alignment 0/ 0: 126% Length 1024, alignment 0/ 0: 126% Length 8192, alignment 0/ 0: 125% Length 8192, alignment 63/18: 636% Length 8192, alignment 0/ 0: 125% Length 8192, alignment 0/ 0: 125% Length 16, alignment 1/ 2: 317% Length 16, alignment 1/ 2: 317% Length 16, alignment 1/ 2: 317% Length 32, alignment 2/ 4: 395% Length 32, alignment 2/ 4: 395% Length 32, alignment 2/ 4: 398% Length 64, alignment 3/ 6: 475% Length 64, alignment 3/ 6: 475% Length 64, alignment 3/ 6: 477% Length 128, alignment 4/ 8: 479% Length 128, alignment 4/ 8: 479% Length 128, alignment 4/ 8: 479% Length 256, alignment 5/10: 543% Length 256, alignment 5/10: 539% Length 256, alignment 5/10: 543% Length 512, alignment 6/12: 585% Length 512, alignment 6/12: 585% Length 512, alignment 6/12: 585% Length 1024, alignment 7/14: 611% Length 1024, alignment 7/14: 611% Length 1024, alignment 7/14: 611%	2017-06-29 20:36:35 +02:00
Sebastian Pop	9938a64ca9	aarch64: optimize the unaligned case of memcmp This brings to newlib a performance improvement that we developed in Bionic libc. That change has been submitted for review to Bionic libc: https://android-review.googlesource.com/418279 A similar patch has been submitted for review in glibc: https://sourceware.org/ml/libc-alpha/2017-06/msg01143.html Patch written by Vikas Sinha and Sebastian Pop. The performance was measured on the bionic-benchmarks on a hikey (aarch64 8xA53) board. There was no performance change to the existing benchmark and a performance improvement on the new benchmark for memcmp on the unaligned side. The new benchmark has been submitted for review at https://android-review.googlesource.com/414860 The overall performance improves by 18% for the small data set 8 and the performance improves by 450% for the large data set 64k. The base is with the libc from /system/lib64. The bionic libc with this patch is in /data. hikey:/data # export LD_LIBRARY_PATH=/system/lib64 hikey:/data # ./bionic-benchmarks --benchmark_filter='BM_string_memcmp' Run on (8 X 2.4 MHz CPU s) Benchmark Time CPU Iterations ---------------------------------------------------------------------- BM_string_memcmp/8 30 ns 30 ns 22955680 251.07MB/s BM_string_memcmp/64 57 ns 57 ns 12349184 1076.99MB/s BM_string_memcmp/512 305 ns 305 ns 2297163 1.56496GB/s BM_string_memcmp/1024 571 ns 571 ns 1225211 1.66912GB/s BM_string_memcmp/8k 4307 ns 4306 ns 162562 1.77177GB/s BM_string_memcmp/16k 8676 ns 8675 ns 80676 1.75887GB/s BM_string_memcmp/32k 19233 ns 19230 ns 36394 1.58695GB/s BM_string_memcmp/64k 36986 ns 36984 ns 18952 1.65029GB/s BM_string_memcmp_aligned/8 199 ns 199 ns 3519166 38.3336MB/s BM_string_memcmp_aligned/64 386 ns 386 ns 1810734 158.073MB/s BM_string_memcmp_aligned/512 1735 ns 1734 ns 403981 281.525MB/s BM_string_memcmp_aligned/1024 3200 ns 3200 ns 218838 305.151MB/s BM_string_memcmp_aligned/8k 25084 ns 25080 ns 28180 311.507MB/s BM_string_memcmp_aligned/16k 51730 ns 51729 ns 13521 302.057MB/s BM_string_memcmp_aligned/32k 103228 ns 103228 ns 6782 302.727MB/s BM_string_memcmp_aligned/64k 207117 ns 207087 ns 3450 301.806MB/s BM_string_memcmp_unaligned/8 339 ns 339 ns 2070998 22.5302MB/s BM_string_memcmp_unaligned/64 1392 ns 1392 ns 502796 43.8454MB/s BM_string_memcmp_unaligned/512 9194 ns 9194 ns 76133 53.1104MB/s BM_string_memcmp_unaligned/1024 18325 ns 18323 ns 38206 53.2963MB/s BM_string_memcmp_unaligned/8k 148579 ns 148574 ns 4713 52.5831MB/s BM_string_memcmp_unaligned/16k 298169 ns 298120 ns 2344 52.4118MB/s BM_string_memcmp_unaligned/32k 598813 ns 598797 ns 1085 52.188MB/s BM_string_memcmp_unaligned/64k 1196079 ns 1196083 ns 540 52.2539MB/s hikey:/data # export LD_LIBRARY_PATH=/data hikey:/data # ./bionic-benchmarks --benchmark_filter='BM_string_memcmp' Run on (8 X 2.4 MHz CPU s) Benchmark Time CPU Iterations ---------------------------------------------------------------------- BM_string_memcmp/8 30 ns 30 ns 23209918 252.802MB/s BM_string_memcmp/64 57 ns 57 ns 12348447 1076.95MB/s BM_string_memcmp/512 305 ns 305 ns 2296878 1.56471GB/s BM_string_memcmp/1024 572 ns 571 ns 1224426 1.6689GB/s BM_string_memcmp/8k 4309 ns 4308 ns 162491 1.77109GB/s BM_string_memcmp/16k 9348 ns 9345 ns 74894 1.63285GB/s BM_string_memcmp/32k 18329 ns 18322 ns 38249 1.6656GB/s BM_string_memcmp/64k 36992 ns 36981 ns 18952 1.65045GB/s BM_string_memcmp_aligned/8 199 ns 199 ns 3513925 38.3162MB/s BM_string_memcmp_aligned/64 386 ns 386 ns 1814038 158.192MB/s BM_string_memcmp_aligned/512 1735 ns 1735 ns 402279 281.502MB/s BM_string_memcmp_aligned/1024 3204 ns 3202 ns 218761 304.941MB/s BM_string_memcmp_aligned/8k 25577 ns 25569 ns 27406 305.548MB/s BM_string_memcmp_aligned/16k 52143 ns 52123 ns 13522 299.769MB/s BM_string_memcmp_aligned/32k 105169 ns 105127 ns 6637 297.26MB/s BM_string_memcmp_aligned/64k 206508 ns 206383 ns 3417 302.835MB/s BM_string_memcmp_unaligned/8 282 ns 282 ns 2482953 27.062MB/s BM_string_memcmp_unaligned/64 542 ns 541 ns 1298317 112.77MB/s BM_string_memcmp_unaligned/512 2152 ns 2152 ns 325267 226.915MB/s BM_string_memcmp_unaligned/1024 4025 ns 4025 ns 173904 242.622MB/s BM_string_memcmp_unaligned/8k 32276 ns 32271 ns 21818 242.09MB/s BM_string_memcmp_unaligned/16k 65970 ns 65970 ns 10554 236.851MB/s BM_string_memcmp_unaligned/32k 131241 ns 131242 ns 5129 238.11MB/s BM_string_memcmp_unaligned/64k 266159 ns 266160 ns 2661 234.821MB/s	2017-06-26 10:22:40 +02:00
Prakhar Bahuguna	21ff2cf930	Fix minor issues in memchr NEON implementation	2017-06-07 12:16:15 +02:00
Sebastian Huber	2693c1db69	Move ARM access.c from machine to sys The implementation of the POSIX access() function is nothing machine specific like memcpy(), etc. Move it back to the system domain. This avoids problems due to the include search order of the Newlib/GCC build which picks up machine includes before system includes. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-05-25 12:34:53 -04:00
Prakhar Bahuguna	c47c9bdc1b	Optimise memchr for NEON-enabled processors	2017-04-06 18:19:20 +02:00
Catherine Moore	571c69656a	Use .syntax unified instead of .syntax divided.	2017-03-30 17:18:12 +02:00
Kyrill Tkachov	52a6da816f	arm: Fix addressing in optpld macro In patch `b219285f87` you have a syntax error in the PLD instruction. The syntax for the pld argument should be in square brackets as it's a memory address like so: pld [r1]. With your patch the newlib build fails for armv7-a targets. This patch fixes the build failures. Tested by making sure the newlib build completes successfully. 2016-01-26 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * libc/machine/arm/strcpy.c (strcpy): Fix PLD assembly syntax. * libc/machine/arm/strlen-stub.c (strlen): Likewise.	2017-01-26 16:29:36 +01:00
Pat Pannuto	3ebc26958e	arm: Remove RETURN macro LTO can re-order top-level assembly blocks, which can cause this macro definition to appear after its use (or not at all), causing compilation failures. On modern toolchains (armv4t+), assembly should write `bx lr` in all cases, and linkers will transparently convert them to `mov pc, lr`, allowing us to simply remove the macro. (source: https://groups.google.com/forum/#!topic/comp.sys.arm/3l7fVGX-Wug and verified empirically) For the armv4.S file, preserve this macro to maximize backwards compatibility.	2017-01-25 13:32:09 +01:00
Pat Pannuto	b219285f87	arm: Remove optpld macro LTO can re-order top-level assembly blocks, which can cause this macro definition to appear after its use (or not at all), causing compilation failures. As the macro has very few uses, simply removing it by inlining is a simple fix. n.b. one of the macro invocations in strlen-stub.c was already guarded by the relevant #define, so it is simply converted directly to a pld	2017-01-25 13:32:09 +01:00
Pat Pannuto	e7332409cc	Remove unneeded references to arm_asm.h This should result in no functional changes, it simply removes references to arm_asm.h that did not use anything from that file.	2017-01-25 13:32:09 +01:00

1 2 3 4 5 ...

434 Commits