The following FreeBSD kernel methods are not in any standard and
prototypes/definitions were leaking into application space:
+ round_page()
+ trunc_page()
+ atop()
+ ptoa()
+ pgtok()
v3: Add support for read ahead using strnlen, giving an additional 25% speedup
on large inputs (both short and long needles).
This patch significantly improves performance of strstr by using Sunday's
Quick-Search algorithm. Due to its simplicity it has the best average
performance of string matching algorithms on almost all inputs. It uses a
bad-character shift table to skip past mismatches.
The needle length is limited to 254 - this reduces the shift table memory
4 to 8 times, lowering preprocessing overhead and minimizing cache effects.
The limit also implies its worst-case performance is linear.
Larger needles are processed by the Two-Way algorithm. The macro AVAILABLE
has been improved to use strnlen to read the input in chunks. This results
in a 2.5 times speedup for large needles, reducing the performance drop when
the Quick-Search algorithm can't be used.
The code for 1-4 byte needles has been simplified and now uses unsigned
char. Since the optimized code relies on 8-bit chars, we defer to the
size-optimized implementation if CHAR_BIT > 8.
The performance gain of finding a set of randomly chosen words of size 8 in
256 bytes of English text is 14 times on AArch64. For longer haystacks the
gain is well over 20 times.
The size-optimized strstr has also been rewritten from scratch to improve
performance. On the same test the performance gain is 69%.
Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test
(https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c).
--
Use existing HAVE_OPENDIR define to determine if a generic
implementation should be provided. Cygwin for example has its own
implementation of opendir() and dirfd().
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
This is used by the file system support of libstdc++ for example. Use
content from latest FreeBSD <sys/dirent.h>
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Move common content of the various <sys/dirent.h> and the latest FreeBSD
<dirent.h> to <dirent.h>.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Use O_RDONLY since you are not supposed to write to a directory.
Use O_DIRECTORY as mandated by POSIX (The Open Group Base Specifications
Issue 7, 2018 edition IEEE Std 1003.1-2017):
"If the type DIR is implemented using a file descriptor, the descriptor
shall be obtained as if the O_DIRECTORY flag was passed to open()."
Use O_CLOEXEC as mandated by POSIX:
"When a file descriptor is used to implement the directory stream, it
behaves as if the FD_CLOEXEC had been set for the file descriptor."
Drop the fcntl() call in favour of O_CLOEXEC.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Make the POSIX O_CLOEXEC, O_NOFOLLOW, O_DIRECTORY, O_EXEC, and O_SEARCH
open() flags available also to non-Cygwin systems.
Make the BSD/glibc O_DIRECT open() flag available also to non-Cygwin
systems.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Commit fbace81684
("Import correctly working strtold from David M. Gay.")
introduced two new files, strtorx.c and strtodg.c. The functions
are only called from strtold.c. However, while strtold.c is only
built if HAVE_LONG_DOUBLE is defined, the patch erroneously added
the two new files to GENERAL_SOURCES unconditionally.
Fix this by building both files only if HAVE_LONG_DOUBLE has been
defined.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Commit 6c212a8b78
("Fix strtod ("nan") and strtold ("nan") returns wrong negative NaN")
introduced an unconditional dependency to nanl and, in turn, to libm.
Rather than including nanl in libc as well, just call __builtin_nanl
from here. Requires GCC 3.3 or later.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Drop Cygwin-specific nanl in favor of a generic implementation
in newlib. Requires GCC 3.3 or later.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
These attributes help static analysis tools to produce less false
positives, e.g. double free warnings.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
This patch is similar the arm one committed recently.
2018-10-08 Christophe Lyon <christophe.lyon@linaro.org>
* libgloss/aarch64/syscalls.c (_sbrk): Fix prototype.
(_getpid, _write, _swiwrite, _lseek, _swilseek, _read, _wriread):
Likewise.
AngelSWI_Reason_ReportException does not return accoring to the ARM
documentation, so it is valid to mark _kill() as noreturn. This way,
the compiler does not warn about _exit() returning a value despite
being noreturn.
2018-10-01 Christophe Lyon <christophe.lyon@linaro.org>
* libgloss/arm/_exit.c (_exit): Declare _kill() as noreturn.
* libgloss/arm/_exit.c (_kill): Likewise. Remove the return
statements.
* newlib/libc/sys/arm/syscalls.c (_kill): Likewise..
While working on the strstr patch I noticed several copyright headers
of the new math functions are missing closing quotes after ``AS IS.
I've added these. Also update spellings of Arm Ltd in a few places
(but still use ARM LTD in upper case portion). Finally add SPDX
identifiers to make everything consistent.
- per Wilco Dijkstra's patch
- update Arm copyright notice to not exclude AArch64 and the various generic
contributions made (like the new math functions). Also update the date and
change spelling to Arm Ltd.
It's been a while... I see the CRIS port broke with the
time_t-default-to-64-bit change, observable by a few test-cases in the gcc
fortran(!) tests failing, regressing when trying a recent newlib.
This is a two-part belt-and-suspenders change: adjust the CRIS port
gettimeofday syscall (the only one in newlib/CRIS passing a time_t or
struct timeval) to handle a userspace 64-bit time_t and secondly default
time_t to 32-bit long anyway. I considered making the local
"kernel_timeval" copy in _gettimeofday conditional on (userspace) time_t
being 64 bits, but thought it not worth bothering with the few move insns.
The effect of a 64-bit time_t is however observable as longer simulation
time when running the gcc testsuite and as bigger binaries without any
actual upside from the larger time_t size, so I thought better make the
default for this port go back to being a "long" again.
Tested by running the gcc testsuite over the three combinations of two
parts of the patch and observing the expected changes. Committed.
newlib:
* configure.host (cris, crisv32): Default to "long" time_t.
Signed-off-by: Hans-Peter Nilsson <hp@axis.com>
It's been a while... I see the CRIS port broke with the
time_t-default-to-64-bit change, observable by a few test-cases in the
gcc fortran(!) tests failing, regressing when trying a recent newlib.
This is a two-part belt-and-suspenders change: adjust the CRIS port
gettimeofday syscall (the only one in newlib/CRIS passing a time_t or
struct timeval) to handle a userspace 64-bit time_t and secondly default
time_t to 32-bit long anyway. I considered making the local
"kernel_timeval" copy in _gettimeofday conditional on (userspace) time_t
being 64 bits, but thought it not worth bothering with the few move insns.
The effect of a 64-bit time_t is however observable as longer simulation
time when running the gcc testsuite and as bigger binaries without any
actual upside from the larger time_t size, so I thought better make the
default for this port go back to being a "long" again.
Tested by running the gcc testsuite over the three combinations of two
parts of the patch and observing the expected changes. Committed.
libgloss:
Adjust for syscall and userspace having different time_t or timeval.
* cris/linunistd.h (kernel_time_t, kernel_suseconds_t, kernel_timeval):
New types.
(gettimeofday): Change the type of the first argument to be a
pointer to a struct kernel_timeval.
* cris/gensyscalls (_gettimeofday): Use an intermediate struct
kernel_timeval for the syscall and initialize the result from
that.
Signed-off-by: Hans-Peter Nilsson <hp@axis.com>
The current loop condition is borderline. Make sure it ends and
choose a replacement char in the unlikely case the current console
font isn't recognized at all.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Rather than relying on an index variable, store the current
replacement char and use that directly in WriteConsoleW.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
EnumFontFamiliesExW fails if the font is "Terminal" (aka "Raster Fonts")
and lfCharSet requests ANSI_CHARSET. Using DEFAULT_CHARSET fixes this.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
hash.h: Use 32-bit type for data stored on disk, so code
works for 16 and 64-bit targets. Reduce maximum bucket size on 16-bit
targets, so it fits in available memory.
hash.c: Check bucket size isn't too big for target.
hash_buf.c: Fix overflow warning on 16-bit targets.
When __HAVE_LOCALE_INFO__ is not selected, directly access the
existing _ctype_ variable from __locale_ctype_ptr() and
__locale_ctype_ptr_l(), eliminating the need for any locale or reent
structure
Signed-off-by: Keith Packard <keithp@keithp.com>
v2:
locale: fix conflict with __locale_ctype_ptr macro
If we are building without __HAVE_LOCALE_INFO__, there is a
macro providing __locale_ctype_ptr which in turn fouls up this
declaration.
Signed-off-by: Michael Lyle <mlyle@lyle.org>
The string/float conversion functions need to get the locale decimal
point. Instead of calling __localeconv_l (which copies locale data
into lconv form from __get_numeric_locale), use __get_numeric_locale
directly.
Signed-off-by: Keith Packard <keithp@keithp.com>
This makes sure any system-defined limits are specified before the
defaults are checked. Without this, ARG_MAX and PATH_MAX end up
getting the default definitions from limits.h rather than the defines
from syslimits.h. This could potentially cause problems when
different files used different values for the same name.
Signed-off-by: Keith Packard <keithp@keithp.com>
Make sure device context is not copied to forked process.
It is a process-specific datastructure.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Try various Unicode characters which may be used as a replacement
character in case an invalid character has to be printed.
Current list is 0xfffd "REPLACEMENT CHARACTER", 0x25a1 "WHITE SQUARE",
and 0x2592 "MEDIUM SHADE" in that order.
Additionally workaround a problem with some fonts (namely DejaVu
Sans Mono) which are returned wit ha broken fontname with trailing
stray characters.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Improve strstr performance for the common case of short needles. For a single
character strchr is best, for 2-4 characters a small loop is fastest. For these
the speedup over the Two-Way algorithm is ~10 times on large strings.
Newlib builds, the new code passes GLIBC testsuite. OK for commit?
Issuing an ARM semi-hosting Seek command when just querying file
position with SEEK_CUR and offset zero is unnecessary, because unlike
the lseek() Unix system call the Seek command does not actually return
the file position. For that reason, syscalls.c for ARM keeps track of
file position in the 'poslog', so we can just return that.
Moreover, since the Seek command only accepts an absolute file position,
SEEK_CUR operations are implemented by adding the relative offset to the
position in the poslog. If the host implements non-binary files with
implicit carriage return characters but doesn't discount those implicit
CRs when implementing Seek (by just mapping straight to Windows file
operations), this actually ended up wrongly changing file position when
using SEEK_CUR with offset zero or functions like ftell() or fgetpos()
that are based on that.
Also, use off_t rather than int for the poslog.
So far we printed a half filled square (0x2592) if the input char is
invalid, but using REPLACEMENT CHARACTER (0xfffd) is apparently the way
to go.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Commit 35998fc2fa fixed the buffer underun
in win32 path normalization, but introduced a new bug: A wrong
assumption led to the inability to backtrack the path outside of the
current working directory in case of relative paths.
This patch fixes this problem, together with a minor problem if the CWD
is on a network share: The result erroneously started with tripple
backslash if the src path starts with a single backslash.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Standard headers shouldn't use non-reserved identifiers as parameter
names in function declarations, because programs could in theory
define macros with such names before including a header.
This macro selects a compiler option that disables recognition of
common memset/memcpy patterns and converting those to direct
memset/memcpy calls.
Signed-off-by: Keith Packard <keithp@keithp.com>