This was actually more evil:
• use a recursive function to display blocks in reverse order,
so that local variable values overwrite global ones
• add array support to typeset -p (from typeset -p -)
• display 'set -A varname' line before setting values, for -p
• if -p got arguments, only display those (from the innermost scope)
Also, the usual amount of code cleanup…
to get rid of the bias introduced by making the hash never zero
… he also pointed out a memory (heap) usage optimisation… which
may impact code size a bit though as I’d need to pass an additional
argument on hashtable function calls… or, forgo the benefit of not
having to pointer-align the key in the structure, which can be as
much as 3/7 octets per item, heap storage… OTOH the saved space is
4/8 octets per not-allocated item, possibly some code (use of an
multiply-add opcode), but the function call overhead/cost would
possibly be quite a bit… I guess I’ll have to measure…
XXX in the future, the entire scheme must be rethinked when we need more
XXX entropy for the hash tables; possibly a cheap add using NZAT and re-
XXX initialise the LCG only on access and when added (so keep NZAT state
XXX separate from LCG state); also, then we will need a more elaborate
XXX scheme, such as adding from environment, editor keypresses and timing
if we find any, but not later; do not check on every read
⇒ allows changing COLUMNS and LINES (independent of each other, or both)
for script shells by passing them in an environment setting, even if
we get a tty; interactive shells still check before each line is read…
reported by the PLD guys, thanks
handle any more, octal 010 style constants, as promised
• overhaul the manpage re. arithmetic expressions, make the guarantees
mksh code has explicitly, precisely, clear
• to reduce burden of the compiler, getint() now operates on mksh_uari_t
internally; it already applied the sign after operation, anyway (C99
guarantees wraparound on unsigned types, but for signed types we need
specific compiler support; apparently, this comes from hardware limits)
• use const and shuffle order of locals around while here
• while here, reformat 'struct tbl' comment-wise and placement-wise
and drop the Tflag typedef
• while here, write regression test for the "global" built-in, which
does what typeset is supposed to do except that it doubles as "local"
Testsuite:
• add new need-pass: {yes|no} attribute, default yes
• exit with 1 if a need-pass test failed unexpectedly
idea by Kacper Kornet <draenog@pld-linux.org>
• mark utf8bom-2 as need-pass: no
Infrstructure:
• add housekeeping function for making a tty raw
• switch functions with unused results to void
• struct op: u.charflag contains last char of ;; in TPAT
• var.c:arraysearch is now a global function
Language:
• add ;& (fall through) and ;| (examine next) delimiters
in addition to ;; (end case) as zsh extensions, because
POSIX standardised on ;& already
• add -A (read into array), -N (read exactly n bytes),
-n (read up to n bytes), -t (timeout) flags for read
from ksh93
• allow read -N -1 or -n -1 to slurp the entire input
• add -a (read into array the input characters) extension
specific to mksh to read, idea by David Korn
• add -e (exit with error if PWD was not set correctly
after a physical cd) to cd builtin, mandated by next
POSIX, and change error codes accordingly
Rewrites:
• full rewrite of read builtin and its manpage section
• add regression tetss for most of the new functionality
• duplicate hexdump demo tests for use of read -a
• use read -raN-1 in dot.mkshrc to get NUL safe base64,
DJB cdb hash and Jenkins one-at-a-time hash functions
a bit more with POSIX and the other shells
I considered http://austingroupbugs.net/view.php?id=253 but the use
of bi_errorf() is interesting, especially as it’s often enough a
noreturn function, and funnily enough, 'cd -P /foo' returns 0 while
'chdir -P /foo' fails (so idk where to put -e)…
① currently: ((cond) ? true : false) but (!!(cond)) and casting to bool,
the latter only if stdbool.h, would also work – which performs best on
(and across) all supported systems?
• in interactive mode, always look up {LC_{ALL,CTYPE},LANG} environment
variables if setlocale/nl_langinfo(CODESET) doesn’t suffice
• add the ability to call any builtin (some don't make sense or wouldn't
work) directly by analysing argv[0]
• for direct builtin calls, the {LC_{ALL,CTYPE},LANG} environment
variables determine utf8-mode, even if MKSH_ASSUME_UTF8 was set
• when called as builtin, echo behaves POSIXish
• add domainname as alias for true on MirBSD only, to be able to link it
• sync mksh Makefiles with Build.sh output
• adjust manpage wrt release plans
• link some things to mksh now that we have callable builtins:
bin/echo bin/kill bin/pwd bin/sleep (exact matches)
bin/test bin/[ (were scripts before)
bin/domainname=usr/bin/true usr/bin/false (move to /bin/ now)
• drop linked utilities and, except for echo and kill, their manpages
• adjust instbin and link a few more there as well
– possible integer overflows in memory allocation, mostly
‣ multiplication: all are checked now
‣ addition: reviewed them, most were “proven” or guessed to be
“almost” impossible to run over (e.g. when we have a string
whose length is taken it is assumed that the length will be
more than only a few bytes below SIZE_MAX, since code and
stack have to fit); some are checked now (e.g. when one of
the summands is an off_t); most of the unchecked ones are
annotated now
⇒ cost (MirBSD/i386 static): +76 .text
⇒ cost (Debian sid/i386): +779 .text -4 .data
– on Linux targets, setuid() setresuid() setresgid() can fail
with EAGAIN; check for that and, if so, warn once and retry
infinitely (other targets to be added later once we know that
they are “insane”)
⇒ cost (Debian sid/i386): +192 .text (includes .rodata)
• setmode.c: Do overflow checking for realloc() too; switch back
from calloc() to a checked malloc() for simplification while there
• define -DIN_MKSH and let setmode.c look a tad nicer while here
we don’t get SIGWINCH when the window size changes during the runtime of
that, so, the signal is only usable reliably during editing in the shell
and we re-check the window size before each interactive edit line again
• deactivate %a and %A since our libc doesn’t have it
• rewrite the mksh integration code to use shf instead of stdio, removing
floating point support always in the process, as shf doesn’t support it
⇒ saves 11114 (6706 text, 168 data, 4240 bss) with dietlibc on Debian
• fix -Wall -Wextra -Wformat -Wstrict-aliasing=2 for gcc (Debian 4.4.4-7)
• fix these and -Wc++-compat for gcc version 4.6.0 20100711 (experimental)
[trunk revision 162057] (Debian 20100711-1) except:
– a few enum warnings that relate to eglibc’s {g,s}etrlimit() functions
taking an enum instead of an int because they’re too stupid to adhere
to POSIX interfaces they design by themselves
– all “request for implicit conversion” involving a "void *" on one side
• tweak the manual page somewhat more
of foo[0] (but not its attributes), and the rest of the array, so that
later “set +A foo bar” will set foo[0]=bar but retain the attributes.
This is important, because, in the future, arrays will have different
attributes per element, instead of all the same (which, actually, is
not entirely true right now either, since “unset foo[0]” will not mo-
dify the attributes of a foo[1] existing at that point in time), where
foo[$newkey] will inherit from foo[0], but typeset foo will only affect
foo[0] no longer foo[*] in the future. (The rules about typeset=local
will still apply, as they affect creation of variables in a scope.)