Testsuite:
• add new need-pass: {yes|no} attribute, default yes
• exit with 1 if a need-pass test failed unexpectedly
idea by Kacper Kornet <draenog@pld-linux.org>
• mark utf8bom-2 as need-pass: no
Infrstructure:
• add housekeeping function for making a tty raw
• switch functions with unused results to void
• struct op: u.charflag contains last char of ;; in TPAT
• var.c:arraysearch is now a global function
Language:
• add ;& (fall through) and ;| (examine next) delimiters
in addition to ;; (end case) as zsh extensions, because
POSIX standardised on ;& already
• add -A (read into array), -N (read exactly n bytes),
-n (read up to n bytes), -t (timeout) flags for read
from ksh93
• allow read -N -1 or -n -1 to slurp the entire input
• add -a (read into array the input characters) extension
specific to mksh to read, idea by David Korn
• add -e (exit with error if PWD was not set correctly
after a physical cd) to cd builtin, mandated by next
POSIX, and change error codes accordingly
Rewrites:
• full rewrite of read builtin and its manpage section
• add regression tetss for most of the new functionality
• duplicate hexdump demo tests for use of read -a
• use read -raN-1 in dot.mkshrc to get NUL safe base64,
DJB cdb hash and Jenkins one-at-a-time hash functions
① currently: ((cond) ? true : false) but (!!(cond)) and casting to bool,
the latter only if stdbool.h, would also work – which performs best on
(and across) all supported systems?
– possible integer overflows in memory allocation, mostly
‣ multiplication: all are checked now
‣ addition: reviewed them, most were “proven” or guessed to be
“almost” impossible to run over (e.g. when we have a string
whose length is taken it is assumed that the length will be
more than only a few bytes below SIZE_MAX, since code and
stack have to fit); some are checked now (e.g. when one of
the summands is an off_t); most of the unchecked ones are
annotated now
⇒ cost (MirBSD/i386 static): +76 .text
⇒ cost (Debian sid/i386): +779 .text -4 .data
– on Linux targets, setuid() setresuid() setresgid() can fail
with EAGAIN; check for that and, if so, warn once and retry
infinitely (other targets to be added later once we know that
they are “insane”)
⇒ cost (Debian sid/i386): +192 .text (includes .rodata)
• setmode.c: Do overflow checking for realloc() too; switch back
from calloc() to a checked malloc() for simplification while there
• define -DIN_MKSH and let setmode.c look a tad nicer while here
we don’t get SIGWINCH when the window size changes during the runtime of
that, so, the signal is only usable reliably during editing in the shell
and we re-check the window size before each interactive edit line again
• deactivate %a and %A since our libc doesn’t have it
• rewrite the mksh integration code to use shf instead of stdio, removing
floating point support always in the process, as shf doesn’t support it
⇒ saves 11114 (6706 text, 168 data, 4240 bss) with dietlibc on Debian
• fix -Wall -Wextra -Wformat -Wstrict-aliasing=2 for gcc (Debian 4.4.4-7)
• fix these and -Wc++-compat for gcc version 4.6.0 20100711 (experimental)
[trunk revision 162057] (Debian 20100711-1) except:
– a few enum warnings that relate to eglibc’s {g,s}etrlimit() functions
taking an enum instead of an int because they’re too stupid to adhere
to POSIX interfaces they design by themselves
– all “request for implicit conversion” involving a "void *" on one side
• tweak the manual page somewhat more
comment what/why added (to aid understanding this code)
I wonder, though, why their x_escape now almost¹ looks like ours… is
that a coïncidence, or do they steal again (without understanding why)?
① they’re missing the semicolon but falsely added the closing bracket
some idiotic terminal emulators and/or people seem to use the es-
cape codes normally denoting Alt-Arrowkey instead so let's simply
bind them to the vt_hack as well... (untested)
others (colon and equals sign need to be simply escaped, while dollar
sign and accent gravis need double escaping like opening square brak-
ket did back then); add = to C_QUOTE to simplify (doesn't break any-
thing) and sort these strings asciibetically while here
MKSH_S_EDIT for small (Emacs) editing mode, MKSH_S_FEAT for all the dis-
abled language features), which can be set to 0 despite MKSH_SMALL being
defined to re-enable the Vi command line editing mode (which I wouldn't,
but fits into the general mastermind scheme)
some GNU bash extensions (suggested by cnuke@) and bind macros
* make the random cache more efficient (and the code potentially
smaller, although we have a new implementation of the oaat hash
function, alongside the old one, now) and pushb only if needed
(i.e. state has changed or user has set $RANDOM, but not onfork)
return information needed to do a real ktremove instead of the pseudo
ktdelete operation which merely unsets the DEFINED flag to mark it as
eligible for texpand garbage collection (even worse, !DEFINED entries
are still counted)
more from cid 1004A2D72DD5A4E4B4F tried to be fixed in 1004A300A72701188E3
but I’d appreciate someone who actually uses Vi Mode to test it:
Revision 1.26: [7]download - view: [8]text, [9]markup, [10]annotated - [11]select for diffs
Mon Jun 29 22:50:19 2009 UTC (5 days, 14 hours ago) by martynas
Branches: [12]MAIN
CVS tags: [13]OPENBSD_4_6_BASE, [14]OPENBSD_4_6, [15]HEAD
Diff to: previous 1.25: [16]preferred, [17]coloured
Changes since revision 1.25: +10 -5 lines
make VSEARCH werase act like regular werase after the last change.
vi back-words and emacs kill-region are not completely the same.
ok merdely@, millert@. "Get it in" Darrin Chandler
"make ksh vi mode handle werase more like vi. It's really irritating to
have whole paths go away on ^W instead of just the last bit."
"That looks right to me" millert@, "YES kthx bye!" thib@
.oO(there are vi mode users?) We are not GNU bash, good idea! tg@
fix the regression test’s results while here, which have been
broken since cid 10049D9BE5254CE65B8
• get rid of separate copyright file which was intended for De-
bian; track down commits in all files of oksh-mirbsd and mksh
to get correct copyright years per-file, as is BSD custom
the width for control characters (wcwidth(wc) == -1) was hard-coded
to 2 (!) in utf_widthadj, which is true for *only* one of the two x_zotc*
functions in Emacs editing mode, and none of the other functions which
use this piece of code
change to 1, to be more correct in the general case; use of the UTF C1
control characters U+0080‥U+009F is slightly broken anyway, and this
only shifts the brokenness to different places of code
XXX maybe we want to map U+0080‥U+009F into Unicode as if they were
XXX 0x80‥0x9F in ANSI cp1252 instead, at least for displaying?
the editing code is cruel…
was hard to type and hard to fix, galloc is also hard to fix, and some
things I learned will probably improve things more but make me use the
original form as base (especially for space savings)
* let sizeofN die though, remove even more casts
* optimise, polish
* regen Makefiles
* sprinkle a few /* CONSTCOND */ while here
please pcc, prompted for by Anders “ragge” Magnusson, problem spotted
originally by Adam “replaced” Hoka
⇒ rewrote x_bs2() and utf_backch() into a combined x_bs3() function,
since these are never used in any other way
• whitespace cleanup, while here
abortion (^G – ^C is SIGINT and doesn’t work like this, but
that’s actually good IMO)
prompted by enquiry about the Emacs editing mode by <smultron:#MidnightBSD>
in a somewhat hackish way, and it’s still quite different from zsh,
but probably closer to a desired functionality
XXX this makes state by abusing 「modified」 and 「xmp」 (“the mark”).
in præfix sequences (like ANSI cursor keys), leading to annoying effects
if we forget that
this patch changes the behaviour so that another character is read/peeked
at (since this is done in the main loop after ESC anyway, no function loss
through the delay) if ESC leads in a prefix-1 sequence, and if the peeked
character leads in a prefix-1 or prefix-2 sequence when in state prefix-1,
it’s still enacted (XXX document this in manpage)
empty) appears pushed into the history, so that, when pressing cursor-up
or ^P, with a cursor-down or ^N you get it back, unless you modify a line
retrieved from the history, in which case it will overwrite the saved line
and place the current history pointer past the entered history lines
This is for Emacs mode; Vi mode had something similar already, and shares
some code and data
XXX there are several static buffers of size LINE (currently 4096) in here
use the libc functions for converting between multibyte strings and wide
strings in here any more, besides mksh has slightly different needs than
SUSv3 compliance ⇒ hand-craft optimised and unrolled functions for that
• sync the mksh-internal wcwidth function with libc
• fix vi mode (which, however, is officially orphaned) multi-line $PS1 by
using a similar algorithm for prompt skipping as emacs mode (changing
the meaning of prompt_trunc variable and using prompt_redraw, just even
more efficiently than vi mode); reported by asarch via IRC
• fix multi-line prompts if last line is “too large” by using emacs mode
algorithm of just internally appending a newline, while here ☺ this even
saves us having to re-add the prompt_skip variable…
WARNING: this is only barely tested, as almost nobody ever uses vi mode
⇒ test yourself, there may be bugs (e.g. off-by-ones); already known is
that the vi input line editing mode is NOT multibyte safe…
‣ macro afreechk() is superfluous
• get rid of macro afreechv() by re-doing the “don’t leak that much” code
• some KNF (mostly, whitespace and 80c) while here
• more int → bool
• more regression tests: check if the utf8-hack flag is really disabled
at non-interactive startup, enabled at interactive startup, if the
current locale is a UTF-8 one
• make the mksh-local multibyte handling functions globally accessible,
change their names, syntax and semantics a little (XXX more work needed)
• optimise
• utf_wctomb: src → dst, as we’re writing to that char array (pasto?)
• edit.c:x_e_getmbc(): if the second byte of a 2- or 3-byte multibyte
sequence is invalid utf-8, ungetc it (not possible for the 3rd byte yet)
• edit.c:x_zotc3(): easier (and faster) handling of UTF-8
• implement, document and test for base-1 numbers: they just get the
ASCII (8-bit) or Unicode (UTF-8) value of the octet(s) after the ‘1#’,
or do the same as print \x## or \u#### (depending on the utf8-hack flag),
plus support the PUA assignment of EF80‥EFFF for the MirBSD encoding “hack”
(print doesn’t, as it has \x## and \u#### to distinguish, but we cannot use
base-0 numbers which I had planned to use for raw octets first, as they are
used internally): http://thread.gmane.org/gmane.os.miros.general/7938
• as an application example, add a hexdumper to the regression tests ☺