a bit more with POSIX and the other shells
I considered http://austingroupbugs.net/view.php?id=253 but the use
of bi_errorf() is interesting, especially as it’s often enough a
noreturn function, and funnily enough, 'cd -P /foo' returns 0 while
'chdir -P /foo' fails (so idk where to put -e)…
and switches to the TARGET_OS=Linux
• introduce android as regression test suite category
• add an android specific standard alias
• clean up redundant ‘-o sh’ arg in a few checks
UTF-8 BOM instead (UTFMODE has a separate value now for activated
during BOM skipping)
• parsing a COMSUB now skips UTF-8 BOM, too, but only temporarily
• use shf_putc (macro), shf_putchar (function) ipv tputc
• replace shf_putchar(x,y) calls for side-effect-less x with shf_putc
• plug another bug in the tree code – '\' → "\\" (backslashes must be
escaped inside double quotes, too)
• adjust testsuite (and, I _had_ wondered…)
EOF) # works again now, plugging a regression
• rewrite the here document parsing code to be *much* more efficient
(and a bit more readable too!) using goto, while here (no kidding)
instead, but the parser for the so-called “backticks” (U+0060) still emits
plaintext COMSUB wdstrings, and the evaluation code emits plaintext if the
code is not run (‘-n’ option), so it’s not worth the effort and memory ma-
nagement issues, even though it _would_ optimise the most common case…
Bump version numbers, sync regression tests; add one testcase from the old
webpages too. Sync manpage, this now works, but keep the workaround in, as
“portability issue” with slightly changed wording.
Also, /bin/sleep must be used in one manpage example if sleep is built in.
• PIPESTATUS now supported (like bash 2) whose last member
may actually differ from $? since the latter may not be the
result of a pipeline partial command
• add regression tests, documentation, etc.
• in interactive mode, always look up {LC_{ALL,CTYPE},LANG} environment
variables if setlocale/nl_langinfo(CODESET) doesn’t suffice
• add the ability to call any builtin (some don't make sense or wouldn't
work) directly by analysing argv[0]
• for direct builtin calls, the {LC_{ALL,CTYPE},LANG} environment
variables determine utf8-mode, even if MKSH_ASSUME_UTF8 was set
• when called as builtin, echo behaves POSIXish
• add domainname as alias for true on MirBSD only, to be able to link it
• sync mksh Makefiles with Build.sh output
• adjust manpage wrt release plans
• link some things to mksh now that we have callable builtins:
bin/echo bin/kill bin/pwd bin/sleep (exact matches)
bin/test bin/[ (were scripts before)
bin/domainname=usr/bin/true usr/bin/false (move to /bin/ now)
• drop linked utilities and, except for echo and kill, their manpages
• adjust instbin and link a few more there as well
a mirtoconf check, would’ve been a real problem on an LP64 platform
• sh.h: work around a bad interaction between -Wformat on gcc and manual
string pooling for T_synerr, which is used in place of a format string
in some places
– possible integer overflows in memory allocation, mostly
‣ multiplication: all are checked now
‣ addition: reviewed them, most were “proven” or guessed to be
“almost” impossible to run over (e.g. when we have a string
whose length is taken it is assumed that the length will be
more than only a few bytes below SIZE_MAX, since code and
stack have to fit); some are checked now (e.g. when one of
the summands is an off_t); most of the unchecked ones are
annotated now
⇒ cost (MirBSD/i386 static): +76 .text
⇒ cost (Debian sid/i386): +779 .text -4 .data
– on Linux targets, setuid() setresuid() setresgid() can fail
with EAGAIN; check for that and, if so, warn once and retry
infinitely (other targets to be added later once we know that
they are “insane”)
⇒ cost (Debian sid/i386): +192 .text (includes .rodata)
• setmode.c: Do overflow checking for realloc() too; switch back
from calloc() to a checked malloc() for simplification while there
• define -DIN_MKSH and let setmode.c look a tad nicer while here
we don’t get SIGWINCH when the window size changes during the runtime of
that, so, the signal is only usable reliably during editing in the shell
and we re-check the window size before each interactive edit line again
a string buffer whose window size is currently 32 (initial), your data
is 96 bytes, this routine used to resize the buffer to 64, append your
first 64 bytes to it (no matter if there's already something in it)
and then writes the remaining bytes to stdio fd instead of the string…
if it doesn’t SIGABRT before
discovered by wbx@ – thanks – bug inherited from pdksh 5.2.14 (AD 1999)
• deactivate %a and %A since our libc doesn’t have it
• rewrite the mksh integration code to use shf instead of stdio, removing
floating point support always in the process, as shf doesn’t support it
⇒ saves 11114 (6706 text, 168 data, 4240 bss) with dietlibc on Debian
• fix -Wall -Wextra -Wformat -Wstrict-aliasing=2 for gcc (Debian 4.4.4-7)
• fix these and -Wc++-compat for gcc version 4.6.0 20100711 (experimental)
[trunk revision 162057] (Debian 20100711-1) except:
– a few enum warnings that relate to eglibc’s {g,s}etrlimit() functions
taking an enum instead of an int because they’re too stupid to adhere
to POSIX interfaces they design by themselves
– all “request for implicit conversion” involving a "void *" on one side
• tweak the manual page somewhat more
• avoid calling realloc twice in sequence, since the final
size is known at the first call already
• do not lstat(2) the same path twice in the Hurd codepath
• expand-unglob-{dblq,unq} are the same as dash, but with ‘\}’ → ‘}’ as
per austin-group-l discussion, although this is not (yet) a standards
requirement, just a “doesn’t make sense otherwise” thing
expand-ugly:
• printf '%s\n' "foo ${IFS+"b c"} baz" → no field splitting, ksh93 is
wrong here (§2.6.2)
• ‘\}’ vs. ‘}’ as above
• ksh93 dropping a ‘}’ is probably another ksh93 bug
and vendor pdksh versions, re-introduce FPOSIX alongside FSH. The semantics
are now:
‣ set -o posix ⇒
• disable brace expansion and FSH when triggered
• use Debian Policy 10.4 compliant non-XSI “echo” builtin
• do not keep file descriptors > 2 to ksh
‣ set -o sh ⇒
• set automatically #ifdef MKSH_BINSHREDUCED
• disable brace expansion and FPOSIX when triggered
• use Debian Policy 10.4 compliant non-XSI “echo” builtin
• do not keep file descriptors > 2 to ksh
• trigger MKSH_MIDNIGHTBSD01ASH_COMPAT mode if compiled in
• make “set -- $(getopt ab:c "$@")” construct work
Note that the set/getopt one used to behave POSIXly only with FSH or
FPOSIX (depending on the mksh version) set and Bourne-ish with it not
set, so this changes default mksh behaviour to POSIX!
of foo[0] (but not its attributes), and the rest of the array, so that
later “set +A foo bar” will set foo[0]=bar but retain the attributes.
This is important, because, in the future, arrays will have different
attributes per element, instead of all the same (which, actually, is
not entirely true right now either, since “unset foo[0]” will not mo-
dify the attributes of a foo[1] existing at that point in time), where
foo[$newkey] will inherit from foo[0], but typeset foo will only affect
foo[0] no longer foo[*] in the future. (The rules about typeset=local
will still apply, as they affect creation of variables in a scope.)
some idiotic terminal emulators and/or people seem to use the es-
cape codes normally denoting Alt-Arrowkey instead so let's simply
bind them to the vt_hack as well... (untested)
• merge the rest of branch tg-wcswidth-behaviour
• enhance test cases for wcswidth-like behaviour
• switch hash table collision resolution algorithm to Python’s as announced
• bump vsn
│remember to restore errno (ie. stop someone from making a mistake later)
│ok guenther
check.t, sh.h: bump vsn
I wonder though why errno must be restored even if nothing was
called after reading it… moid?
which, in its latest sid incarnation, even received mksh's ability
to produce ${!foo[*]} array keys, wow!)
* plug a memory leak while here (ATEMP only, but still)
I read, IIRC in the Cederqvist, that 'cvs tag' sets a sticky tag onto
the cwd… it doesn’t, apparently. (I actually like it better this way,
but one needs to know!)
others (colon and equals sign need to be simply escaped, while dollar
sign and accent gravis need double escaping like opening square brak-
ket did back then); add = to C_QUOTE to simplify (doesn't break any-
thing) and sort these strings asciibetically while here
• use a combination of the one-at-a-time hash and an LCG for handling
the $RANDOM special if !HAVE_ARC4RANDOM instead of rand(3)/srand(3)
and get rid of time(3) usage to reduce import footprint
• raise entropy state (mostly in the !HAVE_ARC4RANDOM case though…)
• simplify handling of the $RANDOM_SPECIAL generally
• tweak hash() to save a temp var for non-optimising compilers
• some int → mksh_ari_t and other type fixes
• general tweaking of code and comments
change of a variable inside a Bourne style POSIX function will affect
the current execution environment of the function caller (to be consi-
stent with exec-function-environment-1)
just a "somewhat more POSIX" but also a "/bin/sh legacy kludge" mode
* consistently capitalise POSIX and SUSv3/SUSv4 (same as AT&T ksh) and
Bourne shell
to it are now either arc4random or rand/srand, but srand retains the old
state; set +o arc4random is no longer possible, but if it's there we use
arc4random(3), if not, we use rand(3) for $RANDOM reads; optimise special
variable handling too and fix a few consts and other minor things
MKSH_S_EDIT for small (Emacs) editing mode, MKSH_S_FEAT for all the dis-
abled language features), which can be set to 0 despite MKSH_SMALL being
defined to re-enable the Vi command line editing mode (which I wouldn't,
but fits into the general mastermind scheme)
some GNU bash extensions (suggested by cnuke@) and bind macros
* make the random cache more efficient (and the code potentially
smaller, although we have a new implementation of the oaat hash
function, alongside the old one, now) and pushb only if needed
(i.e. state has changed or user has set $RANDOM, but not onfork)
integers in addition to my 「1#a」 (or 「1#…」), which also allows for
finer end-of-character checking. Note that this is locale-dependent in
ksh93, set ±U dependent in mksh, and mksh’s OPTU-16 encoding is used.
Build.sh but use 'if defined(PRECOND) && !defined(TOBEDEFINED)'if possible
* for all of the source code, drop annotations "imake style" (if we check
for specific OSes, bad, instead of using mirtoconf checks proper) and
"conditions correct?" (if I'm not entirely sure if that #if catches all
cases and no false positives) where I can see it by grepping immediately
* bump mksh patchlevel
* refresh Makefiles
more from cid 1004A2D72DD5A4E4B4F tried to be fixed in 1004A300A72701188E3
but I’d appreciate someone who actually uses Vi Mode to test it:
Revision 1.26: [7]download - view: [8]text, [9]markup, [10]annotated - [11]select for diffs
Mon Jun 29 22:50:19 2009 UTC (5 days, 14 hours ago) by martynas
Branches: [12]MAIN
CVS tags: [13]OPENBSD_4_6_BASE, [14]OPENBSD_4_6, [15]HEAD
Diff to: previous 1.25: [16]preferred, [17]coloured
Changes since revision 1.25: +10 -5 lines
make VSEARCH werase act like regular werase after the last change.
vi back-words and emacs kill-region are not completely the same.
ok merdely@, millert@. "Get it in" Darrin Chandler
starting with an ‘!’ exclamation mark at the beginning of a com-
mand (PS1 not PS2), shall have the same effect as the predefined
“r” alias, to be compatible with csh and GNU bash’s “!string” to
«Execute last used command starting with string» – documentation
and feature request provided by wbx@ (Waldemar Brodkorb)
and it actually REDUCES code size to allow it as well; mention
in the manpage that it’s merely unportable (and of course exe-
cution time differs); sync clog
• expose “#ifdef MKSH_MIDNIGHTBSD01ASH_COMPAT” just in case they decide to
require it and show it in the ksh version automatically
• sync the use of non-ASCII characters over files (unification)
fix the regression test’s results while here, which have been
broken since cid 10049D9BE5254CE65B8
• get rid of separate copyright file which was intended for De-
bian; track down commits in all files of oksh-mirbsd and mksh
to get correct copyright years per-file, as is BSD custom
${foo:1:2} operates on characters ipv bytes – which means:
‣ set +U: octets
‣ set -U: MirOS OPTU-8 characters
for consistency I also adapted ${#stringname} to deliver the
length in characters ipv bytes; more may follow; for example
I’d like a way to expose the string width.
you can already get the MirOS OPTU-16 of a character in the
WTF-8 (「set -U」) mode with something like
│ typeset -Uui16 -Z7 x=1#${stringname:position:1}
which will correctly use the PUA EF80‥EFFF mapping for octets.
due to this being an incompatible change, bump to R38
also change the unicode-hexdump sample regression test and
add two news for ${x:1:2} and ${#x} checks in A/W mode ☺
struct env (other structures defined have no "foreign type with pos-
sible alignment constraints" members) and take care of it while dea-
ling in a struct env instance
gcc, SUNWcc, pcc, llvm-gcc, clang, etc. all didn't say a thing!
now compiles warning-free (testsuite pass) on ULTRIX 4.5 (1986),
and OSF/1 X2.0-8 (testsuite norun: perl missing) has only the usual
bitchings about "volatile sig_atomic_t" because the latter part is
already volatile, but otherwise warning-free compile, works fine
* Debian pdksh fails #3 (trap) and #6 (BSD make)
* AT&T ksh93 passes all
* zsh does not pass them literally, but the actual functionality
checked is right there
* dash fails #3 (trap) and does not pass #6 due to missing [[
* GNU bash 2 (MirPorts) and 3 (Debian) fails #6 (BSD make)
* oksh-current passes all
H4sIAAAAAAACA31TTUvDQBC951c804DtIQ0iiFhSPHiwiCDqzUpIuhO6JN2ETSS16n93Np+GFiGE
zM6bN2/fTPQOro5xENYuEVLzBz9eaF6RlWupSpyHaXqzVuvyljbbDCRV0UR7WeLiHEtT4D2GCcUy
pdNFh4pkg4wGZCV8Z6opTPOw3GI+swoq4ZIVZxqkSv0JqRAiWkBkmAJNegHIGG9vcAWcSnjM6dTo
uby8vsL7+wLllhQQKKr4SoFv/8nbXE5pQcdpk4nlwH6AE7zer17uVs/BmHVftnUdxYh0yHbV3ghY
9wh2rQlBkdMm8PmIe57Bjfs7dTW9X/OoEG6lwzwnzYKA7+8jHpsJTuFN4+IjKhKZM6dpZ0xuQ2Nz
7bGB1F74zlebmzTm/PS+OB0GZz4HLcPgz+BeDxyS9S7YU6e3qZaQk5gZgRHvQtI59Eet7QwBWFnL
O/F+ej9Fpmg0ug7OHrnqHy2+7y8x6DGEG3E0Az41RuPlYfXEEx0psrl8PAUebzf6ZklabgD8V3SS
Z7Vo6xfOuQS6gQMAAA==
「mksh -o posix z」 failed in that it continues; 「mksh z」 correctly aborts
let’s see what the obsd people have to say herefore
H4sIAAAAAAACAz1PywrDIBA8m68YgoT20EN7TMixX1F6yGNFIWhRSw2h/95VmuDB2XmsY2oDRVyo
Si1N2uHlMzQK0b8JHaImy4RQwxIITYPiUs4xqcyeGgdfrfumtWX5dMbGSHy0WQgP1PJa49lhdpkV
ynncYSxkOjjxd/V83dmcX9sQtFGRi0zORmO5042HbwnMzlIBqa9lAmfLVBIZ7XqpKBPDnDu+WXpi
wMhnOgQXYvUD+oKHAhUBAAA=
XXX OpenBSD has something different which may DTST or even DTRT (not break
XXX our make(1) wrt <bsd.subdir.mk>), check that
on Debian Lenny/amd64 (XXX need more verification; this
can be used for 64 bit arithmetics later too)
PPID, PGRP, RANDOM, USER_ID are now unsigned by default
on the screen is not enough for two columns, just output the text line by
line, instead of trying to format it; gets rid of superfluous empty lines
if we did not even have space for one column on the screen (x_cols)
noticed by Gábor Gergely in irc, thanks!
was hard to type and hard to fix, galloc is also hard to fix, and some
things I learned will probably improve things more but make me use the
original form as base (especially for space savings)
* let sizeofN die though, remove even more casts
* optimise, polish
* regen Makefiles
* sprinkle a few /* CONSTCOND */ while here
encountered. However, when reading end of input, the source type is set
to SEOF while popping, whereas the recursion check code only checks for
an SALIAS type.
Fix: add a new SF_HASALIAS flag; change u.tblp from being valid if type
is SALIAS to being valid if SF_HASALIAS is set; set SF_HASALIAS for the
created SALIAS sources; set SF_HASALIAS and u.tblp when creating SALIAS
whose next is SEOF on the SEOF source as well.
Reported by Michael Hlavinka as Redhat Bug #474115
allocator using malloc and free, with mmap malloc and omalloc in mind,
not counterfeiting its security measures such as guard pages, and having
some of our own, e.g. XOR random cookies, optional mprotect, etc.
zero cost (for we have arc4random())
$ (CCC_LD=mgcc CC=ccc sh Build.sh -r && ./test.sh -v) 2>&1 | tee log
Total failed: 2 (as expected)
Total passed: 278
Just the result is huge, and we could of course build to intermediate
byte code to optimise globally…
because categories in check.t are OR’d:
• no-stderr-ed disables the newish-ed tests (tried using testcase)
• stdout-ed enables the oldish-ed tests (variable, if the above
testcase succeeds, it’s added, but QNX overrides the variable)