just a "somewhat more POSIX" but also a "/bin/sh legacy kludge" mode
* consistently capitalise POSIX and SUSv3/SUSv4 (same as AT&T ksh) and
Bourne shell
call it only if $RANDOM is indeed set (although pool extension would be a
possibility we do have arc4random_atexit which does it nicely too)
• avoid calling setspec for int→str conversion just before execve()
to it are now either arc4random or rand/srand, but srand retains the old
state; set +o arc4random is no longer possible, but if it's there we use
arc4random(3), if not, we use rand(3) for $RANDOM reads; optimise special
variable handling too and fix a few consts and other minor things
MKSH_S_EDIT for small (Emacs) editing mode, MKSH_S_FEAT for all the dis-
abled language features), which can be set to 0 despite MKSH_SMALL being
defined to re-enable the Vi command line editing mode (which I wouldn't,
but fits into the general mastermind scheme)
some GNU bash extensions (suggested by cnuke@) and bind macros
* make the random cache more efficient (and the code potentially
smaller, although we have a new implementation of the oaat hash
function, alongside the old one, now) and pushb only if needed
(i.e. state has changed or user has set $RANDOM, but not onfork)
• shell flags are now handled in one single place (sh_flags.h)
• sync comments (between enum and array) and manpage with reality
• FMONITOR is now no longer needed for Hartz IV shells
rate-limit calls to CryptGenRandom to every 2‥4 minutes, if the last
call was successful, and operate with hash() on rnd_cache[], so that
it is mixed in a better way
integers in addition to my 「1#a」 (or 「1#…」), which also allows for
finer end-of-character checking. Note that this is locale-dependent in
ksh93, set ±U dependent in mksh, and mksh’s OPTU-16 encoding is used.
libc function realpath(3) which may not be available on the target
system; compile the realpath builtin unconditionally
looks fine to me, but review is appreciated; this is (very) lightly
based upon MirBSD libc’s realpath(3) and pdksh’s get_phys_path()
• we must not set the item pointer to NULL, since subsequent ktscan()
would stop there and not find any later occurrences
possible resolution strategies:
‣ still keep tablep; store a dummy value (either (void *)-1 or, probably
more portable, &ktenter or something like that) as is-free marker
⇒ retains benefit of keeping count of actually used entries
⇒ see below for further discussion
‣ don't keep tablep; revert back to setting entry->flag = 0
⇒ need to ktwalk() or ktsort() for getting number of entries
⇒ most simple code
‣ same but with a twist: make ktscan() set pp to the first one with
!(entry->flag & DEFINED)¹ so that it can subsequently be re-used,
or, more accurate, free’d and the entry pointer re-used
⇒ less chance of texpand()ing when not needed
‣ similar (from kabelaffe@): in ktsearch(), move the one we DID find
to the first unused one
⇒ doesn’t need tablep or something, but has the overall best
memory use
⇒ more complicated ktscan(): needs to check pointer for NULL, for
dummyval, then entry->flag
⇒ makes lookup more expensive
⇒ benefit: self-optimising hash tables
⇒ loss: still need ktwalk() or ktsort()
• when afree()ing in ktremove(), …
① need to take FINUSE into account
• Python-2.5.4/Objects/dictnotes.txt talks about cache lines
‣ linear backward scan is much worse than linear forward scan
(even if we have to calculate the upper C-array bound)
‣ dereferencing the entry pointer in ktscan() is a penalty
• Python-2.5.4/Objects/dictobject.c has a lot of comments and
a rather interesting collision resolution algorithm, which
seems to de-cluster better than linear search at not much
more cost
• clib and libobjfw have unusable (for looking-at-for-ideas)
hash table implementations
this is a no-op change breaking ifdef-out-d code; the most likely
to happen is to switch to the following scheme:
• keep tablep in struct tbl
• use a magic pointer value for ktremove’d entries, deallocate
the struct tbl as soon as possible – if not FINUSE, immediately
inside ktremove()
‣ memory gain, despite needing to have tablep around
• nuke ktdelete, so that all ops go through kt{enter,remove}
‣ gains us accurate fill information
‣ speed gain: ktscan() needs no longer dereference removed entries
‣ memory (ktsort) and speed (ktwalk) gain: removed entries are now
ignored right from the beginning, so tstate->left and the size
of the sorted array are accurate
‣ removed entries no longer can cause texpand() to be invoked
⇒ this does not give us self-optimising tables, but a speed and
memory benefit plus, probably, simplicity of code; we accurately
know how many non-deleted entries are in a keytab so we can cal-
culate if we need to expand, how much space ktsort() is going to
need, and, for when indexed arrays will be converted to use key-
tabs instead of singly linked linear lists, ${#foo[*]} is fast
(although ${!foo[*]}² and ${foo[*]}³ will need some tweaking and
may run a little less quickly)
• shuffle code around, so that things like search/scan and garbage
collection can be re-used
• use Python’s collision resolution algorithm ipv linear search
② the list of keys needs to be sorted, at least for indexed arrays⁴
③ this needs to be sorted by keys, at least for indexed arrays⁴
④ … but this is a nice-to-have for associative arrays⁵ as well
⑤ which we however do not have
bash4 doesn’t have it at all, despite knowing associative arrays
zsh does it………… differently and weird
this is for indexed arrays, as mksh doesn’t have associative arrays
but it should help ☺