jehanne/sys/src/cmd/hmi/vga/notes.txt

The following is a sort of theory of operation for aux/vga and the
kernel vga drivers.

--- aux/vga and basic kernel drivers

Aux/vga consists of a number of modules each of which conforms to an
interface called a Ctlr.  The Ctlr provides functions snarf, options,
init, load, and dump, which are explained in more detail below.  Video
cards are internally represented as just a collection of Ctlrs.  When
we want to run one of the functions (snarf, etc.)  on the whole card,
we run it on each Ctlr piece in turn.

In the beginning of aux/vga, it was common for video cards to mix and
match different VGA controller chips, RAMDACs, clock generators, and
sometimes even hardware cursors.  The original use for vgadb was to
provide a recipe for how to deal with each card.  The ordering in the
ctlr sections was followed during initialization, so that if you said
	ctlr
		0xC0076="Tseng Laboratories, Inc. 03/04/94 V8.00N"
		link=vga
		clock=ics2494a-324
		ctlr=et4000-w32p
		ramdac=stg1602-135
when aux/vga wanted to run, say, snarf on this card it would call the
snarf routines for the vga, ics2494a, et4000, and stg1602 Ctlrs, in
that order.  The special Ctlrs vga and ibm8514 take care of the
generic VGA register set and the extensions to that register set
introduced by the IBM 8514 chip.  Pretty much all graphics cards these
days still use the VGA register set with some extensions.  The only
exceptions currently in vgadb are the Ticket to Ride IV and the
Neomagic (both LCD cards).  The S3 line of chips tends to have the IBM
8514 extensions.

This "mix and match" diversity has settled down a bit, with one chip
now usually handling everything.  As a result, vgadb entries have
become a bit more formulaic, usually listing only the vga link, a
controller, and a hardware cursor.  For example:
	ctlr
		0xC0039="CL-GD540"
		link=vga
		ctlr=clgd542x
		hwgc=clgd542xhwgc

On to the controller functions themselves.  The functions mentioned
earlier are supposed to do the following.

void snarf(Vga *vga, Ctlr *ctlr)
	Read the ctlr's registers into memory, storing them
	either in the vga structure (if there is an appropriate
	place) or into a privately allocated structure, a pointer
	to which can be stored in vga->private [sic].
	[The use of vga->private rather than ctlr->private betrays
	the fact that private data has only been added after we got
	down to having cards with basically a single controller.]

void options(Vga *vga, Ctlr *ctlr)
	This step prepares to edit the in-memory copy of the
	registers to implement the mode given in vga->mode.
	It's really the first half of init, and is often empty.
	Basically, something goes here if you need to influence
	one of the other init routines and can't depend on being
	called before it.  For example, the virge Ctlr rounds
	the pixel line width up to a multiple of 16 in its options routine.
	This is necessary because the vga Ctlr uses the pixel line
	width.  If we set it in virge.init, vga.init would already
	have used the wrong value.

void init(Vga *vga, Ctlr *ctlr)
	Edit the in-memory copy of the registers to implement
	the mode given in vga->mode.

void load(Vga *vga, Ctlr *ctlr)
	Write all the ctlr's registers, using the in-memory values.
	This is the function actually used to switch modes.

void dump(Vga *vga, Ctlr *ctlr)
	Print (to the Biobuf stdout) a description of all the
	in-memory controller state.  This includes the in-memory
	copy of the registers but often includes other calculated
	state like the intended clock frequencies, etc.

Now we have enough framework to explain what aux/vga does.  It's
easiest to present it as a commented recipe.

1.  We sniff around in the BIOS memory looking for a match to
any of the strings given in vgadb.  (In the future, we intend also to
use the PCI configuration registers to identify cards.)

2.  Having identified the card and thus made the list of controller
structures, we snarf the registers and, if the -p flag was
given, dump them.

3.  If the -i or -l flag is given, aux/vga then locates the desired
mode in the vgadb and copies it into the vga structure.  It then does
any automatic frequency calculations if they need doing.  (See the
discussion of defaultclock in vgadb(6).)

For a good introduction to video modes, read Eric Raymond's XFree86
Video Timings HOWTO, which, although marked as obsolete for XFree86,
is still a good introduction to what's going on between the video card
and the monitor.
http://www.linuxdoc.org/HOWTO/XFree86-Video-Timings-HOWTO/

4.  Having copied the vgadb mode parameters into the vga structure,
aux/vga calls the options and then the init routines to twiddle the
in-memory registers appropriately.

5.  Now we are almost ready to switch video modes.  We dump the
registers to stdout if we're being verbose.

6.  We tell the kernel (via the "type" vga ctl message) what sort of
video card to look for.  Specifically, the kernel locates the named
kernel vga driver and runs its enable function.

7.  If we're using a frame buffer in direct-mapped linear mode (see
the section below), we express this intent with a "linear" vga ctl
message.  In response, the kernel calls the vga driver's linear
function.  This should map the video memory into the kernel's address
space.  Conventionally, it also creates a named memory segment for use
with segattach so that uesr-level programs can get at the video
memory.  If there is a separate memory-mapped i/o space, it too is
mapped and named.  These segments are only used for debugging,
specifically for debugging the hardware acceleration routines from
user space before putting them into the kernel.

8.  We tell the kernel the layout of video memory in a "size" ctl
message.  The arguments are the screen image resolution and the pixel
channel format string.

9.  Everything is set; we disable the video card, call the loads to
actally set the real registers, and reenable the card.

At this point there should be a reasonable picture on the screen.  It
will be of random memory contents and thus could be mostly garbage,
but there should be a distinct image on the screen rather than, say,
funny changing patterns due to having used an incorrect sync
frequency.

10.  We write "drawinit" into #v/vgactl, which will initialize the
screen and make console output from now on appear on the graphics
screen instead of being written to the CGA text video memory (as has
been happening).  This calls the kernel driver's drawinit function,
whose only job is to initialize hardware accelerated fills and scrolls
and hardware blanking if desired.

11.  We write "hwgc <hwgcname>" into #v/vgactl, which calls the enable
function on the named kernel hwgc driver.  (Plan 9 does not yet support
software graphics cursors.)

12.  We set the actual screen size with an "actualsize" ctl message.
The virtual screen size (which was used in the "size" message in step
8) controls how the video memory is laid out; the actual screen size
is how much fits on your monitor at a time.  Virtual screen size is
sometimes larger than actual screen size, either to implement panning
(which is really confusing and not recommended) or to round pixel
lines up to some boundary, as is done on the ViRGE and Matrox cards.
The only reason the kernel needs to know the actual screen size is to
make sure the mouse cursor stays on the actual screen.

13.  If we're being verbose, we dump the vga state again.

--- hardware acceleration and blanking

Hardware drawing acceleration is accomplished by calling the
kernel-driver-provided fill and scroll routines rather than
doing the memory operations ourselves.  For >8-bit pixel depths,
hardware acceleration is noticeably needed.  For typical Plan 9
applications, accelerating fill and scroll has been fast enough that we haven't
worried about doing anything else.

The kernel driver's drawinit function should sniff the card
and decide whether it can use accelerated fill and scroll functions.
If so, it fills in the scr->fill and scr->scroll function pointers
with functions that implement the following:

int fill(VGAscr *scr, Rectangle r, uint32_t val);
	Set every pixel in the given rectangle to val.
	Val is a bit pattern already formatted for the screen's
	pixel format (rather than being an RGBA quadruple).
	Do not return until the operation has completed
	(meaning video memory has been updated).
	Usually this means a busy wait looping for a bit
	in some status register.  Although slighty inefficient,
	the net effect is still much faster than doing the work
	ourselves.  It's a good idea to break out of the busy
	loop after a large number of iterations, so that
	if the driver or the card gets confused we don't
	lock up the system waiting for the bit.  Look at
	any of the accelerated drivers for the conventional
	method.

int scroll(VGAscr *scr, Rectangle r, Rectangle sr);
	Set the pixels in rectangle r with the pixels in sr.
	r and sr are allowed to overlap, and the correct
	thing must be done, just like memmove.
	Like fill, scroll must not return until the operation
	has completed.

Russ Cox <rsc@plan9.bell-labs.com> has a user-level scaffold
for testing fill and scroll routines before putting them into
the kernel.  You can mail him for them.

Finally, drawinit can set scr->blank to a hardware blanking
function.  On 8-bit displays we can set the colormap to all
black to get a sort of blanking, but for true-color displays
we need help from the hardware.

int blank(VGAscr *vga, int isblank);
	If isblank is set, blank the screen.  Otherwise, restore it.
	Implementing this function on CRT-based cards is known to
	mess up the registers coming out of the blank.
	We've had better luck with LCD-based cards although
	still not great luck.  But there it is.

--- linear mode and soft screens

In the bad old days, the entire address space was only 1MB, but video
memory (640x480x1) was only 37.5kB, so everything worked out.  It got
its own 64kB segment and everyone was happy.  When screens got deeper
and then bigger, the initial solution was to use the 64kB segment as a
window onto a particular part of video memory.  The offset of the
window was controlled by setting a register on the card.  This works
okay but is a royal pain, especially if you're trying to copy from one
area of the screen to another and they don't fit in the same window.
When we are forced to cope with cards that require accessing memory
through the 64kB window, we allocate our own copy of the screen (a
so-called soft screen) in normal RAM, make changes there, and then
flush the changed portions of memory to video RAM through the window.
To do this, we call the kernel driver-provided page routine:

int pageset(VGAscr *scr, int page);
	Set the base offset of the video window to point
	page*64kB into video memory.

With the advent of 32-bit address spaces, we can map all of video
memory and avoid the soft screen.  We call this running the card
in linear mode, because the whole video memory is mapped linearly
into our address space.  Aux/vga is in charge of deciding
whether to do this.  (In turn, aux/vga more or less respects
vgadb, which controls it by having or not having "linear=1" in
the controller entry.)  If not, aux/vga doesn't do anything special,
and we use a soft screen.  If so, aux/vga writes "linear" and
an address space size into vgactl in step #7 above.  In response
the kernel calls the kernel driver's linear function, whose
job was described in step #7.

Most drivers only implement one or the other interface: if you've
got linear mode, you might as well use it and ignore the paging
capabilties of the card.  Paging is typically implemented only
when necessary.

--- from here

If you want to write a VGA driver, it's fairly essential that you get
documentation for the video chipset.  In a pinch, you might be able to
get by with the XFree86 driver for the chipset instead.  (The NVidia
driver was written this way.)  Another alternative is to use
documentation for a similar but earlier chipset and then tweak
registers until you figure out what is different.  (The SuperSavage
parts of the virge driver got written this way, starting with the
Savage4 parts, which in turn were written by referring to the Savage4
documentation and the Virge parts.)

Even if you do get documentation, the XFree86 driver is good to
have to double check.  Sometimes the documentation is incomplete,
misleading, or just plain wrong, whereas the XFree86 drivers,
complicated beasts though they are, are known to work most of the time.

Another useful method for making sure you understand what is going on
is dumping the card's registers under another system like XFree86 or
Microsoft Windows.  The Plan 9 updates page contains an ANSI/POSIX
port of aux/vga that is useful only for dumping registers on various
systems.  It has been used under Linux, FreeBSD, and Windows 95/98.
It's not clear what to do on systems like Windows NT or Windows 2000
that both have reasonable memory protection and are hardware
programmer-unfriendly.

If you're going to write a driver, it's much easier with a real
Plan 9 network or at least with a do-everything cpu/auth/file server
terminal, so that you can have an editor and compiler going on a
usable machine while you continually frotz and reboot the machine
with the newfangled video card.  Booting this latter machine from
the network rather than its own disk makes life easier for you
(you don't have to explicitly copy aux/vga from the compiling machine to
the testing machine) and doesn't wreak havoc on the testing machine's
local kfs.

It's nice sometimes to have a command-line utility to poke
at the vga registers you care about.  We have one that perhaps
we can clean up and make available.  Otherwise, it's not hard
to roll your own.

The first step in writing an aux/vga driver is to write the
snarf and dump routines for the controller.  Then you can
run aux/vga -p and see whether the values you are getting
match what you expect from the documentation you have.

A good first resolution to try to get working is 640x480x8,
as it can use one of the standard clock modes rather than
require excessive clock fiddling.

/sys/src/cmd/aux/vga/template.c is a template for a new
vga controller driver.  There is no kernel template
but any of the current drivers is a decent template.
/sys/src/9/pc/vga3dfx.c is the smallest one that supports
linear addressing mode.