This document consists of three parts, the Introduction, the actual library interface definition and the Implementation Notes. There also is an Index of the used function names, as well as a section for ideas for the development. Also acknowledgements are given.
This file has been written to define a certain compatibility level between different operating systems for 6502 computer. It defines, on a more abstract level that can be fitted to different real OSes, an interface for system services. The goal of this definition is, to be able to fit into various different environments, like
The possible operating systems also differ in style and features:
This interface offers a common level for all these scenarios. Programs written for this library can run on any of these platforms, a simple recompile on the proprietary 6502 assembler/compiler will be enough. If people can agree on a standard file format, not even recompiling might be necessary.
In this library there still is a certain degree of freedom in the implementation. System process IDs can still be 8 bit, as long as the library offers a 16 bit interface to the application. Memory allocation can still be pagewise (in 256 byte blocks), as long as an application does not rely on that - in other OSes it can probably allocate smaller chunks. The purpose of this interface is, to hide the system OS behind. The application, on the other hand, should never assume any implementation specifics, that are not documented in this definition.
Most of the calls are modeled along the standard libc C library, but there also are some calls from the Unix world.
This file does not make any assumptions about the implementation of the calls, although the behaviour is (i.e. should be :-) noted as exactly as possible.
The file interface uses file numbers. These file numbers are valid in the local environment, and need not be globally valid. But the lib6502 always has to accept these numbers, and then tranforms them internally wo whatever appropriate for the given OS. Of course an implementation where both are identical should be better in performance.
file-nrs in lib6502 are treated as uni-directional or bi-directional channels, i.e. an application can either read or write to a provided uni-directional file-nr, and both at a time to a bi-directional file-nr. The OS can provide bi-directional (with read/write operations possible on the same file-descriptor in one task) or uni-directional (where a write to one end can be read from the other end even in the same task.) file-descriptors. Only the latter case is difficult. The stdlib must check the fileno and remap it to different filenos for reading or writing.
When an application is started, three uni-directional file-nrs do already exist and are open: STDIN, STDOUT, and STDERR. STDOUT gives the file-nr for writing task output, STDERR is for error output, while STDIN is for reading program input.
All open files are closed when the process terminates.
fopen: <- a/y = address of null-terminated filename x = file mode mode : 0 = read-only 1 = write-only 2 = read-write 3 = append -> c=0 : x = file-nr c=1 : a = error code E_NOTFOUND E_PERMISSION E_FNAMLEN E_DIRECTORY
Files are named as: "Device:directory/filename", where Device depends on the OS. There might be OSes where it's a single character (like A: or 8:), others might have a real name. The lib needs to be able to parse its own, implementation dependent namespace only. And the application should not assume anything about the length of the device name. Directory separator is "/". Escape sequence for the directory separator is "\/". For the filename interpretation see the beginning of the directory section. Character set is ASCII (i.e. all codes between 0x20 and 0x7f must be useable, others are not allowed).
The filesystem might imply other limitations on filename length, if there are directories at all, or the allowed characters. Wildcards are "*" that match any string and "?" that match exacly one character. The interpretation might depend on the filesystem, i.e. may not even be the same on the same system, but on different devices. Escape sequences are "\*" and "\?". Escape sequence for "\" is "\\".
The mode byte can be one of the abovementioned, which map to
OPEN_RD 0 OPEN_WR 1 OPEN_RW 2 OPEN_AP 3
Append means, that before any write to the file, the file pointer is positioned at the end of the file. The mode can be or'd with some other bits, like
OPEN_TRUNC $80 If the file already exists, it will be truncated (when writing). OPEN_EXCL $40 Only one process may have this file open at one time. OPEN_NOCREAT $20 Do not automatically create when writing (OPEN_WR, _RW, _AP).
Default is to automatically create the file when it does not exist and it is opened for writing.
fclose <- x = file-nr Closes the given file-nr. Pushes all remaining data to the receiver, and waits till it is written. -> c=0 : everything ok c=1: a = error code E_NOFILE E_WERROR E_NUL fgetc <- x = file-nr c=0 : return immediately, c=1 : block till byte -> c=0 : a = data byte c=1 : a = error code E_NOFILE E_EMPTY E_EOF fputc <- x = file-nr, a = data byte c=0 : return immediately, c=1 : block till byte -> c=0 : ?? c=1 : a = error code E_NOFILE E_FULL (E_TRYAGAIN?) E_NUL (noone reads it) fread <- x = file-nr, a/y = address of struct: .word address_of_buffer, length_of_buffer c = 0 : return immediately, even if nothing read or buffer only partially read. c = 1 : wait till buffer is full or EOF (or error) -> c = 0 : ok struct given holds address+a/y, length-a/y, such that it can directly given to fread again. c = 1 : error code E_NOFILE E_EMPTY (E_TRYAGAIN?) E_EOF fwrite <- x = file-nr, a/y = address of struct: .word address_of_buffer, length_of_buffer c = 0 : return immediately, even if nothing written, or buffer only partially written. c = 1 : wait till buffer is empty (or error) -> c = 0 : ok struct given holds address+a/y, length-a/y, such that it can directly given to fwrite again. c = 1 : error code E_NOFILE E_FULL (E_TRYAGAIN?) E_NUL (noone reads it) fseek <- x = file-nr, a/y = address of struct: .byt mode ; offset is relative to 0 = start of file 1 = end of file 2 = actual position .word 0,0 ; 32 bit offset -> c = 0 : ok; c = 1 : error code E_NOFILE E_NOSEEK
fgetc and fread, and fputc and fwrite can be used interchangeably. fread/fwrite
don't guarantee that the whole buffer is really read/written, even with
carry set. For this, see fcntl below.
If fread
returns an E_EOF
there can still be bytes that
have been read during this call, so the returned struct has to be checked.
When opening a file read-write, then when changing between read and write, there always has to be an fseek operation.
There are, however, files that cannot be seeked, namely character devices. If trying to use fseek on such a device, E_NOSEEK is returned. If a seekable file is given to STDIN and STDOUT/STDERR, the behaviour is not defined. Only non-seekable files should be given to STDIN and STDOUT/STDERR, when opened read-write.
pipe -> x = file-nr for reading y = file-nr for writing opens a uni-directional pipe with two file numbers, one for writing, and one for reading. To close the pipe, each end has to be closed separately. flock <- x = file-nr a = operation: LOCK_SH, LOCK_EX, LOCK_UN c = 0: don't block c = 1: block till you get it -> c = 0: ok, got lock c = 1: a = error code E_NOTIMP E_NOFILE E_LOCKED
The flock call locks a file for other tasks access. If locked shared, then other tasks may also aquire shared locks - for reading, for example. An exclusive lock can only be aquired by exactly one task at a time - for writing. An exclusive lock can not be obtained when there are other shared locks, but a pending exclusive lock blocks all other attempts to lock it, even for shared locks. The flock implementation should be fair, i.e. lock attempts are served in the order they arrive, except that exclusive get served before shared locks. The flock call is optional. If not implemented, return E_NOTIMP
fcntl <- x = file-nr, a = operation a = FC_PUSH 0 all buffers are flushed and sent FC_PULL 1 actively try to get everything that has already been sent FC_RCHECK 2 checks if there is data to read FC_WCHECK 3 checks if at least one byte can be written. -> c = 0: ok c = 1: a = error code E_NOFILE E_NOTIMP E_NOREAD E_NOWRITE
The fcntl return code should be ignored, as it is probably not implemented in most of the systems, except for RCHECK/WCHECK calls of course.
fcmd <- x = operation, a/y = filename,0 [ , filename2, 0 ] x = FC_RENAME 16 filename -> filename2 FC_DELETE 17 FC_MKDIR 18 FC_RMDIR 19 FC_FORMAT 20 filename only to determine drive FC_CHKDSK 21 - " -
Other important calls are the stddup and the dup call.
stddup <- x = old stdio file-nr (STDIN, STDOUT or STDERR) y = new file-nr for stdio file. -> c = 0: ok, x = old stdio file c = 1: a = error code E_NOFILE
This call replaces a stdio file-nr (the pre-defined STDIN, STDOUT, and STDERR file-nrs) with a new file-nr. The old file-nr must not be closed, as it is when the process terminates. Instead the new file-nr returned must be closed.
dup <- x = old file-nr -> c = 0: ok, x = new file-nr c = 1: a = error code E_NOFILE
This call 'reopens' a file, i.e. it returns a new file-nr that is used as the old one. They share the same read/write pointers etc. Both file-nrs must be closed. This way the same file can be given to STDOUT and STDERR in a fork call, for example.
If dup is given a read-write file-nr, both sides are duped and the returned file-nr is bi-directional again.
The library maintains a path that is used for each file system operation. If a filename does not start with a "/" and not with a drive, the path is put in front of the filename. If the filename starts with a drive, it is always taken as an absolute filename, even if the "/" is missing. If only the drive is missing, it is taken from the path.
A special case is the directory call with a filename as "*:". It does not use the path, but returns, in each entry, an available device name. The length attribute should give the available amount of storage space on the device. A wildcard in the device field is not allowed otherwise.
fopendir <- a/y = address of filename -> c = 0: ok, x = file-nr c = 1: a = error code E_NOFILE E_NOTDIR freaddir <- x = file-nr a/y = address of buffer blocks until read, otherwise use fcntl(FC_RCHECK) reads _one_ directory entry into the buffer, which is of length (FD_NAME + MAX_FILENAME) One entry consists of a directory struct .word 0 ; valid bits .word 0 ; owner ID .word 0 ; group ID .word 0 ; permissions (drwxrwxrwx) (2 byte) .word 0,0 ; file length in byte (4 byte) .byt 0,0,0,0,0,0 ; last modification date ; (year-1990, month, day, hr, min, sec) The valid bit say, which entry in the struct is valid. bit 0 is for the permissions, bit 1 for the file length, bit 2 for the date. The file length, if not zero, is an approximate value (like the blocks *254 in a vc1541) this struct is followed by the null-terminated filename. The valid bits are: FDV_PERM 1 FDV_LENGTH 2 FDV_MDATE 4 FDV_OWNER 8 FDV_GROUP 16 If permissions are valid, but not group/owner, then the permissions for user and group are invalid, and the permissions for others should be taken. The directory entry struct definitions are: FD_VALID 0 /* 1 byte */ FD_PERM 1 /* 2 byte */ FD_LENGTH 3 /* 4 byte */ FD_MDATE 7 /* 6 byte */ FD_NAME 13 /* null-terminated */ The permission bits are actually a copy from the Linux man pages...: S_ISUID $800 /* set user ID on execution */ S_ISGID $400 /* set group ID on execution */ S_ISVTX $200 /* -- (sticky bit) */ S_IRWXU $1c0 /* rwx mask for user */ S_IRUSR $100 /* read by owner */ S_IWUSR $080 /* write by owner */ S_IXUSR $040 /* exexute by owner */ S_IRWXG $038 /* rwx mask for group */ S_IRGRP $020 /* read by group */ S_IWGRP $010 /* write by group */ S_IXGRP $008 /* execute by group */ S_IRWXO $007 /* rwx mask for others */ S_IROTH $004 /* read by others */ S_IWOTH $002 /* write by others */ S_IXOTH $001 /* execute by others */ S_IFMT $f000 /* file type mask */ S_IFSOCK $c000 /* socket */ S_IFLNK $a000 /* symbolic link */ S_IFREG $8000 /* regular file */ S_IFBLK $6000 /* block device */ S_IFDIR $4000 /* directory */ S_IFCHR $2000 /* character device */ S_IFIFO $1000 /* fifo */ S_IFMNAME $3000 /* media name */ S_IFMFREE $5000 /* free are on media */ S_IFMSIZE $7000 /* media size */ Most of the file types are probably not implemented on a 6502 system. But before defining new types, one should keep the already defined values save. That's for the last three entries, which are not Posix standards. The length field can be an approximate if it is not zero and the length is not valid (For VC1541 block lengths). fgetattr <- a/y = address of dir struct, incl filename (like in freaddir) -> c = 0: ok, c = 1: a = error code E_NOTIMP This tries to fill in the bits that are _not_ valid in a dir struct. For example, if freaddir returned the filelength only, but no permissions, then calling fgetattr should get the file permissions. But it is not guaranteed, that all fields are filled, as some are not implemented on a certain filesystem. So even after fgetattr, a check of the valid bits is needed. The filename must be completed with the device and path. fsetattr <- a/y = address of dir struct, incl. filename -> c = 0: ok c = 1: a = error code E_NOTIMP Tries to write new file attributes (the ones where the valid bits are set). Need not succeed. Clears the valid bits for the attributes it has successfully set. The filename must be completed with the device and path. chdir <- a/y = address of new path, relative to the old one -> c = : ok c = 1: a = error code E_ILLDRIVE E_ILLPATH cwd <- a/y = address of buffer, x = length of buffer -> c = 0: ok, buffer contains the current working directory. c = 1: a = error code E_NOTIMP E_FNAMLEN
The chdir call changes the saved path in the library. A "." filename means the same directory, while ".." means the parent directory.
cwd gives the address of the path. However, the path pointed to by the returned address must not be changed.
Network streams are used as well as any other file, so we only need opening calls. Currently only TCP/IP is defined and thought of, but there should be no problem allowing other networks.
connect <- a/y address of : byte length of address (incl. length byte), plus 4 byte inet addr (+2 byte port for TCP/UDP) x = protocol (IPV4_TCP, IPV4_UDP,...) -> c = 0: x = (non-seekable, bi-directional) file-nr for read/write c = 1: a = error code E_NOTIMP E_PROT E_NOROUTE E_NOPERM listen <- c = 0: a/y = addr of: byte length of port, 2 byte port number (for TCP/UDP on IP) x = protocol -> c = 0: ok, x = listenport c = 1: a = error code E_NOTIMP E_PROT E_PORTINUSE opens a port to listen at <- c = 1: x = listenport -> c = 0: ok c = 1: a = error code E_NOTIMP E_NOPORT closes the listenport again. accept <- a/y = address of buffer for struct: 1 byte length of buffer (incl. length byte), the rest of the buffer need not be set. for IPV4 TCP and UDP we need place for: 4 byte IP address + 2 byte port, x = listenport c = 0: don't block c = 1: block -> c = 0: x = file-nr for read/write The buffer contains the address that the remote machine uses. The 1st byte contains the length of the address (should not differ from the length indicated by the protocol number in listen. c = 1: a = error code E_NOTIMP E_ADDRLEN E_NOMEM E_TRYAGAIN
connect is something like Unix socket() and connect() together. listen
is something like socket(), bind() and listen() together. listen tells
the network layer, that the application is going to accept connections
on a certain port. Therefore, when a connection is requested from remote,
the network layer can accept them already and hold them "on line"
until the task gets the connection with accept. The maxmimum number of
acceptable connections is implementation specific.
accept gets the first connection waiting for an accept. the other sides
IP and port are stored in the buffer given in a/y If a connection is refused
after checking IP or port, the 'accepted' connection should be closed immediately.
It is possible to have an allocation at byte boundaries, or at page boundaries - an application does not have to rely on a certain alignment!
malloc <- a/y length of block needed -> c=0: a/y address of block allocated c=1: a = error code E_NOMEM mfree <- a/y address of block released -> c=0: ok c=1: error code E_ILLADDR realloc <- a/y address of block, x position of new length in zeropage (2 byte) -> c=0: ok c=1: error code E_ILLADDR
Allocated memory blocks are automatically freed on process termination.
Process management is a bit more complicated. Process ID interface is 16 bit, although they need not all be used, of course.
exec <- a/y = addr of filename,0 [, parameter1, 0 ...] ,0 -> if no error, then the new program starts and gets a/y = address of filename,0 [, parameter1, 0 ...],0 otherwise return: c=1: a = error code E_NOTFOUND E_NOMEM allocates new environment and removes old environment. starts newly loaded executable file. forkto <- a/y addr of struct: '.byte STDIN, STDOUT, STDERR, exec_struct -> c=0: x/y = child pid c = 1: a = error code E_NOMEM E_ILLSTR E_NOTFOUND This is not really a fork like in Unix, but it creates a new process, so it still 'forks'. The new process is started with executing the file given in the exec_struct - which is the same struct as given to exec. The file-nrs given for STDIN, STDOUT and STDERR share the same read/write pointers as the ones in this process. They are internally 'duped', and the calling task has to close them after calling forkto. No other file-nos are inherited. forkthread <- a/y addr of execution for new thread -> c = 0: ok, xr = new thread number c = 1: a = error code E_NOTIMP forkthread sets up a new thread to run in the same memory environment as the calling thread/process. The new thread is started at the given address, with an empty stack. This means the thread has to explicitely call term when terminating. term <- a = return code kill <- a = return code, x/y = pid (or MYTASK = myself -> suicide) -> c=0: ok (except for MYTASK) c=1: a = error code E_ILLPID
The term call terminates the current thread only. The memory etc is only freed when all threads in this environment have terminated. Kill terminates all threads in the environment indicated by the process ID.
getpid -> x/y = own PID
When forking, the files still share the same seek pointer (address in the file where they read/write). When one process writes to a file, the other processes write pointer moves on too, same for the read pointer. Otherwise file sharing with the 1541 would be impossible, for example.
STDIN/STDOUT and STDERR file-nrs appear to be opened before process start. They can be closed as any other file, though. When calling forkto, the file-nrs given to it are `duped' internally, such that they have to be closed in the calling process, as well as in the newly created process.
All files opened by this task are closed when it terminates. All memory blocks allocated by this task are freed when it terminates.
The newly created process is started by calling the "main" function, with a/y pointing to a list of arguments:
.byt "arg0",0, "arg1",0, ... ,"argn",0,0
The "main" function can either call "term", "kill" or return with a "rts" opcode.
yield
yield
gives the process control back to the scheduler. It need
not be called at all. But if a process is doing a busy loop (or spin-lock)
waiting for a certain condition, it might call yield
to give
other tasks the opportunity to run. It is not necessary to call
yield
at all! The scheduler interrupts any thread and
takes control back when the thread has used its timeslice. But if the thread
knows that it waits for something to be done by another task, instead
of waiting for the end of its timeslice it can call yield to give other tasks
the immediate opportunity to run.
Interprocess communication heavily depends on the system underneath the library, so it's not that easy. So far we handle semaphores, signals, and send/receive.
semget <- c = 0: don't block, c = 1: wait till you get one -> c = 0: ok, x = semaphore number c = 1: a = error code E_NOTIMP E_NOSEM gets a new semaphore semfre <- x = semaphore number -> c = 0: ok c = 1: a = error code E_NOTIMP E_NOSEM E_INUSE releases a used semaphore. If a process is waiting for the semaphore, returns E_INUSE semgetnamed <- c = 0: a/y = name of semaphore x = 0 : if not found, return error, x = 1 : if not found, alloc name and return ok -> c = 0 : ok, x = semaphore number c = 1 : a = error code E_NOTIMP E_NOTFOUND E_NOSEM This calls tries to allocate a 'named' semaphore. If the name already exists, the associated semaphore number is returned. If the name doesn't exist, and x=0, then an error is returned. If a name doesn't exist, and x=1, then the new name is allocated, a semaphore is allocated and associated with the name. <- c = 1: a/y = name of semaphore -> c = 0: ok c = 1: a = error code E_NOTIMP E_NOTFOUND The named semaphore is de-allocated. The named semaphore handler counts the number of allocations and frees a semphore if the name is totally deallocated. With this call, one can system-independently allocate system and hardware resources, if they are protected by semaphores. predefined semaphore names are: SEM_C64_SERIEC, SEM_C64_PARIEC, SEM_C64_SID, SEM_C64_VID, SEM_C64_KEYBOARD, SEM_C64_CIA1TA, SEM_C64_CIA1TB, SEM_C64_CIA1TOD, SEM_C64_CIA2TA, SEM_C64_CIA2TB, SEM_C64_CIA2TOD, SEM_CSA_SERIEC, SEM_CSA_PARIEC, SEM_CSA_WD1770, SEM_GECKO_SERIEC, SEM_GECKO_IRTX psem <- x = semaphore number c = 0: don't block; c = 1: wait till gotten -> c = 0: got semaphore c = 1: a = error code E_NOSEM Pass operation on a semaphore. Locks the semaphore. vsem <- x = semaphore number Free operation on a semaphore. Lets other threads "pass".
Signals are some kind of 'remote procedure call' - a signal handler for a certain signal is called upon another threads' request.
signal <- x = signal-number a/y = address of signal handler -> c = 0: ok c = 1: a = error code E_NOTIMP E_ILLSIG installs a signal handler for a signal signal handler address NULL de-installs a handler. sendsignal <- a/y pid of receiving process x = signal number -> c = 0: ok, sent c = 1: a = error code E_ILLPID E_ILLSIG sends a signal to another process. A signal is an emulated interrupt to the address specified as the signal handler address.
Allowed signals are
SIG_USR1 SIG_USR2 SIG_USR3 SIG_USR4 SIG_CHLD
The signal handler gets the signal mask for the signal in ac
(They may or may not be combined if the same signal handler is used for
more than one signal type).
A SIG_CHLD
is used to get information about child processes
that died. If it is received, the
xr/yr registers holds the PID of the dead child and ac holds the return
code of the child. Only child processes are registered that died when the
signal handler mask was set. You can only receive, but not send a
SIG_CHLD
.
This section is very preliminary, as the SEND/RECEIVE interface in OS/A65 is not really useable without MMU, and Lunix doesn't have SEND/RECEIVE.
send <- a/y = address of .word receiver_pid .word address_of_data .word length_of_data c = 0: don't block c = 1: wait till accepted -> c = 0: block sent c = 1: a = error code E_ILLPID E_NOTIMP sends a message to another process. The data sent is not changed, or freed or whatever. receive <- a/y = address of three words, second and third word give address and length of receiver buffer x = 0 : accept any sender x = 1 : first word in (a/y) contains the sender c = 0 : don't block c = 1 : wait till received -> c = 0 : message received, (a/y) has .word sender_pid .word address_of_data .word length_of_data c = 1 : a = error code E_NOTIMP E_ILLPID E_NOMEM The data is stored in the buffer, and length_of_data is changed to the length actually received. If the buffer is too short, length_of_data is set to the length needed, and E_NOMEM is returned.
getenv: <- a/y name of env. variable -> a/y = address of value or NULL if not set. putenv: <- a/y addr of "name=string" -> c=0: ok c=1: error E_NOMEM E_NOTIMP add or change env var. If the variable is empty, it is unset and will return either a NULL pointer with getenv or a pointer to an empty string. getenvp: -> a/y addr of null-terminated list of null-terminated strings with "varname=value" in each string, describing all environment variables. It is NOT allowed to change these variables 'by hand'. getos: -> a/y addr of operating system string
The operating system string is a string containing the
Implementation notes are currently available for the o65 file format only. This file format is rather flexible, and some of the ideas can be taken for other lib6502 file formats.
The o65 file format is defined in another file format specification. It allows the use of undefined references. In order to simplify the relocation procedure, lib6502 files have one undefined reference, namely "STDLIB". This reference defines the base of the lib6502 jump table. At STDLIB+0 there is a JMP opcode pointing to the code for fopen. At STDLIB+3 is a JMP opcode pointing to the code for fclose etc. The order is determined by the order given in the index of this definition.
A global variable is the "main" address, which is the start address for any lib6502 executable. If the "main" address is not given in the object file as a global variable, the start of the text segment is assumed to be the "main" address.
The lib6502 file format allows the use of "header options", where some OS specific options may be saved. The lib6502 files can - but don't need to - use a lib6502 header option (as defined in the o65 file format specification). This lib6502 header option contains the following struct:
.byt lib6502_major_version_nr, lib6502_minor_version_number .byt lib6502_needed_level, lib6502_possible_level
The version numbers are hints as to which library version the file is compiled with. The level is a new number that describes which functions are used, and which are not. A library may provide a certain amount of lib calls in the library call table (STDLIB). The maximum number of calls used is given in the "possible_level" value. The maximum number that must be functional (and with only few exceptions not just return "E_NOTIMP") is given by the "needed_level" number.
The level numbers are defined as:
If the file needs a Possible Level greater than the level provided by the library, an "E_LIBLEVEL" error code should be returned by forkto or exec.
Acknowledgements go to