Fix for Read_Only_Once bug for new partitions



From: Rodney.Hamilton@NOSPAM.Invalid (Rodney Hamilton)
Subject: Fix for Read_Only_Once bug for new partitions
Date: Sun, 24 Nov 2002 01:50:34 -0000

I ran into an interesting problem with the Glenside IDE driver when I tried to create new partitions on my drive. The symptom was that the new partitions would allow reads only ONCE, then fail thereafter.

I tracked down this reads_only_once problem to the DD.TOT field in the partition's drive table. Whenever the unformatted partition's LSN0 was read and copied into the drive table, the table's DD.TOT field was being set to zero! This was due to the LSN0 sectors being null-filled because I had previously cleared the disk under Linux by filling from /dev/zero.

It seems that OS9 always trusts the drive table's sector count and refuses to read any LSN greater than or equal to DD.TOT. If the count becomes zero, NO data will be read, even for the raw device. If LSN0 isn't readable, the drive table isn't updated and DD.TOT remains stuck at zero, making the partition unusable until the next reboot.

The read_only_once bug is a direct consequence of this behavior when the partition's first three bytes are zeros. The driver init sets DD.TOT to $FF0000 and the FIRST process to open inherits this value and can read or write the disk. All later processes see the now-cached $000000 value and are refused access. I ran into the problem when I tried to format the new partitions - lformat opens and closes the new partition several times, thereby becoming a "later" process as seen by RBF.

The RBF DD.TOT problem can be easily duplicated with a scratch floppy:

  1. insert a blank OS9-formatted floppy into /d1
  2. run "ded /d1@", note valid LSN0 data (save dd.tot value "for later")
  3. edit sector, set first 3 bytes to 00, write sector, quit ded
  4. rerun same ded command (CTRL-A), *NO* LSN0 data is read!
  5. No data CAN be read, from ANY sector! quit ded
  6. try dir/dump/format, etc. No go, THAT floppy is unreadable!

To fix, ded a good floppy, swap to the dead one, reread lsn0, fix the count, write sector, quit. Floppy is good again.

For non-removable RBF devices, you have to play offset games with the device descriptor or even directly hack the in-memory drive tables to regain access. Since RBF has no mechanism to handle this special case, it is left up to the device driver to fix the problem.

The IDE driver fix turns out to be quite simple - after updating the drive table, check if DD.TOT is zero and force the msb to $FF if so. This should be done in the "CpyDrvTb" routine at the end of ccide16.a. I've tested this fix and it works just fine.

Here is my updated version of that routine, with the drive table fix.