On Feb 8, 2006, at 3:53 AM, Rashmi M wrote:
I am writing an application on Mac to list out the files deleted from the system (even from the trash). I have written code to read the catalog file node by node. But I have no idea of how deleted files are represented and how can I access them. Any guesses?
First of all, "the trash" is just a separate directory on the disk. Putting something in the trash merely moves the file or directory into the trash directory. The file or directory doesn't actually get deleted until the user "empties" the trash. The name and location of the trash directory has changed in various versions of Mac OS. In Mac OS 9, the directory is named "Trash" and is in the volume's root directory; by convention, it has its "invisible" bit set in the Finder Info. In Mac OS X, there is a directory named ".Trashes" in the volume's root directory; inside there are directories whose names are numeric: a user ID for each user who has a trash directory (they're created on demand).
When a file or directory is actually deleted, its record(s) are removed from the Catalog B-tree. And if it had overflow extents (more than 3 extents for HFS, or more than 8 extents for HFS Plus) then the overflow extent records are removed from the Extents B- tree. In Mac OS X 10.4.0 and later, a file or directory can have extended attributes stored in the Attributes B-tree; records for the deleted item would be removed from the Attributes B-tree as well. The space occupied by a file's forks is freed by clearing the corresponding bits in the allocation bitmap.
Trying to recover deleted files is problematic. Many file systems, such as UFS, EXT, or FAT, will simply mark a directory entry as "deleted" by overwriting a small number of bytes; you may be able to restore those bytes to a non-deleted state and find some or all of the original file's information. It's generally not that easy with HFS or HFS Plus.
In the B-trees, there are typically several records in a single node. They're essentially an array of records. If you delete a record in the middle of the node, the records that follow it get shuffled up to overwrite the original record, usually leaving no remnants of the original record. If the record being deleted is the last one in the node, it can be deleted by merely decrementing the number of records, in which case it might be possible to recover the original record. But with Mac OS X, we found that some non-Apple disk repair utilities were too aggressive in trying to recover valid- looking records in the unused portion of the node, so we began explicitly overwriting the newly freed space with zeroes. So, if Mac OS X deleted a file, the original record will always be overwritten with other data (either other records, or zeroes).
So perhaps your best bet at recovery is if you can recognize the content of a file. You could scan the volume's free allocation blocks looking for recognizable content. But beware that a file may not have been stored contiguously on the media. And the content may have been moved over time (especially with Mac OS X's adaptive hot file clustering), so you may see valid-looking content that is actually from an older version of the file.
-Mark