On Friday, March 29, 2002, at 03:40 AM, Biswaroop(External) wrote:
Well in the MDB structure for an HFS volume the field vol.drXTClpSiz /* clump size for extents overflow file */ is 4 bytes long. Again in the Catalog Data Record structure the member filClpSize; /* file clump size */ takes 2 bytes. Therefore when i assign the value of the first variable to the second I lose information.
I'm not sure why you're copying from one to the other. The drXTClpSiz is the clump size for the extents B-tree only. Since the B-tree is used in a very different way from typical user files, I don't see a reason to try and set an ordinary file's clump size to be the same as one of the B-trees.
I believe Apple's code sets the clump size in a catalog record to zero; I think you can do the same. It turns out that having different clump sizes for different files wasn't very useful. If an application really wanted to make sure that a file was allocated in large contiguous pieces, it was generally better to try and pre-allocate it in one giant contiguous piece (or when allocating additional space, make the entire allocation contiguous). At runtime, Apple's code just uses a volume-wide default for ordinary files (i.e. ones with a catalog record).
Please is there any simple formula to find out the extent file size and the catalog file size for a volume when we know before hand how many files have to be in that volume. For eg. if i know i have to write "X" files contained in "Y" number of directories. Then can i calculate what should be the volume's clump size for the extents overflow file and the catalog file.
Certainly no simple formula for the catalog B-tree. In part that is because the size of the catalog is determined in part by the lengths of the file and directory names (even more so on HFS Plus, where the keys in index nodes are variable length). And for volumes that are modified over time, the order of operations will affect the size of the B-tree in complex ways. I'm sure you could come up with a statistical guess based on average name lengths, average density of nodes (i.e. how "full" they are), etc.
Your particular case of creating a CD is actually a much simpler problem, and you can compute an exact answer if you want. Since the files won't be modified over time, you can guarantee that they will not be fragmented. That means you can get by with a minimal extents B-tree containing no leaf records. That means a single allocation block (for the header node; the other nodes are unused and should be filled with zeroes).
Since you know the complete set of files and directories in advance, you can build an optimal tree by packing as many leaf records in a node as possible, and then moving to the next node. All it requires is knowing the order that you will assign directory IDs to directories, and then be able to sort the file and directory names for the items in a single directory. That way you can predict the entire leaf sequence. Once you know the number of leaf nodes, you can calculate the number of index nodes that will be parents of the leaf nodes, and so on up the tree until you get to a level containing exactly one node (the root). This should be relatively easy for HFS because the records in index nodes are constant size, so the calculation for each level should just be a simple divide and round up. For HFS Plus, you would have to keep track of the actual file or directory names since the length of the keys in index nodes vary based on the name lengths.
If that's too complicated, you could always fall back to assuming a constant size (maximum or average) for all of the records. Don't forget that for thread records, the key is of fixed size but the data is variable (since it contains a variable-length string).
-Mark