1*d5c9a868SElliott Hughes% Copyright 2003,2005,2007 Alain Knaff. 2*d5c9a868SElliott Hughes 3*d5c9a868SElliott Hughes% This documentation is for Mtools which is a collection of tools to 4*d5c9a868SElliott Hughes% allow Unix systems to manipulate MS-DOS files. 5*d5c9a868SElliott Hughes 6*d5c9a868SElliott Hughes% Permission is granted to copy, distribute and/or modify this document 7*d5c9a868SElliott Hughes% under the terms of the GNU Free Documentation License, Version 1.3 or 8*d5c9a868SElliott Hughes% any later version published by the Free Software Foundation; with no 9*d5c9a868SElliott Hughes% Invariant Sections, with no Front-Cover Texts, and with no Back-Cover 10*d5c9a868SElliott Hughes%Texts. A copy of the license is included in the section entitled 11*d5c9a868SElliott Hughes% ``GNU Free Documentation License''. 12*d5c9a868SElliott Hughes 13*d5c9a868SElliott Hughes\documentclass[a4paper,12pt]{article} 14*d5c9a868SElliott Hughes 15*d5c9a868SElliott Hughes\usepackage[T1]{fontenc} 16*d5c9a868SElliott Hughes\usepackage[latin1]{inputenc} 17*d5c9a868SElliott Hughes\usepackage{pslatex} 18*d5c9a868SElliott Hughes\usepackage[pdfpagemode=None,colorlinks]{hyperref} 19*d5c9a868SElliott Hughes 20*d5c9a868SElliott Hughes\author{Alain Knaff} 21*d5c9a868SElliott Hughes\title{How mformat-3.9.10 and above calculates needed FAT size} 22*d5c9a868SElliott Hughes 23*d5c9a868SElliott Hughes\begin{document} 24*d5c9a868SElliott Hughes 25*d5c9a868SElliott Hughes\maketitle 26*d5c9a868SElliott Hughes 27*d5c9a868SElliott HughesThis small document explains the formula used by {\tt mformat.c} to 28*d5c9a868SElliott Hughesfigure out fat size and number of clusters. Due to the way that 29*d5c9a868SElliott Hughesfilesystem parameters are stored in the boot sector, we can afford to 30*d5c9a868SElliott Hugheshave a FAT that is larger than need to be to accommodate the clusters 31*d5c9a868SElliott Hughespresent on disk, but under no circumstances can we have one that is 32*d5c9a868SElliott Hughestoo small. 33*d5c9a868SElliott Hughes 34*d5c9a868SElliott HughesIn this discussion, we use the following variable names: 35*d5c9a868SElliott Hughes 36*d5c9a868SElliott Hughes\begin{tabular}{|r|p{12cm}|} 37*d5c9a868SElliott Hughes 38*d5c9a868SElliott Hughes\hline 39*d5c9a868SElliott Hughes 40*d5c9a868SElliott Hughes$fatNybls$& 41*d5c9a868SElliott HughesNumber of nubbles (4 bit unit per FAT). This is 3 for FAT12, 4 for 42*d5c9a868SElliott HughesFAT16, and 8 for FAT16\\ 43*d5c9a868SElliott Hughes 44*d5c9a868SElliott Hughes\hline 45*d5c9a868SElliott Hughes 46*d5c9a868SElliott Hughes$numClus$& 47*d5c9a868SElliott HughesNumber of clusters on the disk\\ 48*d5c9a868SElliott Hughes 49*d5c9a868SElliott Hughes\hline 50*d5c9a868SElliott Hughes 51*d5c9a868SElliott Hughes$clusSiz$& 52*d5c9a868SElliott HughesSize of a cluster, expressed in sectors\\ 53*d5c9a868SElliott Hughes 54*d5c9a868SElliott Hughes\hline 55*d5c9a868SElliott Hughes 56*d5c9a868SElliott Hughes$secSiz$& 57*d5c9a868SElliott HughesSize of a sector, in bytes. Size of a sector in nybbles is $secSiz$ * 2\\ 58*d5c9a868SElliott Hughes 59*d5c9a868SElliott Hughes\hline 60*d5c9a868SElliott Hughes 61*d5c9a868SElliott Hughes$nfats$& 62*d5c9a868SElliott HughesNumber of FAT copies, usually two\\ 63*d5c9a868SElliott Hughes 64*d5c9a868SElliott Hughes\hline 65*d5c9a868SElliott Hughes 66*d5c9a868SElliott Hughes$remSects$& 67*d5c9a868SElliott Hughes``Remaining sectors'', after number of boot sectors and root directory 68*d5c9a868SElliott Hugheshave been accounted for\\ 69*d5c9a868SElliott Hughes 70*d5c9a868SElliott Hughes\hline 71*d5c9a868SElliott Hughes 72*d5c9a868SElliott Hughes$fatLen$& 73*d5c9a868SElliott HughesLength of the FAT, in sectors\\ 74*d5c9a868SElliott Hughes 75*d5c9a868SElliott Hughes\hline 76*d5c9a868SElliott Hughes 77*d5c9a868SElliott Hughes 78*d5c9a868SElliott Hughes\end{tabular} 79*d5c9a868SElliott Hughes 80*d5c9a868SElliott Hughes\ \\ 81*d5c9a868SElliott Hughes 82*d5c9a868SElliott HughesTaking into account both data and fat, each cluster takes up the 83*d5c9a868SElliott Hughesfollowing number of nybbles (units of 4 bytes): 84*d5c9a868SElliott Hughes 85*d5c9a868SElliott Hughes 86*d5c9a868SElliott Hughes$$clusSiz * (2*secSiz) + nfats * fatNybls$$ 87*d5c9a868SElliott Hughes 88*d5c9a868SElliott HughesThis accounts for the data of the cluster ($clusSiz * secSiz$), 89*d5c9a868SElliott Hughesas well as for the space taken up by its descriptor. 90*d5c9a868SElliott Hughes 91*d5c9a868SElliott HughesThe space taken up by all clusters together, plus the space taken by 92*d5c9a868SElliott Hughesdescriptors for clusters 0 and 1 ($2*nfats*fatNybls$) should be less 93*d5c9a868SElliott Hughesthan what is available. 94*d5c9a868SElliott Hughes 95*d5c9a868SElliott HughesAdditional sectors may be lost due to slack (you have to use a full 96*d5c9a868SElliott HughesFAT sector, you also have to use a full cluster in the data 97*d5c9a868SElliott Hughesarea). Thus, an {\em upper bound} for the number of clusters is as 98*d5c9a868SElliott Hughesfollows: 99*d5c9a868SElliott Hughes 100*d5c9a868SElliott Hughes$$ 101*d5c9a868SElliott HughesnumClus \le {2*remSect*secSiz - 2*nfats*fatNybls \over 102*d5c9a868SElliott Hughes2*clusSiz*secSiz + nfats*fatNybls} 103*d5c9a868SElliott Hughes$$ 104*d5c9a868SElliott Hughes 105*d5c9a868SElliott Hughes 106*d5c9a868SElliott Hughes$$ 107*d5c9a868SElliott HughesnumClus \le {(remSect+2*clusSiz)*2*secSiz \over 108*d5c9a868SElliott Hughes2*clusSiz*secSiz + nfats*fatNybls} - 2 109*d5c9a868SElliott Hughes$$ 110*d5c9a868SElliott Hughes 111*d5c9a868SElliott Hughes 112*d5c9a868SElliott HughesThese will take up at most the following space in one copy of the FAT 113*d5c9a868SElliott Hughes(we have to round up, because even a half-full fat sector must be 114*d5c9a868SElliott Hughestaken in its entirety) 115*d5c9a868SElliott Hughes 116*d5c9a868SElliott Hughes$$ 117*d5c9a868SElliott HughesfatLen \le \left\lceil { (numClus+2)*fatNybls \over secSiz * 2 } \right\rceil 118*d5c9a868SElliott Hughes$$ 119*d5c9a868SElliott Hughes 120*d5c9a868SElliott Hughes 121*d5c9a868SElliott Hughes$$ 122*d5c9a868SElliott HughesfatLen \le \left\lceil { 123*d5c9a868SElliott Hughes\left( { 2*(remSect+2*clusSiz)*secSiz \over 124*d5c9a868SElliott Hughes2*clusSiz*secSiz + nfats*fatNybls} \right) * fatNybls \over 125*d5c9a868SElliott Hughes2*secSiz 126*d5c9a868SElliott Hughes} \right\rceil 127*d5c9a868SElliott Hughes$$ 128*d5c9a868SElliott Hughes 129*d5c9a868SElliott Hughes 130*d5c9a868SElliott Hughes$$ 131*d5c9a868SElliott HughesfatLen \le \left\lceil { 132*d5c9a868SElliott Hughes(remSect+2*clusSiz)* fatNybls \over 133*d5c9a868SElliott Hughes2*clusSiz*secSiz + nfats*fatNybls 134*d5c9a868SElliott Hughes} \right\rceil 135*d5c9a868SElliott Hughes$$ 136*d5c9a868SElliott Hughes 137*d5c9a868SElliott HughesThe space left after FAT sector has been deduced is now {\em less than 138*d5c9a868SElliott Hughesor equal} to what would be needed for the data area of the clusters 139*d5c9a868SElliott Hughes(including fractional clusters), which is good, as we may have under 140*d5c9a868SElliott Hughesno circumstances {\em more} clusters in the data area than in the FAT. 141*d5c9a868SElliott HughesAn important point in this calculation is that we based the needed FAT 142*d5c9a868SElliott Hughessize on a {\em fractional} number of clusters, rather than a rounded 143*d5c9a868SElliott Hughesdown amount of clusters. Indeed, using a rounded down number could 144*d5c9a868SElliott Hugheshave exposed us to a situation where we had an {\em almost enough} 145*d5c9a868SElliott Hughesspace for one more cluster (i.e. not enough space for data + FAT, but 146*d5c9a868SElliott Hughesenough for data alone). This situation, combined with a situation 147*d5c9a868SElliott Hugheswhere the last FAT sector was flush full could have lead to a 148*d5c9a868SElliott Hughessituation where $numClus$ would become too large for the FAT to 149*d5c9a868SElliott Hughesaccommodate. I think this is clearer with an example: 150*d5c9a868SElliott Hughes\begin{itemize} 151*d5c9a868SElliott Hughes\item $remSect=4127$, $clusSiz=1$, $nfats=1$ 152*d5c9a868SElliott Hughes\item (Non rounded down) $numClus={(4127+2)*(1024) \over 1032} - 153*d5c9a868SElliott Hughes2 =4094.992$ 154*d5c9a868SElliott Hughes\item Rounded down: 4094 clusters 155*d5c9a868SElliott Hughes\item These fit into 16 FAT sectors, exactly 156*d5c9a868SElliott Hughes\item ... leaving us 4095 clusters, which is one to many (because 157*d5c9a868SElliott Hughesthese 4095 clusters would now need 17 FAT clusters) 158*d5c9a868SElliott Hughes\end{itemize} 159*d5c9a868SElliott Hughes 160*d5c9a868SElliott HughesKeeping the fractional part (0.992) allows us to round {\em up} the 161*d5c9a868SElliott Hughesneeded number of FAT sectors to 17, nicely solving this problem. 162*d5c9a868SElliott Hughes 163*d5c9a868SElliott HughesThe downside of counting the fractional part however is that we quite 164*d5c9a868SElliott Hughesoften waste a sector in the really common situation where both $nfats$ 165*d5c9a868SElliott Hughesand $clusSiz$ are even, while $remSect$ is odd. An easy way to address 166*d5c9a868SElliott Hughesthis is to subtract one from $remSect$ before application of the 167*d5c9a868SElliott Hughesformula, if this case is detected. Such operation carries no risk, as 168*d5c9a868SElliott Hughesthe odd final sector cannot be used to make a full cluster. 169*d5c9a868SElliott Hughes 170*d5c9a868SElliott HughesThere is still a case however, where fatLen must be grown manually 171*d5c9a868SElliott Hughesafter application of the formula: If numClus exceeds the maximum 172*d5c9a868SElliott Hughesnumber of clusters allowable for FAT12 or FAT16), we need to shrink 173*d5c9a868SElliott Hughes$numClus$ after application of the formula, and manually make the FAT 174*d5c9a868SElliott Hugheslarger in order to take up any excess space. 175*d5c9a868SElliott Hughes 176*d5c9a868SElliott HughesAlso note that as we used upper bounds, we may waste a certain number 177*d5c9a868SElliott Hughesof sectors, which an exact calculation may not have wasted. However, 178*d5c9a868SElliott Hughesnormally, we should not lose more than one sector per FAT copy. 179*d5c9a868SElliott Hughes 180*d5c9a868SElliott HughesN.B. In its document at \url{http://www.microsoft.com/hwdev/download/hardware/fatgen103.pdf}, 181*d5c9a868SElliott HughesMicrosoft proposes a much simpler formula. However, this formula is 182*d5c9a868SElliott Hughesboth wrong (i.e. occasionally it produces a smaller FAT than is 183*d5c9a868SElliott Hughesneeded for the clusters on disk), less generic (only works for sector 184*d5c9a868SElliott Hughessizes equal to 512), and less efficient (in case of FAT32, it may 185*d5c9a868SElliott Hugheswaste up to 8 sectors!) 186*d5c9a868SElliott Hughes 187*d5c9a868SElliott HughesThe formula is the following (for FAT16): 188*d5c9a868SElliott Hughes$$ 189*d5c9a868SElliott HughesfatLen \le \left\lceil { remSect \over 256 * clusSiz + nfats} \right\rceil 190*d5c9a868SElliott Hughes$$ 191*d5c9a868SElliott Hughes 192*d5c9a868SElliott HughesNote that it doesn't account for the dummy clusters 0 and 1. Thus, if 193*d5c9a868SElliott Hugheswe have 258 sectors remaining, with a cluster size of 1, and two FAT 194*d5c9a868SElliott Hughescopies, the Microsoft formula mistakenly assumes fatLen = 1. This 195*d5c9a868SElliott Hughesleaves 258 - 2 = 256 sectors for clusters, which yields 256 clusters. 196*d5c9a868SElliott HughesHowever, those clusters do not fit into the FAT, as two clusters are 197*d5c9a868SElliott Hugheslost (0 and 1). However, to Micro\$ofts' credit, this is not actually 198*d5c9a868SElliott Hughesthe formula they're using (tested with $remsect=160056$ and 199*d5c9a868SElliott Hughes$clusSize=4$), so this seems to be a documentation issue rather than a 200*d5c9a868SElliott Hughesgenuine bug. 201*d5c9a868SElliott Hughes 202*d5c9a868SElliott HughesIn case of FAT32, Microsoft just halves the denominator. This is 203*d5c9a868SElliott Hugheswasteful, as only the $256*clusSiz$ part would need to be halved. 204*d5c9a868SElliott Hughes 205*d5c9a868SElliott HughesIf we assume 16777000, and a cluster size of 8, our formula would give 206*d5c9a868SElliott Hughesus: 207*d5c9a868SElliott Hughes$$fatLen = \left\lceil (16777000 + 16) * 8 \over 2 * 8 * 512 + 16 208*d5c9a868SElliott Hughes\right\rceil = 16352$$ 209*d5c9a868SElliott HughesThis would leave $16777000-2*16352=16744296$ sectors for the clusters, 210*d5c9a868SElliott Hughesi.e. 2093037 clusters. The FAT descriptors for those 2093037 clusters 211*d5c9a868SElliott Hughesdo indeed fit into our 16352 fat sectors. 212*d5c9a868SElliott Hughes 213*d5c9a868SElliott HughesMicrosoft, on the other hand, would get: $$fatLen = \left\lceil{ 214*d5c9a868SElliott Hughes16777000 \over (256 * 8 + 2)/2} \right\rceil = 16368$$ This would only 215*d5c9a868SElliott Hughesleave $16777000-2*16368=16744264$, i.e. 2093033 clusters, thus wasting 216*d5c9a868SElliott Hughes4 clusters. The FAT would be 28 sectors too big, i.e. more than the 217*d5c9a868SElliott Hughesmere 8 sectors announced by Microsoft! Unfortunately, I currently 218*d5c9a868SElliott Hughesdon't have access to any sufficiently recent Windows to check out 219*d5c9a868SElliott Hugheswhether this really happens in the code, or whether it is only a 220*d5c9a868SElliott Hughesdocumentation issue as well. 221*d5c9a868SElliott Hughes 222*d5c9a868SElliott HughesOh, and before somebody points it out: the formula in this document 223*d5c9a868SElliott Hughesoccassionnally wastes sectors too, although not as much (Example: With 224*d5c9a868SElliott Hughes$remsect=685$, $clusSiz=1$ and $nfats=2$ our formula gives $fatLen=3$, 225*d5c9a868SElliott Hugheswhich leaves 679 clusters. However, we could use $fatLen=2$, leaving 226*d5c9a868SElliott Hughes681 clusters. 227*d5c9a868SElliott Hughes 228*d5c9a868SElliott Hughes\end{document} 229