1*cf5a6c84SAndroid Build Coastguard Worker<html><head><title>The design of toybox</title></head> 2*cf5a6c84SAndroid Build Coastguard Worker<!--#include file="header.html" --> 3*cf5a6c84SAndroid Build Coastguard Worker 4*cf5a6c84SAndroid Build Coastguard Worker<h2>Topics</h2> 5*cf5a6c84SAndroid Build Coastguard Worker<ul> 6*cf5a6c84SAndroid Build Coastguard Worker<li><a href=#goals><h3>Design Goals</h3></a></li> 7*cf5a6c84SAndroid Build Coastguard Worker<li><a href=#portability><h3>Portability Issues</h3></a></li> 8*cf5a6c84SAndroid Build Coastguard Worker<li><a href=#license><h3>License</a></h3></a></li> 9*cf5a6c84SAndroid Build Coastguard Worker<li><a href=#codestyle><h3>Coding Style</h3></a></li> 10*cf5a6c84SAndroid Build Coastguard Worker</ul> 11*cf5a6c84SAndroid Build Coastguard Worker<hr /> 12*cf5a6c84SAndroid Build Coastguard Worker 13*cf5a6c84SAndroid Build Coastguard Worker<a name="goals"><b><h2><a href="#goals">Design goals</a></h2></b> 14*cf5a6c84SAndroid Build Coastguard Worker 15*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox should be simple, small, fast, and full featured. In that order.</p> 16*cf5a6c84SAndroid Build Coastguard Worker 17*cf5a6c84SAndroid Build Coastguard Worker<p>It should be possible to get about <a href=https://en.wikipedia.org/wiki/Pareto_principle>80% of the way</a> to each goal 18*cf5a6c84SAndroid Build Coastguard Workerbefore they really start to fight. 19*cf5a6c84SAndroid Build Coastguard WorkerWhen these goals need to be balanced off against each other, keeping the code 20*cf5a6c84SAndroid Build Coastguard Workeras simple as it can be to do what it does is the most important (and hardest) 21*cf5a6c84SAndroid Build Coastguard Workergoal. Then keeping it small is slightly more important than making it fast. 22*cf5a6c84SAndroid Build Coastguard WorkerFeatures are the reason we write code in the first place but this has all 23*cf5a6c84SAndroid Build Coastguard Workerbeen implemented before so if we can't do a better job why bother?</p> 24*cf5a6c84SAndroid Build Coastguard Worker 25*cf5a6c84SAndroid Build Coastguard Worker<b><h3>Features</h3></b> 26*cf5a6c84SAndroid Build Coastguard Worker 27*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox should provide the command line utilities of a build 28*cf5a6c84SAndroid Build Coastguard Workerenvironment capable of recompiling itself under itself from source code. 29*cf5a6c84SAndroid Build Coastguard WorkerThis minimal build system conceptually consists of 4 parts: toybox, 30*cf5a6c84SAndroid Build Coastguard Workera C library, a compiler, and a kernel. Toybox needs to provide all the 31*cf5a6c84SAndroid Build Coastguard Workercommands (with all the behavior) necessary to run the configure/make/install 32*cf5a6c84SAndroid Build Coastguard Workerof each package and boot the resulting system into a usable state.</p> 33*cf5a6c84SAndroid Build Coastguard Worker 34*cf5a6c84SAndroid Build Coastguard Worker<p>In addition, it should be possible to bootstrap up to arbitrary complexity 35*cf5a6c84SAndroid Build Coastguard Workerunder the result by compiling and installing additional packages into this 36*cf5a6c84SAndroid Build Coastguard Workerminimal system, as measured by building both Linux From Scratch and the 37*cf5a6c84SAndroid Build Coastguard WorkerAndroid Open Source Project under the result. Any "circular dependencies" 38*cf5a6c84SAndroid Build Coastguard Workershould be solved by toybox including the missing dependencies itself 39*cf5a6c84SAndroid Build Coastguard Worker(see "Shared Libraries" below).</p> 40*cf5a6c84SAndroid Build Coastguard Worker 41*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox may also provide some "convenience" utilties 42*cf5a6c84SAndroid Build Coastguard Workerlike top and vi that aren't necessarily used in a build but which turn 43*cf5a6c84SAndroid Build Coastguard Workerthe minimal build environment into a minimal development environment 44*cf5a6c84SAndroid Build Coastguard Worker(supporting edit/compile/test cycles in a text console), configure 45*cf5a6c84SAndroid Build Coastguard Workernetwork infrastructure for communication with other systems (in a build 46*cf5a6c84SAndroid Build Coastguard Workercluster), and so on.</p> 47*cf5a6c84SAndroid Build Coastguard Worker 48*cf5a6c84SAndroid Build Coastguard Worker<p>And these days toybox is the command line of Android, so anything the android 49*cf5a6c84SAndroid Build Coastguard Workerguys say to do gets at the very least closely listened to.</p> 50*cf5a6c84SAndroid Build Coastguard Worker 51*cf5a6c84SAndroid Build Coastguard Worker<p>The hard part is deciding what NOT to include. A project without boundaries 52*cf5a6c84SAndroid Build Coastguard Workerwill bloat itself to death. One of the hardest but most important things a 53*cf5a6c84SAndroid Build Coastguard Workerproject must do is draw a line and say "no, this is somebody else's problem, 54*cf5a6c84SAndroid Build Coastguard Workernot something we should do." 55*cf5a6c84SAndroid Build Coastguard WorkerSome things are simply outside the scope of the project: even though 56*cf5a6c84SAndroid Build Coastguard Workerposix defines commands for compiling and linking, we're not going to include 57*cf5a6c84SAndroid Build Coastguard Workera compiler or linker (and support for a potentially infinite number of hardware 58*cf5a6c84SAndroid Build Coastguard Workertargets). And until somebody comes up with a ~30k ssh implementation (with 59*cf5a6c84SAndroid Build Coastguard Workera crypto algorithm that won't need replacing every 5 years), we're 60*cf5a6c84SAndroid Build Coastguard Workergoing to point you at dropbear or bearssl.</p> 61*cf5a6c84SAndroid Build Coastguard Worker 62*cf5a6c84SAndroid Build Coastguard Worker<p>The <a href=roadmap.html>roadmap</a> has the list of features we're 63*cf5a6c84SAndroid Build Coastguard Workertrying to implement, and the reasons why we decided to include those 64*cf5a6c84SAndroid Build Coastguard Workerfeatures. After the 1.0 release some of that material may get moved here, 65*cf5a6c84SAndroid Build Coastguard Workerbut for now it needs its own page. The <a href=status.html>status</a> 66*cf5a6c84SAndroid Build Coastguard Workerpage shows the project's progress against the roadmap.</p> 67*cf5a6c84SAndroid Build Coastguard Worker 68*cf5a6c84SAndroid Build Coastguard Worker<p>There are potential features (such as a screen/tmux implementation) 69*cf5a6c84SAndroid Build Coastguard Workerthat might be worth adding after 1.0, in part because they could share 70*cf5a6c84SAndroid Build Coastguard Workerinfrastructure with things like "less" and "vi" so might be less work for 71*cf5a6c84SAndroid Build Coastguard Workerus to do than for an external from scratch implementation. But for now, major 72*cf5a6c84SAndroid Build Coastguard Workernew features outside posix, android's existing commands, and the needs of 73*cf5a6c84SAndroid Build Coastguard Workerdevelopment systems, are a distraction from the 1.0 release.</p> 74*cf5a6c84SAndroid Build Coastguard Worker 75*cf5a6c84SAndroid Build Coastguard Worker<b><h3>Speed</h3></b> 76*cf5a6c84SAndroid Build Coastguard Worker 77*cf5a6c84SAndroid Build Coastguard Worker<p>Quick smoketest: use the "time" command, and if you haven't got a test 78*cf5a6c84SAndroid Build Coastguard Workercase that's embarassing enough to motivate digging, move on.</p> 79*cf5a6c84SAndroid Build Coastguard Worker 80*cf5a6c84SAndroid Build Coastguard Worker<p>It's easy to say a lot about optimizing for speed (which is why this section 81*cf5a6c84SAndroid Build Coastguard Workeris so long), but at the same time it's the optimization we care the least about. 82*cf5a6c84SAndroid Build Coastguard WorkerThe essence of speed is being as efficient as possible, which means doing as 83*cf5a6c84SAndroid Build Coastguard Workerlittle work as possible. A design that's small and simple gets you 90% of the 84*cf5a6c84SAndroid Build Coastguard Workerway there, and most of the rest is either fine-tuning or more trouble than 85*cf5a6c84SAndroid Build Coastguard Workerit's worth (and often actually counterproductive). Still, here's some 86*cf5a6c84SAndroid Build Coastguard Workeradvice:</p> 87*cf5a6c84SAndroid Build Coastguard Worker 88*cf5a6c84SAndroid Build Coastguard Worker<p>First, understand the darn problem you're trying to solve. You'd think 89*cf5a6c84SAndroid Build Coastguard WorkerI wouldn't have to say this, and yet. Trying to find a faster sorting 90*cf5a6c84SAndroid Build Coastguard Workeralgorithm is no substitute for figuring out a way to skip the sorting step 91*cf5a6c84SAndroid Build Coastguard Workerentirely. The fastest way to do anything is not to have to do it at all, 92*cf5a6c84SAndroid Build Coastguard Workerand _all_ optimization boils down to avoiding unnecessary work.</p> 93*cf5a6c84SAndroid Build Coastguard Worker 94*cf5a6c84SAndroid Build Coastguard Worker<p>Speed is easy to measure; there are dozens of profiling tools for Linux, 95*cf5a6c84SAndroid Build Coastguard Workerbut sticking in calls to "millitime()" out of lib.c and subtracting 96*cf5a6c84SAndroid Build Coastguard Worker(or doing two clock_gettime() calls and then nanodiff() on them) is 97*cf5a6c84SAndroid Build Coastguard Workerquick and easy. Don't waste too much time trying to optimize something you 98*cf5a6c84SAndroid Build Coastguard Workercan't measure, and there's no much point speeding up things you don't spend 99*cf5a6c84SAndroid Build Coastguard Workermuch time doing anyway.</p> 100*cf5a6c84SAndroid Build Coastguard Worker 101*cf5a6c84SAndroid Build Coastguard Worker<p>Understand the difference between throughput and latency. Faster 102*cf5a6c84SAndroid Build Coastguard Workerprocessors improve throughput, but don't always do much for latency. 103*cf5a6c84SAndroid Build Coastguard WorkerAfter 30 years of Moore's Law, most of the remaining problems are latency, 104*cf5a6c84SAndroid Build Coastguard Workernot throughput. (There are of course a few exceptions, like data compression 105*cf5a6c84SAndroid Build Coastguard Workercode, encryption, rsync...) Worry about throughput inside long-running 106*cf5a6c84SAndroid Build Coastguard Workerloops, and worry about latency everywhere else. (And don't worry too much 107*cf5a6c84SAndroid Build Coastguard Workerabout avoiding system calls or function calls or anything else in the name 108*cf5a6c84SAndroid Build Coastguard Workerof speed unless you are in the middle of a tight loop that's you've already 109*cf5a6c84SAndroid Build Coastguard Workerproven isn't running fast enough.)</p> 110*cf5a6c84SAndroid Build Coastguard Worker 111*cf5a6c84SAndroid Build Coastguard Worker<p>The lowest hanging optimization fruit is usually either "don't make 112*cf5a6c84SAndroid Build Coastguard Workerunnecessary copies of data" or "use a reasonable block size in your 113*cf5a6c84SAndroid Build Coastguard WorkerI/O transactions instead of byte-at-a-time". 114*cf5a6c84SAndroid Build Coastguard WorkerStart by looking for those, most of the rest of this advice is just explaining 115*cf5a6c84SAndroid Build Coastguard Workerwhy they're bad.</p> 116*cf5a6c84SAndroid Build Coastguard Worker 117*cf5a6c84SAndroid Build Coastguard Worker<p>"Locality of reference" is generally nice, in all sorts of contexts. 118*cf5a6c84SAndroid Build Coastguard WorkerIt's obvious that waiting for disk access is 1000x slower than doing stuff in 119*cf5a6c84SAndroid Build Coastguard WorkerRAM (and making the disk seek is 10x slower than sequential reads/writes), 120*cf5a6c84SAndroid Build Coastguard Workerbut it's just as true that a loop which stays in L1 cache is many times faster 121*cf5a6c84SAndroid Build Coastguard Workerthan a loop that has to wait for a DRAM fetch on each iteration. Don't worry 122*cf5a6c84SAndroid Build Coastguard Workerabout whether "&" is faster than "%" until your executable loop stays in L1 123*cf5a6c84SAndroid Build Coastguard Workercache and the data access is fetching cache lines intelligently. (To 124*cf5a6c84SAndroid Build Coastguard Workerunderstand DRAM, L1, and L2 cache, read Hannibal's marvelous ram guide at Ars 125*cf5a6c84SAndroid Build Coastguard WorkerTechnica: 126*cf5a6c84SAndroid Build Coastguard Worker<a href=http://arstechnica.com/paedia/r/ram_guide/ram_guide.part1-2.html>part one</a>, 127*cf5a6c84SAndroid Build Coastguard Worker<a href=http://arstechnica.com/paedia/r/ram_guide/ram_guide.part2-1.html>part two</a>, 128*cf5a6c84SAndroid Build Coastguard Worker<a href=http://arstechnica.com/paedia/r/ram_guide/ram_guide.part3-1.html>part three</a>, 129*cf5a6c84SAndroid Build Coastguard Workerplus this 130*cf5a6c84SAndroid Build Coastguard Worker<a href=http://arstechnica.com/articles/paedia/cpu/caching.ars/1>article on 131*cf5a6c84SAndroid Build Coastguard Workercacheing</a>, and this one on 132*cf5a6c84SAndroid Build Coastguard Worker<a href=http://arstechnica.com/articles/paedia/cpu/bandwidth-latency.ars>bandwidth 133*cf5a6c84SAndroid Build Coastguard Workerand latency</a>. 134*cf5a6c84SAndroid Build Coastguard WorkerAnd there's <a href=http://arstechnica.com/paedia/index.html>more where that came from</a>.) 135*cf5a6c84SAndroid Build Coastguard WorkerRunning out of L1 cache can execute one instruction per clock cycle, going 136*cf5a6c84SAndroid Build Coastguard Workerto L2 cache costs a dozen or so clock cycles, and waiting for a worst case dram 137*cf5a6c84SAndroid Build Coastguard Workerfetch (round trip latency with a bank switch) can cost thousands of 138*cf5a6c84SAndroid Build Coastguard Workerclock cycles. (Historically, this disparity has gotten worse with time, 139*cf5a6c84SAndroid Build Coastguard Workerjust like the speed hit for swapping to disk. These days, a _big_ L1 cache 140*cf5a6c84SAndroid Build Coastguard Workeris 128k and a big L2 cache is a couple of megabytes. A cheap low-power 141*cf5a6c84SAndroid Build Coastguard Workerembedded processor may have 8k of L1 cache and no L2.)</p> 142*cf5a6c84SAndroid Build Coastguard Worker 143*cf5a6c84SAndroid Build Coastguard Worker<p>Learn how <a href=http://nommu.org/memory-faq.txt>virtual memory and 144*cf5a6c84SAndroid Build Coastguard Workermemory managment units work</a>. Don't touch 145*cf5a6c84SAndroid Build Coastguard Workermemory you don't have to. Even just reading memory evicts stuff from L1 and L2 146*cf5a6c84SAndroid Build Coastguard Workercache, which may have to be read back in later. Writing memory can force the 147*cf5a6c84SAndroid Build Coastguard Workeroperating system to break copy-on-write, which allocates more memory. (The 148*cf5a6c84SAndroid Build Coastguard Workermemory returned by malloc() is only a virtual allocation, filled with lots of 149*cf5a6c84SAndroid Build Coastguard Workercopy-on-write mappings of the zero page. Actual physical pages get allocated 150*cf5a6c84SAndroid Build Coastguard Workerwhen the copy-on-write gets broken by writing to the virtual page. This 151*cf5a6c84SAndroid Build Coastguard Workeris why checking the return value of malloc() isn't very useful anymore, it 152*cf5a6c84SAndroid Build Coastguard Workeronly detects running out of virtual memory, not physical memory. Unless 153*cf5a6c84SAndroid Build Coastguard Workeryou're using a <a href=http://nommu.org>NOMMU system</a>, where all bets 154*cf5a6c84SAndroid Build Coastguard Workerare off.)</p> 155*cf5a6c84SAndroid Build Coastguard Worker 156*cf5a6c84SAndroid Build Coastguard Worker<p>Don't think that just because you don't have a swap file the system can't 157*cf5a6c84SAndroid Build Coastguard Workerstart swap thrashing: any file backed page (ala mmap) can be evicted, and 158*cf5a6c84SAndroid Build Coastguard Workerthere's a reason all running programs require an executable file (they're 159*cf5a6c84SAndroid Build Coastguard Workermmaped, and can be flushed back to disk when memory is short). And long 160*cf5a6c84SAndroid Build Coastguard Workerbefore that, disk cache gets reclaimed and has to be read back in. When the 161*cf5a6c84SAndroid Build Coastguard Workeroperating system really can't free up any more pages it triggers the out of 162*cf5a6c84SAndroid Build Coastguard Workermemory killer to free up pages by killing processes (the alternative is the 163*cf5a6c84SAndroid Build Coastguard Workerentire OS freezing solid). Modern operating systems seldom run out of 164*cf5a6c84SAndroid Build Coastguard Workermemory gracefully.</p> 165*cf5a6c84SAndroid Build Coastguard Worker 166*cf5a6c84SAndroid Build Coastguard Worker<p>It's usually better to be simple than clever. Many people think that mmap() 167*cf5a6c84SAndroid Build Coastguard Workeris faster than read() because it avoids a copy, but twiddling with the memory 168*cf5a6c84SAndroid Build Coastguard Workermanagement is itself slow, and can cause unnecessary CPU cache flushes. And 169*cf5a6c84SAndroid Build Coastguard Workerif a read faults in dozens of pages sequentially, but your mmap iterates 170*cf5a6c84SAndroid Build Coastguard Workerbackwards through a file (causing lots of seeks, each of which your program 171*cf5a6c84SAndroid Build Coastguard Workerblocks waiting for), the read can be many times faster. On the other hand, the 172*cf5a6c84SAndroid Build Coastguard Workermmap can sometimes use less memory, since the memory provided by mmap 173*cf5a6c84SAndroid Build Coastguard Workercomes from the page cache (allocated anyway), and it can be faster if you're 174*cf5a6c84SAndroid Build Coastguard Workerdoing a lot of different updates to the same area. The moral? Measure, then 175*cf5a6c84SAndroid Build Coastguard Workertry to speed things up, and measure again to confirm it actually _did_ speed 176*cf5a6c84SAndroid Build Coastguard Workerthings up rather than made them worse. (And understanding what's really going 177*cf5a6c84SAndroid Build Coastguard Workeron underneath is a big help to making it happen faster.)</p> 178*cf5a6c84SAndroid Build Coastguard Worker 179*cf5a6c84SAndroid Build Coastguard Worker<p>Another reason to be simple than clever is optimization 180*cf5a6c84SAndroid Build Coastguard Workerstrategies change with time. For example, decades ago precalculating a table 181*cf5a6c84SAndroid Build Coastguard Workerof results (for things like isdigit() or cosine(int degrees)) was clearly 182*cf5a6c84SAndroid Build Coastguard Workerfaster because processors were so slow. Then processors got faster and grew 183*cf5a6c84SAndroid Build Coastguard Workermath coprocessors, and calculating the value each time became faster than 184*cf5a6c84SAndroid Build Coastguard Workerthe table lookup (because the calculation fit in L1 cache but the lookup 185*cf5a6c84SAndroid Build Coastguard Workerhad to go out to DRAM). Then cache sizes got bigger (the Pentium M has 186*cf5a6c84SAndroid Build Coastguard Worker2 megabytes of L2 cache) and the table fit in cache, so the table became 187*cf5a6c84SAndroid Build Coastguard Workerfast again... Predicting how changes in hardware will affect your algorithm 188*cf5a6c84SAndroid Build Coastguard Workeris difficult, and using ten year old optimization advice can produce 189*cf5a6c84SAndroid Build Coastguard Workerlaughably bad results. Being simple and efficient should give at least a 190*cf5a6c84SAndroid Build Coastguard Workerreasonable starting point.</p> 191*cf5a6c84SAndroid Build Coastguard Worker 192*cf5a6c84SAndroid Build Coastguard Worker<p>Even at the design level, a lot of simple algorithms scale terribly but 193*cf5a6c84SAndroid Build Coastguard Workerperform fine with small data sets. When small datasets are the common case, 194*cf5a6c84SAndroid Build Coastguard Worker"better" versions that trade higher throughput for worse latency can 195*cf5a6c84SAndroid Build Coastguard Workerconsistently perform worse. 196*cf5a6c84SAndroid Build Coastguard WorkerSo if you think you're only ever going to feed the algorithm small data sets, 197*cf5a6c84SAndroid Build Coastguard Workermaybe just do the simple thing and wait for somebody to complain. For example, 198*cf5a6c84SAndroid Build Coastguard Workeryou probably don't need to sort and binary search the contents of 199*cf5a6c84SAndroid Build Coastguard Worker/etc/passwd, because even 50k users is still a reasonably manageable data 200*cf5a6c84SAndroid Build Coastguard Workerset for a readline/strcmp loop, and that's the userbase of a fairly major 201*cf5a6c84SAndroid Build Coastguard Worker<a href=https://en.wikipedia.org/wiki/List_of_United_States_public_university_campuses_by_enrollment>university</a>. 202*cf5a6c84SAndroid Build Coastguard WorkerInstead commands like "ls" call bufgetpwuid() out of lib/lib.c 203*cf5a6c84SAndroid Build Coastguard Workerwhich keeps a linked list of recently seen items, avoiding reparsing entirely 204*cf5a6c84SAndroid Build Coastguard Workerand trusting locality of reference to bring up the same dozen or so entries 205*cf5a6c84SAndroid Build Coastguard Workerfor "ls -l /dev" or similar. The pathological failure mode of "simple 206*cf5a6c84SAndroid Build Coastguard Workerlinked list" is to perform exactly as badly as constantly rescanning a 207*cf5a6c84SAndroid Build Coastguard Workerhuge /etc/passwd, so this simple optimization shouldn't ever make performance 208*cf5a6c84SAndroid Build Coastguard Workerworse (modulo possible memory exhaustion and thus swap thrashing). 209*cf5a6c84SAndroid Build Coastguard WorkerOn the other hand, toybox's multiplexer does sort and binary 210*cf5a6c84SAndroid Build Coastguard Workersearch its command list to minimize the latency of each command startup, 211*cf5a6c84SAndroid Build Coastguard Workerbecause the sort is a compile-time cost done once per build, 212*cf5a6c84SAndroid Build Coastguard Workerand the whole of command startup 213*cf5a6c84SAndroid Build Coastguard Workeris a "hot path" that should do as little work as possible because EVERY 214*cf5a6c84SAndroid Build Coastguard Workercommand has to go through it every time before performing any other function 215*cf5a6c84SAndroid Build Coastguard Workerso tiny gains are worthwhile. (These decisions aren't perfect, the point is 216*cf5a6c84SAndroid Build Coastguard Workerto show that thought went into them.)</p> 217*cf5a6c84SAndroid Build Coastguard Worker 218*cf5a6c84SAndroid Build Coastguard Worker<p>The famous quote from Ken Thompson, "When in doubt, use brute force", 219*cf5a6c84SAndroid Build Coastguard Workerapplies to toybox. Do the simple thing first, do as little of it as possible, 220*cf5a6c84SAndroid Build Coastguard Workerand make sure it's right. You can always speed it up later.</p> 221*cf5a6c84SAndroid Build Coastguard Worker 222*cf5a6c84SAndroid Build Coastguard Worker<b><h3>Size</h3></b> 223*cf5a6c84SAndroid Build Coastguard Worker<p>Quick smoketest: build toybox with and without the command (or the change), 224*cf5a6c84SAndroid Build Coastguard Workerand maybe run "nm --size-sort" on files in generated/unstripped. 225*cf5a6c84SAndroid Build Coastguard Worker(See make bloatcheck below for toybox's built in nm size diff-er.)</p> 226*cf5a6c84SAndroid Build Coastguard Worker 227*cf5a6c84SAndroid Build Coastguard Worker<p>Again, being simple gives you most of this. An algorithm that does less work 228*cf5a6c84SAndroid Build Coastguard Workeris generally smaller. Understand the problem, treat size as a cost, and 229*cf5a6c84SAndroid Build Coastguard Workerget a good bang for the byte.</p> 230*cf5a6c84SAndroid Build Coastguard Worker 231*cf5a6c84SAndroid Build Coastguard Worker<p>What "size" means depends on context: there are at least a half dozen 232*cf5a6c84SAndroid Build Coastguard Workerdifferent metrics in two broad categories: space used on disk/flash/ROM, 233*cf5a6c84SAndroid Build Coastguard Workerand space used in memory at runtime.</p> 234*cf5a6c84SAndroid Build Coastguard Worker 235*cf5a6c84SAndroid Build Coastguard Worker<p>Your executable file has at least 236*cf5a6c84SAndroid Build Coastguard Workerfour main segments (text = executable code, rodata = read only data, 237*cf5a6c84SAndroid Build Coastguard Workerdata = writeable variables initialized to a value other than zero, 238*cf5a6c84SAndroid Build Coastguard Workerbss = writeable data initialized to zero). Text and rodata are shared between multiple instances of the program running 239*cf5a6c84SAndroid Build Coastguard Workersimultaneously, the other 4 aren't. Only text, rodata, and data take up 240*cf5a6c84SAndroid Build Coastguard Workerspace in the binary, bss, stack and heap only matter at runtime. You can 241*cf5a6c84SAndroid Build Coastguard Workerview toybox's symbols with "nm generated/unstripped/toybox", the T/R/D/B 242*cf5a6c84SAndroid Build Coastguard Workerlets you know the segment the symbol lives in. (Lowercase means it's 243*cf5a6c84SAndroid Build Coastguard Workerlocal/static.)</p> 244*cf5a6c84SAndroid Build Coastguard Worker 245*cf5a6c84SAndroid Build Coastguard Worker<p>Then at runtime there's 246*cf5a6c84SAndroid Build Coastguard Workerheap size (where malloc() memory lives) and stack size (where local 247*cf5a6c84SAndroid Build Coastguard Workervariables and function call arguments and return addresses live). And 248*cf5a6c84SAndroid Build Coastguard Workeron 32 bit systems mmap() can have a constrained amount of virtual memory 249*cf5a6c84SAndroid Build Coastguard Worker(usually a couple gigabytes: the limits on 64 bit systems are generally big 250*cf5a6c84SAndroid Build Coastguard Workerenough it doesn't come up)</p> 251*cf5a6c84SAndroid Build Coastguard Worker 252*cf5a6c84SAndroid Build Coastguard Worker<p>Optimizing for binary size is generally good: less code is less to go 253*cf5a6c84SAndroid Build Coastguard Workerwrong, and executing fewer instructions makes your program run faster (and 254*cf5a6c84SAndroid Build Coastguard Workerfits more of it in cache). On embedded systems, binary size is especially 255*cf5a6c84SAndroid Build Coastguard Workerprecious because flash is expensive and code may need binary auditing for 256*cf5a6c84SAndroid Build Coastguard Workersecurity. Small stack size 257*cf5a6c84SAndroid Build Coastguard Workeris important for nommu systems because they have to preallocate their stack 258*cf5a6c84SAndroid Build Coastguard Workerand can't make it bigger via page fault. And everybody likes a small heap.</p> 259*cf5a6c84SAndroid Build Coastguard Worker 260*cf5a6c84SAndroid Build Coastguard Worker<p>Measure the right things. Especially with modern optimizers, expecting 261*cf5a6c84SAndroid Build Coastguard Workersomething to be smaller is no guarantee it will be after the compiler's done 262*cf5a6c84SAndroid Build Coastguard Workerwith it. Will total binary size is the final result, it isn't always the most 263*cf5a6c84SAndroid Build Coastguard Workeraccurate indicator of the impact of a given change, because lots of things 264*cf5a6c84SAndroid Build Coastguard Workerget combined and rounded during compilation and linking (and things like 265*cf5a6c84SAndroid Build Coastguard WorkerASAN disable optimization). Toybox has scripts/bloatcheck to compare two versions 266*cf5a6c84SAndroid Build Coastguard Workerof a program and show size changes in each symbol (using "nm --size-sort"). 267*cf5a6c84SAndroid Build Coastguard WorkerYou can "make baseline" to build a baseline version to compare against, 268*cf5a6c84SAndroid Build Coastguard Workerand then apply your changes and "make bloatcheck" to compare against 269*cf5a6c84SAndroid Build Coastguard Workerthe saved baseline version.</p> 270*cf5a6c84SAndroid Build Coastguard Worker 271*cf5a6c84SAndroid Build Coastguard Worker<p>Avoid special cases. Whenever you see similar chunks of code in more than 272*cf5a6c84SAndroid Build Coastguard Workerone place, it might be possible to combine them and have the users call shared 273*cf5a6c84SAndroid Build Coastguard Workercode (perhaps out of lib/*.c). This is the most commonly cited trick, which 274*cf5a6c84SAndroid Build Coastguard Workerdoesn't make it easy to work out HOW to share. If seeing two lines of code do 275*cf5a6c84SAndroid Build Coastguard Workerthe same thing makes you slightly uncomfortable, you've got the right mindset, 276*cf5a6c84SAndroid Build Coastguard Workerbut "reuse" requires the "re" to have benefit, and infrastructure in search 277*cf5a6c84SAndroid Build Coastguard Workerof a user will generally bit-rot before it finds one.</p> 278*cf5a6c84SAndroid Build Coastguard Worker 279*cf5a6c84SAndroid Build Coastguard Worker<p>The are a lot of potential microoptimizations (on some architectures 280*cf5a6c84SAndroid Build Coastguard Workerusing char instead of int as a loop index is noticeably slower, on some 281*cf5a6c84SAndroid Build Coastguard Workerarchitectures C bitfields are surprisingly inefficient, & is often faster 282*cf5a6c84SAndroid Build Coastguard Workerthan % in a tight loop, conditional assignment avoids branch prediction 283*cf5a6c84SAndroid Build Coastguard Workerfailures...) but they're generally not worth doing unless you're trying to 284*cf5a6c84SAndroid Build Coastguard Workerspeed up the middle of a tight inner loop chewing through a large amount 285*cf5a6c84SAndroid Build Coastguard Workerof data (such as a compression algorithm). For data pumps sane blocking 286*cf5a6c84SAndroid Build Coastguard Workerand fewer system calls (buffer some input/output and do a big read/write 287*cf5a6c84SAndroid Build Coastguard Workerinstead of a bunch of little small ones) is usually the big win. But 288*cf5a6c84SAndroid Build Coastguard Workerbe careful about cacheing stuff: the two persistently hard problems in computer 289*cf5a6c84SAndroid Build Coastguard Workerscience are naming things, cache coherency, and off by one errors.</p> 290*cf5a6c84SAndroid Build Coastguard Worker 291*cf5a6c84SAndroid Build Coastguard Worker<b><h3>Simplicity</h3></b> 292*cf5a6c84SAndroid Build Coastguard Worker 293*cf5a6c84SAndroid Build Coastguard Worker<p>Complexity is a cost, just like code size or runtime speed. Treat it as 294*cf5a6c84SAndroid Build Coastguard Workera cost, and spend your complexity budget wisely. (Sometimes this means you 295*cf5a6c84SAndroid Build Coastguard Workercan't afford a feature because it complicates the code too much to be 296*cf5a6c84SAndroid Build Coastguard Workerworth it.)</p> 297*cf5a6c84SAndroid Build Coastguard Worker 298*cf5a6c84SAndroid Build Coastguard Worker<p>Simplicity has lots of benefits. Simple code is easy to maintain, easy to 299*cf5a6c84SAndroid Build Coastguard Workerport to new processors, easy to audit for security holes, and easy to 300*cf5a6c84SAndroid Build Coastguard Workerunderstand.</p> 301*cf5a6c84SAndroid Build Coastguard Worker 302*cf5a6c84SAndroid Build Coastguard Worker<p>Simplicity itself can have subtle non-obvious aspects requiring a tradeoff 303*cf5a6c84SAndroid Build Coastguard Workerbetween one kind of simplicity and another: simple for the computer to 304*cf5a6c84SAndroid Build Coastguard Workerexecute and simple for a human reader to understand aren't always the 305*cf5a6c84SAndroid Build Coastguard Workersame thing. A compact and clever algorithm that does very little work may 306*cf5a6c84SAndroid Build Coastguard Workernot be as easy to explain or understand as a larger more explicit version 307*cf5a6c84SAndroid Build Coastguard Workerrequiring more code, memory, and CPU time. When balancing these, err on the 308*cf5a6c84SAndroid Build Coastguard Workerside of doing less work, but add comments describing how you 309*cf5a6c84SAndroid Build Coastguard Workercould be more explicit.</p> 310*cf5a6c84SAndroid Build Coastguard Worker 311*cf5a6c84SAndroid Build Coastguard Worker<p>In general, comments are not a substitute for good code (or well chosen 312*cf5a6c84SAndroid Build Coastguard Workervariable or function names). Commenting "x += y;" with "/* add y to x */" 313*cf5a6c84SAndroid Build Coastguard Workercan actually detract from the program's readability. If you need to describe 314*cf5a6c84SAndroid Build Coastguard Workerwhat the code is doing (rather than _why_ it's doing it), that means the 315*cf5a6c84SAndroid Build Coastguard Workercode itself isn't very clear.</p> 316*cf5a6c84SAndroid Build Coastguard Worker 317*cf5a6c84SAndroid Build Coastguard Worker<p>Environmental dependencies are another type of complexity, so needing other 318*cf5a6c84SAndroid Build Coastguard Workerpackages to build or run is a big downside. For example, we don't use curses 319*cf5a6c84SAndroid Build Coastguard Workerwhen we can simply output ansi escape sequences and trust all terminal 320*cf5a6c84SAndroid Build Coastguard Workerprograms written in the past 30 years to be able to support them. Regularly 321*cf5a6c84SAndroid Build Coastguard Workertesting that we work with C libraries which support static linking (musl does, 322*cf5a6c84SAndroid Build Coastguard Workerglibc doesn't) is another way to be self-contained with known boundaries: 323*cf5a6c84SAndroid Build Coastguard Workerit doesn't have to be the only way to build the project, but should be regularly 324*cf5a6c84SAndroid Build Coastguard Workertested and supported.</p> 325*cf5a6c84SAndroid Build Coastguard Worker 326*cf5a6c84SAndroid Build Coastguard Worker<p>Prioritizing simplicity tends to serve our other goals: simplifying code 327*cf5a6c84SAndroid Build Coastguard Workergenerally reduces its size (both in terms of binary size and runtime memory 328*cf5a6c84SAndroid Build Coastguard Workerusage), and avoiding unnecessary work makes code run faster. Smaller code 329*cf5a6c84SAndroid Build Coastguard Workeralso tends to run faster on modern hardware due to CPU cacheing: fitting your 330*cf5a6c84SAndroid Build Coastguard Workercode into L1 cache is great, and staying in L2 cache is still pretty good.</p> 331*cf5a6c84SAndroid Build Coastguard Worker 332*cf5a6c84SAndroid Build Coastguard Worker<p>But a simple implementation is not always the smallest or fastest, and 333*cf5a6c84SAndroid Build Coastguard Workerbalancing simplicity vs the other goals can be difficult. For example, the 334*cf5a6c84SAndroid Build Coastguard Workeratolx_range() function in lib/lib.c always uses the 64 bit "long long" type, 335*cf5a6c84SAndroid Build Coastguard Workerwhich produces larger and slower code on 32 bit platforms and 336*cf5a6c84SAndroid Build Coastguard Workeroften assigned into smaller interger types. Although libc has parallel 337*cf5a6c84SAndroid Build Coastguard Workerimplementations for different data sizes (atoi, atol, atoll) we chose a 338*cf5a6c84SAndroid Build Coastguard Workercommon codepath which can cover all cases (every user goes through the 339*cf5a6c84SAndroid Build Coastguard Workersame codepath, with the maximum amount of testing and minimum and avoids 340*cf5a6c84SAndroid Build Coastguard Workersurprising variations in behavior).</p> 341*cf5a6c84SAndroid Build Coastguard Worker 342*cf5a6c84SAndroid Build Coastguard Worker<p>On the other hand, the "tail" command has two codepaths, one for seekable 343*cf5a6c84SAndroid Build Coastguard Workerfiles and one for nonseekable files. Although the nonseekable case can handle 344*cf5a6c84SAndroid Build Coastguard Workerall inputs (and is required when input comes from a pipe or similar, so cannot 345*cf5a6c84SAndroid Build Coastguard Workerbe removed), reading through multiple gigabytes of data to reach the end of 346*cf5a6c84SAndroid Build Coastguard Workerseekable files was both a common case and hugely penalized by a nonseekable 347*cf5a6c84SAndroid Build Coastguard Workerapproach (half-minute wait vs instant results). This is one example 348*cf5a6c84SAndroid Build Coastguard Workerwhere performance did outweigh simplicity of implementation.</p> 349*cf5a6c84SAndroid Build Coastguard Worker 350*cf5a6c84SAndroid Build Coastguard Worker<p><a href=http://www.joelonsoftware.com/articles/fog0000000069.html>Joel 351*cf5a6c84SAndroid Build Coastguard WorkerSpolsky argues against throwing code out and starting over</a>, and he has 352*cf5a6c84SAndroid Build Coastguard Workergood points: an existing debugged codebase contains a huge amount of baked 353*cf5a6c84SAndroid Build Coastguard Workerin knowledge about strange real-world use cases that the designers didn't 354*cf5a6c84SAndroid Build Coastguard Workerknow about until users hit the bugs, and most of this knowledge is never 355*cf5a6c84SAndroid Build Coastguard Workerexplicitly stated anywhere except in the source code.</p> 356*cf5a6c84SAndroid Build Coastguard Worker 357*cf5a6c84SAndroid Build Coastguard Worker<p>That said, the Mythical Man-Month's "build one to throw away" advice points 358*cf5a6c84SAndroid Build Coastguard Workerout that until you've solved the problem you don't properly understand it, and 359*cf5a6c84SAndroid Build Coastguard Workerabout the time you finish your first version is when you've finally figured 360*cf5a6c84SAndroid Build Coastguard Workerout what you _should_ have done. (The corrolary is that if you build one 361*cf5a6c84SAndroid Build Coastguard Workerexpecting to throw it away, you'll actually wind up throwing away two. You 362*cf5a6c84SAndroid Build Coastguard Workerdon't understand the problem until you _have_ solved it.)</p> 363*cf5a6c84SAndroid Build Coastguard Worker 364*cf5a6c84SAndroid Build Coastguard Worker<p>Joel is talking about what closed source software can afford to do: Code 365*cf5a6c84SAndroid Build Coastguard Workerthat works and has been paid for is a corporate asset not lightly abandoned. 366*cf5a6c84SAndroid Build Coastguard WorkerOpen source software can afford to re-implement code that works, over and 367*cf5a6c84SAndroid Build Coastguard Workerover from scratch, for incremental gains. Before toybox, the unix command line 368*cf5a6c84SAndroid Build Coastguard Workerhas already been reimplemented from scratch several times (the 369*cf5a6c84SAndroid Build Coastguard Workeroriginal AT&T Unix command line in assembly and then in C, the BSD 370*cf5a6c84SAndroid Build Coastguard Workerversions, Coherent was the first full from-scratch Unix clone in 1980, 371*cf5a6c84SAndroid Build Coastguard WorkerMinix was another clone which Linux was inspired by and developed under, 372*cf5a6c84SAndroid Build Coastguard Workerthe GNU tools were yet another rewrite intended for use in the stillborn 373*cf5a6c84SAndroid Build Coastguard Worker"Hurd" project, BusyBox was still another rewrite, and more versions 374*cf5a6c84SAndroid Build Coastguard Workerwere written in Plan 9, uclinux, klibc, sash, sbase, s6, and of course 375*cf5a6c84SAndroid Build Coastguard Workerandroid toolbox...). But maybe toybox can do a better job. :)</p> 376*cf5a6c84SAndroid Build Coastguard Worker 377*cf5a6c84SAndroid Build Coastguard Worker<p>As Antoine de St. Exupery (author of "The Little Prince" and an early 378*cf5a6c84SAndroid Build Coastguard Workeraircraft designer) said, "Perfection is achieved, not when there 379*cf5a6c84SAndroid Build Coastguard Workeris nothing left to add, but when there is nothing left to take away." 380*cf5a6c84SAndroid Build Coastguard WorkerAnd Ken Thompson (creator of Unix) said "One of my most productive 381*cf5a6c84SAndroid Build Coastguard Workerdays was throwing away 1000 lines of code." It's always possible to 382*cf5a6c84SAndroid Build Coastguard Workercome up with a better way to do it.</p> 383*cf5a6c84SAndroid Build Coastguard Worker 384*cf5a6c84SAndroid Build Coastguard Worker<p>P.S. How could I resist linking to an article about 385*cf5a6c84SAndroid Build Coastguard Worker<a href=http://blog.outer-court.com/archive/2005-08-24-n14.html>why 386*cf5a6c84SAndroid Build Coastguard Workerprogrammers should strive to be lazy and dumb</a>?</p> 387*cf5a6c84SAndroid Build Coastguard Worker 388*cf5a6c84SAndroid Build Coastguard Worker<hr> 389*cf5a6c84SAndroid Build Coastguard Worker<a name="portability"><b><h2><a href="#portability">Portability issues</a></h2></b> 390*cf5a6c84SAndroid Build Coastguard Worker 391*cf5a6c84SAndroid Build Coastguard Worker<b><h3>Platforms</h3></b> 392*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox should run on Android (all commands with musl-libc, as large a subset 393*cf5a6c84SAndroid Build Coastguard Workeras practical with bionic), and every other hardware platform Linux runs on. 394*cf5a6c84SAndroid Build Coastguard WorkerOther posix/susv4 environments (perhaps MacOS X or newlib+libgloss) are vaguely 395*cf5a6c84SAndroid Build Coastguard Workerinteresting but only if they're easy to support; I'm not going to spend much 396*cf5a6c84SAndroid Build Coastguard Workereffort on them.</p> 397*cf5a6c84SAndroid Build Coastguard Worker 398*cf5a6c84SAndroid Build Coastguard Worker<p>I don't do windows.</p> 399*cf5a6c84SAndroid Build Coastguard Worker 400*cf5a6c84SAndroid Build Coastguard Worker<a name="standards" /> 401*cf5a6c84SAndroid Build Coastguard Worker<b><h3>Standards</h3></b> 402*cf5a6c84SAndroid Build Coastguard Worker 403*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox is implemented with reference to 404*cf5a6c84SAndroid Build Coastguard Worker<a href=https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf>c11</a>, 405*cf5a6c84SAndroid Build Coastguard Worker<a href=roadmap.html#susv4>Posix 2008</a>, 406*cf5a6c84SAndroid Build Coastguard Worker<a href=#bits>LP64</a>, 407*cf5a6c84SAndroid Build Coastguard Worker<a href=roadmap.html#sigh>LSB 4.1</a>, 408*cf5a6c84SAndroid Build Coastguard Workerthe <a href=https://www.kernel.org/doc/man-pages/>Linux man pages</a>, 409*cf5a6c84SAndroid Build Coastguard Workervarious <a href=https://www.rfc-editor.org/rfc-index.html>IETF RFCs</a>, 410*cf5a6c84SAndroid Build Coastguard Workerthe linux kernel source's 411*cf5a6c84SAndroid Build Coastguard Worker<a href=https://www.kernel.org/doc/Documentation/>Documentation</a> directory, 412*cf5a6c84SAndroid Build Coastguard Workerutf8 and unicode, and our terminal control outputs ANSI 413*cf5a6c84SAndroid Build Coastguard Worker<a href=https://man7.org/linux/man-pages/man4/console_codes.4.html>escape sequences</a>. 414*cf5a6c84SAndroid Build Coastguard WorkerToybox gets <a href=faq.html#cross>tested</a> with gcc and llvm on glibc, 415*cf5a6c84SAndroid Build Coastguard Workermusl-libc, and bionic, plus occasional <a href=https://github.com/landley/toybox/blob/master/kconfig/freebsd_miniconfig>FreeBSD</a> and 416*cf5a6c84SAndroid Build Coastguard Worker<a href=https://github.com/landley/toybox/blob/master/kconfig/macos_miniconfig>MacOS</a> builds for subsets 417*cf5a6c84SAndroid Build Coastguard Workerof the commands.</p> 418*cf5a6c84SAndroid Build Coastguard Worker 419*cf5a6c84SAndroid Build Coastguard Worker<p>For the build environment and runtime environment, toybox depends on 420*cf5a6c84SAndroid Build Coastguard Workerposix-2008 libc features such as the openat() family of 421*cf5a6c84SAndroid Build Coastguard Workerfunctions. We also root around in the linux /proc directory a lot (no other 422*cf5a6c84SAndroid Build Coastguard Workerway to implement "ps" at the moment), and assume certain "modern" linux kernel 423*cf5a6c84SAndroid Build Coastguard Workerbehavior (for example <a href=https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b6a2fea39318>linux 2.6.22</a> 424*cf5a6c84SAndroid Build Coastguard Workerexpanded the 128k process environment size limit to 2 gigabytes, then it was 425*cf5a6c84SAndroid Build Coastguard Workertrimmed back down to 10 megabytes, and when I asked for a way to query the 426*cf5a6c84SAndroid Build Coastguard Workeractual value from the kernel if it was going to keep changing 427*cf5a6c84SAndroid Build Coastguard Workerlike that <a href=https://lkml.org/lkml/2017/11/5/204>Linus declined</a>). 428*cf5a6c84SAndroid Build Coastguard WorkerWe make an effort to support <a href=faq.html#support_horizon>older kernels</a> 429*cf5a6c84SAndroid Build Coastguard Workerand other implementations (primarily MacOS and BSD) but we don't always 430*cf5a6c84SAndroid Build Coastguard Workerpolice their corner cases very closely.</p> 431*cf5a6c84SAndroid Build Coastguard Worker 432*cf5a6c84SAndroid Build Coastguard Worker<p><b>Why not just use the newest version of each standard?</b> 433*cf5a6c84SAndroid Build Coastguard Worker 434*cf5a6c84SAndroid Build Coastguard Worker<p>Partly to <a href=faq.html#support_horizon>support older systems</a>: 435*cf5a6c84SAndroid Build Coastguard Workeryou can't fix a bug in the old system if you can't build in the old 436*cf5a6c84SAndroid Build Coastguard Workerenvironment.</p> 437*cf5a6c84SAndroid Build Coastguard Worker 438*cf5a6c84SAndroid Build Coastguard Worker<p>Partly because toybox's maintainer has his own corollary to Moore's law: 439*cf5a6c84SAndroid Build Coastguard Worker50% of what you know about programming is obsolete every 18 440*cf5a6c84SAndroid Build Coastguard Workermonths, but the advantage of C & Unix it's usually the same 50% cycling 441*cf5a6c84SAndroid Build Coastguard Workerout over and over.</p> 442*cf5a6c84SAndroid Build Coastguard Worker 443*cf5a6c84SAndroid Build Coastguard Worker<p>But mostly because the updates haven't added anything we care about. 444*cf5a6c84SAndroid Build Coastguard WorkerPosix-2008 switched some things to larger (64 bit) data types and added the 445*cf5a6c84SAndroid Build Coastguard Workeropenat() family of functions (which take a directory filehandle instead of 446*cf5a6c84SAndroid Build Coastguard Workerusing the Current Working Directory), 447*cf5a6c84SAndroid Build Coastguard Workerbut the 2013 and 2018 releases of posix were basically typo fixes: still 448*cf5a6c84SAndroid Build Coastguard Workerrelease 7, still SUSv4. (An eventual release 8 might be interesting but 449*cf5a6c84SAndroid Build Coastguard Workerit's not out yet.)</p> 450*cf5a6c84SAndroid Build Coastguard Worker 451*cf5a6c84SAndroid Build Coastguard Worker<p>We're nominally C11 but mostly just writing good old ANSI C (I.E. C89). 452*cf5a6c84SAndroid Build Coastguard WorkerWe use a few of the new features like compound literals (6.5.2.5) and structure 453*cf5a6c84SAndroid Build Coastguard Workerinitialization by member name with unnamed members zeroed (6.7.9), 454*cf5a6c84SAndroid Build Coastguard Workerbut mostly we "officially" went from c99 to C11 to work around a 455*cf5a6c84SAndroid Build Coastguard Worker<a href=https://github.com/landley/toybox/commit/3625a260065b>clang compiler bug</a>. 456*cf5a6c84SAndroid Build Coastguard WorkerThe main thing we use from c99 that c89 hadn't had was // single line comments. 457*cf5a6c84SAndroid Build Coastguard Worker(We mostly don't even use C99's explicit width data types, ala uint32_t and 458*cf5a6c84SAndroid Build Coastguard Workerfriends, because LP64 handles that for us.)</p> 459*cf5a6c84SAndroid Build Coastguard Worker 460*cf5a6c84SAndroid Build Coastguard Worker<p>We're ignoring new versions of the Linux Foundation's standards (LSB, FHS) 461*cf5a6c84SAndroid Build Coastguard Workerentirely, for the same reason Debian is: they're not good at maintaining 462*cf5a6c84SAndroid Build Coastguard Workerstandards. (The Linux Foundation acquiring the Free Standards Group worked 463*cf5a6c84SAndroid Build Coastguard Workerout about as well as Microsoft buying Nokia, Twitter buying Vine, Yahoo 464*cf5a6c84SAndroid Build Coastguard Workerbuying Flickr...)</p> 465*cf5a6c84SAndroid Build Coastguard Worker 466*cf5a6c84SAndroid Build Coastguard Worker<p>We refer to current versions of man7.org because it's 467*cf5a6c84SAndroid Build Coastguard Workernot easily versioned (the website updates regularly) and because 468*cf5a6c84SAndroid Build Coastguard WorkerMichael Kerrisk does a good job maintaining it so far. That said, we 469*cf5a6c84SAndroid Build Coastguard Workertry to "provide new" in our commands but "depend on old" in our build scripts. 470*cf5a6c84SAndroid Build Coastguard Worker(For example, we didn't start using "wait -n" until it had been in bash for 7 471*cf5a6c84SAndroid Build Coastguard Workeryears, and even then people depending on Centos' 10 year support horizon 472*cf5a6c84SAndroid Build Coastguard Workercomplained.)</p> 473*cf5a6c84SAndroid Build Coastguard Worker 474*cf5a6c84SAndroid Build Coastguard Worker<p>Using newer vs older RFCs, and upgrading between versions, is a per-case 475*cf5a6c84SAndroid Build Coastguard Workerjudgement call.</p> 476*cf5a6c84SAndroid Build Coastguard Worker 477*cf5a6c84SAndroid Build Coastguard Worker<p><b>How strictly do you adhere to these standards?</b> 478*cf5a6c84SAndroid Build Coastguard Worker 479*cf5a6c84SAndroid Build Coastguard Worker<p>...ish? The man pages have a lot of stuff that's not in posix, 480*cf5a6c84SAndroid Build Coastguard Workerand there's no "init" or "mount" in posix, you can't implement "ps" 481*cf5a6c84SAndroid Build Coastguard Workerwithout replying on non-posix APIs....</p> 482*cf5a6c84SAndroid Build Coastguard Worker 483*cf5a6c84SAndroid Build Coastguard Worker<p>When the options a command offers visibly contradict posix, we try to have 484*cf5a6c84SAndroid Build Coastguard Workera "deviations from posix" section at the top of the source listing the 485*cf5a6c84SAndroid Build Coastguard Workerdifferences, but that's about what we provide not what we used from the OS 486*cf5a6c84SAndroid Build Coastguard Workeror build environment.</p> 487*cf5a6c84SAndroid Build Coastguard Worker 488*cf5a6c84SAndroid Build Coastguard Worker<p>The build needs bash (not a pure-posix sh), and building on MacOS requires 489*cf5a6c84SAndroid Build Coastguard Worker"gsed" (because Mac's sed is terrible), but toybox is explicitly self-hosting 490*cf5a6c84SAndroid Build Coastguard Workerand any failure to build under the tool versions we provide would be a bug 491*cf5a6c84SAndroid Build Coastguard Workerneeding to be fixed.</p> 492*cf5a6c84SAndroid Build Coastguard Worker 493*cf5a6c84SAndroid Build Coastguard Worker<p>Within the code, everything in main.c and lib/*.c has to build 494*cf5a6c84SAndroid Build Coastguard Workeron every supported Linux version, compiler, and library, plus BSD and MacOS. 495*cf5a6c84SAndroid Build Coastguard WorkerWe mostly try to keep #if/else staircases for portability issues to 496*cf5a6c84SAndroid Build Coastguard Workerlib/portability.[ch].</p> 497*cf5a6c84SAndroid Build Coastguard Worker 498*cf5a6c84SAndroid Build Coastguard Worker<p>Portability of individual commands varies: we sometimes program directly 499*cf5a6c84SAndroid Build Coastguard Workeragainst linux kernel APIs (unavoidable when accessing /proc and /sys), 500*cf5a6c84SAndroid Build Coastguard Workerindividual commands are allowed to #include <linux/*.h> (common 501*cf5a6c84SAndroid Build Coastguard Workerheaders and library files are not, except maybe lib/portability.* within an 502*cf5a6c84SAndroid Build Coastguard Workerappropriate #ifdef), we only really test against Linux errno values 503*cf5a6c84SAndroid Build Coastguard Worker(unless somebody on BSD submits a bug), and a few commands outright cheat 504*cf5a6c84SAndroid Build Coastguard Worker(the way ifconfig checks for ioctl numbers in the 0x89XX range). This is 505*cf5a6c84SAndroid Build Coastguard Workerthe main reason some commands build on BSD/MacOS and some don't.</p> 506*cf5a6c84SAndroid Build Coastguard Worker 507*cf5a6c84SAndroid Build Coastguard Worker<a name="bits" /> 508*cf5a6c84SAndroid Build Coastguard Worker<b><h3>32/64 bit</h3></b> 509*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox should work on both 32 bit and 64 bit systems. 64 bit desktop 510*cf5a6c84SAndroid Build Coastguard Workerhardware went mainstream <a href=https://web.archive.org/web/20040307000108mp_/http://developer.intel.com/technology/64bitextensions/faq.htm>in 2005</a> 511*cf5a6c84SAndroid Build Coastguard Workerand was essentially ubiquitous <a href=faq.html#support_horizon>by 2012</a>, 512*cf5a6c84SAndroid Build Coastguard Workerbut 32 bit hardware will continue to be important in embedded devices for years to come.</p> 513*cf5a6c84SAndroid Build Coastguard Worker 514*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox relies on the 515*cf5a6c84SAndroid Build Coastguard Worker<a href=http://archive.opengroup.org/public/tech/aspen/lp64_wp.htm>LP64 standard</a> 516*cf5a6c84SAndroid Build Coastguard Workerwhich Linux, MacOS X, and BSD all implement, and which modern 64 bit processors such as 517*cf5a6c84SAndroid Build Coastguard Workerx86-64 were <a href=http://www.pagetable.com/?p=6>explicitly designed to 518*cf5a6c84SAndroid Build Coastguard Workersupport</a>. (Here's the original <a href=https://web.archive.org/web/20020905181545/http://www.unix.org/whitepapers/64bit.html>LP64 white paper</a>.)</p> 519*cf5a6c84SAndroid Build Coastguard Worker 520*cf5a6c84SAndroid Build Coastguard Worker<p>LP64 defines explicit sizes for all the basic C integer types, and 521*cf5a6c84SAndroid Build Coastguard Workerguarantees that on any Unix-like platform "long" and "pointer" types 522*cf5a6c84SAndroid Build Coastguard Workerare always the same size (the processor's register size). 523*cf5a6c84SAndroid Build Coastguard WorkerThis means it's safe to assign pointers into 524*cf5a6c84SAndroid Build Coastguard Workerlongs and vice versa without losing data: on 32 bit systems both are 32 bit, 525*cf5a6c84SAndroid Build Coastguard Workeron 64 bit systems both are 64 bit.</p> 526*cf5a6c84SAndroid Build Coastguard Worker 527*cf5a6c84SAndroid Build Coastguard Worker<table border=1 cellpadding=10 cellspacing=2> 528*cf5a6c84SAndroid Build Coastguard Worker<tr><td>C type</td><td>char</td><td>short</td><td>int</td><td>long</td><td>long long</td></tr> 529*cf5a6c84SAndroid Build Coastguard Worker<tr><td>32 bit<br />sizeof</td><td>8 bits</td><td>16 bits</td><td>32 bits</td><td>32 bits</td><td>64 bits</td></tr> 530*cf5a6c84SAndroid Build Coastguard Worker<tr><td>64 bit<br />sizeof</td><td>8 bits</td><td>16 bits</td><td>32 bits</td><td>64 bits</td><td>64 bits</td></tr> 531*cf5a6c84SAndroid Build Coastguard Worker</table> 532*cf5a6c84SAndroid Build Coastguard Worker 533*cf5a6c84SAndroid Build Coastguard Worker<p>LP64 eliminates the need to use c99 "uint32_t" and friends: the basic 534*cf5a6c84SAndroid Build Coastguard WorkerC types all have known size/behavior, and the only type whose 535*cf5a6c84SAndroid Build Coastguard Workersize varies is "long", which is the natural register size of the processor.</p> 536*cf5a6c84SAndroid Build Coastguard Worker 537*cf5a6c84SAndroid Build Coastguard Worker<p>Note that Windows doesn't work like this, and I don't care, but if you're 538*cf5a6c84SAndroid Build Coastguard Workercurious here are <a href=https://devblogs.microsoft.com/oldnewthing/20050131-00/?p=36563>the insane legacy reasons why this is broken on Windows</a>.</a></p> 539*cf5a6c84SAndroid Build Coastguard Worker 540*cf5a6c84SAndroid Build Coastguard Worker<p>The main squishy bit in LP64 is that "long long" was defined as 541*cf5a6c84SAndroid Build Coastguard Worker"at least" 64 bits instead of "exactly" 64 bits, and the standards body 542*cf5a6c84SAndroid Build Coastguard Workerthat issued it collapsed in the wake of the <a href=https://en.wikipedia.org/wiki/Unix_wars>proprietary unix wars</a> (all 543*cf5a6c84SAndroid Build Coastguard Workerthose lawsuits between AT&T/BSDI/Novell/Caldera/SCO), so is 544*cf5a6c84SAndroid Build Coastguard Workernot available to issue an official correction. Then again a processor 545*cf5a6c84SAndroid Build Coastguard Workerwith 128-bit general purpose registers wouldn't be commercially viable 546*cf5a6c84SAndroid Build Coastguard Worker<a href=https://landley.net/notes-2011.html#26-06-2011>until 2053</a> 547*cf5a6c84SAndroid Build Coastguard Worker(because 2005+32*1.5), and with the S-curve of Moore's Law slowly 548*cf5a6c84SAndroid Build Coastguard Worker<a href=http://www.acm.org/articles/people-of-acm/2016/david-patterson>bending back down</a> as 549*cf5a6c84SAndroid Build Coastguard Workeratomic limits and <a href=http://www.cnet.com/news/end-of-moores-law-its-not-just-about-physics/>exponential cost increases</a> produce increasing 550*cf5a6c84SAndroid Build Coastguard Workerdrag.... (The original Moore's Law curve would mean that in the year 2022 551*cf5a6c84SAndroid Build Coastguard Workera high end workstation would have around 8 terabytes of RAM, available retail. 552*cf5a6c84SAndroid Build Coastguard WorkerMost don't even come with 553*cf5a6c84SAndroid Build Coastguard Workerthat much disk space.) At worst we don't need to care for decades, the 554*cf5a6c84SAndroid Build Coastguard WorkerS-curve bending down means probably not in our lifetimes, and 555*cf5a6c84SAndroid Build Coastguard Workeratomic limits may mean "never". So I'm ok treating "long long" as exactly 64 bits.</p> 556*cf5a6c84SAndroid Build Coastguard Worker 557*cf5a6c84SAndroid Build Coastguard Worker<b><h3>Signedness of char</h3></b> 558*cf5a6c84SAndroid Build Coastguard Worker<p>On platforms like x86, variables of type char default to unsigned. On 559*cf5a6c84SAndroid Build Coastguard Workerplatforms like arm, char defaults to signed. This difference can lead to 560*cf5a6c84SAndroid Build Coastguard Workersubtle portability bugs, and to avoid them we specify which one we want by 561*cf5a6c84SAndroid Build Coastguard Workerfeeding the compiler -funsigned-char.</p> 562*cf5a6c84SAndroid Build Coastguard Worker 563*cf5a6c84SAndroid Build Coastguard Worker<p>The reason to pick "unsigned" is that way char strings are 8-bit clean by 564*cf5a6c84SAndroid Build Coastguard Workerdefault, which makes UTF-8 support easier.</p> 565*cf5a6c84SAndroid Build Coastguard Worker 566*cf5a6c84SAndroid Build Coastguard Worker<p><h3>Error messages and internationalization</h3></p> 567*cf5a6c84SAndroid Build Coastguard Worker 568*cf5a6c84SAndroid Build Coastguard Worker<p>Error messages are extremely terse not just to save bytes, but because we 569*cf5a6c84SAndroid Build Coastguard Workerdon't use any sort of _("string") translation infrastructure. (We're not 570*cf5a6c84SAndroid Build Coastguard Workertranslating the command names themselves, so we must expect a minimum amount of 571*cf5a6c84SAndroid Build Coastguard Workerenglish knowledge from our users, but let's keep it to a minimum.)</p> 572*cf5a6c84SAndroid Build Coastguard Worker 573*cf5a6c84SAndroid Build Coastguard Worker<p>Thus "bad -A '%c'" is 574*cf5a6c84SAndroid Build Coastguard Workerpreferable to "Unrecognized address base '%c'", because a non-english speaker 575*cf5a6c84SAndroid Build Coastguard Workercan see that -A was the problem (giving back the command line argument they 576*cf5a6c84SAndroid Build Coastguard Workersupplied). A user with a ~20 word english vocabulary is 577*cf5a6c84SAndroid Build Coastguard Workermore likely to know (or guess) "bad" than the longer message, and you can 578*cf5a6c84SAndroid Build Coastguard Workeruse "bad" in place of "invalid", "inappropriate", "unrecognized"... 579*cf5a6c84SAndroid Build Coastguard WorkerSimilarly when atolx_range() complains about range constraints with 580*cf5a6c84SAndroid Build Coastguard Worker"4 < 17" or "12 > 5", it's intentional: those don't need to be translated.</p> 581*cf5a6c84SAndroid Build Coastguard Worker 582*cf5a6c84SAndroid Build Coastguard Worker<p>The strerror() messages produced by perror_exit() and friends should be 583*cf5a6c84SAndroid Build Coastguard Workerlocalized by libc, and our error functions also prepend the command name 584*cf5a6c84SAndroid Build Coastguard Worker(which non-english speakers can presumably recognize already). Keep the 585*cf5a6c84SAndroid Build Coastguard Workerexplanation in between to a minimum, and where possible feed back the values 586*cf5a6c84SAndroid Build Coastguard Workerthey passed in to identify _what_ we couldn't process. 587*cf5a6c84SAndroid Build Coastguard WorkerIf you say perror_exit("setsockopt"), you've identified the action you 588*cf5a6c84SAndroid Build Coastguard Workerwere trying to take, and the perror gives a translated error message (from libc) 589*cf5a6c84SAndroid Build Coastguard Workerexplaining _why_ it couldn't do it, so you probably don't need to add english 590*cf5a6c84SAndroid Build Coastguard Workerwords like "failed" or "couldn't assign".</p> 591*cf5a6c84SAndroid Build Coastguard Worker 592*cf5a6c84SAndroid Build Coastguard Worker<p>All commands should be 8-bit clean, with explicit 593*cf5a6c84SAndroid Build Coastguard Worker<a href=http://yarchive.net/comp/linux/utf8.html>UTF-8</a> support where 594*cf5a6c84SAndroid Build Coastguard Workernecessary. Assume all input data might be utf8, and at least preserve 595*cf5a6c84SAndroid Build Coastguard Workerit and pass it through. (For this reason, our build is -funsigned-char on 596*cf5a6c84SAndroid Build Coastguard Workerall architectures; "char" is unsigned unless you stick "signed" in front 597*cf5a6c84SAndroid Build Coastguard Workerof it.)</p> 598*cf5a6c84SAndroid Build Coastguard Worker 599*cf5a6c84SAndroid Build Coastguard Worker<p>Locale support isn't currently a goal; that's a presentation layer issue 600*cf5a6c84SAndroid Build Coastguard Worker(I.E. a GUI problem).</p> 601*cf5a6c84SAndroid Build Coastguard Worker 602*cf5a6c84SAndroid Build Coastguard Worker<p>Someday we should probably have translated --help text, but that's a 603*cf5a6c84SAndroid Build Coastguard Workerpost-1.0 issue.</p> 604*cf5a6c84SAndroid Build Coastguard Worker 605*cf5a6c84SAndroid Build Coastguard Worker<p><h3>Help text</h3></p> 606*cf5a6c84SAndroid Build Coastguard Worker 607*cf5a6c84SAndroid Build Coastguard Worker<p>Each command's help text tries to briefly answer the questions "what does 608*cf5a6c84SAndroid Build Coastguard Workerthis command do" and "how do I use it". There's a usage: line, basic 609*cf5a6c84SAndroid Build Coastguard Workerdescription, list of command line options (mostly in alphabetical order), 610*cf5a6c84SAndroid Build Coastguard Workerand sometimes additional explanation at the end. Default values and --longopts 611*cf5a6c84SAndroid Build Coastguard Workerare usually in parentheses on the end of an option's explanation line.</p> 612*cf5a6c84SAndroid Build Coastguard Worker 613*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox silently accepts a lot of compatibility flags like <b>patch -u</b> 614*cf5a6c84SAndroid Build Coastguard Workerthat aren't in the help text to work with existing scripts, but may not 615*cf5a6c84SAndroid Build Coastguard Workermention options that don't help write new scripts (mostly synonyms and NOPs).</p> 616*cf5a6c84SAndroid Build Coastguard Worker 617*cf5a6c84SAndroid Build Coastguard Worker<p><h3>Shared Libraries</h3></p> 618*cf5a6c84SAndroid Build Coastguard Worker 619*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox's policy on shared libraries is that they should never be 620*cf5a6c84SAndroid Build Coastguard Workerrequired, but can optionally be used to improve performance.</p> 621*cf5a6c84SAndroid Build Coastguard Worker 622*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox should provide the command line utilities for 623*cf5a6c84SAndroid Build Coastguard Worker<a href=roadmap.html#dev_env>self-hosting development envirionments</a>, 624*cf5a6c84SAndroid Build Coastguard Workerand an easy way to set up "hermetic builds" (I.E. builds which provide 625*cf5a6c84SAndroid Build Coastguard Workertheir own dependencies, isolating the build logic from host command version 626*cf5a6c84SAndroid Build Coastguard Workerskew with a simple known build environment). In both cases, external 627*cf5a6c84SAndroid Build Coastguard Workerdependencies defeat the purpose.</p> 628*cf5a6c84SAndroid Build Coastguard Worker 629*cf5a6c84SAndroid Build Coastguard Worker<p>This means toybox should provide full functionality without relying 630*cf5a6c84SAndroid Build Coastguard Workeron any external dependencies (other than libc). But toybox may optionally use 631*cf5a6c84SAndroid Build Coastguard Workerlibraries such as zlib and openssl to improve performance for things like 632*cf5a6c84SAndroid Build Coastguard Workerdeflate and sha1sum, which lets the corresponding built-in implementations 633*cf5a6c84SAndroid Build Coastguard Workerbe simple (and thus slow). But the built-in implementations need to exist and 634*cf5a6c84SAndroid Build Coastguard Workerwork.</p> 635*cf5a6c84SAndroid Build Coastguard Worker 636*cf5a6c84SAndroid Build Coastguard Worker<p>(This is why we use an external https wrapper program, because depending on 637*cf5a6c84SAndroid Build Coastguard Workeropenssl or similar to be linked in would change the behavior of toybox.)</p> 638*cf5a6c84SAndroid Build Coastguard Worker 639*cf5a6c84SAndroid Build Coastguard Worker<hr /><a name="license" /><h2>License</h2> 640*cf5a6c84SAndroid Build Coastguard Worker 641*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox is licensed <a href=license.html>0BSD</a>, which is a public domain 642*cf5a6c84SAndroid Build Coastguard Workerequivalent license approved by <a href=https://spdx.org/licenses/0BSD.html>SPDX</a>. This works like other BSD licenses except that it doesn't 643*cf5a6c84SAndroid Build Coastguard Workerrequire copying specific license text into the resulting project when 644*cf5a6c84SAndroid Build Coastguard Workeryou copy code. (We care about attribution, not ownership, and the internet's 645*cf5a6c84SAndroid Build Coastguard Workerreally good at pointing out plagiarism.)</p> 646*cf5a6c84SAndroid Build Coastguard Worker 647*cf5a6c84SAndroid Build Coastguard Worker<p>This means toybox usually can't use external code contributions, and must 648*cf5a6c84SAndroid Build Coastguard Workerimplement new versions of everything unless the external code's original 649*cf5a6c84SAndroid Build Coastguard Workerauthor (and any additional contributors) grants permission to relicense. 650*cf5a6c84SAndroid Build Coastguard WorkerJust as a GPLv2 project can't incorporate GPLv3 code and a BSD-licensed 651*cf5a6c84SAndroid Build Coastguard Workerproject can't incorporate either kind of GPL code, we can't incorporate 652*cf5a6c84SAndroid Build Coastguard Workermost BSD or Apache licensed code without changing our license terms.</p> 653*cf5a6c84SAndroid Build Coastguard Worker 654*cf5a6c84SAndroid Build Coastguard Worker<p>The exception to this is code under an existing public domain equivalent 655*cf5a6c84SAndroid Build Coastguard Workerlicense, such as the xz decompressor or 656*cf5a6c84SAndroid Build Coastguard Worker<a href=https://github.com/mkj/dropbear/blob/master/libtommath/LICENSE>libtommath</a> and <a href=https://github.com/mkj/dropbear/blob/master/libtomcrypt/LICENSE>libtomcrypt</a>.</p> 657*cf5a6c84SAndroid Build Coastguard Worker 658*cf5a6c84SAndroid Build Coastguard Worker<hr /><a name="codestyle" /><h2>Coding style</h2> 659*cf5a6c84SAndroid Build Coastguard Worker 660*cf5a6c84SAndroid Build Coastguard Worker<p>The real coding style holy wars are over things that don't matter 661*cf5a6c84SAndroid Build Coastguard Worker(whitespace, indentation, curly bracket placement...) and thus have no 662*cf5a6c84SAndroid Build Coastguard Workerobviously correct answer. As in academia, "the fighting is so vicious because 663*cf5a6c84SAndroid Build Coastguard Workerthe stakes are so small". That said, being consistent makes the code readable, 664*cf5a6c84SAndroid Build Coastguard Workerso here's how to make toybox code look like other toybox code.</p> 665*cf5a6c84SAndroid Build Coastguard Worker 666*cf5a6c84SAndroid Build Coastguard Worker<p>Toybox source uses two spaces per indentation level, and wraps at 80 667*cf5a6c84SAndroid Build Coastguard Workercolumns. (Indentation of continuation lines is awkward no matter what 668*cf5a6c84SAndroid Build Coastguard Workeryou do, sometimes two spaces looks better, sometimes indenting to the 669*cf5a6c84SAndroid Build Coastguard Workercontents of a parentheses looks better.)</p> 670*cf5a6c84SAndroid Build Coastguard Worker 671*cf5a6c84SAndroid Build Coastguard Worker<p>I'm aware this indentation style creeps some people out, so here's 672*cf5a6c84SAndroid Build Coastguard Workerthe sed invocation to convert groups of two leading spaces to tabs:</p> 673*cf5a6c84SAndroid Build Coastguard Worker<blockquote><pre> 674*cf5a6c84SAndroid Build Coastguard Workersed -i ':loop;s/^\( *\) /\1\t/;t loop' filename 675*cf5a6c84SAndroid Build Coastguard Worker</pre></blockquote> 676*cf5a6c84SAndroid Build Coastguard Worker 677*cf5a6c84SAndroid Build Coastguard Worker<p>And here's the sed invocation to convert leading tabs to two spaces each:</p> 678*cf5a6c84SAndroid Build Coastguard Worker<blockquote><pre> 679*cf5a6c84SAndroid Build Coastguard Workersed -i ':loop;s/^\( *\)\t/\1 /;t loop' filename 680*cf5a6c84SAndroid Build Coastguard Worker</pre></blockquote> 681*cf5a6c84SAndroid Build Coastguard Worker 682*cf5a6c84SAndroid Build Coastguard Worker<p>There's a space after C flow control statements that look like functions, so 683*cf5a6c84SAndroid Build Coastguard Worker"if (blah)" instead of "if(blah)". (Note that sizeof is actually an 684*cf5a6c84SAndroid Build Coastguard Workeroperator, so we don't give it a space for the same reason ++ doesn't get 685*cf5a6c84SAndroid Build Coastguard Workerone. Yeah, it doesn't need the parentheses either, but it gets them. 686*cf5a6c84SAndroid Build Coastguard WorkerThese rules are mostly to make the code look consistent, and thus easier 687*cf5a6c84SAndroid Build Coastguard Workerto read.) We also put a space around assignment operators (on both sides), 688*cf5a6c84SAndroid Build Coastguard Workerso "int x = 0;".</p> 689*cf5a6c84SAndroid Build Coastguard Worker 690*cf5a6c84SAndroid Build Coastguard Worker<p>Blank lines (vertical whitespace) go between thoughts. "We were doing that, 691*cf5a6c84SAndroid Build Coastguard Workernow we're doing this." (Not a hard and fast rule about _where_ it goes, 692*cf5a6c84SAndroid Build Coastguard Workerbut there should be some for the same reason writing has paragraph breaks.)</p> 693*cf5a6c84SAndroid Build Coastguard Worker 694*cf5a6c84SAndroid Build Coastguard Worker<p>Variable declarations go at the start of blocks, with a blank line between 695*cf5a6c84SAndroid Build Coastguard Workerthem and other code. Yes, c99 allowed you to put them anywhere, but they're 696*cf5a6c84SAndroid Build Coastguard Workerharder to find if you do that. If there's a large enough distance between 697*cf5a6c84SAndroid Build Coastguard Workerthe declaration and the code using it to make you uncomfortable, maybe the 698*cf5a6c84SAndroid Build Coastguard Workerfunction's too big, or is there an if statement or something you can 699*cf5a6c84SAndroid Build Coastguard Workeruse as an excuse to start a new closer block? Use a longer variable name 700*cf5a6c84SAndroid Build Coastguard Workerthat's easier to search for perhaps?</p> 701*cf5a6c84SAndroid Build Coastguard Worker 702*cf5a6c84SAndroid Build Coastguard Worker<p>An * binds to a variable name not a type name, so space it that way. 703*cf5a6c84SAndroid Build Coastguard Worker(In C "char *a, b;" and "char* a, b;" mean the same thing: "a" is a pointer 704*cf5a6c84SAndroid Build Coastguard Workerbut "b" is not. Spacing it the second way is not how C works.)</p> 705*cf5a6c84SAndroid Build Coastguard Worker 706*cf5a6c84SAndroid Build Coastguard Worker<p>We wrap lines at 80 columns. Part of the reason for this I (toybox's 707*cf5a6c84SAndroid Build Coastguard Workerfounder Rob) have mediocre eyesight (so tend to increase the font size in 708*cf5a6c84SAndroid Build Coastguard Workerterminal windows and web browsers), and program in a lot of coffee shops 709*cf5a6c84SAndroid Build Coastguard Workeron laptops with a smallish sceen. I'm aware this <a href=http://lkml.iu.edu/hypermail/linux/kernel/2005.3/08168.html>exasperates Linus torvalds</a> 710*cf5a6c84SAndroid Build Coastguard Worker(with his 8-character tab indents where just being in a function eats 8 chars 711*cf5a6c84SAndroid Build Coastguard Workerand 4 more indent levels eats half of an 80 column terminal), but you've 712*cf5a6c84SAndroid Build Coastguard Workergotta break somewhere and even Linus admits there isn't another obvious 713*cf5a6c84SAndroid Build Coastguard Workerplace to do so. (80 columns came from punched cards, which came 714*cf5a6c84SAndroid Build Coastguard Workerfrom civil war era dollar bill sorting boxes IBM founder Herman Hollerith 715*cf5a6c84SAndroid Build Coastguard Workerbought secondhand when bidding to run the 1890 census. "Totally arbitrary" 716*cf5a6c84SAndroid Build Coastguard Workerplus "100 yeas old" = standard.)</p> 717*cf5a6c84SAndroid Build Coastguard Worker 718*cf5a6c84SAndroid Build Coastguard Worker<p>If statements with a single line body go on the same line when the result 719*cf5a6c84SAndroid Build Coastguard Workerfits in 80 columns, on a second line when it doesn't. We usually only use 720*cf5a6c84SAndroid Build Coastguard Workercurly brackets if we need to, either because the body is multiple lines or 721*cf5a6c84SAndroid Build Coastguard Workerbecause we need to distinguish which if an else binds to. Curly brackets go 722*cf5a6c84SAndroid Build Coastguard Workeron the same line as the test/loop statement. The exception to both cases is 723*cf5a6c84SAndroid Build Coastguard Workerif the test part of an if statement is long enough to split into multiple 724*cf5a6c84SAndroid Build Coastguard Workerlines, then we put the curly bracket on its own line afterwards (so it doesn't 725*cf5a6c84SAndroid Build Coastguard Workerget lost in the multple line variably indented mess), and we put it there 726*cf5a6c84SAndroid Build Coastguard Workereven if it's only grouping one line (because the indentation level is not 727*cf5a6c84SAndroid Build Coastguard Workerproviding clear information in that case).</p> 728*cf5a6c84SAndroid Build Coastguard Worker 729*cf5a6c84SAndroid Build Coastguard Worker<p>I.E.</p> 730*cf5a6c84SAndroid Build Coastguard Worker 731*cf5a6c84SAndroid Build Coastguard Worker<blockquote> 732*cf5a6c84SAndroid Build Coastguard Worker<pre> 733*cf5a6c84SAndroid Build Coastguard Workerif (thingy) thingy; 734*cf5a6c84SAndroid Build Coastguard Workerelse thingy; 735*cf5a6c84SAndroid Build Coastguard Worker 736*cf5a6c84SAndroid Build Coastguard Workerif (thingy) { 737*cf5a6c84SAndroid Build Coastguard Worker thingy; 738*cf5a6c84SAndroid Build Coastguard Worker thingy; 739*cf5a6c84SAndroid Build Coastguard Worker} else thingy; 740*cf5a6c84SAndroid Build Coastguard Worker 741*cf5a6c84SAndroid Build Coastguard Workerif (blah blah blah... 742*cf5a6c84SAndroid Build Coastguard Worker && blah blah blah) 743*cf5a6c84SAndroid Build Coastguard Worker{ 744*cf5a6c84SAndroid Build Coastguard Worker thingy; 745*cf5a6c84SAndroid Build Coastguard Worker} 746*cf5a6c84SAndroid Build Coastguard Worker</pre></blockquote> 747*cf5a6c84SAndroid Build Coastguard Worker 748*cf5a6c84SAndroid Build Coastguard Worker<p>Gotos are allowed for error handling, and for breaking out of 749*cf5a6c84SAndroid Build Coastguard Workernested loops. In general, a goto should only jump forward (not back), and 750*cf5a6c84SAndroid Build Coastguard Workershould either jump to the end of an outer loop, or to error handling code 751*cf5a6c84SAndroid Build Coastguard Workerat the end of the function. Goto labels are never indented: they override the 752*cf5a6c84SAndroid Build Coastguard Workerblock structure of the file. Putting them at the left edge makes them easy 753*cf5a6c84SAndroid Build Coastguard Workerto spot as overrides to the normal flow of control, which they are.</p> 754*cf5a6c84SAndroid Build Coastguard Worker 755*cf5a6c84SAndroid Build Coastguard Worker<p>When there's a shorter way to say something, we tend to do that for 756*cf5a6c84SAndroid Build Coastguard Workerconsistency. For example, we tend to say "*blah" instead of "blah[0]" unless 757*cf5a6c84SAndroid Build Coastguard Workerwe're referring to more than one element of blah. Similarly, NULL is 758*cf5a6c84SAndroid Build Coastguard Workerreally just 0 (and C will automatically typecast 0 to anything, except in 759*cf5a6c84SAndroid Build Coastguard Workervarargs), "if (function() != NULL)" is the same as "if (function())", 760*cf5a6c84SAndroid Build Coastguard Worker"x = (blah == NULL);" is "x = !blah;", and so on.</p> 761*cf5a6c84SAndroid Build Coastguard Worker 762*cf5a6c84SAndroid Build Coastguard Worker<p>The goal is to be 763*cf5a6c84SAndroid Build Coastguard Workerconcise, not cryptic: if you're worried about the code being hard to 764*cf5a6c84SAndroid Build Coastguard Workerunderstand, splitting it to multiple steps on multiple lines is 765*cf5a6c84SAndroid Build Coastguard Workerbetter than a NOP operation like "!= NULL". A common sign of trying too 766*cf5a6c84SAndroid Build Coastguard Workerhard is nesting ? : three levels deep, sometimes if/else and a temporary 767*cf5a6c84SAndroid Build Coastguard Workervariable is just plain easier to read. If you think you need a comment, 768*cf5a6c84SAndroid Build Coastguard Workeryou may be right.</p> 769*cf5a6c84SAndroid Build Coastguard Worker 770*cf5a6c84SAndroid Build Coastguard Worker<p>Comments are nice, but don't overdo it. Comments should explain _why_, 771*cf5a6c84SAndroid Build Coastguard Workernot how. If the code doesn't make the how part obvious, that's a problem with 772*cf5a6c84SAndroid Build Coastguard Workerthe code. Sometimes choosing a better variable name is more revealing than a 773*cf5a6c84SAndroid Build Coastguard Workercomment. Comments on their own line are better than comments on the end of 774*cf5a6c84SAndroid Build Coastguard Workerlines, and they usually have a blank line before them. Most of toybox's 775*cf5a6c84SAndroid Build Coastguard Workercomments are c99 style // single line comments, even when there's more than 776*cf5a6c84SAndroid Build Coastguard Workerone of them. The /* multiline */ style is used at the start for the metadata, 777*cf5a6c84SAndroid Build Coastguard Workerbut not so much in the code itself. They don't nest cleanly, are easy to leave 778*cf5a6c84SAndroid Build Coastguard Workeraccidentally unterminated, need extra nonfunctional * to look right, and if 779*cf5a6c84SAndroid Build Coastguard Workeryou need _that_ much explanation maybe what you really need is a URL citation 780*cf5a6c84SAndroid Build Coastguard Workerlinking to a standards document? Long comments can fall out of sync with what 781*cf5a6c84SAndroid Build Coastguard Workerthe code is doing. Comments do not get regression tested. There's no such 782*cf5a6c84SAndroid Build Coastguard Workerthing as self-documenting code (if nothing else, code with _no_ comments 783*cf5a6c84SAndroid Build Coastguard Workeris a bit unfriendly to new readers), but "chocolate sauce isn't the answer 784*cf5a6c84SAndroid Build Coastguard Workerto bad cooking" either. Don't use comments as a crutch to explain unclear 785*cf5a6c84SAndroid Build Coastguard Workercode if the code can be fixed.</p> 786*cf5a6c84SAndroid Build Coastguard Worker 787*cf5a6c84SAndroid Build Coastguard Worker<!--#include file="footer.html" --> 788