- The goal is to remove the dependency on precompiled binaries, or at least build everything in terms of some small core (Something a human could audit).
- Each package has a set of inputs required to build it. (for example many projects require 'make' and 'gcc' and binutils 'ld').
- If one of these inputs (directly or indirectly) is the output of building the project itself, then the package is self-hosted. (For example gcc is usually built using gcc)
In a set of packages, if there is only one self hosted package then you can trace the build process of any other package back down to that core.
- Identify self hosted packages.
- Break their loop by creating a new set of inputs able to build them that is "smaller" and not self-hosted.
- Identify and create useful packages that are built in terms of smaller inputs (e.g. stage0, mes, guile, amber)
This can be done in parallel, there is little need to coordinate a specific bootstrap path. It will simply exist if each package can be built in terms of smaller parts. The big problem is that most of the essential smaller parts are so useful, they are repeatedly used all over the place and basic cooperation is recommended to avoid massive duplication of effort that such work could entail.
Note that the problem for gcc is not completely solved just by compiling gcc with tcc. To build gcc (with tcc) you need to run ./configure which depends on tr, diff, mktemp etc. All of which need to built using a C compiler on a Unix. So at some point we need to build these tools without ./configure and without a working Unix.
Here is a helpful command to list all the binaries invoked during the build process of compiling a package:
TMP=`mktemp` ; strace -o "$TMP" -f -e trace=execve -e 'signal=!all' make ; sed -ne 's/.*execve("\([^"]*\)",.*/\1/p' "$TMP" | xargs ls -d 2>/dev/null | sort | uniq
Example: Binutils Build Inputs