- The goal is to remove the dependency on precompiled binaries, or at least build everything in terms of some small core.
- Each package has a set of inputs required to build it. (for example many projects require 'make' and 'gcc' and binutils 'ld').
- If one of these inputs (directly or indirectly) is the output of building the project itself, then the package is self-hosted. (For example gcc is usually built using gcc)
In a set of packages, if there is only one self hosted package then you can trace the build process of any other package back down to that core.
- Identify self hosted packages.
- Break their loop by creating a new set of inputs able to build them that is "smaller" and not self-hosted.
- Identify and create useful packages that are built in terms of smaller inputs (e.g. stage0, mes, guile, amber)
This can be done in parallel, there is no need to coordinate a specific bootstrap path. It will simply exist if each package can be built in terms of smaller parts.
Note that the problem for gcc is not completely solved just by compiling gcc with tcc. To build gcc (with tcc) you need to run ./configure which depends on tr, diff, mktemp etc. All of which need to built using a C compiler. So at some point we need to build these tools without ./configure.
Here is a helpful command to list all the binaries invoked during the build process of compiling a package:
TMP=`mktemp` ; strace -o "$TMP" -f -e trace=execve -e 'signal=!all' make ; sed -ne 's/.*execve("\([^"]*\)",.*/\1/p' "$TMP" | xargs ls -d 2>/dev/null | sort | uniq
Example: Binutils Build Inputs