Main Page

From bootstrapping
Revision as of 20:31, 31 August 2017 by Nickpsecurity (talk | contribs) (Past Research / intray: Added Red Language given it's powerful, productive, and mostly bootstrapped.)
Jump to: navigation, search

Welcome to bootstrapping!

This wiki is about bootstrapping. Building up compilers and interpreters and tools from nothing.

"Recipe for yogurt: Add yogurt to milk." - Anon.

short sci fi story Coding Machines by Lawrence Kesteloot, January 2009

Current Topics

Past Research

bcompiler by Grimley Evans
This is a detailed log of the process of bootstrapping a series of languages up starting from just a hex assembler written using a hex editor.
The Cuneiform Tablets of 2015 by Long Tien Nguyen, Alan Kay
This discusses methods of long term software preservation. Briefly about hardware that will not degrade over time, but the majority of the paper is about how to design a software stack that can be executed in the far future. In order to achieve this they recommend build everything in terms of a machine with a short simple specification.
jonesforth.S by Richard W.M. Jones
In depth literate programming describing a complete implementation of forth. Bootstrapped from intel 32 bit assembly with lots of assembler macros into a fully self extensible forth. This is a really illuminating read, teaching a lot of details about forth as well as showing just how minimal a runtime it is possible to make a programming language with.
amber by nineties
These slides outline the developement of rowl and amber. This is a programming language bootstrapped up from assembly. rowl is implemented directly in assembly then parts of the amber vm and compiler are implemented in rowl, then the rest of amber is implemented by self hosting.
SCM-Go by pkelcjte
This project builds a SICP-style, Scheme interpreter with a REPL in Go. The blog post describes each phase. They're simple-looking. The Github integrates it into a total of 240 lines of code. Being a simple language, the Go implementation could be ported to anything else in our collection or straight hand-assemblied. Then, more complex stuff built on it like nineties or other LISPers do.
jrp.c by curtism
A very small JIT stack calculator implemented in C. All of the instructions are coded in a clever way to make them each a double word or a quad word.
Yet Another BrainFuck Compiler by cameronswinoga
This is a brainfuck compiler implemented in C. It produces an elf file directly.
The QCC project: hooking tcc frontend up with qcc's code generator and creating a toybox style set of cc, as, ld tools.
List of Diverse Hardware
A big concern in dealing with trust in hardware is whether it's subverted or not. Intel, AMD, and many other big names have backdoors in their chips for management purposes. Among other things... ;) One cheat to get trustworthy image is to just use a computer you have no reason to believe is subverted. Acquire it under a boring buyer, it itself is a boring tech, do your bootstrapping thing in it air gapped, and use what it produces. It will likely *not* be subverted *by default* since the interdictors and TAO folks have limited resources w/ no reason to target the system. Use several that are different for best results. To help with that, I (Nick P.) put together a list of all kinds of CPU's and execution strategies on Schneier's blog. Something I left off the list are old TI-82 calculators, Palm Pilots, etc. Lots of old stuff lying around you can get in person with cash that is probably unsubverted.
golang talk golang transpiled from c to go
"It's time for the Go compilers to be written in Go, not in C. I'll talk about the unusual process the Go team has adopted to make that happen: mechanical conversion of the existing C compilers into idiomatic Go code". They wrote the compiler in C then translated the source code from C into Go almost automatically (had to do some manual fixing up). This is an interesting approach. Let's name it the transpile approach to self hosting.

Past Research / intray

important: try to summarize lessons learned from each.

Karger-Thompson Attack

Anything related to the karger thompson attack: proof of concept demos, mitigations, theory.

  • multics the original paper explaining the attack (before thompson!)
  • SCM Security by Wheeler (Secure distribution & compilation of source fundamentals; Karger advised mastering it)
  • rotten by rntz (thompson attack demo)
  • rust infection by manishearth (thompson attack demo in the rust compiler)
  • tcc ACSAC by daved wheeler
  • CompCert by Leroy et al (Mathematically-verified, C compiler whose specs and proofs checked with tiny, verified checker)
  • CakeML by Myreen et al (Mathematically-verified, SML compiler whose specs and proofs checked with different, tiny, verified checker)
  • VLISP by Oliva and Wand (Article has links to VLISP which mathematically verified PreScheme and Scheme48)
  • KCC by Rosu et al (Executable, formal semantics for C in rewrite logic; could do that w/ simpler engine)
  • TALC by Cornell (Typed, assembly language to verify safety w/out compiler; checker can be simple; C subset + verified compiler to TALC)
  • CoqASM by Microsoft Research (Bootstrap in verifiably-safe assembly in prover checked by tiny, verified checker)

Ubiquitous Implementations

These are tools written in ubiquitous languages, therefore they can be used in a wide variety of contexts.

  • shasm by Hohensee (x86 assembler written in BASH)
  • AWKLisp by Bacon (LISP written in Awk; includes Perl version from Perl Avenger)
  • Gherkin by Dipert (LISP written in Bash)
  • mal "make a lisp" implementing a very basic lisp interpreter in hundreds of languages

Small C Compilers

  • c4 by rswier (incredibly short c compiler)
  • cc500 by edmund grimley-evans (tiny c compiler)
  • CUCU by Zaitsev (Small, C compiler designed for easy understanding)
  • SmallerC by Frunze (Small, single-pass, C compiler for several ISA's)
  • picoc interpreter.
  • C Interpreter by Dr Dobbs (Describes building a C interpreter with source)
  • [3] Small C for I386 (IA-32)
  • Selfie, a tiny self-compiling compiler for a subset of C, a tiny self-executing MIPS emulator, and a tiny self-hosting MIPS hypervisor, all in a single 7kLoC file. HN discussion. Paper.

Grammars, Parsing, and Term Rewriting

  • Grammar Executing Machine by McKeeman and He (Incrementally extend languages from simple to complex grammars in interpreter(s))
  • peg by kragen (parsing)
  • PEG-based simple compiler by Ian Piumarta
  • META II by Bayfront Tech (Original meta-compiler w/ live code and detailed tutorial; OMeta was successor)
  • META II implementation by Lugon (Looks like a small implementation of META II; also bootstrapped in META II)
  • OMeta# Intro by Moser (OMeta intro that nicely illustrates the meta approach/advantages)

Virtual Machines, Instruction Sets

  • P-code by Wirth (High-level language & libraries target ultra-simple, portable interpreter)
  • sweet16 by Steve Wozniak
  • Tiny BASIC by Allison (Small BASIC whose original VM took 120, virtual opcodes to implement using 3KB RAM)
  • Klip by Cutting (Compiler & runtime for simple language for students; done in C#; runtime is very readable)

CPU's for Bootstrapping: The Simple, The Verified, and The Necessarily Complex

  • NAND2Tetris by Nisan and Schocken (Guide that teaches hardware step-by-step in fun way with simple CPU emerging)
  • J1 by by Bowman (16-bit Forth CPU in 200 lines of Verilog that does 100MIPS on FPGA's;
  • H2 by Howe (Modified, VHDL version of J1 with detailed description and Howe's code MIT-licensed
  • RISC-0 by Wirth (Simple, RISC CPU & SOC designed for Oberon language with detailed docs and source online)
  • JOP by Shoeberl et al (Embedded Java processor that takes up 1830 slices on FPGA)
  • Scheme Machine by Burger (Scheme interpreter implemented as CPU using formal methods)
  • ZPU by Zylin AS (Tiny, 32-bit CPU for deep embedded apps in 440 LUT's)
  • J2 by Landley et al (Clone of cost-efficient, SuperH-2 CPU in open-source)
  • VAMP by Beyer et al (Formally-verified, DLX-style processor in 18,000 slices on Xilinx)
  • Leon3 by Gaisler (Industry-grade, 32-bit SPARC w/ auto-configuration of core and GPL license)
  • Rocket by Univ of CA (1.4GHz RISC-V CPU and generator for customization)
  • OpenPITON by Princeton (25-core, shared-memory, SPARC CPU open-sourced and very scalable)

Helpful Links