Aesop/Notes and Research
Random notes about bootstrapping Pascal with Aesop, An Extensive Subset Of Pascal.
Compiler backends[edit]
Conclusion: QBE Assembly is the best choice
- C: Apparently compiling to C loses you some flexibility
- GCC: Far too bloated, apparently not nice to work with, hard dependency on C++.
- LLVM: Far too bloated, hard dependency on C++.
- Cranelift: Very immature and buggy, supports even fewer architectures than QBE.
- QBE:
Supports fewer architectures, but that's not a problem thanks to QEMU. Only dependency is a C compiler.Update: It has come to my attention that, due to its youth, QBE does not support some important features such as inline asm. - Assembly: The only choice after eliminating all the others. Unfortunately.
Language[edit]
Conclusion: Probably C.
- C: Bootstrapping it is annoying, but there's plenty of solid tools like Yacc and Lex for it. It also doesn't have a complexity problem like many of the others do.
- Go: I don't know it very well.
- Rust: Bootstrapping it is annoying, it suffers from complexity syndrome, and the ecosystem has an annoying tendency to pull in dozens of dependencies with each crate.
- D: Bootstrapping it is annoying, and it has too many features for its own good.
- Scheme: No static typing :(
- Lisp: No static typing :(
- Nim: The whole point of Aesop is to reach Nim in the first place!
- OCaml: Older version is bootstrapped; not all libraries will work with this version.
- Idris: Requires Haskell.
- Haskell: Attempting to bootstrap it is more scarring than programming in COBOL.
- One of the stage0 languages: Ha ha, you're very funny.
Possible alternative solutions to the FPC problem[edit]
Conclusion: diverse double-compilation with a new compiler (Aesop) is the best solution
- Compiling it with GNU Pascal: Nobody could figure out how to make it compile with modern toolchains, and they follow completely different standards: FPC follows (and was written using) Turbo Pascal and Delphi features, whereas GNU Pascal followed the ISO Extended Pascal standard. However, GNU Pascal has significant support for the Borland Pascal(Turbo Pascal 7) dialect. This should be enough to compile FreePascal 1.0.10, and to start the bootstrap chain.
- Using an older version and doing a chain: We'd need to emulate it in DOSBox or FreeDOS-in-QEMU or something prior to Unix-like support added in Freepascal 1.0.10. The first versions were compiled by the proprietary Turbo Pascal compiler, in response to Borland dropping support for DOS. Those first versions probably STILL used Turbo Pascal extensions like the preprocessor and units. GNU Pascal supports Turbo Pascal extensions, but not the new features implemented and used in recent Freepascal versions.
- Using one of the Pascal to C tools: They have the same problem as GNU Pascal.
- Extending someone else's interpreter or compiler: Interpreters aren't viable, since I'd need to write an assembler and JIT for inline asm blocks. I couldn't find a toy Pascal compiler.
A brief venture into the FPC source code, or: How non-standard is FPC exactly?[edit]
We'll clone the git repo at [1], then cd into it. The directory layout looks reasonable enough:
compiler fpmake.pp installer Makefile nohup.out README.md tests fpmake_add1.inc fpmake_proc1.inc LICENSE Makefile.fpc packages rtl utils
The only worrying thing there is that 'fpmake.pp' file. Looks like a custom build system... written in Pascal :( Since we probably don't need to worry about the other directories, let's cd into compiler/. I'm sure this directory layout will be fi-
aarch64 cstreams.pas mips objcdef.pas pinline.pas README.txt aasmbase.pas cutils.pas MPWMake objcgutl.pas pkgutil.pas rescmn.pas aasmcfi.pas dbgbase.pas msg objcutil.pas pmodules.pas rgbase.pas aasmcnst.pas dbgcodeview.pas msgidx.inc ogbase.pas powerpc rgobj.pas aasmdata.pas dbgdwarf.pas msgtxt.inc ogcoff.pas powerpc64 riscv aasmdef.pas dbgstabs.pas nadd.pas ogelf.pas pparautl.pas riscv32 aasmsym.pas dbgstabx.pas nbas.pas oglx.pas ppc68k.lpi riscv64 aasmtai.pas defcmp.pas ncal.pas ogmacho.pas ppc8086.lpi scandir.pas aggas.pas defutil.pas ncgadd.pas ogmap.pas ppcaarch64.lpi scanner.pas aoptbase.pas dirparse.pas ncgbas.pas ognlm.pas ppcarm.lpi sparc aoptda.pas dwarfbase.pas ncgcal.pas ogomf.pas ppcavr.lpi sparc64 aoptobj.pas elfbase.pas ncgcnv.pas ogrel.pas ppcgen sparcgen aopt.pas entfile.pas ncgcon.pas ogwasm.pas ppcjvm.lpi switches.pas aoptutils.pas export.pas ncgflw.pas omfbase.pas ppcmips64el.lpi symbase.pas arm expunix.pas ncghlmat.pas optbase.pas ppcmipsel.lpi symconst.pas armgen finput.pas ncginl.pas optconstprop.pas ppcmips.lpi symcreat.pas assemble.pas fmodule.pas ncgld.pas optcse.pas ppcppc64le.lpi symdef.pas avr fpcdefs.inc ncgmat.pas optdead.pas ppcppc64.lpi symsym.pas blockutl.pas fpchash.pas ncgmem.pas optdeadstore.pas ppcppc.lpi symtable.pas browcol.pas fpcp.pas ncgnstfl.pas optdfa.pas ppcriscv32.lpi symtype.pas catch.pas fpkg.pas ncgnstld.pas options.pas ppcriscv64.lpi symutil.pas ccharset.pas fppu.pas ncgnstmm.pas optloadmodifystore.pas ppcsparc64.lpi syscinfo.pas cclasses.pas gendef.pas ncgobjc.pas optloop.pas ppcsparc.lpi systems cepiktimer.pas generic ncgopt.pas opttail.pas ppcwasm32.lpi systems.inc cfidwarf.pas globals.pas ncgrtti.pas optutils.pas ppcx64llvm.lpi systems.pas cfileutl.pas globstat.pas ncgset.pas optvirt.pas ppcx64.lpi tgobj.pas cg64f32.pas globtype.pas ncgutil.pas owar.pas ppcxtensa.lpi tokens.pas cgbase.pas hlcg2ll.pas ncgvmt.pas owbase.pas ppcz80.lpi triplet.pas cgexcept.pas hlcgobj.pas ncnv.pas owomflib.pas ppheap.pas utils cghlcpu.pas html ncon.pas parabase.pas pp.lpi verbose.pas cgobj.pas htypechk.pas nflw.pas paramgr.pas pp.pas version.pas cgutils.pas i386 ngenutil.pas parser.pas ppu.pas wasm32 cmsgs.pas i8086 ngtcon.pas pass_1.pas procdefutil.pas wasmbase.pas comphook.pas impdef.pas ninl.pas pass_2.pas procinfo.pas widestr.pas compiler.pas import.pas nld.pas pbase.pas psabiehpi.pas wpobase.pas compinnr.pas jvm nmat.pas pcp.pas pstatmnt.pas wpoinfo.pas comprsrc.pas ldscript.pas nmem.pas pdecl.pas psub.pas wpo.pas comptty.pas link.pas nobjc.pas pdecobj.pas psystem.pas x86 constexp.pas llvm nobj.pas pdecsub.pas ptconst.pas x86_64 COPYING.txt m68k node.pas pdecvar.pas ptype.pas xtensa cprofile.pas macho.pas nopt.pas pexports.pas raatt.pas z80 crefs.pas machoutils.pas nset.pas pexpr.pas rabase.pas cresstr.pas Makefile nutils.pas pgentype.pas rasm.pas cscript.pas Makefile.fpc objcasm.pas pgenutil.pas rautils.pas
...oh. Let's see approximately how non-standard this code is. First thing; in every file we have:
unit (...);
...
interface
...
implementation
...
What's this, then? Hmm... Apparently this is Pascal's module system. Except even that isn't standard. So, we'd need a compiler/interpreter with at least a module system. GNU Pascal has one... but it's the completely different and incompatible Extended Pascal module system. I can also see loads of things that look like this, and nvim highlights them differently to comments:
{$i (...)}
These are, apparently, preprocessor directives. Of course, these aren't standardized either; they're a Turbo Pascal extension if I remember right. Let's have a look at, say, llvm/aasmllvm.pas now. As soon as you open it up, you can see a type declaration... and after the equals sign:
class(tai_cpu_abstract_sym)
...
end
The FPC codebase uses advanced Delphi object-oriented features extensively. Of course it does. I don't think there's any hope of using another compiler at this point, so let's just stop. (There's probably many other more subtle portability problems in there, like non-standard built-in types, but I'm not experienced enough with Pascal to tell.)