parabola-cross-bootstrap ======================== “Don't cry because it's over, smile because it happened.” ― Dr. Seuss 1. Introduction --------------- This project is an attempt to bootstrap a self-contained parabola GNU/Linux-libre system for the following architectures: - riscv64 (HiFive) (complete) - riscv32 (PULP) (blocked) - sparc (OpenSparc) (planned) - powerpc64le (TalosII) (multilib complete) The scripts are created with the goal to be as architecture agnostic as possible, to make future porting efforts easier. The build process is split into four stages, the rationale of which is outlined in section 2 below. To initiate a complete build of all stages, run: $> sudo ./create [CARCH] The builds can be configured to keep going if the build of a single package fails, by creating the file `build/.KEEP_GOING`. Otherwise, the build will stop once an error is encountered. This is useful for getting as much work done as possible unattended, but will make debugging harder in the later stages, because temporary build fragments and filesystem trees in the build chroots will be overwritten by the next package. The complete console output of a package build process can be found in the corresponding .MAKEPKGLOG file in the build directory of that package. Target platform configuration is persisted in the config/config.$CARCH.sh files. 1.1. System Requirements ------------------------ The scripts require, among probably other things, to be running on a fairly POSIX-conforming GNU/Linux system, and in particular need the following tools to be present and functional: * decently up-to-date GNU build toolchain (gcc, glibc, binutils) most of the * things in base-devel pacman, makepkg I have tried to make the script smart enough to check for required bits and pieces where needed, and to report when anything is missing ahead of time, but some requirements may be missing. 1.2. A note to the reader ------------------------- The scripts assume to be running on a parabola GNU/Linux-libre system, and may fail in subtle and unexpected ways otherwise. They also may fail in subtle and unexpected ways anyway, because they are modifying upstream PKGBUILD files, which are very volatile and change without notice. When adapting this project to a new architecture, or a different flavour of archlinux, execrcise caution, pay close attention to any output, and be prepared to fix and modify patches and scripts. Also, if you found this project useful, and want to chat about anything, you can email me at , or find me as in #parabola and others on irc.freenode.org. 1.3. Current state of the project --------------------------------- All four stages of the riscv64 bootstrap are complete, and efforts to add additional architectures has begun. A pointer where to find future development efforts for the parabola ports will be added here in due time. 2. Build Stages --------------- The following subsections outline the reasoning behind the separate bootstrap stages. More details about *how* things are done may be gathered from reading the inline comments of the respective scripts. From Stage 2 onwards, the scripts use DEPTREEs to determine what packages need to be built. these files are located in the build/$CARCH/stage$X/ directories and can give an insight into what packages are going to be built next and what dependencies are still missing. Since risc-v is a fairly new ISA, some packages were packaged with config.sub and config.guess files that are too old to recognize the target triplet. This requires config.sub and config.guess files to be refreshed with newer versions from upstream. This is done automatically for stage 2 and onwards, if REGEN_CONFIG_FRAGMENTS is set to yes (the default) in the architecture config file. Packages with the `any' architecture are reused from upstream arch for this bootstrap, since they should work for any architecture and do not need to be re-(or cross-)compiled. This assumption is not always true, but it is a reasonable approximation. Additionally, all checkdepends are ignored, and the check() phase of the builds is skipped for sanity reasons, since the tests are likely to fail for benign reasons in a cross-arch makepkg chroot. Note that the stages use upstream arch and parabola PKGBUILDs and attempt to apply custom patches to resolve unbuildable or missing dependencies, and fix architecture specific build issues. Look for these patches in src/stage$X/patches/* and be prepared to fix and adapt them for future porting efforts. 2.1. Stage 1 ------------ The first stage creates and installs a cross-compile toolchain for the target triplet defined in $CHOST, consisting of binutils, linux-libre-api-headers, gcc and glibc. The scripts will check for $CHOST-ar and $CHOST-gcc in $PATH to determine whether binutils and gcc are installed, and will then proceed to look for the following files in $CHOST-gcc's sysroot: $sysroot/lib/libc.so.6 # for $CHOST-glibc $sysroot/include/linux/kernel.h # for $CHOST-linux-libre-api-headers If your system contains these files, the toolchain bootstrap process will be skipped. Otherwise, the scripts will create a new toolchain as follows: - compile $CHOST-binutils - compile $CHOST-linux-libre-api-headers - compile $CHOST-gcc-bootstrap, without glibc dependency - cross-compile $CHOST-glibc using $CHOST-gcc-bootstrap - compile $CHOST-gcc against the created $CHOST-glibc The sysroot of the created toolchain is set as /usr/$CHOST. The $CHOST-{binutils,linux-libre-api-headers,gcc,glibc} packages are installed as regular pacman packages on the build system, and are not automatically removed by the scripts after the build is completed (or has failed). For more information, you might want to check the PKGBUILD files located in src/stage1/toolchain-pkgbuilds/*. 2.2. Stage 2 ------------ Stage 2 uses the toolchain created in Stage 1 to cross-compile a subset of the packages of the base-devel group plus transitive runtime dependencies. The script creates an empty skeleton chroot, into which the cross-compiled packages are going to be installed, and creates a sane makepkg.conf and a patched makepkg.sh to work in the prepared chroot root directory. To make the sysroot of the compiler available to builds in the chroot and vice-versa, the /usr directory of the chroot is mounted into the sysroot. At the end of Stage 2, or in case of an error, this directory is unmounted automatically. To build the packages, the DEPTREE is traversed and packages are cross-compiled using upstream PKGBUILDs and custom patches, and the compiled packages are installed into the chroot immediately. For this stage, the patches are mandatory for each built package. Note that this process is a bit fragile and dependent on arbitrary particularities of the host system, and thus might fail for subtle reasons, like missing, or superfluous build-time installed packages on the host. Exercise caution and common sense. 2.3. Stage 3 ------------ Stage 3 uses the cross-compiled makepkg chroot created in Stage 2 to natively recompile the base-devel group of packages. This stage requires to build more packages, since a reduced set of make-time dependencies need to be present in the makepkg chroot, as well as runtime dependencies. Additionally, running the cross-compiled native compiler instead of the cross compiler takes longer, since user-mode emulation needs to be applied. As a result, stage 3 is expected to takes much longer than stage 2. However, since now the process is isolated from the host systems installed packages, since everything is built cleanly in a chroot, the process is much more stable and less prone to hard to diagnose problems with the host system. The scripts create a clean librechroot from the cross-compiled packages produced in stage 2. A modified libremakepkg script is created to perform config fragment regeneration and to skip the check() phase, and then packages are built in order of the DEPTREE. Since no cross-compilation is needed from stage 3 onwards, the patches are no longer required for each package. Fewer packages need patching, and the patches are typically smaller and apply less pressure to the build recipes. As a consequence, the patches probably need less maintenance in the future, and less work to adapt to a different architecture. Note that the cross-compiled packages from stage 3 can be a bit derpy at times, hence the stage 3 build scripts prioritize building bash and make natively, to avoid some known issues down the line. 2.4. Stage 4 ------------ Stage 4 does a final recompile of the packages of the base-devel group, similar to stage 3, with the difference that more make-time dependencies are enabled, and the packages of the base group are added to the DEPTREE. Stage 4 relies entirely on the packages natively compiled in stage 3, and no cross-compiled packages are present in the build chroot at any time. This results in reliable builds and reproducible build failures. However, since the number of packages to be built in stage 4 is again much larger, expect the builds to take a long time (days / weeks, not including work required to fix broken patches and builds). The result of stage 4 is a repository of packages that should allow to bootstrap and boot a virtual machine of the bootstrapped architecture, with the packages required to build the entirety of the arch / parabola package repository. At this point, the port becomes self-hosting and I consider the bootstrap process done. Note that a bootloader is missing from the bootstrap process, but this can easily be built manually after stage4 using the bootstrapped makepkg chroot. 3. Acknowledgements =================== I would like to thank the awesome abaumann from the archlinux32 project for pointers on how to bootstrap a PKGBUILD based system for a new architecture, and for his work on the bootstrap32 project, which helped a lot in getting this project started: https://github.com/archlinux32/bootstrap32 Further, I would like to thank the amazing people in #riscv and #fedora-riscv on irc.freenode.org, especially rwmjones, davidlt and sorear for all the help getting packages to build and work on risc-v, and for providing the source packages of all fedora-riscv packages, including all patches and build recipes, which have been of immense help: https://fedorapeople.org/groups/risc-v/SRPMS/ Lastly, it was refreshing to see yor names pop up over and over again on the risc-v ports, patches and pull requests out there, and it made me realize how much work it really is to get a port like this off the ground. I could not have made it this far without your work. ― Andreas Grapentin, 2018-04-18 ------------------------------------------------------------------------------ After finishing the risc-v bootstrap, I didn't expect to revisit this repository so soon. Now, after bootstrapping parabola for powerpc64le, I was very pleased with how well the scripts held up to the new architecture and how easily everything came together, despite a couple of bumps in the road. I realize again that this would not have been possible without the people listed above - abaumann from archlinux32 and the fedora crew (including, of course, the amazing people working on ppc64le fedora). Thank you. ― Andreas Grapentin, 2018-06-22