Parallelize Pacman Downloads & Compilation

Parallelizing Pacman Downloads

I usually start off blog posts with an “about the project”, but this is a very short one so lets just get right into it! You should modify your pacman.conf file to enable parallelization of downloads. It’ll speed up installation of packages that have several dependencies. For all those distro hoppers out there, this should speed up your install times.

By default the ParallelDownloads option is set to only 5, but can be configured up any positive integer. The higher you make this number, the more parallel downloads will occur, neat right? The downside here, is you’re also going to be exhausting your uplink connection and potentially causing more stress on a mirror.

...truncated...
# Misc options
#UseSyslog
#Color
NoProgressBar
# We cannot check disk space from within a chroot environment
#CheckSpace
VerbosePkgLists
ParallelDownloads = 5
DownloadUser = alpm
#DisableSandbox
...truncated...

The screenshow below shows parallelization in action!

multi-package-downloading-with-arch.png

For those who use yay, this file is used unless you create a file called “yay” within ~/.config/. If you’re working in a RHEL environment, the variable max_parallel_downloads exists within /etc/dnf.conf that allows up to 20 parallel downloads according to the man page.

While this only speeds up downloading of packages, we can further tinker with configuration files to distribute compilation as well! The utility distcc distributes the task of compilation across N-number of hosts. Now you’re probably thinking, “well that’s wild, but why would I do this?”. If you have a handful of Raspberry Pi’s floating around (and who doesn’t) as long as the appropriate compiler is installed on the target host, your application should be compiled without issue.

How does this relate to Arch Linux? The makepkg utility comes without of the box support for distcc! The configuration below shows the /etc/makepkg.conf where hosts to connect to and whether or not distcc should be enabled is shown.

...truncated...
#########################################################################
# BUILD ENVIRONMENT
#########################################################################
#
# Makepkg defaults: BUILDENV=(!distcc !color !ccache check !sign)
#  A negated environment option will do the opposite of the comments below.
#
#-- distcc:   Use the Distributed C/C++/ObjC compiler
#-- color:    Colorize output messages
#-- ccache:   Use ccache to cache compilation
#-- check:    Run the check() function if present in the PKGBUILD
#-- sign:     Generate PGP signature file
#
BUILDENV=(!distcc color !ccache check !sign)
#
#-- If using DistCC, your MAKEFLAGS will also need modification. In addition,
#-- specify a space-delimited list of hosts running in the DistCC cluster.
#DISTCC_HOSTS=""
#
#-- Specify a directory for package building.
#BUILDDIR=/tmp/makepkg
...truncated...

All distcc connections can be configured to use ssh to initiate the connection. From the homepage of the distcc project itself, setup is as easy as:

For each machine, download distcc unpack, and do 
 ./configure && make && sudo make install
  On each of the servers, run distccd --daemon, with --allow options to restrict access.
  Put the names of the servers in your environment:
  export DISTCC_POTENTIAL_HOSTS='localhost red green blue'
  Build! Wrap your build command in the "pump" script and use "distcc" as your C compiler:
  cd ~/work/myproject; pump make -j8 CC=distcc 
  - https://www.distcc.org/

With that, I’ll wrap up this blog post.

Hopefully this leaves you with enough to go forth and tinker!

Thank you for reading, please consider sharing.