Clone the github stuff, “make”, and then fail

I had to write two bug reports this morning, to an important extent on the same, annoying issue:

https://github.com/jameister/jsonpath/issues/1

and

https://github.com/tj/luna/issues/79

There are always more implicit (and even almost hidden) dependencies than one may think.

It wreaks havoc on installers too, and not just on the build process.

Docker nicely solves this by stating explicitly what the base image for a program must be. Declaring and enforcing a base image is a necessary (but not sufficient condition) for reproducible builds. As you know, reproducible builds are also a pressing security concern and we all know that time is running out on that. In the meanwhile, Debian are still not ready:

https://wiki.debian.org/ReproducibleBuilds

Our existing build and installation methods are simply unsustainable!

What is missing in Docker is some kind of git-style content database which stores all the files ever downloaded for any image, including for the host image itself. From there, we would no longer need to download the entire image, but only the missing files. The image would just be a set of hard links to files in the download cache. Such cache should not operate at the level of packages, but at the level of actual files. The current image definition method leads to too much file duplication. What the image provisioning system should do, is:

[1] download the list of sha256 hashes and tree for an image
[2] populate it from the existing cache by hard-linking to the files
[3] download the missing files only

If I already downloaded exactly the same file for an alpine linux image, why download it again for an arch linux image? The OS package managers should do the same. For example, the apt-get command tool should not just download an entire package. It should resolve the package files from the cache first and only download the remainder.

In absence of decent image management, and with the existing installed base image essentially just being an arbitrary fact, the build/installation process may fail on all kinds of unstated and possible unmet requirements. This is a real problem:

If the user cannot install the program, he can also not use it.

Docker more or less solves it at the cost of an incredible amount of duplication. A next step should be to prevent the system from ever downloading a cached file again.

 

Advertisements

Published by

eriksank

I mostly work on an alternative bitcoin marketplace -and exchange applications. I am sometimes available for new commercial projects but rather unlikely right now.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s