You wrote that “A basic automated test suite may be of help here.” Totally agree.
We, of course, do have an automated test suite. I can see how it would appear as if we lack even the most basic of rudimentary CI/CD systems, and it’s not a good look. I do appreciate the respectful way you offered coaching and guidance without implying that we are a bunch of jokers . Thank you. I think that is a very reasonable suggestion for you to make and question to ask.
Now it is worth explaining the state of the world and what did go wrong in this case:
You can see part of the test suite here: sdk/e2e at master · dfinity/sdk · GitHub. We run these tests on both Linux and Darwin.
It’s an uncomfortable fact of life that all test suites have blind spots. This one is no exception. The test suite didn’t catch this defect, so we’ll learn from this and improve. I have work in progress to check for this specific condition: test: check binaries (wip) by ericswanson-dfinity · Pull Request #1868 · dfinity/sdk · GitHub
While I can’t speak to the TLS certificate issue (a different team manages that), I can explain what happened here with dfx start
.
As part of open-sourcing the sdk repo, we needed to remove some dependencies on repositories that aren’t yet open-sourced. One of those dependencies is the replica
binary and its launcher, ic-starter
. Rather than have the sdk’s build system in turn build replica
and ic-starter
from a private repo, it now downloads prebuilt binaries from download.dfinity.systems.
We use nix for CI and for builds. Nix provides an environment for reproducible builds. It’s quite sophisticated, and also complicated. One of those complications is that in the absence of certain post-processing, dynamically-linked binaries produced by nix will contain links to /nix/store/. For example:
$ ldd replica
linux-vdso.so.1 (0x00007ffed39ea000)
libpthread.so.0 => /nix/store/0c7c96gikmzv87i7lv3vq5s1cmfjd6zf-glibc-2.31-74/lib/libpthread.so.0 (0x00007f3810b84000)
libgcc_s.so.1 => /nix/store/0c7c96gikmzv87i7lv3vq5s1cmfjd6zf-glibc-2.31-74/lib/libgcc_s.so.1 (0x00007f3810b6a000)
librt.so.1 => /nix/store/0c7c96gikmzv87i7lv3vq5s1cmfjd6zf-glibc-2.31-74/lib/librt.so.1 (0x00007f3810b60000)
libm.so.6 => /nix/store/0c7c96gikmzv87i7lv3vq5s1cmfjd6zf-glibc-2.31-74/lib/libm.so.6 (0x00007f3810a1f000)
libdl.so.2 => /nix/store/0c7c96gikmzv87i7lv3vq5s1cmfjd6zf-glibc-2.31-74/lib/libdl.so.2 (0x00007f3810a1a000)
libc.so.6 => /nix/store/0c7c96gikmzv87i7lv3vq5s1cmfjd6zf-glibc-2.31-74/lib/libc.so.6 (0x00007f3810859000)
/nix/store/0c7c96gikmzv87i7lv3vq5s1cmfjd6zf-glibc-2.31-74/lib/ld-linux-x86-64.so.2 => /nix/store/0c7c96gikmzv87i7lv3vq5s1cmfjd6zf-glibc-2.31-74/lib64/ld-linux-x86-64.so.2 (0x00007f3815b43000)
The previously approach of building replica
and ic-starter
within sdk used a build target that patched the /nix/store links, but the new process that builds the binaries for download.dfinity.systems did not. (It does now.) So all the automated test suites passed (because they run in nix), and our local testing of the release binaries worked (because we have nix installed).
TLDR: This is a leftover consequence of open-sourcing the SDK. This did not happen before, but we failed to update some things while open-sourcing the SDK. That is now being addressed.