Reproducibility in Computational Research

From scientific reproducibility to software environment management

Kolen Cheung

Research Software & Analytics Group, University of Exeter

November 26th, 2025

Introduction

What is reproducibility?

According to Hernández and Colom (2023),

  1. Re-runnable (\(R^1\))
  2. Repeatable (\(R^2\))
  3. Reproducible (\(R^3\))
  4. Reusable (\(R^4\))
  5. Replicable (\(R^5\))
Comparison of terminologies. See Plesser (2018)
Goodman Claerbout ACM
Repeatability
Methods reproducibility Reproducibility Replicability
Results reproducibility Replicability Reproducibility
Inferential reproducibility

Different kinds of reproducibility

What is a package manager?

“Package manager” can refer to multiple things:

What kinds of package managers?

By scope:

By distribution method:

By platform:

By linking strategy:

Problem statements

Conda

The problem with PyPI packages

From pixell/setup.py at b41248618ce92277a19a4efccadfc3b7403d67f5 · simonsobs/pixell

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""The setup script."""
from __future__ import print_function
import setuptools
from setuptools import find_packages
from distutils.errors import DistutilsError
from numpy.distutils.core import setup, Extension, build_ext, build_src
import versioneer
import os, sys
import subprocess as sp
import numpy as np

build_ext = build_ext.build_ext
build_src = build_src.build_src


compile_opts = {
    #'extra_compile_args': ['-std=c99','-fopenmp', '-Wno-strict-aliasing', '-g', '-O0', '-fPIC', '-fsanitize=address', '-fsanitize=undefined'],
    'extra_compile_args': ['-std=c99','-fopenmp', '-Wno-strict-aliasing', '-g', '-Ofast', '-fPIC'],
    'extra_f90_compile_args': ['-fopenmp', '-Wno-conversion', '-Wno-tabs', '-fPIC'],
    'f2py_options': ['skip:', 'map_border', 'calc_weights', ':'],
    'extra_link_args': ['-fopenmp', '-g', '-fPIC', '-fno-lto']
    }

# Set compiler options
# Windows
if sys.platform == 'win32':
    raise DistutilsError('Windows is not supported.')
elif sys.platform == 'darwin' or sys.platform == 'linux':
    environment = os.environ

    if not 'CC' in environment:
        environment["CC"] = "gcc"
    
    if not "CXX" in environment:
        environment["CXX"] = "g++"
    
    if not "FC" in environment:
        environment["FC"] = "gfortran"

    # Now, try out our environment!
    c_return = sp.call([environment["CC"], *compile_opts["extra_compile_args"], "scripts/omp_hello.c", "-o", "/tmp/pixell-cc-test"], env=environment)

    if c_return != 0:
        raise EnvironmentError(
            "Your C compiler does not support the following flags, required by pixell: "
            f"{' '.join(compile_opts['extra_compile_args'])}"
            ". Consider setting the value of environment variable CC to a known good gcc install. "
            "The built-in Apple clang does not support OpenMP. Use Homebrew to install either gcc or llvm. "
            f"Current value of $CC is {environment['CC']}.",
        )
    else:
        print(f"C compiler found ({environment['CC']}) and supports OpenMP.")
    
    
    cxx_return = sp.call([environment["CXX"], *compile_opts["extra_compile_args"], "scripts/omp_hello.c", "-o", "/tmp/pixell-cxx-test"], env=environment)

    if cxx_return != 0:
        raise EnvironmentError(
            "Your CXX compiler does not support the following flags, required by pixell: "
            f"{' '.join(compile_opts['extra_compile_args'])}"
            ". Consider setting the value of environment variable CXX to a known good gcc install. "
             "The built-in Apple clang does not support OpenMP. Use Homebrew to install either gcc or llvm. "
            f"Current value of $CXX is {environment['CXX']}.",
        )
    else:
        print(f"CXX compiler found ({environment['CXX']}) and supports OpenMP.")
    
    fc_return = sp.call([environment["FC"], *compile_opts["extra_f90_compile_args"], "scripts/omp_hello.f90", "-o", "/tmp/pixell-fc-test"], env=environment)

    if fc_return != 0:
        raise EnvironmentError(
            "Your Fortran compiler does not support the following flags, required by pixell: "
            f"{' '.join(compile_opts['extra_f90_compile_args'])}"
            ". Consider setting the value of environment variable FC to a known good gfortran install."
            f"Current value of $FC is {environment['FC']}.",
        )
    else:
        print(f"Fortran compiler found ({environment['FC']}) and supports OpenMP.")

    # Why do we remove -fPIC here?
    compile_opts['extra_link_args'] = ['-fopenmp']
else:
    raise EnvironmentError("Unknown platform. Please file an issue on GitHub.")

def pip_install(package):
    import pip
    if hasattr(pip, 'main'):
        pip.main(['install', package])
    else:
        pip._internal.main(['install', package])

with open('README.rst') as readme_file:
    readme = readme_file.read()

with open('HISTORY.rst') as history_file:
    history = history_file.read()

requirements =  ['numpy>=1.20.0',
                 'astropy>=2.0',
                 'setuptools>=39',
                 'h5py>=2.7',
                 'scipy>=1.0',
                 'python_dateutil>=2.7',
                 'cython<3.0.4',
                 'healpy>=1.13',
                 'matplotlib>=2.0',
                 'pyyaml>=5.0',
                 'Pillow>=5.3.0',
                 'pytest-cov>=2.6',
                 'coveralls>=1.5',
                 'pytest>=4.6',
                 'ducc0>=0.31.0']


test_requirements = ['pip>=9.0',
                     'bumpversion>=0.5',
                     'wheel>=0.30',
                     'watchdog>=0.8',
                     'flake8>=3.5',
                     'coverage>=4.5',
                     'Sphinx>=1.7',
                     'twine>=1.10',
                     'numpy>=1.20',
                     'astropy>=2.0',
                     'setuptools>=39.2',
                     'h5py>=2.7,<=2.10',
                     'scipy>=1.0',
                     'python_dateutil>=2.7',
                     'cython<3.0.4',
                     'matplotlib>=2.0',
                     'pyyaml>=5.0',
                     'pytest-cov>=2.6',
                     'coveralls>=1.5',
                     'pytest>=4.6']

# Why are we doing this instead of allowing the environment to do this? We should just use -O3 and -fPIC.
fcflags = os.getenv('FCFLAGS')
if fcflags is None or fcflags.strip() == '':
    fcflags = ['-O3','-fPIC']
    #fcflags = ['-O0','-fPIC', '-fsanitize=address', '-fsanitize=undefined']
else:
    print('User supplied fortran flags: ', fcflags)
    print('These will supersede other optimization flags.')
    fcflags = fcflags.split()
    
compile_opts['extra_f90_compile_args'].extend(fcflags)
compile_opts['extra_f77_compile_args'] = compile_opts['extra_f90_compile_args']

def presrc():
    # Create f90 files for f2py.
    if sp.call('make -C fortran', shell=True) != 0:
        raise DistutilsError('Failure in the fortran source-prep step.')
    
def prebuild():
    # Handle cythonization
    no_cython = sp.call('cython --version',shell=True)
    if no_cython:
        try:
            print("Cython not found. Attempting a conda install first.")
            import conda.cli
            conda.cli.main('conda', 'install',  '-y', 'cython')
        except:
            try:
                print("conda install of cython failed. Attempting a pip install.")
                pip_install("cython")
            except:
                raise DistutilsError('Cython not found and all attempts at installing it failed. User intervention required.')
        
    if sp.call('make -C cython',  shell=True) != 0:
        raise DistutilsError('Failure in the cython pre-build step.')


class CustomBuild(build_ext):
    def run(self):
        print("Running build...")
        prebuild()
        # Then let setuptools do its thing.
        return build_ext.run(self)

class CustomSrc(build_src):
    def run(self):
        print("Running src...")
        presrc()
        # Then let setuptools do its thing.
        return build_src.run(self)

class CustomEggInfo(setuptools.command.egg_info.egg_info):
    def run(self):
        print("Running EggInfo...")
        presrc()
        prebuild()
        return setuptools.command.egg_info.egg_info.run(self)   

# Cascade your overrides here.
cmdclass = {
    'build_ext': CustomBuild,
    'build_src': CustomSrc,
    'egg_info': CustomEggInfo,
}
cmdclass = versioneer.get_cmdclass(cmdclass)


setup(
    author="Simons Observatory Collaboration Analysis Library Task Force",
    author_email='mathewsyriac@gmail.com',
    classifiers=[
        'Development Status :: 2 - Pre-Alpha',
        'Intended Audience :: Developers',
        'License :: OSI Approved :: BSD License',
        'Natural Language :: English',
        "Programming Language :: Python :: 2",
        'Programming Language :: Python :: 2.7',
        'Programming Language :: Python :: 3',
        'Programming Language :: Python :: 3.4',
        'Programming Language :: Python :: 3.5',
        'Programming Language :: Python :: 3.6',
    ],
    description="pixell",
    package_dir={"pixell": "pixell"},
    entry_points={
    },
    ext_modules=[
        Extension('pixell.cmisc',
            sources=['cython/cmisc.c','cython/cmisc_core.c'],
            libraries=['m'],
            include_dirs=[np.get_include()],
            **compile_opts),
        Extension('pixell.distances',
            sources=['cython/distances.c','cython/distances_core.c'],
            libraries=['m'],
            include_dirs=[np.get_include()],
            **compile_opts),
        Extension('pixell.srcsim',
            sources=['cython/srcsim.c','cython/srcsim_core.c'],
            libraries=['m'],
            include_dirs=[np.get_include()],
            **compile_opts),
        Extension('pixell._interpol_32',
            sources=['fortran/interpol_32.f90'],
            **compile_opts),
        Extension('pixell._interpol_64',
            sources=['fortran/interpol_64.f90'],
            **compile_opts),
        Extension('pixell._colorize',
            sources=['fortran/colorize.f90'],
            **compile_opts),
        Extension('pixell._array_ops_32',
            sources=['fortran/array_ops_32.f90'],
            **compile_opts),
        Extension('pixell._array_ops_64',
            sources=['fortran/array_ops_64.f90'],
            **compile_opts),
    ],
    include_dirs = [],
    library_dirs = [],
    install_requires=requirements,
    extras_require = {'fftw':['pyFFTW>=0.10'],'mpi':['mpi4py>=2.0']},
    license="BSD license",
    long_description=readme + '\n\n' + history,
    package_data={'pixell': ['pixell/tests/data/*.fits','pixell/tests/data/*.dat','pixell/tests/data/*.pkl']},
    include_package_data=True,    
    data_files=[('pixell', ['pixell/arial.ttf'])],
    keywords='pixell',
    name='pixell',
    packages=find_packages(),
    test_suite='pixell.tests',
    tests_require=test_requirements,
    url='https://github.com/simonsobs/pixell',
    version=versioneer.get_version(),
    zip_safe=False,
    cmdclass=cmdclass,
    scripts=['scripts/test-pixell']
)

print('\n[setup.py request was successful.]')

The necessity of conda

it really sounds like your needs are so unusual compared to the larger Python community that you’re just better off building your own

From 2012 PyData Workshop Panel Discussion with Guido van Rossum. See Conda: Myths and Misconceptions | Pythonic Perambulations.

The conda solution

Separation of build, host, run-time dependencies
Section Who needs it? Architecture? Example
Build The compiler machine Build Platform (e.g., x86) cmake, gcc, make
Host The package being built (linking phase) Target Platform (e.g., ARM64) openssl, python, libpng
Run The final user Target Platform (e.g., ARM64) python, requests, numpy

Multi-platform

  • Linux-x86_64
  • Linux-aarch64
  • Linux-ppc64le
  • MacOSX-x86_64
  • MacOSX-arm64
  • Windows-x86_64

Language agnostic

  • Python
  • C/C++
  • Fortran
  • R
  • Rust
  • bash
  • juliaup

Conda vs. Mamba + conda-forge

How conda/mamba achieves reproducibility

  • a package distributed via conda-forge has a strong guarantee of reproducibility by its design
    • conda/mamba specific design decisions
    • conda-forge specific design in infrastructure and CI
Separation of build, host, run-time dependencies
Section Who needs it? Architecture? Example
Build The compiler machine Build Platform (e.g., x86) cmake, gcc, make
Host The package being built (linking phase) Target Platform (e.g., ARM64) openssl, python, libpng
Run The final user Target Platform (e.g., ARM64) python, requests, numpy

How conda/mamba+conda-forge achieves customizability

Quoting directly from Knowledge Base | conda-forge | community-driven packaging for conda

You can switch your BLAS implementation by doing,

conda install "libblas=*=*_mkl"
conda install "libblas=*=*_openblas"
conda install "libblas=*=*_blis"
conda install "libblas=*=*_accelerate"
conda install "libblas=*=*_newaccelerate"
conda install "libblas=*=*_netlib"

MPI:

Or even microarch! See Microarchitecture-optimized builds

How to distribute a conda/mamba environment

Conda/mamba environment is designed to be reproducible with a different prefix (see the placeholder trick in Detailed operations — documentation)

Example: how to package a pure Python package that’s already on PyPI

grayskull pypi pytest

See more in conda/grayskull: Grayskull - Recipe generator for Conda.

Example: how to package a complex scientific software

From ducc0-feedstock/recipe/meta.yaml at f114fdaa2eb46afb5dee0c4c92b366a506f3475a · conda-forge/ducc0-feedstock

{% set name = "ducc0" %}
{% set version = "0.39.1" %}

package:
  name: {{ name|lower }}
  version: {{ version }}

source:
  url: https://pypi.org/packages/source/{{ name[0] }}/{{ name }}/ducc0-{{ version }}.tar.gz
  sha256: 38eda188733d43c3602726e28bc9928d3117cdc23b5c1e7d89fdc26004a1d847

build:
  number: 1
  skip: true  # [py<=36]
  script_env: DUCC0_OPTIMIZATION=portable
  script: {{ PYTHON }} -m pip install . -vv

requirements:
  build:
    - python                                 # [build_platform != target_platform]
    - cross-python_{{ target_platform }}     # [build_platform != target_platform]
    - pybind11                               # [build_platform != target_platform]
    - nanobind
    - make
    - cmake
    - {{ compiler('c') }}
    - {{ stdlib("c") }}
    - {{ compiler('cxx') }}
  host:
    - pip
    - pybind11
    - nanobind
    - python
    - make
    - cmake
    - scikit-build
    - scikit-build-core
  run:
    - numpy >=1.17.0
    - python

test:
  imports:
    - ducc0
  commands:
    - pip check
  requires:
    - pip

about:
  home: https://gitlab.mpcdf.mpg.de/mtr/ducc
  summary: Distinctly useful code collection
  license: GPL-2.0-or-later
  license_file: LICENSE

extra:
  recipe-maintainers:
    - ickc
    - MarkWieczorek
    - mreineck

Example: how to package a complex scientific software (cont’d)

From toast-feedstock/recipe/meta.yaml at df31bdbcae76b144ab89a8150ab8c43fb9a61d54 · conda-forge/toast-feedstock


{% set version = "2.3.14" %}
{% set sha256 = "924912213af3bbacd622b9318bd6d79055c4d57f58c2da486f4b3f62a12466f1" %}

{% set build = 2 %}
{% if blas_impl == 'openblas' %}
{% set build = build + 100 %}
{% endif %}

{% set blas_prefix = blas_impl %}

package:
  name: toast
  version: {{ version }}

source:
  url: https://github.com/hpc4cmb/toast/archive/{{ version }}.tar.gz
  sha256: {{ sha256 }}

build:
  skip: True  # [py<37]
  skip: True  # [win]
  number: {{ build }}
  string: "{{ blas_prefix }}_py{{ py }}h{{ PKG_HASH }}_{{ build }}"
  run_exports:
    - toast * {{ blas_prefix }}_*

requirements:
  build:
    - {{ compiler('c') }}
    - {{ compiler('cxx') }}
    - cmake
    - make                 # [unix]
    - llvm-openmp >=4.0.1  # [osx]
  host:
    - llvm-openmp >=4.0.1  # [osx]
    - python
    - fftw  # [blas_impl == 'openblas']
    - openblas * openmp_*  # [blas_impl == 'openblas']
    - mkl-devel  # [blas_impl == 'mkl']
    - liblapack
    - suitesparse
    - libaatm
  run:
    - llvm-openmp >=4.0.1  # [osx]
    - python
    - {{ pin_compatible("fftw") }}  # [blas_impl == 'openblas']
    - openblas * openmp_*  # [blas_impl == 'openblas']
    - {{ pin_compatible("mkl") }}  # [blas_impl == 'mkl']
    - {{ pin_compatible("liblapack") }}
    - {{ pin_compatible("suitesparse") }}
    - {{ pin_compatible("libaatm") }}
    - numpy
    - scipy
    - astropy
    - healpy
    - h5py
    - ephem

test:
  files:
    - run_test.sh
  commands:
    - ./run_test.sh

about:
  home: https://github.com/hpc4cmb/toast
  license: BSD-2-Clause
  license_family: BSD
  license_file: LICENSE
  summary: 'Time Ordered Astrophysics Scalable Tools'
  description: |
    TOAST is a software framework for simulating and processing timestream data
    collected by microwave telescopes.
  dev_url: https://github.com/hpc4cmb/toast

extra:
  recipe-maintainers:
    - tskisner

Example: how to reproduce a set of system softwares on HPC

From envoy/conda/system_linux-aarch64.yml at main · ickc/envoy

channels:
- conda-forge
dependencies:
- bash
- bat
- bat-extras
- bottom
- btop
- bzip2
- clang-format
- coreutils
- curl
- difftastic
- diffutils
- direnv
- dua-cli
- dust
- exiftool
- fastfetch
- fd-find
- ffmpeg
- file
- findutils
- fzf
- gawk
- gh
- ghostscript
- git
- git-delta
- gnu-units
- go-shfmt
- go-task
- graphviz
- grep
- gzip
- htop
- hyperfine
- imagemagick
- inetutils
- joshuto
- jq
- juliaup
- libarchive
- lsdeluxe
- make
- mediainfo
- mosh
- nano
- nvtop
- onefetch
- openssh
- pandoc
- parallel
- patch
- pdf2svg
- pixi
- poppler
- prettier
- ripgrep
- rsync
- sed
- shellcheck
- starship
- tar
- tmux
- tokei
- tree
- unzip
- uv
- wget
- which
- zellij
- zsh
- zstd
name: system

Example: how to distribute a complex scientific software environment on a heterogeneous HPC cluster

SO:UK Data Centre example:

Pixi

Introduction

Example: PyAutoLens

From python-autojax/pixi.toml at c8a71287dd42752e95e06d3339eb44bc472c5d99 · ickc/python-autojax

[project]
authors = ["Kolen Cheung <christian.kolen@gmail.com>"]
channels = ["conda-forge"]
description = "DiRAC: revealing the nature of dark matter with the James Webb space telescope and JAX"
name = "autojax"
platforms = ["osx-arm64", "linux-64", "linux-aarch64"]
version = "0.1.0"

[tasks]

[dependencies]
python = ">=3.9"
numpy = "*"
numba = "*"
jax = "*"
# build
poetry = "*"
# extras
bump-my-version = "*"
# tests
coverage = "*"
pytest = "*"
pytest-benchmark = "*"
# docs
furo = "*"
linkify-it-py = "*"
myst-parser = "*"
sphinx = "*"
sphinx-autobuild = "*"
pygal = ">=3.0.5,<4"
defopt = ">=6.4.0,<7"
ipykernel = ">=6.29.5,<7"

[pypi-dependencies]
sphinx-last-updated-by-git = "*"
sphinxcontrib-apidoc = ">=0.5.0,<1"
autojax = { path = ".", editable = true}

[feature.cuda]
system-requirements = {cuda = "12"}
platforms = ["linux-64", "linux-aarch64"]

[feature.cuda.target.linux-64.dependencies]
jaxlib = { version = "*", build = "*cuda*" }

[environments]
cuda = ["cuda"]

Example: BrownianSpinDynamics

From brownian-spin-dynamics/pixi.toml at b2ba42450fe0049c79eb27464eb2c8d1d16c87e6 · UniExeterRSE/brownian-spin-dynamics

[workspace]
channels = ["conda-forge"]
platforms = ["win-64", "linux-64", "linux-aarch64", "osx-64", "osx-arm64"]

[tasks]
# bootstrap
bootstrap-julia = { cmd = "juliaup add $JULIAUP_CHANNEL", description = "install julia version specified by JULIAUP_CHANNEL" }

# resolve
resolve = { depends-on = ["resolve-root", "resolve-library", "resolve-docs"], description = "resolve environments" }
resolve-root = { cmd = "julia --project=. -e 'using Pkg; Pkg.develop(PackageSpec(path=\"BrownianSpinDynamics\")); Pkg.resolve()'", description = "resolve root environment" }
resolve-library = { cmd = "julia --project=BrownianSpinDynamics -e 'using Pkg; Pkg.resolve()'", description = "resolve library environment" }
resolve-docs = { cmd = "julia --project=BrownianSpinDynamics/docs -e 'using Pkg; Pkg.develop(PackageSpec(path=\"BrownianSpinDynamics\")); Pkg.resolve()'", description = "resolve docs environment" }

# update
update = { depends-on = ["update-root", "update-library", "update-docs"], description = "update environments" }
update-root = { cmd = "julia --project=. -e 'using Pkg; Pkg.develop(PackageSpec(path=\"BrownianSpinDynamics\")); Pkg.update()'", description = "update root environment" }
update-library = { cmd = "julia --project=BrownianSpinDynamics -e 'using Pkg; Pkg.update()'", description = "update library environment" }
update-docs = { cmd = "julia --project=BrownianSpinDynamics/docs -e 'using Pkg; Pkg.develop(PackageSpec(path=\"BrownianSpinDynamics\")); Pkg.update()'", description = "update docs environment" }
update-precompile = { cmd = "julia --project=BrownianSpinDynamics -e 'using Pkg; Pkg.precompile()'", description = "update precompile environment" }

# precompile
precompile = { depends-on = ["precompile-root", "precompile-library", "precompile-docs"], description = "precompile environments" }
precompile-root = { cmd="julia --project=. -e 'using Pkg; Pkg.develop(PackageSpec(path=\"BrownianSpinDynamics\")); Pkg.instantiate(); Pkg.precompile()'", description = "precompile root environment" }
precompile-library = { cmd="julia --project=BrownianSpinDynamics -e 'using Pkg; Pkg.instantiate(); Pkg.precompile()'", description = "precompile library environment" }
precompile-docs = { cmd="julia --project=BrownianSpinDynamics/docs -e 'using Pkg; Pkg.develop(PackageSpec(path=\"BrownianSpinDynamics\")); Pkg.instantiate(); Pkg.precompile()'", description = "precompile docs environment" }

# test
test = { cmd = "julia --project=BrownianSpinDynamics integration_tests/runtests_all.jl", description = "run all tests" }
test-unit = { cmd = "julia --project=BrownianSpinDynamics -e 'using Pkg; Pkg.test(test_args=ARGS, allow_reresolve = false)' {{ case }}", args = [{ arg = "case", default = ""}], description = "run unit tests" }
test-integration = { cmd = "julia --project=BrownianSpinDynamics integration_tests/runtests.jl {{ case }}", args = [{ arg = "case", default = ""}], description = "run integration tests" }

# linting
lint-aqua = { cmd = "julia --project=. scripts/lint_package.jl", description = "lint the library with Aqua.jl"}

# benchmarks
bench = { cmd = "julia --project=. BrownianSpinDynamics/bench/bench.jl {{ case }}", args = [{ arg = "case", default = ""}], description = "run benchmarks" }

# format
format = { depends-on = ["pre-sync", "julia-format", "post-sync"], description = "format everything" }
pre-sync = { cmd = "jupytext --sync 'tutorials/*.ipynb'", description = "Synchronize ipynb,jl pairs using jupytext" }
post-sync = { cmd = "jupytext --sync 'tutorials/*.ipynb'", description = "Synchronize ipynb,jl pairs using jupytext" }
julia-format = { cmd = "julia -e 'using JuliaFormatter; format(\".\")'", description = "format all files using JuliaFormatter"}
format-library = { cmd = "julia -e 'using JuliaFormatter; format(\"BrownianSpinDynamics\")'", description = "format BrownianSpinDynamics using JuliaFormatter" }

# docs
docs-build = { cmd = "julia --project=BrownianSpinDynamics/docs BrownianSpinDynamics/docs/make.jl", description = "build docs" }
docs-serve = { cmd = "julia --project=BrownianSpinDynamics/docs BrownianSpinDynamics/docs/serve.jl", description = "serve docs" }

# install
install-kernel = { cmd = "julia --project=. scripts/install-julia-brownian-spin-dynamics.jl --overwrite", description = "install Jupyter kernel for BrownianSpinDynamics" }

# dev
find-version = { cmd = "scripts/find-version.sh {{ pkg }}", args = ["pkg"], description = "find version of a package from Manifest.toml" }

[dependencies]
juliaup = ">=1.17.21,<2"
jupytext = ">=1.17.2,<2"

[activation.env]
JULIA_PROJECT = "@."
JULIAUP_CHANNEL = "1.11.7"

# this put the .julia directory typically available in ~/.julia
# to the conda prefix that the pixi environment resides in
[target.unix.activation.env]
JULIA_DEPOT_PATH = "$CONDA_PREFIX/.julia"
JULIAUP_DEPOT_PATH = "$CONDA_PREFIX/.julia"
[target.win.activation.env]
JULIA_DEPOT_PATH = "%CONDA_PREFIX%\\.julia"
JULIAUP_DEPOT_PATH = "%CONDA_PREFIX%\\.julia"

Nix

Why functional package manager?

If we represent the lifecycle of reproducibility from source code and data to result via functions:

  1. \(c_i = C(s_i, g_i(s_j))\): Compilation takes source code and the dependency graph to compiled binaries

  2. \(e = G(c_i)\): environment constructed from the whole dependency Graph of all precompiled binaries

  3. \(p_i = f_i(e, d_j)\): filters or functions that are an individual part of your scientific workflow, executing in the environment and acting on your data to produce data products.

  4. \(r = W(e, f_i, d_j)\): a Workflow that chains all these to obtain the final result.

Then it becomes obvious that (3) is the job of the programmer, (4) is the job of the workflow manager to ensure that they are pure functions (so that it is reproducible given the same inputs.)

The remaining task (1) and (2) are the jobs of a package manager.

What if we can make them pure functions? That’s basically what a functional package manager does.

Impurity in building software

What could make it impure?

Solutions to purity

On top of these, functional package manager guarantees building softwares is a pure function. Hence it is always reproducible.

(In contrast, despite all these efforts, non-functional package managers cannot guarantee purity, hence reproducibility.)

Nix (and also Guix, another functional package manager inspired by Nix) has various levels of integration with Software Heritage to automatically mitigate against link rot.

Spack

Docker

Misc.

Reflections on trusting trust

Thompson (1984)

Prefix, RPATH, and all that

References

Hernández, José Armando, and Miguel Colom. 2023. “Repeatability, Reproducibility, Replicability, Reusability (4R) in Journals’ Policies and Software/Data Management in Scientific Publications: A Survey, Discussion, and Perspectives.” arXiv:2312.11028. Preprint, arXiv, December 18. https://doi.org/10.48550/arXiv.2312.11028.
Plesser, Hans E. 2018. “Reproducibility Vs. Replicability: A Brief History of a Confused Terminology.” Frontiers in Neuroinformatics 11 (January): 76. https://doi.org/10.3389/fninf.2017.00076.
Thompson, Ken. 1984. “Reflections on Trusting Trust.” Communications of the ACM 27 (8): 761–63. https://doi.org/10.1145/358198.358210.