Technical Report: Nix on Windows

Nix is a package manager for Unix-like systems, which supports the installation of multiple versions of an application and which can run in parallel to whatever package manager a system uses natively (apt, pacman, yum, …). As other package managers, it needs a compiler toolchain to generate packages. Mingw-w64 and Cygwin provide free open source toolchains on Windows, both having their trade-offs and benefits: Cygwin offers a POSIX emulation needed by many applications, but binaries generated with Cygwin require the Cygwin runtime library; Mingw-w64 generates native Windows binaries without providing a POSIX emulation. In this report we describe how the Nix package manager realizes the installation of multiple versions of an application, what challenges Cygwin poses to these mechanisms, how we addressed, what should be done next, and how Mingw-w64 could be integrated to support management of native windows applications.

1   Cross-platform deployment with Nix

Cross-platform deployment means to exactly reproduce an application with its dependencies across different target platforms, like GNU/Linux, Mac OS X or Windows. An application — or more specifically a release of an application — is built and tested with specific versions of dependencies, like dynamic libraries, interpreter language modules, databases and command line tools.

For the deployment of multiple applications or releases of one application, it is essential that multiple versions of dependencies can be installed in parallel without interference. To minimize disk and RAM usage it is desirable that identical dependencies are shared between applications.

The Nix Package Manager solves these problems and the remainder of this section gives a brief introduction of its features and mechanisms.

1.1   Nix — The Purely Functional Package Manager

Nix is a package manager for Unix-like systems, which can run in parallel to whatever package manager the system uses natively (apt, pacman, yum, …). As of version 1.7 it was available for GNU/Linux and Mac OS X (Darwin). NixOS is a Linux distribution that uses it natively as its sole package manager.

Nix is a purely functional package manager in that it treats the files of packages like values of functions in purely functional programming languages such as Haskell: The functions describe how packages are built, they don’t have side-effects, and they never change after they have been built.

A collection of such build functions is provided as the Nix Packages Collection (Nixpkgs). At its core Nixpkgs has a set of functions that provide a compiler toolchain and a minimal set of system tools across platforms, described in its own section: The Standard Environment.

Nix supports installation of multiple versions of one application and even multiple, differently compiled builds of one version; sticking to the Filesystem Hierachy Standard (FHS) would lead to collisions. Instead nix installs the files generated by its build functions into a database, described in its own section: The Nix Store.

1.2   The Nix Store

Nix uses functions in a purely functional sense to describe its packages. The values of these functions are called outputs and are stored as subdirectories of the Nix Store (/nix/store). An output in the store holds the content of a package that would usually be installed to /usr. In that regard, the store resembles /opt with the possibility to have multiple versions and even multiple builds of the same version installed in parallel. This is achieved by using a cryptographic hash as a prefix for the store output.

% ls -l /nix/store/w08118q0kp26qdcz5cdl2rx6chghawam-hello-2.9
dr-xr-xr-x 2 root root 4096 Jan  1  1970 bin/
dr-xr-xr-x 5 root root 4096 Jan  1  1970 share/

% ls -l /nix/store/w08118q0kp26qdcz5cdl2rx6chghawam-hello-2.9/bin
-r-xr-xr-x 2 root root 30845 Jan  1  1970 hello*

The cryptographic hash used as a prefix is calculated over the function describing the build of a package as well as all its build-time dependencies, represented by similar functions. Variations in any of these will lead to a different hash and therefore a different store location.

To make sure that a binary in the store will load the correct dynamic libraries, Nix hardcodes absolute store paths of dependencies into build binaries.

% ldd =hello
    linux-vdso.so.1 (0x00007fffbc1fe000)
    libc.so.6 => /nix/store/i11d0d4015p0vbdnjq7lb509v9pwp049-glibc-2.19/lib/libc.so.6 (0x00007f55d1263000)
    /nix/store/i11d0d4015p0vbdnjq7lb509v9pwp049-glibc-2.19/lib/ld-linux-x86-64.so.2 (0x00007f55d1610000)

Based on such and other references of store locations, Nix performs automatic runtime dependency detection and supports copying a package and its dependencies from one machine to another: If a store path of package A occurs in any file of package B, package A is a runtime dependency of B.

1.3   The Standard Environment

One essential component for building packages with Nix is the standard environment, mostly consisting of:

  • A working compiler toolchain (gcc, libc, binutils) and
  • a minimal set of system tools (e.g. curl, coreutils, tar, bzip).

Each platform (Linux/Darwin) has a unique standard environment. It builds the base of Nixpkgs and is provided by the stdenv package, which resolves to different implementations depending on the Nix system variable.

Most Linux distributions including NixOS are handled by stdenvLinux, which provides a so-called pure standard environment. It has no external dependencies (impurities) to programs or libraries provided by the host system outside of the Nix Store.

In contrast to that, stdenvNative is another standard environment, which creates wrapper shell scripts in the store to use the native compiler toolchain provided by the host system. Its dependency on programs and libraries outside of the store make it impure. We will see implications of this later on.

Darwin uses a mixture of these two approaches were the native C library is used but apart from that the toolchain is built by Nix.

2   Compiling on and for Windows

Our target platforms are:

  • Windows Vista (32bit/64bit),
  • Windows 7 (32bit/64bit),
  • Windows 8 and 8.1 (32bit/64bit),
  • Windows Server 2008 and 2008 R2 (32bit/64bit),
  • Windows Server 2012 and 2012 R2 (64bit).

We investigated two projects that provide free open source 32bit and 64bit toolchains for Windows: Mingw-w64 and Cygwin.

2.1   Mingw-w64

"Mingw-w64 brings free software toolchains to Windows". It is a collection of headers and provides pre-packaged toolchains to compile native Windows binaries. While more and more projects are using it, it does not provide full POSIX API functionality and it is up to the projects and their build systems to support Mingw-w64.

One hard requirement for us was POSIX API functionality of the target system. We outline in Native Windows Binaries how Mingw-w64 could be facilitated.

2.2   Cygwin

"Cygwin is a large collection of GNU and Open Source tools which provide functionality similar to a Linux distribution on Windows." In contrast to Mingw-w64, it has a DLL that "provides substantial POSIX API functionality" on Windows at its core. The emulated POSIX API functionality has some drawbacks, explained in Challenges on Cygwin,

Running 32bit Cygwin on a 64bit Windows works well with the exception of Windows Server 2012 and its R2, which seem to be unsuitable for WOW64.

As a consequence Windows Server 2012 (R2) can be used to build and run 64bit Cygwin binaries, but not 32bit.

3   Challenges on Cygwin

The combination of Windows, Cygwin, and Nix introduces challenges caused by Windows’ Portable Executable Format (PE). PE is to Windows, what ELF is to Linux. It is the file format used for .dll and .exe files on Windows, and it has some peculiar properties that cause major problems in Cygwin environments.

Further, Cygwin goes to great length to ease execution of PE from within Cygwin’s Unix-like environment.

3.1   Finding DLLs

Windows has no simple, reliable way to load DLLs from absolute filesystem locations, which is inherently needed to make Nix’ concept of the store work.

Windows executables and shared libraries reference shared libraries they depend on only by name, not by full path as on some other platforms. This means that the GCC --rpath flag is useless on Windows. When C:\foo\bar.exe requests library baz Windows follows certain search rules to find and load a file named baz.dll. Usually it looks first in the directory of the executable itself (C:\foo\) and if unsuccessful it will at some point also try all directories contained in the PATH environment variable.

This property makes library versioning particularly difficult to achieve and leads to the infamous DLL Hell.

3.2   DLL Relocation and Cygwin’s fork

Windows has no equivalent to the POSIX fork() system call and Cygwin goes to great length to implement it on top of Windows’ CreateProcess. Cygwin’s fork() implementation creates the necessity that all DLLs are loaded to unique pre-defined addresses by all processes.

When Windows runs a program that loads multiple DLLs to the same, intersecting or otherwise occupied address, the libraries have to be relocated to different random addresses. This is a common procedure and every DLL provides a .reloc section that describes how to update address references in the DLL code. Apart from minor load time penalties these DLL relocations do not interfere in any way with normal Windows programs.

However, Cygwin programs are not simple Windows programs. Cygwin implements a POSIX layer in Windows, and some POSIX calls do not map easily to the Win32 API. Cygwin’s implementation of fork() is not a lightweight copy-on-write approach as its equivalents on e.g. Linux. Instead Cygwin creates a new instance of the running program and copies all data and state from the parent to the new child process. For this to work it is necessary that the forked binaries and all their libraries have been loaded to the same non-colliding virtual addresses in the parent as well as in the child process. If Windows DLL relocation happens, the fork() will most likely fail with an error like "address space already occupied".

3.3   Automagic .exe Suffixing

Binary executables in Windows have a .exe filename extension. As this would break many programs and shell scripts, Cygwin implements automatic filename globbing when it comes to handling executables. If there is a file named foo.exe in the current working directory, calling ./foo will actually run foo.exe. Similarly, running vim foo would open foo.exe in vim for editing. This default behavior saves many programs from needing special patches or being broken on Windows.

While this makes it tricky to create a file foo, while a file foo.exe already exists, it also can be exploited: If there is a foo.exe in the current directory and at the same time an executable shell script called foo, then calling ./foo will run the shell script. The shell script effectively shadows the executable and can act as a transparent wrapper script, a feature that we exploit to improve Cygwin’s Nix compatibility.

4   Reproducible Cygwin Environments

Cygwin has no fixed release cycles and its update procedure resembles a rolling release distribution. Nix main feature is reproducibility. Nix on Cygwin at least so far is impure in that it uses components outside the store provided by Cygwin. We need a way to make Cygwin environments reproducible.

Our requirements are:

  • one Windows machine to host an arbitrary amount of Cygwin environments for 32bit and 64bit,
  • control which Cygwin package versions are installed so we can reliably reproduce Cygwin environments,
  • Nix packaged for Cygwin to be installable like any other Cygwin package,
  • installation of Cygwin packages via command line,
  • and ssh access to each of the environments.

4.1   Custom Cygwin Mirror

By setting up a Cygwin mirror, we can freeze the state of all Cygwin packages and subsequently control which packages are updated and when. It also allows us to add our own packages to the repository. Cygwin mirrors use the perl script genini to create index files containing metadata of all available packages. There is one for 32bit x86/setup.ini and one for 64bit x86_64/setup.ini.

We use three folders: upstream tracks an upstream mirror, custom contains our own packages and cygwin merges the former two and forms our Cygwin mirror.

upstream

is synced via rsync with an upstream mirror (around 60GB).

for x in x86 x86_64; do
  rsync -vazP ${RSYNC_MIRROR}/$x/ upstream/$x/
done

By omitting the --delete flag and putting the upstream/*/setup.* index files under version control we can follow updates while being able to go back in case something breaks.

custom

will contain our custom packages and we use genini to create its setup.ini index files.

for x in x86 x86_64; do
  perl genini --arch=$x --recursive --output $x/setup.ini $x/setup.ini $x/release
done

It is important to call genini in the directory containing x86 and x86_64 with relative paths like above as otherwise paths in the generated files will be incorrect.

cygwin

is the root of the custom Cygwin mirror. It is a symlink farm that merges upstream and custom. As genini is very slow, we create the index files by concatenating while omitting what would be an in-between header. Cygwin also looks for a bzip2 version named setup.bz2.

for x in x86 x86_64; do
  cat upstream/$x/setup.ini > cygwin/$x/setup.ini
  tail -n+7 custom/$x/setup.ini >> cygwin/$x/setup.ini
  bzip2 -c cygwin/$x/setup.ini > cygwin/$x/setup.bz2
done

4.2   Installation and Cygwin Package Management

To bootstrap a Cygwin environment we use a batch file calling Cygwin’s setup.exe. Next to it are a skel directory to be used as Cygwin’s /etc/skel and the 32bit and 64bit installers.

bootstrap-cygwin.bat
setup-x86_64.exe
setup-x86.exe
skel/

The bootstrap script creates the mintty.bat batch file, which starts a shell in the new Cygwin environment by executing its mintty.exe as login shell (parameter -) and setting the Cygwin directory as window title.

start "%~dp0" "%~dp0bin\mintty.exe" -

Especially with ssh access (see Remote Access), it helps a lot if shell prompts indicate the Cygwin directory they belong to, as set in .bashrc.

export PS1='\[\e]0;\w\a\]\n(cygroot='$(cygpath -w -a /)') \[\e[32m\]\u@\h \[\e[33m\]\w\[\e[0m\]\n\$ '

For Cygwin package management we found it handy to put the Cygwin installer for the architecture (setup-x86.exe or setup-x86_64.exe) as setup.exe into the Cygwin root folder and to create a wrapper setup.bat with some default parameters. This is handled by bootstrap-cygwin.bat (see above).

$ ls -1 /setup.*
setup.bat
setup.exe

$ cat /setup.bat
"%~dp0setup.exe" -q -X -O -s http://custom.cygwin/mirror -R "%~dp0" -l "c:\cygwin-cache" %*

This enables easy execution from Windows Explorer. To have it available from a shell within the Cygwin environment we created two aliases: one for querying available packages and the other to simply run /setup.bat.

if test $(uname -m) = "x86_64"; then
  cygarch=x86_64
else
  cygarch=x86
fi
export $cygarch

alias cygsetup='/setup.bat'
alias cygquery='cat /cygdrive/c/cygwin-cache/<MIRDIR>/${cygarch}/setup.ini |grep -i --color'

4.3   Remote Access

To register SSH as a Windows service, Cygwin has the ssh-host-config configure script. We enhanced it to allow for multiple Cygwins to provide simultaneous SSH access, by accepting a port number and service name. Our patches for this and for cygserver (see below) are submitted and integrated upstream.

$ ssh-host-config --port 32001 --name cygwin-x86-32001-sshd
$ cygrunsrv --start cygwin-x86-32001-sshd

Cygserver provides Cygwin applications with services which require security arbitration or which need to persist while no other Cygwin application is running. Similar to the SSH services there is a script (cygserver-config) to configure the Windows service, which with our patches supports now an explicit name to allow for multiple cygserver services per Windows machine.

$ cygserver-config --name cygwin-x86-32001-cygserver
$ cygrunsrv --start cygwin-x86-32001-cygserver

4.4   DO NOT COPY CYGWINS

We encountered that Windows Explorer does not preserve permissions correctly when copying Cygwins, that effectively led to the umask being ignored.

5   Nix on Cygwin

In this section we describe fixes to the Nix package manager itself to compile on Cygwin, hooks to create an impure standard environment for Cygwin and give an overview of fixes we did to Nixpkgs.

5.1   Preparations

There are a couple of things to ensure or be aware of before actually starting with Nix on Cygwin:

git and unix line endings

Before using git in any way, e.g. and especially to check out nixpkgs git needs to be told not to be smart about line endings. Our bootstrap handles this via a skeleton file.

$ git config --global core.eol lf

If for some reason you do not want to set this globally for your Cygwin user, at the very least you have to set it for checkouts of Nixpkgs.

XXX: Is this still needed? If so, there should be a bug/feature request for Nix/Nixpkgs

allowBroken

At the moment of writing, nix considers its support for Cygwin to be broken and has to be told that you are aware of that. Our bootstrap handles this via skel file.

$ cd; mkdir -p .nixpkgs; echo "{ allowBroken = true; }" >> .nixpkgs/config.nix

XXX: Is this still needed?

5.2   Package, Build and Install Nix

To install Nix on Cygwin we decided to use Cygwin’s cygport source packaging tool. Cygport is inspired by Gentoo’s Portage package managing system and borrows some of its ideas and syntax for defining packages. The only item we have to provide is a nix.cygport file, available in ternaris/cygports:

NAME="nix"
VERSION=1.8
RELEASE=1
CATEGORY="Devel"
SUMMARY="The purely functional package manager"
DESCRIPTION="Nix is a powerful package manager for Linux and other Unix systems that makes package management reliable and reproducible. It provides atomic upgrades and rollbacks, side-by-side installation of multiple versions of a package, multi-user package management and easy setup of build environments."
HOMEPAGE="http://nixos.org/nix/"
SRC_URI="http://nixos.org/releases/nix/nix-1.8/nix-1.8.tar.xz"


DEPEND="bison curl flex gcc-core gcc-g++ libbz2-devel libsqlite3-devel make openssl-devel patch perl perl-DBD-SQLite perl-WWW-Curl pkg-config"
REQUIRES="$DEPEND"


src_compile() {
  cd ${S}
  ./configure --prefix=/usr --sysconfdir=/etc --disable-init-state
  ./config.status --quiet --file Makefile.config
  sed -i Makefile.config \
      -e "s,^libdir = /usr/lib,libdir = /usr/bin," \
      -e "s,^perllibdir = .*$,perllibdir = $(perl -E 'use Config; print $Config{vendorarch};'),"
  make
}


src_install() {
  cd ${S}

  # XXX: maybe we can configure already to the $pkgdir locations, but it feels that up there should be the after-install-locations. currently, this leads to a couple of things being relinked on make install.
  sed -i Makefile.config \
      -e "s,dir = /usr,dir = ${pkgdir}/usr," \
      -e "s,prefix = /usr,prefix = ${pkgdir}/usr,"

  cyginstall profiledir=${pkgdir}/etc/profile.d
}

Some patches to Nix’ sources were needed as outlined below and merged upstream:

  1. GCC complained that several standard BSD or POSIX functions, e.g. srandom, are not defined.
  2. Libraries (DLLs) were built with a .so file extension instead of the Windows standard .dll extension and they were installed to the ${prefix}/lib directory which is fine on Linux and Darwin, however, on Windows, DLL files have to be installed to the ${prefix}/bin directory.
  3. The Nix Store Perl extension did not compile due to missing Perl symbols.

5.2.1   Fix Missing Library Function Definitions

GCC complained about missing definitions of standard functions in some source files. This happened mainly for functions usually found in stdlib.h, which is not always included explicitly by the Nix sources, as it is pulled in by other headers on Linux/Darwin. Adding the explicit #include statements where appropriate partially solves the problem.

However, some functions are still not found, even with all headers in place. As it turns out culprit is the -std=c++0x gcc flag, which sets the language standard and implicitly enables __STRICT_ANSI__, which in turn prevents the Cygwin headers from defining some vital functions. Unsetting it by adding -U__STRICT_ANSI__ to the CFLAGS variable concludes this fix.

5.2.2   Fix Library File Extension and Directory

This requires a straightforward patch to the make files to conditionally set file extension and installation directory of shared libraries.

5.2.3   Fix Perl Extension

Nix’ Perl extensions use internal Perl functions found in the corresponding perl.dll. On Windows we cannot create libraries with unresolved symbols, which makes it necessary to link the extension module against perl.dll. To address this issue we discover the path of perl.dll on Windows and add it to LDFLAGS of the store target in the corresponding Makefile.

5.3   Standard Environment Hooks

After experimenting with various approaches we decided to use stdenvNative. We outline the steps towards a pure environment in Pure Standard Environment.

At some point in the past Nixpkgs had support for the Cygwin platform. Over time this port became dysfunctional and it was subsequently removed, leaving only a few code remnants in the Nixpkgs tree. One of the first of these remnants we encountered was a hook for stdenvNative that set some platform specific options and most notably disabled shared libraries by default. For our new Cygwin support we decided to reevaluate all previous changes in the spirit of staying as close as possible to Nix on NixOS for easier maintenance in the future.

In some aspects, however, Cygwin is fundamentally different from other Nix platforms. As described in Challenges on Cygwin, Windows PE format causes two major challenges:

  1. DLLs cannot be reliably referenced by absolute paths
  2. DLLs need to have unique pre-defined addresses

We address these challenges by hooks to Nixpkgs’ stdenvNative.

5.3.1   DLL Search Path Wrappers

As described in Finding DLLs, Windows looks for DLLs in the PATH environment variable and does not support referencing them by absolute path, as needed by nix.

To emulate library references by absolute path we have to make sure that on program execution, all DLLs direct or indirect dependencies are present in the PATH variable. To this end we added a postInstall hook to stdenvNative that creates wrapper shell scripts for all store output’s executable, exploiting Cygwin’s Automagic .exe Suffixing.

postFixupHooks+=(_cygwinWrapExesToFindDlls)

_cygwinWrapExesToFindDlls() {
    find $out -type l | while read LINK; do
        TARGET="$(readlink "${LINK}")"

        # fix all non .exe links that link explicitly to a .exe
        if [[ ${TARGET} == *.exe ]] && [[ ${LINK} != *.exe ]]; then
            mv "${LINK}" "${LINK}.exe"
            LINK="${LINK}.exe"
        fi

        # generate complementary filenames
        if [[ ${LINK} == *.exe ]]; then
            _LINK="${LINK%.exe}"
            _TARGET="${TARGET%.exe}"
        else
            _LINK="${LINK}.exe"
            _TARGET="${TARGET}.exe"
        fi

        # check if sould create complementary link
        DOLINK=1
        if [[ ${_TARGET} == *.exe ]]; then
            # the canonical target has to be a .exe
            CTARGET="$(readlink -f "${LINK}")"
            if [[ ${CTARGET} != *.exe ]]; then
                CTARGET="${CTARGET}.exe"
            fi

            if [ ! -e "${CTARGET}" ]; then
                unset DOLINK
            fi
        fi

        if [ -e "${_LINK}" ]; then
            # complementary link seems to exist
            # but could be cygwin smoke and mirrors
            INO=$(stat -c%i "${LINK}")
            _INO=$(stat -c%i "${_LINK}")
            if [ "${INO}" -ne "${_INO}" ]; then
                unset DOLINK
            fi
        fi

        # create complementary link
        if [ -n "${DOLINK}" ]; then
            ln -s "${_TARGET}" "${_LINK}.tmp"
            mv "${_LINK}.tmp" "${_LINK}"
        fi
    done

    find $out -type f -name "*.exe" | while read EXE; do
        WRAPPER="${EXE%.exe}"
        if [ -e "${WRAPPER}" ]; then
            # check if really exists or cygwin smoke and mirrors
            INO=$(stat -c%i "${EXE}")
            _INO=$(stat -c%i "${WRAPPER}")
            if [ "${INO}" -ne "${_INO}" ]; then
                continue
            fi
        fi

        mv "${EXE}" "${EXE}.tmp"

        cat >"${WRAPPER}" <<EOF
#!/bin/sh
export PATH=$_PATH${_PATH:+:}\${PATH}
exec "\$0.exe" "\$@"
EOF
        chmod +x "${WRAPPER}"
        mv "${EXE}.tmp" "${EXE}"
    done
}

Creating these scripts can be tricky, as writing to the file foo in presence of foo.exe in the same directory will overwrite the contents of the latter executable. Symlinks to executables need special attention, too. If a program provides foo.exe and a symlink bar.exe->foo.exe, the hook has to create a wrapper foo and also a complementary symlink bar->foo, otherwise programs whose behavior depends on the executable name (e.g. busybox) will break. Some build system will create constructs like bar->foo.exe, which our hook sanitizes.

5.3.2   Runtime Dependency Non-Detection

In order to detect store location A to be a runtime dependency of B, the Nix store relies on A’s path being mentioned in B. On Linux and Darwin, binaries reference their dependencies by absolute paths and therefore enable runtime detection. On Windows this is not possible as outlined in Finding DLLs. As a quick and dirty solution we turn all build-time dependencies of package to be runtime dependencies as well.

# On cygwin, automatic runtime dependency detection does not work
# because the binaries do not contain absolute references to store
# locations (yet)
postFixupHooks+=(_cygwinAllBuildInputsAsRuntimeDep)

_cygwinAllBuildInputsAsRuntimeDep() {
    if [ -n "$buildInputs" ]; then
        mkdir -p "$out/nix-support"
        echo "$buildInputs" >> "$out/nix-support/cygwin-buildinputs-as-runtime-deps"
    fi

    if [ -n "$nativeBuildInputs" ]; then
        mkdir -p "$out/nix-support"
        echo "$nativeBuildInputs" >> "$out/nix-support/cygwin-buildinputs-as-runtime-deps"
    fi
}

This works well, but results in superfluous runtime dependencies. See Proper Runtime Dependency Detection for a better proposal.

5.3.3   DLL Rebasing to Enable Cygwin’s fork()

As described in DLL Relocation and Cygwin’s fork, DLL relocation is lethal to Cygwin’s fork(). Key to a working fork() call is to avoid address collisions when loading DLLs by minimizing the probability of relocation.

Windows DLLs can define an ImageBase in the IMAGE_OPTIONAL_HEADER section of their PE header. Its value defines the DLL’s preferred base address when it is loaded into the virtual address space of a running process.

Cygwin makes sure that all DLLs it installs are rebased to unique addresses using ImageBase and that the address space beginning at ImageBase is large enough to hold the complete library. Rebasing happens automatically whenever Cygwin’s setup.exe installs or updates a package.

Binutils’ ld linker accepts an option called --enable-auto-image-base for DLL creation. When enabled, ld automatically generates a pseudo-unique value for ImageBase based on a hash of the library name. The rationale behind this approach is that differently named DLLs will end up at different base addresses. It does not account for which parts of the address space are claimed by other libraries and collisions are still possible albeit somewhat less likely to occur. Cygwin’s global solution does a better job ensuring non-overlapping address spaces.

However, Cygwin is not aware of libraries compiled and installed via Nix. For choosing and setting ImageBase in newly compiled DLLs we wrote two more postInstall hooks specific to 32bit and 64bit

postFixupHooks+=(_cygwinFixAutoImageBase)

_cygwinFixAutoImageBase() {
    find $out -name "*.dll" | while read DLL; do
        if [ -f /etc/rebasenix.nextbase ]; then
            NEXTBASE="$(</etc/rebasenix.nextbase)"
        fi
        NEXTBASE=${NEXTBASE:-0x200000000}

        REBASE=(`/bin/rebase -i $DLL`)
        BASE=${REBASE[2]}
        SIZE=${REBASE[4]}
        SKIP=$(((($SIZE>>16)+1)<<16))

        echo "REBASE FIX: $DLL $BASE -> $NEXTBASE"
        /bin/rebase -b $NEXTBASE $DLL
        NEXTBASE="0x`printf %x $(($NEXTBASE+$SKIP))`"

        echo $NEXTBASE > /etc/rebasenix.nextbase
    done
}

It keeps a global base address counter, whose initial value is set to be slightly above the address space taken by Cygwin’s most important library cygwin1.dll, which is specific to 32bit/64bit. For each new library the postInstall hook rebases the library to the address saved in the counter and increments the counter by the size of the library, rounded up to the next multiple of 64kB. This ensures unique non-overlapping areas for all DLLs in the store, without the necessity for additional or global rebasing scripts. This approach has some drawbacks, as address space cannot be reclaimed by the hook when software is uninstalled from the store. At least on 32bit Windows the hook will run out of valid addresses sooner rather than later. Also, on Windows, one program cannot load multiple shared libraries with the same name at the same time. Therefore, it would be possible to rebase different versions of the same library to the same base address and therefore to better utilize the available address space.

See Store Level DLL Rebasing for a better proposal.

5.4   Nixpkgs fixes — A Beginning

Nixpkgs used to have support for Cygwin, which was dropped at some point. Before starting our work we flagged occurences of old Cygwin specifics to be investigated. For 4 packages we could remove the old Cygwin specifics and there remain 49 such todo markers in packages that we did not investigate so far.

All our patches to Nixpkgs are being discussed and are meanwhile merged as pull request. Here we’d like to give a brief overview.

  • In three cases (gnugrep, gnum4, gnumake) we decided to disable failing tests, but only, because they were already disabled for at least one other platform.
  • In case a newer version of a package had Cygwin support, we decided to upgrade for all platforms (boehmgc, gettext)
  • In case of bash we had to downgrade from 4.2 to 4.1, the latest version with support for Cygwin. We did so only for Cygwin, the other platforms remain on 4.2. It looks like Cygwin support will be back with bash 4.3.
  • We encountered 4 packages (cpio, findutils, gnutar, gzip) that ship an older version of gnulib, which gets confused by the Cygwin headers. Small package-specific preConfigure hooks solve the issues.
  • For gawk, gettext, ncurses, and zip it was sufficient to specify configure flags specific to Cygwin.
  • For 12 packages we use Cygwin specific patches taken from Cygwin itself: bash, boehmgc, coreutils, gettext, libffi, libiconv, mysql, openssl, pkgconfig, popt, python27, w3m.
  • Nixpkgs provides a substituteInPlace command which is implemented using bash’s pattern substitution. At least with bash-4.1 we experienced this to be seriously slow and propose to use sed instead, along with an alias sed-i that maps to sed -i"" for gnused and sed -i "" for bsdsed. Alternatively, we could require gnused to be installed.

And there is the category of weird fixes, most notably asciidoc:

+  makeFlags = if stdenv.isCygwin then "DESTDIR=/." else null;

Without this the destination directory would be prefixed with //, which thanks to Cygwin becomes \\ and is subsequently interpreted as a network path.

6   Summary and Outlook

There is a cygport to install Nix on Cygwin and a impure standard environment allows installation of some packages contained in the Nix Packages Collection. In addition to investigate and fix the remaining packages, we propose:

  • DLL rebasing in the store to enable binary caches for Cygwin,
  • detection of real runtime dependencies
  • a pure standard environment to eliminate interference with the Cygwin host system,
  • reduction of environment variable length to remain within Cygwins limits,
  • investigation of batch files as an alternative or supplement to shebang wrappers to enable execution from outside of Cygwin, and
  • Mingw-w64 as an alternative or supplement to Cygwin to create native Windows binaries.

6.1   Store Level DLL Rebasing

Currently, we use a simple, non-intrusive postInstall hook that rebases all DLLs as part of the build process (see DLL Rebasing to Enable Cygwin’s fork(). Therefore, rebasing is only run when building a package locally, but will not affect packages that are installed through binary caches.

This renders distribution of precompiled packages effectively useless at the moment. Additionally, address space cannot be reused when libraries are removed from the store.

A better solution would track addition and removal of packages to and from the store and perform rebase (and defragmentation) as part of a package’s build. For that to work, the nix store would need support for platform specific hooks, which would be executed after adding and before removing software from the store. This solution would move the current code to a more appropriate place and make all steps reversible.

We believe that such a solution would greatly improve the long term health of a Nix on Cygwin installation, as running out of address space would become much more unlikely and distributed binaries would work out of the box on every machine, as the Nix Store would take care of rebasing regardless of a package’s origin.

6.2   Proper Runtime Dependency Detection

Currently, we simply turn all build dependencies into runtime dependencies (see Runtime Dependency Non-Detection), which results in superfluous runtime dependencies.

The DLL Search Path Wrappers already mention dependencies by absolute path. As a side effect these dependencies would already be detected. However, a proper solution should not rely on side effects of another hook.

A proper runtime dependency detection should scan all binaries (.dll and .exe) of a package and map the referenced library names to absolute store locations, based on the currently set PATH variable.

6.3   Pure Standard Environment

Some of the package issues we encountered, were caused by some build systems discovering libraries belonging to the host Cygwin environment.

We would like to investigate a pure standard environment, where everything — including the Cygwin DLL — is served from the Nix Store.

This environment could enforce purity much like stdenvLinux does on NixOS. With an almost empty / there would be nothing to find for badly behaved packages. Such a "self-hosted" solution would not require the explicit installation of Cygwin anymore, but could be distributed as a small tarball consisting of not much more than a /nix directory.

6.4   Environment Variable Length

Windows limits the maximum size of user-defined environment variables to 32767 characters.

Nix’ way of handling Python packages, for example, stores each dependency individually in the Nix store and assembles the Python environmnet via the PYTHONPATH variable set in shebang wrapper scripts. Assuming an average 100 characters per Python package store path, this starts to be an issue in case of more than 300 dependencies.

Different forms of packaging that do not assemble Python environments via PYTHONPATH need to be investigated and likewise for other languages.

Up until Windows Server 2003 and Windows XP there was an additional limit on the whole block of all environment variables, which does not exist anymore in newer versions.

6.5   Batch Files Instead of Shebang Wrappers

Nix in general and we especially for Nix on Cygwin make heavy use of shebang wrappers to set environment variables for an application, e.g.:

#!/bin/sh
export PATH=/nix/store/3hshx30bkahwyxs2y7dkiilzf72ggva7-cc-native/bin:${PATH}
exec "$0.exe" "$@"

Cygwin supports shebang well, but it is not easily possible to execute these wrappers from outside Cygwin. To support this, generation of Windows batch files could be an alternative. However, if Windows’s command line prompt is involved in their execution, this would create an even smaller limit of 8191 characters for the Environment Variable Length).

6.6   Native Windows Binaries

As outlined in Compiling on and for Windows we evaluated two projects providing free open source toolchains: Cygwin and Mingw-w64.

While our initial focus was to provide a full POSIX-environment based on Cygwin it would be very interesting to attempt the integration of Mingw-w64 to create native Windows binaries, which can be run outside of Cygwin.

Build Environment Toolchain Runtime Environment
Cygwin Cygwin Cygwin
Cygwin Mingw-w64 native
Mingw-w64 Mingw-w64 native

Our current solution is represented by the first row: Cygwin is used as build environment and toolchain to create binaries that need Cygwin also as runtime environment.

As the next step, we could use Nix in a Cygwin build environment with Mingw-w64 toolchain to build native Windows programs, which could then be run natively. A Cygwin build environment could support this native mode as well as the Cygwin runtime.

We then could maybe even go further and build Nix itself with Mingw-w64. This would require to create alternate paths for quite a lot of POSIX specific Nix code. However, this would get rid of some of the complexities of the current build environment and may lead to the creation of the ultimate Windows build tool.

social