moby: cbfs driver causes 'hcsshim::PrepareLayer - failed failed in Win32: Incorrect function. (0x1)'

Description

Note: There is a long two-year old discussion, with lots of “me too” comments on this issue at docker/for-win#3884. Re-opening here based on feedback from @stephen-turner.

Starting with Windows 10 1903, having certain ‘drive’ software installed with certain versions of CBFS drivers causes Docker builds to fail with Windows containers, even with the most trivial Dockerfile.

hcsshim::PrepareLayer - failed failed in Win32: Incorrect function. (0x1)

Steps to reproduce the issue:

  1. Install Box Drive (no need to log in, just install)
  2. Create a simple Dockerfile
FROM mcr.microsoft.com/windows/nanoserver:1809
SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]
  1. docker build .

Describe the results you received:

➜  docker build .
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM mcr.microsoft.com/windows/nanoserver:1809
 ---> f524b7260f3c
Step 2/2 : SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]
 ---> Running in 389b868934e6
hcsshim::PrepareLayer - failed failed in Win32: Incorrect function. (0x1)

Uninstalling Box Drive makes the issue go away.

Describe the results you expected: A successful build.

Additional information you deem important (e.g. issue happens only occasionally): The company that makes this CBFS Connect driver used by Box and others has apparently released a version with a workaround for this bug. Installing software that contains this updated driver (such as SFTP Drive v2) actually resolves the issue. However Box has not released a fix, and insists that the issue is a regression in Docker and/or Windows itself.

Output of docker version:

Client:
 Cloud integration: 1.0.17
 Version:           20.10.8
 API version:       1.41
 Go version:        go1.16.6
 Git commit:        3967b7d
 Built:             Fri Jul 30 19:58:50 2021
 OS/Arch:           windows/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.8
  API version:      1.41 (minimum version 1.24)
  Go version:       go1.16.6
  Git commit:       75249d8
  Built:            Fri Jul 30 19:54:29 2021
  OS/Arch:          windows/amd64
  Experimental:     false

Output of docker info:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Build with BuildKit (Docker Inc., v0.6.1-docker)
  compose: Docker Compose (Docker Inc., v2.0.0-rc.1)
  scan: Docker Scan (Docker Inc., v0.8.0)

Server:
 Containers: 11
  Running: 0
  Paused: 0
  Stopped: 11
 Images: 801
 Server Version: 20.10.8
 Storage Driver: windowsfilter
  Windows:
 Logging Driver: json-file
 Plugins:
  Volume: local
  Network: ics internal l2bridge l2tunnel nat null overlay private transparent
  Log: awslogs etwlogs fluentd gcplogs gelf json-file local logentries splunk syslog
 Swarm: inactive
 Default Isolation: hyperv
 Kernel Version: 10.0 19043 (19041.1.amd64fre.vb_release.191206-1406)
 Operating System: Windows 10 Enterprise Version 2009 (OS Build 19043.985)
 OSType: windows
 Architecture: x86_64
 CPUs: 12
 Total Memory: 31.81GiB
 Name: LT-NWE1-T-US
 ID: KJ3K:R3GY:FLRD:BAJT:4KZW:LFB3:KPOO:JMA7:Y5Y7:UYWE:MPYU:2M7U
 Docker Root Dir: C:\ProgramData\Docker
 Debug Mode: true
  File Descriptors: -1
  Goroutines: 26
  System Time: 2021-08-31T14:21:10.5302762-04:00
  EventsListeners: 1
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

Additional environment details (AWS, VirtualBox, physical, etc.): physical

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 6
  • Comments: 18 (10 by maintainers)

Most upvoted comments

In that case, I’m not sure why the old CBFS driver didn’t cause similar problems elsewhere.

Yes, curious about that as well; from the description it seems like there’s a chance it may fail / cause errors in different scenarios as well.

referencing the same IRP_MN_MOUNT_VOLUME documentation, so perhaps it is HCS that got narrower in what it expects to receive from the mount call, and what it considers a failure, or perhaps a larger logic change in how HCS mounts volumes.

perhaps @katiewasnothere or @kevpar can help shine a light on this (or know who would be able to); and/or help improving the Windows documentation if it’s ambiguous.

That’s the problem - someone in MS was trying to fit multiple different pieces of information into one variable. It would be more logical to have a separate parameter for an indicator of the volume recognition and reserve the result code for possible errors.

I’m not familiar with the overall Windows APIs in this respect, but it’s a quite common pattern to use sentinel errors to control flows. In this case, (deducting from the above), I think the intent is;

  • “no error”: the driver successfully handled the request, so no need to continue with other drivers
  • STATUS_UNRECOGNIZED_VOLUME: (sentinel error): the driver did not handle the request, continue with next driver
  • any other (non-sentinel) error: the driver was able to handle the request, but something failed. the error returned may be specific to this driver.

I can confirm that it replicates just with wclayer on my Windows 10 20H2 (19042.1165) desktop box.

Given 32ebba58e25d6508c171d737289fc376095d82d0b026132e903dd32673a88a65 is a Windows container base layer (I happened to have hello-world:latest pulled here, so I think this is a nanoserver base, and then working in some temp directory:

## Box is not installed
> wclayer create tm -l C:\ProgramData\Docker\windowsfilter\32ebba58e25d6508c171d737289fc376095d82d0b026132e903dd32673a88a65
> wclayer mount tm -l C:\ProgramData\Docker\windowsfilter\32ebba58e25d6508c171d737289fc376095d82d0b026132e903dd32673a88a65
\\?\Volume{74a47361-9cf4-4774-8655-a25fb69af17a}
> wclayer unmount tm
## Installed https://e3.boxcdn.net/box-installers/desktop/releases/win/Box-x64.msi here, closed the login window
> wclayer mount tm -l C:\ProgramData\Docker\windowsfilter\32ebba58e25d6508c171d737289fc376095d82d0b026132e903dd32673a88a65
hcsshim::PrepareLayer - failed failed in Win32: Incorrect function. (0x1)

The loaded filesystem filters are unchanged by installing or uninstalling Box, so however CBFS works, it’s not a minifilter as I had assumed, or somehow it’s causing problems without being loaded.

> fltmc

Filter Name                     Num Instances    Altitude    Frame
------------------------------  -------------  ------------  -----
bindflt                                 1       409800         0
FsDepends                               8       407000         0
WdFilter                                8       328010         0
storqosflt                              1       244000         0
wcifs                                   1       189900         0
gameflt                                 5       189850         0
CldFlt                                  1       180451         0
FileCrypt                               0       141100         0
luafv                                   1       135000         0
npsvctrig                               1        46000         0
Wof                                     6        40700         0
FileInfo                                8        40500         0

I do see it’s a kernel mode driver.

A service was installed in the system.

Service Name:  cbfsconnect2017
Service File Name:  C:\WINDOWS\system32\drivers\cbfsconnect2017.sys
Service Type:  kernel mode driver
Service Start Type:  system start
Service Account:  
> sc query cbfsconnect2017

SERVICE_NAME: cbfsconnect2017
        TYPE               : 1  KERNEL_DRIVER
        STATE              : 4  RUNNING
                                (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN)
        WIN32_EXIT_CODE    : 0  (0x0)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0

So I’m assuming it’s a legacy filesytem filter driver. However, the docs for that say it should appear in the above fltmc output. I suspect that’s because the driver installation is being done with the Service Control Manager (pre-XP) approach.

I uninstalled Box, and installed SFTP Drive V2 Personal Edition, and confirmed that it looks the same (i.e. not in fltmc output) and the repro case passes. So the difference must be in the driver implementation itself.

So I can confirm that the 2019 release of cbfsconnect bundled with Box shows this problem, and the 2020 release of cbfsconnect bundled with SFTP Drive V2 does not.

And it’s nothing to do with Docker, it’s a conflict between the device driver and the Windows container filesystem support.

Windows 10 1903 introduced some driver model changes, so it’s not unbelievable to me that this is what introduced the incompatibility, and since the fix is apparently obvious to the CBFS driver developers in their closed-source product, I can’t see any useful way forward except either:

  • CBFS driver developers acknowledge it was a bug or legacy issue in their code and the Box developers upgrade their integration
  • CBFS driver developers usefully document the Windows bug they are working around to Microsoft (perhaps raise it on the hcsshim bug tracker, since there’s a simple repro case there), and MS fixes it as appropriate.

Last email from Callback on this:

Thanks for the update. It sounds like the direction is good and we don’t have anything further to add. The return status in newer builds does align with TBBle’s more strict interpretation. Let us know however we can assist.

It seems there is nothing to fix in Windows/moby/Docker but rather this is now on Box to incorporate the updated driver. Thanks all for the thorough investigation, I’ll close and continue my year+ effort to get Box to incorporate the fix. 😬

Wow, thanks for the deep-dive into this, @TBBle. Looks like there’s not a lot that can be done in this repository.

Does someone on this thread have a subscription for Box Drive or is in contact with the CBFS Connect driver people to get the missing information?