autograph: Tests fail on Mac OS Monterey, Rust 1.57

Tests fail to finish on M1 Mac

$ cargo test device_new --features device_tests
running 1 test
test device::tests::device_new has been running for over 60 seconds
error: test failed, to rerun pass '--lib'
Caused by:
  process didn't exit successfully: `/Users/rjzak/Downloads/autograph/target/debug/deps/autograph-868587c6365604da device_new` (signal: 9, SIGKILL: kill)

$ cargo test --features "full device_tests"
test device::buffer::tests::device_buffer_copy_from_slice has been running for over 60 seconds
test device::buffer::tests::device_buffer_serde has been running for over 60 seconds
test device::buffer::tests::fill_bf16 has been running for over 60 seconds
test device::buffer::tests::fill_f16 has been running for over 60 seconds
test device::buffer::tests::fill_f32 has been running for over 60 seconds
test device::buffer::tests::fill_f64 has been running for over 60 seconds
test device::buffer::tests::fill_i16 has been running for over 60 seconds
test device::buffer::tests::fill_i32 has been running for over 60 seconds
error: test failed, to rerun pass '--lib'
Caused by:
  process didn't exit successfully: `/Users/rjzak/Downloads/autograph/target/debug/deps/autograph-aa9dbc5e89ab94bc` (signal: 9, SIGKILL: kill)

$ uname -a
Darwin macmini.local 21.1.0 Darwin Kernel Version 21.1.0: Wed Oct 13 17:33:24 PDT 2021; root:xnu-8019.41.5~1/RELEASE_ARM64_T8101 arm64
$ rustc --version
rustc 1.57.0 (f1edd0429 2021-11-29)

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 37 (20 by maintainers)

Most upvoted comments

Ok so I found some potential issues. This really needs unit tests because there are at least a few different common setups for host / device memory. Typical discrete gpus have a big DEVICE_LOCAL heap, a small DEVICE_LOCAL | CPU_VISIBLE heap, and a CPU_VISIBLE | COHERENT and potentially CPU_CACHED heap. But for integrated gpus and the M1, it may be different, with more DEVICE_LOCAL | CPU_VISIBLE memory in addition to DEVICE_LOCAL memory alone. The impl was intended to handle this but without tests it got neglected. This should have tests for known configs, as this is not only necessary to run on M1 but also iGPU’s and mobile chips as well.

charles-r-earp on Dec 14, 2021

Making some progress!! The “device new” test worked, so I also ran all of the tests. 145 passed, 1 failed, 2 ignored.

% git status                                   
On branch vulkano
Your branch is up to date with 'origin/vulkano'.

nothing to commit, working tree clean
% cargo test --features device_tests device_new
   Compiling autograph v0.1.1 (/Users/rjzak/Downloads/autograph)
warning: unused imports: `SliceMut`, `Slice`
 --> src/device.rs:2:14
  |
2 |     buffer::{Slice, SliceMut},
  |              ^^^^^  ^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: unused import: `bytemuck::Pod`
 --> src/device.rs:6:5
  |
6 | use bytemuck::Pod;
  |     ^^^^^^^^^^^^^

warning: unused imports: `Weak`, `drop`, `iter`, `take`, `time::Duration`
  --> src/device/engine.rs:17:5
   |
17 |     iter,
   |     ^^^^
18 |     iter::once,
19 |     mem::{drop, take, transmute},
   |           ^^^^  ^^^^
...
23 |         Arc, Weak,
   |              ^^^^
24 |     },
25 |     time::Duration,
   |     ^^^^^^^^^^^^^^

warning: unused imports: `AutoCommandBufferBuilder`, `DescriptorSetResources`, `DescriptorSetWithOffsets`, `DescriptorWrite`, `FenceSignalFuture`, `FenceWaitError`, `StdDescriptorPool`, `UnsafeDescriptorSet`, `memory::DeviceMemoryAllocError`, `now`
  --> src/device/engine.rs:39:9
   |
39 |         AutoCommandBufferBuilder, CommandBufferLevel, CommandBufferUsage, PrimaryCommandBuffer,
   |         ^^^^^^^^^^^^^^^^^^^^^^^^
...
46 |             StdDescriptorPool,
   |             ^^^^^^^^^^^^^^^^^
47 |         },
48 |         sys::{DescriptorWrite, UnsafeDescriptorSet},
   |               ^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^
49 |         DescriptorSet, DescriptorSetResources, DescriptorSetWithOffsets,
   |                        ^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^
...
57 |     memory::DeviceMemoryAllocError,
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
63 |     sync::{now, AccessFlags, Fence, FenceSignalFuture, FenceWaitError, GpuFuture, PipelineStages},
   |            ^^^                      ^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^

warning: unused import: `Capability`
 --> src/device/shader.rs:8:13
  |
8 |     spirv::{Capability, Decoration, ExecutionMode, ExecutionModel, Op, StorageClass, Word},
  |             ^^^^^^^^^^

warning: unused import: `ModuleId`
  --> src/device.rs:26:23
   |
26 | use shader::{EntryId, ModuleId};
   |                       ^^^^^^^^

warning: unused import: `PrimaryCommandBuffer`
  --> src/device/engine.rs:39:75
   |
39 |         AutoCommandBufferBuilder, CommandBufferLevel, CommandBufferUsage, PrimaryCommandBuffer,
   |                                                                           ^^^^^^^^^^^^^^^^^^^^

warning: unused import: `DescriptorSet`
  --> src/device/engine.rs:49:9
   |
49 |         DescriptorSet, DescriptorSetResources, DescriptorSetWithOffsets,
   |         ^^^^^^^^^^^^^

warning: unused import: `GpuFuture`
  --> src/device/engine.rs:63:72
   |
63 |     sync::{now, AccessFlags, Fence, FenceSignalFuture, FenceWaitError, GpuFuture, PipelineStages},
   |                                                                        ^^^^^^^^^

warning: unused variable: `base`
   --> src/device.rs:460:21
    |
460 |         if let Some(base) = self.base.as_ref() {
    |                     ^^^^ help: if this is intentional, prefix it with an underscore: `_base`
    |
    = note: `#[warn(unused_variables)]` on by default

warning: unused variable: `src`
   --> src/device/buffer.rs:606:36
    |
606 |             (DynBufferBase::Device(src), Some(device)) => todo!(), // Ok(src.into_device(device).await?.into()),
    |                                    ^^^ help: if this is intentional, prefix it with an underscore: `_src`

warning: unused variable: `device`
   --> src/device/buffer.rs:606:47
    |
606 |             (DynBufferBase::Device(src), Some(device)) => todo!(), // Ok(src.into_device(device).await?.into()),
    |                                               ^^^^^^ help: if this is intentional, prefix it with an underscore: `_device`

warning: unused variable: `guard`
   --> src/device/buffer.rs:942:26
    |
942 |             Self::Device(guard) => todo!(), // guard.as_slice().to_vec(),
    |                          ^^^^^ help: if this is intentional, prefix it with an underscore: `_guard`

warning: unnecessary `unsafe` block
   --> src/device/engine.rs:538:27
    |
538 |         let mut barrier = unsafe { UnsafeCommandBufferBuilderPipelineBarrier::new() };
    |                           ^^^^^^ unnecessary `unsafe` block
    |
    = note: `#[warn(unused_unsafe)]` on by default

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
  --> src/util.rs:35:21
   |
35 |             None => unreachable_unchecked(),
   |                     ^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
   |
note: the lint level is defined here
  --> src/lib.rs:1:23
   |
1  | #![warn(missing_docs, unsafe_op_in_unsafe_fn)]
   |                       ^^^^^^^^^^^^^^^^^^^^^^
   = note: consult the function's documentation for information on how to avoid undefined behavior

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
   --> src/buffer.rs:159:12
    |
159 |         Ok(Buffer::alloc(device, len)?.into())
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
   --> src/buffer.rs:388:12
    |
388 |         Ok(Buffer::alloc(device, len)?.into())
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
   --> src/buffer/float.rs:167:56
    |
167 |                           FloatType::F32 => Ok(Self::F32($buffer::alloc(device, len)?)),
    |                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
...
210 | / impl_float_buffer_owned! {
211 | |     (FloatBuffer, Buffer),
212 | |     (FloatArcBuffer, ArcBuffer),
213 | |     (FloatCowBuffer<'a>, CowBuffer<'a>),
214 | | }
    | |_- in this macro invocation
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior
    = note: this warning originates in the macro `impl_float_buffer_owned` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
   --> src/buffer/float.rs:166:58
    |
166 |                           FloatType::BF16 => Ok(Self::BF16($buffer::alloc(device, len)?)),
    |                                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
...
210 | / impl_float_buffer_owned! {
211 | |     (FloatBuffer, Buffer),
212 | |     (FloatArcBuffer, ArcBuffer),
213 | |     (FloatCowBuffer<'a>, CowBuffer<'a>),
214 | | }
    | |_- in this macro invocation
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior
    = note: this warning originates in the macro `impl_float_buffer_owned` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: field is never read: `descriptor_set_layout`
   --> src/device/engine.rs:528:5
    |
528 |     descriptor_set_layout: Arc<DescriptorSetLayout>,
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: `#[warn(dead_code)]` on by default

warning: field is never read: `device`
   --> src/device/engine.rs:634:5
    |
634 |     device: Arc<Device>,
    |     ^^^^^^^^^^^^^^^^^^^

warning: associated function is never used: `into_device`
   --> src/device/buffer.rs:262:14
    |
262 |     async fn into_device(self, device: DeviceBase) -> Result<DeviceBuffer<T>>
    |              ^^^^^^^^^^^

warning: associated function is never used: `specialization_size`
   --> src/device/shader.rs:181:19
    |
181 |     pub(super) fn specialization_size(&self) -> usize {
    |                   ^^^^^^^^^^^^^^^^^^^

warning: associated function is never used: `size`
   --> src/device/shader.rs:203:19
    |
203 |     pub(super) fn size(&self) -> usize {
    |                   ^^^^

warning: field is never read: `local_size`
   --> src/device.rs:281:5
    |
281 |     local_size: [u32; 3],
    |     ^^^^^^^^^^^^^^^^^^^^

warning: `autograph` (lib test) generated 25 warnings
    Finished test [unoptimized + debuginfo] target(s) in 4.87s
     Running unittests (target/debug/deps/autograph-6651538b3f305376)

running 1 test
[mvk-info] MoltenVK version 1.1.5, supporting Vulkan version 1.1.189.
	The following 72 Vulkan extensions are supported:
		VK_KHR_16bit_storage v1
		VK_KHR_8bit_storage v1
		VK_KHR_bind_memory2 v1
		VK_KHR_create_renderpass2 v1
		VK_KHR_dedicated_allocation v3
		VK_KHR_depth_stencil_resolve v1
		VK_KHR_descriptor_update_template v1
		VK_KHR_device_group v4
		VK_KHR_device_group_creation v1
		VK_KHR_driver_properties v1
		VK_KHR_external_fence v1
		VK_KHR_external_fence_capabilities v1
		VK_KHR_external_memory v1
		VK_KHR_external_memory_capabilities v1
		VK_KHR_external_semaphore v1
		VK_KHR_external_semaphore_capabilities v1
		VK_KHR_get_memory_requirements2 v1
		VK_KHR_get_physical_device_properties2 v2
		VK_KHR_get_surface_capabilities2 v1
		VK_KHR_imageless_framebuffer v1
		VK_KHR_image_format_list v1
		VK_KHR_maintenance1 v2
		VK_KHR_maintenance2 v1
		VK_KHR_maintenance3 v1
		VK_KHR_multiview v1
		VK_KHR_portability_subset v1
		VK_KHR_push_descriptor v2
		VK_KHR_relaxed_block_layout v1
		VK_KHR_sampler_mirror_clamp_to_edge v3
		VK_KHR_sampler_ycbcr_conversion v14
		VK_KHR_shader_draw_parameters v1
		VK_KHR_shader_float16_int8 v1
		VK_KHR_shader_subgroup_extended_types v1
		VK_KHR_storage_buffer_storage_class v1
		VK_KHR_surface v25
		VK_KHR_swapchain v70
		VK_KHR_swapchain_mutable_format v1
		VK_KHR_timeline_semaphore v2
		VK_KHR_uniform_buffer_standard_layout v1
		VK_KHR_variable_pointers v1
		VK_EXT_debug_marker v4
		VK_EXT_debug_report v10
		VK_EXT_debug_utils v2
		VK_EXT_descriptor_indexing v2
		VK_EXT_fragment_shader_interlock v1
		VK_EXT_hdr_metadata v2
		VK_EXT_host_query_reset v1
		VK_EXT_image_robustness v1
		VK_EXT_inline_uniform_block v1
		VK_EXT_memory_budget v1
		VK_EXT_metal_surface v1
		VK_EXT_post_depth_coverage v1
		VK_EXT_private_data v1
		VK_EXT_robustness2 v1
		VK_EXT_scalar_block_layout v1
		VK_EXT_shader_stencil_export v1
		VK_EXT_shader_viewport_index_layer v1
		VK_EXT_subgroup_size_control v2
		VK_EXT_swapchain_colorspace v4
		VK_EXT_texel_buffer_alignment v1
		VK_EXT_texture_compression_astc_hdr v1
		VK_EXT_vertex_attribute_divisor v3
		VK_AMD_gpu_shader_half_float v2
		VK_AMD_negative_viewport_height v1
		VK_AMD_shader_image_load_store_lod v1
		VK_AMD_shader_trinary_minmax v1
		VK_IMG_format_pvrtc v1
		VK_INTEL_shader_integer_functions2 v1
		VK_GOOGLE_display_timing v1
		VK_MVK_macos_surface v3
		VK_MVK_moltenvk v32
		VK_NV_glsl_shader v1
[mvk-info] GPU device:
		model: Apple M1
		type: Integrated
		vendorID: 0x106b
		deviceID: 0xa140
		pipelineCacheUUID: C1D03328-0400-03EF-0000-000000000000
	supports the following Metal Versions, GPU's and Feature Sets:
		Metal Shading Language 2.3
		GPU Family Apple 7
		GPU Family Apple 6
		GPU Family Apple 5
		GPU Family Apple 4
		GPU Family Apple 3
		GPU Family Apple 2
		GPU Family Apple 1
		GPU Family Mac 2
		GPU Family Mac 1
		GPU Family Common 3
		GPU Family Common 2
		GPU Family Common 1
		macOS GPU Family 2 v1
		macOS GPU Family 1 v4
		macOS GPU Family 1 v3
		macOS GPU Family 1 v2
		macOS GPU Family 1 v1
[mvk-info] Created VkInstance for Vulkan version 1.1.0, as requested by app, with the following 0 Vulkan extensions enabled:
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test device::tests::device_new ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 41 filtered out; finished in 0.10s

% cargo test --features "full device_tests" --no-fail-fast
   Compiling indexmap v1.8.0
   Compiling tokio v1.14.0
   Compiling serde_json v1.0.72
   Compiling naga v0.5.0
   Compiling autograph v0.1.1 (/Users/rjzak/Downloads/autograph)
   Compiling plist v1.3.1
   Compiling envmnt v0.8.4
   Compiling gfx-hal v0.9.0
   Compiling vulkano v0.27.1
   Compiling ci_info v0.10.2
   Compiling ash-molten v0.12.0+1.1.5
   Compiling rusty-hook v0.11.2
   Compiling tokio-util v0.6.9
   Compiling tokio-native-tls v0.3.0
   Compiling h2 v0.3.9
   Compiling hyper v0.14.16
   Compiling hyper-tls v0.5.0
   Compiling reqwest v0.11.7
   Compiling downloader v0.2.6
warning: unused imports: `SliceMut`, `Slice`
 --> src/device.rs:2:14
  |
2 |     buffer::{Slice, SliceMut},
  |              ^^^^^  ^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: unused import: `bytemuck::Pod`
 --> src/device.rs:6:5
  |
6 | use bytemuck::Pod;
  |     ^^^^^^^^^^^^^

warning: unused imports: `Weak`, `drop`, `iter`, `take`, `time::Duration`
  --> src/device/engine.rs:17:5
   |
17 |     iter,
   |     ^^^^
18 |     iter::once,
19 |     mem::{drop, take, transmute},
   |           ^^^^  ^^^^
...
23 |         Arc, Weak,
   |              ^^^^
24 |     },
25 |     time::Duration,
   |     ^^^^^^^^^^^^^^

warning: unused imports: `AutoCommandBufferBuilder`, `DescriptorSetResources`, `DescriptorSetWithOffsets`, `DescriptorWrite`, `FenceSignalFuture`, `FenceWaitError`, `StdDescriptorPool`, `UnsafeDescriptorSet`, `memory::DeviceMemoryAllocError`, `now`
  --> src/device/engine.rs:39:9
   |
39 |         AutoCommandBufferBuilder, CommandBufferLevel, CommandBufferUsage, PrimaryCommandBuffer,
   |         ^^^^^^^^^^^^^^^^^^^^^^^^
...
46 |             StdDescriptorPool,
   |             ^^^^^^^^^^^^^^^^^
47 |         },
48 |         sys::{DescriptorWrite, UnsafeDescriptorSet},
   |               ^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^
49 |         DescriptorSet, DescriptorSetResources, DescriptorSetWithOffsets,
   |                        ^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^
...
57 |     memory::DeviceMemoryAllocError,
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
63 |     sync::{now, AccessFlags, Fence, FenceSignalFuture, FenceWaitError, GpuFuture, PipelineStages},
   |            ^^^                      ^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^

warning: unused import: `Capability`
 --> src/device/shader.rs:8:13
  |
8 |     spirv::{Capability, Decoration, ExecutionMode, ExecutionModel, Op, StorageClass, Word},
  |             ^^^^^^^^^^

warning: unused import: `ModuleId`
  --> src/device.rs:26:23
   |
26 | use shader::{EntryId, ModuleId};
   |                       ^^^^^^^^

warning: unused import: `PrimaryCommandBuffer`
  --> src/device/engine.rs:39:75
   |
39 |         AutoCommandBufferBuilder, CommandBufferLevel, CommandBufferUsage, PrimaryCommandBuffer,
   |                                                                           ^^^^^^^^^^^^^^^^^^^^

warning: unused import: `DescriptorSet`
  --> src/device/engine.rs:49:9
   |
49 |         DescriptorSet, DescriptorSetResources, DescriptorSetWithOffsets,
   |         ^^^^^^^^^^^^^

warning: unused import: `GpuFuture`
  --> src/device/engine.rs:63:72
   |
63 |     sync::{now, AccessFlags, Fence, FenceSignalFuture, FenceWaitError, GpuFuture, PipelineStages},
   |                                                                        ^^^^^^^^^

warning: unused variable: `base`
   --> src/device.rs:460:21
    |
460 |         if let Some(base) = self.base.as_ref() {
    |                     ^^^^ help: if this is intentional, prefix it with an underscore: `_base`
    |
    = note: `#[warn(unused_variables)]` on by default

warning: unused variable: `src`
   --> src/device/buffer.rs:606:36
    |
606 |             (DynBufferBase::Device(src), Some(device)) => todo!(), // Ok(src.into_device(device).await?.into()),
    |                                    ^^^ help: if this is intentional, prefix it with an underscore: `_src`

warning: unused variable: `device`
   --> src/device/buffer.rs:606:47
    |
606 |             (DynBufferBase::Device(src), Some(device)) => todo!(), // Ok(src.into_device(device).await?.into()),
    |                                               ^^^^^^ help: if this is intentional, prefix it with an underscore: `_device`

warning: unused variable: `guard`
   --> src/device/buffer.rs:942:26
    |
942 |             Self::Device(guard) => todo!(), // guard.as_slice().to_vec(),
    |                          ^^^^^ help: if this is intentional, prefix it with an underscore: `_guard`

warning: unnecessary `unsafe` block
   --> src/device/engine.rs:538:27
    |
538 |         let mut barrier = unsafe { UnsafeCommandBufferBuilderPipelineBarrier::new() };
    |                           ^^^^^^ unnecessary `unsafe` block
    |
    = note: `#[warn(unused_unsafe)]` on by default

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
  --> src/util.rs:35:21
   |
35 |             None => unreachable_unchecked(),
   |                     ^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
   |
note: the lint level is defined here
  --> src/lib.rs:1:23
   |
1  | #![warn(missing_docs, unsafe_op_in_unsafe_fn)]
   |                       ^^^^^^^^^^^^^^^^^^^^^^
   = note: consult the function's documentation for information on how to avoid undefined behavior

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
   --> src/buffer.rs:159:12
    |
159 |         Ok(Buffer::alloc(device, len)?.into())
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
   --> src/buffer.rs:388:12
    |
388 |         Ok(Buffer::alloc(device, len)?.into())
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
   --> src/tensor/float.rs:338:35
    |
338 |         let data = S::from_buffer(FloatBuffer::alloc(float_type, device, dim.size())?);
    |                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
   --> src/buffer/float.rs:167:56
    |
167 |                           FloatType::F32 => Ok(Self::F32($buffer::alloc(device, len)?)),
    |                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
...
210 | / impl_float_buffer_owned! {
211 | |     (FloatBuffer, Buffer),
212 | |     (FloatArcBuffer, ArcBuffer),
213 | |     (FloatCowBuffer<'a>, CowBuffer<'a>),
214 | | }
    | |_- in this macro invocation
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior
    = note: this warning originates in the macro `impl_float_buffer_owned` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: call to unsafe function is unsafe and requires unsafe block (error E0133)
   --> src/buffer/float.rs:166:58
    |
166 |                           FloatType::BF16 => Ok(Self::BF16($buffer::alloc(device, len)?)),
    |                                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
...
210 | / impl_float_buffer_owned! {
211 | |     (FloatBuffer, Buffer),
212 | |     (FloatArcBuffer, ArcBuffer),
213 | |     (FloatCowBuffer<'a>, CowBuffer<'a>),
214 | | }
    | |_- in this macro invocation
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior
    = note: this warning originates in the macro `impl_float_buffer_owned` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: field is never read: `descriptor_set_layout`
   --> src/device/engine.rs:528:5
    |
528 |     descriptor_set_layout: Arc<DescriptorSetLayout>,
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: `#[warn(dead_code)]` on by default

warning: field is never read: `device`
   --> src/device/engine.rs:634:5
    |
634 |     device: Arc<Device>,
    |     ^^^^^^^^^^^^^^^^^^^

warning: associated function is never used: `into_device`
   --> src/device/buffer.rs:262:14
    |
262 |     async fn into_device(self, device: DeviceBase) -> Result<DeviceBuffer<T>>
    |              ^^^^^^^^^^^

warning: associated function is never used: `specialization_size`
   --> src/device/shader.rs:181:19
    |
181 |     pub(super) fn specialization_size(&self) -> usize {
    |                   ^^^^^^^^^^^^^^^^^^^

warning: associated function is never used: `size`
   --> src/device/shader.rs:203:19
    |
203 |     pub(super) fn size(&self) -> usize {
    |                   ^^^^

warning: field is never read: `local_size`
   --> src/device.rs:281:5
    |
281 |     local_size: [u32; 3],
    |     ^^^^^^^^^^^^^^^^^^^^

warning: unused import: `half::bf16`
   --> src/tensor/linalg.rs:181:9
    |
181 |     use half::bf16;
    |         ^^^^^^^^^^

warning: `autograph` (lib) generated 26 warnings
warning: `autograph` (lib test) generated 27 warnings (26 duplicates)
    Finished test [unoptimized + debuginfo] target(s) in 36.74s
     Running unittests (target/debug/deps/autograph-6263cc1c4b31f23f)

running 148 tests
[mvk-info] MoltenVK version 1.1.5, supporting Vulkan version 1.1.189.
	The following 72 Vulkan extensions are supported:
		VK_KHR_16bit_storage v1
		VK_KHR_8bit_storage v1
		VK_KHR_bind_memory2 v1
		VK_KHR_create_renderpass2 v1
		VK_KHR_dedicated_allocation v3
		VK_KHR_depth_stencil_resolve v1
		VK_KHR_descriptor_update_template v1
		VK_KHR_device_group v4
		VK_KHR_device_group_creation v1
		VK_KHR_driver_properties v1
		VK_KHR_external_fence v1
		VK_KHR_external_fence_capabilities v1
		VK_KHR_external_memory v1
		VK_KHR_external_memory_capabilities v1
		VK_KHR_external_semaphore v1
		VK_KHR_external_semaphore_capabilities v1
		VK_KHR_get_memory_requirements2 v1
		VK_KHR_get_physical_device_properties2 v2
		VK_KHR_get_surface_capabilities2 v1
		VK_KHR_imageless_framebuffer v1
		VK_KHR_image_format_list v1
		VK_KHR_maintenance1 v2
		VK_KHR_maintenance2 v1
		VK_KHR_maintenance3 v1
		VK_KHR_multiview v1
		VK_KHR_portability_subset v1
		VK_KHR_push_descriptor v2
		VK_KHR_relaxed_block_layout v1
		VK_KHR_sampler_mirror_clamp_to_edge v3
		VK_KHR_sampler_ycbcr_conversion v14
		VK_KHR_shader_draw_parameters v1
		VK_KHR_shader_float16_int8 v1
		VK_KHR_shader_subgroup_extended_types v1
		VK_KHR_storage_buffer_storage_class v1
		VK_KHR_surface v25
		VK_KHR_swapchain v70
		VK_KHR_swapchain_mutable_format v1
		VK_KHR_timeline_semaphore v2
		VK_KHR_uniform_buffer_standard_layout v1
		VK_KHR_variable_pointers v1
		VK_EXT_debug_marker v4
		VK_EXT_debug_report v10
		VK_EXT_debug_utils v2
		VK_EXT_descriptor_indexing v2
		VK_EXT_fragment_shader_interlock v1
		VK_EXT_hdr_metadata v2
		VK_EXT_host_query_reset v1
		VK_EXT_image_robustness v1
		VK_EXT_inline_uniform_block v1
		VK_EXT_memory_budget v1
		VK_EXT_metal_surface v1
		VK_EXT_post_depth_coverage v1
		VK_EXT_private_data v1
		VK_EXT_robustness2 v1
		VK_EXT_scalar_block_layout v1
		VK_EXT_shader_stencil_export v1
		VK_EXT_shader_viewport_index_layer v1
		VK_EXT_subgroup_size_control v2
		VK_EXT_swapchain_colorspace v4
		VK_EXT_texel_buffer_alignment v1
		VK_EXT_texture_compression_astc_hdr v1
		VK_EXT_vertex_attribute_divisor v3
		VK_AMD_gpu_shader_half_float v2
		VK_AMD_negative_viewport_height v1
		VK_AMD_shader_image_load_store_lod v1
		VK_AMD_shader_trinary_minmax v1
		VK_IMG_format_pvrtc v1
		VK_INTEL_shader_integer_functions2 v1
		VK_GOOGLE_display_timing v1
		VK_MVK_macos_surface v3
		VK_MVK_moltenvk v32
		VK_NV_glsl_shader v1
[mvk-info] GPU device:
		model: Apple M1
		type: Integrated
		vendorID: 0x106b
		deviceID: 0xa140
		pipelineCacheUUID: C1D03328-0400-03EF-0000-000000000000
	supports the following Metal Versions, GPU's and Feature Sets:
		Metal Shading Language 2.3
		GPU Family Apple 7
		GPU Family Apple 6
		GPU Family Apple 5
		GPU Family Apple 4
		GPU Family Apple 3
		GPU Family Apple 2
		GPU Family Apple 1
		GPU Family Mac 2
		GPU Family Mac 1
		GPU Family Common 3
		GPU Family Common 2
		GPU Family Common 1
		macOS GPU Family 2 v1
		macOS GPU Family 1 v4
		macOS GPU Family 1 v3
		macOS GPU Family 1 v2
		macOS GPU Family 1 v1
[mvk-info] Created VkInstance for Vulkan version 1.1.0, as requested by app, with the following 0 Vulkan extensions enabled:
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::device_buffer_from_vec ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::device_buffer_serde ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::device_buffer_copy_from_slice ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::fill_i64 ... ok
test buffer::tests::fill_i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::fill_bf16 ... ok
test buffer::tests::fill_i8 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::fill_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::host_buffer_copy_from_slice ... ok
test buffer::tests::fill_u16 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::host_buffer_serde ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::fill_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::fill_f16 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::fill_u8 ... ok
test buffer::tests::fill_u64 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::fill_i16 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::fill_f64 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
test buffer::tests::scale::f32::f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
test buffer::tests::scale::bf16::i32 ... ok
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::f32::u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::i32::i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::bf16::u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::bf16::f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::i32::u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::i32::f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::f32::i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::u16::f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::u16::i32 ... ok
test device::engine::tests::instance ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::u32::f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test device::tests::device_new ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::u32::u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::u16::u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::u8::i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::u8::f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::u8::u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test buffer::tests::scale::u32::i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::autograd::tests::cross_entropy_loss_backward_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::autograd::tests::bias_backward_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::autograd::tests::cross_entropy_loss_f32 ... FAILED
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::kmeans::tests::compute_distances_f32_m11_k5_n13 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::layer::tests::max_pool_2d_backward_atomic_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::autograd::tests::bias_backward_bf16 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test device::shader::tests::shader_module_from_spirv ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::kmeans::tests::compute_distances_bf16_m11_k5_n13 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::layer::tests::relu_backward_f32 ... ok[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1

[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
test learn::neural_network::layer::tests::relu_backward_bf16 ... [mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::layer::tests::max_pool_2d_f32 ... ok
test learn::neural_network::layer::tests::mean_pool_2d_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::layer::tests::mean_pool_2d_backward_atomic_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::layer::tests::mean_pool_2d_backward_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::layer::tests::max_pool_2d_backward_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::layer::tests::relu_bf16 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test glsl_shaders::tests::to_metal ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test learn::neural_network::layer::tests::relu_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::accuracy::tests::accuracy_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_f32_m25_k611_n6_N_N ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_f32_m21_k31_n41_N_N ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_f32_m121_k131_n141_N_N ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_f32_m121_k131_n141_N_T ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_f32_m121_k131_n141_T_N ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_f32_m121_k131_n141_T_T ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_i32_m21_k31_n41_N_N ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_i32_m121_k131_n141_N_N ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_i32_m121_k131_n141_N_T ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test rust_shaders::tests::core_to_metal ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_u32_m21_k31_n41_N_N ... ok
test tensor::ops::tests::im2col_convolution_bf16 ... ignored
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::ops::tests::col2im_convolution_f32 ... ok
test tensor::reduce::tests::atomic_add_f32 ... ignored
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmax_bf16_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmax_bf16_22x23_axis1 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmax_f32_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmax_f32_22x23_axis1 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::ops::tests::im2col_convolution_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmax_i32_22x23_axis1 ... ok
test tensor::reduce::tests::tensor_argmax_i32_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmax_u32_22x23_axis1 ... ok
test tensor::reduce::tests::tensor_argmax_u32_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmin_bf16_22x23_axis1 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmin_bf16_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_i32_m121_k131_n141_T_N ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
test tensor::reduce::tests::tensor_argmin_f32_11x12_axis0 ... ok
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmin_f32_22x23_axis1 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmin_u32_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmin_u32_22x23_axis1 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_sum_bf16_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_sum_bf16_22x23_axis1 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmin_i32_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_argmin_i32_22x23_axis1 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_sum_f32_22x23_axis1 ... ok
test tensor::reduce::tests::tensor_sum_f32_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_sum_u32_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_sum_u32_22x23_axis1 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reduce::tests::tensor_sum_i32_22x23_axis1 ... ok
test tensor::reduce::tests::tensor_sum_i32_11x12_axis0 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reorder::tests::into_standard_layout_4d_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reorder::tests::into_standard_layout_6d_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_u32_m121_k131_n141_T_N ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_i32_m121_k131_n141_T_T ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::reorder::tests::reorder_2d_f32_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u16_i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u16_bf16 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u16_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_u32_m121_k131_n141_N_T ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u32_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u16_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::linalg::tests::tensor_dot_u32_m121_k131_n141_N_N ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u32_i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u32_bf16 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_bf16_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_bf16_i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_bf16_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u8_bf16 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_i32_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_i32_i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u8_f32 ... ok
test tensor::tests::scaled_cast_i32_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_u16_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_u16_i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_u16_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u8_i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_u32_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u32_u32 ... ok
test tensor::tests::scaled_cast_u32_i32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_u32_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::one_hot_u8_u32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_u8_f32 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::scaled_cast_u8_i32 ... ok
test tensor::tests::tensor_from_array0 ... ok
test tensor::tests::tensor_from_array1 ... ok
test tensor::tests::tensor_from_array2 ... ok
test tensor::tests::tensor_from_array3 ... ok
test tensor::tests::scaled_cast_u8_u32 ... ok
test tensor::tests::tensor_from_array4 ... ok
test tensor::tests::tensor_from_array6 ... ok
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 with the following 1 Vulkan extensions enabled:
		VK_KHR_portability_subset v1
test tensor::tests::tensor_from_arrayD ... ok
test tensor::tests::tensor_serde_host ... ok
test tensor::tests::tensor_serde_device ... ok
test util::tests::size_eq ... ok
test tensor::tests::test_from_array5 ... ok
test util::tests::type_eq ... ok
test tensor::tests::scaled_cast_bf16_bf16 ... ok
test tensor::tests::scaled_cast_i32_bf16 ... ok
test tensor::tests::scaled_cast_u32_bf16 ... ok
test tensor::tests::scaled_cast_u8_bf16 ... ok
test tensor::linalg::tests::tensor_dot_u32_m121_k131_n141_T_T ... ok
test tensor::tests::scaled_cast_u16_bf16 ... ok

failures:

---- learn::neural_network::autograd::tests::cross_entropy_loss_f32 stdout ----
[src/device/engine.rs:191] physical_device.supported_features() = Features {
    acceleration_structure: false,
    acceleration_structure_capture_replay: false,
    acceleration_structure_host_commands: false,
    acceleration_structure_indirect_build: false,
    advanced_blend_coherent_operations: false,
    alpha_to_one: true,
    attachment_fragment_shading_rate: false,
    bresenham_lines: false,
    buffer_device_address: false,
    buffer_device_address_capture_replay: false,
    buffer_device_address_multi_device: false,
    color_write_enable: false,
    compute_derivative_group_linear: false,
    compute_derivative_group_quads: false,
    compute_full_subgroups: true,
    conditional_rendering: false,
    constant_alpha_color_blend_factors: true,
    cooperative_matrix: false,
    cooperative_matrix_robust_buffer_access: false,
    corner_sampled_image: false,
    coverage_reduction_mode: false,
    custom_border_color_without_format: false,
    custom_border_colors: false,
    decode_mode_shared_exponent: false,
    dedicated_allocation_image_aliasing: false,
    depth_bias_clamp: true,
    depth_bounds: false,
    depth_clamp: true,
    depth_clip_enable: false,
    descriptor_binding_acceleration_structure_update_after_bind: false,
    descriptor_binding_inline_uniform_block_update_after_bind: true,
    descriptor_binding_partially_bound: true,
    descriptor_binding_sampled_image_update_after_bind: true,
    descriptor_binding_storage_buffer_update_after_bind: true,
    descriptor_binding_storage_image_update_after_bind: true,
    descriptor_binding_storage_texel_buffer_update_after_bind: true,
    descriptor_binding_uniform_buffer_update_after_bind: true,
    descriptor_binding_uniform_texel_buffer_update_after_bind: true,
    descriptor_binding_update_unused_while_pending: true,
    descriptor_binding_variable_descriptor_count: true,
    descriptor_indexing: false,
    device_coherent_memory: false,
    device_generated_commands: false,
    device_memory_report: false,
    diagnostics_config: false,
    draw_indirect_count: false,
    draw_indirect_first_instance: true,
    dual_src_blend: true,
    events: true,
    exclusive_scissor: false,
    extended_dynamic_state: false,
    extended_dynamic_state2: false,
    extended_dynamic_state2_logic_op: false,
    extended_dynamic_state2_patch_control_points: false,
    external_memory_rdma: false,
    fill_mode_non_solid: true,
    format_a4b4g4r4: false,
    format_a4r4g4b4: false,
    fragment_density_map: false,
    fragment_density_map_deferred: false,
    fragment_density_map_dynamic: false,
    fragment_density_map_non_subsampled_images: false,
    fragment_shader_barycentric: false,
    fragment_shader_pixel_interlock: true,
    fragment_shader_sample_interlock: true,
    fragment_shader_shading_rate_interlock: false,
    fragment_shading_rate_enums: false,
    fragment_stores_and_atomics: true,
    full_draw_index_uint32: true,
    geometry_shader: false,
    geometry_streams: false,
    global_priority_query: false,
    host_query_reset: true,
    image_cube_array: true,
    image_footprint: false,
    image_view2_d_on3_d_image: false,
    image_view_format_reinterpretation: true,
    image_view_format_swizzle: true,
    imageless_framebuffer: true,
    independent_blend: true,
    index_type_uint8: false,
    inherited_conditional_rendering: false,
    inherited_queries: true,
    inherited_viewport_scissor2_d: false,
    inline_uniform_block: true,
    invocation_mask: false,
    large_points: true,
    logic_op: false,
    memory_priority: false,
    mesh_shader: false,
    multi_draw: false,
    multi_draw_indirect: true,
    multi_viewport: true,
    multisample_array_image: true,
    multiview: true,
    multiview_geometry_shader: false,
    multiview_tessellation_shader: false,
    mutable_comparison_samplers: true,
    mutable_descriptor_type: false,
    no_invocation_fragment_shading_rates: false,
    null_descriptor: false,
    occlusion_query_precise: true,
    pageable_device_local_memory: false,
    performance_counter_multiple_query_pools: false,
    performance_counter_query_pools: false,
    pipeline_creation_cache_control: false,
    pipeline_executable_info: false,
    pipeline_fragment_shading_rate: false,
    pipeline_statistics_query: false,
    point_polygons: false,
    present_id: false,
    present_wait: false,
    primitive_fragment_shading_rate: false,
    primitive_topology_list_restart: false,
    primitive_topology_patch_list_restart: false,
    private_data: true,
    protected_memory: false,
    provoking_vertex_last: false,
    ray_query: false,
    ray_tracing_motion_blur: false,
    ray_tracing_motion_blur_pipeline_trace_rays_indirect: false,
    ray_tracing_pipeline: false,
    ray_tracing_pipeline_shader_group_handle_capture_replay: false,
    ray_tracing_pipeline_shader_group_handle_capture_replay_mixed: false,
    ray_tracing_pipeline_trace_rays_indirect: false,
    ray_traversal_primitive_culling: false,
    rectangular_lines: false,
    representative_fragment_test: false,
    robust_buffer_access: true,
    robust_buffer_access2: false,
    robust_image_access: true,
    robust_image_access2: true,
    runtime_descriptor_array: true,
    sample_rate_shading: true,
    sampler_anisotropy: true,
    sampler_filter_minmax: false,
    sampler_mip_lod_bias: false,
    sampler_mirror_clamp_to_edge: false,
    sampler_ycbcr_conversion: true,
    scalar_block_layout: true,
    separate_depth_stencil_layouts: false,
    separate_stencil_mask_ref: true,
    shader_buffer_float16_atomic_add: false,
    shader_buffer_float16_atomic_min_max: false,
    shader_buffer_float16_atomics: false,
    shader_buffer_float32_atomic_add: false,
    shader_buffer_float32_atomic_min_max: false,
    shader_buffer_float32_atomics: false,
    shader_buffer_float64_atomic_add: false,
    shader_buffer_float64_atomic_min_max: false,
    shader_buffer_float64_atomics: false,
    shader_buffer_int64_atomics: false,
    shader_clip_distance: true,
    shader_cull_distance: false,
    shader_demote_to_helper_invocation: false,
    shader_device_clock: false,
    shader_draw_parameters: true,
    shader_float16: true,
    shader_float64: false,
    shader_image_float32_atomic_add: false,
    shader_image_float32_atomic_min_max: false,
    shader_image_float32_atomics: false,
    shader_image_gather_extended: true,
    shader_image_int64_atomics: false,
    shader_input_attachment_array_dynamic_indexing: true,
    shader_input_attachment_array_non_uniform_indexing: true,
    shader_int16: true,
    shader_int64: true,
    shader_int8: true,
    shader_integer_dot_product: false,
    shader_integer_functions2: true,
    shader_output_layer: false,
    shader_output_viewport_index: false,
    shader_resource_min_lod: true,
    shader_resource_residency: false,
    shader_sample_rate_interpolation_functions: true,
    shader_sampled_image_array_dynamic_indexing: true,
    shader_sampled_image_array_non_uniform_indexing: true,
    shader_shared_float16_atomic_add: false,
    shader_shared_float16_atomic_min_max: false,
    shader_shared_float16_atomics: false,
    shader_shared_float32_atomic_add: false,
    shader_shared_float32_atomic_min_max: false,
    shader_shared_float32_atomics: false,
    shader_shared_float64_atomic_add: false,
    shader_shared_float64_atomic_min_max: false,
    shader_shared_float64_atomics: false,
    shader_shared_int64_atomics: false,
    shader_sm_builtins: false,
    shader_storage_buffer_array_dynamic_indexing: true,
    shader_storage_buffer_array_non_uniform_indexing: false,
    shader_storage_image_array_dynamic_indexing: true,
    shader_storage_image_array_non_uniform_indexing: true,
    shader_storage_image_extended_formats: true,
    shader_storage_image_multisample: false,
    shader_storage_image_read_without_format: true,
    shader_storage_image_write_without_format: true,
    shader_storage_texel_buffer_array_dynamic_indexing: true,
    shader_storage_texel_buffer_array_non_uniform_indexing: true,
    shader_subgroup_clock: false,
    shader_subgroup_extended_types: true,
    shader_subgroup_uniform_control_flow: false,
    shader_terminate_invocation: false,
    shader_tessellation_and_geometry_point_size: true,
    shader_uniform_buffer_array_dynamic_indexing: true,
    shader_uniform_buffer_array_non_uniform_indexing: false,
    shader_uniform_texel_buffer_array_dynamic_indexing: true,
    shader_uniform_texel_buffer_array_non_uniform_indexing: true,
    shader_zero_initialize_workgroup_memory: false,
    shading_rate_coarse_sample_order: false,
    shading_rate_image: false,
    smooth_lines: false,
    sparse_binding: false,
    sparse_image_float32_atomic_add: false,
    sparse_image_float32_atomic_min_max: false,
    sparse_image_float32_atomics: false,
    sparse_image_int64_atomics: false,
    sparse_residency16_samples: false,
    sparse_residency2_samples: false,
    sparse_residency4_samples: false,
    sparse_residency8_samples: false,
    sparse_residency_aliased: false,
    sparse_residency_buffer: false,
    sparse_residency_image2_d: false,
    sparse_residency_image3_d: false,
    stippled_bresenham_lines: false,
    stippled_rectangular_lines: false,
    stippled_smooth_lines: false,
    storage_buffer16_bit_access: true,
    storage_buffer8_bit_access: true,
    storage_input_output16: true,
    storage_push_constant16: true,
    storage_push_constant8: true,
    subgroup_broadcast_dynamic_id: false,
    subgroup_size_control: true,
    subpass_shading: false,
    supersample_fragment_shading_rates: false,
    synchronization2: false,
    task_shader: false,
    tessellation_isolines: false,
    tessellation_point_mode: false,
    tessellation_shader: true,
    texel_buffer_alignment: true,
    texture_compression_astc_hdr: true,
    texture_compression_astc_ldr: true,
    texture_compression_bc: true,
    texture_compression_etc2: true,
    timeline_semaphore: true,
    transform_feedback: false,
    transform_feedback_preserves_provoking_vertex: false,
    triangle_fans: false,
    uniform_and_storage_buffer16_bit_access: true,
    uniform_and_storage_buffer8_bit_access: true,
    uniform_buffer_standard_layout: true,
    variable_multisample_rate: false,
    variable_pointers: true,
    variable_pointers_storage_buffer: true,
    vertex_attribute_access_beyond_stride: true,
    vertex_attribute_instance_rate_divisor: true,
    vertex_attribute_instance_rate_zero_divisor: true,
    vertex_input_dynamic_state: false,
    vertex_pipeline_stores_and_atomics: true,
    vulkan_memory_model: false,
    vulkan_memory_model_availability_visibility_chains: false,
    vulkan_memory_model_device_scope: false,
    wide_lines: false,
    workgroup_memory_explicit_layout: false,
    workgroup_memory_explicit_layout16_bit_access: false,
    workgroup_memory_explicit_layout8_bit_access: false,
    workgroup_memory_explicit_layout_scalar_block_layout: false,
    ycbcr2plane444_formats: false,
    ycbcr_image_arrays: false,
}
thread 'learn::neural_network::autograd::tests::cross_entropy_loss_f32' panicked at 'assert_relative_eq!(y_array, y_true, max_relative = 0.000_001)

    left  = [[761.26965, 663.8111, 568.35254, 474.89398, 383.43542, 293.97687, 206.51831, 121.05977, 37.601215],
 [685.14264, 596.6841, 510.22556, 425.767, 343.30847, 262.8499, 184.39136, 107.9328, 33.47425],
 [609.0157, 529.5571, 452.0986, 376.64005, 303.1815, 231.72295, 162.26439, 94.80584, 29.34729],
 [532.88885, 462.4303, 393.97174, 327.51318, 263.05466, 200.5961, 140.13754, 81.678986, 25.220432],
 [456.76187, 395.3033, 335.8448, 278.38623, 222.92767, 169.46912, 118.01056, 68.552, 21.093452],
 ...,
 [372.17633, 320.7178, 271.25925, 223.80069, 178.34213, 134.88358, 93.425026, 53.966473, 16.50792],
 [296.04938, 253.59082, 213.13226, 174.6737, 138.21515, 103.7566, 71.29805, 40.839493, 12.3809395],
 [219.92229, 186.46375, 155.00519, 125.54664, 98.08809, 72.62954, 49.17099, 27.71244, 8.253891],
 [143.79535, 119.33679, 96.87824, 76.41969, 57.961143, 41.502594, 27.044044, 14.585495, 4.1269455],
 [67.668396, 52.209846, 38.751297, 27.292747, 17.834198, 10.3756485, 4.917099, 1.4585495, 0.0]], shape=[67, 9], strides=[9, 1], layout=Cc (0x5), const ndim=2
    right = [[761.26965, 663.8111, 568.35254, 474.894, 383.43546, 293.9769, 206.51834, 121.0598, 37.601242],
 [685.14264, 596.68414, 510.2256, 425.76706, 343.3085, 262.8499, 184.39137, 107.93283, 33.474277],
 [609.0157, 529.5572, 452.09863, 376.64008, 303.18152, 231.72296, 162.2644, 94.80586, 29.347311],
 [532.88873, 462.4302, 393.97168, 327.51312, 263.05457, 200.596, 140.13745, 81.6789, 25.220345],
 [456.76178, 395.30325, 335.8447, 278.38614, 222.9276, 169.46902, 118.01048, 68.55193, 21.09338],
 ...,
 [372.17627, 320.71774, 271.2592, 223.80063, 178.34207, 134.88351, 93.424965, 53.966415, 16.507862],
 [296.0493, 253.59076, 213.13222, 174.67366, 138.2151, 103.756546, 71.298, 40.83945, 12.380897],
 [219.92233, 186.46379, 155.00525, 125.54669, 98.08814, 72.629585, 49.171032, 27.712484, 8.253931],
 [143.79538, 119.33683, 96.87828, 76.41972, 57.961174, 41.50262, 27.044067, 14.585518, 4.1269655],
 [67.66841, 52.20986, 38.751312, 27.292759, 17.834208, 10.375655, 4.9171033, 1.4585518, 0.0]], shape=[67, 9], strides=[9, 1], layout=Cc (0x5), const ndim=2

', src/learn/neural_network/autograd.rs:1369:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    learn::neural_network::autograd::tests::cross_entropy_loss_f32

test result: FAILED. 145 passed; 1 failed; 2 ignored; 0 measured; 0 filtered out; finished in 5.21s

   Doc-tests autograph

running 2 tests
test src/lib.rs - buffer (line 62) - compile ... ok
test src/learn.rs - learn::neural_network (line 26) - compile ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.23s

rjzak on Jan 22, 2022

Well that’s encouraging. I’m trying to figure out how to use ash-molten to do the static linking, it seems like it worked but the function I used returned None for some reason. But all the extensions are provided in the api so it will be possible to select impls based on that rather than guessing based on platform.

charles-r-earp on Jan 22, 2022

Ok so that didn’t actually work as expected. Anyway, I was pretty sure that it was the atomic ops. Metal does support some atomic operations, but not atomic_or (used in glsl impls for storing bf16) and atomic_compare_exchange (Metal has atomic_compare_exchange_weak which should work fine). So I think if I use atomic_compare_exchange_weak it will work.

However, it turns out that gfx-hal (the base API that I used to abstract over the 3 backends) uses spirv_cross to compile spirv to hlsl, but for whatever reason uses naga by default to compile for metal. Naga is a Rust cross compilation tool, but it doesn’t support everything yet, and does not appear to support atomic operations at all. So I should be able to use spirv_cross instead, which should fix the issue, at the very least it should be able to parse the spirv as it has worked on dx12, but there may still be limitations / some things that don’t translate to metal. So I’m working on trying validate this at shader compile time to catch issues like this. I should be able to check that it will compile to metal and hlsl cross platform, including testing on CI as well.

But hopefully with a few small changes everything should work correctly without separate impls.

Eventually I would like to fix the issue with shader compilation at runtime poisoning the device, it would be potentially nicer if it blocked instead. This way you could try different versions with different capabilities / extensions, it would be more flexible.

charles-r-earp on Dec 21, 2021

Awesome! I think I found the issue. Can you pull the changes and retry please? It looks like I requested CPU_CACHED for host memory when that is not always available.

charles-r-earp on Dec 14, 2021