edgetpu: Installation failing on Raspberry Pi CM4 for PCI-E driver

Following the installation guide for the M.2 I get several compilation errors when its trying to install gasket. Here the log of the make process: gasket-make.log

It seems its mostly the 3 same errors invalid use of undefined type ‘struct msix_entry’’ implicit declaration of function ‘writeq_relaxed’; did you mean ‘writel_relaxed’ implicit declaration of function ‘readq_relaxed’; did you mean ‘readw_relaxed’ implicit declaration of function ‘pci_disable_msix’; did you mean ‘pci_disable_sriov’

This is using gcc version 8.3.0 using the latest Raspbian with Kernel 5.4.51-v7l+ Unsure whether this is compiler, kernel header or code issues.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 93 (13 by maintainers)

Most upvoted comments

Ok,

I’ve committed my changes as a fork of google-coral/libedgetpu and google/gasket-driver here:

https://github.com/ghollingworth/libedgetpu https://github.com/ghollingworth/gasket-driver

I’ve tried to keep my changes to an absolute minimum, so they’re easy to understand

To force the CM4 to use a window of 0x01000040 you just need to copy the pcie-32bit-dma-overlay-dts and change the dma-ranges to:

		dma-ranges = <0x02000000 0x01000040 0x00000000 0x0 0x00000000
			      0x0 0x80000000>;

But as I said above, this will help for a few accesses, but then the upper 32bits of the page table address registers changes to something else and this no longer works (but can be used as a proof of concept)

So, having spent a little time understanding this problem I’ve come to the following conclusions:

Doing a downstream 64bit access (either read or write) through the PCIe interface is broken into two 32bit accesses incorrectly such that the least significant word is repeated. So if you do

*(unsigned long long *) addr = 0x12345678aabbccdd;

It’ll actually write 0xaabbccddaabbccdd to the address. This is only in the downstream direction (root to device) and the upstream memory bus is not affected (i.e. when the device does an access to main memory it doesn’t matter what width of data is used.

This means that a possible solution would instead to do the following:

*(unsigned long *) addr = 0xaabbccdd; *(unsigned long *) (addr+4) = 0x12345678;

This will work correctly through the downstream interface sending two separate 32bit writes to the hardware.

Doing this I’ve been able to get further with the code such that it seems to correctly set up the registers in the Google Coral device. The problem occurs when the device tries to access data from main memory (in the upstream direction). It does this through some page tables on the device which first need to be programmed up to point to the scattered user data in physical memory.

These page table registers/memories on the Google Coral device cannot be written to 32bits at a time, for example, if you write the previous values as two 32bit accesses (least significant word first) to one of these page table registers and read back the data you get:

print(“%04x”, *(unsigned long *) addr); 0xyyyyyyyy print(“%04x”, *(unsigned long *) (addr+4)); 0x12345678

Or if you write most significant word first you get:

print(“%04x”, *(unsigned long *) addr); 0xaabbccdd print(“%04x”, *(unsigned long *) (addr+4)); 0xyyyyyyyyy

Whichever word gets written second will be correct, but the other word looks like something hanging around in a latch somewhere (0xyyyyyyyy just means it’s not absolutely certain what this number will be). I did find that, in general, if you only wrote to the lower 32bits the upper 32bits got set to 0x01000040

So I changed the upsteam PCIe window to be 0x01000040 and forced the driver to only use simple page table entries (rather than extended page table entries)…

Suddenly, I was getting some upstream accesses working… Around about the first couple of kB of accesses get through correctly! RESULT!

Actually no, because soon after that the most significant word changes to something else, it seems that the 0x01000040 is not fixed (kind of obvious really) and suddenly the PCIe root just fails the transaction and you end up in the same place.

So, this now requires a Google Coral hardware engineer to look at the verilog around the page table AXI bus interface to see whether it is possible somehow to set both words using only 32bit accesses. This may be possible using some internally programmable master that can write to those registers (a DMA controller or processor of some type).

Will create a pull request against google-coral/libedgetpu and mbrooksx/gasket-driver with my changes which limits it to just the changes required to at least get this far…

I completely agree about the potential with the combination. At this point, it looks like a irreparable hardware issue with the antiquated CM4 PCIe module. I have forced all the allocations into simple mapping (see above for more info about this) so that all the virtual addresses are 32-bit, as well as previously setting all reads/writes to 32-bit. However, the device itself (in hardware) makes reads/writes in the coherent cache - all of these read/writes are 64-bits.

For now, the plan is to wait until the office is open so we can use a PCIe analyzer and confirm this hypothesis. But there doesn’t appear to be any additional changes that we can do in SW - the device expecting a host to be able to perform 64-bit read/write is built into the hardware.

USB is still the recommendation for the CM4. USB2.0 is possible out of box, and USB3.0 may be possible although extra design considerations are required (more info here: https://coral.ai/products/accelerator-module/).

@timonsku : Yes, I’m actively working with the people in the Pi forum discussion. While MSI-X isn’t technically supported by the BCM2711, as you saw from that patch if SW indicates it works then the PCIe hardware is actually able to map some MSI-X interrupts correctly.

We’ve validated farther than you have (including MSI-X), your errors are because you’re building for the 32-bit kernel but the driver expects 64-bit read/write (thus why writeq/readq don’t exist). My plan is to customize the driver for Pi (including 32-bit workarounds) and likely submit it to the Pi kernel vs trying to update our DKMS package. Will keep you informed of the status.

@SamueldeFaria could you maybe specify which product have you chosen instead. This may be helpful for others who follow, or will stumble on this issue in the future.

It would be so sad if it would never be possible to use the Coral Boards via PCIE on the CM4. The combo is the perfect high performance - low power - compact formfactor - multi camera - mainline kernel supported - embedded inference platform. Please please find a way to make it useable.

incredible this issue is still open after 2 years

Any updates on this issue?

There was a tentative plan to investigate further when a PCIe Analyzer became available. Have these tests been done?

Thanks!

Although i cannot help you but, I came here everyday with a hope to see it can work together ^^

I unfortunately don’t have an estimated date. The CM4 PCIe hardware is antiquated, and there are endless hacks required to try to have it operate competently (note that the TPU is a PCIe bus master, and I don’t see any evidence of a bus master ever being tested with the CM4). We haven’t been receiving the support needed from the Pi team, so for now it’s continuing to try things to understand the issues with communication (at this point it seems an issue with the shared memory). It may be within the next few weeks for operation (in which case I would post the hacked up version for your evaluation while we decide the best way to release this without polluting the main Coral codebase). I will keep this thread up to date.

Depending on the board configuration, USB may be a better choice.

If someone at Google is working on it, or is going to, it would be nice to get a very rough ETA (weeks, months) on when we can expect to know whether or not the TPU will ever work over PCIe on a CM4. I’ll be creating a new revision of my products PCB in few weeks, and if there’s very little chance the PCIe TPU won’t work anytime soon, I’ll have to switch both to USB.

I’d like to add some clarifications to the comments from SamueldeFaria above… the reason Coral team could not provide the host board design support with CM4 and Coral Accelerator, is due to the fact that we won’t know if a certain board design is going to work or not until the board is physically made and tuned, given the stated reason of PHY trace. And we asked for some information about what product a developer is making or what customer they serve, is not trying to get any confidential product information or having any nefarious intention, as the comment might have implied, but simply try to get a sense of if the product has a compelling use case or important customer that it might merit us providing extra assistance from our engineering team. Due to the sheer volume of support request we receive for potentially using CM4 + Coral Accelerator, as you could imagine, we just use this info to help us assessing and prioritizing the help inquiries we received, so we can allocate limited engineering support resources to help as many key customers as possible. Of course, if anyone is not comfortable telling us what they are developing, then there’s no need to reply on our follow up.

Regardless, there are some customers have successfully implemented USB3 interface with the Coral Accelerator module in their product design, since the PCIe interface with CM4 is out of the question at the moment, and one of them is Upverter, and they are offering board design tools and services to other customers who’d like to leverage their experiences and expertise in designing CM4 based host board with Coral Accelerator integrated on it, so anyone can also try their tools & services if you’d like to try a USB3 interface, Thanks!

Reading this thread it is still no clear to me if there is possibly a hardware limitation which prevents us to have CM4 + Edge TPU via PCIe in the future. Is this a complicate software issue or potentially a hardware dead end?

It seems that Gumstix for example has at least one board with the Edge TPU via PCI-E interface: https://www.gumstix.com/cm4-uprev-ai.html

According to Gumstix and Google, their solution is CM4 -> PCIe -> USB3 -> Coral TPU, so you still get the fast performance instead of CM4 -> USB2 -> Coral TPU. Gumstix is using ASM1142 USBH1 to Coral Accelerator Module.

As long as this issue is still open, there’s no working solution yet for CM4 -> PCIe -> Coral TPU.

Thanks for your kind response. My questions are: 1- Why don’t you provide complete and professional looking made datasheets as all the other manufacturers? 2 - Whats the idea to provide that link? A company that is designing hardware as are we? Thanks but no thanks. I’m glad I didn’t provide any information after all.

The product is working now with solution from other manufacturer. They were really helpful and provide all the documentation needed. No Client id, product type/idea, production numbers or … Will look to them again if needed in another product. In what regards to me, I will not look to you for any other hardware solutions you may have.

Has anyone had a go at this? I’ve done a bit of debugging and hacking myself and got the kernel module to load and libedgetpu to start an inference (although it never finishes, some event is missing, and there is an HIB error?).

There are some changes needed in both the kernel module and the user-space drivers, so far primarily replacing 64bit memory accesses with two 32bit ones. My progress is here for the module which I have updated to the latest version from the dkms package and here for libedgetpu, but these changes are of course nowhere near merge-quality.

This is what libedgetpu logs:

I :273] Starting in normal mode
I :83] Opening /dev/apex_0. read_only=0
I :97] mmap_offset=0x0000000000040000, mmap_size=4096
I :108] Got map addr at 0x0xb6fde000
I :97] mmap_offset=0x0000000000044000, mmap_size=4096
I :108] Got map addr at 0x0xb6fdd000
I :97] mmap_offset=0x0000000000048000, mmap_size=4096
I :108] Got map addr at 0x0xb6fdc000
I :229] Read: offset = 0x00000000000486f0, value: = 0x0000000000000000, w0=0x00000000, w1=0x00000000
I :191] Write: offset = 0x00000000000487a8, value = 0x0000000000000000
I :229] Read: offset = 0x0000000000048578, value: = 0x0000000000000010, w0=0x00000010, w1=0x00000000
I :136] MmuMapper#Map() : 00000000b6627000 -> 0000000001000000 (1 pages) flags=00000000.
I :55] MapMemory() page-aligned : device_address = 0x0000000001000000
I :169] Queue base : 0xb6627000 -> 0x0000000001000000 [4096 bytes]
I :136] MmuMapper#Map() : 00000000b6628000 -> 0000000001001000 (1 pages) flags=00000000.
I :55] MapMemory() page-aligned : device_address = 0x0000000001001000
I :179] Queue status block : 0xb6628000 -> 0x0000000001001000 [16 bytes]
I :191] Write: offset = 0x0000000000048590, value = 0x0000000001000000
I :191] Write: offset = 0x0000000000048598, value = 0x0000000001001000
I :191] Write: offset = 0x00000000000485a0, value = 0x0000000000000100
I :191] Write: offset = 0x0000000000048568, value = 0x0000000000000005
I :229] Read: offset = 0x0000000000048570, value: = 0x0000000000000001, w0=0x00000001, w1=0x00000000
I :229] Read: offset = 0x00000000000486d0, value: = 0x0000000000000000, w0=0x00000000, w1=0x00000000
I :191] Write: offset = 0x0000000000044018, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000044158, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000044198, value = 0x0000000000000001
I :191] Write: offset = 0x00000000000441d8, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000044218, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000048788, value = 0x000000000000007f
I :229] Read: offset = 0x0000000000048788, value: = 0x000000000000007f, w0=0x0000007f, w1=0x00000000
I :191] Write: offset = 0x00000000000400c0, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000040150, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000040110, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000040250, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000040298, value = 0x0000000000000001
I :191] Write: offset = 0x00000000000402e0, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000040328, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000040190, value = 0x0000000000000001
I :191] Write: offset = 0x00000000000401d0, value = 0x0000000000000001
I :191] Write: offset = 0x0000000000040210, value = 0x0000000000000001
I :191] Write: offset = 0x00000000000486e8, value = 0x0000000000000000
I :45] Set event fd : event_id:0 -> event_fd:7,
I :45] Set event fd : event_id:4 -> event_fd:11,
I :62] event_fd=7. Monitor thread begin.
I :45] Set event fd : event_id:5 -> event_fd:12,
I :45] Set event fd : event_id:6 -> event_fd:13,
I :62] event_fd=12. Monitor thread begin.
I :62] event_fd=11. Monitor thread begin.
I :45] Set event fd : event_id:7 -> event_fd:14,
I :62] event_fd=13. Monitor thread begin.
I :45] Set event fd : event_id:8 -> event_fd:15,
I :62] event_fd=14. Monitor thread begin.
I :45] Set event fd : event_id:9 -> event_fd:16,
I :45] Set event fd : event_id:10 -> event_fd:17,
I :62] event_fd=15. Monitor thread begin.
I :45] Set event fd : event_id:11 -> event_fd:18,
I :62] event_fd=16. Monitor thread begin.
I :62] event_fd=17. Monitor thread begin.
I :45] Set event fd : event_id:12 -> event_fd:19,
I :62] event_fd=18. Monitor thread begin.
I :191] Write: offset = 0x00000000000486a0, value = 0x000000000000000f
I :191] Write: offset = 0x00000000000485c0, value = 0x0000000000000001
I :191] Write: offset = 0x00000000000486c0, value = 0x0000000000000001
I :172] Opening device at /dev/apex_0
I :62] event_fd=19. Monitor thread begin.
I :75] event_fd=19. Monitor thread got num_events=1.
I :191] Write: offset = 0x00000000000486c0, value = 0x0000000000000000
I :191] Write: offset = 0x00000000000486c8, value = 0x0000000000000000
I :229] Read: offset = 0x00000000000486f0, value: = 0x0000000000000001, w0=0x00000001, w1=0x00000000
I :229] Read: offset = 0x0000000000048700, value: = 0x0000000000000001, w0=0x00000001, w1=0x00000000
E :254] HIB Error. hib_error_status = 0000000000000001, hib_first_error_status = 0000000000000001
I :75] event_fd=19. Monitor thread got num_events=1.
I :191] Write: offset = 0x00000000000486c0, value = 0x0000000000000000
I :191] Write: offset = 0x00000000000486c8, value = 0x0000000000000000
I :229] Read: offset = 0x00000000000486f0, value: = 0x0000000000000001, w0=0x00000001, w1=0x00000000
I :229] Read: offset = 0x0000000000048700, value: = 0x0000000000000001, w0=0x00000001, w1=0x00000000
E :254] HIB Error. hib_error_status = 0000000000000001, hib_first_error_status = 0000000000000001
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
I :47] Adding input "map/TensorArrayStack/TensorArrayGatherV3" with 150528 bytes.
I :58] Adding output "prediction" with 965 bytes.
I :167] Request prepared, total batch size: 1, total TPU requests required: 1.
I :310] Request [0]: Submitting P0 request immediately.
I :373] Request [0]: Need to map parameters.
I :136] MmuMapper#Map() : 00000000ad93d000 -> 8000000000000000 (953 pages) flags=00000002.
I :55] MapMemory() page-aligned : device_address = 0x8000000000000000
I :252] Mapped params : Buffer(ptr=0xad93d000) -> 0x8000000000000000, 3900864 bytes.
I :252] Mapped params : Buffer(ptr=(nil)) -> 0x0000000000000000, 0 bytes.
I :387] Request [0]: Need to do parameter-caching.
I :80] [0] Request constructed.
I :46] InstructionBuffers created.
I :653] Created new instruction buffers.
I :75] Mapped scratch : Buffer(ptr=(nil)) -> 0x0000000000000000, 0 bytes.
I :368] MapDataBuffers() done.
I :187] Linking Parameter: 0x8000000000000000
I :136] MmuMapper#Map() : 0000000001266000 -> 8000000000400000 (3 pages) flags=00000002.
I :55] MapMemory() page-aligned : device_address = 0x8000000000400000
I :223] Mapped "instructions" : Buffer(ptr=0x1266000) -> 0x8000000000400000, 9680 bytes. Direction=1
I :384] MapInstructionBuffers() done.
I :481] [0] SetState old=0, new=1.
I :393] [0] NotifyRequestSubmitted()
I :481] [0] SetState old=1, new=2.
I :83] Request[0]: Submitted
I :401] [0] NotifyRequestActive()
I :481] [0] SetState old=2, new=3.
I :133] Request[0]: Scheduling DMA[0]
I :394] Adding an element to the host queue.
I :191] Write: offset = 0x00000000000485a8, value = 0x0000000000000001
I :80] [1] Request constructed.
I :113] Adding input "map/TensorArrayStack/TensorArrayGatherV3" with 150528 bytes.
I :188] Adding output "prediction" with 965 bytes.
I :46] InstructionBuffers created.
I :653] Created new instruction buffers.
I :75] Mapped scratch : Buffer(ptr=(nil)) -> 0x0000000000000000, 0 bytes.
I :136] MmuMapper#Map() : 0000000001226000 -> 8000000000440000 (38 pages) flags=00000002.
I :55] MapMemory() page-aligned : device_address = 0x8000000000440000
I :223] Mapped "map/TensorArrayStack/TensorArrayGatherV3" : Buffer(ptr=0x1226440) -> 0x8000000000440440, 150528 bytes. Direction=1
I :136] MmuMapper#Map() : 0000000001276000 -> 8000000000404000 (1 pages) flags=00000004.
I :55] MapMemory() page-aligned : device_address = 0x8000000000404000
I :223] Mapped "prediction" : Buffer(ptr=0x1276000) -> 0x8000000000404000, 968 bytes. Direction=2
I :368] MapDataBuffers() done.
I :93] Linking map/TensorArrayStack/TensorArrayGatherV3[0]: 0x8000000000440440
I :93] Linking prediction[0]: 0x8000000000404000
I :136] MmuMapper#Map() : 00000000012b9000 -> 8000000000420000 (32 pages) flags=00000002.
I :55] MapMemory() page-aligned : device_address = 0x8000000000420000
I :223] Mapped "instructions" : Buffer(ptr=0x12b9000) -> 0x8000000000420000, 129536 bytes. Direction=1
I :384] MapInstructionBuffers() done.
I :481] [1] SetState old=0, new=1.
I :393] [1] NotifyRequestSubmitted()
I :481] [1] SetState old=1, new=2.
I :83] Request[1]: Submitted
I :401] [1] NotifyRequestActive()
I :481] [1] SetState old=2, new=3.
I :133] Request[1]: Scheduling DMA[0]
I :394] Adding an element to the host queue.
I :191] Write: offset = 0x00000000000485a8, value = 0x0000000000000002

Also the only interrupt firing seems to be the fatal error one:

cat /sys/class/apex/apex_0/interrupt_counts
0x00: 0
0x01: 0
0x02: 0
0x03: 0
0x04: 0
0x05: 0
0x06: 0
0x07: 0
0x08: 0
0x09: 0
0x0a: 0
0x0b: 0
0x0c: 2

I’m just going to do a shameless plug here: https://github.com/will127534/Coral-USB3-M2-Module A full opensourced design with CTS test passed.

No it doesn’t… It’s no more promising that any of the other products that will not work due to hardware limitations of BCM2711 and the Google Coral device

Designed m.2 card with Coral Accelerator Module that seem to work fine with Piunora CM4 baseboard. Test suggestions are welcomed

Hello, I’m thinking about getting a Coral with my raspberry PI4 (not a Compute Module), the easy way would be to get the usb stick but I live in France and it is not delivered there. Would I get all the problems you guys are getting if I were to plug a mini PCIe Coral on my rasp through something like mini PCIe -> USB -> RPI4 ?

This has nothing to do with this issue. PCIe is used on RPi4 for USB3 and need customization to make use of it. I think your best chance is still to get the USB Accelerator which is using USB3.

At Google I/O 2021, the Coral team announced companies they were working with to develop TPU projects, Gumstix was one. Gumstix has a Pixhawk development board which uses Coral and CM4. I asked the Coral team if that unit works (since it uses the PCIe interface to talk to the TPU. There response was:

“Unfortunately, we haven’t been able to run the TPU on a 32-bit system (estaban - am assuming this means any?) . Please refer to this issue: https://github.com/google-coral/edgetpu/issues/280 (estaban - this posting). The CM4 has a 32-bit bus, and despite changing both the driver and userspace (see bug for links to GitHub repos with those changes) - the device still is hardcoded to issue 64-bit operations. We expect that the 32-bit host simply omits the upper word, leading to invalid read/writes (as reflected in the HIB error).”

The bottom of the email has the following bug report fields: Status - In-progress Priority - Medium Status Detail - Assigned

Not solved but maybe not dropped…and maybe “no” 32-bit processor can use it, Seems important for an Edge TPU.

R/Estaban

@n1mda - Coral and CM4 are a no go. Coral seems to work on the Pi 5 (and hopefully the CM5 when it is released), as it has a more compliant PCIe bus.

I’m not going to sell this, I’ve been using this board to evaluate Coral module but it’s performance, availability (The one you saw in the image takes 6 months of waiting) and having a USB3 controller in the middle of both device and host that supports PCIe just doesn’t make sense in terms of power, cost, complexity. The git repo is more about documenting the USB3 capability for Coral module that Google hides from it’s datasheet.

Really appreciate you spending so much time and energy on this,

Thanks Manoj. I have posted the query on the sales link, but it usually takes a long time to hear back from them. Since this is a bit urgent, I was hoping someone from Google here can connect me to the right team to take this forward.

On 13-Apr-2022, at 12:24 PM, Manoj @.***> wrote:

@manishbuttan https://github.com/manishbuttan Please contact at Coral sales link given at https://coral.ai/products/accelerator-module#tech-specs https://coral.ai/products/accelerator-module#tech-specs — Reply to this email directly, view it on GitHub https://github.com/google-coral/edgetpu/issues/280#issuecomment-1097625337, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAU7ODQ5ZIWIFRYTNSLA6ETVEZVQZANCNFSM4UY4ATPA. You are receiving this because you were mentioned.

@vebmaster - So far I haven’t been able to get that compute module (nor Pine64’s SOQuartz) to get to a state where I can test it but TPUs are some of the first devices I’m planning on testing!

Will this problem be wherever “Raspberry Pi CM4” is used or are there exceptions?

Did I understand correctly that at the moment (December 2021) it makes no sense to buy “Raspberry Pi CM4” to use “Coral edge Mini PCIe” and “Coral edge m.2”?

There hasn’t been an update to the gasket driver (which I’ve now moved to https://github.com/google/gasket-driver) or libedgetpu to enable 32-bit operation required by the CM4. The only way we’re aware of to communicate between the CM4 and the TPU is via USB3.

This can be accomplished by starting with a known-good design from Gumstix and customizing in Upverter (based on this board) or by reaching out to Coral Sales to discuss how to build your own USB3 design.

As for the performance of the USB3 + CM4, here is the models_benchmark output on the GumStix PoE camera (this can be compared to the tested CTS runs) - you’ll see it significantly outperforms the USB2 design (Dev Board Mini) but is slightly less than the x86 USB3 (due to the more powerful CPU) or Dev Board (due to the slight latency added with the PCIe-USB bridge on the CM4 design).

-----------------------------------------------------
models_benchmark
-----------------------------------------------------
2021-05-13 22:49:44
Running /home/pi/coral/cts/models_benchmark
Run on (4 X 1500 MHz CPU s)
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-----------------------------------------------------------------------------------------------------------
Benchmark                                                                    Time           CPU Iterations
-----------------------------------------------------------------------------------------------------------
BM_MobileNetV1<coral::kEdgeTpu>                                        2474282 ns     123110 ns       6085
BM_MobileNetV1<coral::kCpu>                                          264763991 ns  264650975 ns          3
BM_MobileNetV1_25<coral::kEdgeTpu>                                      992947 ns     133297 ns       7661
BM_MobileNetV1_25<coral::kCpu>                                        15568433 ns   15554960 ns         35
BM_MobileNetV1_50<coral::kEdgeTpu>                                     1334842 ns     125070 ns       6861
BM_MobileNetV1_50<coral::kCpu>                                        57824373 ns   57769407 ns          9
BM_MobileNetV1_75<coral::kEdgeTpu>                                     1688143 ns     123297 ns       6376
BM_MobileNetV1_75<coral::kCpu>                                       139678319 ns  139581462 ns          6
BM_MobileNetV1_L2Norm<coral::kEdgeTpu>                                 3427892 ns    1134517 ns        491
BM_MobileNetV1_L2Norm<coral::kCpu>                                   268103838 ns  267834574 ns          2
BM_MobileNetV2<coral::kEdgeTpu>                                        2713575 ns     117379 ns       1000
BM_MobileNetV2<coral::kCpu>                                          259893815 ns  259716474 ns          3
BM_MobileNetV2INatPlant<coral::kEdgeTpu>                               2957550 ns     140127 ns       1000
BM_MobileNetV2INatPlant<coral::kCpu>                                 272773147 ns  272692758 ns          2
BM_MobileNetV2INatInsect<coral::kEdgeTpu>                              2756209 ns     128306 ns       1000
BM_MobileNetV2INatInsect<coral::kCpu>                                288822651 ns  288533369 ns          3
BM_MobileNetV2INatBird<coral::kEdgeTpu>                                2697566 ns     111712 ns       1000
BM_MobileNetV2INatBird<coral::kCpu>                                  253786802 ns  253703727 ns          3
BM_SsdMobileNetV1<coral::kEdgeTpu>                                   178930700 ns  164907000 ns          4
BM_SsdMobileNetV1<coral::kCpu>                                       817034721 ns  816729350 ns          1
BM_SsdMobileNetV2<coral::kEdgeTpu>                                    19009941 ns    8747667 ns         89
BM_SsdMobileNetV2<coral::kCpu>                                       600252151 ns  599998258 ns          1
BM_FaceSsd<coral::kEdgeTpu>                                           21361053 ns   15720419 ns         56
BM_FaceSsd<coral::kCpu>                                              630894423 ns  630591164 ns          1
BM_InceptionV1<coral::kEdgeTpu>                                        3458645 ns     138411 ns       1000
BM_InceptionV1<coral::kCpu>                                          538686872 ns  538445563 ns          2
BM_InceptionV2<coral::kEdgeTpu>                                       16778940 ns     180806 ns       1000
BM_InceptionV2<coral::kCpu>                                          783592939 ns  783126423 ns          1
BM_InceptionV3<coral::kEdgeTpu>                                       49608231 ns     190925 ns        100
BM_InceptionV3<coral::kCpu>                                         2072519064 ns 2072098252 ns          1
BM_InceptionV4<coral::kEdgeTpu>                                       96992586 ns     200406 ns        100
BM_InceptionV4<coral::kCpu>                                         4660374403 ns 4658627612 ns          1
BM_EfficientNetEdgeTpuSmall<coral::kEdgeTpu>                           5179719 ns     127570 ns       1000
BM_EfficientNetEdgeTpuSmall<coral::kCpu>                             978528500 ns  978162145 ns          1
BM_EfficientNetEdgeTpuMedium<coral::kEdgeTpu>                          9829099 ns     179685 ns       1000
BM_EfficientNetEdgeTpuMedium<coral::kCpu>                           1640263796 ns 1639636846 ns          1
BM_EfficientNetEdgeTpuLarge<coral::kEdgeTpu>                          27017634 ns     184506 ns        100
BM_EfficientNetEdgeTpuLarge<coral::kCpu>                            4013653278 ns 4012972856 ns          1
BM_Deeplab513Mv2Dm1_WithArgMax<coral::kEdgeTpu>                      326646686 ns  303988036 ns          2
BM_Deeplab513Mv2Dm1_WithArgMax<coral::kCpu>                         2587352037 ns 2586685081 ns          1
BM_Deeplab513Mv2Dm05_WithArgMax<coral::kEdgeTpu>                     331540823 ns  314183018 ns          2
BM_Deeplab513Mv2Dm05_WithArgMax<coral::kCpu>                        1283381701 ns 1282998809 ns          1
BM_KerasPostTrainingQuantizedUnetMv2128<coral::kEdgeTpu>              10664440 ns    3564409 ns        211
BM_KerasPostTrainingQuantizedUnetMv2128<coral::kCpu>                 306444764 ns  306451518 ns          2
BM_KerasPostTrainingQuantizedUnetMv2256<coral::kEdgeTpu>             125483203 ns     407458 ns        100
BM_KerasPostTrainingQuantizedUnetMv2256<coral::kCpu>                1430027962 ns 1429929847 ns          1
BM_SsdMobileNetV1FineTunedPet<coral::kEdgeTpu>                        71192741 ns   60151344 ns         10
BM_SsdMobileNetV1FineTunedPet<coral::kCpu>                           645666361 ns  645403571 ns          1
BM_PostTrainingQuantizedTf2KerasMobileNetV1<coral::kEdgeTpu>           2590886 ns     136522 ns       1000
BM_PostTrainingQuantizedTf2KerasMobileNetV1<coral::kCpu>             292732080 ns  292613949 ns          3
BM_PostTrainingQuantizedTf2KerasMobileNetV2<coral::kEdgeTpu>           2839777 ns     134647 ns       1000
BM_PostTrainingQuantizedTf2KerasMobileNetV2<coral::kCpu>             218506495 ns  218423487 ns          3
BM_PostTrainingQuantizedTf2KerasMobileNetV3EdgeTpu<coral::kEdgeTpu>    2997397 ns     122935 ns       1000
BM_PostTrainingQuantizedTf2KerasMobileNetV3EdgeTpu<coral::kCpu>      424009442 ns  423864110 ns          2
BM_SsdLiteMobileDet<coral::kEdgeTpu>                                  19936002 ns    8912055 ns         66
BM_SsdLiteMobileDet<coral::kCpu>                                     834558010 ns  834245273 ns          1
BM_SsdMobileNetV1_NoNms<coral::kEdgeTpu>                              17669110 ns    7159439 ns        102
BM_SsdMobileNetV1_NoNms<coral::kCpu>                                 629323244 ns  602055865 ns          1
BM_SsdMobileNetV2_NoNms<coral::kEdgeTpu>                              24258748 ns    9243972 ns         80
BM_SsdMobileNetV2_NoNms<coral::kCpu>                                 668772697 ns  643773166 ns          1

Yes, we worked with GumStix to enable a USB3.0 + CM4 solution. They are using a PCIe to USB3 bridge to accomplish this (since the CM4 only pins out USB2.0). It fully passes Coral CTS and achieves the expected performance for USB3. They have this in both the IP Camera form factor and the Pixhawk. For USB designs it is possible to use their Upverter platform to create a known-good USB3.0 baseboard for the CM4 or reach out to Coral Sales to discuss the special design considerations for USB3. Will keep this issue open, however, for PCIe discussion.

The Accelerator Module is USB2 but GumStix is using a USB3 version and need to contact sales to purchase, is there any performance benefits to get USB3 version instead of USB2? Is the USB3 version going to be generally available? Thanks.

The accelerator module supports PCIe, USB3, and USB2. USB3 requires working with us extra design considerations to ensure that the design will work properly. As we validate more designs, we may make this information generally available but we want to ensure it can work across many designs (instead of setting up people for failure).

As for performance, it’s a significant difference. I would recommend referring to the CTS outputs. The bottom of the CTS outputs is benchmarks - specifically I’d compare the Dev Board (A53 + PCIe) and Dev Board Mini (A35 + USB2). While the Dev Board is PCIe, it’s a more fair comparison then x86+USB (also in CTS outputs) because of the much faster CPU time on the beefier platform.

@julled - Yeah this new overlay is unrelated (and frankly I can’t believe that overlay is actually useful). Thanks for finding the source.

Choosing to believe this is still possible…here are my current DMESG and libedgetpu logs: (Kernel: 5.10.23-v8+ (aarch64) with gasket/apex modules and libedgetpu from mbooksx’s repos, custom Buildroot Rootfs)

DMESG

[ 1876.006541] apex 0000:01:00.0: Fault VA: 0xffffffff
[ 1876.012884] apex 0000:01:00.0: Failing in first (simple) read access. Extended_level0: 0x7ff, Simple: 0x1fff
[ 1876.024280] apex 0000:01:00.0: Computed Failing Bus Addr: 0x0
[ 1876.031596] apex 0000:01:00.0: Fault VA: 0x0
[ 1876.042358] apex 0000:01:00.0: Fault VA: 0x0
[ 1876.048153] apex 0000:01:00.0: Fault VA: 0x0
[ 1876.053923] apex 0000:01:00.0: Fault VA: 0x0
[ 1876.059681] apex 0000:01:00.0: Fault VA: 0x0
[ 1876.065456] apex 0000:01:00.0: Fault VA: 0x0
[ 1876.071141] apex 0000:01:00.0: Fault VA: 0x0
[ 1876.076769] apex 0000:01:00.0: Fault VA: 0x0
[ 1876.082370] apex 0000:01:00.0: Fault VA: 0x0
[ 1876.089568] apex 0000:01:00.0: Map Simple Pages: host_addr 0x7f89c74000, dev_addr 0x1000000, num_pages 1
[ 1876.100752] apex 0000:01:00.0: Map Simple Pages: host_addr 0x7f89c75000, dev_addr 0x1001000, num_pages 1
[ 1876.160486] apex 0000:01:00.0: Map Simple Pages: host_addr 0x7f5f969000, dev_addr 0x0, num_pages 1603
[ 1876.171885] apex 0000:01:00.0: Map Simple Pages: host_addr 0xd9c3000, dev_addr 0x1004000, num_pages 3
[ 1876.185214] apex 0000:01:00.0: Map Simple Pages: host_addr 0x7f88350000, dev_addr 0x1080000, num_pages 66
[ 1876.196648] apex 0000:01:00.0: Map Simple Pages: host_addr 0xd9c7000, dev_addr 0x1002000, num_pages 2
[ 1876.208103] apex 0000:01:00.0: Map Simple Pages: host_addr 0x7f88272000, dev_addr 0x1040000, num_pages 44
[ 1876.219712] apex 0000:01:00.0: Map Simple Pages: host_addr 0xd9ca000, dev_addr 0x1008000, num_pages 2
[ 1876.230804] apex 0000:01:00.0: Map Simple Pages: host_addr 0x7f88231000, dev_addr 0x1100000, num_pages 63

(here the test program hangs until ctrl-c)

[ 1904.820076] apex 0000:01:00.0: Fault VA: 0xbe96c8
[ 1904.826533] apex 0000:01:00.0: Failing in first (simple) read access. Extended_level0: 0x5, Simple: 0xbe9
[ 1904.837859] apex 0000:01:00.0: Computed Failing Bus Addr: 0x100004000000000
[ 1904.846581] apex 0000:01:00.0: Fault VA: 0xbe96c8
[ 1904.853128] apex 0000:01:00.0: Failing in first (simple) read access. Extended_level0: 0x5, Simple: 0xbe9
[ 1904.864475] apex 0000:01:00.0: Computed Failing Bus Addr: 0x100004000000000
[ 1904.873204] apex 0000:01:00.0: Fault VA: 0xffffffffffffffff
[ 1904.880539] apex 0000:01:00.0: Fault VA: 0xffffffff
[ 1904.887108] apex 0000:01:00.0: Failing in first (simple) read access. Extended_level0: 0x7ff, Simple: 0x1fff
[ 1904.898652] apex 0000:01:00.0: Computed Failing Bus Addr: 0x0
[ 1904.906057] apex 0000:01:00.0: Fault VA: 0x0
[ 1904.921784] apex 0000:01:00.0: Fault VA: 0x0
[ 1904.927701] apex 0000:01:00.0: Fault VA: 0x0
[ 1904.933515] apex 0000:01:00.0: Fault VA: 0x0
[ 1904.939298] apex 0000:01:00.0: Fault VA: 0x0
[ 1904.945065] apex 0000:01:00.0: Fault VA: 0xffffffffffffffff

libedgetpu (verbosity=10)

I :944] EnumerateDevices: vendor:0x1a6e, product:0x89a                                                                                                                                                            
I :944] EnumerateDevices: vendor:0x18d1, product:0x9302                                                                                                                                                           
Test_EdgeTPU[412]: (main:70): Num EdgeTPU Devices: 1                                                                                                                                                              
I :453] No matching device is already opened for shared ownership.                                                                                                                                                
I :944] EnumerateDevices: vendor:0x1a6e, product:0x89a                                                                                                                                                            
I :944] EnumerateDevices: vendor:0x18d1, product:0x9302                                                                                                                                                           
I :104] USB always DFU: False (default)                                                                                                                                                                           
I :126] USB bulk-in queue capacity: default                                                                                                                                                                       
I :65] Performance expectation: Max (default)                                                                                                                                                                     
I :273] Hello Adam!                                                                                                                                                                                               
I :274] Starting in FUCK YEAH mode                                                                                                                                                                                
I :83] Opening /dev/apex_0. read_only=0                                                                                                                                                                           
I :97] mmap_offset=0x0000000000040000, mmap_size=4096                                                                                                                                                             
I :108] Got map addr at 0x0x7f904db000                                                                                                                                                                            
I :97] mmap_offset=0x0000000000044000, mmap_size=4096                                                                                                                                                             
I :108] Got map addr at 0x0x7f89c79000                                                                                                                                                                            
I :97] mmap_offset=0x0000000000048000, mmap_size=4096                                                                                                                                                             
I :108] Got map addr at 0x0x7f89c78000                                                                                                                                                                            
I :240] Offset: 0x00000000000486f0, mmap_reg: 0x7f89c786f0, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x0000000000000000, value:0x0000000000000000                                     
I :269] Read 32 Hacks: offset = 0x00000000000486f0, lower: = 0x0000000000000000 upper: = 0x0000000000000000 value: = 0x0000000000000000 mmap: 0x7f89c786f0                                                        
I :282] Page Fault Address: 0x0000000000000000                                                                                                                                                                    
I :195] Write 32 Hacks: offset = 0x00000000000487a8, value = 0x0000000000000000 mmap=0x7f89c787a8                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000487a8, value: = 0x0000000000000000                                                                                                                                 
I :240] Offset: 0x0000000000048578, mmap_reg: 0x7f89c78578, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x0000000000000010, value:0x0000000000000010                                     
I :269] Read 32 Hacks: offset = 0x0000000000048578, lower: = 0x0000000000000010 upper: = 0x0000000000000000 value: = 0x0000000000000010 mmap: 0x7f89c78578                                                        
I :282] Page Fault Address: 0x0000000000000000                                                                                                                                                                    
I :136] MmuMapper#Map() : 0000007f89c74000 -> 0000000001000000 (1 pages) flags=00000000.                                                                                                                          
I :55] MapMemory() page-aligned : device_address = 0x0000000001000000                                                                                                                                             
I :169] Queue base : 0x7f89c74000 -> 0x0000000001000000 [4096 bytes]                                                                                                                                              
I :136] MmuMapper#Map() : 0000007f89c75000 -> 0000000001001000 (1 pages) flags=00000000.                                                                                                                          
I :55] MapMemory() page-aligned : device_address = 0x0000000001001000                                                                                                                                             
I :179] Queue status block : 0x7f89c75000 -> 0x0000000001001000 [16 bytes]                                                                                                                                        
I :195] Write 32 Hacks: offset = 0x0000000000048590, value = 0x0000000001000000 mmap=0x7f89c78590                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000048590, value: = 0x0000000001000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000048598, value = 0x0000000001001000 mmap=0x7f89c78598                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000048598, value: = 0x0000000001001000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x00000000000485a0, value = 0x0000000000000100 mmap=0x7f89c785a0                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000485a0, value: = 0x0000000000000100                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000048568, value = 0x0000000000000005 mmap=0x7f89c78568                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000048568, value: = 0x0000000000000005                                                                                                                                 
I :240] Offset: 0x0000000000048570, mmap_reg: 0x7f89c78570, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x0000000000000001, value:0x0000000000000001                                     
I :269] Read 32 Hacks: offset = 0x0000000000048570, lower: = 0x0000000000000001 upper: = 0x0000000000000000 value: = 0x0000000000000001 mmap: 0x7f89c78570                                                        
I :282] Page Fault Address: 0x0000000000000000                                                                                                                                                                    
I :240] Offset: 0x00000000000486d0, mmap_reg: 0x7f89c786d0, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x0000000000000000, value:0x0000000000000000                                     
I :269] Read 32 Hacks: offset = 0x00000000000486d0, lower: = 0x0000000000000000 upper: = 0x0000000000000000 value: = 0x0000000000000000 mmap: 0x7f89c786d0                                                        
I :282] Page Fault Address: 0x0000000000000000                                                                                                                                                                    
I :195] Write 32 Hacks: offset = 0x0000000000044018, value = 0x0000000000000001 mmap=0x7f89c79018                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000044018, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000044158, value = 0x0000000000000001 mmap=0x7f89c79158                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000044158, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000044198, value = 0x0000000000000001 mmap=0x7f89c79198                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000044198, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x00000000000441d8, value = 0x0000000000000001 mmap=0x7f89c791d8                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000441d8, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000044218, value = 0x0000000000000001 mmap=0x7f89c79218                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000044218, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000048788, value = 0x000000000000007f mmap=0x7f89c78788                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000048788, value: = 0x000000000000007f                                                                                                                                 
I :240] Offset: 0x0000000000048788, mmap_reg: 0x7f89c78788, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x000000000000007f, value:0x000000000000007f                                     
I :269] Read 32 Hacks: offset = 0x0000000000048788, lower: = 0x000000000000007f upper: = 0x0000000000000000 value: = 0x000000000000007f mmap: 0x7f89c78788                                                        
I :282] Page Fault Address: 0x0000000000000000                                                                                                                                                                    
I :195] Write 32 Hacks: offset = 0x00000000000400c0, value = 0x0000000000000001 mmap=0x7f904db0c0                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000400c0, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000040150, value = 0x0000000000000001 mmap=0x7f904db150                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000040150, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000040110, value = 0x0000000000000001 mmap=0x7f904db110                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000040110, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000040250, value = 0x0000000000000001 mmap=0x7f904db250                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000040250, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000040298, value = 0x0000000000000001 mmap=0x7f904db298                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000040298, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x00000000000402e0, value = 0x0000000000000001 mmap=0x7f904db2e0                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000402e0, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000040328, value = 0x0000000000000001 mmap=0x7f904db328                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000040328, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000040190, value = 0x0000000000000001 mmap=0x7f904db190                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000040190, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x00000000000401d0, value = 0x0000000000000001 mmap=0x7f904db1d0                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000401d0, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x0000000000040210, value = 0x0000000000000001 mmap=0x7f904db210                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x0000000000040210, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x00000000000486e8, value = 0x0000000000000000 mmap=0x7f89c786e8                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000486e8, value: = 0x0000000000000000                                                                                                                                 
I :45] Set event fd : event_id:0 -> event_fd:8,                                                                                                                                                                   
I :45] Set event fd : event_id:4 -> event_fd:12,                                                                                                                                                                  
I :62] event_fd=8. Monitor thread begin.                                                                                                                                                                          
I :45] Set event fd : event_id:5 -> event_fd:13,                                                                                                                                                                  
I :62] event_fd=12. Monitor thread begin.                                                                                                                                                                         
I :45] Set event fd : event_id:6 -> event_fd:14,                                                                                                                                                                  
I :62] event_fd=13. Monitor thread begin.                                                                                                                                                                         
I :45] Set event fd : event_id:7 -> event_fd:15,                                                                                                                                                                  
I :62] event_fd=14. Monitor thread begin.                                                                                                                                                                         
I :45] Set event fd : event_id:8 -> event_fd:16,                                                                                                                                                                  
I :62] event_fd=15. Monitor thread begin.                                                                                                                                                                         
I :45] Set event fd : event_id:9 -> event_fd:17,                                                                                                                                                                  
I :62] event_fd=16. Monitor thread begin.                                                                                                                                                                         
I :45] Set event fd : event_id:10 -> event_fd:18,                                                                                                                                                                 
I :62] event_fd=17. Monitor thread begin.                                                                                                                                                                         
I :45] Set event fd : event_id:11 -> event_fd:19,                                                                                                                                                                 
I :62] event_fd=18. Monitor thread begin.                                                                                                                                                                         
I :45] Set event fd : event_id:12 -> event_fd:20,                                                                                                                                                                 
I :62] event_fd=19. Monitor thread begin.                                                                                                                                                                         
I :195] Write 32 Hacks: offset = 0x00000000000486a0, value = 0x000000000000000f mmap=0x7f89c786a0                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000486a0, value: = 0x000000000000000f                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x00000000000485c0, value = 0x0000000000000001 mmap=0x7f89c785c0                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000485c0, value: = 0x0000000000000001                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x00000000000486c0, value = 0x0000000000000001 mmap=0x7f89c786c0                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000486c0, value: = 0x0000000000000001                                                                                                                                 
I :62] event_fd=20. Monitor thread begin.                                                                                                                                                                         
I :172] Opening device at /dev/apex_0                                                                                                                                                                             
Test_EdgeTPU[412]: (main:75): EdgeTPU - path:type (0=PCIe, 1=USB): /dev/apex_0:0                                                                                                                                  
Test_EdgeTPU[412]: (main:80): Loading Model: /home/kampff/Voight-Kampff/objects_edgetpu.tflite                                                                                                                    
Test_EdgeTPU[412]: (main:82): Model Created
Test_EdgeTPU[412]: (main:89): Options configured: maybe                                                                                                                                                           
Test_EdgeTPU[412]: (main:94): Interpreter Created                                                                                                                                                                 
Test_EdgeTPU[412]: (main:98): Tensors Allocated                                                                                                                                                                   
Test_EdgeTPU[412]: (main:120): NPU inputs: 1 vs 1                                                                                                                                                                 
Test_EdgeTPU[412]: (main:127):  - Input 0 (normalized_input_image_tensor): Dimensionsw: 4                                                                                                                         
Test_EdgeTPU[412]: (main:132):    - Dimension 0: (size: 1)                                                                                                                                                        
Test_EdgeTPU[412]: (main:132):    - Dimension 1: (size: 300)                                                                                                                                                      
Test_EdgeTPU[412]: (main:132):    - Dimension 2: (size: 300)                                                                                                                                                      
Test_EdgeTPU[412]: (main:132):    - Dimension 3: (size: 3)                                                                                                                                                        
Test_EdgeTPU[412]: (main:138): NPU outputs: 4 vs 4                                                                                                                                                                
Test_EdgeTPU[412]: (main:145):  - Ouput 0 (TFLite_Detection_PostProcess): Dimensions: 3                                                                                                                           
Test_EdgeTPU[412]: (main:150):    - Dimension 0: 1)                                                                                                                                                               
Test_EdgeTPU[412]: (main:150):    - Dimension 1: 20)                                                                                                                                                              
Test_EdgeTPU[412]: (main:150):    - Dimension 2: 4)                                                                                                                                                               
Test_EdgeTPU[412]: (main:145):  - Ouput 1 (TFLite_Detection_PostProcess:1): Dimensions: 2                                                                                                                         
Test_EdgeTPU[412]: (main:150):    - Dimension 0: 1)                                                                                                                                                               
Test_EdgeTPU[412]: (main:150):    - Dimension 1: 20)                                                                                                                                                              
Test_EdgeTPU[412]: (main:145):  - Ouput 2 (TFLite_Detection_PostProcess:2): Dimensions: 2                                                                                                                         
Test_EdgeTPU[412]: (main:150):    - Dimension 0: 1)                                                                                                                                                               
Test_EdgeTPU[412]: (main:150):    - Dimension 1: 20)                                                                                                                                                              
Test_EdgeTPU[412]: (main:145):  - Ouput 3 (TFLite_Detection_PostProcess:3): Dimensions: 1                                                                                                                         
Test_EdgeTPU[412]: (main:150):    - Dimension 0: 1)                                                                                                                                                               
Test_EdgeTPU[412]: (main:167): Test Image Loaded                                                                                                                                                                  
Test_EdgeTPU[412]: (main:185): Labels Loaded                                                                                                                                                                      
Test_EdgeTPU[412]: (main:209): Inputs Configured                                                                                                                                                                  
I :47] Adding input "normalized_input_image_tensor" with 270000 bytes.                                                                                                                                            
I :58] Adding output "Squeeze" with 7668 bytes.                                                                                                                                                                   
I :58] Adding output "convert_scores" with 174447 bytes.                                                                                                                                                          
I :167] Request prepared, total batch size: 1, total TPU requests required: 1.                                                                                                                                    
I :310] Request [0]: Submitting P0 request immediately.                                                                                                                                                           
I :373] Request [0]: Need to map parameters.                                                                                                                                                                      
I :136] MmuMapper#Map() : 0000007f5f969000 -> 0000000000000000 (1603 pages) flags=00000002.                                                                                                                       
I :55] MapMemory() page-aligned : device_address = 0x0000000000000000                                                                                                                                             
I :252] Mapped params : Buffer(ptr=0x7f5f969000) -> 0x0000000000000000, 6564224 bytes.                                                                                                                            
I :252] Mapped params : Buffer(ptr=(nil)) -> 0x0000000000000000, 0 bytes.                                                                                                                                         
I :387] Request [0]: Need to do parameter-caching.                                                                                                                                                                
I :80] [0] Request constructed.                                                                                                                                                                                   
I :46] InstructionBuffers created.                                                                                                                                                                                
I :653] Created new instruction buffers.                                                                                                                                                                          
I :75] Mapped scratch : Buffer(ptr=(nil)) -> 0x0000000000000000, 0 bytes.                                                                                                                                         
I :368] MapDataBuffers() done.                                                                                                                                                                                    
I :187] Linking Parameter: 0x0000000000000000                                                                                                                                                                     
I :136] MmuMapper#Map() : 000000000d9c3000 -> 0000000001004000 (3 pages) flags=00000002.                                                                                                                          
I :55] MapMemory() page-aligned : device_address = 0x0000000001004000                                                                                                                                             
I :223] Mapped "instructions" : Buffer(ptr=0xd9c3000) -> 0x0000000001004000, 11472 bytes. Direction=1                                                                                                             
I :384] MapInstructionBuffers() done.                                                                                                                                                                             
I :481] [0] SetState old=0, new=1.                                                                                                                                                                                
I :393] [0] NotifyRequestSubmitted()                                                                                                                                                                              
I :481] [0] SetState old=1, new=2.                                                                                                                                                                                
I :83] Request[0]: Submitted                                                                                                                                                                                      
I :401] [0] NotifyRequestActive()                                                                                                                                                                                 
I :481] [0] SetState old=2, new=3.                                                                                                                                                                                
I :133] Request[0]: Scheduling DMA[0]                                                                                                                                                                             
I :393] Adding an element to the host queue.                                                                                                                                                                      
I :195] Write 32 Hacks: offset = 0x00000000000485a8, value = 0x0000000000000001 mmap=0x7f89c785a8                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000485a8, value: = 0x0000000000000001                                                                                                                                 
I :75] event_fd=20. Monitor thread got num_events=1.                                                                                                                                                              
I :80] [1] Request constructed.                                                                                                                                                                                   
I :195] Write 32 Hacks: offset = 0x00000000000486c0, value = 0x0000000000000000 mmap=0x7f89c786c0                                                                                                                 
I :113] Adding input "normalized_input_image_tensor" with 270000 bytes.                                                                                                                                           
I :206] ReRead 32 Hacks: offset = 0x00000000000486c0, value: = 0x0000000000000000                                                                                                                                 
I :188] Adding output "Squeeze" with 7668 bytes.                                                                                                                                                                  
I :195] Write 32 Hacks: offset = 0x00000000000486c8, value = 0x0000000000000000 mmap=0x7f89c786c8                                                                                                                 
I :188] Adding output "convert_scores" with 174447 bytes.                                                                                                                                                         
I :206] ReRead 32 Hacks: offset = 0x00000000000486c8, value: = 0x0000000000000001                                                                                                                                 
I :240] Offset: 0x00000000000486f0, mmap_reg: 0x7f89c786f0, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x0000000000000211, value:0x0000000000000211                                     
I :269] Read 32 Hacks: offset = 0x00000000000486f0, lower: = 0x0000000000000211 upper: = 0x0000000000000000 value: = 0x0000000000000211 mmap: 0x7f89c786f0                                                        
I :282] Page Fault Address: 0x0000000000be96c8                                                                                                                                                                    
I :240] Offset: 0x0000000000048700, mmap_reg: 0x7f89c78700, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x0000000000000010, value:0x0000000000000010                                     
I :269] Read 32 Hacks: offset = 0x0000000000048700, lower: = 0x0000000000000010 upper: = 0x0000000000000000 value: = 0x0000000000000010 mmap: 0x7f89c78700                                                        
I :282] Page Fault Address: 0x0000000000be96c8                                                                                                                                                                    
I :240] Offset: 0x0000000000048700, mmap_reg: 0x7f89c78700, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x0000000000000010, value:0x0000000000000010                                     
I :269] Read 32 Hacks: offset = 0x0000000000048700, lower: = 0x0000000000000010 upper: = 0x0000000000000000 value: = 0x0000000000000010 mmap: 0x7f89c78700                                                        
I :282] Page Fault Address: 0x0000000000be96c8                                                                                                                                                                    
E :254] HIB Error. hib_error_status = 0000000000000211, hib_first_error_status = 0000000000000010                                                                                                                 
I :75] event_fd=20. Monitor thread got num_events=1.                                                                                                                                                              
I :195] Write 32 Hacks: offset = 0x00000000000486c0, value = 0x0000000000000000 mmap=0x7f89c786c0                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000486c0, value: = 0x0000000000000000                                                                                                                                 
I :195] Write 32 Hacks: offset = 0x00000000000486c8, value = 0x0000000000000000 mmap=0x7f89c786c8                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000486c8, value: = 0x0000000000000000                                                                                                                                 
I :240] Offset: 0x00000000000486f0, mmap_reg: 0x7f89c786f0, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x0000000000000211, value:0x0000000000000211                                     
I :269] Read 32 Hacks: offset = 0x00000000000486f0, lower: = 0x0000000000000211 upper: = 0x0000000000000000 value: = 0x0000000000000211 mmap: 0x7f89c786f0                                                        
I :282] Page Fault Address: 0x0000000000be96c8                                                                                                                                                                    
I :240] Offset: 0x0000000000048700, mmap_reg: 0x7f89c78700, Upper: 0x0000000000000000, Shifted upper: 0x0000000000000000, lower: 0x0000000000000010, value:0x0000000000000010                                     
I :269] Read 32 Hacks: offset = 0x0000000000048700, lower: = 0x0000000000000010 upper: = 0x0000000000000000 value: = 0x0000000000000010 mmap: 0x7f89c78700                                                        
I :282] Page Fault Address: 0x0000000000be96c8                                                                                                                                                                    
E :254] HIB Error. hib_error_status = 0000000000000211, hib_first_error_status = 0000000000000010                                                                                                                 
I :46] InstructionBuffers created.                                                                                                                                                                                
I :653] Created new instruction buffers.                                                                                                                                                                          
I :75] Mapped scratch : Buffer(ptr=(nil)) -> 0x0000000000000000, 0 bytes.                                                                                                                                         
I :136] MmuMapper#Map() : 0000007f88350000 -> 0000000001080000 (66 pages) flags=00000002.                                                                                                                         
I :55] MapMemory() page-aligned : device_address = 0x0000000001080000                                                                                                                                             
I :223] Mapped "normalized_input_image_tensor" : Buffer(ptr=0x7f88350040) -> 0x0000000001080040, 270000 bytes. Direction=1                                                                                        
I :136] MmuMapper#Map() : 000000000d9c7000 -> 0000000001002000 (2 pages) flags=00000004.                                                                                                                          
I :55] MapMemory() page-aligned : device_address = 0x0000000001002000                                                                                                                                             
I :136] MmuMapper#Map() : 0000007f88272000 -> 0000000001040000 (44 pages) flags=00000004.                                                                                                                         
I :55] MapMemory() page-aligned : device_address = 0x0000000001040000                                                                                                                                             
I :223] Mapped "convert_scores" : Buffer(ptr=0x7f88272000) -> 0x0000000001040000, 176368 bytes. Direction=2                                                                                                       
I :223] Mapped "Squeeze" : Buffer(ptr=0xd9c7000) -> 0x0000000001002000, 7672 bytes. Direction=2                                                                                                                   
I :368] MapDataBuffers() done.                                                                                                                                                                                    
I :93] Linking normalized_input_image_tensor[0]: 0x0000000001080040                                                                                                                                               
I :93] Linking Squeeze[0]: 0x0000000001002000                                                                                                                                                                     
I :93] Linking convert_scores[0]: 0x0000000001040000                                                                                                                                                              
I :136] MmuMapper#Map() : 000000000d9ca000 -> 0000000001008000 (2 pages) flags=00000002.                                                                                                                          
I :55] MapMemory() page-aligned : device_address = 0x0000000001008000                                                                                                                                             
I :136] MmuMapper#Map() : 0000007f88231000 -> 0000000001100000 (63 pages) flags=00000002.                                                                                                                         
I :55] MapMemory() page-aligned : device_address = 0x0000000001100000                                                                                                                                             
I :223] Mapped "instructions" : Buffer(ptr=0x7f88231000) -> 0x0000000001100000, 256992 bytes. Direction=1                                                                                                         
I :223] Mapped "instructions" : Buffer(ptr=0xd9ca000) -> 0x0000000001008000, 7632 bytes. Direction=1                                                                                                              
I :384] MapInstructionBuffers() done.                                                                                                                                                                             
I :481] [1] SetState old=0, new=1.                                                                                                                                                                                
I :393] [1] NotifyRequestSubmitted()                                                                                                                                                                              
I :481] [1] SetState old=1, new=2.                                                                                                                                                                                
I :83] Request[1]: Submitted                                                                                                                                                                                      
I :401] [1] NotifyRequestActive()                                                                                                                                                                                 
I :481] [1] SetState old=2, new=3.                                                                                                                                                                                
I :133] Request[1]: Scheduling DMA[0]                                                                                                                                                                             
I :393] Adding an element to the host queue.                                                                                                                                                                      
I :195] Write 32 Hacks: offset = 0x00000000000485a8, value = 0x0000000000000002 mmap=0x7f89c785a8                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000485a8, value: = 0x0000000000000002                                                                                                                                 
I :133] Request[1]: Scheduling DMA[1]                                                                                                                                                                             
I :393] Adding an element to the host queue.                                                                                                                                                                      
I :195] Write 32 Hacks: offset = 0x00000000000485a8, value = 0x0000000000000003 mmap=0x7f89c785a8                                                                                                                 
I :206] ReRead 32 Hacks: offset = 0x00000000000485a8, value: = 0x0000000000000003

program hangs until killed with ctl-c...