circuitpython: MatrixPortal hangs and disconnects from USB after reading URL multiple times

CircuitPython version

Adafruit CircuitPython 7.2.0 on 2022-02-24; Adafruit Matrix Portal M4 with samd51j19
Board ID:matrixportal_m4

Code/REPL

import time
import board
import traceback

from adafruit_matrixportal.matrixportal import MatrixPortal
from secrets import secrets

matrix_portal = MatrixPortal(status_neopixel=board.NEOPIXEL, debug=True)
network = matrix_portal.network
display = matrix_portal.graphics.display

url = "http://example.com"

while True:
    try:
        print(f"Reading url: {url}")
        matrix_portal.set_background(0x400000)
        resp = network.fetch(url)
        matrix_portal.set_background(0x000000)
        print(resp.content.decode('utf-8')[:64])
    except (ValueError, RuntimeError) as e:
        print("Likely data read error:")
        traceback.print_exception(None, e, e.__traceback__)
    except OSError as e:
        print("Likely wifi connection error:")
        traceback.print_exception(None, e, e.__traceback__)
    except Exception as e:
        print("Unknown error:")
        traceback.print_exception(None, e, e.__traceback__)
    time.sleep(1)

Behavior

After some time passes (random, but typically less than an hour), the device stops responding.

The last thing printed is “Retrieving data…” (which is a print that can’t be turned off in the CircuitPython libraries). The LED matrix is stuck on red (so the call to fetch hasn’t returned.)

After some time passes, USB disconnects. The board has to be hard-reset to recover.

There is no Python backtrace or any other error message printed via the serial port.

Description

No response

Additional information

I tested this initially with a text file hosted on an Apache server on the local LAN. I then tested with example.com just to rule out anything weird on my Apache instance. (So you ought to be able to repro this with various URLs, basically)

You can probably omit most of the exception handling from my example code. I’ve found that sometimes the MatrixPortal doesn’t want to connect to wifi or the response gets screwy on occasion, so I handle those. However, I haven’t hit those exceptions when performing this test, so it’s likely not needed.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 64 (6 by maintainers)

Most upvoted comments

The issue I’m reporting is that it hangs - there’s no exception raised.

I upgraded to 1.7.4, and I am also now seeing hangs, sometimes after a couple of minutes sometimes much longer. I don’t think this is necessarily due to the upgrade; it’s just that happened to see a different error before. No panel is attached to the MatrixPortal, so that’s not a factor. So it’s disappointing the upgrade is not helping, but I have some leads.

I was just going to write to you. I fixed it in a different way, which I think will work too. It is not the same fix as what I gave you. See https://github.com/adafruit/circuitpython/pull/6498 and https://github.com/adafruit/samd-peripherals/pull/42. Try the build artifacts: https://github.com/adafruit/circuitpython/actions/runs/2516315728. Scroll down to see the artifacts. Unzip the file for your board and get the .uf2 you need.

The second I bring the LED display (via adafruit_matrixportal Matrix) into the picture, alongside fetching data online, the system will inevitably hang within couple hours.

Thanks, that is very helpful.

code.txt

@dhalbert Here is a simplified single file version that should be easier to parse through. At its core, it’s doing the same thing without object-oriented methods. I got a USB disconnect and hang at about 90 minutes on this one.

Thanks for looking at this!

It’s not a fix, but an option you have is to try the hardware watchdog, which would reset the whole microcontroller (the one where CircuitPython resides) if your program stops running normally.

Before doing this, be comfortable with entering safe mode on your board, because an errant watchdog reset can make it difficult to modify your code.py.

You’d initialize the watchdog with a timeout:

from microcontroller import watchdog as w
from watchdog import WatchDogMode
w.timeout=2.5 # timeout in seconds
w.mode = WatchDogMode.RESET
w.feed()

and within your top-level loop you’d keep the watchdog happy with w.feed(). If it’s ever NOT called for 2.5 seconds, the whole microcontroller is reset, similar to pressing the reset button.