eModbus: ModbusServerRTU not working reliable for function code 0x10
My client (windows PC) ask from my server (ESP32) to write holding registers (0x10). When writing less than 5 words, it usually goes fine but still sometimes (maybe 1 out of 40 packets) gets an error thrown (PACKET_LENGTH_ERROR).
Increasing the amount of word writes to holding registers, you will receive more frequent errors. Going to 100 word writes (I know, absurd value) you will always get an error. I use 100 word writes to holding registers as a test case in the code. And I got it fixed that never an error will be thrown.
The problem is because of the following;
The ModbusServerRTU function serve() is pinned to core 0 and assigned the highest priority (number 8 priority).
See:
xTaskCreatePinnedToCore((TaskFunction_t)&serve, taskName, 4096, this, 8, &serverTask, coreID >= 0 ? coreID : NULL);
Now going to the file RTUuntils.cpp. The function ModbusMessage RTUutils::receive waits for new data to arrive on the serial port in the state “WAIT_DATA”. When a single byte is received, the state is changed from “WAIT_DATA” to “IN_PACKET”.
And here is the problem. “IN_PACKET” is an blocking function. It blocks the entire RTOS. Thus also Queues will not work anymore from FreeRtos. The serial port espHAL uses the FreeRtos Queue to transfer data from hardware uart to the FreeRtosQueue.
Meaning, when in the state “IN_PACKET” no new serial port data can arrive in the freertos queue. You will only receive the current queue data until it is empty. New serial port data during this time will not be added to the queue causing that modbus frame’s will not be received completely if they are a longer than 1mS.
- Quick solution that I have tested to work FINE with 100 words. (zero faults!).
In the “IN_PACKET” routine add a delay “delay(1);” thus giving the scheduler some space to allow the queues to be filled. Also increase the interval for this state to e.g. 10mS. Why such a high interval? Because on the core0 also WiFI functions are operating. When using WiFi like I do, it is possible that there will be some context switches available that can cause context switches delay of several mS.
The question also is, why is such tight “interval” value is needed anyways. Because currently at 56k6 BAUD rate, the interval will be only 650uS. For a server, it is absolutely no problem to react some time later. Even the most robust PLC’s that I work with, do not reply that soon. They take 50mS or more for a response. Do note that the interval specified in the ModbusRTU is mininum interval. Not maximum. RTU servers may not respond earlier than this interval, but later is no problem.
- Solid solution: It could be improved the following way:
Add to the “IN_PACKET” state an routine that checks the incoming data. With the first let say 6 bytes from the modbus frame, you can already determine what frame length of data is to be expected. Using such a peek function, after just receiving 6 bytes of data from a client, you can calculate how many bytes are left to be received thus at arrival of the latest char you can directly start processing without any interval delay.
case IN_PACKET:
// Data waiting and space left in buffer?
while (serial.available()) {
// Yes. Catch the byte
// Due to a latent bug in esp32-hal-uart.c of the arduino-esp32 core,
// available() may return a 1 although there is no byte in the buffer yet.
// So we have to attempt a read() and discard the result, if it is < 0.
int b = serial.read();
if (b >= 0) {
buffer[bufferPtr++] = b;
// Buffer full?
if (bufferPtr >= bufferBlocks * BUFBLOCKSIZE) {
// Yes. Extend it by another block
bufferBlocks++;
uint8_t *temp = new uint8_t[bufferBlocks * BUFBLOCKSIZE];
memcpy(temp, buffer, (bufferBlocks - 1) * BUFBLOCKSIZE);
delete[] buffer;
buffer = temp;
}
// Rewind timer
lastMicros = micros();
}else
{
LOG_D("Err no byte in buffer\n");
}
}
delay(1); // **<---- add this delay for the scheduler to have some room to fill queues**
// Gap of at least _interval micro seconds passed without data?
if (micros() - lastMicros >= 10000){// interval) { // **<---- Replace interval for fixed timeout**
state = DATA_READ;
LOG_D("RTU Interval end \n");
}
break;
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 29 (29 by maintainers)
Hmm, never heard about that looong intervals before. The client in fact is dictating the interval, so if a server timing-wise would be below the standards of the client, it could not work reliably. In the
ModbusClientRTU
code thebegin()
function has an optional factor parameter to scale the interval up to cope with slow responding servers.But I always wanted to have the server code maintain the interval as good as possible. I will experiment a bit with different ideas I already got. PLS SBY! 😉
I will revert to the
delay(1)
as well and take out thetaskYIELD()
.Despite using RTU and WIFI in parallel on the bridge devices I do not remember having any freezes or reboots, so I will leave the RTU task priority at 6. Anyone is free to choose core 1 for it anyway - that is in fact what we suggest in the docs.
Thank you for your insights and suggestions!
The current “hot thing” I am working on is a
CoilData
class to make handling of this data type easier. This will be released next together with the RTU changes. Feel free to try it out!Before I created many libraries for Microchip 16/32 bit devices and one of them was Modbus. The power with those devices was that there is no OS. You could pre-calculate timings to the microsecond.
However, this is not a time-critical bus (Modbus). If someone need something time critical, probably they choose CANbus. For now, I will use this library with the
delay(1);
inside, works rock solid and I think it is the best you can do when A) using an OS and B) not sacrificing that other libraries wont work anymore and C) to have an universal library.Next step for me is to change the priority to zero and maybe increase the hardware buffer to 512 bytes. That way, the OS at ESP32 just gets an trigger when there is data in the buffer and thus can be read out at once.
Now I am also using an auto direction RS485 chip, so no more delay toggling the receive/drive enable pin. If you have something new to test, let me know 😃
Thanks for looking at the code again.
I was not aware that
delayMicroseconds()
was blocking; I thought it would behave the same asdelay()
. Since the minimal interval now is 1750µs, adelay(1)
would indeed be better suited.The
while
loop is trying to do several things at once. The buffer extension part most probably will not come into play at all with behaving participants on the bus, but will kick in when any interval is missed, so we seemingly get a message exceeding the 256 bytes. While this message will be meaningless anyway, we at least will “clean” the wire for the next regular packet to drop in.If we have the situation that you described with stray bits coming in all the time, it might be worthwhile not to break the loop after a fixed time, but when the buffer extension is filled as well. This will be below the one second with baud rates > 5120 and still be an emergency stop for the loop.
I quickly changed the code in
receive()
to a fixed 512-byte buffer and jump out thewhile
if that buffer is exhausted. THat simplified the code a lot, the while loop now will be less blocking even if a regular packet is caught.Right. But give me some more time to think about it - maybe I get another idea…