SDL: Event processing seems to be slow and dependent on the SDL_Delay interval between frames

I have created a fairly complex main loop in my engine. This loop calls the logic update and rendering functions at approximately equal intervals (fixed/limited frame rate). I have neither the target logic update code nor the rendering code yet, so I implemented spinlocks that mimic the actual code.

These spinlocks consume a certain number of milliseconds over and over again by calling SDL_GetPerformanceCounter, so they wait a given number of milliseconds with a very high precision (usually down to a microsecond). I added code for updating and rendering time measurements, which are represented as milliseconds (as floats). If the spinlock of the logic update is to be 5ms and the rendering time is 10ms, the measurement results look something like this:

Update: 5.00059ms | Render: 10.00030ms

A high enough precision. This is where the problem arises. Logic update is also event queue processing, so the update measurement relates to event processing and the update function (the one with a spinlock for 5ms). The results are as follows:

Update: 5.14063ms | Render: 10.00047ms

If I exclude the 5ms spinlock from the measurement and measure only the event queue processing time, the results look like this:

Update: 0.18426ms | Render: 10.00049ms

Thus, processing the event queue itself takes from a few tenths of a millisecond to even more than one millisecond (sometimes almost two milliseconds). It is unacceptable for event processing to take up to 5-10% of the time devoted to generating a frame (at 60fps). I measured the SDL event processing time by running the program and not generating any events (I didn’t move the mouse, I didn’t press any keys), so the events queue was empty almost all the time.

I thought the problem was with my loop, so I created a minimal test program to test the event processing only measurement. It looks like this:

{$MODE OBJFPC}

uses
  SDL2;
var
  Window: PSDL_Window;
  Event:  TSDL_Event;
var
  Stopped: Boolean = False;
var
  TicksLast:    Int64;
  TicksCurrent: Int64;
var
  CounterBegin:     Int64;
  CounterEnd:       Int64;
  CounterSample:    Int64;
  CounterPerSecond: Int64;
begin
  SDL_Init(SDL_INIT_EVERYTHING);

  Event  := Default(TSDL_Event);
  Window := SDL_CreateWindow(nil, SDL_WINDOWPOS_CENTERED, SDL_WINDOWPOS_CENTERED, 800, 600, SDL_WINDOW_SHOWN);

  CounterPerSecond := SDL_GetPerformanceFrequency();
  TicksLast        := SDL_GetTicks64() div (1000 div 4);

  repeat
    CounterBegin := SDL_GetPerformanceCounter();

      while SDL_PollEvent(@Event) = 1 do
        if Event.Type_ = SDL_QUITEV then
          Stopped := True;

    CounterEnd    := SDL_GetPerformanceCounter();
    CounterSample := CounterEnd - CounterBegin;

    TicksCurrent := SDL_GetTicks64() div (1000 div 4);

    if TicksCurrent <> TicksLast then
    begin
      WriteLn('time: ', (CounterSample * 1000 / CounterPerSecond):1:3, 'ms | sample: ', CounterSample);
      TicksLast := TicksCurrent;
    end;

    SDL_Delay(1);
  until Stopped;

  SDL_DestroyWindow(Window);
  SDL_Quit();
end.

Operation is simple:

get the number of ticks before processing the event loop,
process all queued events
get the number of ticks after processing the event loop,
display the measurement in the console (limited to four times per second)
wait a millisecond for the next measurement (it saves CPU) and go back to step 1.

The results are as follows:

time: 0.046ms | sample: 461
time: 0.010ms | sample: 100
time: 0.047ms | sample: 465
time: 0.010ms | sample: 103
time: 0.047ms | sample: 473
time: 0.010ms | sample: 103
time: 0.047ms | sample: 469
time: 0.010ms | sample: 103
time: 0.047ms | sample: 468
time: 0.010ms | sample: 101
time: 0.124ms | sample: 1238
time: 0.047ms | sample: 467
time: 0.010ms | sample: 101
time: 0.046ms | sample: 463
time: 0.136ms | sample: 1359
time: 0.136ms | sample: 1362
time: 0.045ms | sample: 449

As you can see, the very processing of events (despite not generating them) takes from a few hundredths of a millisecond to a few tenths of a millisecond. Now the most interesting — I change SDL_Delay(1) to SDL_Delay(10):

time: 0.091ms | sample: 905
time: 0.068ms | sample: 681
time: 0.379ms | sample: 3794
time: 0.103ms | sample: 1033
time: 0.069ms | sample: 686
time: 0.176ms | sample: 1764
time: 0.294ms | sample: 2942
time: 0.175ms | sample: 1754
time: 0.121ms | sample: 1210
time: 0.093ms | sample: 933
time: 0.095ms | sample: 949
time: 0.090ms | sample: 901
time: 0.089ms | sample: 893
time: 0.076ms | sample: 761
time: 0.179ms | sample: 1793

Now event processing takes from a few hundredths of a millisecond to even 0.4ms. Now for SDL_Delay(100):

time: 0.295ms | sample: 2945
time: 0.345ms | sample: 3451
time: 0.794ms | sample: 7941
time: 0.323ms | sample: 3227
time: 0.298ms | sample: 2982
time: 0.295ms | sample: 2953
time: 0.469ms | sample: 4691
time: 0.295ms | sample: 2950
time: 0.323ms | sample: 3230
time: 0.615ms | sample: 6154
time: 0.340ms | sample: 3403
time: 0.339ms | sample: 3388
time: 0.356ms | sample: 3556
time: 0.337ms | sample: 3370
time: 0.343ms | sample: 3433
time: 1.159ms | sample: 11585
time: 0.660ms | sample: 6601
time: 0.341ms | sample: 3411
time: 0.352ms | sample: 3523
time: 0.342ms | sample: 3415

Processing of events takes from a few tenths of a millisecond to over one millisecond. For SDL_Delay(500):

time: 2.073ms | sample: 20731
time: 1.399ms | sample: 13985
time: 3.798ms | sample: 37983
time: 4.745ms | sample: 47452
time: 2.030ms | sample: 20296
time: 2.040ms | sample: 20396
time: 2.259ms | sample: 22590
time: 4.019ms | sample: 40187
time: 2.196ms | sample: 21957
time: 1.975ms | sample: 19749
time: 2.307ms | sample: 23074
time: 2.301ms | sample: 23013
time: 2.593ms | sample: 25927
time: 1.972ms | sample: 19722
time: 2.032ms | sample: 20317
time: 3.758ms | sample: 37582

Now it takes a few milliseconds all the time. For SDL_Delay(1000):

time: 4.549ms | sample: 45487
time: 4.860ms | sample: 48596
time: 4.597ms | sample: 45973
time: 4.699ms | sample: 46986
time: 5.158ms | sample: 51582
time: 4.550ms | sample: 45502
time: 5.308ms | sample: 53084
time: 5.304ms | sample: 53037
time: 2.319ms | sample: 23192
time: 4.635ms | sample: 46354
time: 4.761ms | sample: 47608
time: 4.454ms | sample: 44542
time: 4.637ms | sample: 46371
time: 4.949ms | sample: 49490
time: 4.469ms | sample: 44686
time: 4.661ms | sample: 46608
time: 4.728ms | sample: 47281

As you can see, the larger the SDL_Delay interval between individual measurements, the longer it takes to process the event queue, or at least this is what the measurements show. The larger the interval, the greater the difference in the number of ticks taken before and after the event loop.

Why does the interval of the SDL_Delay function called between measurements affect the measurement of event processing time? After all, this function is called between measurements, not during them, so it should not affect them (only increase or decrease the number of measurements per second).

The same is true if I explicitly use the QueryPerformanceFrequency, QueryPerformanceCounter and Sleep functions, so this is not a problem with SDL_Delay (especially since it uses Sleep internally).

I know guys that you use C language and I use Free Pascal, so in the attachment I added test programs compiled in FPC so that you can test them on your own (Win 64-bit). Attached is the last stable library SDL2.dll, downloaded from this repository.

There are several executable files in the attachment — the number in the executable name is the interval used in the SDL_Delay function called between measurements (no number in the name is a version without using SDL_Delay).

Attachment — timming_test.zip

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 15 (8 by maintainers)

Most upvoted comments

@peppy: even if PeekEvents works for their case, I think it is worth to check why PumpEvents is slow. At some point PumpEvents must be called either directly or indirectly. It does not make sense to expect the application to change its main loop.

@furious-programming : do you have some special sensor or what kind of videocard are you using? I think it would help to find out the bottleneck. Have you tried to set SDL_JOYSTICK_DISABLED or SDL_SENSOR_DISABLED?

pionere on Nov 9, 2022

I’ll go ahead and close this, but if you find new information, please let us know!

slouken on Nov 15, 2022

Before creating this issue, I checked if the queue was actually empty, so I did not include it in the logs. Anyway, it is logical that if I do not move the mouse over the window and do not press the keys, then no events are generated.

Okay, here are the logs with added event count tracking and comments that indicate problematic lines. This time, I alternately did nothing and tested the event generation by moving the cursor over the window. I removed the excess lines showing 0 events to shorten the log. This is a test for SDL_Delay(1):

time: 0.010ms | sample:   104 | events:  0
time: 0.148ms | sample:  1484 | events: 11 // 11 events, 0.148ms - bad
time: 0.048ms | sample:   477 | events: 22
time: 0.010ms | sample:   103 | events:  0
time: 0.046ms | sample:   464 | events:  0
time: 0.046ms | sample:   461 | events:  0
time: 0.046ms | sample:   462 | events:  0
time: 0.310ms | sample:  3104 | events: 28 // 0.310ms to process 28 events
time: 0.175ms | sample:  1748 | events: 29
time: 0.026ms | sample:   257 | events:  0
time: 0.201ms | sample:  2010 | events:  0 // 0.201ms but no events processed...
time: 0.013ms | sample:   129 | events:  0
time: 0.048ms | sample:   480 | events:  0
time: 0.010ms | sample:   103 | events:  0
time: 0.010ms | sample:   101 | events: 11 // 11 events, only 0.010ms
time: 0.011ms | sample:   109 | events:  0
time: 0.043ms | sample:   425 | events:  5
time: 0.011ms | sample:   108 | events: 12 // 0.011ms to process 12 events
time: 0.010ms | sample:   102 | events:  0
time: 0.047ms | sample:   470 | events:  0
time: 0.010ms | sample:   102 | events:  0
time: 0.047ms | sample:   473 | events:  0
time: 0.047ms | sample:   466 | events:  0
time: 0.012ms | sample:   118 | events: 21 // 21 events, only 0.012ms to process
time: 0.047ms | sample:   467 | events: 26 // 26 events, only 0.047ms
time: 0.011ms | sample:   109 | events:  0
time: 0.042ms | sample:   417 | events: 25 // 25 events, only 0.042ms
time: 0.010ms | sample:   100 | events: 17 // 17 events, only 0.010ms
time: 0.162ms | sample:  1615 | events: 26 // 26 events, but many times longer processing
time: 0.047ms | sample:   471 | events: 16 // 16 events, 0.047ms
time: 0.010ms | sample:   100 | events: 14
time: 0.010ms | sample:   102 | events:  0
time: 0.047ms | sample:   466 | events:  0
time: 0.010ms | sample:   103 | events:  0
time: 0.047ms | sample:   465 | events:  0
time: 0.049ms | sample:   488 | events:  0
time: 0.048ms | sample:   481 | events: 16 // 16 events - again 0.048ms
time: 0.010ms | sample:   100 | events:  0
time: 0.049ms | sample:   486 | events:  0
time: 0.010ms | sample:   100 | events:  0
time: 0.011ms | sample:   110 | events:  0
time: 0.047ms | sample:   470 | events:  0
time: 0.011ms | sample:   106 | events:  0
time: 0.192ms | sample:  1925 | events: 12 // 0.192ms to process 12 events?!
time: 0.010ms | sample:   103 | events: 31 // 0.010ms to process 31 events - what a difference!
time: 0.048ms | sample:   483 | events: 26
time: 0.010ms | sample:   100 | events:  0
time: 0.047ms | sample:   465 | events:  2
time: 0.009ms | sample:    93 | events:  3
time: 0.047ms | sample:   473 | events: 19 // 19 events, 0.047ms
time: 0.009ms | sample:    93 | events:  0
time: 0.049ms | sample:   489 | events:  0
{...}

As can be clearly seen in the logs, when events appear in the queue, their processing time varies greatly. SDL can return 30 events in 0.01 ms, and then returns a similar number of events in several dozen times longer. Besides, once in a while it takes even 0.2 ms to process an empty queue (so calling the SDL_PollEvent function to get the result 0).

Another problem I noticed is that when you drag a window with the mouse, by grabbing its title bar, it adds moving time to adding the event processing time itself. Example for SDL_Delay(1):

time: 0.024ms | sample:   241 | events: 10
time: 0.346ms | sample:  3462 | events: 22
time: 0.010ms | sample:   104 | events: 27
time: 0.050ms | sample:   500 | events: 17
time: 0.051ms | sample:   508 | events: 14
time: 0.046ms | sample:   461 | events:  0
time: 2160.101ms | sample: 21601013 | events:  2 // time of moving window is added to the sample
time: 0.287ms | sample:  2870 | events:  0
time: 0.042ms | sample:   417 | events: 10
time: 0.047ms | sample:   468 | events: 23
time: 0.047ms | sample:   470 | events:  0
time: 0.010ms | sample:   103 | events:  5

I held the window by the title bar and moved it for about two seconds. The time of holding and moving the window has been added to the measurement — why? This happens at any SDL_Delay interval and every time I move the window.

And this is the test for SDL_Delay(100) between measurements:

// doing nothing:
time: 1.131ms | sample: 11307 | events:  0 // A la la la la long long li long long long, oh yeah!
time: 0.454ms | sample:  4544 | events:  0
time: 0.441ms | sample:  4410 | events:  0
time: 0.653ms | sample:  6534 | events:  0
time: 0.433ms | sample:  4333 | events:  0
time: 0.746ms | sample:  7463 | events:  0
time: 0.548ms | sample:  5476 | events:  0
time: 0.456ms | sample:  4562 | events:  0
time: 0.620ms | sample:  6199 | events:  0
time: 0.439ms | sample:  4390 | events:  0
time: 0.656ms | sample:  6565 | events:  0

// moving the mouse over the window:
time: 0.622ms | sample:  6224 | events:  2
time: 0.619ms | sample:  6186 | events:  3
time: 2.047ms | sample: 20474 | events:  2
time: 1.361ms | sample: 13611 | events:  3
time: 0.660ms | sample:  6595 | events:  2
time: 1.066ms | sample: 10658 | events:  2
time: 0.887ms | sample:  8870 | events:  3
time: 0.654ms | sample:  6542 | events:  2
time: 0.617ms | sample:  6168 | events:  3
time: 0.672ms | sample:  6719 | events:  2
time: 0.602ms | sample:  6019 | events:  3
time: 1.153ms | sample: 11528 | events:  2
time: 1.354ms | sample: 13538 | events:  3
time: 0.688ms | sample:  6879 | events:  2
time: 0.725ms | sample:  7253 | events:  3
time: 0.646ms | sample:  6463 | events:  2
time: 0.856ms | sample:  8562 | events:  3
time: 0.978ms | sample:  9775 | events:  2
time: 0.446ms | sample:  4458 | events:  2
time: 0.444ms | sample:  4444 | events:  0
time: 0.437ms | sample:  4370 | events:  0
time: 0.672ms | sample:  6721 | events:  1
time: 2.953ms | sample: 29532 | events:  3
time: 0.473ms | sample:  4729 | events:  0
time: 0.443ms | sample:  4434 | events:  0
time: 0.441ms | sample:  4413 | events:  0
time: 0.444ms | sample:  4444 | events:  0
time: 0.446ms | sample:  4461 | events:  0
time: 0.446ms | sample:  4462 | events:  0
time: 1.435ms | sample: 14351 | events:  0 // WTF is this...
time: 0.451ms | sample:  4511 | events:  0
time: 0.448ms | sample:  4476 | events:  0
time: 0.450ms | sample:  4503 | events:  0

When no events is processed, the times are significantly longer. For the SDL_Delay(1000):

// doing nothing:
time: 4.594ms | sample: 45936 | events:  0
time: 5.578ms | sample: 55778 | events:  0
time: 3.030ms | sample: 30298 | events:  0
time: 2.894ms | sample: 28938 | events:  0
time: 4.830ms | sample: 48301 | events:  0
time: 4.991ms | sample: 49907 | events:  0
time: 4.583ms | sample: 45831 | events:  0
time: 4.654ms | sample: 46535 | events:  0
time: 4.562ms | sample: 45616 | events:  0
time: 6.130ms | sample: 61298 | events:  0
time: 6.193ms | sample: 61930 | events:  0
time: 4.664ms | sample: 46643 | events:  0
time: 4.570ms | sample: 45696 | events:  0
time: 5.418ms | sample: 54185 | events:  0
time: 7.045ms | sample: 70447 | events:  0

// moving the cursor over the window:
time: 7.582ms | sample: 75824 | events:  2
time: 6.830ms | sample: 68295 | events:  1
time: 5.856ms | sample: 58558 | events:  1
time: 6.802ms | sample: 68017 | events:  1
time: 5.739ms | sample: 57389 | events:  1
time: 5.226ms | sample: 52257 | events:  0
time: 4.910ms | sample: 49103 | events:  1
time: 4.972ms | sample: 49716 | events:  1
time: 7.504ms | sample: 75042 | events:  1
time: 2.953ms | sample: 29532 | events:  1
time: 5.022ms | sample: 50224 | events:  1
time: 5.323ms | sample: 53233 | events:  1
time: 5.271ms | sample: 52711 | events:  1

The times are ridiculous. It’s also interesting that I’m moving the cursor all the time, making circles, but the queue only gets one event each second. The situation is different if I also click LMB while making circles with the cursor:

time: 6.046ms | sample: 60461 | events: 11
time: 4.670ms | sample: 46701 | events:  5
time: 7.254ms | sample: 72539 | events:  9
time: 11.297ms | sample: 112965 | events: 20
time: 12.228ms | sample: 122276 | events: 21
time: 15.361ms | sample: 153605 | events: 17
time: 10.515ms | sample: 105149 | events: 15
time: 8.147ms | sample: 81474 | events:  3
time: 15.132ms | sample: 151317 | events: 17
time: 12.836ms | sample: 128358 | events: 21
time: 12.371ms | sample: 123712 | events: 21
time: 12.695ms | sample: 126953 | events: 21
time: 16.724ms | sample: 167241 | events: 26
time: 12.847ms | sample: 128474 | events: 23
time: 10.596ms | sample: 105959 | events: 17
time: 10.694ms | sample: 106942 | events: 13

There are many events in the queue, but I didn’t click 21 times per second, but only few times per second (3-5), so the rest of the events must be related to the mouse movement.

It should also be taken into account that if the SDL_Delay interval is less than a quarter of a second (e.g. 1 or 10), the number of events displayed in the console does not apply to the number of events per frame, but to the sum of events in the last 250 ms. If the event counter is reset every frame (below printing data, at the end of the loop), the number of events in the console will be smaller, but the time samples will be the same (so much the worse for SDL).

It is difficult for me to say why the measurements look like this, and it is especially difficult for me to understand why manipulating the SDL_Delay interval (which is called between frames) has a real impact on the amount of time used by SDL_PollEvent. I am adding a new package of test programs to the attachments, which also displays the number of processed events every quarter second (events counter is reset 4 times per second).

Attachment — timming_test_2.zip

Here is the code template of these programs:

{$MODE OBJFPC}

uses
  SDL2;
var
  Window: PSDL_Window;
  Event:  TSDL_Event;
var
  Stopped: Boolean = False;
var
  TicksLast:    Int64;
  TicksCurrent: Int64;
var
  CounterBegin:     Int64;
  CounterEnd:       Int64;
  CounterSample:    Int64;
  CounterPerSecond: Int64;
var
  EventsTime: Single = 0.0;
  EventsAccu: Int32  = 0;
begin
  SDL_Init(SDL_INIT_EVERYTHING);

  Event  := Default(TSDL_Event);
  Window := SDL_CreateWindow(nil, SDL_WINDOWPOS_CENTERED, SDL_WINDOWPOS_CENTERED, 800, 600, SDL_WINDOW_SHOWN);

  CounterPerSecond := SDL_GetPerformanceFrequency();
  TicksLast        := SDL_GetTicks64() div (1000 div 4);

  repeat
    CounterBegin := SDL_GetPerformanceCounter();

      while SDL_PollEvent(@Event) = 1 do
      begin
        EventsAccu += 1;

        if Event.Type_ = SDL_QUITEV then
          Stopped := True;
      end;

    CounterEnd    := SDL_GetPerformanceCounter();
    CounterSample := CounterEnd - CounterBegin;

    TicksCurrent := SDL_GetTicks64() div (1000 div 4);

    if TicksCurrent <> TicksLast then
    begin
      EventsTime := CounterSample * 1000 / CounterPerSecond;

      WriteLn('time: ', EventsTime:1:3, 'ms | sample: ', CounterSample:5, ' | events:',  EventsAccu:3); // new print format

      TicksLast  := TicksCurrent;
      EventsAccu := 0;
    end;

    SDL_Delay(1); // delay between frames
  until Stopped;

  SDL_DestroyWindow(Window);
  SDL_Quit();
end.

furious-programming on Nov 6, 2022