triggers: Pod proliferation problem using Triggers with many webhooks

Expected Behavior

ValidationTasks allow events to be properly validated, which gates EventListener Triggers

Actual Behavior

This works “as intended” assuming low traffic, but with potentially many Triggers on an EventListener, this will start a validate Task for every single Trigger on the the EventListener, which can currently be the source of multiple event sources resulting in a lot of wasted work.

For example, with just two different GitHub repositories (they require separate validation), one may be receiving many events, while the other receives none. Now, whenever an event comes in, the validation Task for this inactive repository will be run.

This could easily result in spikes of thousands of pods being created if someone were to put all their Triggers in a single EventListener. With a forced single source approach, N listener pods are created for N event sources (compared to 1), but this allows for native IP tables to route to the underlying services to these pods (with a single validate) rather than it being resolved internally through many much slower Tasks.

Additional Info

Here are some questions to evaluate under this issue.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 18 (13 by maintainers)

Most upvoted comments

@vincent-pli fwiw - what you describe is how use this project. I will likely never have an EventListener exposed outside the cluster for security issues - everything gets processed and validated up front (and converted to CloudEvents) and the listener is only used locally. This pattern works really well so far. Hopefully whatever we land on doesn’t exclude our ability to make use of this pattern. This “optional” part of what @dibyom mentioned regarding the service above makes me feel like this pattern wont impair users ability to have a system up front doing validation for users instead (or addition to).

It seems to me like the questions at the core of this issue are:

1. Is it “bad” to have one EventListener for each webhook (when I have many webhooks)? Why is it “bad”?

  • This could create many idle EventListener sink pods
  • Define what is too “bad” to use

2. Is it “bad” to have many triggers in one EventListener? Why is it “bad”?

  • This could spawn many Validation tasks (and/or Filter tasks when this is implemented)
  • Define what is too “bad” to use

3. Which approach (1) or (2) should users take when creating many webhooks using Triggers?

  • Do we need an additional change/solution for a user with many webhooks to be able to use Triggers

I think that we should rephrase this issue to answer these questions above. What do you think @vtereso?

Thanks @ncskier I agree we should tackle 1. and 2. separately.

To me 2. is the more immediate problem. As @vtereso and @khrm suggested, we could change the interface for the validation from a task that is run per event to something more long running.

One idea – the interface for the validation could be an objectReference to another Addressable. Something like how EventSources in knative/eventing: https://github.com/knative/docs/blob/master/docs/eventing/samples/github-source/github-source.yaml#L17

Another idea could be to have something like an EventSource (or just reuse the Knative EventSources) that process the event i.e, validate it and then send it to the Trigger Event Listener for the actual triggering.

One consideration is that with the case of GitHub, someone could create multiple different webhooks on a repository where we would probably have to take one of the following stances:

  • Expect one webhook with all the events instead
  • All the webhooks would need the same secret
  • Also have one EventListener for each of these distinct webhooks on the repository.