edgedb: Prohibit DML queries in function bodies

The proposal is to prohibit INSERT, UPDATE, and DELETE statements in function bodies.

Currently these statements cannot appear in conditional expressions. More specifically, they are executed unconditionally, so it is semantically misleading to place them in a conditional branch that may or may not be executed - so this is forbidden. There’s a specialized form of these statements that allows them to be conditionally executed #1639.

However, functions don’t have the same unconditional execution policy and if these statements were to appear in functions, this could lead to unpredictable results, especially since function calls may be optimized by the query planner based on volatility and where they appear. In order to prevent unspecified behavior we currently choose to ban data manipulation inside function bodies.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (13 by maintainers)

Commits related to this issue

Most upvoted comments

The inability to create conditional triggers limits the flexibility of the application. I tried to work around this limitation using the following approach. Unfortunately it didn’t work.

function createL(app: LApp) -> bool {
    volatility := "Volatile";
    using (
        WITH
            l:= (INSERT L {code:= app.code})
        SELECT true)
}

trigger create_L_after_D after insert for each do (
    assert(createL(__new__.app) if __new__.event = Event.D else true)
);

Perhaps there is some chance to make a special type of functions to support DML?

mutation can also write to the database and external state. Can neither be used in constraints nor in conditionals.

I would even say that we need a statement like CALL create_new_session() or MUTATE WITH create_new_session() and this is the only place where these functions can be used. Then these functions are as apparent as INSERT/UPDATE/DELETE statements and can be used in a same way (i.e. not a part of conditionals, …etc)

The simplest functions are pure, deterministic, do not access the database and always terminate normally. On top of that there are a bunch of additional capabilities a function could have:

  1. Non determinism/read access to external state (randomness, current time, …)
  2. Read access to the database
  3. Write access to the database
  4. Write access to external state
  5. Throw an exception
  6. Run forever

Functions used in constraints cannot access to database and not determinism is generally a bad idea in constraints.

Functions mutating cannot be optimized out and can only be used in certain places.

IMO you should encode at least enough of the above properties in the function to decide if the function can be used in constraints or conditionals.

The minimal design which achieves that would be three types of function:

  • function (pure, deterministic, can throw exceptions and run forever. Can be used in constraints and conditionals.
  • query can also read from the database and be read external state/be non deterministic. Can not be used in constraints but can be used in conditionals.
  • mutation can also write to the database and external state. Can neither be used in constraints nor in conditionals.

I hope you understand that this would be a big breaking change.

Absolutely! Hence the “under discussion” label here.

The ability to use DML in functions was there for months

You might have noticed that functions with DML have serious issues: 1) it’s impossible to see the effect of a data-modifying function in a calling query; 2) depending on where the call site in the query is, the function might not even get called, or it might, the behavior is undefined. We’d like to avoid inheriting PostgreSQL mistakes and it’s not too late yet.

Is there is absolutely no other way around at the moment?

We can add a configuration knob that enables the current behavior, e.g. CONFIGURE SYSTEM/SESSION SET allow_dml_in_functions := true (it’ll be disabled by default).

Furhter, we will look into finding a way to allow DML-containing functions to be used in a safe and predictable manner, but before we do that it’s imperative that we don’t promote their use in new code.

The minimal design which achieves that would be three types of function:

Actually this is an interesting way of looking at this problem. I specifically like the notion of “mutation” you suggested: a special kind of function where DML is allowed and with a set of restrictions on its call sites.