edgedb: Prohibit DML queries in function bodies

The proposal is to prohibit INSERT, UPDATE, and DELETE statements in function bodies.

Currently these statements cannot appear in conditional expressions. More specifically, they are executed unconditionally, so it is semantically misleading to place them in a conditional branch that may or may not be executed - so this is forbidden. There’s a specialized form of these statements that allows them to be conditionally executed #1639.

However, functions don’t have the same unconditional execution policy and if these statements were to appear in functions, this could lead to unpredictable results, especially since function calls may be optimized by the query planner based on volatility and where they appear. In order to prevent unspecified behavior we currently choose to ban data manipulation inside function bodies.

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 15 (13 by maintainers)

Commits related to this issue

Forbid usage of DML in some special cases. (#1726) There are some limitations to using DML (`INSERT`, `DELETE`, `UPDATE`) statements as part of other expressions: * They cannot appear as branches... — committed to edgedb/edgedb by vpetrovykh 4 years ago
Add tracking of DML inside function bodies. Track the presence of DML statemnts and function calls that appear inside function bodies. Issue #1661 — committed to edgedb/edgedb by vpetrovykh 4 years ago
Add tracking of DML inside function bodies. Track the presence of DML statemnts and function calls that appear inside function bodies. Issue #1661 — committed to edgedb/edgedb by vpetrovykh 4 years ago
Add tracking of DML inside function bodies. Track the presence of DML statemnts and function calls that appear inside function bodies. Issue #1661 — committed to edgedb/edgedb by vpetrovykh 4 years ago
Add tracking of DML inside function bodies. Track the presence of DML statemnts and function calls that appear inside function bodies. Issue #1661 — committed to edgedb/edgedb by vpetrovykh 4 years ago
Prohibit DML in function bodies (for now) Data modification in functions has a number of problems: - Effects of DML inside a function is invisible to the calling query, which is confusing. - The e... — committed to edgedb/edgedb by elprans 3 years ago
Prohibit DML in function bodies (for now) Data modification in functions has a number of problems: - Effects of DML inside a function is invisible to the calling query, which is confusing. - The e... — committed to edgedb/edgedb by elprans 3 years ago
Prohibit DML in function bodies (for now) Data modification in functions has a number of problems: - Effects of DML inside a function is invisible to the calling query, which is confusing. - The e... — committed to edgedb/edgedb by elprans 3 years ago
Prohibit DML in function bodies (for now) Data modification in functions has a number of problems: - Effects of DML inside a function is invisible to the calling query, which is confusing. - The e... — committed to edgedb/edgedb by elprans 3 years ago

Most upvoted comments

The inability to create conditional triggers limits the flexibility of the application. I tried to work around this limitation using the following approach. Unfortunately it didn’t work.

function createL(app: LApp) -> bool {
    volatility := "Volatile";
    using (
        WITH
            l:= (INSERT L {code:= app.code})
        SELECT true)
}

trigger create_L_after_D after insert for each do (
    assert(createL(__new__.app) if __new__.event = Event.D else true)
);

Perhaps there is some chance to make a special type of functions to support DML?

mkhlsvn on Jul 17, 2023

mutation can also write to the database and external state. Can neither be used in constraints nor in conditionals.

I would even say that we need a statement like CALL create_new_session() or MUTATE WITH create_new_session() and this is the only place where these functions can be used. Then these functions are as apparent as INSERT/UPDATE/DELETE statements and can be used in a same way (i.e. not a part of conditionals, …etc)

tailhook on Aug 19, 2020

The simplest functions are pure, deterministic, do not access the database and always terminate normally. On top of that there are a bunch of additional capabilities a function could have:

Non determinism/read access to external state (randomness, current time, …)
Read access to the database
Write access to the database
Write access to external state
Throw an exception
Run forever

Functions used in constraints cannot access to database and not determinism is generally a bad idea in constraints.

Functions mutating cannot be optimized out and can only be used in certain places.

IMO you should encode at least enough of the above properties in the function to decide if the function can be used in constraints or conditionals.

The minimal design which achieves that would be three types of function:

function (pure, deterministic, can throw exceptions and run forever. Can be used in constraints and conditionals.
query can also read from the database and be read external state/be non deterministic. Can not be used in constraints but can be used in conditionals.
mutation can also write to the database and external state. Can neither be used in constraints nor in conditionals.

CodesInChaos on Aug 18, 2020

I hope you understand that this would be a big breaking change.

Absolutely! Hence the “under discussion” label here.

The ability to use DML in functions was there for months

You might have noticed that functions with DML have serious issues: 1) it’s impossible to see the effect of a data-modifying function in a calling query; 2) depending on where the call site in the query is, the function might not even get called, or it might, the behavior is undefined. We’d like to avoid inheriting PostgreSQL mistakes and it’s not too late yet.

Is there is absolutely no other way around at the moment?

We can add a configuration knob that enables the current behavior, e.g. CONFIGURE SYSTEM/SESSION SET allow_dml_in_functions := true (it’ll be disabled by default).

Furhter, we will look into finding a way to allow DML-containing functions to be used in a safe and predictable manner, but before we do that it’s imperative that we don’t promote their use in new code.

elprans on Aug 14, 2020

The minimal design which achieves that would be three types of function:

Actually this is an interesting way of looking at this problem. I specifically like the notion of “mutation” you suggested: a special kind of function where DML is allowed and with a set of restrictions on its call sites.

1st1 on Aug 18, 2020