ozo: connection pool error: get resource timeout

I keep seeing ozo::error_code’s with the following value and message.

yamail::resource_pool::error::detail::category:1

get resource timeout

This is happening in the middle of my application’s activities, where database activity is happening normally for several minutes already.

I’m using the default ozo::connection_pool_config settings.

I’m connecting to a database-as-a-service database in “the cloud”, and have a pgBouncer load balance configured and managed by my service provider, so there shouldn’t be any issues with connecting to the database, or a problem with limited number of connection slots available.

Could I get an explanation of possible causes for this error?

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 17 (8 by maintainers)

Most upvoted comments

For anyone reading along, I’ve implemented this functionality in an application-specific way, like such:

First, run SQL code similar to this:

CREATE OR REPLACE FUNCTION handle_telemetry_result_array(_ids TEXT[], _contents TEXT[])
RETURNS VOID AS
$BODY$
BEGIN
    IF ARRAY_LENGTH(_ids, 1) != ARRAY_LENGTH(_contents, 1) THEN
        RAISE invalid_parameter_value USING MESSAGE = 'Parameters must all be 1 dimensional arrays of equal length.';
    END IF;

    FOR i IN ARRAY_LOWER(_ids, 1) .. ARRAY_UPPER(_ids, 1)
    LOOP
        BEGIN
            PERFORM handle_telemetry_result(_ids[i], _contents[i]);
        EXCEPTION WHEN OTHERS THEN
            RAISE NOTICE 'Exception when running telemetry transaction for ID=%s, _ids[i];
        END;
    END LOOP;
END
$BODY$ LANGUAGE plpgsql;

Which takes several arrays of data points, which all must be the same length.

The function then iterates over each array, and executes your function of preference with the values provided.

In my case, the handle_telemetry_result() function looks up which function to execute based on the telemetryid that I provide, and executes it with the ID and contents provided.

Then, in your C++ code, something like this’ll do the trick.

struct TelemetryDataController : std::enable_shared_from_this<TelemetryDataController>
{
    TelemetryDataController(boost::asio::io_context&                         ioc,
                            ozo::connection_pool<ozo::connection_info<>>&    connPool)
     : m_ioc(ioc)
     , m_connPool(connPool)
    { }

    void report_telemetry(std::string                clientID,
                          std::optional<std::string> content);
private:
    /**
     * @brief execute
     * If there is no data in flight, sends current data to the database.
     * Otherwise does nothing.
     */
    void execute(void);

private:
    boost::asio::io_context&                         m_ioc;
    ozo::connection_pool<ozo::connection_info<>>&    m_connPool;

    //Member variables that accumulate data while waiting for the current query to finish.
    std::vector<std::string>                m_ids;
    std::vector<std::optional<std::string>> m_contents;

    //Member variables with data that's in-flight to the database.
    std::vector<std::string>                m_inFlightIds;
    std::vector<std::optional<std::string>> m_inFlightContents;
};


void TelemetryDataController::report_telemetry(std::string                clientID,
                                               std::optional<std::string> content)
{
    m_macs.push_back(std::move(clientID));
    m_contents.push_back(std::move(content));

    execute();
} // TelemetryDataController::report_telemetry


//-----------------------------------------------------------------------------

void TelemetryDataController::execute(void)
{
    BOOST_ASSERT(m_ids.size() == m_contents.size());

    if(   m_ids.empty()
       || m_contents.empty())
    {
        return;
    }

    BOOST_ASSERT(m_inFlightIds.size() == m_inFlightContents.size());

    if(   ! m_inFlightIds.empty()
       || ! m_inFlightContents.empty())
    {
        return;
    }

    std::swap(m_ids,      m_inFlightIds);
    std::swap(m_contents, m_inFlightContents);

    auto callback = [this, pSelf = this->shared_from_this()](ozo::error_code ec, auto conn)
    {
        if(ec) { /* log failure, or take other action on failure */ }
        m_inFlightIds.clear();
        m_inFlightContents.clear();

        this->execute();
    };

    ozo::execute(m_connPool[m_ioc],
                 ozo::make_query("select handle_telemetry_result_array($1::TEXT[], $3::TEXT[])",
                                 // These are kept alive by callback holding ptr to TelemetryDataController
                                 std::ref(m_inFlightIds),
                                 std::ref(m_inFlightContents)),
                 // Adjust to be application specific, maybe grow with data items?
                 std::chrono::seconds(5),
                 std::move(callback));
} // TelemetryDataController::execute()

In this way, you only send data to the database when you don’t already have a database operation in flight, and then as soon as that in-flight operation returns, another operation is triggered to send any data that has accumulated in the mean time.

Compared to simply having one DB connection per operation, this increases your average latency substantially. However, DB connections aren’t free, and you’re almost always limited to a small number of them, so it’s generally worth the trade off if you’re not very concerned about latency.

Is this a perfect solution? No, of course not. The only real solution to this problem is for libpq to gain the ability to use a single connection for multiple queries at the same time… even if the queries are all executed in serial on the database, just having less dead air is a substantial reduction in latency.

But that’s out of the OZO library’s control. So this is what I’m using until that feature is implemented.

jonesmz on Feb 20, 2020