pandas: REGR: ImportError on other dbapi2 connections when sqlalchemy is not installed

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import psycopg2 #version 2.9.3
import pandas as pd #version 1.4.0
conn = psycopg2.connect(
        host=XXX,
        dbname=XXX,
        user=XXX,
        password=XXX,
    )
df = pd.read_sql("select 1", conn)

ImportError: Missing optional dependency 'SQLAlchemy'.  Use pip or conda to install SQLAlchemy.

Issue Description

There was no SQLAlchemy dependency in the previous version when using a psycopg2 connection object in read_sql pandas function

Even after installing SQLAlchemy, there is a unexpected warning :

UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy

Expected Behavior

No error neither warning using a psycopg2 connection object in read_sql pandas function

Installed Versions

INSTALLED VERSIONS

commit : bb1f651536508cdfef8550f93ace7849b00046ee python : 3.9.9.final.0 python-bits : 64 OS : Linux OS-release : 5.13.0-27-generic Version : #29~20.04.1-Ubuntu SMP Fri Jan 14 00:32:30 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.4.0 numpy : 1.22.1 pytz : 2021.3 dateutil : 2.8.2 pip : 20.0.2 setuptools : 44.0.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : 2.9.3 jinja2 : None IPython : 7.31.1 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.7.3 sqlalchemy : 1.4.31 tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 9
  • Comments: 25 (9 by maintainers)

Most upvoted comments

Same for me: psycopg2.connect() has worked flawlessly for 3 years and I am ignoring this warning for now:

import psycopg2
import warnings
import pandas as pd


def db_query(sql_query: str, db_conn: psycopg2.extensions.connection) -> pd.DataFrame:
    """Execute SQL Query and get Dataframe with pandas"""
    with warnings.catch_warnings():
        # ignore warning for non-SQLAlchemy Connecton
        # see github.com/pandas-dev/pandas/issues/45660
        warnings.simplefilter('ignore', UserWarning)
        # create pandas DataFrame from database query
        df = pd.read_sql_query(sql_query, db_conn)
    return df

db_conn= psycopg2.connect(
        host=DB_HOST,
        port=DB_PORT ,
        dbname=DB_NAME,
        user=DB_USER,
        password=DB_PASS
)

df_test = db_query("SELECT * FROM public.test LIMIT 1", db_conn)

I found this link: https://stackoverflow.com/questions/71082494/getting-a-warning-when-using-a-pyodbc-connection-object-with-pandas

from sqlalchemy.engine import URL connection_string = “DRIVER={ODBC Driver 17 for SQL Server};SERVER=dagger;DATABASE=test;UID=user;PWD=password” connection_url = URL.create(“mssql+pyodbc”, query={“odbc_connect”: connection_string})

engine = create_engine(connection_url)

df = pd.sql_read(query, engine)

@fangchenli so just to confirm, are you saying you never supported other connection types, but this warning has been added recently? I’m suddenly seeing the same message using psycopg2 on code that I’ve been running fine for 3+ years?

Also, the message itself has grammatical errors

only support should be only supports ordatabase should be or database

That is correct. Before the recent changes, all non-sqlalchemy connections were passed to a fallback class that was only tested against sqlite3. Surprisingly it has worked well with many different connection types for many years. We probably will address this issue and officially support other raw SQL connections soon.

And apologize for the grammatical errors lol…

@fangchenli so just to confirm, are you saying you never supported other connection types, but this warning has been added recently? I’m suddenly seeing the same message using psycopg2 on code that I’ve been running fine for 3+ years? Also, the message itself has grammatical errors only support should be only supports ordatabase should be or database

That is correct. Before the recent changes, all non-sqlalchemy connections were passed to a fallback class that was only tested against sqlite3. Surprisingly it has worked well with many different connection types for many years. We probably will address this issue and officially support other raw SQL connections soon.

I think supporting various database connections is the way to go. I’ve tested pretty much all connectors and they did not report an error until very recently. Even with an error reported, my code works fine. The error is just annoying.

In testing I’ve found that the DBAPI2 connectors are much, much faster than others, at least when connecting to SQL Server. CTDS (https://zillow.github.io/ctds/) is actually the fastest connector library that I’ve found for connecting to SQL Server, however, pandas does still give an error when using this library. It would be great if this and other connectors could get ‘official’ support.

It was asked above how to make the warning go away, for psycopg2 connections (and I suspect most of the others) I found a fairly painless way of quickly building an sqlalchemy engine from an existing connection which should then stop those annoying warnings:

import pandas as pd
import psycopg2
import sqlalchemy
from sqlalchemy.pool import StaticPool
…
engine = sqlalchemy.create_engine('postgresql+psycopg2://', poolclass=StaticPool, creator= lambda: {psycopg2connection})
data = pd.read_sql_query("select something”, con=engine)

I hope this helps.

Sorry I’m a late add here. I got the same warning with cx_Oracle.connect and pyodbc.connect. That didn’t happen before until I reinstalled all my python packages. Current using pandas 1.4.1. Also installed SQLAlchemy-1.4.31.

Anyway to include these connections? What should I do to make the warning go away?

Does anyone know a way to simply suppress the warning without installing sqlalchemy?

Approach of @Sieboldianus works fine

@fangchenli so just to confirm, are you saying you never supported other connection types, but this warning has been added recently? I’m suddenly seeing the same message using psycopg2 on code that I’ve been running fine for 3+ years? Also, the message itself has grammatical errors only support should be only supports ordatabase should be or database

That is correct. Before the recent changes, all non-sqlalchemy connections were passed to a fallback class that was only tested against sqlite3. Surprisingly it has worked well with many different connection types for many years. We probably will address this issue and officially support other raw SQL connections soon.

And apologize for the grammatical errors lol…

@fangchenli Could you please give us a tentative date on when this would be fixed? Thanks

@fangchenli so just to confirm, are you saying you never supported other connection types, but this warning has been added recently? I’m suddenly seeing the same message using psycopg2 on code that I’ve been running fine for 3+ years?

Also, the message itself has grammatical errors

only support should be only supports ordatabase should be or database