pyodbc: Unable to insert UTF-8 character with fast_executemany option

I am trying to copy data from one SQL Server instance to another using pyODBC package and I encountered an error while handling UTF-8 characters on Windows.

I have narrowed it down to this snippet, which fails for me:

import pyodbc
import sys
print('system:' + str(sys.getwindowsversion()))
print('python:' + sys.version)
print('pyodbc:' + pyodbc.version)
cnxn = pyodbc.connect(
    r'''Driver=ODBC Driver 17 for SQL Server;Server=localhost;'''
    r'''Database=DB_Pros;port=1433;Trusted_Connection=yes'''
    )
crsr = cnxn.cursor()
crsr.fast_executemany=True

crsr.execute("""DROP TABLE IF EXISTS dbo.testing_table""")
crsr.execute("""CREATE TABLE dbo.testing_table (col1 nvarchar(max) null)""")
crsr.executemany("""INSERT INTO dbo.testing_table (col1) VALUES (?)""", [[' 🎥 ',]])

return_value = crsr.execute("SELECT * FROM dbo.testing_table").fetchone()

print(return_value )

crsr.close()
cnxn.close()

I get the following error:

system:sys.getwindowsversion(major=10, minor=0, build=18362, platform=2, service_pack='')
python:3.7.1 (default, Oct 28 2018, 08:39:03) [MSC v.1912 64 bit (AMD64)]
pyodbc:4.0.27
Traceback (most recent call last):
  File "test.py", line 15, in <module>
    crsr.executemany("""INSERT INTO dbo.testing_table (col1) VALUES (?)""", [[' 🎥 ',]])
pyodbc.DataError: ('22026', '[22026] [Microsoft][ODBC Driver 17 for SQL Server]String data, length mismatch (0) (SQLParamData)')

Everything works fine when I set the crsr.fast_executemany=False.

This seems to be somewhat related to #246 , but this happens on Windows system.

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 17 (5 by maintainers)

Commits related to this issue

Fix for #617 Incorrect size calculation for Unicode characters (#652) * Merging updates. (#1) Merging updates. * fix for smalldatetime issue * Fixed a bad merge * Fix for inserting high u... — committed to mkleehammer/pyodbc by deleted user 5 years ago

Most upvoted comments

Is the fix for this problem (parameters not literals) currently scheduled for any release?

In the interim is there a workaround, a way of ‘pre-treating’ the strings to bypass the error? The closest I can get is val.encode('unicode-escape').decode('utf-8') anything going to nvarchar(max), but then they look pretty bad because it encodes even unicode chars below 256, such as common western language accented vowels.

I load/reload a lot of data daily, many millions of rows, so I need fast_executemany. I just swapped out turbodbc, because it has its own problems with unicode strings longer than 1600 chars, and now I have hit this issue.

Any advice is appreciated, thanks.

ambarbour on Nov 25, 2019