pandas: pd.merge() doesn't merge int and str column dtypes but no warning or error

When merging an int dtype with a str dtype the join does not work:

>>> import pandas as pd
>>> df1 = pd.DataFrame({"A":[0]})
>>> df2 = pd.DataFrame({"A":["0"]})
>>> pd.merge(df1, df2, on=["A"])
Empty DataFrame
Columns: [A]
Index: []

I think it would be better to get a warning that the join is performed on incompatible column dtypes.

This is my pandas version:

>>> pd.show_versions()                                                                                                                                    

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.9.final.0
python-bits: 64
OS: Darwin
OS-release: 13.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.15.2
nose: 1.3.4
Cython: 0.21
numpy: 1.9.2
scipy: 0.15.1                                                                                                                                                          
statsmodels: 0.6.1
IPython: 2.2.0
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.4.1
pytz: 2014.9
bottleneck: None
tables: 3.1.1
numexpr: 2.3.1
matplotlib: 1.4.3
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: 0.5.7
lxml: 3.4.0
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
rpy2: 2.5.6
sqlalchemy: 0.9.7
pymysql: None
psycopg2: None

Thanks for all your work on pandas!

About this issue

  • Original URL
  • State: closed
  • Created 9 years ago
  • Comments: 15 (13 by maintainers)

Most upvoted comments

You are doing an inner merge, which doesn’t match. Not sure if we could reliably detect this, as it involves a computation to figure out that you have strings that looks like numbers.

In [8]: >>> pd.merge(df1, df2, on=["A"],how='outer')
Out[8]: 
   A
0  0
1  0

In [9]: >>> pd.merge(df1, df2, on=["A"],how='outer').dtypes
Out[9]: 
A    float64
dtype: object

you can also do

df.convert_objects(convert_numeric=True) to force the objects to become numbers