imbalanced-learn: [BUG]- error with SMOTENC fit_resample: ValueError: could not broadcast input array from shape (137,12) into shape (272,12

Describe the bug

Error with SMOTENC.fit_resample: ValueError: could not broadcast input array from shape (137,12) into shape (272,12)

Steps/Code to Reproduce

Using the two X and y csv dataset attached:

X.zip y.zip

I’m running:

smote = SMOTENC(
  categorical_features=[19],
  sampling_strategy="auto",
  random_state=0,
  n_jobs=8
)
X, y = smote.fit_resample(X, y)

Expected Results

No error is thrown.

Actual Results

File "C:\Users\c42steguerri\PycharmProjects\StrategyLab\venv\lib\site-packages\imblearn\over_sampling\_smote\base.py", line 577, in _generate_samples
    ] = self._X_categorical_minority_encoded
ValueError: could not broadcast input array from shape (137,12) into shape (272,12) 

Versions

System:
    python: 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
executable: C:\Users\c42steguerri\PycharmProjects\StrategyLab\venv\Scripts\python.exe
   machine: Windows-10-10.0.16299-SP0

Python dependencies:
          pip: 19.0.3
   setuptools: 40.8.0
      sklearn: 0.24.1
        numpy: 1.18.4
        scipy: 1.4.1
       Cython: None
       pandas: 1.0.5
   matplotlib: None
       joblib: 0.14.1
threadpoolctl: 2.0.0

Built with OpenMP: True

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 18 (8 by maintainers)

Most upvoted comments

It could be another bug with the same error. Don’t hesitate to open a new issue with a minimal example that trigger the error.

I’m having a similar issue with some code I’m testing. If I discover anything I’ll let you know.