gspread: Duplicate header check in 5.2.0 is not backward compatible

Describe the bug

A spreadsheet with multiple columns that had a blank header used to load using get_all_records before 5.2.0, but it now fails with “headers must be uniques” exception. I presume, but did not confirm, that it is due to this simplification: https://github.com/burnash/gspread/commit/c8a5a7350c40498cf38d3c4a27c748100632804a

To Reproduce Steps to reproduce the behavior:

  1. Run get_all_records on a spreadsheet with multiple columns with a blank header.
  2. See error “headers must be uniques”.

Expected behavior This should work as it used to without an error.

Environment info:

  • Operating System [e.g. Linux, Windows, macOS]: macOS
  • Python version: 3.8
  • gspread version: 5.2.0

Stack trace or other output that would be helpful Traceback (most recent call last): File “<stdin>”, line 1, in <module> File “/edx/other/edx-repo-health/repo_health/check_ownership.py”, line 79, in check_ownership records = find_worksheet(google_creds_file, spreadsheet_url, worksheet_id) File “/edx/other/edx-repo-health/repo_health/check_ownership.py”, line 44, in find_worksheet return worksheet.get_all_records() File “/edx/venvs/edx-repo-health/lib/python3.8/site-packages/gspread/worksheet.py”, line 408, in get_all_records raise GSpreadException(“headers must be uniques”) gspread.exceptions.GSpreadException: headers must be uniques

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 4
  • Comments: 17

Commits related to this issue

Most upvoted comments

I solved this problem. The problem was in gspread version. Just install 5.1.1 instead of 5.2.0.

This was still happening to me My version is 5.4.0

Workaround :

sheet_ref = gspread_client.open_by_key(sheet_key).get_worksheet_by_id(worksheet_gid)
expected_headers = sheet_ref.row_values(1)
all_records = sheet_ref.get_all_records(expected_headers=expected_headers)

Hi @robrap

thank you for raising this issue. I confirm that it breaks if all your headers are empty strings "".

I am wondering why would you use the method get_all_records to retrieve all the values of a sheet if your headers are empty 🤔

Instead you could use the following methods:

  • get_values() with no range specified, it will return all the values of the current sheet
  • get_all_values() this is a legacy method that calls get_values
  • get_all_cells() that will return a single list of Cell object for every cell in that sheet.

I am still thinking about a way to prevent this breaking change and keep the new feature.

Hi, version 5.3.2 provides an extra parameter that allow you to pass a list of headers you expect from the spreadsheet. This allows you to use the method get_all_records with only a subsets of your headers that are unique.

It’s done ✔️

This proposal for a fix has been released in https://github.com/burnash/gspread/releases/tag/v5.3.2

I found a potential way to make the best of both worlds:

  • add extra argument to provide the expected list of keys that matters
  • make sure this list is unique
  • make sure this list is part of the pulled headers
  • make sure the list does not contain extra headers (it is not preventing gspread from working but it is safer)

See linked PR.