pandas: DOC: fix EX03 errors in docstrings
pandas has a script for validating docstrings
Currently, some methods fail the EX03 check.
The task here is:
- take 2-4 methods
- run: scripts/validate_docstrings.py --format=actions --errors=EX03
method-name
- check if validation docstrings passes for those methods, and if it’s necessary fix the docstrings according to whatever error is reported
- remove those methods from code_checks.sh
- commit, push, open pull request
Please don’t comment take
as multiple people can work on this issue. You also don’t need to ask for permission to work on this, just comment on which methods are you going to work.
If you’re new contributor, please check the contributing guide
thanks @MarcoGorelli for giving me the idea for this issue.
About this issue
- Original URL
- State: closed
- Created 6 months ago
- Reactions: 1
- Comments: 66 (54 by maintainers)
yeah that’s the fix - sorry if it wasn’t clear, it was more of an explanation for people that had trouble figuring out the lines affected.
Explanation of what to look for:
EX03 is the errors for the example code-blocks in a function/method’s documentation
for
pandas.errors.SpecificationError
the examples show:line 4 here would be the 4th line in the examples which is
>>> df.groupby('A').B.agg({'foo': 'count'}) # doctest: +SKIP
line 6 would be
>>> df.groupby('A').agg({'B': {'foo': ['sum', 'max']}}) # doctest: +SKIP
@jordan-d-murphy, I agree, seems we fixed all flake8 errors. Thank you for working on this issue with intensity and helping other contributors. Now, we can close this issue.
@jordan-d-murphy great I’ll take those!
I’ll take:
Okay! Makes sense. Hope the photo might help someone else then 🙂
I’ve opened a PR for the remaining 4 functions. I believe this will close this issue.
pandas.Series.plot.line pandas.Series.to_sql pandas.read_json pandas.DataFrame.to_sql
I tried all of that and still didn’t work. I’m going to stop working on it and find another issue to take on. Thanks for all the help!
hmmm okay, yes your approach seems correct, but when I ran this on the latest branch I’m seeing no EX03 errors for
pandas.Series.plot.line
I’ve been using the following approach to set up my dev env and working branch before working on my PRs, which ensures my branch is up to date with the latest version of main.
can you try running these commands, and then try running your script again and see if it helps?
Updating the development environment
git checkout main git merge upstream/main mamba activate pandas-dev mamba env update -f environment.yml --prune
Creating a feature branch
git checkout main git pull upstream main --ff-only git checkout -b shiny-new-feature (NOTE: shiny-new-feature should be your new working branch name)
After running the above commands, running the following script
scripts/validate_docstrings.py --format=actions --errors=EX03 pandas.Series.plot.line
results in output that ends with this:script:
scripts/validate_docstrings.py --format=actions --errors=EX03 pandas.Series.plot.line
Output:
I cannot work on these since I cannot make the
numpydoc
work to test for validity. So they are up for grabs.@natmokval should
pandas.arrays.DatetimeArray
be added to the list? I see aflake8
error on my PR tests:Error: /home/runner/work/pandas/pandas/pandas/core/arrays/datetimes.py:179:EX03:pandas.arrays.DatetimeArray:flake8 error: line 2, col 4: E121 continuation line under-indented for hanging indent
Edit: Looks like this is fixed in #56855
I’ll take these:
pandas.io.formats.style.Styler.highlight_quantile pandas.io.formats.style.Styler.background_gradient pandas.io.formats.style.Styler.text_gradient
Working on:
I’ll take these:
pandas.io.formats.style.Styler.set_tooltips pandas.io.formats.style.Styler.set_uuid pandas.io.formats.style.Styler.pipe pandas.io.formats.style.Styler.highlight_between
Working on:
Working on:
I’ll take:
pandas.io.formats.style.Styler.format_index pandas.io.formats.style.Styler.relabel_index pandas.io.formats.style.Styler.hide pandas.io.formats.style.Styler.set_td_classes
I’ll take these:
pandas.io.json.build_table_schema pandas.read_stata pandas.plotting.scatter_matrix pandas.Index.droplevel pandas.Grouper
I’ll take: pandas.Timestamp.ceil pandas.Timestamp.floor pandas.Timestamp.round
working on:
Hi @natmokval, the command line
scripts/validate_docstrings.py --format=actions --errors=EX03 method-name
outputs all kind of errors, not just the EX03 errors.https://github.com/pandas-dev/pandas/blob/c778746f2219601ac3c38f4f287f9a4e68905655/scripts/validate_docstrings.py#L444-L458
Working on:
PR opened for the following:
work on
I’ll take these:
pandas.core.resample.Resampler.interpolate pandas.pivot
pandas.merge_asof pandas.wide_to_long pandas.Index.rename pandas.Index.isin pandas.IndexSlice
Even I am new and facing similar issue . Even after making the changes the error logs don’t change
Working on:
Working on :
I will work for:
pandas.Series.cat.set_categories
pandas.Series.plot.bar
pandas.Series.plot.hist
I can take the first two methods.
pandas.Series.dt.day_name
pandas.Series.str.len