arrow: Wrong result when a shift involves a DST change.
In paris in 2013, we changed to DST in 2013-03-31 (at 2 AM it was 3 AM), from UTC+1 to UTC+2 (DST)
Here’s an example of the issue I have with arrow :
>>> import arrow, datetime
>>> just_before = arrow.get(datetime.datetime(2013, 03, 31, 1, 50, 45), "Europe/Paris").ceil("hour")
>>> just_before
<Arrow [2013-03-31T01:59:59.999999+01:00]>
>>> just_after = just_before.replace(microseconds=+1)
>>> just_after
<Arrow [2013-03-31T02:00:00+02:00]>
>>> # That is not right... It should be either 2 AM UTC+1 or preferably 3AM UTC +2
>>> # One could argue that depending on what you ask for, it may be the correct answer, but then the following is :
>>> (just_after - just_before).total_seconds()
1e-06
>>> # That is right but not consistent with the dates
>>> (just_after.to("utc") - just_before.to("utc").total_seconds()
-3599.999999
# That should be the same value as the previous computation. Plus, the value is negative because :
>>> just_before.to('utc'), just_after.to('utc')
(<Arrow [2013-03-31T00:59:59.999999+00:00]>,
<Arrow [2013-03-31T00:00:00+00:00]>)
I think the problem is pretty clear…
About this issue
- Original URL
- State: closed
- Created 11 years ago
- Reactions: 1
- Comments: 36 (15 by maintainers)
@andrewelkins It seems that all the examples you point out as to be fixed comes from here: https://github.com/sdispater/pendulum#why-not-arrow (I am the author of Pendulum by the way)
I should point out that this approach is wrong since those are just some non-exhaustive examples and should not be treated as specific cases but rather understanding the underlying problem, in this case understanding how DST transitions work.
This is an issue that has been here for 3 years and neither Chris nor you have been able to fix this while being critical for a datetime library, so the promise made by Arrow seems dubious, at best. I started Pendulum because Arrow is so broken (you can just look at your issues piling up) I wondered if its maintainers had any knowledge of what a datetime library should be or if they had any desire to go forward with the library.
Sorry if I sound too harsh but I have been bitten so many times by Arrow that I could no longer rely on it. To be honest, the first stable version of Pendulum is around the corner and does everything Arrow does and more, without the bugs. What I don’t understand is how a 6-month old library can be more complete and less buggy than a 3-year old one.
I don’t know what you plan to do with Arrow (it seems to me that you just expect people to fix the issues for you at this point) but you owe it to your users to warn them about the critical shortcomings of the library.
@systemcatch Yes, I’ll definitely revisit arrow (and in fact, I have not pulled it from my flow yet). I really like it, but I cannot give up the practicality of using
pandas
for this project.If there is something you want me to help with on your fork, let me know. I don’t have a ton of free time, but I would like to help give back. 😃
I’d just like to be clear about this - doing a time shift should automatically resolve DST shifts, both
.dst()
timedelta
-wise and time-wise. That’s the goal, right?I get it that you’re upset at Arrow not working how you would expect, but there’s no need to be aggressive and condescending here. We’re in @crsmithdev 's space, AFAIK we both have no idea of what might have happened in their live that kept them from fixing the bug (or I should say, to spend countless hours and/or days maintaining a library they may not even use anymore themselves, working completely for nothing, on their free time in addition to whatever else they might have to deal with) and it’s dishonest to assume it has anything to do with their competence.
Using someone else’s software won’t make them owe you anything. It’s a pure question of gratefulness, of trust, of sharing.
A library does not grow with time but with contributions. Contributions are more frequent when a project is useful and welcoming. Having people taking over issues to say their own project is better and this project is shit does not really help making the space welcoming. Just saying.
I’ve been using Arrow a little bit (not quite much, I admit), but I’ve always looked at that project with interest. When I discovered pendulum lately (and we had an exchange on twitter), I was equally interested. Sadly, I’m not sure I really want to go further now. I’m aware that you have no reason to care, but I sincerly hope that by the time pendulum becomes a must-have in many projects, which might happen, I can find in your community a more welcoming atmosphere 😕
I’m still a bit confused. Here is a script:
Can you tell me what is wrong here? And what you want?
The really annoying part is that around DST, when you shift by 1h from 1.30AM, the time difference should be 60 minutes (what you asked for) and the “visual difference” should be 2 hours (not what you asked) (3.30AM), but when you shift by 1 day, say from midnight, the time difference should be 23h, but the visual difference should be exactly 1 day, so it’s the other way around (well, if you define 1 day as “adding 1 to the “day” field for the calendar” and not “24 hours”)
Among the different algorithms that have worked on some cases for me was :
Or differenciate shifts for date and shifts for time,
This last strategy can lead to potential issues : when you want to relocalize, there might be an ambiguity (for exemple, if you are in Paris, 2013-03-30 2:30:00+01:00 and you add 1 day, you arrive on an hour that does not exist. Same for DST end, you can land on 2 different hours that have the same name)
Hopping these thoughts could help you 😄
OK, thanks @dlopuch and @systemcatch . I’d read somewhere that
pytz
had more issues with it thanarrow
when dealing with these corner cases, but I did not realize / research thatpandas
usedpytz
under the hood anyway.So I’ll just adjust my code to work with
pandas
“purely”, and avoidarrow
in this case. That’s a shame, since I really like howarrow
works; it’s a great API.I guess I didn’t realize how much the Python ecosystem has left to deal with in this datetime world. I’m not even asking about leap seconds or anything bizarre (IMO). I know it’s a tricky, pain-in-the-ass problem to solve, but it’s also a hugely important one.
This issue isn’t limited to
.shift()
, but also functions like.range()
.You probably knew that, but here’s a quick
.range()
test case across the spring DST boundary to check:In the normalized-to-UTC column, note the two
2018-03-11T09:00:00+00:00
entries:This is (should be) a barrier to anyone using Arrow in production code with non-UTC date math. I’ll be giving Arrow serious consideration only after this is fixed.
I will say that ewjoachim’s first paragraph is the behaviour I hope Arrow will adopt.