watchdog: Modifed event triggered twice

if I copy and paste a file on OSX lion it triggered created event once and modified event twice should it be just trigger once modified event?

INFO:root:Created file: /Users/ouyangjichao/BoxSync/rename/testwatchdog.py INFO:root:Modified file: /Users/ouyangjichao/BoxSync/rename/testwatchdog.py INFO:root:Modified file: /Users/ouyangjichao/BoxSync/rename/testwatchdog.py

About this issue

  • Original URL
  • State: closed
  • Created 12 years ago
  • Reactions: 4
  • Comments: 21 (1 by maintainers)

Most upvoted comments

So I did some digging and found watchgod, and it handles this better albeit in a different way, opened 4 files in vscode, made a change and did save all running the following example:

from watchgod import watch

for changes in watch('.'):
    print(changes)

resulting output:

python .\test.py 
{(<Change.modified: 2>, '.\\test\\test_file-d.txt'), (<Change.modified: 2>, '.\\test\\test_file-c.txt'), (<Change.modified: 2>, '.\\test\\test_file-a.txt'), (<Change.modified: 2>, '.\\test\\test_file-b.txt')}

It’s definitely different but it’s not duplicated so from my POV that’s something I can work with - last thing I want to do is add a FIFO queue and dequeue system in to deduplicate these things prior to execution of a function on the file.

Example 2: $ touch test_file-{a..z}.txt

{(<Change.modified: 2>, '.\\test\\test_file-d.txt'), (<Change.modified: 2>, '.\\test\\test_file-m.txt'), (<Change.modified: 2>, '.\\test\\test_file-f.txt'), (<Change.modified: 2>, '.\\test\\test_file-h.txt'), (<Change.modified: 2>, '.\\test\\test_file-v.txt'), (<Change.modified: 2>, '.\\test\\test_file-w.txt'), (<Change.modified: 2>, '.\\test\\test_file-g.txt'), (<Change.modified: 2>, '.\\test\\test_file-n.txt'), (<Change.modified: 2>, '.\\test\\test_file-q.txt'), (<Change.modified: 2>, '.\\test\\test_file-z.txt'), (<Change.modified: 2>, '.\\test\\test_file-x.txt'), (<Change.modified: 2>, '.\\test\\test_file-i.txt'), (<Change.modified: 2>, '.\\test\\test_file-l.txt'), (<Change.modified: 2>, '.\\test\\test_file-c.txt'), (<Change.modified: 2>, '.\\test\\test_file-p.txt'), (<Change.modified: 2>, '.\\test\\test_file-a.txt'), (<Change.modified: 2>, '.\\test\\test_file-o.txt'), (<Change.modified: 2>, '.\\test\\test_file-r.txt'), (<Change.modified: 2>, '.\\test\\test_file-s.txt'), (<Change.modified: 2>, '.\\test\\test_file-y.txt'), (<Change.modified: 2>, '.\\test\\test_file-j.txt'), (<Change.modified: 2>, '.\\test\\test_file-t.txt'), (<Change.modified: 2>, '.\\test\\test_file-u.txt'), (<Change.modified: 2>, '.\\test\\test_file-e.txt'), (<Change.modified: 2>, '.\\test\\test_file-k.txt'), (<Change.modified: 2>, '.\\test\\test_file-b.txt')}

All 26 files changed were shown in a single response.

Tried the same with 1…1000 and all the files were in the same response, and added print(len(changes)) to the for loop to confirm that too.

All in all it seems to work and it may help the people in this thread.

Disclaimer: It’s using file system polling, so it may not be efficient but the async support presented some perks against that backdrop. Performance comparison isn’t that easy with an async function so I’ve not tried to wrap it in timeit, that being said, it may be useful to get a snapshot dictionary back of changes in a polling time delta rather than many itemised changes - depends on usecase.

Note: Just tried it on Arch with linux 5.3 and it’s ORDERS of magnitude faster than windows. For comparison, I switched to powershell natively with fsutil to create files and I couldn’t even create files quick enough - on linux with bash parameter expansion on touch I could create, and get a dictionary of, 10000 files in less than a second.

Can’t confirm on MacOS but I can’t think of a way to create many dummy files simultaneously on windows, a for loop with fsutil seems too slow, using WSL to use touch in bash suffers the NTFS translation layer IO performance penality - if anyone has a good suggestion there to create 1000 files on windows simultaneously for testing I’m all ears.

I also have this issue, and I think I found the problem.

In src/watchdog/observers/inotify.py line 162 we have:

elif event.is_modify:
    cls = DirModifiedEvent if event.is_directory else FileModifiedEvent
    self.queue_event(cls(src_path))

which uses the is_modify property. Which comes from src/watchdog/observers/inotify_c.py:

@property
def is_modify(self):
    return self._mask & InotifyConstants.IN_MODIFY > 0

Notice that IN_MODIFY is used as the definition of file modification. But the Inotify documentation (http://inotify.aiken.cz/?section=inotify&page=faq&lang=en) says:

What is the difference between IN_MODIFY and IN_CLOSE_WRITE? The IN_MODIFY event is emitted on a file content change (e.g. via the write() syscall) while IN_CLOSE_WRITE occurs on closing the changed file. It means each change operation causes one IN_MODIFY event (it may occur many times during manipulations with an open file) whereas IN_CLOSE_WRITE is emitted only once (on closing the file).

Is it better to use IN_MODIFY or IN_CLOSE_WRITE? It varies from case to case. Usually it is more suitable to use IN_CLOSE_WRITE because if emitted the all changes on the appropriate file are safely written inside the file. The IN_MODIFY event needn’t mean that a file change is finished (data may remain in memory buffers in the application). On the other hand, many logs and similar files must be monitored using IN_MODIFY - in such cases where these files are permanently open and thus no IN_CLOSE_WRITE can be emitted.

So the solution is, I think, to expose both the IN_MODIFY and the IN_CLOSE_WRITE event in the Python interface. This way both use cases can be supported.

@RRSR A workaround to this which I am following is based on a suggestion by @travcunn

https://pastebin.com/gFQBQM0S

The 2 events that are fired are fired just after each other, with a few milliseconds at most(at least in my case). Hope this helps

@anvesh1212 On Windows 10 it might be a slightly different issue. The library will provide a different implementation (WindowsApiObserver) when it detects that the platform is Windows. I was able to reproduce the issue 100% of the time on Windows 10 with the default import:

from watchdog.observers import Observer

However I no longer reproduce it by selecting the generic implementation:

from watchdog.observers import PollingObserver as Observer

This could be used as a workaround, although this class may not be guaranteed to be suitable in the future.

This happens because you’re defaulting back to the slow polling rather than relying on the windows filechange API. See more here: https://github.com/gorakhargosh/watchdog#supported-platforms https://github.com/gorakhargosh/watchdog/blob/master/src/watchdog/observers/polling.py#L126

As an aside I had to use from watchdog.observers.polling import PollingObserver as Observer instead for v0.9.0.

Not convinced this would be suited for anything other than tens of files, it’ll scale awfully with increasing complexity of file structures.

It’s a complicated issue to fix, though it makes it difficult for my use case - something akin to an npm watch command to monitor file changes for compilation of jinja2 templates.

Interestingly, VScode has some peculiar but expected behaviour when saving - if the file is empty it’s a single modification event, if it’s saving with content it’s two - assuming one is dumping the content after. Not much else to debug here other than some events will be duplicated and sometimes it could be by design.

Edit: Found this and it works for the most part, but limits some things - https://stackoverflow.com/questions/18599339/python-watchdog-monitoring-file-for-changes just by virtue of having a second timeout to prevent duplicates.

As an example: Running the following in WSL Bash touch test_file-{0..100000}.txt and the resulting output from that stackoverflow script was:

Event type: modified  path : .\test\test_file-0.txt
False
Event type: modified  path : .\test\test_file-3334.txt
False
Event type: modified  path : .\test\test_file-7209.txt
False
Event type: modified  path : .\test\test_file-10331.txt
False
Event type: modified  path : .\test\test_file-13916.txt
False
Event type: modified  path : .\test\test_file-17153.txt
False
Event type: modified  path : .\test\test_file-20907.txt
False
Event type: modified  path : .\test\test_file-24328.txt
False
Event type: modified  path : .\test\test_file-27599.txt
False
Event type: modified  path : .\test\test_file-30788.txt
False
Event type: modified  path : .\test\test_file-33740.txt
False
Event type: modified  path : .\test\test_file-37422.txt
False
Event type: modified  path : .\test\test_file-40442.txt
False
Event type: modified  path : .\test\test_file-44172.txt
False
Event type: modified  path : .\test\test_file-47999.txt
False
Event type: modified  path : .\test\test_file-51371.txt
False
Event type: modified  path : .\test\test_file-55067.txt
False
Event type: modified  path : .\test\test_file-58909.txt
False
Event type: modified  path : .\test\test_file-61758.txt
False
Event type: modified  path : .\test\test_file-65516.txt
False
Event type: modified  path : .\test\test_file-69054.txt
False
Event type: modified  path : .\test\test_file-72506.txt
False
Event type: modified  path : .\test\test_file-76257.txt
False
Event type: modified  path : .\test\test_file-79455.txt
False
Event type: modified  path : .\test\test_file-82837.txt
False
Event type: modified  path : .\test\test_file-86725.txt
False
Event type: modified  path : .\test\test_file-90524.txt
False
Event type: modified  path : .\test\test_file-94054.txt
False
Event type: modified  path : .\test\test_file-97403.txt
False

As such it’s missing most of the events there - so there may be some scope to fix that because if a lot of files change at once it’ll miss it rather than dropping duplicates.

For comparison: Standard observer behaviour when I click save all on 4 modified files:

event type: modified  path : .\test\test_file-b.txt
event type: modified  path : .\test\test_file-a.txt
event type: modified  path : .\test\test_file-c.txt
event type: modified  path : .\test\test_file-d.txt
event type: modified  path : .\test\test_file-b.txt
event type: modified  path : .\test\test_file-d.txt
event type: modified  path : .\test\test_file-c.txt
event type: modified  path : .\test\test_file-a.txt

@anvesh1212 On Windows 10 it might be a slightly different issue. The library will provide a different implementation (WindowsApiObserver) when it detects that the platform is Windows. I was able to reproduce the issue 100% of the time on Windows 10 with the default import:

from watchdog.observers import Observer

However I no longer reproduce it by selecting the generic implementation:

from watchdog.observers import PollingObserver as Observer

This could be used as a workaround, although this class may not be guaranteed to be suitable in the future.

@anvesh1212 I don’t think it will ever be fixed. This issue is open since 2012. I go with the workaround that @naveenjafer posted on 25 Dec 2017.

The same issue still exists i.e. on the creation of a new file the ‘modified’ event is called twice.

Perhaps it would be helpful to do a stat to check when a file was last modified to avoid triggering twice, especially if it’s unavoidable from OS-specific implementation.

I’m having the same issue when using vim to edit files(with “set nobackup, nowritebackup”) under Ubuntu 14.04. Although this is vary late, now watchdog.observers.polling.PollingObserver works fine for me.