wraith: `spider_skips` property broken
I’m using the default spider.yaml from https://raw.githubusercontent.com/BBC-News/wraith/master/templates/configs/spider.yaml
I’ve tried running it with wraith installed locally on my mac, and also via the wraith docker image. Both fail, with different error messages.
On my mac locally:
$ wraith capture spider.yaml
Config validated. No serious issues found.
no paths defined in config, crawling from site root
creating new spider file
/Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in `=~': type mismatch: String given (TypeError)
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in `block in skip_link?'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in `any?'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in `skip_link?'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:256:in `visit_link?'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:151:in `block in run'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:151:in `delete_if'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:151:in `run'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:92:in `block in crawl'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:83:in `initialize'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:90:in `new'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:90:in `crawl'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:18:in `crawl'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/spider.rb:69:in `spider'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/spider.rb:35:in `determine_paths'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/spider.rb:23:in `check_for_paths'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/cli.rb:36:in `check_for_paths'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/cli.rb:133:in `block in capture'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/cli.rb:28:in `within_acceptable_limits'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/cli.rb:130:in `capture'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/thor-0.19.1/lib/thor/command.rb:27:in `run'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/thor-0.19.1/lib/thor/invocation.rb:126:in `invoke_command'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/thor-0.19.1/lib/thor.rb:359:in `dispatch'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/thor-0.19.1/lib/thor/base.rb:440:in `start'
from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/bin/wraith:5:in `<top (required)>'
from /Users/paullew/.rvm/gems/ruby-2.2.1/bin/wraith:23:in `load'
from /Users/paullew/.rvm/gems/ruby-2.2.1/bin/wraith:23:in `<main>'
from /Users/paullew/.rvm/gems/ruby-2.2.1/bin/ruby_executable_hooks:15:in `eval'
from /Users/paullew/.rvm/gems/ruby-2.2.1/bin/ruby_executable_hooks:15:in `<main>'
Running it via the wraith docker image:
$ docker run --rm -P -v ~/devel/resources/testing/wraith:/wraithy -w='/wraithy' bbcnews/wraith capture spider.yaml
/usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/spider.rb:64:in `spider': undefined local variable or method `wraith' for #<Wraith::Crawler:0x005648d0ca4be8> (NameError)
from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/spider.rb:36:in `determine_paths'
from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/spider.rb:24:in `check_for_paths'
from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/cli.rb:36:in `check_for_paths'
from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/cli.rb:134:in `block in capture'
from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/cli.rb:28:in `within_acceptable_limits'
from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/cli.rb:131:in `capture'
from /usr/local/lib/ruby/gems/2.1.0/gems/thor-0.19.1/lib/thor/command.rb:27:in `run'
from /usr/local/lib/ruby/gems/2.1.0/gems/thor-0.19.1/lib/thor/invocation.rb:126:in `invoke_command'
from /usr/local/lib/ruby/gems/2.1.0/gems/thor-0.19.1/lib/thor.rb:359:in `dispatch'
from /usr/local/lib/ruby/gems/2.1.0/gems/thor-0.19.1/lib/thor/base.rb:440:in `start'
from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/bin/wraith:5:in `<top (required)>'
from /usr/local/bin/wraith:23:in `load'
from /usr/local/bin/wraith:23:in `<main>'
Config validated. No serious issues found.
no paths defined in config, crawling from site root
About this issue
- Original URL
- State: open
- Created 8 years ago
- Reactions: 2
- Comments: 16 (1 by maintainers)
Same Issue!
I’m getting undefined local variable or method `wraith’ for #Wraith::Crawler:0x005648d0ca4be8 (NameError) for default spider.yml as well, version 3.1.2
I’m also getting the same error as @imagreenplant. Is this solved?
Same here:
wraith capture configs/spider.yaml Config validated. No serious issues found. no paths defined in config, crawling from site root /Library/Ruby/Gems/2.0.0/gems/wraith-3.1.2/lib/wraith/spider.rb:64:in
spider’: undefined local variable or methodwraith' for #<Wraith::Crawler:0x007fd2f1938610> (NameError)
@sembrat
Yes, switching to regex does seem to correct the errors.
This error has been fixed in 3.2.1:
'spider': undefined local variable or method 'wraith' for # (NameError)
However, I can see that the original error in this issue is:
core.rb:298:in =~': type mismatch: String given (TypeError)
This issue has been closed in error. Re-opening.
I have the same error message as @slimatic , full dump:
Any thoughts on what this issue could be?
/usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.2.0/lib/wraith/spider.rb:64:in
spider’: undefined local variable or methodwraith' for #<Wraith::Crawler:0x0055cde740ed58> (NameError)
I get the same thing. Same error as @imagreenplant