wraith: `spider_skips` property broken

I’m using the default spider.yaml from https://raw.githubusercontent.com/BBC-News/wraith/master/templates/configs/spider.yaml

I’ve tried running it with wraith installed locally on my mac, and also via the wraith docker image. Both fail, with different error messages.

On my mac locally:

$ wraith capture spider.yaml
Config validated. No serious issues found.
no paths defined in config, crawling from site root
creating new spider file
/Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in `=~': type mismatch: String given (TypeError)
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in `block in skip_link?'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in `any?'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in `skip_link?'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:256:in `visit_link?'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:151:in `block in run'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:151:in `delete_if'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:151:in `run'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:92:in `block in crawl'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:83:in `initialize'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:90:in `new'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:90:in `crawl'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/anemone-0.7.2/lib/anemone/core.rb:18:in `crawl'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/spider.rb:69:in `spider'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/spider.rb:35:in `determine_paths'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/spider.rb:23:in `check_for_paths'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/cli.rb:36:in `check_for_paths'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/cli.rb:133:in `block in capture'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/cli.rb:28:in `within_acceptable_limits'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/lib/wraith/cli.rb:130:in `capture'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/thor-0.19.1/lib/thor/command.rb:27:in `run'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/thor-0.19.1/lib/thor/invocation.rb:126:in `invoke_command'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/thor-0.19.1/lib/thor.rb:359:in `dispatch'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/thor-0.19.1/lib/thor/base.rb:440:in `start'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/gems/wraith-3.1.0/bin/wraith:5:in `<top (required)>'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/bin/wraith:23:in `load'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/bin/wraith:23:in `<main>'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/bin/ruby_executable_hooks:15:in `eval'
    from /Users/paullew/.rvm/gems/ruby-2.2.1/bin/ruby_executable_hooks:15:in `<main>'

Running it via the wraith docker image:

$ docker run --rm -P -v ~/devel/resources/testing/wraith:/wraithy -w='/wraithy' bbcnews/wraith capture spider.yaml
/usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/spider.rb:64:in `spider': undefined local variable or method `wraith' for #<Wraith::Crawler:0x005648d0ca4be8> (NameError)
    from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/spider.rb:36:in `determine_paths'
    from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/spider.rb:24:in `check_for_paths'
    from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/cli.rb:36:in `check_for_paths'
    from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/cli.rb:134:in `block in capture'
    from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/cli.rb:28:in `within_acceptable_limits'
    from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/lib/wraith/cli.rb:131:in `capture'
    from /usr/local/lib/ruby/gems/2.1.0/gems/thor-0.19.1/lib/thor/command.rb:27:in `run'
    from /usr/local/lib/ruby/gems/2.1.0/gems/thor-0.19.1/lib/thor/invocation.rb:126:in `invoke_command'
    from /usr/local/lib/ruby/gems/2.1.0/gems/thor-0.19.1/lib/thor.rb:359:in `dispatch'
    from /usr/local/lib/ruby/gems/2.1.0/gems/thor-0.19.1/lib/thor/base.rb:440:in `start'
    from /usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.1.2/bin/wraith:5:in `<top (required)>'
    from /usr/local/bin/wraith:23:in `load'
    from /usr/local/bin/wraith:23:in `<main>'
Config validated. No serious issues found.
no paths defined in config, crawling from site root

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Reactions: 2
  • Comments: 16 (1 by maintainers)

Most upvoted comments

Same Issue!

I’m getting undefined local variable or method `wraith’ for #Wraith::Crawler:0x005648d0ca4be8 (NameError) for default spider.yml as well, version 3.1.2

I’m also getting the same error as @imagreenplant. Is this solved?

Same here:

wraith capture configs/spider.yaml Config validated. No serious issues found. no paths defined in config, crawling from site root /Library/Ruby/Gems/2.0.0/gems/wraith-3.1.2/lib/wraith/spider.rb:64:inspider’: undefined local variable or method wraith' for #<Wraith::Crawler:0x007fd2f1938610> (NameError)

@sembrat

Yes, switching to regex does seem to correct the errors.

This error has been fixed in 3.2.1:

'spider': undefined local variable or method 'wraith' for # (NameError)

However, I can see that the original error in this issue is:

core.rb:298:in =~': type mismatch: String given (TypeError)

This issue has been closed in error. Re-opening.

I have the same error message as @slimatic , full dump:

/usr/local/lib/ruby/gems/2.2.0/gems/wraith-3.1.2/lib/wraith/spider.rb:64:in
 'spider': undefined local variable or method 'wraith' for #<Wraith::Crawler:0x007fef0c1129b8> (NameError)
    from /usr/local/lib/ruby/gems/2.2.0/gems/wraith-3.1.2/lib/wraith/spider.rb:36:in 'determine_paths'
    from /usr/local/lib/ruby/gems/2.2.0/gems/wraith-3.1.2/lib/wraith/spider.rb:24:in 'check_for_paths'
    from /usr/local/lib/ruby/gems/2.2.0/gems/wraith-3.1.2/lib/wraith/cli.rb:36:in 'check_for_paths'
    from /usr/local/lib/ruby/gems/2.2.0/gems/wraith-3.1.2/lib/wraith/cli.rb:134:in 'block in capture'
    from /usr/local/lib/ruby/gems/2.2.0/gems/wraith-3.1.2/lib/wraith/cli.rb:28:in 'within_acceptable_limits'
    from /usr/local/lib/ruby/gems/2.2.0/gems/wraith-3.1.2/lib/wraith/cli.rb:131:in 'capture'
    from /usr/local/lib/ruby/gems/2.2.0/gems/thor-0.19.1/lib/thor/command.rb:27:in 'run'
    from /usr/local/lib/ruby/gems/2.2.0/gems/thor-0.19.1/lib/thor/invocation.rb:126:in 'invoke_command'
    from /usr/local/lib/ruby/gems/2.2.0/gems/thor-0.19.1/lib/thor.rb:359:in 'dispatch'
    from /usr/local/lib/ruby/gems/2.2.0/gems/thor-0.19.1/lib/thor/base.rb:440:in 'start'
    from /usr/local/lib/ruby/gems/2.2.0/gems/wraith-3.1.2/bin/wraith:5:in '<top (required)>'
    from /usr/local/bin/wraith:23:in 'load'
    from /usr/local/bin/wraith:23:in '<main>'

Any thoughts on what this issue could be?

/usr/local/lib/ruby/gems/2.1.0/gems/wraith-3.2.0/lib/wraith/spider.rb:64:inspider’: undefined local variable or method wraith' for #<Wraith::Crawler:0x0055cde740ed58> (NameError)

I get the same thing. Same error as @imagreenplant