jekyll-algolia: Unable to batch push data on the index

Hello,

Just found out about Algolia and its Jekyll plugin.

However, when I run the command:

ALGOLIA_API_KEY='secret' bundle exec jekyll algolia --trace

I stumble upon the following issue:

Extracting records...                                                            
Updating records in index blog.frankel.ch...                                     
Records to delete: 33                                                            
Records to add:    57                                                            
bundler: failed to load command: jekyll (/Users/i303869/.rbenv/versions/2.4.2/bin/jekyll)
NoMethodError: undefined method `[]' for nil:NilClass
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/error_handler.rb:198:in `record_too_big?'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/error_handler.rb:58:in `block in identify'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/error_handler.rb:57:in `each'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/error_handler.rb:57:in `identify'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/error_handler.rb:23:in `stop'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/indexer.rb:154:in `rescue in block in update_records'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/indexer.rb:151:in `block in update_records'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/indexer.rb:150:in `each'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/indexer.rb:150:in `each_slice'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/indexer.rb:150:in `update_records'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/algolia/indexer.rb:192:in `run'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll-algolia.rb:119:in `write'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-3.7.2/lib/jekyll/site.rb:75:in `process'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll-algolia.rb:57:in `run'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-algolia-1.1.0/lib/jekyll/commands/algolia.rb:42:in `block (2 levels) in init_with_program'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `block in execute'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `each'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `execute'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/mercenary-0.3.6/lib/mercenary/program.rb:42:in `go'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/mercenary-0.3.6/lib/mercenary.rb:19:in `program'
  /Users/i303869/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/jekyll-3.7.2/exe/jekyll:15:in `<top (required)>'
  /Users/i303869/.rbenv/versions/2.4.2/bin/jekyll:23:in `load'
  /Users/i303869/.rbenv/versions/2.4.2/bin/jekyll:23:in `<top (required)>'

Interestingly enough, if I remove all the posts (but not the pages), or use the example repo, it works.

Do you have any way I could debug further? Any hint/help appreciated.

My repo is hosted on a private Gitlab repo, but should the need be, I could give you access rights.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 23 (1 by maintainers)

Most upvoted comments

I just pulled your branch and tried a jekyll algolia. I managed to push ~600 records to my index with no error thrown.

But I realized that re-running the command was actually deleting all records and re-adding them. I’m going to get that fixed (I think it has something to do with the adoc converter). Maybe both bugs are related.

In the meantime, could you try to update to the latest v1.1.1. It should have better error handling of the error you were having. And to answer your question, records are pushed from memory and not written on disk. The new feature in v1.1.1 will write a log file in your source directory containing the content of the record that was refused by Algolia. This should help you troubleshoot the origin of the issue.

FYI, I wrote a basic plugin (I’m not a Ruby developer) instead of a hook to cut on the excerpt size:

Jekyll::Hooks.register :posts, :pre_render do |doc|
  content = doc.content
  if content.length > 150
    content = content[0..150]
  end
  doc.data['excerpt'] = content
end

Not only does indexing works now, it’s also very very fast.

Thanks for the report.

I stumbled upon a similar issue yesterday. What I think is going on is that the plugin is trying to push a record that is too big for Algolia (more than 10Kb). The plugin should handle this gracefully and let you know about the record that is indeed causing issues, but this part of the code is failing.

Now, why is your record that big? This is often caused either by a malformed HTML (unclosed tags, causing the HTML parser to grab too much content), or by recursive rendering in your pages (calling {{ content }} from a page, forcing it to re-render itself inside itself, inception-style).

I’ll release a new version of the plugin soon, with better error handling that might help you pinpoint the issue, but if you can grant me access to your GitLab, or give me a smaller reproducible test case I’d be happy to have a look.