aws-sdk-ruby: Nesting Bug When DynamoDB is Mid Provisioning Capacity

I’m using this code to do bulk writes to DynamoDB:

@dynamodb ||= Aws::DynamoDB::Client.new

 def build_put_request(data)
    while data.any?
      # DynamoDB supports batchwrites of up to 25 at a time
      batch = data.slice!(0, 25)
      items = put_request_items(batch)
      write_to_dynamo(ENV['DYNAMO_TABLE'] => items)
    end
  end

  def put_request_items(batch)
    Array.new(batch.length) do |i|
      { put_request: item(batch[i]) }
    end
  end

  def item(batch)
    { item: {
      user: batch['user'],
      pic: batch['pic'],
      bio: batch['bio'],
      fav_movies: batch['movies'],
      fav_tv_shows: batch['fav_tv_shows'],
      age: batch['age'],
      premium_user: true,
      referral_source: batch['source']
    } }
  end

  def write_to_dynamo(items)
    tries ||= 5
    resp = @dynamodb.batch_write_item(request_items: items)
    handle_unprocessed_items(resp.unprocessed_items)
  rescue StandardError => e
    @logger.error("#{e}\nError on: #{{ request_items: items }}")
    sleep(2)
    retry unless (tries -= 1).zero?
    @logger.fatal('Fatal Dynamo Errors, Program Aborting')
    abort
  end

  def handle_unprocessed_items(unprocessed_items)
    if unprocessed_items.count > 0
      log_failure(unprocessed_items[ENV['DYNAMO_TABLE']]) if (@tries + 1) > 1
      sleep(@tries * 0.75)
      write_to_dynamo(unprocessed_items)
    else
      reset_tries
      print_sent_message
    end
  end

I get this error when I run out of provisioned capacity: ERROR -- DynamoDB: The level of configured provisioned throughput for one or more global secondary indexes of the table was exceeded. Consider increasing your provisioning level for the under-provisioned global secondary indexes with the UpdateTable API

So now, on the retry (Another app updates the provisioned writes dynamically), I get this error: ERROR -- DynamoDB: The provided key element does not match the schema

When I look at the data suddenly I have this odd nestings that appear:

{:request_items=>{"users"=>[#<struct Aws::DynamoDB::Types::WriteRequest put_request=#<struct Aws::DynamoDB::Types::PutRequest item={"age"=>{:m=>{"n"=>{:s=>"34"}}}, "user"=>{:m=>{"s"=>{:s=>"Uncle Bill"}}}, "pic"=>{:m=>{"s"=>{:s=>"https://s3.aws.sdfa3jlfsd.com"}}}, "bio"=>{:m=>{"s"=>{:s=>"lorem ipsum...."}}}, "premium_user"=>{:m=>{"bool"=>{:bool=>true}}} # etc...

The request should look like this as far as I understand:

{:request_items=>{"users"=>[#<struct Aws::DynamoDB::Types::WriteRequest put_request=#<struct Aws::DynamoDB::Types::PutRequest item={"age"=>{:n=>"34"}, "user"=>{:s=>"Uncle Bill"}, "pic"=>{:s=>"https://s3.aws.sdfa3jlfsd.com"}, "bio"=>{:s=>"lorem ipsum...."}, "premium_user"=>{:bool=>true} # etc...

This only seems to happen when I’m mid provisioning more capacity or I did something wrong?

AWS Gem Versions:

aws-sdk (2.6.44)
  aws-sdk-resources (= 2.6.44)
aws-sdk-core (2.6.44)
  aws-sigv4 (~> 1.0)
  jmespath (~> 1.0)
aws-sdk-resources (2.6.44)
  aws-sdk-core (= 2.6.44)
aws-sigv4 (1.0.0)

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 24 (14 by maintainers)

Commits related to this issue

Most upvoted comments

Okay, I do see a possible bug source here. The code works under normal circumstances, but it’s possible you’re getting an unusual error, like perhaps a network error, under production load and our plugin stack is consuming the mutated hash for a built-in retry.

In our Aws::Plugins::DynamoDBSimpleAttributes class, the param hash gets mutated. If we go through the full stack, this mutation happens in both directions and there’s no problem, but it looks like in some cases we might mutate it twice.

Still investigating.