cloud_controller_ng: App events with large meta-data are cause DB error and are not recorded in event table

Thanks for submitting an issue to capi-release. We are always trying to improve! To help us, please fill out the following template.

Issue

This is related to https://github.com/cloudfoundry/capi-release/issues/152 Application crash events that have large metadata cause a DB error and are not recorded in the CC DB events table.

Below are the CC and TPS logs related to the failure (truncated for readability)

CC logs:

"message":"Started POST \"/internal/v4/apps/76a37782-5b65-4ecb-980b-3d26a6ed094d-e3083ba6-e413-4d68-8006-a26d741ef2ac/crashed\" for user:...
"message":"exception not translated: Sequel::DatabaseError - Mysql2::Error: Data too long for column 'metadata' at row 1...
"message":"Request failed: 500: {\"description\"=>\"Database error\", \"error_code\"=>\"CF-DatabaseError\", \"code\"=>10011, \"test_mode_info\"=>{\"description\"=>\"Database error\", \"error_code\"=>\"CF-DatabaseError\"...
"message":"Completed 500 vcap-request-id: ae31f995-3ad3-44c9-b77f-4699c57002b2","log_level":"info","source":"cc.api","data"...

TPS logs:

"message":"tps-watcher.watcher.app-crashed","data":{"index":0,"process-guid":"76a37782-5b65-4ecb-980b-3d26a6ed094d-e3083ba6-e413-4d68-8006-a26d741ef2ac","session":"2"}}
"message":"tps-watcher.watcher.recording-app-crashed","data":{"index":0,"process-guid":"76a37782-5b65-4ecb-980b-3d26a6ed094d-e3083ba6-e413-4d68-8006-a26d741ef2ac","session":"2"}}
"message":"tps-watcher.watcher.failed-recording-app-crashed","data":{"error":"Crashed response POST failed with 500","index":0,"process-guid":"76a37782-5b65-4ecb-980b-3d26a6ed094d-e3083ba6-e413-4d68-8006-a26d741ef2ac","session":"2"}}

Steps to Reproduce

Pushing an app that exceeds its disk limit seems to reliably reproduce this issue. Just push a simple test app that contains a file that is larger than the disk limit assigned. Ex: app size > 10 M cf push myapp -k 10M

Expected result

Would expect to see events for the app showing that it crashed due to insufficient disk space

Current result

No events for app crash and errors in the CC and TPS logs

Possible Fix

Limit the size of the metadata posted to ‘/internal/v4/apps/:process_guid/crashed’ Ideally before the message is sent to the TPS or CC rather than in the CC code itself (app/controllers/internal/app_crashed_controller.rb)

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 15 (5 by maintainers)

Commits related to this issue

Most upvoted comments

We haven’t done any analysis of that sort so I couldn’t say if that lines up with our data but realistically I don’t think it really matters from our prospective. Lets just pick a reasonable size like 10K and make sure any DB writes are truncated to fit so we don’t get a sql error. I think the missing events in the event table is the bigger issue here, that should never happen.