tensorflow: mobile_ssd_v2_float_coco.tflite model :Cannot copy between a TensorFlowLite tensor with shape [1, 2034, 4] and a Java object with shape [1, 10, 4].

When using mobile_ssd_v2_float_coco.tflite model with Tensorflow Object Detection example code, I am facing this error:

Process: org.tensorflow.lite.examples.detection, PID: 30765 java.lang.IllegalArgumentException: Cannot copy between a TensorFlowLite tensor with shape [1, 2034, 4] and a Java object with shape [1, 10, 4]. at org.tensorflow.lite.Tensor.throwIfShapeIsIncompatible(Tensor.java:282) at org.tensorflow.lite.Tensor.throwIfDataIsIncompatible(Tensor.java:249) at org.tensorflow.lite.Tensor.copyTo(Tensor.java:141) at org.tensorflow.lite.NativeInterpreterWrapper.run(NativeInterpreterWrapper.java:161) at org.tensorflow.lite.Interpreter.runForMultipleInputsOutputs(Interpreter.java:275) at org.tensorflow.lite.examples.detection.tflite.TFLiteObjectDetectionAPIModel.recognizeImage(TFLiteObjectDetectionAPIModel.java:220) at org.tensorflow.lite.examples.detection.DetectorActivity$2.run(DetectorActivity.java:197) at android.os.Handler.handleCallback(Handler.java:873) at android.os.Handler.dispatchMessage(Handler.java:99) at android.os.Looper.loop(Looper.java:214) at android.os.HandlerThread.run(HandlerThread.java:65)

  • I did change the private static final int TF_OD_API_INPUT_SIZE = 320;``
  • Followed this StackOverflow solution https://stackoverflow.com/questions/54423649/tensorflow-lite-gpu-support-on-object-detector This gives only a partial soluton the problem. Can anyone suggest a proper solution.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 36 (10 by maintainers)

Most upvoted comments

@jdduke One more query, The official documentation suggests mobile_ssd_v2_float_coco.tflite model for enabling GPU delegate, but after analyzing the Model with Netron, mobile_ssd_v2_float_coco.tflite OUTPUT: raw_outputs/box_encodings id: raw_outputs/box_encodings type: float32[1,2034,4] raw_outputs/class_predictions id: raw_outputs/class_predictions type: float32[1,2034,91]

Whereas the default model detect.tflite used in the demo has 4 different parameters. OUTPUT: TFLite_Detection_PostProcess id: TFLite_Detection_PostProcess type: float32 TFLite_Detection_PostProcess:1 id: TFLite_Detection_PostProcess:1 type: float32 TFLite_Detection_PostProcess:2 id: TFLite_Detection_PostProcess:2 type: float32 TFLite_Detection_PostProcess:3 id: TFLite_Detection_PostProcess:3 type: float32 ie for Location, Classes, Scores, Number and detections

How can we use mobile_ssd_v2_float_coco.tfliteor any model to get enable GPU with Object Detection, Please suggest?

So, these models aren’t directly compatible. That is, the model included with the detection sample creates the following output tensors: locations, classes, scores, detections. The mobile_ssd_v2_float_coco.tflite model produces a 1x2034x4 encodings output tensor and a 1x2034x91 class predictions tensor. You’ll need to modify the logic in TFLiteObjectDetectionAPIModel to handle this altered output format. You can use the Netron visualization tool to help identify how the output tensors differ.

So if you look at the output_details, specifically the shape of each output tensor, you will see that the order of outputs is actually [classes, boxes, num_detections, scores] or [scores, boxes, num_detections, classes] (because the shape of output at index 3 is same as shape of output 0, so one of them is scores and other is classes - I am not sure which). Right now, its likely that your Java code assumes [boxes, classes, scores, num_detections] - which is why the error says that you are trying to copy a wrong-shaped tensor into another one.

You can run the model with some Python code with example images (as in the Colab I sent you) to debug.

Hey @srjoglekar246

Thank you so much. You made my day!

For anyone who looks answer for future.

as advised from @srjoglekar246

// outputLocations: array of shape [Batchsize, NUM_DETECTIONS,4] // contains the location of detected boxes private float[][][] outputLocations; // outputClasses: array of shape [Batchsize, NUM_DETECTIONS] // contains the classes of detected boxes private float[][] outputClasses; // outputScores: array of shape [Batchsize, NUM_DETECTIONS] // contains the scores of detected boxes private float[][] outputScores; // numDetections: array of shape [Batchsize] // contains the number of detected boxes private float[] numDetections; The answer lies here - [numDetections:2, outputScores:0/3, outputLocations:1 outputClasses:3/0]

The changes are in indexing of outputMap.put and it works all fine! Object[] inputArray = {imgData}; Map<Integer, Object> outputMap = new HashMap<>(); outputMap.put(1, outputLocations); outputMap.put(3, outputClasses); outputMap.put(0, outputScores); outputMap.put(2, numDetections);

Thank you once again, for swift replies. Will try to communicate this in all stack overflow to help others 😃
God speed.

Feel free to just follow up here, so that its better for me to track 😃

I got the same issue, I set num_classes as 5 to train my custom dataset. but I got the error: java.lang.IllegalArgumentException: Cannot copy between a TensorFlowLite tensor with shape [1, 10, 4] and a Java object with shape [1, 5, 4]. The 5 is the number of classes, Why is the output tensor 10?

label_map.pbtxt

item {
  id: 1
  name: 'me'
}

item {
  id: 2
  name: 'teammate'
}

item {
  id: 3
  name: 'enemy'
}

item {
  id: 4
  name: 'enemy_no_threat'
}

item {
  id: 5
  name: 'ignorance'
}

ssdlite_mobilenet_v3_small_320x320_coco.config

model {
  ssd {
    inplace_batchnorm_update: true
    freeze_batchnorm: false
    num_classes: 5
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
        use_matmul_gather: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    encode_background_as_zeros: true
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 320
        width: 320
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 3
        use_depthwise: true
        box_code_size: 4
        apply_sigmoid_to_scores: false
        class_prediction_bias_init: -4.6
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            random_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.97,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v3_small'
      min_depth: 16
      depth_multiplier: 1.0
      use_depthwise: true
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.97,
          epsilon: 0.001,
        }
      }
      override_base_feature_extractor_hyperparams: true
    }
    loss {
      classification_loss {
        weighted_sigmoid_focal {
          alpha: 0.75,
          gamma: 2.0
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          delta: 1.0
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    normalize_loc_loss_by_codesize: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
        use_static_shapes: true
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 512
  sync_replicas: true
  startup_delay_steps: 0
  replicas_to_aggregate: 32
  num_steps: 800000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: 0.4
          total_steps: 800000
          warmup_learning_rate: 0.13333
          warmup_steps: 2000
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "/home/cy/workspace/ssd/tfdata/sausagetain.record"
  }
  label_map_path: "/home/cy/workspace/ssd/tfdata/sausage_label_map.pbtxt"
}

eval_config: {
  num_examples: 8000
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/home/cy/workspace/ssd/tfdata/sausageval.record"
  }
  label_map_path: "/home/cy/workspace/ssd/tfdata/sausage_label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

Convert pb to tflite.

tensorflow/lite/toco/toco \
--input_file=/Users/cy/PycharmProjects/sausage.pb \
--output_file=/Users/cy/PycharmProjects/detect5.tflite \
--output_format=TFLITE \
--input_arrays=normalized_input_image_tensor \
--input_shapes=1,320,320,3 \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--inference_type=FLOAT \
--mean_values=128 \
--std_values=128 \
--allow_custom_ops \
--change_concat_input_ranges=false