PaddleOCR: 【BUG】ppstructer quickstart: running bugs with official quickstart python sdk with document-analysis, table recognition functions

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem When I was using the python code in the ppstructer documentation to quickly experience the function of image orientation classification + layout analysis + table recognition, and there was an error. I didn’t change the logic of the code, and I just typed it according to the official code.

系统环境/System Environment：centos7
版本号/Version：Paddle：paddlegpu 2.4.2&dev 0.0.0 PaddleOCR：release2.7 问题相关组件/Related components：
运行指令/Command Code：follow the official document of ppstructer document analysis: table_engine = PPStructure(show_log=True, image_orientation=True) img = cv2.imread(img_path) result = table_engine(img)
完整报错/Complete Error Message： Traceback (most recent call last): File "/opt/conda/envs/jcai/bin/paddleocr", line 8, in <module> sys.exit(main()) File "/opt/conda/envs/jcai/lib/python3.8/site-packages/paddleocr/paddleocr.py", line 837, in main result = engine(img, img_idx=index) File "/opt/conda/envs/jcai/lib/python3.8/site-packages/paddleocr/paddleocr.py", line 759, in __call__ res, _ = super().__call__( File "/opt/conda/envs/jcai/lib/python3.8/site-packages/paddleocr/ppstructure/predict_system.py", line 143, in __call__ filter_boxes, filter_rec_res, ocr_time_dict = self.text_system( File "/opt/conda/envs/jcai/lib/python3.8/site-packages/paddleocr/tools/infer/predict_system.py", line 105, in __call__ rec_res, elapse = self.text_recognizer(img_crop_list) File "/opt/conda/envs/jcai/lib/python3.8/site-packages/paddleocr/tools/infer/predict_rec.py", line 616, in __call__ self.predictor.run() ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [4, 96, 3, 90] and the shape of Y = [4, 96, 4, 90]. Received [3] in X is not equal to [4] in Y at i:2. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/phi/kernels/funcs/common_shape.h:84) [operator < elementwise_add > error] 我们提供了AceIssueSolver来帮助你解答问题，你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no): yes 请尽量不要包含图片在问题中/Please try to not include the image in the issue.

About this issue

Original URL
State: closed
Created 9 months ago
Comments: 15 (3 by maintainers)

Most upvoted comments

According to the error message, the problem occurs during the text recognition stage, where the height of the image input to the recognition model is different from the expected input (Operands could not be broadcast together with the shape of X = [4, 96, 3, 90] and the shape of Y = [4, 96, 4, 90]). This error may be caused by the height of the input image not being a multiple of 16, resulting in similar errors during the concatenation of maps due to rounding errors in the downsampling and upsampling processes. We suggest you check your input img.

xu-peng-7 on Sep 20, 2023

Can you first run the 2.2.2 版面分析+表格识别 code, and check if the same error occurs?

sure.After runnning this code above, i also with errors:

Traceback (most recent call last):
  File "demo.py", line 57, in <module>
    result = table_engine(img)
  File "/opt/conda/envs/jcai/lib/python3.8/site-packages/paddleocr/paddleocr.py", line 759, in __call__
    res, _ = super().__call__(
  File "/opt/conda/envs/jcai/lib/python3.8/site-packages/paddleocr/ppstructure/predict_system.py", line 143, in __call__
    filter_boxes, filter_rec_res, ocr_time_dict = self.text_system(
  File "/opt/conda/envs/jcai/lib/python3.8/site-packages/paddleocr/tools/infer/predict_system.py", line 105, in __call__
    rec_res, elapse = self.text_recognizer(img_crop_list)
  File "/opt/conda/envs/jcai/lib/python3.8/site-packages/paddleocr/tools/infer/predict_rec.py", line 616, in __call__
    self.predictor.run()
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [6, 96, 3, 74] and the shape of Y = [6, 96, 4, 74]. Received [3] in X is not equal to [4] in Y at i:2.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ../paddle/phi/kernels/funcs/common_shape.h:86)
  [operator < elementwise_add > error]

PancakeAwesome on Sep 20, 2023

Can you first run the 2.2.2 版面分析+表格识别 code, and check if the same error occurs?

xu-peng-7 on Sep 20, 2023

Hi @PancakeAwesome , could you provide the image used for testing, I’ll try to reproduce it.

GreatV on Sep 20, 2023