agones: FleetAutoscaler bug

What happened:

we found a bug by using FleetAutoscaler when we set FleetAutoscaler such as :

apiVersion: "autoscaling.agones.dev/v1"
kind: FleetAutoscaler
metadata:
  name: gameserver-autoscaler
spec:
  fleetName: test-gameserver
  policy:
    type: Buffer
    buffer:
      bufferSize: 2
      minReplicas: 3
      maxReplicas: 100

we will get like this. so far so good

NAME                            SCHEDULING   DESIRED   CURRENT   ALLOCATED   READY   AGE
gameserver   Packed       3         3         0           0       43d
NAME                                        READY   STATUS    RESTARTS   AGE
gameserver-xkhxz-77mbr   2/2     Running   0          13s
gameserver-xkhxz-t44r9   2/2     Running   0          14s
gameserver-xkhxz-xn4bz   2/2     Running   0          22s

now Allocat 2 gameservers we will get 2 Ready and 2 Allocated server

NAME                            SCHEDULING   DESIRED   CURRENT   ALLOCATED   READY   AGE
gameserver   Packed       4         4         2           2       43d
NAME                                        READY   STATUS    RESTARTS   AGE
gameserver-xkhxz-77mbr   2/2     Running   0          4m41s
gameserver-xkhxz-g47f7   2/2     Running   0          42s
gameserver-xkhxz-t44r9   2/2     Running   0          4m42s
gameserver-xkhxz-xn4bz   2/2     Running   0          4m50s
NAME                                        STATE       ADDRESS          PORT   NODE                                               AGE
gameserver-xkhxz-77mbr   Allocated   18.181.197.173   7190   ip-10-188-11-12.ap-northeast-1.compute.internal    4m42s
gameserver-xkhxz-g47f7   Ready       3.112.57.82      7233   ip-10-188-36-130.ap-northeast-1.compute.internal   43s
gameserver-xkhxz-t44r9   Allocated   3.112.57.82      7842   ip-10-188-36-130.ap-northeast-1.compute.internal   4m43s
gameserver-xkhxz-xn4bz   Ready       18.181.197.173   7518   ip-10-188-11-12.ap-northeast-1.compute.internal    4m51s

now when we shutdown one gameserver it will create new server and gonna delete a gameserver which was Ready check it below.

NAME                            SCHEDULING   DESIRED   CURRENT   ALLOCATED   READY   AGE
gameserver   Packed       4         4         1           2       43d
NAME                                        READY   STATUS        RESTARTS   AGE
gameserver-xkhxz-77mbr   2/2     Running       0          4m58s
gameserver-xkhxz-g47f7   2/2     Running       0          59s
gameserver-xkhxz-gm9ll   2/2     Running       0          4s
gameserver-xkhxz-t44r9   2/2     Terminating   0          4m59s
gameserver-xkhxz-xn4bz   2/2     Running       0          5m7s
NAME                                        STATE       ADDRESS          PORT   NODE                                               AGE
gameserver-xkhxz-77mbr   Allocated   18.181.197.173   7190   ip-10-188-11-12.ap-northeast-1.compute.internal    4m59s
gameserver-xkhxz-g47f7   Ready       3.112.57.82      7233   ip-10-188-36-130.ap-northeast-1.compute.internal   60s
gameserver-xkhxz-gm9ll   Scheduled   3.112.57.82      7838   ip-10-188-36-130.ap-northeast-1.compute.internal   5s   <--- new server
gameserver-xkhxz-t44r9   Shutdown    3.112.57.82      7842   ip-10-188-36-130.ap-northeast-1.compute.internal   5m  <--- shutdown server
gameserver-xkhxz-xn4bz   Ready       18.181.197.173   7518   ip-10-188-11-12.ap-northeast-1.compute.internal    5m8s <--- this server disappear

delete gameserver-xkhxz-xn4bz , which was ready…

NAME                            SCHEDULING   DESIRED   CURRENT   ALLOCATED   READY   AGE
gameserver   Packed       3         3         1           1       43d
NAME                                        READY   STATUS    RESTARTS   AGE
gameserver-xkhxz-77mbr   2/2     Running   0          5m15s
gameserver-xkhxz-g47f7   2/2     Running   0          76s
gameserver-xkhxz-gm9ll   2/2     Running   0          21s
NAME                                        STATE       ADDRESS          PORT   NODE                                               AGE
gameserver-xkhxz-77mbr   Allocated   18.181.197.173   7190   ip-10-188-11-12.ap-northeast-1.compute.internal    5m16s
gameserver-xkhxz-g47f7   Ready       3.112.57.82      7233   ip-10-188-36-130.ap-northeast-1.compute.internal   77s
gameserver-xkhxz-gm9ll   Scheduled   3.112.57.82      7838   ip-10-188-36-130.ap-northeast-1.compute.internal   22s
NAME                            SCHEDULING   DESIRED   CURRENT   ALLOCATED   READY   AGE
gameserver   Packed       3         3         1           2       43d
NAME                                        READY   STATUS    RESTARTS   AGE
gameserver-xkhxz-77mbr   2/2     Running   0          17m
gameserver-xkhxz-g47f7   2/2     Running   0          13m
gameserver-xkhxz-gm9ll   2/2     Running   0          12m
NAME                                        STATE       ADDRESS          PORT   NODE                                               AGE
gameserver-xkhxz-77mbr   Allocated   18.181.197.173   7190   ip-10-188-11-12.ap-northeast-1.compute.internal    17m
gameserver-xkhxz-g47f7   Ready       3.112.57.82      7233   ip-10-188-36-130.ap-northeast-1.compute.internal   13m
gameserver-xkhxz-gm9ll   Ready       3.112.57.82      7838   ip-10-188-36-130.ap-northeast-1.compute.internal   12m

What you expected to happen:

when we shutdown Allocated gameserver, it should keep the Ready Server.

How to reproduce it (as minimally and precisely as possible):

check abouve.

Anything else we need to know?:

Environment:

  • Agones version: agones-1.13.0
  • Kubernetes version (use kubectl version): v1.19.6-eks-49a6c0
  • Cloud provider or hardware configuration:
  • Install method (yaml/helm): helm
  • Troubleshooting guide log(s):
  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 19 (17 by maintainers)

Most upvoted comments

I think this is a reasonable feature 👍🏻 and could be interesting to explore more as a feature.

I think above feature can mitigate the impact and worth doing. But it still can’t solve the root problem. Those temp created gameserver will cause issues in some number-sensitive system such as billing.

Personally I’d like to implement a “lazy reconciling” feature. In this mode we can wait deleting gameservers fully removed before creating new gameservers during reconciling. What do you think?