Regarding the Fleet Fight in X47L-Q

Hello everyone,

Yesterday you took to the battlefield and had yourselves a massive brawl in X47L-Q with local reaching over 6,000 pilots. These moments are always awe inspiring and we get extremely excited to witness these battles and wars unfold. It is our goal to deliver the capability for you to create these moments, in ever bigger and more glorious fashion.

I will go over the general circumstances surrounding the issues you experienced yesterday and what we are doing to address them going forward.

As for the issues that ultimately lead to a node death which contained the system in which the fight took place, they were unexpected. At this time it is not known what caused it, we are looking into this and if they relate to the previous recent node deaths.

Following server downtime a new boundary was pushed when an attempt was made to simultaneously log in every participant as the system was loading. Usually battles of this size do not span downtime and with this many participants the ensuing login process was very slow. This was not an unexpected result but we are looking into how we can better facilitate battles of this size around server downtime.

We noticed there was talk of a Rollback in the restoration of the system following the node death. We would like to clarify that this was absolutely not the case, and no decision was made by anyone from CCP to restore to a previously saved state of the system before the service containing the system terminated. We had indicators that a node death might be imminent and we prepared to bring the system back online. When the node death occurred the system was restored as is, with the very last saved state we had.

When the outstanding calls on a node begin to pile up, the information that is recorded can sometimes vary. Ship kills and losses are the highest priority, and ship location is often much lower. When the X47 node was killed and the system was placed on a new node, we were using the most up-to-date information available.

It has been a while since we discussed the method of requesting fleet fight reinforcements, given an issue that manifested yesterday this seems as good a time as any. When requesting multiple systems to be reinforced it is paramount that your staging system is not requested unless you intend to fight there. Staging systems naturally have more load, and thus more resources dedicated to them without you having to request them. Node reinforcements are done following downtime, in the case of there being more requests than dedicated fleet fight nodes they will be distributed onto those nodes and then resolved manually after downtime. In this instance the automated distribution placed one staging system with X47L-Q by chance and this caused some problems.

In the short term, we have as in previous large conflicts temporarily dedicated more nodes to serve as fleet fight nodes which become available after server downtime tomorrow. As you continue to push the boundaries, we continue on our goal to make that possible. We will look into these issues and inform you of our progress.

o7

18 Likes

More nodes just means more characters just means more of the same issues.

3 Likes

Umm…restoring to very last saved state is surely very definition of a rollback?

3 Likes

If I understood correctly, more dedicated fleet fight nodes means less chance other systems happen to end up at the same reinforced node, so less chance of a high-intensity battle to be affecting another system as well.

Even if a high-intensity battle again manages to crash a node, it wouldn’t take another system with it if the node is dedicated for a single system rather than two, which is an improvement.

1 Like

Isn’t a rollback if you go back to a previous state, before the last one?

Here they just continued the server where it last saved it’s state.

9 Likes

Perhaps…but…it’s still a rollback from the state server was at when it died…i.e ship positions etc…A lot fo people who were clear of the Keepstar suddenly found themselves back on it.So intended or not…it is still,even inadvertently…a rollback from the natural state!

Could also be argued that by returning the ships onto the keepstar grid…that many losses where incurred over time that would not,as I suspect a lot of people logged in as they could once server gained some stability…thinking they were safe…only to find themselves deposited in a prior position…i.e pre-location before the node crash/death!

3 Likes

Which you’re probably never going to be able to work out, you don’t want to write half a state because that breaks a LOT of things and you can’t rely on the state that immediately lead to the failure because if it was something about that state that caused the failure you would just be restoring a node that was about to implode again

Its not a true “rollback” in the terms that EVE players commonly expect it to be used

4 Likes

When the outstanding calls on a node begin to pile up, the information that is recorded can sometimes vary. Ship kills and losses are the highest priority, and ship location is often much lower.

What do ppl think about de-prioritizing info recorded in a node-death-imminent snapshot? Some ships get away, some don’t, some capital ship will live that should have died, etc. Complaints either way.

Thanks for posting this explanation, Paragon.

4 Likes

Could the timing of downtime be changed under such circumstances? For example, If you know that there is likely to be a massive fight around 11:00 Eve, shift downtime back to 23:00 Eve?

1 Like

Maybe there should be loading nodes that accept Logins but holds the character in stasis for X amount of time while the system fills with characters.

2 Likes

I see how node death is not a roll back. But can I suggest in the future there is a “save” that has consistent time data? Perhaps further back than the currently working set, and you can restore from that?

The issue with the restore in this case, was everything in the game was restored at different stages.

For instance, drone paths that showed our previous escape from the Keepstar were still present, however people were landing at 0.

Ships that had docked were docked in the correct station after the restart however if you were in space on tether on the same structure you were suddenly teleported to your pre-downtime location.

All this points towards a need for some validation of the save before it is restored to make sure time stamps match for each of the datatypes stored.

The split in this data caused mass confusion on the third server restart as half the fleet that managed to dock were unable to provide any help to the other half of the pilots that had suddenly respawned at 0. It gave a major advantage to the defenders in this case as none of them had to warp anywhere and if they had docked, they could just undock at 0 again.

We had to escape twice….

I understand that this kind of player count is unprecedented and can cause issues but in this case a failure of your systems caused not only a fight to go badly from a gameplay POV but data was lost and a restore was implemented unfairly (whether intentional or not).

This needs to be fixed or at least managed next time with a proper restore, with a time indicated so your players are at least informed as to where they will log in.

Any reason why the servers buckled under the 2017 V-3YG7 fight (some modules ended up going offline, one emergency DC kept cycling and draining its ship of cap instead of burning out and it kept cycling even when cap was gone) but not the 2020 FWST-8 one even though the latter had more toons on grid involved?

How about the participants working with or around known limitations. Stop making your problems someone else’s problems.

5 Likes

It is not my problem. It is a problem for the thousands of people that were engaging in the fight, for CCP and for Eve in general. It was rather ironic that CCP issued a Facebook post about the ongoing fight with over 6000 capsuleers involved and how fantastic Eve was, and very shortly afterwards the node died. These huge fights clearly put a huge load on the servers and it would appear that if they are over downtime there are particular problems. If that pressure can be eased by occasionally shifting downtime by a few hours to faciltate those involved in these headline creating fights having a more enjoyable experience surely that is to everyone’s benefit.

It doesn’t matter how much CCP increases server performance because it’s not just that, it’s also latency which is just a reality when you have thousands of people from all over the world all interacting with each other, at that moment.

If CCP would double server performance next time there’s a fight both sides will just bring 9000 people each and create the same problem again, because they’re doing it on purpose.

6 Likes

I guess this just shows how dangerous the automated assigning of nodes using the fleet fight tool is!

On Wednesday we have 4 keepstars, which will all span across downtime, coming out of reinforced mode. The fact that there is a lot of potential fight systems (One of which is a staging system already) so close to downtime, I’d hope you can manually intervene to ensure the nodes are distributed correctly otherwise this could be a huge mess waiting to happen!

2023-03-15 10:53:25: X47L-Q
2023-03-15 10:54:36: 5ZXX-K
2023-03-15 11:22:57: F-NMX6
2023-03-15 12:24:21: ROIR-Y

As a developer, I would be really curious and enthusiast if you can make an article as technically detailed as the one that introduced Tranquility Tech IV.

For example, explaining why doubling the server performance does not mean doubling the number of supported players can be insightful for many among us.

And of course, explaining how the admin team managed the battle from their side, what were their challenges would be great.

4 Likes

The server is the source of truth, not the state of your client.

Take gate bombing for example, your client can be in the next system but the server kills you in the system you were jumping out off.

Its not ideal but it is what it is.

3 Likes

I agree!, I’m curious too; to do a good job, it’s not enough to eat a pepito… good, doing a good job also requires: going to work !