Super-lag during large fleet battles, node crashes, etc

From what I understand, in EvE these big fights occur on a single machine and run on a single core. While I don’t understand the issues involved in this particular situation (running EvE), I find it hard to believe that this problem isn’t parallelizable, at least to a higher degree than what is currently being achieved.

What’s the main problem they are solving? Is it collision detection and resolution (weapons hitting targets)? Can someone who actually knows what he’s talking about convince me that this, or other problems they are solving, are inherently sequential and can’t be parallelized?

You are by far not the first to have that thought. The problem with developing for multiple threads is to keep things in sync. This is especially important here, where you have several thousand players on the same grid that all want to see the same things happen on their screens. Having everything run out of sync is much worse than having everything run slow AF.

And then there’s the fact that the game logic from all I know is written in python, and developing multi-threaded applications in python really is no fun at all.

I can imagine that it’s possible to give each grid its own thread, but this would for one not help much, as most large battles all happen in one grid anyway, and it would likely cause problems when mutiple grids start merging.

What definitely can be (and is being) done is to keep as many things as possible that aren’t immediately relevant to flying in space out of that one main thread, and to reduce the calculations this one thread has to do, for example by merging swarms of fighters into a squad, that to the server now is just one entity. But this has limits of course: Many things simply need to be in sync at all times (i.e. ship stats, skill bonuses, hitpoints, damage dealt, target locks, …).

3 Likes

image

1 Like

And yes, those things you mention all depend on one another:

How much damage a weapon deals (or if it hits at all) depends on the shooters skill bonuses, as well as how fast and in which direction the target moves. This in turn depends on how fast that ship can go, which depends on the skill bonuses of the pilot, and it depends on whether the ship is being bumped, which depends on the speed and position of the bumper, which in turn depends on their skill bonuses. And those values are all modified by the fleet they are in through fleet boosters, and the boost amount again depends on the skills of the booster-pilot.

If those things go out of sync on the side of the client (i.e. when the server simply doesn’t respond in time or starts dropping packets), that is bad, but not catastrophic. Once the server catches up and sends you new data, the situation gets rectified and you finally see your ship explode. However, if the server itself both thinks that the target is dead and alive, because the “imma firin mah laz0rs”-thread thinks it should hit, but the “how many hitpoints do I have”-thread thinks it’s not dead yet, because it just got an armor boost, that’s when things start getting really messy.

How is the weather forecast calculated? The system state matrix and then iterations which are literally done in graphics cards. The solution is simply but ccp have to hire and pay real developers rather than “developers” they have today.

I’m not sure if the weather forecast is a good example. On the other hand, if server side calculations in Eve were just as accurate as the weather forecast usually is, that could be quite entertaining.

1 Like

EVE is a single server with multiple nodes, that much I assume you do understand. Each node normally handles multiple star-systems (don’t ask me how many). In times of large fleet fights in a single system they can allocate additional nodes to that 1 system (sort of bandaid) to increase the game’s processing power to better handle all the action taking place, which should ease up the TiDi to a certain extend.

You can search around the forums and you will find a few posts about suggestions on how to improve EVE’s responsiveness, among the discussions, parallelization have been mentioned however there are quite a few limitations; like with any game there will always be limits to what the hardware can manage and there is also the matter of EVE’s core systems (legacy Code) being old and obsolete compared to today’s standard and cannot be easily replaced.

2 Likes

Either the calculations going on in a single system in a large fleet fight are parallelizable, or they are not. If they are, at least to some extent, then there is no good excuse for the current situation.

The problem is the players, no matter how many people CCP can viably make a system able to hold, the playerbase will intentionally overpopulate that figure for whatever reason and then complain its not working.

Honestly because hardware has limitations, and the playerbase can be trusted about zero, the real only answer is to target the very mechanics that allow alliances to create such mass numbers of capitals without worry. That very problem, and everyone knows it, starts at the mass amount of ore the rorqals produce. This is what you get when you “give more”, you get ■■■■ on back, the easy solution is to “give less” to get ■■■■ on less.

Hardware does have limitations, but that isn’t the issue here. If the algorithms are parallelizable, then you just throw more hardware at the problem. If they aren’t parallelizeable, then it doesn’t matter how much hardware you throw at the problem, because it won’t help.

Too much small, mutually dependant stuff happening when there are thousands of players on grid. Parallizing that gives you worse performance.

That is exactly the problem parallelization is supposed to solve: If you have a task that can be parallelized perfectly, then you can just keep throwing cores at it, and cores nowadays are a dime a dozen. However, the fact of the matter is that it simply isn’t as easy as Beast of Revelations wants to make it:

It’s either possible to fly a man to Mars or it is not. And we are fairly sure it is, so there is no excuse for not having flown a man to mars already, I suppose. So, where’s the real life Martian?

At this stage, not everything can just be paralellized, and big companies are throwing lots of money at hard- and software developers to change that. However, those are a whole different caliber than a little niche game developer. The sad fact is, that most games still cannot be properly parallelized, and that’s a problem. But keeping parallel threads in sync without having them wait for each other all the time (in which case it’s really no better than a single thread) is really frickin hard, even if it’s possible at all.

1 Like

It’s ultimately impossible for anyone to give an exact answer, unless said person has got first-hand experience with the server code. The code may be close to perfect, in which case the hardware would be too weak. Or the code could be sub-optimal and simply doesn’t utilize the hardware to its fullest potential, in which case it would need a refactoring, revision or even a rewrite. The answer may be somewhere in between, and we still couldn’t do anything but to hope for the best and in CCP’s ability to keep giving us our game. Hence it’s a all big cheese wheel…

Mutually dependant stuff can easily be made independent.
For example:

  1. Process all the bonuses
  2. Process all the weapons
  3. Process all energy/shield/armor logistics
  4. Calculate remaining HP of ships
  5. Remove dead ships
  6. Add new ships/pods to grid

This is just example of how this system can be split to stages where each stage can be calculated in parallel threads.

1 Like

But what do you gain by putting all those things in separate threads, if you have to keep them in order anyway. It only means that all those threads won’t do anything but wait most of the time, and it will add unnecessary memory overhead.

Without knowing how the processes work I can all but speculate, and I think that CCP could very likely do better. But there’s more to it than just “put everything into different threads”.

Hard for who? People who aren’t trained to do it or think that way? And using languages which aren’t designed for it? I will grant you that.

I myself write highly-concurrent systems for a living, employing parallel algorithms which utilize thousands of processes running on many cores and across many machines. But the problems you are solving have to be parellelizable. Not all are, but most of the time when someone says their problem isn’t, it is, they just don’t know what they are doing.

If I had to guess, I suspect most of the server code was written years ago without much thought or regard to parallelization, and now CCP is in “maintenance mode” with this game so they have no desire or intent to rewrite it to “do it right.”

1 Like

If that list of things has to be in order, then you simply use your thread (or process) pool to tackle that list in parallel, but in order. For instance all X number of threads/processes works on step 1, then they all tackle step 2, etc. Naturally, that assumes each step can be parallelized.

I agree, a lot of this may just have to do with the fact that this game is 15 years old and written in python. But do you really think that just rewriting everything and buying a new GPGPU compute cluster to run the new code on is a viable solution for a performance problem in a silly online internet spaceships game? Even if a game developer is not in maintenance mode, this would be just silly.

You don’t put stages to separate threads.
You execute each stage in multiple threads. As all calculations inside one stage are independent from each other they can be run in any amount of separate threads.
Then you combine results on one place and move to the next stage.

Basically this is MapReduce.

Sure, why not?

I’m afraid I don’t follow.