Tripwire - signature mapping tool

Daimian_Mercer · June 29, 2020, 11:22pm

Tripwire should be back online and updated to 1.13 - please refresh your browsers. A hard refresh might be required but shouldn’t be.

Daimian_Mercer · June 29, 2020, 11:51pm

I re-enabled ship mass tracking feature for now. If it causes problems again I might have to disable it.

fubbleskag · July 15, 2020, 12:10pm

this worked great for a few days after the reset, but the last 48-72 hours have been hit or miss, mostly miss - seeing a lot of 502/504

Daimian_Mercer · July 15, 2020, 4:06pm

Something is going on with the PHP servers. I have not yet solved the issue as it always seems to happen when I am away/working.

I am monitoring and trying to resolve the issues.

Dravick_Afterthought · July 15, 2020, 6:13pm

Thanks you for your time & effort.

Daimian_Mercer · July 15, 2020, 6:59pm

Think I finally got the 502s figured out. A table suddenly needed an index.

fubbleskag · July 17, 2020, 6:01pm

was going pretty good for about 24h and then just up and died now

maybe the mass tracking?

Daimian_Mercer · July 17, 2020, 6:41pm

Making an official public cry for help…

Tripwire 502s are being caused by database issues. The signature/wormhole tables (InnoDB) are becoming locked for long periods of time.

I’ve blindly tried adding indexes and other things to try and fix the issue and have run out of even ideas to try. Tripwire needs an DBA (Database Administrator) that can look at the structure and make professional recommendations on proper setup to prevent table locks - if anyone is one or knows one PLEASE HELP!

You can find the entire database structure in the source code here: https://bitbucket.org/daimian/tripwire
Look in /.docker/mysql/tripwire.sql

I cannot and will not provide data to populate a database - this would destroy Tripwire security, so don’t bother asking.

Brixer · July 17, 2020, 7:23pm

In my opinion there is no way to be immune to tablelocks by messing around with the DB itself. You can make it better, but never eliminate it.
Most such problems are coding issues. Like starting an read/write transaction and doing lot of BS before commiting it. Or demanding exclusive read/write if you only need to read.

Also, you can probably run queries to check for queued transacations to help track down which transactions that block. I only have experience with Postgres so cant tell you what those commands would be in your case.

Ritek · July 17, 2020, 8:27pm

As a professional programmer with years of experience, i normally try to have a routine in yor ADMIN CONTROL PANEL specifically for recreate Indexes

WHen i go to create a new index, NEVER do in the database myself, i run my module and check if the index exist name is IX_TABLENAME_FIELDNAME , if not exist create it. I can rerun or rebuild if i mess something.

Anyway, you need check your source code and look what selects are you doing in source code and add to a notepad/editor file this

In medium to small datasets (less than 20 gb) i normally have best results using SIMPLE INEX, not composite index.

Good but confuse …
CREATE INDEX IX_Persons_LastNameMIX ON Persons (LastName, FirstName);

More readble for me in
CREATE INDEX IX_Persons_LastName ON Persons (LastName);
CREATE INDEX IX_Persons_FirstName ON Persons (FirstName);

An i assume you hace not char fields, instead varchar.

Saeka_Tyr · July 17, 2020, 9:11pm

innodb doesn’t do table locks. it does row locks.

from your db structure: https://bitbucket.org/daimian/tripwire/src/9c04fdbc0fc95db602b973fd26999f5ae94fb07a/.docker/mysql/tripwire.sql#lines-66

IASM.

SELECT TABLE_NAME,
   ENGINE
FROM   information_schema.TABLES
WHERE  TABLE_SCHEMA = 'database';

you are going to need to do an alter table and fix all these ■■■■■■■ so they appropriately use innodb engine rather than IASM.

CCP_Bartender · July 17, 2020, 9:16pm

My recommendation would be that you start by looking into the query planner explain command. I use the TSQL equivalent a lot when doing these kinds of optimizations to the eve DB. Unfortunately, I know very little about the details of innoDB, but as an example, consider the following query:

UPDATE signatures SET modifiedByID = 0, modifiedByName = "Tripwire" WHERE lifeLeft < NOW() AND lifeLength <> '0' AND type <> 'wormhole';

When executing that query, you’re hinging on the index being created by this key in the signatures table to avoid a whole table scan:
KEY `lifeLeft, lifeLength` (`lifeLeft`,`lifeLength`),

However, it may be buying you less than you expect. You’re about to wander into the deep magic of implementation-specific db engine index construction techniques.

Here are some good links on the topic:

https://www.sql-server-performance.com/composite-indexes/
https://dev.mysql.com/doc/refman/5.7/en/multiple-column-indexes.html

It’s hard to say exactly what this is doing without looking at the query planner results. However, my guess is that because you also have the “type” in the where clause, the composite index is less useful than you might think. The composite index can be used to rapidly assemble a bunch of rows, but it’s possible that they now need to be merged and scanned to see if they are of type wormhole. It’s possible the innoDB engine is ignoring the index entirely and just doing a full table scan, because it considers the merge-and-filter to be more expensive than the table scan.

My recommendations would be:

Throw some of these things through the EXPLAIN statement and see if it tells you where the bottlenecks are.
Try making a triple clustered index that goes (type, lifeLength, lifeLeft) and see if it makes the updates and deletes magically faster.

Note however that if #2 does magically work you probably want to consider the situation a bit more and not just blindly leave that there. A triple index in that order might make your inserts much slower. Use the query planner!

WangzoR · July 18, 2020, 9:26am

Do you have any more details of the query plan and/or what scenario(s) the performance issue occurs?
inserting sigs? editing? etc

Daimian_Mercer · July 18, 2020, 3:36pm

I can assure you that the tables that are causing the issues (signatures & wormholes tables) are in fact InnoDB and are have table locks, which I thought was not possible because they are row locking engines.

Other tables in Tripwire’s DB are actually IASM because they are 99% of the time a read and I don’t have any issues with table locks.

I don’t mind blind suggestions (though yours isn’t completely blind) but I am already making these kind of blind changes myself. I am more looking for someone that can take the time to actually deeply look into the DB and figure out what is going on.

Daimian_Mercer · July 18, 2020, 3:41pm

wow, thats some awesome info - thanks. You actually hinted at exactly something I recently tried and thought it fixed the issue only to have the issue come back (#2).

Daimian_Mercer · July 18, 2020, 3:44pm

The issues are with the signatures & wormholes tables becoming locked. The cause seems to be according to the DB logs 2 events that are called wormholeClean and wormholeCritical. Which are responsible for removing expired wormholes and their signatures and the other marks wormholes as critical. Here is a sample of the logs:

2020-07-15T17:59:08.370317Z 263 [ERROR] Event Scheduler: [root@localhost][tripwire.wormholeClean] Lock wait timeout exceeded; try restarting transaction
2020-07-15T17:59:08.370367Z 263 [Note] Event Scheduler: [root@localhost].[tripwire.wormholeClean] event execution failed.
2020-07-15T17:59:08.425392Z 422 [ERROR] Event Scheduler: [root@localhost][tripwire.wormholeCritical] Deadlock found when trying to get lock; try restarting transaction
2020-07-15T17:59:08.425421Z 422 [ERROR] Event Scheduler: [root@localhost][tripwire.wormholeCritical] Unknown error
2020-07-15T17:59:08.425427Z 422 [Note] Event Scheduler: [root@localhost].[tripwire.wormholeCritical] event execution failed.

Dionysus_Erata · July 18, 2020, 4:26pm

honesty i much rather have a working tripwire than wormholes being removed/made crit automatically

actually on that topic me and my corp often notice that wormholes are made crit or even deleted too soon quite often

CCP_Bartender · July 18, 2020, 6:20pm

Hrmm, I am mad suspicious about those update+innerjoin queries in the wormholeCritical event.

Again, not an expert in innodb and hard to say without the query planner, but according to this gentleman, you might be locking way more of the table rows than you expect: https://stackoverflow.com/questions/6664667/mysql-innodb-locks-on-joined-rows/12251122#12251122

You could try using a with clause, it might help hint the query planner to reduce the amount of rows locked by divorcing the table scans from the update. Again, the explain command will help you figure out whether the change has really made a difference, you should see radically different plans.

Consider this query:

    UPDATE signatures s 
INNER JOIN wormholes w ON (s.id = initialID OR s.id = secondaryID) AND life = 'stable' 
       SET modifiedByID = 0, modifiedByName = 'Tripwire', modifiedTime = NOW() 
     WHERE s.type = 'wormhole' AND lifeLength <> '0' AND DATE_SUB(lifeLeft, INTERVAL 4 HOUR) < NOW();

If we re-write it (I have done this by hand without testing, it may not work without tweaking) so that “wormholes w” is encapsulated in a WITH clause:

WITH w AS (
    SELECT initialID, SecondaryID
       FROM wormholes
      WHERE life = 'stable'
  )
  UPDATE signatures S
         SET s.modifiedByID = 0, s.modifiedByName = 'Tripwire', s.modifiedTime = NOW()
   WHERE s.type = 'wormhole' AND s.lifeLength <> '0' AND DATE_SUB(s.lifeLeft, INTERVAL 4 HOUR) < NOW()
        AND (s.id IN (w.initialID) OR s.id IN (w.secondaryID));

That might speed it up just by allowing the query planner to unlock a bunch of rows that maybe it wasn’t sure about before. However, this re-write helps show something else as well. Notice the condition used to generate the result set w. It’s only criteria is that the wormhole must be “stable”. How much of your wormhole table is lifetime “stable”?

Provided you’re culling dead wormholes from the table, I’m guessing a lot of it. Which means you’re having to compare every row in the signatures table against not one but two columns of most of the wormholes you know about in order to tell if it needs to be updated.

The database engine might be smart enough to invert the query and make it a “not in” if that reduces the data more, but that’s implementation dependent and I don’t know how innoDB profiles these things. You could try inverting the condition on w and on that last AND clause to hint to the query planner that there’s way more stable wormholes than not, see if excluding rather than including is faster in the average case?

With the INNER JOIN strategy, I wonder if perhaps the database can’t “abort early” for rows which don’t meet some conditions of the WHERE clause, and it’s being forced to lock a tonne of rows in both table that it needn’t because it has to join first before it can confidently execute the WHERE clause and start removing signatures that don’t meet the type condition.

Now, you do have indexes on the both initialID and secondaryID to help with these issues. But again, I’m not certain they’re doing what you expect. It seems to me that query planner would have to tunnel all the way down to the leaf node level for every entry in the signatures row to be sure it’s not in the result, and it has to do that twice, once for the initial and once for the secondary. At the risk of sounding like a cracked record, throw it through the explain statement and see what it’s really doing.

WangzoR · July 18, 2020, 11:29pm

I also noticed the inner join updates yesterday which I found curious as joins are not quite the norm for updates, and will have a look into them sometime today as well

Two general situations we can possibly look at
a) known naturally long lived processes, assess if they are taking more locks than absolutely necessary
b) performance problems where the level of locking is appropriate but the performance is detected as sub-optimal

@Daimian_Mercer the two transactions you mentioned are the processes affected by locks, whilst they may also be the cause, but they could just be the processes affected by ‘something’ taking up locks its needing

The mention of deadlocks is interesting so please try what CCP Bartender is mentioning first to decouple, because if we indeed suspect the same query is causing itself problems, then we just need to go gun ho on performance tuning and reducing amount of locks held as much as possible by those statements

The below will help assess whether the same SQL is causing another instance of the same SQL to lock or something else entirely, if you are able to catch the situation when this occurs or replicate it as needed, you could try analyse the root of the lock by using the mySQL equivalent of sp_who

SHOW FULL PROCESSLIST

It shows all active transactions and their state/status, specifically whether they are waiting for locks to be granted or the transaction that is holding locks, this can also be used to check transactions lasting longer than appropriate irrespective if they have locks or not

Whilst unfortunately it doesn’t tell you explicitly which transaction is blocking another, if there are only a few long lived active transactions at the time, that information could be derived and whom is the offending transaction

Then you could apply an EXPLAIN on the transaction that we think is causing the locks

As a reference however and not sure if this has already been done, the following command shows in-depth detail about deadlocks if they aren’t currently logged in detail

SHOW ENGINE INNODB STATUS

Most relevant is the ‘LATEST DETECTED DEADLOCK’

If you want this continuously dumped than the following needs to be enabled

set global innodb_print_all_deadlocks=1;

External References:
https://www.xaprb.com/blog/2006/07/31/how-to-analyze-innodb-mysql-locks/

Nix_Anteris · July 19, 2020, 5:21am

Change this to
lifeLeft < DATE_ADD(NOW(), INTERVAL 4 HOUR)

NOW() is evaluated once at the start of the query and the execution planner should resolve the RHS to a constant. This avoids having to calculate the value for every row in the table and actually use the index on lifeLeft if it has one.

Topic		Replies	Views
Anyone use deepsafe tool? Known Space	5	1405	September 19, 2020
Scanner Assistant Utility Wormhole Space	9	1439	August 23, 2018
Pathfinder vs. Tripwire Wormhole Space	38	17747	February 20, 2019
Pathfinder support thread? Wormhole Space	9	3446	December 24, 2020
Tripwire mapping issue Wormhole Space	3	576	December 27, 2020

Tripwire - signature mapping tool

Related topics