You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Benoit Tellier (Jira)" <se...@james.apache.org> on 2021/08/06 03:42:00 UTC

[jira] [Created] (JAMES-3626) Setting value of a collection inserts tomstone

Benoit Tellier created JAMES-3626:
-------------------------------------

             Summary: Setting value of a collection inserts tomstone
                 Key: JAMES-3626
                 URL: https://issues.apache.org/jira/browse/JAMES-3626
             Project: James Server
          Issue Type: Improvement
          Components: cassandra
    Affects Versions: master
            Reporter: Benoit Tellier
             Fix For: 3.7.0


https://docs.datastax.com/en/landing_page/doc/landing_page/dataModel.html#dataModel__checkTableStructure


{code:java}
Check use of collection types

When insert or full update of a non-frozen collection occurs, such as replacing the value of the column with another value like UPDATE table SET field = new_value …, Cassandra inserts a tombstone marker to prevent possible overlap with previous data even if data did not previously exist. A large number of tombstones can significantly affect read performance.

When you know that no previous data exists and to prevent creation of tombstones when inserting data into a set or map (or when performing the full update of a set or map), you can use append operation for columns. For example:

CREATETABLE test.m1 (
id int PRIMARY KEY,
m map<int, text>
);

instead of using:

INSERT INTO test.m1(id, m) VALUES (1, {1:'t1', 2:'t2'}); 

or

UPDATE test.m1 SET m = {1:'t1', 2:'t2'} WHERE id = 1; 

which generate tombstones, execute:

UPDATE test.m1 SET m = m + {1:'t1', 2:'t2'} WHERE id = 1; 

which has the same result, but without tombstone generation.
{code}

Affected tables:

 - imapUidTable (use of a non frozen set for user flags)
      - inserts are done
      - full updates are done
      - This table is critical for JMAP performance. Not that it can explain the ~10% performance hit that occurs when reading mailbox with recent inserts versus reading mailboxes without recent inserts.

 - messageIdTable (use of a non frozen set for user flags)
      - inserts are done
      - full updates are done
      - This table is critical for IMAP performance.

 - enqueuedMailsV3 (non frozen list data types are mistakenly used for RECIPIENTS and PER_RECIPIENT_SPECIFIC_HEADERS - frozen would be acceptable - map is used for attributed and could be frozen too)
      - As a temporary measure, we can fix the inserts.
      - This table is critical to the browse start update performance

 - mailRepositoryContentV2 (non frozen list data types are mistakenly used for RECIPIENTS and PER_RECIPIENT_SPECIFIC_HEADERS - frozen would be acceptable- map is used for attributed and could be frozen too)
      - As a temporary measure, we can fix the inserts.
      - This table is not critical



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org