User:CSammy/Database Ideas

From diaspora* project wiki
< User:CSammy
Revision as of 17:37, 16 February 2016 by CSammy (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In this Github Issue, a discussion sparked (and continued in Issue 6604) on how to make the database smaller. However, size is not the only problem the database faces. Speed is another, and that is not inherently linked to the size of the database. A few ideas on how to restructure the database in order to make it faster will follow. For the current database schema, I use [1] as a reference.

Signatures

Signatures appear to be a big part of the database. Aside from the fact that, apparently, a good part of the signatures can be discarded anyway, they are not information that is needed for daily operation after the initial creation/receive of the object. Thus, a sharding of the table may improve the performance of tables with a signature column.

I propose to replace the signature field by an integer(4) posing as foreign key for a "signatures" table, which consists of the fields "signature_id (int4)" and "signature (text65535)". If a signature is needed, it is unlikely that a lot of signatures are needed, which will keep joins cheap. Also, the PostgreSQL index that's range-based may be a big help here in order to keep the index size down (good for keeping it in main memory). This however is dependant on how often the signatures are needed since the range index needs a sequential walk within the addressed block to finally reacht the correct entry.