Relay servers for public posts
The problem
Diaspora is an excellent communication platform for sharing ideas and discussing them. Decentralization however, while bringing benefits also causes some issues. One of those "make or break" issues is the lack of federation for public posts. Setting up your own pod at the moment doesn't really make sense since you will instantly lose one of the biggest features - tag following. Since posts are generally only delivered to pods where there are participants for that post, lonely pods will not generally get a bulk of the public posts going around.
This creates a broken network and this proposal aims to fix that in a way that in a more lighter way than just pushing all public posts to all pods (which technically is not a good thing to do).
In fact, this proposal doesn't limit itself to diaspora*, but would also serve other federated social networks using the same protocol (ie RedMatrix and Friendica).
Concepts of the solution
Relay server
A relay server is a lightweight server app that has only one function - passing of messages from one place to another. In terms of federation message support, the server needs to implement:
/.well-known/host-meta
to provide discovery for the special relay handle- WebFinger or parts of it, to allow querying of the special relay handle
- hCard for the special relay handle
/receive/public
to receive public posts. The relay can and should ignore signature verification of authors which will be done by receiving pods anywayPOST
of received messages to/receive/public
of other pods
There should not be only one relay server. There should be many relay servers and anyone in the community should be able to run one. There should be a default relay server specified (in the format of a diaspora handle ie relay@defaultrelay.tld
) the core code base so new pods can easily enable the public posts outbound relay feature.
Storage
The relay is not required to store messages for any longer than it needs to send them out. The storage should preferable work in a "first in first out" principle, so something like a list would do well.
Relay should store the pod list locally for active pods. Relay should add to this pod list details retrieved from the pods themselves, including the following:
- Subscription status
- Scope
- Tags
Pod configuration and statistics.json / NodeInfo
A pod can publish information about itself in the current statistics.json route, which in the future will be replaced by NodeInfo. The attributes related to relays are as follows, in JSON format (with the first possible value as default):
"relay": { "subscribe": (false/true), "scope": "(all/tags)", "tags": [] }
The pod configuration file should have the following available (defaults first):
relay: send_outbound: false/true outbound_handle: relay@defaultrelay.tld subscribe: false/true scope: all/tags tags_source: users/podmin
The tags_source
setting governs how tags
is populated in to the statistics.json / NodeInfo output.
- users -- All tags that active users (6mo) follow will be added to the list
- podmin -- List of tags is maintained by podmin in the admin menu (needs db table). In the admin menu to not require constant restarts.
Rationale is that podmins can control whether they want the pod to focus on certain topics or let the users influence the topics.
List of active pods
Where the relay gets the active pods list doesn't matter, there are several possibilities already, including http://the-federation.info, https://diasp.net/active and http://podupti.me. A relay should be flexible in falling back to other lists when needed. A relay should also cache the list of pods internally. A relay should query the podlist on a regular basis, for example daily, so newly registered pods don't have to wait for a long time.
Diagrams
Data flow
Process of post delivery
Core code changes required for outbound sending
Diaspora already includes a kind of "carbon copy" possibility in Postzord. If outbound relay has been activated, a code change will add to each public post the relay handle as recipient. POC working code already exists here (with a hard coded locally created relay Person object id).
Phases of post delivery
As per federation requirements, the sending pod needs to first fetch via normal methods the contact of the relay handle, if it hasn't been retrieved yet. This doesn't require any additional code changes.
1. Relay contact discovery 2. Delivery as normal to recipients 3. Relay saves message to its outbound queue 4. Relay analyses received posts fifo 5. Relay opens message, ignoring verifying author signatures and parses tags from the message 6. Post delivered to remote pods according to their subscription settings
Spec to make sure everything flows to pods who have no other contact to originating post pod needs to be thought through. If necessary, relays should cache locally the following data, for each post it delivers:
[ .... { "post_guid": [ ["list", "of", "pods"] }, .... ]
Then pods should always deliver comments and other participations for public posts always to the relay also, which checks against its database who, if any, to follow it to.
The other option would be to have receiving pods ping back when they receive a public post from a relay. This would be to create the necessary participation on the originating pod to allow comments etc to federate.
Either of these options would increase the weight of the relay system to all pods. The flow of federation needs to be first checked what would be the smallest possible change that could be made to ensure delivery of participations to all participants.
Other considerations
Pod database size
Most of the pod database size is actually caused by participations, not posts. So unless the participations problem above is solved, pod databases shouldn't be affected that much, most of the additional weight will be on Sidekiq, and that weight can be partly controlled by pod subscription settings.
However, some solution to the way participations bloat the database should be thought about to make sure the amount of participations can be increased through the relay system, to make information flow where it needs to go.