Difference between revisions of "Relay servers for public posts"

From diaspora* project wiki
Jump to: navigation, search
m (...mediawiki syntax, you suck)
(Concepts of the solution: a code tag was leaking out due to typo :))
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Note|This is a proposal, not an accepted specification.}}
+
{{Note|Code enabling relays to work with diaspora* has been merged into 0.6 version.}}
{{Note|Discussion regarding this on [https://www.loomio.org/discussions/7963 Loomio].}}
+
{{Note|Discussion regarding this on [https://discourse.diasporafoundation.org/t/public-post-federation/264 Discourse].}}
  
 
== The problem ==
 
== The problem ==
Line 7: Line 7:
  
 
This creates a broken network and this proposal aims to fix that in a way that in a more lighter way than just pushing all public posts to all pods (which technically is not a good thing to do).
 
This creates a broken network and this proposal aims to fix that in a way that in a more lighter way than just pushing all public posts to all pods (which technically is not a good thing to do).
 +
 +
In fact, this proposal doesn't limit itself to diaspora*, but would also serve other federated social networks using the same protocol (ie ''RedMatrix'' and ''Friendica'').
  
 
== Concepts of the solution ==
 
== Concepts of the solution ==
  
=== The central hub ===
+
=== Relay server ===
 
 
We need a central hub. This central hub should be maintained by appointed community members until proper foundation resources can be used. For transparency ALL the data stored on the central hub should be public. The functionality of the central hub would be to hold an official pod list (opt in for podmins of course) and relating to this relay server functionality proposal hold a list of tags each participating pod is subscribing to.
 
  
Summa summarum;
+
A relay server is a lightweight server app that has only one function - passing of messages from one place to another. In terms of federation message support, the server needs to implement:
  
* Central hub that is maintained by appointed community members
+
* <code>/.well-known/host-meta</code> to provide discovery for the special relay handle
* ALL data on central hub should be open (REST API)
+
* ''WebFinger'' or parts of it, to allow querying of the special relay handle
* Pods add themselves and maintain their data (opt in)
+
* ''hCard'' for the special relay handle
* Pods can also list tags they are interested in
+
* <code>/receive/public</code> to receive public posts. The relay can and should ignore signature verification of authors which will be done by receiving pods anyway
 +
* <code>POST</code> of received messages to <code>/receive/public</code> of other pods
  
=== Relay server ===
+
There should not be only one relay server. There should be many relay servers and anyone in the community should be able to run one. There should be a default relay server specified (in the format of a diaspora handle ie <code>relay@defaultrelay.tld</code>) the core code base so new pods can easily enable the public posts outbound relay feature.
  
A relay server is a lightweight server app that has only one function - passing of messages from one place to another. Technical specs could be, for example (just to not be too Ruby specific :));
+
In the future the relay network could be load balanced, if necessary, by having pods exchange lists of working relays between themselves.
  
* NodeJS
+
=== Storage ===
* Redis
 
  
There should not be only one relay server. There should be many relay servers and anyone in the community should be able to run one. When someone runs a relay server, that relay will notify the central hub that they are open for business. The central hub in turn notifies pods (when they ask for this) about a list of relays available.
+
The relay is not required to store messages for any longer than it needs to send them out. The storage should preferable work in a "first in first out" principle, so something like a list would do well.
  
If the relay app could use the normal Diaspora protocol for communication it should.
+
Relay should store the pod list locally for active pods. Relay should add to this pod list details retrieved from the pods themselves, including the following:
  
== Events ==
+
* Subscription status
 +
* Scope
 +
* Tags
  
=== Someone on a pod adds a hashtag to his/her followed hashtag lists ===
+
=== Pod configuration and .well-known/x-social-relay ===
  
If pod is not already subscribing to this hashtag, it will notify the central hub that it wants these posts.
+
A pod can publish information about itself using the <code>.well-known/x-social-relay</code> ([http://the-federation.info/social-relay/well-known-schema-v1.json schema]). An example:
  
=== Someone on a pod unfollows a hashtag ===
+
<pre>
 +
{
 +
  "subscribe": true,
 +
  "scope": "tags",
 +
  "tags": ["foo", "bar"]
 +
}
 +
</pre>
  
If no one else on the pod follows this hashtag, pod will notify central hub that it doesn't want these posts any more.
+
The pod configuration file should have the following available (defaults first):
  
=== Someone writes a public post with one or more hashtags ===
+
<pre>
 +
relay:
 +
  outbound:
 +
    send: false/true
 +
    handle: relay@defaultrelay.tld
 +
  inbound:
 +
    subscribe: false/true
 +
    scope: all/tags
 +
    include_user_tags: false/true
 +
    pod_tags: foo,bar,etc
 +
</pre>
  
If podmin has opted in to relay functionality, pod will randomize one of the cached relays and delivers that post to the designated relay.
+
The <code>relay.inbound</code> configuration controls what the <code>.well-known/x-social-relay</code> file will contain. If <code>scope</code> is <code>tags</code> then the <code>tags</code> list for the well-known will be populated as follows;
  
=== Relay receives a post from a pod ===
+
* Any tags in <code>pod_tags</code> setting
 +
* If <code>include_user_tags</code> is true, then all tags that active users (6mo) follow will be added to the list
  
Relay already has a cached list of pods and what hashtags they want so relay will deliver post to pods that are interested in one or more of the hashtags in this post. Relay is not for public message keeping - it will delete any posts as soon as they have been pushed out.
+
Rationale is that podmins can control whether they want the pod to focus on certain topics and also let the users influence the topics.
  
=== Pod receives post from relay ===
+
=== List of active pods ===
  
Pod checks from originating pod that the post hash matches the one from the relay server. If hash matches, post is added to pod.
+
Where the relay gets the active pods list doesn't matter, there are several possibilities already, including http://the-federation.info, https://diasp.net/active and http://podupti.me. A relay should be flexible in falling back to other lists when needed. A relay should also cache the list of pods internally. A relay should query the podlist on a regular basis, for example daily, so newly registered pods don't have to wait for a long time.
  
 
== Diagrams ==
 
== Diagrams ==
Line 58: Line 77:
 
=== Data flow ===
 
=== Data flow ===
  
[[File:Relays1.png]]
+
[[File:PublicRelays2_1.png]]
  
 
=== Process of post delivery ===
 
=== Process of post delivery ===
  
[[File:Relays2.png]]
+
[[File:PublicRelays2_2.png]]
 +
 
 +
==== Core code changes required for outbound sending ====
 +
 
 +
Any pod wanting to send public posts out to a relay should carbon copy them to a relay. This is the only change that is required. Support in diaspora* for this is in the develop branch, coming out in the 0.6 release.
 +
 
 +
==== Phases of post delivery ====
  
== Alternative solution that doesn't require core code changes ==
+
As per federation requirements, the sending pod needs to first fetch via normal methods the contact of the relay handle, if it hasn't been retrieved yet. This doesn't require any additional code changes.
  
This solution is based on using the current federation protocol and manual user action from users to subscribe to tags. This solution would not require core code changes and could be used as an initial stage. This would not transparently make users receive public posts from the network, it would require each user sending out ```subscribe/unbsubscribe``` messages.
+
# Relay contact discovery
 +
# Delivery as normal to recipients
 +
# Relay saves message to its outbound queue
 +
# Relay analyses received posts fifo
 +
# Relay opens message, ignoring verifying author signatures and parses tags from the message
 +
# Post delivered to remote pods according to their subscription settings
  
The downside of this solution vs the original one is that each relay would need to keep track of all users subscribing, instead of relays just needing to know which pods subscribe. For privacy, this solution is much worse.
+
Note, the relay ''opens the message'' only to read the tags from it. It should store the actual message just as a dump of the whole XML payload received, and send that out ''exactly as received''. Tampering of the payload would invalidate the message signatures.
  
=== Relay server ===
+
== What about comments, likes, reshares and other participations? ==
  
==== Federation ====
+
TBD
  
In this solution, the relay server would talk like a normal diaspora* pod - in terms of federation it would look to other pods just like a normal pod. However, it doesn't need to support the whole protocol, only a subset. These parts include;
+
== Other considerations ==
  
* Share / stop sharing
+
=== Pod database size ===
* Receive private and public status messages
 
* Push out private and public status messages
 
  
Everything else can be ignored.
+
Most of the pod database size is actually caused by participations, not posts. Before participations are distributed with the relay system, the diaspora* database size needs to be fixed.
  
==== Tech stack ====
+
The problem with participations bloating the database is highlighted [https://github.com/diaspora/diaspora/issues/4920 in this GitHub issue].
  
The server could be built taking the current diaspora* server federation parts or using some other solution, for example Pyaspora.
+
== Show me the code! ==
  
==== Sharing of information between relay servers ====
+
There is some PoC code to back up this proposal.
  
Initially, one server could be good enough for POC. The relays should however also exchange information between them via the normal protocol status messages. Each relay would keep track of other relays out there and notify them of changes regarding subscribers. If a relay goes down for too long a time, it is forgotten by all the other relays.
+
* [https://github.com/jaywink/social-federation Social-Federation] is a Python library that aims to abstract multiple federated social networking protocols under it. Currently it only supports the Diaspora protocol and only relevant parts of it. Fully implemented is parsing a <code>status_message</code> message which covers public posts.
 +
* [https://github.com/jaywink/social-relay Social-Relay] is a Python Flask application that will aim to cover this proposal, using the ''social-federation'' library. It handles public post receiving and storing to Redis. It is currently receiving all public posts from the pod iliketoast.net.
 +
* [https://github.com/jaywink/diaspora/commit/77aa340cdc0da7372facada3674b9918437debba Change to Diaspora ''deferred_dispatch'' worker to make public posts be sent out to a relay] [merged
  
=== Subscribing and unsubscribing to tags ===
+
There is an example relay running at https://relay.iliketoast.net.
  
When a user wants to subscribe to a tag, they first start sharing with a relay of choice. Then once the sharing is complete, they write mention the relay in a status message, listing which tags they would like to subscribe or unsubscribe to. This message is sent as normal to the relay server, which then notes down the changes in its db (and notifies other servers too). For the relay, it really only cares about which pod is interested in which tags - but the way to keep track of this is to work with users directly.
+
A relay could be written with any programming language - the only constraint is talking the Diaspora protocol.
  
=== Delivery of public messages ===
+
== The way forward ==
  
Only messages from users who are sharing with a relay will be delivered to relays, due to the fact how the current code works. Once the relay receives the message, it will look at its db and see which pods are interested in the message, by analysing the tags it contains. It will then deliver the message to the pods who are interested (minus the sending pod of course).
+
Blockers for 0.6:
 +
* <strike>Implement missing parts to ''social-federation'' library, meaning webfinger etc</strike>
 +
* <strike>Finish up PoC ''social-relay'' server to include exposing relay handle, read pod lists, query remote pods and deliver posts</strike>
 +
* <strike>Submit PR to Diaspora core to include configuration and post mirroring to relays</strike>
 +
* <strike>Investigate federation flow regarding participations</strike>
 +
* <strike>Lots of testing before 0.6 release.</strike> (months of testing has shown relay system works and causes no problems to the core)
  
After delivery, it will delete the message. Any additional copies of it will be redelivered (for example a possible future post edit feature).
+
Non-blockers for 0.6:
 +
* Document suggestions regarding participations flow
 +
* Document suggestions regarding relay decentralization
 +
* Implement participations flow
 +
* Implement relay decentralization
  
 
[[Category: Proposals]]
 
[[Category: Proposals]]
 
[[Category: Federation]]
 
[[Category: Federation]]

Latest revision as of 13:25, 4 April 2019

»» Note
Code enabling relays to work with diaspora* has been merged into 0.6 version.
»» Note
Discussion regarding this on Discourse.


The problem

Diaspora is an excellent communication platform for sharing ideas and discussing them. Decentralization however, while bringing benefits also causes some issues. One of those "make or break" issues is the lack of federation for public posts. Setting up your own pod at the moment doesn't really make sense since you will instantly lose one of the biggest features - tag following. Since posts are generally only delivered to pods where there are participants for that post, lonely pods will not generally get a bulk of the public posts going around.

This creates a broken network and this proposal aims to fix that in a way that in a more lighter way than just pushing all public posts to all pods (which technically is not a good thing to do).

In fact, this proposal doesn't limit itself to diaspora*, but would also serve other federated social networks using the same protocol (ie RedMatrix and Friendica).

Concepts of the solution

Relay server

A relay server is a lightweight server app that has only one function - passing of messages from one place to another. In terms of federation message support, the server needs to implement:

  • /.well-known/host-meta to provide discovery for the special relay handle
  • WebFinger or parts of it, to allow querying of the special relay handle
  • hCard for the special relay handle
  • /receive/public to receive public posts. The relay can and should ignore signature verification of authors which will be done by receiving pods anyway
  • POST of received messages to /receive/public of other pods

There should not be only one relay server. There should be many relay servers and anyone in the community should be able to run one. There should be a default relay server specified (in the format of a diaspora handle ie relay@defaultrelay.tld) the core code base so new pods can easily enable the public posts outbound relay feature.

In the future the relay network could be load balanced, if necessary, by having pods exchange lists of working relays between themselves.

Storage

The relay is not required to store messages for any longer than it needs to send them out. The storage should preferable work in a "first in first out" principle, so something like a list would do well.

Relay should store the pod list locally for active pods. Relay should add to this pod list details retrieved from the pods themselves, including the following:

  • Subscription status
  • Scope
  • Tags

Pod configuration and .well-known/x-social-relay

A pod can publish information about itself using the .well-known/x-social-relay (schema). An example:

{
  "subscribe": true,
  "scope": "tags",
  "tags": ["foo", "bar"]
}

The pod configuration file should have the following available (defaults first):

relay:
  outbound:
    send: false/true
    handle: relay@defaultrelay.tld
  inbound:
    subscribe: false/true
    scope: all/tags
    include_user_tags: false/true
    pod_tags: foo,bar,etc

The relay.inbound configuration controls what the .well-known/x-social-relay file will contain. If scope is tags then the tags list for the well-known will be populated as follows;

  • Any tags in pod_tags setting
  • If include_user_tags is true, then all tags that active users (6mo) follow will be added to the list

Rationale is that podmins can control whether they want the pod to focus on certain topics and also let the users influence the topics.

List of active pods

Where the relay gets the active pods list doesn't matter, there are several possibilities already, including http://the-federation.info, https://diasp.net/active and http://podupti.me. A relay should be flexible in falling back to other lists when needed. A relay should also cache the list of pods internally. A relay should query the podlist on a regular basis, for example daily, so newly registered pods don't have to wait for a long time.

Diagrams

Data flow

Error creating thumbnail: Unable to save thumbnail to destination

Process of post delivery

Error creating thumbnail: Unable to save thumbnail to destination

Core code changes required for outbound sending

Any pod wanting to send public posts out to a relay should carbon copy them to a relay. This is the only change that is required. Support in diaspora* for this is in the develop branch, coming out in the 0.6 release.

Phases of post delivery

As per federation requirements, the sending pod needs to first fetch via normal methods the contact of the relay handle, if it hasn't been retrieved yet. This doesn't require any additional code changes.

  1. Relay contact discovery
  2. Delivery as normal to recipients
  3. Relay saves message to its outbound queue
  4. Relay analyses received posts fifo
  5. Relay opens message, ignoring verifying author signatures and parses tags from the message
  6. Post delivered to remote pods according to their subscription settings

Note, the relay opens the message only to read the tags from it. It should store the actual message just as a dump of the whole XML payload received, and send that out exactly as received. Tampering of the payload would invalidate the message signatures.

What about comments, likes, reshares and other participations?

TBD

Other considerations

Pod database size

Most of the pod database size is actually caused by participations, not posts. Before participations are distributed with the relay system, the diaspora* database size needs to be fixed.

The problem with participations bloating the database is highlighted in this GitHub issue.

Show me the code!

There is some PoC code to back up this proposal.

  • Social-Federation is a Python library that aims to abstract multiple federated social networking protocols under it. Currently it only supports the Diaspora protocol and only relevant parts of it. Fully implemented is parsing a status_message message which covers public posts.
  • Social-Relay is a Python Flask application that will aim to cover this proposal, using the social-federation library. It handles public post receiving and storing to Redis. It is currently receiving all public posts from the pod iliketoast.net.
  • Change to Diaspora deferred_dispatch worker to make public posts be sent out to a relay [merged

There is an example relay running at https://relay.iliketoast.net.

A relay could be written with any programming language - the only constraint is talking the Diaspora protocol.

The way forward

Blockers for 0.6:

  • Implement missing parts to social-federation library, meaning webfinger etc
  • Finish up PoC social-relay server to include exposing relay handle, read pod lists, query remote pods and deliver posts
  • Submit PR to Diaspora core to include configuration and post mirroring to relays
  • Investigate federation flow regarding participations
  • Lots of testing before 0.6 release. (months of testing has shown relay system works and causes no problems to the core)

Non-blockers for 0.6:

  • Document suggestions regarding participations flow
  • Document suggestions regarding relay decentralization
  • Implement participations flow
  • Implement relay decentralization