User:Jaywink/Central hub

Introduction

Diaspora* is decentralized - and this is important and we need to keep it that way. This means the end product that we as a project build (the server software of diaspora*) needs to work in a decentralized way. Each pod must be a single entity and not restricted by the others.

However, there are some things that cannot be decentralized, and there is no sense in them being decentralized. These things are for example the code repository (github), the project home page (having many would be confusing) and also the licenses need to be owned by a single entity.

The same logic applies to this proposal of a Central Hub. It is needed for many reasons - and for many reasons it has to be centralized. The information in it must be true and it must be authoritive over any conflicting information. At the same time it MUST NOT be required to register to the central hub to use the diaspora* server software. It is just one extra resource in use for the project to help us build our decentralized social network.

What would it do?

The central hub would take and share information. What information it keeps and shares needs to be defined, but the important aspect is that all that information should be public, for transparency. Any information that is not public should not be in the data schema of the central hub. The information should of course relate to the diaspora social network as a whole - any information that it needs to function better and grow. The information should also always be opt in and pods should be able to register and disconnect at any time.

Information would be provided to anyone who asks for it, over a REST interface and some report pages on the project site.

What information do we need there and how will having it benefit the network?

The following is what I have in mind that we could gather from participating pods:

Name of pod
URL of pod
Registrations open / closed
Version
TOS (when implemented)
Amount of local users
Amount of local users active last 6 months

While much of this is already available from podupti.me and diapod.net - those lists cannot be used for some purposes since they are not integrated into the pods programmatically. We need authoritive data, that we can use to make our network work better and tell people that the diaspora* network is alive.

While some technical proposals (like Relay servers for public posts and Tag aggregation) require or benefit from this kind of central hub, I think the real benefit is visibility into the network itself.

We need to know how we are doing. We need to know how our user base is going up or down. We need to be able to say to press that our network is growing, since most of the internet thinks diaspora* died a long time ago.

The most important thing here is that we keep the central hub opt-in and pods should also be able to remove their pod at any time.

Technical

Code stack

I propose to use a MEAN stack since it would be perfect for this case as we are requiring a capable REST server + pod information storage mainly - our views out of the app will be mainly information and they don't even need to be exposed by this application necessarely. I would also like diaspora* as a project not to be totally Ruby dependent, to be able to attract developers from all kinds of technologies.

If this proposal is accepted, I am willing to do a large amount or all required to implement the application side of this (if someone helps on the diaspora server side).

The code should of course live in GitHub under the diaspora* organization and should be licensed under AGPLv3.

Architecture

We need a server to host the application. I would be honored to host it until we have truly project owned resources.

REST endpoints

/register Pods will call with this initially when they want to register the pod. This call will be followed by the central hub calling back to check the pod is really a pod.

/remove Pods can use this to remove the pod from the central hub. Central hub will check the call by doing a call to the pod and only after that do the remove.

/get/<information>?params Rest endpoint to retrieve information from the central hub, <information> being a sub type of information required, params including some filters to filter down the amount of information.

/export End point to get the full data export from the hub. In a disaster case this export would be imported to a new hub. Thus no one needs to trust the server admin since many community members would be subscribing to this archive.

Operations

The hub will periodically call each pod in its database to retrieve information. If the pod declines to give the information a certain amount of times, or is down for a certain amount of time, it will be removed or marked inactive, depending on the situation.

Pod changes

Pods will need a new configuration section with the following settings (or similar):

  central_hub:
     register: boolean (false by default)
     hub_url: 'https://hub.diasporafoundation.org' (included just to avoid hardcoding it, it should't be changed normally by podmin)
     hide:
        localuserscount: boolean (false by default)
        localactiveuserscount: boolean (false by default)

Pod should maintain in its db whether it is currently registered and when this changes, it should call the central hub to request the change.

Pod needs to implement an endpoint that the central hub can call to retrieve information and verify requests for removal.

Security

Since all the information is public and opt-in voluntarely shared by pods, and doesn't include any personal information about users themselves, there is little need to restrict call to the central hub rest endpoints or pod interfaces directly. This would allow someone to build their own version if they want and query pods directly. We can of course make this configurable, something like "allow_non_hub_calls" if we want, and then make the pod reject calls not coming from the correct source (domain name).