The Content Service

The Content Service Interface

Concepts

Before diving into how clients will connect and use the content service, we need to discuss how we will tackle this service conceptually. We have to support managing scenes and profiles, and these have a few similarities that we could generalize.

Entities

The API will be built around the concept of entities. We should consider that an entity is an instance of each of the types described above (scenes and profiles). As such, an entity will have the following properties:

  • A type (either a scene or profile)
  • Content (files that are referenced or used by the entity)
  • Metadata (for example, in the case of profiles, it could be a description)
  • Pointers (indices that reference the entity)

It is important to point out that if any of these properties were to change, we would consider the updated version to be a completely new entity.

Pointers

Since entities are immutable, and every time an update happens a new entity is created, we add the concept of pointers to be able to easily search for and access specific entities. Each pointer can only reference one entity (this will be enforced by the server), but an entity can be referenced by many different pointers.

One last thing that is important to mention is that depending on the entity’s type, we might consider a specific pointer structure to be valid, while others might fail such validation.

Example

Let’s look at an example to better understand these concepts. Let’s say we have an entity of type scene called Scene1. This entity has some metadata such as name and some content files that will be rendered in the Metaverse. Now, pointers for type scene have the form of “X,Y”, and we know them as parcels.

Note that we can’t have a parcel point to different scenes, but we can have a scene being referenced by many parcels. It is also important to mention that each of these pointers (in this case parcels) has an owner that determines the entity that the pointers reference.

This idea can also be extended to the other entities. Take profiles for example, where an ETH address will point to a specific profile.

Content Retrieval

Now that we have established what entities and pointers are, let’s look at how the API would work.

Entities Endpoint

Entity Querying

The /entities endpoint will provide two different ways to be queried:

  • By list of pointers
  • By list of entity ids

On the server side, we will need to make sure that one (and only one) of these options is used.

Example:

Request:

GET /entities/scenes/?pointer=x1,y1&pointer=x2,y2

Response:

[
    {
        "id" : "SceneId1",
        "contents" : [{
            "file" : "file_nameX",
            "hash" : "file_hashX"
        }],
        "pointers" : ["X1,Y1"],
        "metadata" : { ... },
    },
    {
        "id" : "SceneId2",
        "contents" : [{
            "file" : "file_nameY",
            "hash" : "file_hashY"
        }],
        "pointers" : ["X2,Y2"],
        "metadata" : { ... },
    }
]

Fields Param

In order to allow clients to determine which field they want to see, we will add a query param called “fields” where one or more of the following values can be set: “contents”, “pointers”, “metadata”. When the query param is set, only the entity’s ID and the specified fields will be present in the response.

Contents Endpoint

By using the contents endpoint, clients will be able to download files, based on the file’s hash. To understand more about how this hash is calculated, continue to the next section.

Request:

GET /contents/{hashId}

Content Upload

A client that wants to perform the upload will first need to hash all files that compose the entity:

image_1

entity.json

The hashes calculated before, added to the extra information, will be used to generate a new entity.json file:

{
    "type": "scene",
    "content" : [
        {
            "file" : "file_name1",
            "hash" : "file_hash1"
        },
        ...
    ],
    "metadata": { ... }
    "pointers": ["x1,y1", "x2,y2", ...],
    "timestamp" : 150000000
}

Field

Description
type The type of entity. Could be one of: scenes, profiles
content A list of pairs <File Name, File Hash>
metadata Some data that makes sense to the entity
pointers List of pointers that will reference the new entity
timestamp Epoch time, that determines when the entity was created

This file will also be hashed and the result will become the identifier for the deployment and, in a sense, of the entity itself. This hash (which we will call EHash from now on) will be used as the entity’s ID.

image_2

deploy.proof

Once the EHash is generated, the client will need to sign it in order to prove ownership of the pointers that they want to modify. This additional information will also be sent to the Content Service.

deploy.proof

{
    "id" : "EHash",
    "address": "0x...",
    "signature": "..."
}

Field

Description
id EHash will be used to identify this entity from now on
address Associated address of the signer
signature sig(EHash)

We will add a timestamp to the deployment.proof file, marking the moment when the deployment was executed.

Complete upload request

Putting everything together, the request to the Content Service will be a multi-part request composed by:

  • All entity’s content files
  • An entity.json file.
  • A deploy.proof file.

The endpoint used for scene uploading, will be /entities, by executing a POST.

When the request is received, the Content Service will validate that:

  • entity.json
    • All files defined on the file are either part of the request or had been previously uploaded
      • If the hash of a file that is being uploaded is already present, then we won’t re-upload it
    • The uploaded files’ hashes actually match the reported hashes
    • There are no duplicate file names
    • Pointers are valid
      • No duplicates
      • Make sure that pointers aren’t pointing to entity with higher timestamp
  • deploy.proof
    • The entity.json hash matches the one inside the deployment.proof file
    • Validate that the address has permissions to alter the state of the given pointers
    • Validate the signature to verify that the caller is actually the owner of the address

Entity Overlap

It could happen that when an entity is deployed, the pointers that reference it overlap with an already deployed entity. To see how this would work, let’s see an example with scenes.

In this example, we have a scene already deployed at (0,0), but the new scene would collide with the new update. When this happens, the outcome will be:

  • Pointers(new entity) will point to the new entity
  • (Pointers(old entity) - Pointers(new entity)) will no longer point to an entity

This makes more sense when we discuss synchronization later on.

Entity audit

Any user will be able to retrieve the deployment proof just by calling the /audit endpoint. The endpoint will then return the content of the deploy.proof file. Since the ID can be used to then download the entity.json file, any user will be able to verify that the files have not been modified by us, or any third party.

Request:

GET /audit/{type}/{entityId}

Response:

{
    "id" : "EHash",
    "address": "0x….",
    "signature": "...",
    "timestamp": "..."
}

Available Contents

Currently, our CLI checks if any of a scene’s assets have already been uploaded to the content service, in order to avoid a re-upload. It makes sense to continue to support this use case, so we will build an endpoint for it where we will return whether a file is already stored on the service or not.

Request:

GET /available-content?cids={cid1},{cid2}

Response:

[


     { 
        "cid": "cid1",
        "available": true
     },
     ...
]

History

In order to support cluster synchronization, we will add an endpoint that will report the deployment history of a given content service. The request will be able to take some time delimiters.

Request:

GET /history?from={timestamp}&to={timestamp}

Response:

[
    {
        "entityType": "{entityType}",
        "entityId": "{entityId}",
        "timestamp": "..."     
    },
    ...
]

The Complete Interface

  • GET /entities/{type}/?{filter}&fields={fieldList}
  • POST /entities/
  • GET /contents/{hashId}
  • GET /pointers/{type}
  • GET /audit/{type}/{entityId}
  • GET /available-content?cids={hashId1},{hashId2}
  • GET /history?from={timestamp}&to={timestamp}

Entities

As mentioned before, we would like to support two different kinds of entities: scenes and profiles.

Scenes

Scenes will be referenced by pointers known as parcels. These pointers have a 1:1 relation with the NFTs called LAND. Parcels have the form _“X,Y” _where both X and Y are integers.

The metadata field will be filled with what today goes on the scene.json file.

Profiles

Profiles will be referenced by ETH addresses. This way, claimed names can be traded without affecting one’s avatar. The metadata field will be filled with what we defined as a profile today, with the name, description and avatar.

The owner of each ETH address will be the only one who can modify the pointer.

One aspect to consider, is the fact that only users with Web3 enabled will be able to store their profile on the content service. Users that don’t have Web3 enabled, will store their profile on their local browser storage.

Content Storage

The interface described above should be respected by all content services, so clients can call any server uniformly. However, every server could choose a different storage solution. Now, we propose a possible storage implementation that will be implemented for the first version of the service. Our proposal is to use the file system as a database:

image_4

image_5

image_6

This same approach can be easily mapped to an S3 bucket or other ways of storage. Now, in order for our clients to have a good experience, we will need to cache some of these files in memory. Otherwise, every request will require a disk read, which of course would not be ideal.

Decentralization

Before diving into the details on how decentralization will be handled, let’s remember that there is going to be a DAO that provides a list of the community approved services.

Synchronization

So far we described (hopefully thoroughly) how a single content service would work. Now, we need to understand how many instances would work together. We could potentially make a service ping the others when a user uploads data to it, but this could generate a few problems when a new node is introduced. To make it simpler, every service will need to check the history the others provide and detect when a new deployment is made.

When a node learns of a new deployment, it would ask for the metadata and file mappings to the node where the deployment happened, and then download the entity’s content for itself.

It is important to mention that before updating its own vision of the Metaverse, each node should validate the entity again (the proof, the hashes, everything). If the validation fails, then we will end up with two different versions of the Metaverse.

An example of how this would work is show below:

Conflicts

Given the amount of deployments we have right now, a conflict would be unlikely. But it could happen that two or more people with writing access to the same parcel, deploy different scenes on different nodes at almost the same time.

When this happens, the deployment with the highest timestamp will prevail, while the other will eventually be replaced with the newest deployment. This implies that this approach would provide eventual consistency for the overall content management cluster. We should highlight that in order for this approach to work, we would need to make how a timestamp should be calculated really clear and transparent.

One other aspect to highlight, is that even though a user upload must be rejected if there is already a newer entity deployed on one of the intended pointers, the same doesn’t happen when handling updates from other nodes. Let’s look at an example with scenes:

Now, if t0, t1, t2 and t3 are timestamps, and t0 < t1 < t2 < t3, then the final state should be:
image_8

In order for this to happen, then Node B shouldn’t ignore the update from Node A, even though it already has an overlapping entity with a newer timestamp. If Node B ignored the update, then Scene 0 and Scene 2 would live together.

This implies that even though the update on Node A will be overridden, it still needs to be processed, and it will also be added to the history. This way, we won’t have any gaps (at least eventually) in the node’s history.

Node Onboarding

When a new node is whitelisted into the cluster (by the DAO), it will select one (or more) of the other whitelisted nodes and ask for its history. Until something like snapshots is implemented, the new node will go through each entry on the history and download the entity. If an entity is known to have been overwritten, then the new node won’t download the content files.

Node Revocation

When a node is removed from the list, other services will simply stop listening to its updates. As clients should refresh their view of the whitelist after a certain amount of time, there is nothing more for the nodes to do on their side.

3 Likes

Awesome work! Maybe we should add this to the documentation. Thanks Nico and welcome to the forum!