ADR-153: Transform SDK component

More details about this document
Latest published version:
https://adr.decentraland.org/adr/ADR-153
Authors:
menduz
Feedback:
GitHub decentraland/adr (pull requests, new issue, open issues)
Edit this documentation:
GitHub View commits View commits on githistory.xyz

Abstract

This document describes the transform component for the SDK, it is used to spatially place things in the world, including the rotation, scaling, positioning and parenting. The semantics of this component also derive the coordinate system of Decentraland explorers. Transform operations are -as in any of the protocol's CRDT- commutative and idempotent operations.

Component description

The TransformComponent is one of the available ways to instruct the rendering engine about how to place, rotate and scale any element in the 3D space. It enables the users to provide the three mentioned components plus the parent entity, enabling the creation of a hierarchy. Within this hierarchy, the reference system is transformed to assume the zero of coordinates, scale and rotation as instructed by the parent's Transform component. We call this "local coordinate system".

Serialization

parameters:
  COMPONENT_ID: 1
  COMPONENT_NAME: core::Transform
  CRDT_TYPE: LastWriteWin-Element-Set

The Transform component is serialized using a plain-old C struct due to the amount of times it needs to be serialized and deserialized, it uses a fixed memory layout, represented as TransformComponent in the following snippet:

struct Vector3 {
  float x, y, z;
};
struct Quaternion {
  float x, y, z, w;
};
struct TransformComponent {
  Vector3 position;
  Quaternion rotation;
  Vector3 scale;
  unsigned int parentEntityId;
}

All fields are encoded in little-endian.

Semantics

In any rendering engine, to project the three-dimensional points on a two-dimensional screen, matrix transformations are used. It is first required to calculate the "world matrix" of an entity. Then that matrix is used to calculate all the local positions of each vertex. And then those vertex positions are projected to the screen coordinates using a "view matrix" and the "projection matrix". We will focus now in the first one, the "world matrix". To calculate it, all the components of the vectorial TransformComponent are used, taking into consideration the parent entity's "world matrix". But parenting of entities is a complex for an ECS-based engine, becayse entities are stored in a flat structure, and trees are a synthetic construct for positioning purposes only.

One of the complexities is that rendering engines usually require a acyclic tree-like structure to calculate all the world matrices. Due to the commutative nature of the CRDT messages, there may be scenarios in which, during a window of time, the reflected state of the messages contain cycles in the parenting hierarchy.

The RECOMMENDED implementation path is to move the entities to the root level of the scene while there are parenting cycles. Always prioritizing the best possible performance for the best case scenario (state without cycles). If the scenes are well programmed, after processing all CRDT messages the scene should converge towards a cycle-less DAG starting on the root entity.

To elaborate on the parenting process, we must first introduce how vertex projection works on 3D engines. It is all based on matrix calculations. Starting from an identity matrix, we can translate, scale or rotate the matrix by multiplying the same matrix by a rotated identity or rotated scale matrix, as many times as needed.

The process of calculating the world matrix, allows us to change the reference system for each entity. And it is performed by multiplying the parent entity's world matrix by the current entity's world matrix.

Matrix operations are multiplications, and matrix multiplication are not commutative operations (a*b) != (b*a). This forces every rendering engine to process and traverse the entire tree of the entities from the root entity downwards, using preorder.

And since matrix operations are multiplications, in the cases where there is no Transform component, we must assume identity matrix. Otherwise it would render every position in the 0,0,0 coordinates. The Identity values for the TransformComponent are defined as follow:

// the identity transform
Transform.Identity = {
  scale: Vector3(1, 1, 1),
  position: Vector3(0, 0, 0),
  rotation: Quaternion(0, 0, 0, 1), // Identity
  parent: ROOT_ENTITY // 0
}
// yields an identity matrix
Matrix4x4.Identity = [
  [1, 0, 0, 0],
  [0, 1, 0, 0],
  [0, 0, 1, 0],
  [0, 0, 0, 1]
]

To perform all the calculations consistently, the protocol requires a left-hand coordinate system with the parameters UP=vec3(0,1,0) and FORWARD=vec3(0,0,1).

From this reasoning, we get a set of rules:

When an entity A is added to an entity B but the engine does not know about entity B

When the engine is made aware of the real Transform component of entity B

Complex scenario

After creating and emparenting entities with explicit transform in the form:

ROOT_ENTITY
  └── A
      └── B
          └── C
              └── D
                  └── E
                      └── F
ROOT_ENTITY
  ├── A
  │   └── B
  │       └── C
  └── D
      └── E
          └── F

RFC 2119 and RFC 8174

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 and RFC 8174.

License

Copyright and related rights waived via CC0-1.0. Living