WebXR Plane Detection Module

Draft Community Group Report,

More details about this document
This version:
https://github.com/immersive-web/real-world-geometry/
Issue Tracking:
GitHub
Editor:
(Google)
Participate:
File an issue (open issues)
Mailing list archive
W3C’s #immersive-web IRC

Abstract

Status of this document

1. Introduction

2. Initialization

2.1. Feature descriptor

In order for the applications to signal their interest in using plane detection during a session, the session must be requested with appropriate feature descriptor. The string plane-detection is introduced by this module as a new valid feature descriptor for plane detection feature.

A device is capable of supporting the plane-detection feature if the device’s tracking system exposes a native plane detection capability. The inline XR device MUST NOT be treated as capable of supporting the plane-detection feature.

When a session is created with plane-detection feature enabled, the update planes algorithm MUST be added to the list of frame updates of that session.

The following code demonstrates how a session that requires plane detection could be requested:
const session = await navigator.xr.requestSession("immersive-ar", {
  requiredFeatures: ["plane-detection"]
});

3. Planes

3.1. XRPlaneOrientation

enum XRPlaneOrientation {
    "horizontal",
    "vertical"
};

3.2. XRPlane

interface XRPlane {
    [SameObject] readonly attribute XRSpace planeSpace;

    readonly attribute FrozenArray<DOMPointReadOnly> polygon;
    readonly attribute XRPlaneOrientation? orientation;
    readonly attribute DOMHighResTimeStamp lastChangedTime;
    readonly attribute DOMString? semanticLabel;
};

An XRPlane represents a single, flat surface detected by the underlying XR system.

The planeSpace is an XRSpace that establishes the coordinate system of the plane. The native origin of the planeSpace tracks plane’s center. The underlying XR system defines the exact meaning of the plane center. The Y axis of the coordinate system defined by planeSpace MUST represent the plane’s normal vector.

Each XRPlane has an associated native entity.

Each XRPlane has an associated frame.

The polygon is an array of vertices that describe the shape of the plane. They are returned in the form of a loop of points at the edges of the polygon, expressed in the coordinate system defined by planeSpace. The Y coordinate of each vertex MUST be 0.0.

The semanticLabel attribute is a string that describes the semantic label of the polygon. This array is empty if there is no semantic information. The XRSystem SHOULD populate this with the semantic label it has knowledge of.

A semantic label is an ASCII lowercase DOMString that describes the name in the real world of the XRPlane as known by the XRSystem. The list of semantic labels is defined in the semantic label registry.

The orientation describes orientation of the plane, as classified by the underlying XR system. In case the orientation cannot be classified into "horizontal" or "vertical" by the underlying XR system, this attribute will be set to null.

The lastChangedTime is the last time some of the plane attributes have been changed.

Note: The pose of a plane is not considered a plane attribute and therefore updates to plane pose will not cause the lastChangedTime to change. This is because plane pose is a property that is derived from two different entities - planeSpace and the XRSpace relative to which the pose is to be computed via getPose() function.

4. Obtaining detected planes

4.1. XRPlaneSet

interface XRPlaneSet {
  readonly setlike<XRPlane>;
};

An XRPlaneSet is a collection of XRPlanes. It is the primary mechanism of obtaining the collection of planes detected in an XRFrame.

partial interface XRFrame {
  readonly attribute XRPlaneSet detectedPlanes;
};

XRFrame is extended to contain detectedPlanes attribute which contains all planes that are still tracked in the frame. The set is initially empty and will be populated by the update planes algorithm. If this attribute is accessed when the frame is not active, the user agent MUST throw InvalidStateError.

partial interface XRSession {
  Promise<undefined> initiateRoomCapture();
};

XRSession is extended to contain the initiateRoomCapture method which, if supported, will ask the XR Compositor to capture the current room layout. It is up to the XRCompositor if this will replace or augment the set of tracked planes. The user agent MAY also ignore this call, for instance if it doesn’t support a manual room capture more or if it determines that the room is already set up. The initiateRoomCapture method MUST only be able to be called once per XRSession.

XRSession is also extended to contain associated set of tracked planes, which is initially empty. The elements of the set will be of XRPlane type.

In order to update planes for frame, the user agent MUST run the following steps:
  1. Let session be a frame’s session.

  2. Let device be a session’s XR device.

  3. Let trackedPlanes be a result of calling into device’s native plane detection capability to obtain tracked planes at frame’s time.

  4. For each native plane in trackedPlanes, run:

    1. If desired, treat the native plane as if it were not present in trackedPlanes and continue to the next entry. See § 6 Privacy & Security Considerations for criteria that could be used to determine whether an entry should be ignored in this way.

    2. If session’s set of tracked planes contains an object plane that corresponds to native plane, invoke update plane object algorithm with plane, native plane, and frame, and continue to the next entry.

    3. Let plane be the result of invoking the create plane object algorithm with native plane and frame.

    4. Add plane to session’s set of tracked planes.

  5. Remove each object in session’s set of tracked planes that was neither created nor updated during the invocation of this algorithm.

  6. Set frame’s detectedPlanes to set of tracked planes.

In order to create plane object from a native plane object native plane and XRFrame frame, the user agent MUST run the following steps:
  1. Let result be a new instance of XRPlane.

  2. Set result’s native entity to native plane.

  3. Set result’s planeSpace to a new XRSpace object created with session set to frame’s session and native origin set to track native plane’s native origin.

  4. Invoke update plane object algorithm with result, native plane, and frame.

  5. Return result.

A plane object, result, created in such way is said to correspond to the passed in native plane object native plane.

In order to update plane object plane from a native plane object native plane and XRFrame frame, the user agent MUST run the following steps:
  1. Set plane’s frame to frame.

  2. If native plane is classified by the underlying system as vertical, set plane’s orientation to "vertical". Otherwise, if native plane is classified by the underlying system as horizontal, set plane’s orientation to "horizontal". Otherwise, set plane’s orientation to null.

  3. Set plane’s polygon to the new array of vertices representing native plane’s polygon, performing all necessary conversions to account for differences in native plane polygon representation.

  4. Set plane’s semanticLabel to a new string with the semantic labels.
  5. If desired, reduce the level of detail of the plane’s polygon as described in § 6 Privacy & Security Considerations.

  6. Set plane’s lastChangedTime to time.

The following example demonstrates how an application could obtain information about detected planes and act on it. The code that can be used to render a graphical representation of the planes is not shown.

// `planes` will track all detected planes that the application is aware of,
// and at what timestamp they were updated. Initially, this is an empty map.
const planes = Map();

function onXRFrame(timestamp, frame) {
  const detectedPlanes = frame.detectedPlanes;

  // First, let’s check if any of the planes we knew about is no longer tracked:
  for (const [plane, timestamp] of planes) {
    if(!detectedPlanes.has(plane)) {
      // Handle removed plane - `plane` was present in previous frame,
      // but is no longer tracked.

      // We know the plane no longer exists, remove it from the map:
      planes.delete(plane);
    }
  }

  // Then, let’s handle all the planes that are still tracked.
  // This consists both of tracked planes that we have previously seen (may have
  // been updated), and new planes.
  detectedPlanes.forEach(plane => {
    if (planes.has(plane)) {
      // Handle previously-seen plane:

      if(plane.lastChangedTime > planes.get(plane)) {
        // Handle previously seen plane that was updated.
        // It means that one of the plane’s properties is different than
        // it used to be - most likely, the polygon has changed.

        ... // Render / prepare the plane for rendering, etc.

        // Update the time when we have updated the plane:
        planes.set(plane, plane.lastChangedTime);
      } else {
        // Handle previously seen plane that was not updated in current frame.
        // Note that plane’s pose relative to some other space MAY have changed.
      }
    } else {
      // Handle new plane.

      // Set the time when we have updated the plane:
      planes.set(plane, plane.lastChangedTime);
    }

    // Irrespective of whether the plane was previously seen or not,
    // & updated or not, its pose MAY have changed:
    const planePose = frame.getPose(plane.planeSpace, xrReferenceSpace);
  });

  frame.session.requestAnimationFrame(onXRFrame);
}

5. Native device concepts

5.1. Native plane detection

The plane detection API provides information about flat surfaces detected in users' environment. It is assumed in this specification that user agents can rely on native plane detection capabilities provided by the underlying platform for their implementation of plane-detection features. Specifically, the underlying XR device should provide a way to query all planes that are tracked at a time that corresponds to the timeof a specific XRFrame.

Moreover, it is assumed that the tracked planes, known as native plane objects, maintain their identity across frames - that is, given a plane object P returned by the underlying system at time t0, and a plane object Q returned by the underlying system at time t1, it is possible for the user agent to query the underlying system about whether P and Q correspond to the same logical plane object. The underlying system is also expected to provide a native origin that can be used to query the location of a pose at time t, although it is not guaranteed that plane pose will always be known (for example for planes that are still tracked but not localizable at a given time). In addition, the native plane object should expose a polygon describing approximate shape of the detected plane.

In addition, the underlying system should recognize native planes as native entities for the purposes of XRAnchor creation. For more information, see WebXR Anchors Module § native-anchor section.

6. Privacy & Security Considerations

The plane detection API exposes information about users' physical environment. The exposed plane information (such as plane’s polygon) may be limited if the user agent so chooses. Some of the ways in which the user agent can reduce the exposed information are: decreasing the level of detail of the plane’s polygon in update plane object algorithm (for example by decreasing the number of vertices, or by rounding / quantizing the coordinates of the vertices), or removing the plane altogether by behaving as if the plane object was not present in trackedPlanes collection in update planes algorithm (this could be done for example if the detected plane is deemed to small / too detailed to be surfaced and the mechanisms to reduce details exposed on planes are not implemented by the user agent). The poses of the planes (obtainable from planeSpace) could also be quantized.

Since concepts from plane detection API can be used in methods exposed by [webxr-anchors-module] specification, some of the privacy & security considerations that are relevant to WebXR Anchors Module also apply here. For details, see WebXR Anchors Module § privacy-security section.

Due to how plane detection API extends WebXR Device API, the section WebXR Device API §  13. Security, Privacy, and Comfort Considerations is also applicable to the features exposed by the WebXR Plane Detection Module.

7. Acknowledgements

The following individuals have contributed to the design of the WebXR Plane Detection specification:

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[GEOMETRY-1]
Simon Pieters; Chris Harrelson. Geometry Interfaces Module Level 1. URL: https://drafts.fxtf.org/geometry/
[HR-TIME-3]
Yoav Weiss. High Resolution Time. URL: https://w3c.github.io/hr-time/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[WEBIDL]
Edgar Chen; Timothy Gu. Web IDL Standard. Living Standard. URL: https://webidl.spec.whatwg.org/
[WEBXR]
Brandon Jones; Manish Goregaokar; Rik Cabanier. WebXR Device API. URL: https://immersive-web.github.io/webxr/

Informative References

[WEBXR-ANCHORS-MODULE]
Piotr Bialecki. WebXR Anchors Module. DR. URL: https://immersive-web.github.io/anchors/

IDL Index

enum XRPlaneOrientation {
    "horizontal",
    "vertical"
};

interface XRPlane {
    [SameObject] readonly attribute XRSpace planeSpace;

    readonly attribute FrozenArray<DOMPointReadOnly> polygon;
    readonly attribute XRPlaneOrientation? orientation;
    readonly attribute DOMHighResTimeStamp lastChangedTime;
    readonly attribute DOMString? semanticLabel;
};

interface XRPlaneSet {
  readonly setlike<XRPlane>;
};

partial interface XRFrame {
  readonly attribute XRPlaneSet detectedPlanes;
};

partial interface XRSession {
  Promise<undefined> initiateRoomCapture();
};