Integration with IoT Devices

This document provides developers with a guide to integrating MultiSet's Visual Positioning System (VPS) into non-AR, camera-enabled IoT devices like the ESP32. The primary use case is to achieve instant, centimeter-level positioning by sending image data to the MultiSet REST API and receiving a precise device pose in return.

Advantages of MultiSet VPS for IoT

Integrating MultiSet VPS with your IoT devices offers several key advantages:

  • High-Precision Positioning: Achieve centimeter-level accuracy, enabling a new class of location-aware applications.

  • Instant Localization: Get a precise position and orientation from a single image, eliminating the need for drift-prone tracking over time.

  • Cost-Effective Hardware: The solution works with standard camera modules, allowing you to leverage affordable and widely available hardware like the ESP32-CAM.

  • Low On-Device Processing: The heavy computational tasks are offloaded to the MultiSet cloud, keeping the device-side requirements minimal.

  • Robustness: Visual positioning is less susceptible to the interference and signal-loss issues that can affect other positioning technologies like GPS or Wi-Fi, especially indoors.

  • Rich Data: The system provides a full 6DoF (Degrees of Freedom) pose (3D position and 3D rotation), giving you a complete understanding of the device's orientation in space.

Integration Workflow

The process of getting a device's pose is straightforward and consists of the following steps:

  1. Capture an Image: Use the camera module on your device (e.g., ESP32) to capture an image of the surrounding environment.

  2. Prepare API Request: Collect the necessary camera parameters, including the image resolution and intrinsic values (focal length and principal point).

  3. Send Request to MultiSet API: Make an HTTP POST request to the MultiSet VPS API endpoint. The request will be a multipart/form-data submission containing the image and camera parameters.

  4. Receive Pose Data: The API will process the image and return a JSON object containing the device's calculated position and rotation if the location is recognized.

  5. Utilize Pose Data: Parse the JSON response on your device to extract the position and rotation, and use this data in your application.

Developer Integration Steps

Integration requires interacting with two main API endpoints.

1. Generate M2M Auth Token

This endpoint authenticates your device and provides a JWT token that is required for all other API calls.

  • Endpoint: POST https://api.multiset.ai/v1/m2m/token

  • Body (application/json):

    codeJSON

  • Success Response (200 OK):

    codeJSON

2. Query VPS Map

This endpoint takes an image and camera parameters to perform localization.

  • Endpoint: POST https://api.multiset.ai/v1/vps/map/query-form

  • Authorization Header: Authorization: Bearer <YOUR_JWT_TOKEN>

  • Body (multipart/form-data):

    • mapCode

    • fx, fy, px, py, width, height (Required camera intrinsics)

    • queryImage (Required image file)

Understanding the API Response

The API returns a JSON object with the localization result.

  • poseFound: A boolean that is true if the device's location was successfully identified within the map.

  • location: An object containing the 6DoF pose.

    • position: The (x, y, z) coordinates of the device relative to the map's origin.

    • rotation: The orientation of the device represented as a quaternion (qx, qy, qz, qw).

  • confidence: A numerical value indicating the confidence level of the localization result.

  • mapId: The unique identifier of the map in which the device was localized.

Sample C++ Script for ESP32

This sample code demonstrates how to send an image to the MultiSet VPS API from an ESP32 using the HTTPClient and ArduinoJson libraries.

Prerequisites:

  1. ESP32 board with a camera module (e.g., ESP32-CAM).

  2. Arduino IDE with the ESP32 board support package installed.

  3. Install the ArduinoJson library from the Arduino Library Manager.

  4. Configure your Wi-Fi credentials, API token, and camera parameters in the script.

Last updated

Was this helpful?