Understanding Mixed Reality Development on the Microsoft HoloLens

Damian Perera
4 min readFeb 2, 2020


This article was originally published on my personal blog in November 2016 and is recommended for developers looking to begin development for the HoloLens 1 Developer Edition — in no way does this cover the entire spectrum of game physics, programming paradigms, and concepts that a developer needs to know when building advanced applications for the HoloLens or any other Mixed Reality device.

In order to understand the concept of Mixed Reality, one must first be familiar with the terms of Virtual Reality and Augmented Reality, for Mixed Reality is a culmination of both the Virtual and Augmented scopes.

  • Virtual Reality refers to the simulation of a user’s physical presence in a virtual environment.
  • Augmented Reality refers to the view of a physical environment where some or all of its elements are augmented by computer-generated input.

Mixed Reality (also referred to as Hybrid Reality) refers to the merging of both Virtual and Augmented Reality where the physical environment and the virtual environment co-exist based on several paradigms.

In Mixed Reality, one scenario can be where computer-generated elements are augmented into the real world, whereas another scenario can be where the real world is replaced by a virtual one (e.g. HoloTour). In this post, I’ll review and shed some light on the functioning of the Microsoft HoloLens headset, the first Mixed Reality device.

RoboRaid — A look into the Augmented Reality capabilities of the HoloLens

Microsoft HoloLens

The HoloLens is a Mixed Reality device powered by a modified version of the Windows 10 IoT Core. It’s forte and the power that drives its complex system is the Holographic Processing Unit (HPU) developed to render graphics based on DirectX 11.

The advantage of having a Holographic Processor based on DirectX 11 that supports the Universal Windows Platform (UWP) is that developers can build holographic applications using existing suites and frameworks such as Unity and SharpDX. Seeing as Unity is already a fully-fledged game engine, and since it supports Windows Holographic out of the box, we will look into the development of applications based on Unity 5.4.x.

Anyone who develops for the HoloLens on top of Unity needs to be familiar with it’s Game Physics and Input Paradigms.

Game Physics

The HoloLens API (UnityEngine.VR.WSA) provides spatial mapping and plane detection, whereby we create a mesh collider called the SpatialMappingCollider (SMC) for each surface plane. The SMC includes renderer components and a Spatial Mesh component (used to occlude virtual GameObjects with the real environment). All interactions and animations are conducted within this collider where GameObjects and Rigid Body Physics use the collider as the default horizontal and vertical plane. Every GameObject that uses spatial mapping for placing as well as animations will use the SMC to identify floors and walls.

In order for the game engine to correctly position and apply game physics to GameObjects, they should contain a box collider in order to supplement the SMC. The behaviour, collision detection and interaction between the SMC and the different GameObject colliders is the basis for Game Physics in the augmented environment.

How the HoloLens sees the real environment

Input Paradigms

Input for the HoloLens manifests in three main forms known as GGV.

  • Gaze — Simulates mouse pointing in the environment
  • Gesture — Simulates selection, drag and drop, bloom etc. in the virtual environment
  • Voice — Allows the user to issue voice commands for predefined interactions or inputs

In addition to the GGV Input Paradigm, the HoloLens uses Spatial Sound for immersive sound output in the virtual environment. Using Spatial Sound, a developer can simulate the origin of an audio source — the closer a user is to the sound source, the clearer and louder it can be heard, and vice versa.

Gaze, Gesture and Voice

Since every input form in the HoloLens is done via GGV, a developer must be thorough with the usage of all three input sources, in order to use them in conjunction with each other.

An important point of note is that contrary to popular belief, the Gaze input is not provided by the operating system — it is instead a user-programmed input source, using the Raycast Hit from the main camera position (transform.position) and it’s a direction (transform.forward). Hence, the cursor design and behaviour is completely user-programmable.

As a beginner to Unity and the MR field, it took me quite a while to figure out the basics that drive every MR application built on top of UWP and the Unity Game Engine. I hope this post helps anyone looking to begin development on the Microsoft HoloLens. I would recommend checking out the Microsoft HoloToolkit for Unity for a helping hand in development.