Review Index:
Feedback

SteamVR HTC Vive In-depth - Lighthouse Tracking System Dissected and Explored

Subject: General Tech
Manufacturer: HTC

The SteamVR Lighthouse Tracking System

Principles of Operation

Enter the Lighthouse tracking system. Pioneered by Alan Yates of Valve, this system uses Beacons (a.k.a. Base Station) to emit precisely timed IR pulses (blinks) and X/Y axis IR laser sweeps. Instead of cameras or IR LEDs mounted to the HMD and controllers, Lighthouse embeds an array of IR-filtered photodiodes within all items that require tracking. As the X and Y ‘plane’ of IR laser light sweeps past the various sensors embedded within the controllers and HMD, those diodes’ outputs are amplified and passed onto an internal ASIC, which is programmed with the relative location where each input signal was sourced. Provided there are enough inputs (sensors that have a direct line of sight to one or both Base Stations), the ASIC can then work out its own location and orientation within the room.

View Full Size

The HTC Vive HMD uses over a dozen IR sensors to locate itself and its orientation.

Internals

That was a lot to digest, so let’s try and get a better handle on things by taking a look inside, shall we?

View Full Size

Going to the Base Station first, we can see a relatively simple looking box with a lot going on inside.

View Full Size

Starting at the top, we have the IR blinker array. The reason for the number of LEDs here is to increase the brightness of the synchronization pulses. It is also possible to encode simple intelligence into the blinks (for identification purposes where there are multiple Base Stations).

At either side here we see a pair of brushless permanent magnet three-phase synchronous motors. I’ve seen a few fans of the Rift complain about these ‘moving parts’ as a point of potential failure. To quench that issue, consider that these are the same types of motors used in hard drives, but these spin at half the speed (3600 RPM), and are designed to spin continuously for years without failure, so no worries there.

Each motor spins an optical flywheel with a ‘line lens’ of the multiple rod type. This lens takes the internal laser beam and fans it out in the form of a line perpendicular to the direction of the rods. The end result is an IR laser line that sweeps around the radius of each spinning rotor. We will get more into the details of how these work later on.

Straight down from the blinker array and between the two flywheels is a single photodiode, which is used to synchronize multiple beacons (mitigating the need for a sync cable, though one can be used in obscured environments). Just below the photodiode we see a small PCB containing a discrete amplifier. A ribbon then carries the amplified signal to the bottom PCB.

View Full Size

At the rear, we find some beefy capacitors, drive electronics for the flywheel motors, and power / sync / USB connectors. The USB connector is used for firmware updates.

View Full Size

Flipping back around to the rear we find two sets of wires leading to a pair of IR laser diodes, which fire across each other and down the center of each flywheel assembly, where a 45 degree mirror reflects it towards the multi-rod laser lens, forming the line that sweeps either horizontally or vertically (depending on which flywheel you are looking at).

Operation

We found some examples and even some high-speed video of a Base Station in operation, but most of the information out there is incorrect or based on the development hardware. This meant we had to shoot our own video.

The above video was shot at 960 FPS and slowed to 1/160th of real time speed. The red flashes you see are dimmer than we would like because IR was mostly filtered by our camera’s internal optics, but there is enough to get the point. If you watch the video closely, you’ll note that there is a ‘blink’ from the IR LED array in-between alternating passes of each flywheel lens. The blinks occur every 8.33 ms (120 Hz), and the two rotating lenses are interleaved, each passing by every 16.6 ms (60 Hz). Those with a keen eye will note that the flywheel lasers are only intermittent – the pattern is blink / X-sweep / blink / Y-sweep / blink / (none) / blink / (none). This is different from what you might have seen in other videos, partially because our video was taken with two Base Stations in-sync with each other. Lighthouse can’t work properly with multiple sweeps simultaneously overlapping each other, so each beacon must take a turn. The ‘blinks’ themselves are automatically synchronized across all beacons and intended to act like an IR camera flash across the whole room while the sweeps are interleaved among them. Below is a video representation of what is happening here (but assumes only a single beacon is in operation – no interleaving):

As with any optical positional tracking implementation, there are limits. To get an initial position and orientation ‘lock’, a device must have at least 5 sensors ‘lit’ by a Base Station (or 3 if two Base Stations are in view). Once locked, the device can use data from an internal gyro + accelerometer (IMU) to track its own location in real-time, meaning it knows where it is during the periods of time between Base Station sweeps or even if all sensors have been temporarily blocked from view (a form of dead reckoning). Provided a fix was recently established, the system can still determine its location in the room with just a single sensor and two beacons.

Controllers

Now that you hopefully have a grasp on how the Base Stations blink and sweep the room, let’s crack open one of the controllers:

View Full Size

This is the top facing portion of the Vive’s controller ring. Here we can make out 11 sensor units.

View Full Size

A closer look at some of the sensors. These are tucked into nearly every crevice of the ring portion of the controller. There are 24 total sensors on the ring.

Conclusion

That about wraps things up here. We hope that answers some of the questions and curiosities that some of you might have had about the Lighthouse Tracking System. Feel free to discuss further in the comments below!


April 5, 2016 | 05:35 PM - Posted by Anonymous (not verified)

wow thats lot of accessories and lot of plugs

April 5, 2016 | 05:50 PM - Posted by Allyn Malventano

It looks more simple once everything is set up, but yeah, there's a few parts.

April 5, 2016 | 06:09 PM - Posted by louis ifesiokwu (not verified)

i think htc's tech is a good idea as with each sweep it establishes a reference datum point for the sensors it can see making it quite accurate. but what do you think there might be issues with syncing both light houses as any error in the speed of theos motors would throw it oull out of sync.

well out of the 2 main tecnologies which one do you think would have a larger cpu overhead or prone to errors.

April 5, 2016 | 07:24 PM - Posted by Allyn Malventano

The motors are synchronous, meaning they spin at the exact speed that they are directed to by the controller. The controller sends a three-phase AC signal which the permanent magnet rotor must follow. The only way they could be out of sync was if there was some sort of failure.

Position tracking is done in hardware on the Vive, so no CPU overhead necessary. It may be done in software on the Rift, but it doesn't appear to have any noticeable impact (position tracking math is very simple compared to other game tasks performed by a CPU).

April 5, 2016 | 07:34 PM - Posted by Anonymous (not verified)

Does the Rift camera require a high speed USB port? It seems like it would need to essentially stream video to the system for processing but that processing would not take much power. It would be at 60 fps or more, but all it is doing is tracking LEDs which doesn't take much. It doesn't seem like a very good solution since expanding to multiple cameras and multiple tracked devices will increase overhead. There is no overhead for multiple lighthouse units and there is minimal overhead for tracking more devices. I would expect a second generation Rift to switch to lighthouse style tracking.

April 6, 2016 | 03:47 PM - Posted by Anonymous (not verified)

The rift requires a USB3 for position tracking. Using only USB2 will make the position tracking on the oculus quite unstable if you have other USB things connected at same time, so its quite demanding actually. So proper USB3 connector to at least the oculus tracking camera is a minimum requirement in my eyes if you want rock steady tracking.

April 7, 2016 | 05:12 AM - Posted by psuedonymous (not verified)

>Position tracking is done in hardware on the Vive, so no CPU overhead necessary.

This is incorrect.

Both the Vive and Rift perform the model-fit algorithm and Sensor Fusion (the actually CPU intensive part, though not all that intensive) on the host CPU. The only difference between the two systems is how they populate the array of XY camera-relative (or basestation-relative) coords for each tracked object: Lighthouse does so asynchronously by the sensor readouts providing sweep-relative timings, Constellation does this synchronously by processing the camera image (thresholding then centroid-finding, logging the blink codes frame-to-frame for marker ID). While the Constellation cameras do use the host CPU to do the image processing, this is a very basic process to do: in the Wii Remote this processing was done by the ISP on the Pixart camera, to give an example of how low-power the task is.

April 7, 2016 | 01:40 PM - Posted by Allyn Malventano

The Lighthouse inventor has stated that pose / positional calculations are done in ASIC within the controllers and HMD. Is this different than what you are talking about?

April 11, 2016 | 08:00 AM - Posted by Anonymous (not verified)

From Alan Yates (reddit username vr2zay): https://www.reddit.com/r/oculus/comments/3a879f/how_the_vive_tracks_posi...

" Presently the pose computation is done on the host PC, the MCU is just managing the IMU and FPGA data streams and sending them over radio or USB.

A stand-alone embeddable solver is a medium term priority and if Lighthouse is adopted will likely become the standard configuration. There are currently some advantages to doing the solve on the PC, in particular the renderer can ask the Kalman filter directly for predictions instead of having another layer of prediction. It also means the complete system can use global information available to all objects the PC application cares about, for example the solver for a particular tracked object can know about Lighthouses it hasn't seen yet, but another device has."

April 13, 2016 | 04:47 PM - Posted by Allyn Malventano

Interesting, in his interview video on Tested he talked more like everything being done in the ASIC, but that was back when everything was prototype so that must have changed in production.

April 5, 2016 | 07:24 PM - Posted by Anonymous (not verified)

I think that the light house techniques would be almost no CPU overhead. It just reads the IR sensors on the headset and determines the location based on the different times of arrival. It then just reports the location data to the system. This is actually how GPS basically works. GPS satellites are moving though, so they broadcast the time and position. The receiver than calculates position based on the time difference of arrival of the signals.

With the Rift, it actually has to take a picture with the camera at (I would guess) 90 to 120 Hz. It then has to process each image to determine the location of the IR LEDs on the headset. It is probably a low resolution image though. If the camera attachment has an integrated processor, then the CPU usage would be the same. I doubt that it has an integrated processor though. The Xbox One actually used a portion of it's processing power to handle input from the Kinect camera, although it was doing more than just tracking IR lights. If the Rift camera requires a high speed USB port, then it is probably streaming the images to the computer for processing since just sending coordinated would be low bandwidth. Making the camera separate seems like it was a bad idea. Even the CastAR device integrated the camera on the headset and used stationary LEDs for tracking.

The light house solution seems much better. It allows you to completely rotate around in a larger area. To do that with the Rift's set-up, you would need multiple, probably wired cameras. The tracking is done by the camera, not the headset, so wireless cameras would probably be too high latency. With light house, the boxes are essentially passive. They have some communication ability to sync with other light house units, but that is it.

People seem impressed with the Rift's screen/lenses but that is about it. You will be limited to sitting there with an old style game controller. It cannot easily do things like the "Hover Junkers" game which uses 3D mapped controllers to represent guns. You just point at what you want to shoot and pull the trigger. You can't do this with the Rift. In fact there are a lot of games which have been demonstrated on the Vive Pre which can not easily be implemented with the Rift hardware. Any game for the Rift could be easily played in the Vive though. I think Facebook needs to make a new revision of the Rift using light house type tracking and with 3D mapped controllers. This would obsolete this initial version of the Rift though.

April 5, 2016 | 09:21 PM - Posted by Anonymous (not verified)

My guess would be that the Vive has lower position reporting latency and better accuracy. The Rift is the cheaper option to build due to the lower number of electronic parts though.

I'm glad PCPer is pushing full throttle into VR technology, reminds me of the nascent SSD days.

April 6, 2016 | 03:19 AM - Posted by Anonymous (not verified)

The Vive is supposed to be $100 more, I think. I don't know if that represents that much of a difference in the cost to make the device though. I suspect that the Rift is being sold at closer to cost than the Vive. I kind of doubt that the two Light House units cost $100 to make; they mostly look like off-the-shelf components. You get more with the Vive though, so that extra money is probably worth it. The Rift will be a very limited experience just sitting with an old style game controller. I don't know what they will be doing to support 3D mapped controllers with the Rift. The current implementation doesn't seem like it can support it easily.

April 6, 2016 | 05:31 AM - Posted by Anonymous (not verified)

You're right about the Rift controller being limited. Also with just one camera as soon as you turn away it's screwed, I don't think you can get past that so the controller will need another camera also.

I'm not sure how cost is factored into the price of either of these devices (or overhead). It seems likely that both could sell at a loss to get people using their platform (like consoles).

April 6, 2016 | 06:03 PM - Posted by Allyn Malventano

The Rift is designed to track position even when looking to the side / away. There are LEDs all the way around. Agreed on the controller tracking though.

April 6, 2016 | 09:28 AM - Posted by DakTannon (not verified)

(Question) Hey Ryan have you guys tested the USB compatibility of the CV1 yet? I have heard ALOT of people have been getting incompatibility rating in the Oculus Tool and I don't know whether these parts will actually cause trouble or if they just aren't on Oculus' white list (VIVE doesn't have this Problem at all) or if they are just being over protective. For example I have 3930k@4.8ghz and it says my CPU isn't good enough but even per core it is stronger than a 4590 (I got an 11 on steam VR tool) so I know they are wrong. Now it also says my USB is no good now it could be because the x79 platform is not on their radar or also it turns out if you have even one incpable usb port and 4 usable ones the tool will still fail you if you don't disable the offending usb ports you guys could test any older but viable platform like z77 or X79 before intel had integrated usb 3.0

April 6, 2016 | 06:05 PM - Posted by Allyn Malventano

From what we've gathered so far, the Rift camera is picky and prefers native USB 3.0 implementations. 

April 7, 2016 | 12:00 AM - Posted by rhekman

Allyn, thanks for the neat break down and analysis. We've known for some time the lighthouse system's base stations are mostly passive and the actual sensors and location calculation happen on the headset and controllers. Despite that knowledge, so many outlets and commentators online continue to refer to the lighthouses as "sensors" or "cameras". Thanks for providing a resource to point to so that can be corrected.

I'd also like to point out the lighthouse system has an analog in GPS. The base stations are similar to GPS satellites in that they provide a timing signal via their blinks, and identification from the order of the sweeps. The headset and controllers have the sensors and logic to pick up the signals and then do the trigonometry to calculate their position. It's not a perfect analogy, since lighthouse uses IR instead of RF, and the receivers have more sensors to determine orientation as well position. Also the IMUs play an important role for responsiveness, power consumption, and orientation accuracy. Allyn thanks for the neat break down.

April 9, 2016 | 08:36 PM - Posted by Anonymous (not verified)

It actually is quite similar to GPS, but it is not, strictly speaking, triangulation. There has been a lot of people referring to such systems as triangulation when they are really multilateration. They do not measure angles to something as with triangulation. With triangulation, you measure the angles that the signals are arriving from, and it is very simple to determine position based on trigometric identities. Actual triangulation also only requires two receivers; it is basically how your eyes determine the position of something. They turn inward while focusing on something close providing an angle measurement to your brain.

GPS and Light House use time difference of arrival of signals to determine location. With GPS, the satalites broadcast the time and their position, since they are moving. All of the satalites have extremely accurate atomic clocks and all are synced precisely. The receiving device can determine its location based on the time difference in the arrival of the signals. Since the time is embedded in the signals, the receiver doesn't need to have a super accurate clock. The distances are great enough that the speed of light signals take a sufficiently long enough time in transit to provide a difference in arrival time. The solution is calculated, as far as I know, by solving a system of equations, not by applying trigometric functions.

The Light House system doesn't use the time difference of arrival based on the speed of light. That small of a distance would be difficult to measure. It uses the time difference of when the laser sweeps over the sensors instead. If the LH is sweeping right to left (based on your frame of reference), the beam will strike the sensors on your right first and sweep across the sensors to the left. The timing of when each sensor is triggered can then be plugged into a system of equations to solve for your relative position from the LH. I don't know what the system of equations look like, but it is not the simple trigonometry used in triangulation. Computers can solve such simple systems of equations very easily though.

April 7, 2016 | 10:22 PM - Posted by Kingkookaluke (not verified)

This whole crew thing is a fad,

April 8, 2016 | 06:11 AM - Posted by Anonymous (not verified)

Hey,
Very nice article. I have question though.
In the podcast you said you only get a position at 15hz. But even with 2 sensors, and one of them occluded, you should get 30 positions per second. Because the sweeps from one sensor should be sufficient to determine your position. The second lighthouse is just there for better occlusion coverage.

Or am I missing something?

April 8, 2016 | 03:32 PM - Posted by Allyn Malventano

You're right, I divided by too many twos on the podcast. It's 30 Hz when one station is occluded, 60 with both visible (assuming a controller can track under both simultaneously, which we are not sure of yet).

April 18, 2016 | 07:04 PM - Posted by Anonymous (not verified)

I would say they can track both simultaneously. They can see the blinks as well so they know the second set of sweeps are form a different base.

April 18, 2016 | 09:40 PM - Posted by Tom S (not verified)

Hi,

Can you talk about expandability? Valve has on a couple of occasions mentioned that this system is very expandable by adding base-stations. However, the question recently came up on Reddit/r/vive, and someone used your article as a point that if you keep increasing # of lighthouse units, you run into issues. Cut & Paste: "The more you add the more infrequent each base's sweeps are sent out. Only one sweep can fill a volume at a time. Otherwise the sensors will get confused on which sweep they are seeing. So, more base stations means each one has to wait longer before it can sweep again while the others get their turn. Other than that, adding additional base statioms is as easy as firmware updates.
As it is, each base station does its sweeps at 30Hz. That would be down to 20Hz with three and 15Hz with four. If the other three were occluded, then you'd be down to only 15Hz tracking with the one station that can see you with ~50ms between each sweep set.
Unless they can change the speeds of the motors at will, then this all goes out the window. However, if they speed them up the sweeps cross the colume faster and that means reduced reaction time for the sesnors to do their math. Not sure if they're anywhere close to the limits of the ASICSs they're using. Also, higher speed means more noise and more Lighthouse motor wear."

It seems to me Valve must have thought of this? However, getting hold of anyone there is a crap-shoot, and you seem to know a lot on the subject!

April 22, 2016 | 02:05 PM - Posted by jkostans (not verified)

I think your math is correct regarding scalability vs report rate. I think even 3 stations would potentially create a scenario where may you occlude 2 and only be getting 20Hz data from a single station. As it is now, 30Hz from a single station isn't enough to maintain solid tracking, so dropping to 20Hz would just make that worse. If you however have a setup where no more than 1 station is occluded at all times, you would only be dropping to a 40Hz rate, with two different viewpoints. This situation is much better than a 2 station setup with one station occluded.

In order to really scale up, there needs to be some frequency domain separation of the lighthouses. That way they can all sweep at the same time and the receiver should be able to differentiate between the stations in the frequency domain. This may or may not be possible with firmware updates. I doubt it because I would think this would have been the route they would have taken initially if the hardware was capable.

May 10, 2016 | 12:04 PM - Posted by Anonymous (not verified)

How does the controller of the vive know which base station hit it ?, so as to calculate the right relative position.

June 19, 2016 | 11:20 AM - Posted by Dale A (not verified)

The system is identical to how aircraft VOR locations works.

At 0 Degrees an omnidirectional radio transmission occurs. Then as a directional radio signal swipes across your aircraft the time difference between the two gives you the angle from the base station.

Ive spent the last few weeks integrating both vive and rift into Aces High. The rifts software prediction for lost frames is very impressive as compared to the openvr.

The vive controllers forces a rethink of how to do all player interactions. A simple thing like changing the controller into a hand makes it simple to just press virtual buttons with your finger for GUI's. Or just reach over and grab the virtual joystick or throttle in aircraft.

HiTech

November 24, 2016 | 10:26 PM - Posted by Student (not verified)

How does the lighthouse tracking system work on a low level? In terms of the mathematics need to process the inputs. What are the inputs, and outputs?