From the Lab: Exploring AR
Here at Steamclock, we spend a portion of our time each quarter deep diving into new technologies and potential product ideas that might be useful, interesting, and fun. This January, we decided it was time to dig in to augmented reality (AR) and what it can do.
While a really compelling consumer AR headset isn’t here yet, AR capabilities on mobile have been growing due to the increase in computing power, and the lower cost of sensors like LiDAR (put a pin in that one for later). As a developer, AR is especially exciting, as it opens up a new space full of opportunity: We can try to enhance existing products, solve new kinds of problems, or just find interesting ways to delight users.
Opening the Walls
We brainstormed a lot of AR product ideas potentially worth prototyping, but we zeroed in on the home renovation space. We wondered: how can we make use of AR here? Well, we know that building maintenance can be a big headache. Not only because buildings are big (they are! look around you!), but because some of the most important components are hidden away in the walls. Not knowing where your wires and pipes are exactly can make renovations a lot more costly and uncertain.
Our solution to the problem: map out the inside of the walls in AR. Imagine lifting your phone – or later putting on your AR glasses – and seeing all the pipes, wires, studs, and everything else you’d need to plan out a renovation.
It seems that, when a building’s walls are open before drywall is put up, we could scan and recognize the space, and store that knowledge for later use. No expensive mistakes while renovating!
But from that elevator pitch alone, we’ve already introduced a lot of technical complexity. How do we map out an existing space, recognize components, label and connect components, and load a persisted space with all of the previous metadata? Given the number of technical and practical uncertainties, we put together a small team and embarked on a 6-week “Labs Sprint” to see what we could learn.
Digging In, Writing Code
When there’s a lot of uncertain pieces to a project, it’s best to break out a few experiments around the most uncertain parts, start prototyping, and learning with those. Given that, we decided to focus our sprint on the four following main tasks:
- Mapping out a space using AR
- Persisting that information and recalling it for a particular room
- Detecting light switches in that space using machine learning
- Linking found light switches with a light
After weighing the various platforms we could try this on, we went for iOS. This was closest to our existing wheelhouse as mobile developers, letting us use Swift to test both LiDAR and non-LiDAR devices using the same APIs (Let’s put a second pin in LiDAR now!).
Our initial investigations (read: Googling) quickly led us to Apple’s ARWorldMap, Apple’s API for persisting and loading a real-world space using ARKit. ARWorldMap maps out a space with feature points in your environment, allows you to place anchors in the map, then allows you to persist that information on device. When it’s time to load the space, you are given a snapshot of the space to line up your camera, then ARWorldMap will compare the feature points and lay out the persisted AR map around those points if they match.
For our demo purposes, this was a clear win, allowing us to build on the provided ARWorldMap example from Apple to fulfill 2 of the 4 initial goals: mapping out a space and persisting it. In terms of an eventual product, ARWorldMap’s reliance on feature points in the environment – effectively requiring it keep looking the same – seemed like it would limit us from being able to compare a room at the time of construction (with open walls) against that same room later, after dry-walling had occurred. We set out to verify if that was the case.
After a series of experiments to build our understanding of ARWorldMap’s limits, we eventually picked up a LiDAR enabled device for the project. It was finally time to pay the piper with our pair of pickled pins: What is a LiDAR sensor, and why do they matter?
LiDAR sensors are used to determine the range between the sensor and an object using a laser, measuring the time it takes for the reflection to return. This allows for better depth mapping in a space than using cameras and machine learning based approach. In particular, we found that mapping out a plane accurately is only reliable with LiDAR enabled iOS devices. This gave us what we needed for the demo, filing away the discovery that a fully functional version of the features we’d envisioned would probably only be possible on the smaller install base of LiDAR enabled devices.
For identifying light switches with machine learning, CoreML was the natural choice. Apple’s example project for CoreML was useful, allowing us to use a pre-built object classification model to tell if a light switch was on screen. However for our purposes, we didn’t want to classify an image - we wanted to detect an object and obtain the bounding box for that object’s position. We weren’t able to find a pre-built CoreML model that could detect a light-switch automatically, so we opted to train our own CoreML model. Surprisingly, this turned out to be easier than expected; we used an open-source dataset that included images of light switches and Create ML. A couple thousand iterations later, we had a model that could somewhat reliably detect light switches in the environment, and that would create a bounding box for them.
Which only left one feature out of our four: manually placing a light and connecting that to the detected light switch. Leveraging our new experience with projecting into 3D space, we were able to quickly make a rough version of this feature. Our demo now supported a simple flow for mapping/loading a room, detecting light switches, and adding a corresponding light for that switch. The connection between the light switches is indicated through colour, and we used two different shapes to indicate the different objects. One demo presentation and a project debrief later, we completed our first AR demo!
And with that, our Labs Sprint was a success. We learned lot about ARKit and CoreML, and we even created our own object detection ML model, which was something we weren’t expecting when we started. As a team, we also learned a lot about our process for future tech explorations. We struggled a bit at spreading exploratory work between all the team members, however, we’ve learned some things about how better to set up research and planning for this kind of project. Next up: more experiments — especially now that Apple has announced their RoomPlan AR scene detection technology this week. AR is just getting started.
Interested in future posts or announcements? Subscribe to our feed.