My Romance with XR
July 18, 2018
Dr Andrew Yew
My first ever XR application was completed in 2009 for my final-year-project during my Bachelor studies in the National University of Singapore. Okay, it was an AR application. XR probably encompasses so much more than AR. Confused? Medium.com gives a short summary of the longpath it has taken to reach an uncomfortable consensus of what the terms AR/VR/MR/XR mean. Fun fact: medium.com had only just recently been unblocked in Malaysia after the new Pakatan Harapan government took over.
Back to the matter at hand. My first ever AR application was completed in 2009 for my final-year-project during my Bachelor studies in the National University of Singapore Augmented Reality Lab. It was a collaborative tool where two users in different locations could manipulate the same 3D models viewed in Augmented Reality (AR). Today, an 11-year-old could probably do that in three days. This 11-year-old would probably take two hours.
Then I continued there as a PhD candidate and a research fellow for almost 7 years. While I was there, I encountered several things that absolutely blew my mind, including:
- Markerless augmented reality
- Bare-hand human-computer interaction
- Haptic feedback in mid-air
- Actual “no-trickery” holograms
- Augmented reality sandbox
Today, (1) comes built into the latest phones, and (2) can be added to any system with a tiny device that costs 80 USD. I tried (3) at AWE 2018, and was part of one of only two teams that won the haptic VR game they used to demo the system *proud*. I haven’t experienced (4) and (5) for myself, but I show these at every presentation I give just to hear the sound of more minds being blown. In case you didn’t know, it sounds like a low bass drop.
I felt like Leela from Futurama traveling to what she thought was them long-lost home-world of cyclops people when I attended AWE 2018, except for me it turned out to be real. All the speakers waxed lyrical about the massive impact XR is making, and how it is changing the face of society. And if any of them had reached their hand out to me, I would have joined them in a femtosecond. Because no matter what name it is called by, XR or AR, I’ve always firmly believed that it would bring about a fundamental change in how we interact in our world.
XR – A Shared Personal Experience
To me, an XR world revolves around the user – all content that the user sees should be relevant to them – but it also blends seamlessly with anyone else’s world. Mark Billinghurst, who is something of a hero of mine, gave a talk at AWE 2018 about telepresence with AR. He described his Shared Sphere project, which is a system where one user (the sender) can share his physical space with another remote user and perform tasks together in the shared space. Through a VR headset, the remote user is immersed in the shared space, which is created using a 360 camera or depth camera. The view of the remote user is independent of the sender and, with bare-hand interaction, the remote user can point at objects and annotate the shared space by sketching in 3D.
My AWE Journey in Pictures!
There’s so much more about AWE and XR that I want to talk about but I have yet to organize these other thoughts. Let me just leave you for now with snippets of my AWE journey.
Talk by Mark Billinghurst
Augmented Teleportation, a talk given by Prof. Mark Billinghurst, co-creator of the legendary ARToolKit. Mark proposed the use of AR to break communication seams between the task space (where stuff is done) and the communication space (where people talk about doing the stuff). This is where he described the Shared Sphere project.
This talk was given by Wrnch.ai CEO Dr. Paul Kruszewski and was about the use of deep learning for motion capture (head, face, body, hands) without suits or sensors, just regular RGB cameras. This saves a lot of cost for traditional mocap applications, but also opens up many new possibilities for interactive and AR applications. If you want to play around with this technology, you can check out the free and open source OpenPose, but for production grade applications Dr. Kruszewski’s mocap model is probably more robust and higher performance.
This was one of the few talks about Blockchain. Many believe that Blockchain and AR/VR are a match made in heaven. The premise is that with everyone already connected online, there would be enough nodes to maintain a Blockchain network, and the use of a trustworthy system to manage transactions of virtual and physical goods and real estate would simplify the implementation of e-commerce.
The Zapbox 2.0 was on display. It came with a cardboard video-see-through AR viewer, a pointer with a button that all works based on marker tracking, and a set of markers to establish a ground plane for more sprawling AR content. The pointer worked surprisingly well, but I’m still uncomfortable with video-see-through AR, especially when it’s mobile. I still bought a Zapbox though, hoping to create something awesome with it.
A Plethora of Headsets
So many different types of headsets were on display here, including one to be used while riding a motorcycle. I saw people assembling headsets on the spot in order to demonstrate them to bigger groups. Ultimately, I feel that all these headsets are severely lacking in tracking and graphics quality compared to the Microsoft HoloLens.
These bags are made using fabric from AFFOA, each designed with a unique barcode pattern. This allows every bag to be identifiable and recognized by a mobile app using computer vision. A mobile app is used to register content such as name, logo, and social media links to the bag. When the app is used to scan the bag, these content are pulled from the cloud and displayed over the bag. Pretty cool and lots of potential. These fabrics can also be used to make shoes and clothes. I saw many other AWE attendees had bought a bag and so did I.
Occlusion of Physical Objects by Virtual Objects
The occlusion of physical objects by virtual object in AR has long been an issue. Most apps get around this problem by designing experiences where the virtual content is always meant to be in front of the physical environment. When I was working in my lab, I had two methods of solving this. One was for the occlusion of virtual objects by the user’s hands, I used a skin color segmentation mask to mask out parts of the hand that were supposed to be behind virtual content. The second was for physical objects to occlude virtual objects, I obtained the 3D meshes and tracked the poses of the physical objects and used these to populate the depth buffer of my rendering engine. Stereolabs, depicted in the photo, developed a depth camera and Unity plugin to take care of the problem. Notice that the ping pong table is a virtual object while the plastic bottle in front is real. Forgive my poorly framed photo. Watch this video instead – https://youtu.be/rfskhlS-XT0
AR Car Manual
This demo by Vuforia showcases an AR car manual. When you point the app at different areas of the car interior, buttons appear which you tap in order to bring up instructions and information. The demo made use of Vuforia’s latest technology to track physical objects based on their CAD model alone.
The True Scope of XR
I don’t think many people know what XR really is yet. For me, XR basically gives us the ability to receive useful information without having to put in much effort to search, to access and interact with digital services intuitively, and to reach out to people around the world in a manner that makes it easier for them to receive us. XR is enabled by every technology that enhances mankind. Here is my list of five (not click-bait I promise!) key enabling technologies for re-imagining our world in XR:
- Augmented reality – places digital information and virtual objects seamlessly in our physical world.
- Natural interaction – human-computer interaction methods such as hand motion tracking, gaze tracking, voice recognition, etc.
- Internet of Things – sensors and devices all interconnected so that they can provide higher level functionality than what they can achieve individually.
- Machine learning – raw data about different phenomena is consumed by computers so that they learn to understand the world of humans and the world of nature, allowing them to feed us even more useful information and even automate complicated tasks.
- Blockchain – a distributed record of transactions and information that is designed to be incorruptible, thus no one has to trust anyone but the blockchain itself, which is trustworthy.
What is the readiness level of technology to create a universal and always-on XR framework? From what I saw at AWE 2018, I would say 97.6% (hello K2-SO fans). (2), (3) and (4) are so advanced now you wouldn’t believe it. However, what is still lacking is mainly a suitable viewing device for augmented reality, and for Blockchain to work out its implementation issues.
Blockchain was only briefly discussed at AWE compared to the other stuff. However, I saw such a wide array of different brands of head-mounted display glasses for AR beyond anything I ever imagined. When I first started out doing research in the university, a pair of head-mounted display glasses cost upwards of 5,000 USD, and that was just for a pair of bulky display glasses that did not have any processor or tracking hardware. By comparison, the Microsoft HoloLens, which is a standalone system in itself that includes augmented reality tracking hardware and hand tracking sensors costs 3,000 USD.
The Meta2 Augmented Reality Headset.
HoloLens – Gamechanger? Almost!
The Microsoft HoloLens is an optical see-through display, which is to say, the graphical display is projected onto a transparent surface so that users get to see the physical environment as well. The other category is video see-through, which means the physical environment is captured by cameras and seen as a video display on which the augmented graphics are overlaid. For AR viewing, video see-through are pretty much passé, and also really uncomfortable and unsafe.
The Microsoft HoloLens works really really well. All other head-mounted displays pale in comparison to the HoloLens. The closest is probably the Meta2. While the Meta2 boasts a much wider field of view of the AR graphics, the main drawback of the Meta2 is that it has to be connected to a computer or laptop. I also did not find the display of the Meta2 as bright (now you get the pale joke?) or clear as the HoloLens, or the tracking to be as stable, which means that the AR graphics were slightly jittery and appeared to “swim” around the physical environment at times. For the HoloLens, the tracking is so stable that a virtual object appears absolutely stationery with respect to the physical location it is registered to.