Several years ago, I watched a presentation by Andy Goodman titled Zero UI. He spoke about designing for a screen-less world, where we’d be interacting with spaces an objects in a more immersive way by relying on natural gestures and voice input. The prospect of designing outside of rectangles for contexts that would extend to cities, homes, education, commerce and more thrilled me!
Over the last year I’ve had the opportunity to design Augmented Reality products with the Spark AR team at Facebook. I’ve enjoyed experimenting within this emerging platform and it is incredible to watch as AR gets us closer to Andy’s vision of future interfaces.
Here are a few things I’ve learned along the way. I hope these are helpful to anyone who is itching to jump into AR design!
How does Augmented Reality work?
Computer vision is used to understand what is where in the world. Just as our brains are trained to recognise aspects of the world around us, computer vision uses semantics and geometry to inform a live camera feed about what it is seeing and where things are.
In practice Augmented Reality could enable potential consumers to try a pair of sunglasses on virtually by recognising a human face and finding points on it similar to other faces. Based on this information it understands where your eyes are positioned, in order to determine where the sunglasses should be anchored.
As we continue to improve computer vision understanding of the world, the range of semantics paired with geometry will enable us to understand more complex shapes in the physical world, and we’ll be to able augment more specific objects and places.
Today’s augmented reality is informed by more “simple” information inputs like planes, points, depth, and common visual patterns in 2D images.
Types of AR Tracking
Image (Marker), Planes & People (Marker-less) , and Location-based Tracking are 3 most common AR types.
Tracking Images & Textures
Marker, Target or Recogniser-based AR tries to understand unique characteristics of an image, in order to add relevant digital content. This means you can use an image or texture to trigger AR content to appear on top of or near it.
To learn more about what types of images can be used for AR image tracking, check out Best Practices for Target Tracking in Spark AR Studio.
Tracking Spaces and People in the World
Marker-less AR detects and maps a real-world environment or a person using inputs such as points on a face or surfaces in a room to understand where digital content should be placed.
Digital content is anchored with the help of SLAM (Simultaneous Localization and Mapping) which tracks feature points in a physical environment, making it possible for a camera to identify the distance between an object or surface and the user. SLAM enables us to track a specific scene or object in the world. Examples of marker-less AR include tracking planes, points of intersection, faces, hands and other parts of the body.
Location-based AR
Live location data (GPS) is used to position digital content to specific coordinates in the physical world. This means you can anchor AR content contextually to points of interest, roads, businesses, etc. If you’re interested in working with location datasets, I’d recommend looking at Mapbox.
A great example of location-based AR used for navigation is Google Maps Live View.
Context is King
Designers are responsible for simplifying how users consume information on any given interface. This can be based on a users’ intent at a certain point in time or their location within a product. This will and should not change as platforms and products evolve towards spatial interfaces.
Currently spatial awareness is mostly limited to surfaces, points, and images around the user. In order to make AR truly utilitarian, we will need to anticipate human needs. Serving users AR content that is contextual to their needs may require a collection of various sets of information like time, location, face recognition, environment semantics, etc. Understanding a person’s position and intent in the world, like we understand a human face will be an essential challenge for success of AR.
Interaction Design 2.0
When we think of interaction design in a more traditional sense (2D interfaces), designers rely on inputs like a mouse, finger or stylus. Augmented Reality affords interaction through facial gestures, voice, auditory, and haptics. Technologies such as hand tracking will enable us to perform more complex hand gestures, diversifying the range of possible inputs within AR/VR experiences.
Hand tracking is used in Oculus Quest to replace controller input. — Cred: Oculus Quest by Facebook
Spaces & Coordinate Systems
When building mobile AR, we can place digital content on either screen space (2D coordinates) or world space (3D coordinates). Content placed in screen space means it will appear on the device screen like typical UI. These could be things like notifications, menus, overlays, and heads up displays. Virtual objects and content placed in world space can be attached to surfaces and people in a physical space.
Objects in World Space (left) are anchored to surfaces, objects or people. Objects in Screen Space (right) are anchored to the device screen.
Similar to working within a 3D software, when building AR we should be aware of local and global coordinate systems. In practice, placing an object within a device camera’s coordinate system, means that when you move your device around, the virtual object will be locked to a position relative to the camera lens. The object and the camera share a local coordinate system and will always move together. Their coordinates in world space are the same. Alternatively, if the object is placed outside of the camera’s coordinate system, the camera and the object have individual coordinate systems. Their coordinates in world space are different.
Local coordinates are the coordinates of an object relative to its parent.
World coordinates are its global coordinates in world space.
Being comfortable with being uncomfortable
Augmented Reality is such a new space, with an undetermined path to a somewhat speculative destination. Feeling a sense of impostor syndrome is inevitable when starting out in this industry. When I joined the Spark AR design team, I had limited knowledge about 3D design and no experience working with AR technologies. Diving into software like Cinema 4D, Blender and Unity has helped me establish a better understanding for 3D fundamentals. Later on, work more closely with tracking technologies served as building blocks for uncovering problems, and feeling closer current and future user pain points.
One of the exciting things of working in AR is that most of the problems posed require new solutions, for users and platforms that might not yet exist. However because it is early days in the trajectory of spatial computing, it means that designers, engineers, data scientists, research, content strategy, product, marketing (all of us) are learning together!
Start making
The best way to get familiar with AR design is to start making! Here are some tools I think are worth trying out if you are interested in diving into AR creation. Have fun!
AR Creation (without coding):
- Spark AR Studio & Spark AR Player (iOS) (Android)
- Torch
- Adobe Aero
- Lens Studio
3D Scanning:
Thank you to the Spark AR team at Facebook for making it a great place to work, and to my mentors for helping me learn and grow in this industry.
Help others to find my article on Medium by Clapping below.And follow me on: Twitter, Medium, and LinkedIn.