Normal view

There are new articles available, click to refresh the page.

Before yesterdayMain stream

Road to VR
Microsoft Reportedly Orders Samsung Micro OLEDs to Restart XR Hardware Ambitions
7 August 2024 at 10:45

Microsoft Reportedly Orders Samsung Micro OLEDs to Restart XR Hardware Ambitions

Road to VR

By: Scott Hayden

7 August 2024 at 10:45

Samsung Micro OLED at MWC 2024 | Image courtesy Samsung

According to a report from Korean tech outlet The Elec, Microsoft has contracted Samsung to supply micro OLED display panels for what is described as “next-generation mixed reality devices.”

Citing industry sources, the report maintains the order could reach into the “hundreds of thousands” of micro OLED displays, with such a Microsoft XR device reportedly slated to arrive as early as 2026.

Unlike Meta’s current line of Quest headsets, the alleged Microsoft headset will be used for “enjoying or watching content such as games or movies rather than the metaverse,” the report maintains (machine translated from Korean), potentially putting it in competition with Apple Vision Pro.

Since Microsoft’s abandonment of its WMR platform late last year and ongoing stagnation around its HoloLens AR platform, the company has mostly concentrated on smaller XR software projects.

Microsoft HoloLens 2 | Image by Road to VR

In January 2024, the company released support for 3D and VR meetings in Mesh, its immersive chatting platform. The company later announced at its Build developer conference in May it was bringing Windows Volumetric apps to Quest.

Since the release of Vision Pro earlier this year however, competing—or at least preparing to compete—with Apple seems to be the order of the day.

Samsung and Google confirmed in July their forthcoming “XR platform” will be announced sometime this year. The ‘platform’, which is thought to be hardware built by Samsung and an Android XR operating system built by Google, was previously reportedly delayed in effort to better compete with Vision Pro.

Meta is also apparently looking to compete with Vision Pro, with a device reported to arrive sometime in 2027.

Thanks to Brad ‘SadlyItsBradley‘ Lynch for pointing us to the news.

The post Microsoft Reportedly Orders Samsung Micro OLEDs to Restart XR Hardware Ambitions appeared first on Road to VR.

IEEE Spectrum Recent Content full text
How LG and Samsung Are Making TV Screens Disappear
29 July 2024 at 15:00

How LG and Samsung Are Making TV Screens Disappear

IEEE Spectrum Recent Content full text

By: Alfred Poor

29 July 2024 at 15:00

A transparent television might seem like magic, but both LG and Samsung demonstrated such displays this past January in Las Vegas at CES 2024. And those large transparent TVs, which attracted countless spectators peeking through video images dancing on their screens, were showstoppers.

Although they are indeed impressive, transparent TVs are not likely to appear—or disappear—in your living room any time soon. Samsung and LG have taken two very different approaches to achieve a similar end—LG is betting on OLED displays, while Samsung is pursuing microLED screens—and neither technology is quite ready for prime time. Understanding the hurdles that still need to be overcome, though, requires a deeper dive into each of these display technologies.

How does LG’s see-through OLED work?

OLED stands for organic light-emitting diode, and that pretty much describes how it works. OLED materials are carbon-based compounds that emit light when energized with an electrical current. Different compounds produce different colors, which can be combined to create full-color images.

To construct a display from these materials, manufacturers deposit them as thin films on some sort of substrate. The most common approach arranges red-, green-, and blue-emitting (RGB) materials in patterns to create a dense array of full-color pixels. A display with what is known as 4K resolution contains a matrix of 3,840 by 2,160 pixels—8.3 million pixels in all, formed from nearly 25 million red, green, and blue subpixels.

The timing and amount of electrical current sent to each subpixel determines how much light it emits. So by controlling these currents properly, you can create the desired image on the screen. To accomplish this, each subpixel must be electrically connected to two or more transistors, which act as switches. Traditional wires wouldn’t do for this, though: They’d block the light. You need to use transparent (or largely transparent) conductive traces.

An image of an array of 15 transparent TVs, shot with a fish-eye lens and displaying white trees with pink and green swaths of color above them. LG’s demonstration of transparent OLED displays at CES 2024 seemed almost magical. Ethan Miller/Getty Images

A display has thousands of such traces arranged in a series of rows and columns to provide the necessary electrical connections to each subpixel. The transistor switches are also fabricated on the same substrate. That all adds up to a lot of materials that must be part of each display. And those materials must be carefully chosen for the OLED display to appear transparent.

The conductive traces are the easy part. The display industry has long used indium tin oxide as a thin-film conductor. A typical layer of this material is only 135 nanometers thick but allows about 80 percent of the light impinging on it to pass through.

The transistors are more of a problem, because the materials used to fabricate them are inherently opaque. The solution is to make the transistors as small as you can, so that they block the least amount of light. The amorphous silicon layer used for transistors in most LCD displays is inexpensive, but its low electron mobility means that transistors composed of this material can only be made so small. This silicon layer can be annealed with lasers to create low-temperature polysilicon, a crystallized form of silicon, which improves electron mobility, reducing the size of each transistor. But this process works only for small sheets of glass substrate.

Faced with this challenge, designers of transparent OLED displays have turned to indium gallium zinc oxide (IGZO). This material has high enough electron mobility to allow for smaller transistors than is possible with amorphous silicon, meaning that IGZO transistors block less light.

These tactics help solve the transparency problem, but OLEDs have some other challenges. For one, exposure to oxygen or water vapor destroys the light-emissive materials. So these displays need an encapsulating layer, something to cover their surfaces and edges. Because this layer creates a visible gap when two panels are placed edge to edge, you can’t tile a set of smaller displays to create a larger one. If you want a big OLED display, you need to fabricate a single large panel.

The result of even the best engineering here is a “transparent” display that still blocks some light. You won’t mistake LG’s transparent TV for window glass: People and objects behind the screen appear noticeably darker than when viewed directly. According to one informed observer, the LG prototype appears to have 45 percent transparency.

How does Samsung’s magical MicroLED work?

For its transparent displays, Samsung is using inorganic LEDs. These devices, which are very efficient at converting electricity into light, are commonplace today: in household lightbulbs, in automobile headlights and taillights, and in electronic gear, where they often show that the unit is turned on.

In LED displays, each pixel contains three LEDs, one red, one green, and one blue. This works great for the giant digital displays used in highway billboards or in sports-stadium jumbotrons, whose images are meant to be viewed from a good distance. But up close, these LED pixel arrays are noticeable.

TV displays, on the other hand, are meant to be viewed from modest distances and thus require far smaller LEDs than the chips used in, say, power-indicator lights. Two years ago, these “microLED” displays used chips that were just 30 by 50 micrometers. (A typical sheet of paper is 100 micrometers thick.) Today, such displays use chips less than half that size: 12 by 27 micrometers.

A wooden frame surrounds a transparent display featuring an advertisement for a Black Friday Sale and a large image of a smartwatch. While transparent displays are stunning, they might not be practical for home use as televisions. Expect to see them adopted first as signage in retail settings. AUO

These tiny LED chips block very little light, making the display more transparent. The Taiwanese display maker AUO recently demonstrated a microLED display with more than 60 percent transparency.

Oxygen and moisture don’t affect microLEDs, so they don’t need to be encapsulated. This makes it possible to tile smaller panels to create a seamless larger display. And the silicon coating on such small panels can be annealed to create polysilicon, which performs better than IGZO, so the transistors can be even smaller and block less light.

But the microLED approach has its own problems. Indeed, the technology is still in its infancy, with costing a great deal to manufacture and requiring some contortions to get uniform brightness and color across the entire display.

For example, individual OLED materials emit a well-defined color, but that’s not the case for LEDs. Minute variations in the physical characteristics of an LED chip can alter the wavelength of light it emits by a measurable—and noticeable—amount. Manufacturers have typically addressed this challenge by using a binning process: They test thousands of chips and then group them into bins of similar wavelengths, discarding those that don’t fit the desired ranges. This explains in part why those large digital LED screens are so expensive: Many LEDs created for their construction must be discarded.

But binning doesn’t really work when dealing with microLEDs. The tiny chips are difficult to test and are so expensive that costs would be astronomical if too many had to be rejected.

A person wearing a white shirt with red text and a name badge is placing his hand behind a transparent display screen. The screen shows an image of splashing liquid and fire. Though you can see through today’s transparent displays, they do block a noticeable amount of light, making the background darker than when viewed directly. Tekla S. Perry

Instead, manufacturers test microLED displays for uniformity after they’re assembled, then calibrate them to adjust the current applied to each subpixel so that color and brightness are uniform across the display. This calibration process, which involves scanning an image on the panel and then reprogramming the control circuitry, can sometimes require thousands of iterations.

Then there’s the problem of assembling the panels. Remember those 25 million microLED chips that make up a 4K display? Each must be positioned precisely, and each must be connected to the correct electrical contacts.

The LED chips are initially fabricated on sapphire wafers, each of which contains chips of only one color. These chips must be transferred from the wafer to a carrier to hold them temporarily before applying them to the panel backplane. The Taiwanese microLED company PlayNitride has developed a process for creating large tiles with chips spaced less than 2 micrometers apart. Its process for positioning these tiny chips has better than 99.9 percent yields. But even at a 99.9 percent yield, you can expect about 25,000 defective subpixels in a 4K display. They might be positioned incorrectly so that no electrical contact is made, or the wrong color chip is placed in the pattern, or a subpixel chip might be defective. While correcting these defects is sometimes possible, doing so just adds to the already high cost.

A person looks at a transparent micro led screen displaying splashes of liquid in red, yellow, and green. Samsung’s microLED technology allows the image to extend right up to the edge of the glass panel, making it possible to create larger displays by tiling smaller panels together. Brendan Smialowski/AFP/Getty Images

Could MicroLEDs still be the future of flat-panel displays? “Every display analyst I know believes that microLEDs should be the ‘next big thing’ because of their brightness, efficiency, color, viewing angles, response times, and lifetime, “ says Bob Raikes, editor of the 8K Monitor newsletter. “However, the practical hurdles of bringing them to market remain huge. That Apple, which has the deepest pockets of all, has abandoned microLEDs, at least for now, and after billions of dollars in investment, suggests that mass production for consumer markets is still a long way off.”

At this juncture, even though microLED technology offers some clear advantages, OLED is more cost-effective and holds the early lead for practical applications of transparent displays.

But what is a transparent display good for?

Samsung and LG aren’t the only companies to have demonstrated transparent panels recently.

AUO’s 60-inch transparent display, made of tiled panels, won the People’s Choice Award for Best MicroLED-Based Technology at the Society for Information Display’s Display Week, held in May in San Jose, Calif. And the Chinese company BOE Technology Group demonstrated a 49-inch transparent OLED display at CES 2024.

These transparent displays all have one feature in common: They will be insanely expensive. Only LG’s transparent OLED display has been announced as a commercial product. It’s without a price or a ship date at this point, but it’s not hard to guess how costly it will be, given that nontransparent versions are expensive enough. For example, LG prices its top-end 77-inch OLED TV at US $4,500.

A diagram of the structure of a display pixel represented as a grey rectangle, which frames an open area labeled transmissive space, and three rectangular blocks labeled R, G, and B. Displays using both microLED technology [above] and OLED technology have some components in each pixel that block light coming from the background. These include the red, green, and blue emissive materials along with the transistors required to switch them on and off. Smaller components mean that you can have a larger transmissive space that will provide greater transparency. Illustration: Mark Montgomery; Source: Samsung

Thanks to seamless tiling, transparent microLED displays can be larger than their OLED counterparts. But their production costs are larger as well. Much larger. And that is reflected in prices. For example, Samsung’s nontransparent 114-inch microLED TV sells for $150,000. We can reasonably expect transparent models to cost even more.

Seeing these prices, you really have to ask: What are the practical applications of transparent displays?

Don’t expect these displays to show up in many living rooms as televisions. And high price is not the only reason. After all, who wants to see their bookshelves showing through in the background while they’re watching Dune? That’s why the transparent OLED TV LG demonstrated at CES 2024 included a “contrast layer”—basically, a black cloth—that unrolls and covers the back of the display on demand.

Transparent displays could have a place on the desktop—not so you can see through them, but so that a camera can sit behind the display, capturing your image while you’re looking directly at the screen. This would help you maintain eye contact during a Zoom call. One company—Veeo—demonstrated a prototype of such a product at CES 2024, and it plans to release a 30-inch model for about $3,000 and a 55-inch model for about $8,500 later this year. Veeo’s products use LG’s transparent OLED technology.

Transparent screens are already showing up as signage and other public-information displays. LG has installed transparent 55-inch OLED panels in the windows of Seoul’s new high-speed underground rail cars, which are part of a system known as the Great Train eXpress. Riders can browse maps and other information on these displays, which can be made clear when needed for passengers to see what’s outside.

LG transparent panels have also been featured in an E35e excavator prototype by Doosan Bobcat. This touchscreen display can act as the operator’s front or side window, showing important machine data or displaying real-time images from cameras mounted on the vehicle. Such transparent displays can serve a similar function as the head-up displays in some aircraft windshields.

And so, while the large transparent displays are striking, you’ll be more likely to see them initially as displays for machinery operators, public entertainment, retail signage, and even car windshields. The early adopters might cover the costs of developing mass-production processes, which in turn could drive prices down. But even if costs eventually reach reasonable levels, whether the average consumer really want a transparent TV in their home is something that remains to be seen—unlike the device itself, whose whole point is not to be.

KGOnTech
Mixed Reality at CES & AR/VR/MR 2024 (Part 3 Display Devices)
20 April 2024 at 14:59

Mixed Reality at CES & AR/VR/MR 2024 (Part 3 Display Devices)

KGOnTech

By: Karl Guttag

20 April 2024 at 14:59

Update 2/21/22: I added a discussion of the DLP’s new frame rates and its potential to address field sequential color breakup.

Introduction

In part 3 of my combined CES and AR/VR/MR 2024 coverage of over 50 Mixed Reality companies, I will discuss display companies.

As discussed in Mixed Reality at CES and the AR/VR/MR 2024 Video (Part 1 – Headset Companies), Jason McDowall of The AR Show recorded more than four hours of video on the 50 companies. In editing the videos, I felt the need to add more information on the companies. So, I decided to release each video in sections with a companion blog article with added information.

Outline of the Video and Additional Information

The part of the video on display companies is only about 14 minutes long, but with my background working in displays, I had more to write about each company. The times in blue on the left of each subsection below link to the YouTube video section discussing a given company.

00:10 Lighting Silicon (Formerly Kopin Micro-OLED)

Lighting Silicon is a spinoff of Kopin’s micro-OLED development. Kopin started making micro-LCD microdisplays with its transmissive color filter “Lift-off LCOS” process in 1990. 2011 Kopin acquired Forth Dimension Displays (FDD), a high-resolution Ferroelectric (reflective) LCOS maker. In 2016, I first reported on Kopin Entering the OLED Microdisplay Market. Lighting Silicon (as Kopin) was the first company to promote the combination of all plastic pancake optics with micro-OLEDs (now used in the Apple Vision Pro). Panasonic picked up the Lighting/Kopin OLED with pancake optics design for their Shift All headset (see also: Pancake Optics Kopin/Panasonic).

At CES 2024, I was invited by Chris Chinnock of Insight Media to be on a panel at Lighting Silicon’s reception. The panel’s title was “Finding the Path to a Consumer-Friendly Vision Pro Headset” (video link – remember this was made before the Apple Vision Pro was available). The panel started with Lighting Silicon’s Chairman, John Fan, explaining Lighting Silicon and its relationship with Lakeside Lighting Semiconductor. Essentially, Lightning Semiconductor designs the semiconductor backplane, and Lakeside Lighting does the OLED assembly (including applying the OLED material a wafer at a time, sealing the display, singulating the displays, and bonding). Currently, Lakeside Lighting is only processing 8-inch/200mm wafers, limiting Lighting Silicon to making ~2.5K resolution devices. To make ~4K devices, Lighting Semiconductor needs a more advanced semiconductor process that is only available in more modern 12-inch/300mm FABs. Lakeside is now building a manufacturing facility that can handle 12-inch OLED wafer assembly, enabling Lighting Silicon to offer ~4K devices.

Related info on Kopin’s history in microdisplays and micro-OLEDs:

2022 AWE Video Discussion with Brad Lynch Kopin (LCOS and OLED microdisplays)
2021 Pancake Optics Kopin/Panasonic
2013 Kopin Displays and Near Eye (Follow-Up to Seeking Alpha Article)
2013 Extended Temperature Range with LC-Based Microdisplays (about Kopin)

02:55 RaonTech

RaonTech seems to be one of the most popular LCOS makers, as I see their devices being used in many new designs/prototypes. Himax (Google Glass, Hololens 1, and many others) and Omnivision (Magic Leap 1&2 and other designs) are also LCOS makers I know are in multiple designs, but I didn’t see them at CES or the AR/VR/MR. I first reported on RaonTech at CES 2018 (Part 1 – AR Overview). RaonTech makes various LCOS devices with different pixel sizes and resolutions. More recently, they have developed a 2.15-micron pixel pitch field sequential color pixel with an “embedded spatial interpolation is done by pixel circuit itself,” so (as I understand it) the 4K image is based on 2K data being sent and interpolated by the display.

In addition to LCOS, RaonTech has been designing backplanes for other companies making micro-OLED and MicroLED microdisplays.

04:01 May Display (LCOS)

May Display is a Korean LCOS company that I first saw at CES 2022. It surprised me, as I thought I knew most of the LCOS makers. May is still a bit of an enigma. They make a range of LCOS panels, their most advanced being an 8K (7980 x 4,320) 3.2-micron pixel pitch. May also makes a 4K VR headset with a 75-degree FOV using their LCOS devices.

May has its own in-house LCOS manufacturing capability. May demonstrated using its LCOS devices in projectors and VR headsets and showed them being used in a (true) holographic projector (I think using phase LCOS).

May Display sounds like an impressive LCOS company, but I have not seen or heard of their LCOS devices being used in other companies’ products or prototypes.

04:16 Kopin’s Forth Dimensions Display (LCOS)

As discussed earlier with Lighting Silicon, Kopin acquired Ferroelectric LCOS maker Forth Dimension Displays (FDD) in 2011. FDD was originally founded as Micropix in 1988 as part of CRL-Opto, then renamed CRLO in 2004, and finally Forth Dimension Displays in 2005, before Kopin’s 2011 acquisition.

I started working in LCOS in 1998 as the CTO of Silicon Display, a startup developing a VR/AR monocular headset. I designed an XGA (1024 x768) LCOS backplane and the FGA to drive it. We were looking to work with MicroPix/CRL-Opto to do the LCOS assembly (applying the cover glass, glue seal, and liquid crystal). When MicroPix/CRL-Opto couldn’t get their backplane to work, they ended up licensing the XGA LCOS backplane design I did at Silicon Display to be their first device, which they had made for many years.

FDD has focused on higher-end display applications, with its most high-profile design win being the early 4K RED cameras. But (almost) all viewfinders today, including RED, use OLEDs. FDD’s LCOS devices have been used in military and industrial VR applications, but I haven’t seen them used in the broader AR/VR market. According to FDD, one of the biggest markets for their devices today is in “structured light” for 3-D depth sensing. FDD’s devices are also used in industrial and scientific applications such as 3D Super Resolution Microscopy and 3D Optical Metrology.

05:34 Texas Instruments (TI) DLP^®

Around 2015, DLP and LCOS displays seemed to have been used in roughly equal numbers of waveguide-based AR/MR designs. However, since 2016, almost all new waveguide-based designs have used LCOS, most notably the Hololens 1 (2016) and Magic Leap One (2018). Even companies previously using DLP switched to LCOS and, more recently, MicroLEDs with new designs. Among the reasons the companies gave for switching from DLP to LCOS were pixel size and, thus, a smaller device for a given resolution, lower power consumption of the display+asic, more choice in device resolutions and form factors, and cost.

While DLP does not require polarized light, which is a significant efficiency advantage in room/theater projector applications that project hundreds or thousands of lumens, the power of the display device and control logic/ASICs are much more of a factor in near-eye displays that require less than 1 to at most a few lumens since the light is directly aimed into the eye rather than illuminating the whole room. Additionally, many near-eye optical designs employ one or more reflective optics requiring polarized light.

Another issue with DLP is drive algorithm control. Texas Instruments does not give its customers direct access to the DLP’s drive algorithm, which was a major issue for CREAL (to be discussed in the next article), which switched from DLP to LCOS partly because of the need to control its unique light field driving method directly. VividQ (also to be discussed in the next article), which generates a holographic display, started with DLP and now uses LCOS. Lightspace 3D has similarly switched.

Far from giving up, TI is making a concerted effort to improve its position in the AR/VR/MR market with new, smaller, and more efficient DLP/DMD devices and chipsets and reference design optics.

Color Breakup On Hololens 1 using a low color sequential field rate

Added 2/21/22: I forgot to discuss the DLP’s new frame rates and field sequential color breakup.

I find the new, much higher frame rates the most interesting. Both DLP and LCOS use field sequential color (FSC), which can be prone to color breakup with eye and/or image movement. One way to reduce the chance of breakup is to increase the frame rate and, thus, the color field sequence rate (there are nominally three color fields, R, G, & B, per frame). With DLP’s new much higher 240Hz & 480Hz frame rates, the DLP would have 720 or 1440 color fields per second. Some older LCOS had as low as 60-frames/180-fields (I think this was used on Hololens 1 – right), and many, if not most, LCOS today use 120-frames/360-fields per second. A few LCOS devices I have seen can go as high as 180-frames/540-fields per second. So, the newer DLP devices would have an advantage in that area.

The content below was extracted from the TI DLP presentation given at AR/VR/MR 2024 on January 29, 2024 (note that only the abstract seems available on the SPIE website).

My Background at Texas Instruments:

I worked at Texas Instruments from 1977 to 1998, becoming the youngest TI Fellow in the company’s history in 1988. However, contrary to what people may think, I never directly worked on the DLP. The closest I came was a short-lived joint development program to develop a DLP-based color copier using the TMS320C80 image processor, for which I was the lead architect.

I worked in the Microprocessor division developing the TMS9918/28/29 (the first “Sprite” video chip), the TMS9995 CPU, the TMS99000 CPU, the TMS34010 (the first programmable graphics processor), the TMS34020 (2nd generation), the TMS302C80 (first image processor with 4 DSP CPUs and a RISC CPU) several generations of Video DRAM (starting with the TMS4161), and the first Synchronous DRAM. I designed silicon to generate or process pixels for about 17 of my 20 years at TI.

After leaving TI, ended up working on LCOS, a rival technology to DLP, from 1998 through 2011. But then when I was designing a aftermarket autmotive HUD at Navdy, I chose use a DLP engine for the projector for its advantages in that application. I like to think of myself as a product focused and want to use whichever technology works best for the given application. I see pros and cons in all the display technologies.

07:25 VueReal MicroLED

VueReal is a Canadian-based startup developing MicroLEDs. Their initial focus was on making single color per device microdisplays (below left).

However, perhaps VueReal’s most interesting development is their cartridge-based method of microprinting MicroLEDs. In this process, they singulate the individual LEDs, test and select them, and then transfer them to a substrate with either passive (wire) or active (ex., thin-film transistors on glass or plastic). They claim to have extremely high yields with this process. With this process, they can make full-color rectangular displays (above right), transparent displays (by spacing the LEDs out on a transparent substrate, and displays of various shapes, such as an automotive instrument panel or a tail light.

I was not allowed to take pictures in the VueReal suite, but Chris Chinnock of Insight Media was allowed to make a video from the suit but had to keep his distance from demos. For more information on VueReal, I would also suggest going to MicroLED-Info, which has a combination of information and videos on VueReal.

08:26 MojoVision MicroLED

MojoVision is pivoting from a “Contact Lens Display Company” to a “MicroLED component company.” Its new CEO is Dr. Nikhil Balram, formerly the head of Google’s Display Group. MojoVision started saying (in private) that it was putting more emphasis on being a MicroLEDs component company around 2021. Still, it didn’t publicly stop developing the contact lens display until January 2023 after spending more than $200M.

To be clear, I always thought the contact lens display concept was fatally flawed due to physics, to the point where I thought it was a scam. Some third-party NDA reasons kept me from talking about MojoVision until 2022. I outlined some fundamental problems and why I thought the contact lens display was a sham in my 2022 Video with Brad Lynch on Mojovision Contact Display in my 2022 CES Discussion video with Brad Lynch (if you take pleasure in my beating up on a dumb concept for about 14 minutes, it might be a fun thing to watch).

So, in my book, Mojovision, the company starts with a major credibility problem. Still, they are now under new leadership and focusing on what they got to work, namely very small MicroLEDs. Their 1.75-micron LEDs are the smallest I have heard about. The “old” Mojovision had developed direct/native green MicroLEDs, but the new MojoVision is developing native blue LEDs and then using quantum dot conversion to get green and red.

I have been hearing about using quantum dots to make full-color MicroLEDs for ~10 years, and many companies have said they are working on it. Playnitride demonstrated quantum dot-converted microdisplays (via Lumus waveguides) and larger direct-view displays at AR/VR/MR 2023 (see MicroLEDs with Waveguides (CES & AR/VR/MR 2023 Pt. 7)).

Mike Wiemer (CTO) gave a presentation on “Comparing Reds: QD vs InGaN vs AlInGaP” (behind the SPIE Paywall). Below are a few slides from that presentation.

Wiemer gave many of the (well-known in the industry) advantages of the blue LED with the quantum dot approach for MicroLEDs over competing approaches to full-color MicroLEDs, including:

Blue LEDs are the most efficient color
You only have to make a single type of LED crystal structure in a single layer.
It is relatively easy to print small quantum dots; it is infeasible to pick and place microdisplay size MicroLEDs
Quantum dots converted blue to green and red are much more efficient than native green and red LEDs
Native red LEDs are inefficient in GaN crystalline structures that are moderately compatible with native green and blue LEDs.
Stacking native LEDs of different colors on different layers is a complex crystalline growth process, and blocking light from lower layers causes efficiency issues.
Single emitters with multiple-color LEDs (e.g., See my article on Porotech) have efficiency issues, particularly in RED, which are further exacerbated by the need to time sequence the colors. Controlling a large array of single emitters with multiple colors requires a yet-to-be-developed, complex backplane.

Some of the known big issues with quantum dot conversion with MicroLED microdisplays (not a problem for larger direct view displays):

MicroLEDs can only have a very thin layer of quantum dots. If the layer is too thin, the light/energy is wasted, and the residual blue light must be filtered out to get good greens and reds.
- MojoVision claims to have developed quantum dots that can convert all the blue light to red or green with thin layers
There must be some structure/isolation to prevent the blue light from adjacent cells from activating the quantum dots of a given cell, which would cause the desaturation of colors. Eliminating color crosstalk/desaturating is another advantage of having thinner quantum dot layers.
The lifetime and potential for color shifting with quantum dots, particularly if they are driven hard. Native crystalline LEDs are more durable and can be driven harder/brighter. Thus, quantum dot-converted blue LEDs, while more than 10x brighter than OLEDs, are expected to be less bright than native LEDs
While MojoVision has a relatively small 1.37-micron LED on a 1.87-micron pitch, that still gives a 3.74-micron pixel pitch (assuming MojoVision keeps using two reds to get enough red brightness). While this is still about half the pixel pitch of the Apple Vision’s Pro ~7.5-micron pitch OLED, a smaller pixel size such as with a single-emitter-with multiple-colors (e.g., Porotech) would be better (more efficient due to étendue see: MicroLEDs with Waveguides (CES & AR/VR/MR 2023 Pt. 7)) for semi-collimating the light using microlenses as needed by waveguides.

10:20 Porotech MicroLED

I covered Porotech’s single emitter, multiple color, MicroLED technology extensively last year in CES 2023 (Part 2) – Porotech – The Most Advanced MicroLED Technology, MicroLEDs with Waveguides (CES & AR/VR/MR 2023 Pt. 7), and my CES 2023 Video with Brad Lynch.

While technically interesting, Porotech’s single-emitter device will likely take considerable time to perfect. The single-emitter approach has the major advantage of supporting a smaller pixel since only one LED per pixel is required. This also results in only two electrical connections (power and ground) to LED per pixel.

However, as the current level controls the color wavelength, this level must be precise. The brightness is then controlled by the duty cycle. An extremely advanced semiconductor backplane will be needed to precisely control the current and duty cycle per pixel, a backplane vastly more complex than LCOS or spatial color MicroLEDs (such as MojoVision and Playnitride) require.

Using current to control the color of LEDs is well-known to experts in LEDs. Multiple LED experts have told me that based on their knowledge, they believe Porotech’s red light output will be small relative to the blue and green. To produce a full-color image, the single emitter will have to sequentially display red, green, and blue, further exacerbating the red’s brightness issues.

12:55 Brilliance Color Laser Combiner

Brilliance has developed a 3-color laser combiner on silicon. Light guides formed in/on the silicon act similarly to fiber optics to combine red, green, and blue laser diodes into a single beam. The obvious application of this technology would be a laser beam scanning (LBS) display.

While I appreciate Brilliance’s technical achievement, I don’t believe that laser beam scanning (LBS) is a competitive display technology for any known application. This blog has written dozens of articles (too many to list here) about the failure of LBS displays.

14:24 TriLite/Trixel (Laser Combiner and LBS Display Glasses)

Last and certainly least, we get to TriLite Laser Beam Scanning (LBS) glasses. LBS displays for near-eye and projector use have a perfect 25+ year record of failure. I have written about many of these failures since this blog started. I see nothing in TriLite that will change this trend. It does not matter if they shoot from the temple onto a hologram directly into the eye like North Focals or use a waveguide like TriLite; the fatal weak link is using an LBS display device.

It has reached the point when I see a device with an LBS display. I’m pretty sure it is either part of a scam and/or the people involved are too incompetent to create a good product (and yes, I include Hololens 2 in this category). Every company with an LBS display (once again, including Hololens 2) lies about the resolution by confabulating “scan lines” with the rows of a pixel-based display. Scan lines are not the same as pixel rows because the LBS scan lines vary in spacing and follow a curved path. Thus, every pixel in the image must be resampled into a distorted and non-uniform scanning process.

Like Brilliance above, TriLites’ core technology combines three lasers for LBS. Unlike Brilliance, TriLites does not end up with the beams being coaxial; rather, they are at slightly different angles. This will cause the various colors to diverge by different amounts in the scanning process. TriLite uses its “Trajectory Control Module” (TCM) to compute how to re-sample the image to align the red, green, and blue.

TriLite then compounds its problems with LBS using a Lissajous scanning process, about the worst possible scanning process for generating an image. I wrote about why the Lissajous scanning process, also used by Oqmented (TriLite uses Infineon’s scanning mirror), in AWE 2021 Part 2: Laser Scanning – Oqmented, Dispelix, and ST Micro. Lissajous scanning may be a good way to scan a laser beam for LiDAR (as I discussed in CES 2023 (4) – VoxelSensors 3D Perception, Fast and Accurate), but it is a horrible way to display an image.

The information and images below have been collected from TriLite’s website.

As far as I have seen, it is a myth that LBS has any advantage in size, cost, and power over LCOS for the same image resolution and FOV. As discussed in part 1, Avegant generated the comparison below, comparing North Focals LBS glasses with a ~12-degree FOV and roughly 320×240 resolution to Avegant’s 720 x 720 30-degree LCOS-based glasses.

Below is a selection (from dozens) of related articles I have written on various LBS display devices:

2012 Cynic’s Guild to CES — Measuring Resolution – Discusses how LBS companies confabulate resolution with scan lines
2018 North’s Focals Laser Beam Scanning AR Glasses – “Color Intel Vaunt”
2015 Celluon Laser Beam Scanning Projector Technical Analysis – Part 1 More on LBS and Resolution:
2019 Hololens 2 First Impressions: Good Ergonomics, But The LBS Resolution Math Fails! – This article goes into the basic math behind LBS
2020 Hololens 2 Display Evaluation (Part 1: LBS Visual Sausage Being Made) – This article details the Hololens very complex LBS scanning process and its problems
2021 AWE 2021 Part 2: Laser Scanning – Oqmented, Dispelix, and ST Micro – Goes into the problems with Lissajous scanning in a display device.
2023 Humane AI – Pico Laser Projection – $230M AI Twist on an Old Scam (Title says it all)
2016 Wrist Projector Scams – Ritot, Cicret, the new eyeHand
2018 CES Haier Laser Projector Watch – (Wrist Projector Scams Revisited)
2018 Intel AR “Fixer-Upper” For Sale? Only $350M ???
2018 Magic Leap Fiber Scanning Display (FSD) – “The Big Con” at the “Core”

Next Time

I plan to cover non-display devices next in this series on CES and AR/VR/MR 2024. That will leave sections on Holograms and Lightfields, Display Measurement Companies, and finally, Jason and my discussion of the Apple Vision Pro.

KGOnTech
Apple Vision Pro – Influencing the Influencers & “Information Density”
16 March 2024 at 22:12

Apple Vision Pro – Influencing the Influencers & “Information Density”

KGOnTech

By: Karl Guttag

16 March 2024 at 22:12

Introduction

Many media outlets, large and small, both text and video, use this blog as a resource for technical information on mixed reality headsets. Sometimes, they even give credit. In the past two weeks, this blog was prominently cited in YouTube videos by Linus Tech Tips (LTT) and Artur Tech Tales. Less fortunately, Adam Savage’s Tested, hosted by Norman Chen in his Apple Vision Pro Review, used a spreadsheet test pattern from this blog to demonstrate foveated rendering issues.

I will follow up with a discussion of Linus’s Tech Tips video, which deals primarily with human factors. In particular, I want to discuss the “Information Density issue” of virtual versus physical monitors, which the LTT video touched on.

Influencing the Influencers On Apple Vision Pro

Linus Tech Tips (LTT)

In their “Apple Vision Pro—A PC Guy’s Perspective,” Linus Tech Tips showed several pages from this blog that were nice enough to prominently feature the pages they were using and the web addresses (below). Additionally, I enjoyed their somewhat humorous physical “simulation” of the AVP (more on that in a bit). LTT used images (below-left and below-center) from the blog to explain how the optics distort the display and how the processing in the AVP is used in combination with eye tracking to reduce that distortion. LTT also uses images from the blog (below-right) to show how the field of view (FOV) changes based on the distance from the eye to the optics.

Adam Savages’ Tested

Adam Savage’s Test with host Norman Chan’s review of the Apple Vision Pro used this blog’s AVP-XLS-on-BLACK-Large-Array from Spreadsheet “Breaks” The Apple Vision Pro’s (AVP) Eye-Tracking/Foveation & the First Through-the-optics Pictures to discuss how the foveated boundaries of the Apple Vision Pro are visible. While the spreadsheet is taken from this blog, I didn’t see any references given.

The Adam Savages Tested video either missed or was incorrect on several points it made:

It missed the point of the blog article that the foveated rendering has problems with spreadsheets when directly rendered from Excel on the AVP instead of mirrored by a MacBook.
It stated that taking pictures through the optics is impossible, which this blog has been doing for over a month (including in this article).
It said that the AVP’s passthrough 3-D perspective was good with short-range but bad with long-range objects, but Linuses Tech tips (discussed later) find the opposite. The AVP’s accuracy is poor with short-range objects due to the camera placement.
It said there was no “warping” of the real world with video passthrough, which is untrue. The AVP does less warping than the Meta Quest 3 and Quest Pro, but it still warps objects less than 0.6 meters (2 feet) away and toward the center to the upper part of the user’s view. It is impossible to be both perspective-correct and not warp with the AVP’s camera placement with near objects; the AVP seems to trade off being perspective-correct to have less warping than the Meta headsets.

Artur’s Tech Tales – Interview on AVP’s Optical Design

Artur’s Tech Tales Apple Vision Pro OPTICS—Deep Technical Analysis, featuring Arthur Rabner (CEO of Hypervision), includes an interview and presentation by Hypervision’s CEO, Arther Rabner. In his presentation, Rabner mentions this blog several times. The video details the AVP optics and follows up on Hypervision’s white paper discussed in Apple Vision Pro (Part 4) – Hypervision Pancake Optics Analysis.

Linus Tech Tips on Apple Vision Pro’s Human Factors

Much of the Linus Tech Tips (LTT) videos deal with human factors and user interface issues. For the rest of this article, I will discuss and expand upon comments made in the LTT video. Linus also commented on the passthrough camera’s “shutter angle,” but I moved my discussion on that subject to the “Appendix” at the end as it was a bit off-topic and needed some explanation.

It makes a mess of your face

At 5:18 in the video, Linus takes the headset off and shows the red marks left by the Apple Vision Pro (left), which I think may have been intentional after Linus complained about issues with the headband earlier. For reference, I have included the marks left by the Apple Vision Pro on my face (below-right). I sometimes joke that I wonder if I wear it long enough, it will make a groove in my skull to help hold up the headset.

An Apple person who is an expert at AVP fitting will probably be able to tell based on the marks on our faces if we have the “wrong” face interface. Linus’s headset makes stronger marks on his cheeks, whereas mine makes the darkest marks on my forehead. As I use inserts, I have a fairly thick (but typical for wearing inserts) 25W face hood with the thinner “W” interface, and AVP’s eye detection often complains that I need to get my eyes closer to the lenses. So, I end up cranking the solo band almost to the point where I feel my pulse on my forehead like a blood pressure measuring cuff (perhaps a health “feature” in the future?).

Need for game controllers

For virtual reality, Linus is happy with the resolution and placement of virtual objects in the real world. But he stated, “Unfortunately, the whole thing falls apart when you interact with the game.” Linus then goes into the many problems of not having controllers and relying on hand tracking alone.

I’m not a VR gamer, but I agree with The Verge that AVP’s hand and eye tracking is “magic until it’s not.” I am endlessly frustrated with eye-tracking-based finger selection. Even with the headset cranked hard against my face, the eye tracking is unstable even after recalibration of the IPD and eye tracking many times. I consider eye and hand tracking a good “secondary” selection tool that needs an accurate primary selection tool. I have an Apple Magic Pad that “works” with the AVP but does not work in “3-D space.”

Windows PC Gaming Video Mirroring via WiFi has Lag, Low Resolution, and Compression Artifacts

Linus discussed using the Steam App on the AVP to play games. He liked that he could get a large image and lay back, but there is some lag, which could be problematic for some games, particularly competitive ones; the resolution is limited to 1080p, and compression artifacts are noticeable.

Linus also discussed using the Sunshine (streaming server on the PC) and Moonlight (remote access on the AVP) apps to mirror Windows PCs. While this combination supports up to 4K at 120p, Linus says you will need an incredibly good wireless access point for the higher resolution and frame rates. In terms of effective resolution and what I like to call “Information Density,” these apps will still suffer the loss of significant resolution due to trying to simulate a virtual monitor in 3-D space, as I have discussed in Apple Vision Pro (Part 5C) – More on Monitor Replacement is Ridiculous and Apple Vision Pro (Part 5A) – Why Monitor Replacement is Ridiculous and shown with through the lens pictures in Apple Vision Pro’s (AVP) Image Quality Issues – First Impressions and Apple Vision Pro’s Optics Blurrier & Lower Contrast than Meta Quest 3.

From a “pro” design perspective, it is rather poor on Apple’s part that the AVP does not support a direct Thunderbolt link for both data and power, while at the same time, it requires a wired battery. I should note that the $300 developer’s strap supports a lowish 100Mbs ethernet (compared to USB-C/Thunderbolt 0.48 to 40 Gbs) speed data through a USB-C connector while still requiring the battery pack for power. There are many unused pins on the developer’s strap, and there are indications in the AVP’s software that the strap might support higher-speed connections (and maybe access to peripherals) in the future.

Warping effect of passthrough

In terms of video passthrough, at 13:43 in the video, Linus comments about the warping effect of close objects and depth perception being “a bit off.” He also discussed that you are looking at the world through phone-type cameras. When you move your head, the passthrough looks duller, with a significant blur (“Jello”).

The same Linus Tech Tip video also included humorous simulations of the AVP environment with people carrying large-screen monitors. At one point (shown below), they show a person wearing a respirator mask (to “simulate” the headset) surrounded by three very large monitors/TVs. They show how the user has to move their head around to see everything. LTT doesn’t mention that those monitors’ angular resolution is fairly low, which is why those monitors need to be so big.

Sharing documents is a pain.

Linus discussed the AVP’s difficulty sharing documents with others in the same room. Part of this is because the MacBook’s display goes blank when mirroring onto the AVP. Linus discussed how he had to use a “bizarre workaround” of setting up a video conference to share a document with people in the same room.

Information Density – The AVP Delivers Effectively Multiple Large but Very Low-Resolution Monitors

The most important demonstration in the LTT video involves what I like to call the “Information Density” problem. The AVP, or any VR headset, has low information density when trying to emulate a 2-D physical monitor in 3-D space. It is a fundamental problem; the effective resolution of the AVP well less than half (linearly, less than a quarter two-dimensionally) of the resolution of the monitors that are being simulated (as discussed in Apple Vision Pro (Part 5C) – More on Monitor Replacement is Ridiculous and Apple Vision Pro (Part 5A) and shown with through the lens pictures in Apple Vision Pro’s (AVP) Image Quality Issues – First Impressions and Apple Vision Pro’s Optics Blurrier & Lower Contrast than Meta Quest 3). The key contributors to this issue are:

The peak display resolution in the center of the optics is only 44.4 pixels per degree (human vision it typically better than 60 ppd).
The 2-D/Monitor image must be resampled into 3-D space with an effective resolution loss greater than 2x.
If the monitor is to be viewable, it must be inscribed inside the oval sweet spot of the optics. In the case of the AVP, this cuts off about half the pixels.
While the AVP’s approximate horizontal FOV is about 100 degrees, the optical resolution drops considerably in the outer third of the optics. Only about the center 40-50 degrees of the FOV is usable for high-resolution content.
Simply put, the AVP needs more than double the PPD and better optics to provide typical modern computer monitors’ information/resolution density. Even then, it would be somewhat lacking in some aspects.

Below, show the close-up center (best case) through the AVP’s optics on the (left) and the same image at about the same FOV on a computer monitor (right). Things must be blown up about 2x (linearly) to be as legible on the AVP as on a good computer monitor.

Comparisons of AVP to a Computer Monitor and Quest 3 from Apple Vision Pro’s Optics Blurrier & Lower Contrast than Meta Quest 3

Some current issues with monitor simulation are “temporary software issues” that can be improved, but that is not true with the information density problem.

Linus states in the video (at 17:48) that setting up the AVP is a “bit of a chore,” but it should be understood most of the “chore” is due to current software limitations that could be fixed with better software. The most obvious problems, as identified by Linus, are that the AVP does not currently support multiple screens from a MacBook, and it does not save the virtual screen location of the MacBook. I think most people expect Apple to fix these problems at some point in the near future.

At 18:20, Linus showed the real multiple-monitor workspace of someone doing video editing (see below). While a bit extreme for some people with two vertically stacked 4K monitors in landscape orientation monitors and a third 4K monitor in portrait mode, it is not that far off what I have been using for over a decade with two large side-by-side monitors (today I have a 34″ 22:9 1440p “center monitor” and a 28″ 4K side monitor both in landscape mode).

I want to note a comment made by Linus (with my bold emphasis):

“Vision Pro Sounds like having your own personal Colin holding a TV for you and then allowing it to be repositioned and float effortlessly wherever you want. But in practice, I just don’t really often need to do that, and neither do a lot of people. For example, Nicole, here’s a real person doing real work [and] for a fraction of the cost of a Vision Pro, she has multiple 4K displays all within her field of view at once, and this is how much she has to move her head in order to look between them. Wow.

Again, I appreciate this thing for the technological Marvel that it is—a 4K display in a single Square inch. But for optimal text clarity, you need to use most of those pixels, meaning that the virtual monitor needs to be absolutely massive for the Vision Pro to really shine.“

The bold highlights above make the point about information density. A person can see all the information all at once and then, with minimal eye and head movement, see the specific information they want to see at that moment. Making text bigger only “works” for small amounts of content as it makes reading slower with larger head and eye movement and will tend to make the eyes more tired with movement over wider angles.

To drive the point home, the LTT video “simulates” an AVP desktop, assuming multiple monitor support but physically placing three very large monitors side by side with two smaller displays on top. They had the simulated user wear a paint respirator mask to “simulate” the headset (and likely for comic effect). I would like to add that each of those large monitors, even at that size, with the AVP, will have the resolution capability of more like a 1920x1080p monitor or about half linearly and one-fourth in area, the content of a 4K monitor.

Quoting Linus about this part of the video (with my bold emphasis):

It’s more like having a much larger TV that is quite a bit farther away, and that is a good thing in the sense that you’ll be focusing more than a few feet in front of you. But I still found that in spite of this, that it was a big problem for me if I spent more than an hour or so in spatial-computing-land.

Making this productivity problem worse is the fact that, at this time, the Vision Pro doesn’t allow you to save your layouts. So every time you want to get back into it, you’ve got to put it on, authenticate, connect to your MacBook, resize that display, open a safari window, put that over there where you want it, maybe your emails go over here, it’s a lot of friction that our editors, for example, don’t go through every time they want to sit down and get a couple hours of work done before their eyes and face hurt too much to continue.

I would classify Many of the issues Linus gave in the above quote as solvable in software for the AVP. What is not likely solvable in software are headaches, eye strain, and low angular resolution of the AVP relative to a modern computer monitor in a typical setup.

While speaking in the Los Angeles area at the SID LA One Day conference, I stopped in a Bigscreen Beyond to try out their headset for about three hours. I could wear the Bigscreen Beyond for almost three hours, where typically, I get a spitting headache with the AVP after about 40 minutes. I don’t know why, but it is likely a combination of much less pressure on my forehead and something to do with the optics. Whatever it is, there is clearly a big difference to me. It was also much easier to drink from a can (right) with the Bigscreen’s much-reduced headset.

Conclusion

It is gratifying to see the blog’s work reach a wide audience worldwide (about 50% of this blog’s audience is outside the USA). As a result of other media outlets picking up this blog’s work, the readership roughly doubled last month to about 50,000 (Google Analytics “Users”).

I particularly appreciated the Linus Tech Tip example of a real workspace in contrast to their “simulation” of the AVP workspace. It helps illustrate some human factor issues with having a headset simulate a computer monitor, including information density. I keep pounding on the Information Density issue because it seems underappreciated by many of the media reports on the AVP.

Appendix Linus Comments on AVP’s “Weird Camera Shutter Angle”

I moved this discussion to this Appendix because it involves some technical discussion that, while it may be important, may not be of interest to everyone and takes some time to explain. At the same time, I didn’t want to ignore it as it brings up a potential issue with the AVP.

At about 16:30 in the LTT Video, Linus also states that the Apple Vision Pro cameras use “weird shutter angles to compensate for the flickering of lights around you, causing them [the AVP] to crank up the ISO [sensitivity], adding a bunch of noise to the image.”

From Wikipedia – Example of a 180-degree shutter angle

For those that don’t know, “shutter angle” (see also https://www.digitalcameraworld.com/features/cheat-sheet-shutter-angles-vs-shutter-speeds) is a hold-over term from the days of mechanical movie shutters where the shutter was open for a percentage of a 360-degree rotating shutter (right). Still, it is now applied to camera shutters, including “electronic shutters” (many large mirrorless cameras have mechanical and electronic shutter options with different effects). A 180-degree shutter angle means the shutter/camera scanning is open one-half the frame time, say 1/48th of a 1/24th of a second frame time or 1/180th of a 1/90th of a second frame rate. Typically, people talk about how different shutter angles affect the choppiness of motion and motion blur, not brightness or ISO, even though it does affect ISO/Brightness due to the change in exposure time.

I’m not sure why Linus is saying that certain lights are reducing the shutter angle, thus increasing ISO, unless he is saying that the shutter time is being reduced with certain types of light (or simply bright lights) or with certain types of flickering lights the cameras are missing much of the light. If so, it is a roundabout way of discussing the camera issue; as discussed above, the term shutter angle is typically used in the context of motion effects, with brightness/ISO being more of a side issue.

A related temporal issue is the duty cycle of the displays (as opposed to the passthrough cameras), which has a similar “shutter angle” issue. VR users have found that displays with long on-time duty cycles cause perceived blurriness with rapid head movement. Thus, they tend to prefer display technologies with low-duty cycles. However, low display duty cycles typically result in less display brightness. LED backlit LCDs can drive the LEDs harder for shorter periods to help make up for the brightness loss. However, OLED microdisplays commonly have relatively long (sometimes 100%) on-time duty cycles. I have not yet had a chance to check the duty cycle of the AVP, but it is one of the things on my to-do list. In light of Linus’s comments, I will want to set up some experiments to check out the temporal behavior of the AVP’s passthrough camera.

KGOnTech
Apple Vision Pro’s Optics Blurrier & Lower Contrast than Meta Quest 3
1 March 2024 at 19:02

Apple Vision Pro’s Optics Blurrier & Lower Contrast than Meta Quest 3

KGOnTech

By: Karl Guttag

1 March 2024 at 19:02

Introduction – Sorry, But It’s True

I have taken thousands of pictures through dozens of different headsets, and I noticed that the Apple Vision Pro (AVP) image is a little blurry, so I decided to investigate. Following up on my Apple Vision Pro’s (AVP) Image Quality Issues – First Impressions article, this article will compare the AVP to the Meta Quest 3 by taking the same image at the same size in both headsets, and I got what many will find to be surprising results.

I know all “instant experts” are singing the praises of “the Vision Pro as having such high resolution that there is no screen door effect,” but they don’t seem to understand that the screen door effect is hiding in plain sight, or should I say “blurry sight.” As mentioned last time, the AVP covers its lower-than-human vision angular resolution by making everything bigger and bolder (defaults, even for the small window mode setting, are pretty large).

While I’m causing controversies by showing evidence, I might as well point out that the AVP’s contrast and color uniformity are also slightly lower than the Meta Quest 3 on anything but a nearly black image. This is because the issues with AVP’s pancake optics dominate over AVP’s OLED microdisplay. This should not be a surprise. Many people have reported “glow” coming from the AVP, particularly when watching movies. That “glow” is caused by unwanted reflections in the pancake optics.

If you click on any image in this article, you can access it in full resolution as cropped from a 45-megapixel original image. The source image is on this blog’s Test Pattern Page. As if the usual practice of this blog, I will show my work below. If you disagree, please show your evidence.

Hiding the Screen Door Effect in Plain Sight with Blur

The numbers don’t lie. As I reported last time in Apple Vision Pro’s (AVP) Image Quality Issues – First Impressions, the AVP’s peak center resolution is about 44.4 pixels per degree (PPD), below 80 PPD, what Apple calls “retinal resolution,” and the pixel jaggies and screen door should be visible — if the optics were sharp. So why are so many reporting that the AVP’s resolution must be high since they don’t see the screen door effect? Well, because they are ignoring the issue of the sharpness of the optics.

Two factors affect the effective resolution: the PPD of optics and the optics’ modulation transfer function sharpness and contrast of the optics, commonly measured by the Modulation Transfer Function (MTF — see Appendix on MTF).

People do not see the screen door effect with the AVP because the display is slightly out of focus/blurry. Low pass filtering/blurring is the classic way to reduce aliasing and screen door effects. I noticed that when playing with the AVP’s optics, the optics have to be almost touching the display to be in focus. The AVP’s panel appears to be recessed by about 1 millimeter (roughly judging by my eye) beyond the best focus distance. This is just enough so that the thinner gaps between pixels are out of focus while only making the pixels slightly blurry. There are potentially other explanations for the blur, including the microlenses over the OLED panel or possibly a softening film on top of the panel. Still, the focus seems to be the most likely cause of the blurring.

Full Image Pictures from the center 46 Degrees of the FOV

I’m going to start with high-resolution pictures through the optics. You won’t be able to see any detail without clicking on them to see them at full resolution, but you may discern that the MQ3 feels sharper by looking at the progressively smaller fonts. This is true even in the center of the optics (square “34” below), even before the AVP’s foveate rendering results in a very large blur at the outside of the image (11, 21, 31, 41, 51, and 61). Later, I will show a series of crops to show the central regions next to each other in more detail.

The pictures below were taken by a Canon R5 (45 Megapixel) camera with a 16mm lens at f8. With a combination of window sizing and moving the headset, I created the same size image on the Apple Vision Pro and Meta Quest Pro to give a fair comparison (yes, it took a lot of time). A MacBook Pro M3 Pro was casting the AVP image, and the Meta Quest 3 was running the Immersed application (to get a flat image) mirroring a PC laptop. For reference, I added a picture of a 28″ LCD monitor taken from about 30″ to give approximately the same FOV as the image from a conventional 4K monitor (this monitor could resolve single pixels of four of these 1080p images, although you would have to have very good vision see them distinctly).

Medium Close-Up Comparison

Below are crops from near the center of the AVP image (left), the 28″ monitor (center), and the MQ3 image (right). The red circle on the AVP image over the number 34 is from the eye-tracking pointer being on (also used to help align and focus the camera). The blur of the AVP is more evident in the larger view.

Extreme Close-Up of AVP and MQ3

Cropping even closer to see the details (all the images above are at the same resolution) with the AVP on the top and the MQ3 on the bottom. Some things to note:

Neither the AVP nor MQ3 can resolve the 1-pixel lines, even though a cheap 1080p monitor would show them distinctly.
While the MQ3 has more jaggies and the screen door effect, it is noticeably sharper.
Looking at the space between the circle and the 3-pixel wide lines pointed at by the red arrow, it should be noticed that the AVP has less contrast (is less black) than the MQ3.
Neither the AVP nor MQ3 can resolve the 1-pixel-wide lines correctly, but the 2- and 3-pixel-wide lines, along with all the text, are significantly sharper and have higher contrast than on the AVP. Yes, the effective resolution of the MQ3 is objectively better than the AVP.
Some color moiré can be seen in the MQ3 image, a color artifact due to the camera’s Bayer filter (not seen by the eye) and the relative sharpness of the MQ3 optics. The camera can “see” the MQ3’s LCD color filters through the optics.

Experiment with Slightly Blurring the Meta Quest 3

A natural question is whether the MQ3 should have made their optics slightly out of focus to hide the screen door effect. As a quick experiment, I tried a (Gaussian) blur of the MQ3’s image a little (middle image below) as an experiment. There is room to blur it while still having a higher effective resolution than the AVP. The AVP still has more pixels, and the person/elf’s image looks softer on the slightly blurred MQ3. The lines are testing for high contrast resolution (and optical reflections), and the photograph shows what happens to a lower contrast, more natural image with more pixel detail.

AVP’s Issues with High-Resolution Content

While Apple markets each display as having the same number of pixels as a 4K monitor (but differently shaped and not as wide), the resolution is reduced by multiple factors, including those listed below:

The oval-shaped optics cut about 25-30% of the pixels.
The outer part of the optics has poor resolution (about 1/3rd the pixels per degree of the center) and has poor color.
A rectangular image must be inscribed inside the “good” part of the oval-shaped optics with a margin to support head movement. While the combined display might have a ~100-degree FOV, there is only about a 45- to 50-degree sweet spot.
Any pixels in the source image must be scaled and mapped into the destination pixels. For any high-resolution content, this can cause more than a 2x (linear) loss in resolution and much worse if it aliases. For more on the scaling issues, see my articles on Apple Vision Pro (Part 5A, 5B, & 5C).
As part of #4 above or in a separate process, the image must be corrected for optical distortion and color as a function of eye tracking, causing further image degradation
Scintillation and wiggling of high-resolution content with any head movement.
Blurring by the optics

The net of the above, and as demonstrated by the photographs through the optics shown earlier, the AVP can’t accurately display a detailed 1920×1080 (1080p) image.

AVP Lack “Information Density”

Making everything bigger, including short messages and videos, can work for low-information-density applications. If anything, the AVP demonstrates that very high resolution is less important for movies than people think (watching movies is a notoriously bad way to judge resolution).

As discussed last time, the AVP makes up the less-than-human angular resolution by making everything big to hide the issue. But making the individual elements bigger means less content can be seen simultaneously as the overall image is enlarged. But making things bigger means that the “information density” goes down, with the eyes and head having to move more to see the same amount of content and less overall content can be seen simultaneously. Consider a spreadsheet; fewer rows and columns will be in the sweet spot of a person’s vision, and less of the spreadsheet will be visible without needing to turn your head.

This blog’s article, FOV Obsession, discusses the issue of eye movement and fatigue using information from Thad Starner’s 2019 Photonic’s West AR/VR/MR presentation. The key point is that the eye does not normally want to move more than 10 degrees for an extended period. The graph below left is for a monocular display where the text does not move with the head-turning. Starner points out that a typical newspaper column is only about 6.6 degrees. It is also well known that when reading content more than ~30 degrees wide, even for a short period, people will turn their heads rather than move their eyes. Making text content bigger to make it legible will necessitate more eye and head movement to see/read the same amount of content, likely leading to fatigue (I would like to see a study of this issue).

ANSI-Like Contrast

A standard way to measure contrast is using a black-and-white checkerboard pattern, often called ANSI Contrast. It turns out that with a large checkerboard pattern, the AVP and MQ3 have very similar contrast ratios. For the picture below, I make the checkerboard bigger to fill about 70 degrees horizontally for each device’s FOV. The optical reflections inside the AVP’s optics cancel out the inherent high contrast of the OLED displays inside the AVP.

The AVP Has Worse Color Uniformity than the MQ3

You may be able to tell that the AVP has a slightly pink color in the center white squares. As I move my head around, I see the pink region move with it. Part of the AVP’s processing is used to correct color based on eye tracking. Most of the time, the AVP does an OK job, but it can’t perfectly correct for color issues with the optics, which becomes apparent in large white areas. The issues are most apparent with head and eye movement. Sometimes, by Apple’s admission, the correction can go terribly wrong if it has problems with eye tracking.

Using the same images above and increasing the color saturation in both images by the same amount makes the color issues more apparent. The MQ3 color uniformity only slightly changes in the color of the whites, but the AVP turns pink in the center and cyan on the outside.

The AVP’s “aggressive” optical design has about 1.6x the magnification of the MQ3 and, as discussed last time, has a curved quarter waveplate (QWP). Waveplates modify polarized light and are wavelength (color) and angle of light-dependent. Having repeatedly switched between the AVP and MQ3, the MQ3 has better color uniformity, particularly when taking one off and quickly putting the other on.

Conclusion and Comments

As a complete product (more on this in future articles), the AVP is superior to the Meta Quest Pro, Quest 3, or any other passthrough mixed reality headset. Still, the AVP’s effective resolution is less than the pixel differences would suggest due to the softer/blurrier optics.

While the pixel resolution is better than the Quest Pro and Quest 3, its effective resolution after the optics is worse on high-contrast images. Due to having a somewhat higher PPD, the AVP looks better than the MQP and MQ3 on “natural” lower-contrast content. The AVP image is much worse than a cheap monitor displaying high-resolution, high-contrast content. Effectively, what the AVP supports is multiple low angular resolution monitors.

And before anyone makes me out to be a Meta fanboy, please read my series of articles on the Meta Quest Pro. I’m not saying the MQ3 is better than the AVP. I am saying that the MQ3 is objectively sharper and has better color uniformity. Apple and Meta don’t get different physics, and they make different trade-offs which I am pointing out.

The AVP and any VR/MR headset will fare much better with “movie” and video content with few high-contrast edges; most “natural” content is also low in detail and pixel-to-pixel contrast (and why compression works so well with pictures and movies). I must also caution that we are still in the “wild enthusiasm stage,” where the everyday problems with technology get overlooked.

In the best case, the AVP in the center of the display gives the user a ~20/30 vision view of its direct (non-passthrough) content and worse when using passthrough (20/35 to 20/50). Certainly, some people will find the AVP useful. But it is still a technogeek toy. It will impress people the way 3-D movies did over a decade ago. As a reminder, 3-D TV peaked at 41.45 million units in 2012 before disappearing a few years later.

Making a headset display is like n-dimensional chess; more than 20 major factors must be improved, and improving one typically worsens other factors. These factors include higher resolution, wider FOV, peripheral vision and safety issues, lower power, smaller, less weight, better optics, better cameras, more cameras and sensors, and so on. And people want all these improvements while drastically reducing the cost. I think too much is being made about the cost, as the AVP is about right regarding the cost for a new technology when adjusted for inflation; I’m worried about the other 20 problems that must be fixed to have a mass-market product.

Appendix – Modulation Transfer Function (MTF)

MTF is measured by putting in a series of lines of equal width and spacing and measuring the difference between the white and black as the size and spacing of the lines change. People typically use 50% contrast critical to specify the MTF by convention. But note that contrast is defined as (Imax-Imin)/(Imax+Imin), so to achieve 50% contrast, the black level must be 1/3rd of the white level. The figure (below) shows how the response changes with the line spacing.

The MTF of the optics is reduced by both the sharpness of the optics and any internal reflections that, in turn, reduce contrast.

KGOnTech
Apple Vision Pro (Part 5C) – More on Monitor Replacement is Ridiculous
21 August 2023 at 00:17

Apple Vision Pro (Part 5C) – More on Monitor Replacement is Ridiculous

KGOnTech

By: Karl Guttag

21 August 2023 at 00:17

Introduction

In this series about the Apple Vision Pro, this sub-series on Monitor Replacement and Business/Text applications started with Part 5A, which discussed scaling, text grid fitting, and binocular overlap issues. Part 5B starts by documenting some of Apple’s claims that the AVP would be good for business and text applications. It then discusses the pincushion distortion common in VR optics and likely in the AVP and the radial effect of distortion on resolution in terms of pixels per degree (ppd).

The prior parts, 5A, and 5B, provide setup and background information for what started as a simple “Shootout” between a VR virtual monitor and physical monitors. As discussed in 5A, my office setup has a 34″ 22:9 3440×1440 main monitor with a 27″ 4K (3840×2160) monitor on the right side, which is a “modern” multiple monitor setup that costs ~$1,000. I will use these two monitors plus a 15.5″ 4K OLED Laptop display to compare to the Meta Quest Pro (MQP) since I don’t have an Apple AVP and then extrapolate the results to the AVP.

*My Office Setup: 34″ 22:9 3440×1440 (left) and 27″ 4K (right)*

I will be saving my overall assessment, comments, and conclusions about VR for Office Applications for Part 5D rather than somewhat burying them at the end of this article.

Office Text Applications and “Information Density” – Font Size is Important

A point to be made by using spreadsheets to generate the patterns is that if you have to make text bigger to be readable, you are lowering the information density and are less productive. Lowering the information density with bigger fonts is also true when reading documents, particularly when scanning web pages or documents for information.

Improving font readability is not solely about increasing their size. VR headsets will have imperfect optics that cause distortions, focus problems, chromatic aberrations, and loss of contrast. These issues make it harder to read fonts below a certain size. In Part 5A, I discussed how scaling/resampling and the inability to grid fit when simulating virtual monitors could cause fonts to appear blurry and scintillate/wiggle when locked in 3-D space, leading to reduced readability and distraction.

Meta Quest Pro Horizon Worktop Desktop Approach

As discussed in Part 5A, with Meta’s Horizon Desktop, each virtual monitor is reported to Windows as 1920 by 1200 pixels. When sitting at the nominal position of working at the desktop, the center virtual monitor fills about 880 physical pixels of the MQP’s display. So roughly 1200 virtual pixels are resampled into 880 vertical pixels in the center of view or by about 64%. As discussed in Part 5B, the scaling factor is variable due to severe pincushion distortion of the optics and the (impossible to turn off) curved screen effect in Meta Horizons.

The picture below shows the whole FOV captured by the camera before cropping shot through the left eye. The camera was aligned for the best image quality in the center of the virtual monitor.

Analogous to Nyquist sampling, when you scale pixel rendered image, you want about 2X (linearly) the number of pixels in the display of the source image to render it reasonably faithfully. Below left is a 1920 by 1200 pixel test pattern (a 1920×1080 pattern padded on the top and bottom), “native” to what the MQP reports to Windows. On the right is the picture cropped to that same center monitor.

The picture was taken at 405mp, then scaled down by 3X linearly and cropped. When taking high-resolution display pictures, some amount of moiré in color and intensity is inevitable. The moiré is also affected by scaling and JPEG compression.

Below is a center crop from the original test pattern that has been 2x pixel-replicated to show the detail in the pattern.

Below is a crop from the full-resolution image with reduced exposure to show sub-pixel (color element) detail. Notice how the 1-pixel wide lines are completely blurred, and the test is just becoming fully formed at about Arial 11 point (close to, but not the same scale as used in the MS Excel Calibri 11pt tests to follow). Click on the image to see the full resolution that the camera captured (3275 x 3971 pixels).

The scaling process might lose a little detail for things like pictures and videos of the real world (such as the picture of the elf in the test pattern), but it will be almost impossible for a human to notice most of the time. Pictures of the real world don’t have the level of pixel-to-pixel contrast and fine detail caused by small text and other computer-generated objects.

Meta Quest Pro Virtual Versus Physical Monitor “Shootout”

For the desktop “shootout,” I picked the 34” 22:9 and 27” 4k monitors I regularly use (side by side as shown in Part 5A), plus a Dell 15.5” 4K laptop display. An Excel spreadsheet is used with various displays to demonstrate the amount of content that can be seen at one time on a screen. The spreadsheet allows for flexible changing of how the screen is scaled for various resolutions and text sizes, and the number of cells measures the information density. For repeatability, a screen capture of each spreadsheet was taken and then played back in full-screen mode (Appendix 1 includes the source test patterns)

The Shootout

The pictures below show the relative FOVs of the MQP and various physical monitors taken with the same camera and lens. The camera was approximately 0.5 meters from the center of the physical monitors, and the headset was at the initial position at the MQP’s Horizon Desktop. All the pictures were cropped to the size of a single physical or virtual monitor.

The following is the basic data:

Meta Quest Pro – Central Monitor (only) ~43.5° horizontal FOV. Used an 11pt font with Windows Display Text Scaling at 150% (100% and 175% also taken and included later)
34″ 22:9 3440×1440 LCD – 75° FOV and 45ppd from 0.5m. 11pt font with 100% scaling
27″ 4K (3840 x 2160) LCD – 56° FOV and 62ppd from 0.5m. 11pt font with 150% scaling (results in text the same size at the 34″ 3440×1400 at 100% – 2160/1440 = 150%)
15.5″ 4K OLED – 32° FOV from 0.5m. Shown below is 11pt with 200% scaling, which is what I use on the laptop (a later image shows 250% scaling, which is what Windows “recommends” and would result in approximately the same size fonts at the 34″ 22:9 at 100%).

*Composite image showing the relative FOV – Click to see in higher resolution (9016×5641 pixels)*

The pictures below show the MQP with MS Windows display text scaling set to 100% (below left) and 175% (below middle). The 175% scaling would result in fonts with about the same number of pixels per font as the Apple Vision Pro (but with a larger angular resolution). Also included below (right) is the 15.5″ 4K display with 250% scaling (as recommended by Windows).

The camera was aimed and focused at the center of the MQP, the best case for it, as the optical quality falls off radially (discussed in Part 5B). The text sharpness is the same for the physical monitors from center to outside, but they have some brightness variation due to their edge illumination.

Closeup Look at the Displays

Each picture above was initially taken 24,576 x 16,384 (405mp) by “pixel shifting” the 45MP R5 camera sensor to support capturing the whole FOV while capturing better than pixel-level detail from the various displays. In all the pictures above, including the composite image with multiple monitors, each image was reduced linearly by 3X.

The crops below show the full resolution (3x linearly the images above) of the center of the various monitors. As the camera, lines, and scaling are identical, the relative sizes are what you would see looking through the headset for the MQP sitting at the desktop and the physical monitors at about 0.5 meters. I have also included a 2X magnification of the MQP’s images.

With Windows 100% text scaling, the 11pt font on the MQP is about the same size as it is on the 34” 22:9 monitor at 100%, the 27” 4K monitor at 150% scaling, and the 15.5” 4K monitor at 250% scaling. But while the fonts are readable on the physical monitor, they are a blurry mess on the MQP at 100%. The MQP at 150% and 175% is “readable” but certainly does not look as sharp as the physical monitors.

Extrapolating to Apple Vision Pro

Apple’s AVP has about 175% linear pixel density of the MQP. Thus the 175% case gives a reasonable idea of how text should look on the AVP. For comparison below, the MQP’s 175% case has been scaled to match the size of the 34” 22:9 and 27” 4K monitors at 100%. While the text is “readable” and about the same size, it is much softer/blurrier than the physical monitor. Some of this softness is due to optics, but a large part is due to scaling. While the AVP may have better optics and a text rendering pipeline, they still don’t have the resolution to compete on content density and readability with a relatively inexpensive physical monitor.

Reportedly, Apple Vision Pro Directly Rendering Fonts

Thomas Kumlehn had an interesting comment on Part 5B (with my bold highlighting) that I would like to address:

After the VisionPro keynote in a Developer talk at WWDC, Apple mentioned that they rewrote the entire render stack, including the way text is rendered. Please do not extrapolate from the text rendering of the MQP, as Meta has the tech to do foveated rendering but decided to not ship it because it reduced FPS.

*From Part 5A,* “Rendering a Pixel Size Dot.“

Based on my understanding, the AVP will “render from scratch” instead of rendering an intermediate image that is then rescaled as is done with the MQP discussed in Part 5A. While rendering from scratch has a theoretical advantage regarding text image quality, it may not make a big difference in practice. With an ~40 pixels per degree (ppd) display, the strokes and dots of what should be readable small text will be on the order of 1 pixel wide. The AVP will still have to deal with approximately pixel-width objects straddling four or more pixels, as discussed in Part 5A: Simplified Scaling Example – Rendering a Pixel Size Dot.

Some More Evaluation of MQP’s Pancake Optics Using immersed Virtual Monitor

I wanted to evaluate the MQP pancake optics more than I did in Part 5B. Meta’s Horizon Desktop interface was very limiting. So I decided to try out immersed Virtual Desktop software. Immersed has much more flexibility in the resolution, size, placements, and the ability to select flat or curved monitors. Importantly for my testing, I could create a large, flat virtual 4K monitor that could fill the entire FOV with a single test pattern (the pattern is included in Appendix 1).

Unfortunately, while the immersed software had the basic features I wanted, I found it difficult to precisely control the size and positioning of the virtual monitor (more on this later). Due to these difficulties, I just tried to fill the display with the test pattern with only a roughly perpendicular to the headset/camera monitor. It was a painfully time-consuming process, and I never could get the monitor where it seems perfectly perpendicular.

Below is a picture of the whole (camera) FOV taken at 405mp and then scaled down to 45mp. The image is a bit underexposed to show the sub-pixel (color) detail when viewed at full resolution. In taking the picture, I determined that the MQPs pancake optics focus appears to be a “dished,” with the focus in the center slightly different than on the outsides. The picture was taken focusing between the center and outside focus and using f11 to increase the photograph’s depth of focus. For a person using the headset, this dishing of the focus is likely not a problem as their eye will refocus based on their center of vision.

As discussed in Part 5B, the MQP’s pancake optics have severe pincushion distortion, requiring significant digital pre-correction to make the net result flat/rectilinear. Most notably, the outside areas of the display have about 1/3rd the linear pixel per degree of the center.

Next are shown 9 crops from the full-resolution (click to see) picture at the center, the four corners, top, bottom, left, and right of the camera’s FOV.

The main thing I learned out of this exercised is the apparent dish in focus of the optics and the fall off in brightness. I had determine the change in resolution in the studies shown in Part 5B.

Some feedback on immersed (and all other VR/AR/MR) virtual monitor placement control.

While the immersed had the features I wanted, it was difficult to control the setup of the monitors. The software feels very “beta,” and the interface I got differed from most of the help documentation and videos, suggesting it is a work in progress. In particular, I could’t figure out how to pin the screen, as the control for pinning shown in the help guides/videos didn’t seem to exist on my version. So I had to start from scratch on each session and often within a session.

Trying to orient and resize the screen with controllers or hand gestures was needlessly difficult. I would highly suggest immersed look at some of the 3-D CAD software controls of 3-D models. For example, it would be great to have a single (virtual) button that would position the center monitor directly in front and perpendicular to the user. It would also be a good idea to allow separate control for tilt, virtual distance, and zoom/resize while keeping the monitor centered.

It seemed to be “aware” of things in the room which only served to fight what I wanted to do. I was left contorting my wrist to try and get the monitor roughly perpendicular and then playing with the corners to try and both resized and center the monitor. The interface also appears to conflate “resizing” with moving the monitor closer. While moving the virtual monitor closer or resizing affect the size of everything, the effect will be different when the head moves. I would have a home (perpendicular and center) “button,” and then left-right, up-down, tilt, distance, and size controls.

To be fair, I wanted to set up the screen for a few pictures, and I may have overlooked something. Still, I found the user interface could be vastley better for the setting up the monitors, and the controller or gesture monitor size and positioning were a big fail in my use.

BTW, I don’t want to just pick on immersed for this “all-in-one” control problem. I have found it a pain on every VR and AR/MR headset I have tried that supports virtual monitors to give the user good simple intuitive controls for placing the monitors in the 3D space. Meta Horizons Desktop goes to the extreme of giving no control and overly curved screens.

Other Considerations and Conclusions in Part 5D

This series-within-a-series on the VR and the AVP use as an “office monitor replacement” has become rather long with many pictures and examples. I plan to wrap up this series within the series on the AVP with a separate article on issues to consider and my conclusions.

Appendix 1: Test Patterns

Below is a gallery of PNG file test patterns used in this article. Click on each thumbnail to see the full-resolution test pattern.

Appendix 2: Some More Background Information

More Comments on Font Sizes with Windows

As discussed in Appendix 3: Confabulating typeface “points” (pt) with With Pixels – A Brief History, at font “point” is defined as 1/72nd of an inch (some use 1/72.272 or thereabout – it is a complicated history). Microsoft throws the concept of 96 dots per inch (dpi) as 100%. But it is not that simple.

I wanted to share measurements regarding the Calibri 11pt font size. After measuring it on my monitor with a resolution of 110 pixels per inch (PPI), I found that it translates to approximately 8.44pt (8.44/72 inches). However, when factoring in the monitor PPI of 110 and Windows DPI of 96, the font size increases to ~9.67pt. Alternatively, when using a monitor PPI of 72, the font size increases to ~12.89pt. Interestingly, if printed assuming a resolution of 96ppi, the font reaches the standard 11pt size. It seems Windows apply some additional scaling on the screen. Nevertheless, I regularly use the 11pt 100% font size on my 110ppi monitor, which is the Windows default in Excel and Word, and it is also the basis for the test patterns.

How pictures were shot and moiré

As discussed in 5A’s Appendix 2: Notes on Pictures, some moiré issues will be unavoidable when taking high-resolution pictures of a display device. As noted in that Appendix, all pictures in Lens Shootout were taken with the same camera and lens, and the original images were captured at 405 megapixels (Canon R5 “IBIS sensor shift” mode) and then scaled down by 3X. All test patterns used in this article are included in the Appendix below.

KGOnTech
Apple Vision Pro (Part 5A) – Why Monitor Replacement is Ridiculous
5 August 2023 at 17:53

Apple Vision Pro (Part 5A) – Why Monitor Replacement is Ridiculous

KGOnTech

By: Karl Guttag

5 August 2023 at 17:53

Introduction

As I wrote in Apple Vision Pro (Part 1) regarding the media coverage of the Apple Vision Pro, “Unfortunately, I saw very little technical analysis and very few with deep knowledge of the issues of virtual and augmented reality. At least they didn’t mention what seemed to me to be obvious issues and questions.”

I have been working for the last month on an article to quantify why it is ridiculous to think that a VR headset, even one from Apple, will be a replacement for a physical monitor. In writing the article, if felt the need to include a lot of background material and other information as part of the explanation. As the article was getting long, I decided to break it into two parts, this being the first part.

The issues will be demonstrated using the Meta Quest Pro (MQP) because that is the closest headset available, and it also claims to be for monitor replacement and uses similar pancake optics. I will then translate these results to the higher, but still insufficient, resolution of the Apple Vision Pro (AVP). The AVP will have to address all the same issues as the MQP.

Office applications, including word processing, spreadsheets, presentations, and internet browsing, mean dealing with text. As this article will discuss, text has always been treated as a special case with some “cheating” (“hints” for grid fitting) to improve sharpness and readability. This article will also deal with resolution issues with trying to fit a virtual monitor in a 3-D space.

I will be for this set of articles suspending my disbelief in many other human factor problems caused by trying to simulate a fixed monitor in VR to concentrate on the readability of text.

Back to the Future with Very Low Pixels Per Degree (ppd) with the Apple Vision Pro

Working on this article reminded me of lessons learned in the mid-1980s when I was the technical leader of the TMS34010, the first fully programmable graphics processor. The TMS340 development started in 1982 before an Apple Macintosh (1984) or Lisa (1983) existed (and they were only 1-bit per pixel). But like those products, my work on the 34010 was influenced by Xerox PARC. At that time, only very expensive CAD and CAM systems had “bitmapped graphics,” and all PC/Home Computer text was single-size and monospaced. They were very low resolution if they had color graphics (~320×200 pixels). IBM introduced VGA (640×480) and XGA (1024×768) in 1987, which were their first IBM PC square pixel color monitors.

The original XGA monitor, considered “high resolution” at the time, had a 16” diagonal and 82ppi, which translated 36 to 45 pixels per degree (ppd) from 0.5 meters to 0.8 meters (typical monitor viewing distance), respectively. Factoring in the estimated FOV and resolutions, the Apple Vision Pro is between 35 and 40 ppd or about the same as a 1987 monitor.

So it is time to dust off the DeLorean and go Back to the Future of the mid-1980s and the technical issues with low ppd displays. Only it is worse this time because, in the 1980s, we didn’t have to resample/rescale everything in 3-D space when the user’s head moves to give the illusion that the monitor isn’t moving.

For more about my history in 1980s computer graphics and GPUs, see Appendix 1: My 1980s History with Bitmapped Fonts and Multiple Monitors.

The question is, “Would People?” Not “Could People?” Use an Apple Vision Pro (AVP) as a Computer Monitor

With their marketing and images (below), Apple and Meta suggest that their headsets will work as a monitor replacement. Yes, they will “work” as a monitor if you are desperate and have nothing else, but having multiple terrible monitors is not a solution many people will want. These marketing concepts fail to convey that each virtual monitor will have low effective resolution forcing the text to be blown up to be readable and thus have less content per monitor. They also fail to convey that the text looks grainy and shimmers (more on this in a bit).

Meta Quest Pro (left) and Apple Vision Pro (right) have similar multiple monitor concepts.

Below is a through-the-lens picture of MQP’s Horizons Virtual Desktop. t was taken through the left eye’s optics with the camera centered for best image quality and showed more of the left side of the binocular FOV. Almost all the horizontal FOV for the left eye is shown in the picture, but the camera slightly cuts off the top and bottom.

*MQP Horizon Desktop – Picture via the Left Eye Optics (camera FOV 80°x64°)*

Below for comparison is my desktop setup with a 34” 22:9 3440×1400 monitor on the left and a 27” 4K monitor on the right. The combined cost of the two monitors is less than $1,000 today. The 22:9 monitor display setting is 100% scale (in Windows display settings) and has 11pt fonts in the spreadsheet. The righthand monitor is set for 150% scaling with 11pt fonts netting fonts that are physically the same size.

My office setup – 34” 22:9 3440×1440 (110 PPI) widescreen (left) & 27” 16:9 4K (163 PPI) Monitor (right)

Sitting 0.5 to 0.8 meters away (typical desktop monitor distance), I would judge the 11pt font on either of the physical monitors as much more easily readable than the 11pt font on the Meta Quest Pro with the 150% scaling, even though the MQP’s “11pt” is angularly about 1.5x bigger (as measured via the camera). The MQP’s text is fuzzier, grainier, and scintillates/shimmers. I could over six times the legible text on the 34” 22:9 monitor and over four times on the 27” 4K as the MQP. With higher angular resolution, the AVP will be better than the MQP but still well below the amount of legible text.

Note on Window’s Scaling

In Window, 100% means a theoretical 96 dots per inch. Windows factors in the information reported by the monitor to it (in this case, from the MQP’s software) give a “Scale and Layout” recommendation (right). The resolution reported to Windows by the MQP’s Horizon’s virtual monitor is 1920×1200, and the recommended scaling was 150%. This setting is what I used for most pictures other than for the ones called out as being at 100% or 175%.

For more on the subject of how font “points” are defined, see Appendix 3: Confabulating typeface “points” (pt) with With Pixels – A Brief History.

Optics

I’m not going to go into everything wrong with VR optics, and this article deals with being able to read text in office applications. VR optics have a lot of constraints in terms of cost, space, weight, and wide FOV. While pancake optics are a major improvement over the more common Fresnel lenses, to date, they still are poor optically (we will have to see about the AVP).

While not bad in the center of the FOV, they typically have severe pincushion distortion and chroma (color) aberrations. Pancake optics are more prone to collecting and scattering light, causing objects to glow on dark backgrounds, contrast reduction, and ghosts (out-of-focus reflection). I discussed these issues with Pancake Optics in Meta (aka Facebook) Cambria Electrically Controllable LC Lens for VAC. With computer monitors, there are no optics to cause these problems.

Optical Distortion

As explained in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, the Meta Quest Pro rotates the two displays for the eyes ~20° to clear the nose. The optics also have very large pincushion distortion. The display processor on the MQP pre-corrects digitally for the display optics’ severe pincushion distortion. This correction comes at some loss of fidelity in the resampling process.

The top right image shows the video feed to the displays. The distortion and rotation have been digitally corrected in the lower right image, but other optical problems are not shown (see through-the-lens pictures in this art cle).

There is also an optical “cropping” of the left and right eye displays, indicated by the Cyan and Red dashed lines, respectively. The optical cropping shown is based on my observations and photographs.

The pre-distortion correction is certainly going to hurt the image quality. It is likely that the AVP, using similar pancake optics, will have similar needs for pre-correction. Even though the MQP displays are rotated (no word on the AVP), there are so many other transforms/rescalings, including the transforms in 3-D space required to make the monitor(s) appear stationary, that if the rotation is combined with them (rather than done as a separate transform), the rotation o the display’s effect on resolution may be negligible. The optical quality distortion and the loss of text resolution, when transformed in 3-D space, are more problematic.

Binocular Overlap and Rivalry

One of the ways to improve the overall FOV with a biocular system is to have the FOV of the left and right eye only partially overlap (see figure below). The paper Perceptual Guidelines for Optimizing Field of View in Stereoscopic Augmented Reality Displays and the article Understanding Binocular Overlap and Why It’s Important for VR Headsets discuss the issues with binocular overlap (also known as “Stereo Overlap”). Most optical AR/MR systems have a full or nearly full overlap, whereas VR headsets often have a significant amount of partial overlap.

Partial overlap increases the total FOV when combining both eyes. The problem with partial overlap occurs at the boundary where one FOV ends in the middle of the other eye’s FOV. One eye sees the image fade out to black, whereas the other sees the image. This is a form of Biocular Rivalry, and it is left to the visual cortex to sort out what is seen. The visual cortex will mostly sort it out in a desirable way, but there will be artifacts. Most often, the visual cortex will pick the eye that appears brighter (i.e., the cortex picks one and does not average), but there can be problems with the transition area. Additionally, where one is concentra ing can affect what is seen/perceived.

In the case of the MQP, the region of binocular overlap is slightly less than the width of the center monitor in Meta’s Horizon’s Desktop when viewed from the starting position. Below left shows the view through the left eye when centering the monitor in the binocular FOV.

When concentrating on a cell in the center, I didn’t notice a problem, but when I took in the whole image, I could see these rings, particularly in the lighter parts of the image.

The Meta Quest 2 appears to have substantially more overlap. On the left is a view through the left eye with the camera positioned similarly to the MQP (above left). Note how the left eye’s FOV overlaps the hole central monitor. I didn’t notice the transition “rings” with the Meta Quest 2 as I did with the MQP.

Binocular overlap is not one of those things VR companies like to specify; they would rather talk about the bigger FOV.

In the case of the AVP, it will be interesting to see the amount of binocular overlap in their optics and if it affects the view of the virtual monitors. One would like the overlap to be more than the width of a “typical” virtual monitor, but what does “typical” mean if the monitors can be of arbitrary size and positioned anywhere in 3-D space, as suggested in the AVP’s marketing material?

Inscribing a virtual landscape-oriented monitor uses about half of the vertical pixels of the headset.

The MQP’s desktop illustrates the basic issues of inscribing a virtual monitor into the VR FOV while keeping the monitor stationary. There is some margin for allowing head movement without cutting off the monitor, which would be distracting. Additionally, the binocular overlap cutting off the monitor is discussed above.

As discussed in more detail, the MQP uses a 16:10 aspect ratio, 1920×1200 pixel “virtual monitors” (the size it reports to Windows). The multiple virtual monitors are mapped into the MQP’s 1920×1800 physical display. Looking straight ahead, sitting at the desktop, you see the central monitor and about 30% of the two side monitors.

The center monitor’s center uses about 880 pixels, or about half of the 1800 vertical pixels of the QP’s physical display. The central monitor behaves about 1.5 meters (5 f et) away or about 2 to 3 times the distance of a typical computer monitor. This makes “head zooming” (leaning in to make the image bigger) ineffective.

Apple’s AVP has a similar FOV and will have similar limitations in fitting virtual moni ors. There is the inevitable compromise between showing the whole monitor with some latitude user moving their head while avoiding cutt ng off the monitor the sides of the monitor.

Simplified Scaling Example – Rendering a Pixel Size Dot

The typical readable text has a lot of high-resolution, high contra t, and features that will be on the order of one pixel wide, such as the stroke and dot in the letter “i.” The problems with drawing a single pixel size dot in 3-D space illustrate some of the problems.

Consider drawing a small circular dot that, after all the 3-D transforms, is the size of about one pixel. In the figure below, the pixel boundaries are shown with blue lines. The four columns below in the figure below show a few of an infinite number of relationships between a rendered dot and the pixel grid.

The first row shows the four dots relative to the grid. The nearest pixel is turned on in the second row based on the centroid. In row three, a simple average is used to draw the pixel where the average of 4 pixels should equal the brightness of one pixel. The fourth row shows a low-pass filter of the virtual dots. The fifth row renders the pixels based on the average value of the low-pass filtered version of the dots.

The centroid method is the sharpest and keeps the size of the dot the same, but the location will tend to jump around with the slightest head movement. If many dots formed an object, the shape would appear to wriggle. With the simple average, the “center of mass” is more accurate than the centroid method, but the dot changes shape dramatically based on alignment/movement. The average of the low-pass filter method is better in terms of center of mass, and the shape changes less based on alignment, but now a single pixel size circle is blurred out over 9 pixels.

There are many variations to resampling/scaling, but they all make tradeoffs. A first-order tradeoff is between wiggling (changing in shape and location), with movement versus sharpness. A big problem with text when rendered low ppd displays, including the Apple Vision Pro, is that many features, from periods to the dots of letters to the stroke width of small text fonts, will be close to 1 pixel.

Scaling text – 40+ Years of Computer ont Grid Fitting (“Cheating”) Exposed

Since the beginning, personal computers have dealt with low pixels-per-inch monitors, translating into low pixels per degree based on typical viewing distances. Text is full of fine detail and often has perfectly horizontal and vertical strokes that, even with today’s higher PPI monitors, cause pixel alignment issues. Text is so important and so common that it gets special treatment. Everyone “cheats” to make text look better.

The fonts need to be recognizable without making them so big that the eye has to move a lot to read words and make content less dense with less information on a single screen. Big fonts produce less content per display and more eye movement, making the muscles sore.

In the early to mid-1980, PCs moved rough-looking fixed space to proportionally spaced text and carefully hand-crafted fonts, and only a few font sizes were available. Font edges are also smoothed (antialiased) to make it look better. Today, most fonts are rendered from a model with “hints” that help the fonts look better on a pixel grid. TrueType, originally developed by Apple as a workaround to paying royalties to Adobe, is used by both Apple and MS Windows and includes “Hints” in the font definitions for grid fitting (see: Windows hinting and Apple hinting).

Simplistically, grid fitting tries to make horizontal and vertical strokes of a font land on the pixel grid by slightly modifying the shape and location (vertical and horizontal spacing) of the font. Doing so requires less smoothing/antialiasing without making the font look jagged. This works because computer monitor pixels are on a rectangular grid, and in most text applications, the fonts are drawn in horizontal rows.

Almost all font rending is grid fits, just some more than others (see from 2 07 Font rendering philosophies of Windows & Mac OS X). Apple (and Adobe) have historically tried to keep the text size and spacing more accurate at some loss in font sharpness and readability on low PPI monitors (an easy solution for Apple as they expect you to buy a higher PPI monitor). MS Windows with ClearType and Apple with their LCD font smoothing have options to try and improve fonts further by taking advantage of LCDs with side-by-side red-green-blue subpixels.

But this whole grid fitting scheme falls apart when the monitors are virtualized. Horizontal and vertical strokes transform into diagonal lines. Because grid fitting won’t work, the display of a virtual monitor needs to be much higher in angular resolution than a physical monitor to show a font of the same size with similar sharpness. Yet today and for the foreseeable future, VR displays are much lower resolution.

For more on the definition of font “Points” and their history with Windows and Macs, see Appendix 3: Confabulating typeface “points” (pt) with With Pixels – A Brief History.

Rendering Options: Virtual Monitors Fixed in 3-D Space Breaks the “Pixel Grid.”

The slightest head movement means that everything has to be re-rendered. The “grid” to which you want to render text is not the virtual monitor but that of the headset’s display. There are at least two main approaches:

Re-render everything from scratch every frame – This will give the best theoretical image quality but is very processor intensive and will not be supported by most legacy applications. Simply put, these applications are structured to draw in terms of physical pixels of a fixed size and orientation rather than everything drawn virtually.
Render to a “higher” resolution (if possible) and then scale to the headset’s physical pixels.
- One would like the rendering to be at least 2X (linearly, 4X the pixels) of the physical pixels of the headset covering the same area to keep from having significant degradation in image quality after the scaling-down process.
- The higher-resolution virtual image transformed onto the surface (which might be curved itself) of the virtual monitor in 3-D space. Virtual monitor processing can become complex if the user can put multiple monitors here, there, and everywhere that can be viewed from any angle and distance. The rendering resolution needed for each virtual monitor depends on the virtual distance from the eye.
- Even with this approach, there are “application issues” from the legacy of 40+ years of pcs dealing with fixed pixel grids.
- The grid stretching (font hinting) becomes counterproductive since they are stretching to the virtual rather than the physical display.

Systems will end up with a hybrid of the two approaches mixing “new” 3-D applications with legacy office applications.

Inscribing a virtual landscape-oriented monitor uses about half of the vertical pixels of the headset.

The MQP’s Horizons appears to render the virtual monitor(s) and then re-render them in 3-D space along with the cylindrical effect plus pre-correction for their Pancake lens distortion.

The MQP uses a 16:10 aspect ratio, 1920×1200 pixel “virtual monitors.” The multiple virtual monitors are mapped into the MQP’s 1920×1800 physical display. Looking straight ahead, sitting at the desktop, you see the central monitor and about 30% of the two side monitors.

The virtual monitor’s center uses about 880 pixels, or about half of the 1800 vertical pixels of the MQP’s physical display or 64% of the 1200 vertical pixels reported to Windows with the use at the desktop.

The central monitor behaves like it is about 1.5 meters (5 feet) away or about 2 to 3 times the distance of a typical computer monitor. This makes “head zooming” (leaning in to make the image bigger) much less effective (by a factor of 2 to 3X).

Apple’s AVP has a similar FOV and will have similar limitations in fitting virtual monitors. There is the inevitable compromise between showing the whole monitor with some latitude user movi g their head while avoiding cutting off the monitor on the sides of the monitor.

The pre-distortion correction is certainly going to hurt the image. It is possible that the AVP, using similar pancake optics, will have similar needs for pre-correction (most, if not all, VR optics have significant pincushion distortion – a side effect of trying to support a wide FOV). The MQP displays are rotated to clear the nose (no word on the AVP). However, this can be rolled into the other transformations and probably does not significantly impact the processing requirement or image quality.

A simplified example of scaling text

The image below, one cell of a test pattern with two lines of text and some 1- and 2-pixel-wide lines, shows a simulation (in Photoshop) of the scaling process. For this test, I chose a 175% scaled 11pt front which should have roughly the same number of pixels as an 11pt font at 100% on an Apple Vision Pro. This simulation greatly simplifies the issue but shows what is happening with the pixels. The MQP and AVP must support resampling with 6 degrees of free om in the virtual world and a pre-correcting distortion with the optics (and, in the case of MQP’s Horizons, curve the virtual monitor).

Source Cel (left), Simulated 64% scaling (right)

Sidenote: This one test pattern accidentally has an “i” rather than a “j” between the g & k that I discovered late into editing.

The pixels have been magnified by 600% (in the full-size image), and a grid has been shown to see the individual pixels. On the top right source has been scaled by 64%, about the same amount MQP Horizons scales the center of the 1920×1200 virtual monitor when sitting at the desktop. The bottom right image scales by 64% and rotates by 1° to simulate some head tilt.

If you look carefully at the scaled one and two-pixel wide lines in the simulation, you will notice that sometimes the one-pixel wide lines are as wide as the 2-pixel lines but dimmer. You will also see what started as identical fonts from line to line look different when scaled even without any rotation. Looking through the lens cells, the fonts have further degradation/softening as they are displayed on color subpixels.

Below is what the 11pt 175% fonts look like via the lens of the MQP in high enough resolution to see the color subpixels. By the time the fonts have gone through all the various scaling, they are pretty rounded off. If you look closely at the same font in different locations (say the “7” for the decimal point), you will notice every instance is different, whereas, on a conventional physical monitor, they would all be identical due to grid fitting.

For reference, the full test pattern and the through-the-lens picture of the virtual monitor are given below (Click on the thumbnails to see the full-resolution images). The camera’s exposure was set low so the subpixels would not blow out and lose all their color.

Scintillating Text

When looking through the MQP, the text scintillates/sparkles. This occurs because no one can keep their head perfectly still, and every text character is being redrawn on each frame with slightly different alignments to the physical pixels causing the text to wriggle and scintillate.

Scaling/resampling can be done with sharper or softer processing. Unfortunately, the sharper the image after resampling, the more it will wriggle with movement. The only way to avoid this wriggling and have sharp images is to have a much higher ppd. MQP has only 22.5ppd, and the AVP has about 40ppd and should be better, but I think they would need about 80pp (about the limit of good vision and what Apple retinal monitors support) to eliminate the problems.

The MQP (and most displays) uses spatial color with individual red, green, and blue subpixels, so the wriggling is at the subpixel level. The picture below shows the same text with the headset moving slightly between shots.

Below is a video from two pictures taken with the headset moved slightly between shots to demonstrate the scintillation effect. The 14pt font on the right has about the same number of pixels as an 11pt font with the resolution of the Apple Vision Pro.

Scintillation/wiggle of two frames (right-click > “loop” -> play triangle to see the effect)

Conclusion

This will not be a close call, and using any VR headset, including the QP and Apple Vision Pro, as a computer monitor replacement fails any serious analysis. It might impress people who don’t understand the issues and can be wowed by a flashy short demo, and it might be better than nothing. But it will be a terrible replacement for a physical monitor/display.

I can’t believe Apple seriously thinks a headset display with about 40ppd will make a good virtual monitor. Even if some future VR headset has 80ppd and over 100-degree FOV, double the AVP linearly or 4X, it will still have problems.

Part 5B of this series will include more examples and more on my conclusions.

Appendix 1: My 1980s History with Bitmapped Fonts and Multiple Monitors

All this discussion of fonts and 3-D rendering reminded me of those early days when the second-generation TMS34020 almost got designed into the color Macintosh (1985 faxed letter from Steve Perlman from that era – right). I also met with Steve Jobs at NeXT and mentioned Pixar to him before Jobs bought them (discussed in my 2011 blog article) and John Warnock, a founder of Adobe, who was interested in doing a Port of Postscript to the 34010 in that same time frame.

In the 1980s, I was the technical leader for a series of programs that led to the first fully programmable graphics processor, the TMS34010, and the Multi-ported Video DRAM (which led to today’s SDRAM and GDRAM) at Texas Instruments (TI) (discussed a bit more here and in Jon Peddie’s 2019 IEEE article and his 2022 book “The History of the GPU – Steps to Invention”).

In the early 1980s, Xerox PARC’s work influenced my development of the TMS34010, including Warnock’s 1980 paper (while still at PARC), “The Display of Characters Using Gray Level Sample Arrays,” and the series of PARC’s articles in BYTE Magazine, particularly the August 1981 edition on Smalltalk which discussed bit/pixel aligned transfers (BitBlt) and the use of a “mouse” which had to be explained to BYTE readers as, “a small mechanical box with wheels that lets you quickly move the cursor around the screen.”

When defining the 34010, I had to explain to TI managers that the Mouse would be the next big input device for ergonomic reasons, not the lightpen (used on CAD terminals at TI in the early 1980s), which requires the user to keep their arm floating in the air which quickly become tiring. Most AR headset user interfaces make users suffer with having to float their hands to point, select, and type, so the lessons of the past are being relearned.

In the late 1980s, a systems engineer for a company I had never heard of called “Bloomberg,” who wanted to support 2 to 4 monitors per PC graphics board, came to see us at TI. In a time when a single 1023×786 graphic card could cost over $1,200 (about $3,000 in 2023 dollars), this meeting stood out. The Bloomberg engineer explained how Wall Street traders would pay a premium to get as much information as possible in front of them, and a small advantage on a single trade would pay for the system. It was my first encounter with someone wanting multiple high-resolution monitors per PC.

I used to have a life designing cutting-edge products from blank sheets of paper (back then, it was physical paper) through production and marketing; in contrast, I blog about other people’s designs today. And I have dealt with pixels and fonts for over 40 years.

1982

Below is one of my early presentations on what was then called the “Intelligent Graphics Controller” (for internal political reasons, we could not call it a “processor”), which became the TMS34010 Graphics System Processor. You can also see the state of 1982 presentation technology with a fixed-spaced font and the need to cut and paste hand drawings. This slide was created in Feb 1982. The Apple Lisa didn’t come out until 1983, and the Mac in 1984.

1986 and the Battle with Intel for Early Graphics Processor Dominance

e announced the TMS34010 in 1986, and our initial main competitor was the Intel 82786. But the Intel chip was “hardware” and lacked the 34010’s programmability, and to top it off, the Intel chip had many bugs. In just a few months, the 82786 was a non-factor. The copies of a few of the many articles below capture the events.

1986-June-and-AugTMS34010-and-Intel-82786-PC-Week-EETimes-Si graph-EDN-Chart-Download

1986 we wrote two articles on the 34010 in the IEEE CG&A magazine. You can see from the front pages of the articles the importance we put on drawing text. Copies of these articles are available online (click on the thumbnails below to be linked to the full articles). You may note the similarity of the IEEE CG&A article’s first figure to the one in the 1981 Byte Smalltalk article, where we discussed extending “BitBlt” to the color “PixBlt.”

Around 1980 we started publishing a 3rd party guide of all the companies developing hardware and software for the 340 family of products, and the June 1990 4th Edition contained over 200 hardware and software products.

Below is a page from the TMS340 TIGA Graphics Library, including the font library. In the early 1980s, everyone had to develop their font libraries. There was insufficient power to render fonts with “hints” on the fly. We also do well to have bitmapped fonts with little or no antialiasing/smoothing. From about

Sadly, we are a bit before our time, and Texas Instruments had, by the late 1980s, fallen far behind TSMC and many other companies in semiconductor technology for making processors. Our competitors, such as ATI (NVidia wasn’t founded until 1993), could get better semiconductor processing at a lower cost from the then-new semiconductor 3rd party fabs such as TSMC (founded in 1987).

Appendix 2: Notes on Pictures

All the MQP pictures in these two articles were taken through the l ft eye optics using either the Canon R5 (45mp) with an RF16mmf2.8 or 28mmf2.8 “pancake” lens or the lower resolution Olympus E-M5D-3 (20mp) with 9-18mm zoom lens at 9mm. Both cameras feature a “pixel shift” feature that moves the lens, giving 405mp (24,576 x 16,384) for the R5 and 80mp (10,368 x 7,776 pixels) for the M5D-3 and all the pictures used this feature as it gave better resolution, even if the images were later scaled down.

High-resolution pictures of computer monitors with color subpixels and any scaling or compression cause issues with color and intensity moiré (false patterning) due to the “beat frequency” between the camera’s color sensor and the display device. In this case, there are many different beat frequencies between both the pixels and color subpixels of the MQP’s displays and the cameras. Additionally, the issues of the MQP’s optics (which are poor compared to a camera lens) vary the resolution radially. I found for the whole FOV image, the lower-resolution Olympus camera didn’t have nearly as severe a moiré issue (only a little in intensity and almost none in color). In contrast, it was unavoidable with the R5 with the 16mm lens (see comparison below).

Lower Resolution Olympus D3 with very little moiré

The R5 with the 28mmf2.8 Lens and pixel shift mode could capture the MQP’s individual red, green, and blue subpixels (right). In the picture above, the two “7s” on the far right have a little over 1 pixel wide horizontal and diagonal stroke. The two 7’s are formed by different subpixels caused by them being slightly differently aligned in 3D space. The MQP’s displays are rotated by about 20°; thus, the subpixels are on a 20° diagonal (about the same as the lower stoke on the 7’s. Capturing at this resolution where the individual red, green, and blue sub-pixels are visible necessitated underexposing the overall image by about 8X (3 camera stops). Otherwise, some color dots (particularly green) will “blow out” and shift the color balance.

As seen in the full-resolution crop above, each color dot in the MQP’s display device covers about 1/8th of the area of a pixel, with the other two colors and black filling the rest of the area of a pixel. Note how the scaled-down version of the same pixels on the right look dim when the subpixels are averaged together. The camera exposure had to be set about three stops lower (8 times in brightness as stops are a power of two) to avoid blowing out the subpixels.

Appendix 3: Confabulating typeface “points” (pt) with With Pixels – A Brief History

Making a monitor appear locked in 3-D spaces breaks everything about how PCs have dealt with rendering text and most other objects. Since the beginning of PC bitmap graphics, practical compromises (and shortcuts) have been made to reduce processing and to make images look better on affordable computer monitors. A classic compromise is the font “point,” defined (since 1517) at ~1/72nd of an inch.

So, in theory, when rendering text, a computer should consider the physical size of the monitor’s pixels. Early bitmapped graphics monitors in the mid-1980s had about 60 to 85 ppi, so the PC developers (except Adobe with their Postscript printers, with founders from Xerox PARC, that also influenced Apple) without a processing power to deal with it and the need to get on with making products confabulated “points” and “pixels.” Display font “scaling” helps correct this early transgression.

Many decades ago, MS Windows decided that a (virtual) 96 dots per inch (DPI) would be their default “100%” font scaling. An interesting Wikipedia article on the convoluted logic that led to Microsoft’s decision is discussed here. Conversely, Apple stuck with 72 PPI as their basis for fonts and made compromises with font readability on lower-resolution monitors with smaller fonts. Adherence to 72 PPI may explain why a modern Apple Mac 27” monitor is 5K to reach 218 ppi (within rounding of 3×72=216). In contrast, the much more common and affordable 27” 4K monitor has 163 ppi, not an integer multiple of 72, and Macs have scaling issues with 3rd party monitors, including the very common 27” 4k.

Microsoft and Macs have tried to improve the text by varying the intensity of the color subpixels. Below is an example from MS Windows with “ClearType” for a series of different-size fonts. Note particularly the horizontal strokes at the bottom of the numbers 1, 2, and 7 below and how the jump from 1 pixel wide with no smoothing from Calibri 9 to 14pt, then an 18pt, the strokes jump to 2 pixels wide with a little smoothing and then at 20pt become 2 pixels wide with no smoothing vertically.

Apple has a similar function known as “LCD Font Smoothing. Apple had low-ppd text rendering issues in its rearview mirror with “retinal resolution” displays for Mac laptops and monitors. “Retinal resolution” translates to more than 80ppd when viewed normally, which is about from about 12” (0.3 meters) for handheld devices (ex. iPhone) or about 0.5 to 0.8 meters for a computer.

The chart was Edited for Space, and ppd in information was added.

Apple today sells “retina monitors” with a high 218 PPI, which makes text grid fitting less of an issue. But as the chart from Mac external displays for designers and developers (right), Mac systems have resolution and performance issues with in-between resolution monitors.

The Apple Vision Pro has less than 40 ppd, much lower than any of these monitors at normal viewing distance. And that is before all the issues with making the virtual monitor seem stationary as the user moves.

💾

KGOnTech
Apple Vision Pro (Part 4) – Hypervision Pancake Optics Analysis
26 June 2023 at 23:07

Apple Vision Pro (Part 4) – Hypervision Pancake Optics Analysis

KGOnTech

By: Karl Guttag

26 June 2023 at 23:07

Introduction

Hypervision, a company making a name for itself by developing very wide field of view VR pancake optics, just released a short article analyzing the Apple Vision Pro’s pancake on their website titled, First Insights about Apple Vision Pro Optics. I found the article very interesting from a company that designs pancake optics. I will give a few highlights and key points from Hypervision’s article, but I recommend going to their website for more information.

Hypervision has demonstrated a single pancake 140° VR and an innovative 240° dual pancake per eye optical design. I will briefly discuss Hypervision’s designs after the Apple Vision Pro optics information.

Apple Vision Pro’s Pancake Optical Design

Hypervision’s article starts with a brief description of the basics of pancake optics (this blog also discussed how pancake optics work as part of the article Meta (aka Facebook) Cambria Electrically Controllable LC Lens for VAC?).

Hypervision points out that an important difference in the Apple Pancake optics shown in the WWDC 2023 video and other pancake optics, such as the Meta Quest Pro, is that the Quarter Waveplate (QWP) retarder 2, as shown above, must be curved. Hypervision shows both Meta (Facebook) and Apple patent applications showing pancake optics with a curved QWP. Below are Figs 8 and 9 from Apple’s patent application and Hypervision’s translation into some solid optics.

Hypervision’s Field of View Analysis

Hypervision has also made a detailed field-of-view analysis. They discuss how VR experts who have seen the AVP say they think the AVP FOV is about 110°. Hypervision’s analysis suggests APV’s FOV “wishfully” could be as high as 120°. Either value is probably within the margin of error due to assumptions. Below is a set of diagrams from Hypervisions analysis.

Pixels Per Degree (ppd)

Hypervision’s analysis shows 34 pixels per Degree (ppd) on the lower end. The lower PPD comes from Hypervision’s slightly wider FOV calculations. Hypervision notes that this calculation is rough and may vary across the field of view as the optics may have some non-linear magnification.

I have roughly measured the Meta Quest Pro’s (MQP) ppd in the center and come up with about 22 ppd. Adjusting for about 1.8X more pixels linearly and the difference of 106 FOV for the MQP, and 110 for the AVP results, I get an estimate of about 39 ppd. Once again, with my estimate, there are a lot of assumptions. Considering everything, depending on the combination of high and low estimates, the AVP has between 34 ppd and 39 ppd.

Eye Box

Hypervision makes the point that due to using a smaller pixels size that thus requires more magnification, the eye box (and thus the sweet spot) of the AVP is likely to be smaller than some other headsets that use pancake optics with LCDs rather than the AVP’s use of Micro-OLEDs.

Hypervision

Hypervision clearly has some serious optical design knowledge. I first saw them in 2022, but as their optics have been aimed at VR, I have not previously written about them. But at AR/VR/MR 2023, they showed a vastly improved optical quality design using pancake optics to support 140° with a single pancake optics and 240° with what I call a dual pancake (per eye) design. I took more notice of pancake optics becoming all the rage in VR headsets with MR passthrough.

AR/VR/MR 2022 with Dual Fused Fresnel Lenses and 270°

I first saw Hypervision at AR/VR/MR in January 2022. At the time, they were demonstrating a 270° headset based on what I call a fused dual Fresnel optical design using two LCDs. I took some pictures (below), but I was not covering much about VR at the time unless it was related to passthrough mixed reality. While the field of view was very impressive, there were the usual problems with Fresnel optics and the seam between the dual Fresnel lenses was pretty evident.

AR/VR/MR 2023 Pancake Optics

Below are pictures I took at AR/VR/MR 2023 of Hypervision’s 140° single pancake and 240° dual pancake designs. The pancake designs were optically much better than their earlier Fresnel-based designs. The “seam” with the dual pancakes seemed barely noticeable (Brad Lynch also reported a barely invisible seam in his video). Hypervision has some serious optical design expertise.

I mentioned Brad Lynch of SadlyItsBradley and who covers VR in more detail about Hypervision. Brad had the chance to see them at Display Week 2023 and recorded a video discussing them. Brad said that multiple companies, including Lynx, were impressed by Hypervision.

Closing

Hypervision is a company with impressive optical design expertise, and they demonstrated that they understand pancake optics with their designs. I appreciate that they contacted me to let me know they had analyzed the Apple Vision Pro. It is one thing for me, with an MSEE who picked up some optics through my industry exposure, to try and figure out what is going on with a given optical design; it is something else to have the analysis from a company that has designed that type of optics. So once again, I would recommend reading the whole article on Hypervision’s site.

KGOnTech
Apple Vision Pro (Part 3) – Why It May Be Lousy for Watching Movies On a Plane
22 June 2023 at 01:58

Apple Vision Pro (Part 3) – Why It May Be Lousy for Watching Movies On a Plane

KGOnTech

By: Karl Guttag

22 June 2023 at 01:58

Introduction

Part 1 and Part 2 of this series on the Apple Vision Pro (AVP) primarily covered the hardware. Over the next several articles, I plan to discuss the applications Apple (and others) suggest for AVP. I will try to show the issues with human factors and provide data where possible.

I started working in head-mounted displays in 1998, and we bought a Sony Glasstron to study. Sony’s 1998 Glasstron had an 800×600 (SVGA) display, about the same as most laptop computers in that year, and higher resolution than almost everyone’s television in the U.S. (HDTVs first went on sale in 1998). The 1998 Glasstron even had transparent (sort of) LCD and LCD shutters to support see-through operation.

In the past 25 years, many companies have introduced headsets with increasingly better displays. According to some reports, the installed base of VR headsets will be ~25 million units in 2023. Yet I have never seen anyone on an airplane or a train wear a head-mounted display. I first wrote about this issue in 2012 in an article on the then-new Google Glass with what I called “The Airplane Test.”

I can’t say I was surprised to see Apple showing the movie watching on airplanes VR app, as I have seen it again and again over the last 25 years. It makes me wonder how well Apple verified the concepts they showed. As Snazzy Lab’s explained, there were no new apps that Apple showed that had not failed before, and it is not clear they failed due to not having better hardware.

Since the technology for watching videos on a headset has been available for decades, there must be reasons why almost no one (Brad Lynch of SadlyItsBradley says he has) uses a headset to watch movies on a plane. I also realize that some VR fans will watch movies on their headsets, but this, like VR, does not mean it will support mass market use.

As will be shown, the total pixel angular (pixels per degree) resolution of the AVP, while not horrible, is not particularly good for watching movies. But then, the resolution has not been what has stopped people from using VR on airplanes; it has been other human factors. So the question becomes, “Has the AVP solved the human factors problems that prevent people from using headsets to watch movies on airplanes?”

Some Relevant Movie Watching Human Factors Information

In 2019 in FOV Obsession, I discussed an excellent Photonics West’s AR/VR/MR Conference presentation by Thad Starner, the Georgia Institute of Technology and a long-time AR advocate and user.

First, the eye only has high resolution in the fovea, which covers only ~2°. The eye goes through a series of movements and fixations known as saccades. What a person “sees” results from the human vision system piecing together a series of “snapshots” at each saccade. The saccadic movement is a function of the activity and the person’s attention. Also, vision is partially, but not completely, blanked when the eye is moving (see: We thought our eyes turned off when moving quickly, but that’s wrong, and Intrasaccadic motion streaks jump-start gaze correction)

Starner shows the results from a 2017 Thesis by Haynes, which included a study on FOV and eye discomfort. Haynes’ thesis states (page 8 of 303 pages and 275 megabytes – click here to download it):

“Thus, eye physiology provides some basic parameters for potential HWD design. A display can be no more than 55° horizontally from the normal line of sight based on oculomotor mechanical limits. However, the effective oculomotor range places a de facto limit at 45°. Further, COMR and saccadic accuracy suggest visually comfortable display locations may be no more than [plus or minus] 10-20° from the primary position of gaze.”

The encyclopedic Optical Architectures for Augmented-, Virtual-, and Mixed-Reality Headsets by Bernard Kress writes about a “fixed foveated region of about 40-50° (right). But in reality, the eyes can’t see 40-50° with high resolution for more than a few minutes without becoming tired.

The bottom line is that the human eye will want to stay within about 20° of the center when watching a movie. Generally, if a user wants to see something more than about 30° from the center of their vision, they will turn their head rather than use just their eyes. This is also true when watching a movie or using a large computer monitor for office-type work.

The Optimum Movie Watching FOV is about 30-40 Degrees

It may shock many VR game players that want 120+ degree FOVs, but SMTPE, which sets the recommendations for movie theaters, says the optimal viewing angle for HDTV is only 30°. THX specifies 40 degrees (Wikipedia and many other sources). These same optimum seating location angles apply to normal movie theaters as well.

The front row of a “normal” movie theater is about 60°, which is usually the last row in a theater where people will want to sit. Most people don’t want to sit in the front rows of a theater because of the “head pong” (as Thad Starner called it) required to watch a movie that is ~60° wide.

While 30°-40° may seem small, it comes back to human factors and a feedback loop of the content generated to work well with typical theater setups. A person in the theater will naturally only see what is happening in the center ~30° of the Screen most of the time, except for some head-turning fast action.

The image content generated outside of ~30° helps give an immersive feel but costs money to create and will not be seen in any detail 99.999% of the time. If you take content generated assuming a nominal 30° to 40° viewing angle and enlarge it to fill 90°, it will cause eye and head discomfort for the user to watch it.

AVP’s Pixels Per Degree Are Below “Retinal Resolution”

Another factor is “angular resolution.” The bands in the chart on the right show how far back from a given size TV with a given resolution must sit before you can’t see the pixels. The metric they use for being “beneficial” is 60ppd or more. Also shown on the chart with the dotted white lines are the SMTPE 30° and THX 40° recommendations.

Apple has not given the exact resolution but stated 23 Million (pixels for both eyes). Assuming a square display, this computes to about 3,400 pixels in each direction. The images in the video look to be about a 7:6 aspect ratio which would work out to about ~3680 by ~3150. Also, the optics cut off some of the display’s pixels for each eye, yet often companies count all the display’s pixels.

Apple didn’t specify the field of view (FOV). One big point of confusion on FOV is that VR headsets are typically quoted for both eyes, including the binocular view combing both eyes. The FOV also varies based on the eye relief from person to person (people’s eye insets, foreheads, and other physical features are different). Reports are that the FOV is “similar” to the Meta Quest Pro, which has a binocular FOV of about 106 degrees. The single-eye FOV is about 90°.

Combining the information from various sources, the net result is about 35 to 42 pixels per degree (ppd). Good human 20/20 vision is said to be ~60ppd. Steve Jobs with the iPhone 6 called 300 pixels per inch at reading distance, which works out to ~60ppd), “retinal resolution.” For the record, people with very good eyesight can see 80ppd

Some people wearing the AVP commented that they could make out some screen door effect consistent with about 35-40ppd. The key point is that the AVP is below 60, so jagged line effects will be noticeable.

Using the THX 40° horizontal FOV standard and assuming the AVP is about 90° horizontally (per eye, 110 for both eyes), ~3680 pixels horizontally, and almost no pixels get cropped, this leaves 3680 x (40/90) = ~1635 pixels horizontally. Using the STMPE 30° gives about 3680 x (30/60) = ~1226 pixels wide.

If the AVP is used for watching movies and showing the movie content “optimally,” the image will be lower than full HD (1920×1080) resolution, and since there are ~40ppd, jaggies will be visible.

While the AVP has “more pixels than a 4K TV,” as claimed, they can’t deliver those pixels to an optimally displayed movie’s 40° or 30° horizontal FOV. Using the full FOV would, in effect, put you visually closer than the front row of a movie theater, not where most people would want to watch a movie.

Still, resolution and jaggies alone are not so bad as they would not, and have not, stopped people from using a VR headset for movies.

Vestibulo–Ocular Reflex (VOR) – Stabilizing the View with Head Movement – Simple Head Tracking Fails

The vestibulo-ocular reflex (VOR) stabilizes a person’s gaze during head movement. The inner ear detects the rotation, and if one is gazing, it causes the eyes to rotate to counter the movement to stay fixed on where the person is gazing. In this way, a person can, for example, read a document even if their head is moving. People with a VOR deficiency have problems reading.

Human vision will automatically suppress the VOR when it is a counter product. For example, the VOR reflex will be suppressed if one is tracking an object with a combination of head and eye movement, whereas VOR would be counter-productive. The key point is that the display system must account for the combined head and eye movement to generate the image without causing a vestibular (motion sickness) problem where the inner ear does not agree with the eyes.

Quoting from the WWDC 2023 video at ~1:51:18:

Running in parallel is a brand-new chip called R1. This specialized chip was designed specifically for the challenging task of real-time sensor processing. It processes input from 12 cameras, five sensors, and six microphones.

In other head-worn systems, latency between sensors and displays can contribute to motion discomfort. R1 virtually eliminates lag, streaming new images to the displays within 12 milliseconds. That’s eight times faster than the blink of an eye!

Apple did not say if the “12 cameras” included eye-tracking cameras, as they only showed the cameras on the front, but likely they are included. Complicating matters further is the saccadic movement of the eye. Eye tracking can know where the eye is aimed, but not what is seen. The AVP is known to have superior eye tracking for selecting things from a menu. But we don’t know if the eye tracking coupled with the head tracking deals with VOR, and if so, whether it is accurate and fast enough to solve to not cause VOR-related problems for the user.

Movies on AVP (and VR) – Chose Your Compromises

Now consider some options for displaying a virtual screen on a headset below. Apple has shown locking the Screen in the 3-D space. For their demos, they appear to have gone with a very large (angularly) virtual screen for the demo impact. But, as outlined below, making a very large virtual screen is not the best thing to do for more normal movie and video watching. No matter which option is chosen below, jaggies and “zipper/ripple” antialiasing artifacts will be visible at times due to the angular resolution (pdd) of the AVP.

Simplistic Option: Scale the image to full Screen for the maximum size and have the Screen moves with the headset (not locked in the virtual 3-D space). This option is typically chosen for headsets with smaller FOVs, but it is a poor choice for headsets with large FOVs.
- It is like sitting in a movie theater’s front row (or worse).
- The screen moves unnaturally with head motion as it follows any head motion.
Lock the Virtual Screen but nearly fill the FOV: This is what I will call “Head-Lock for Demos Only Mode.” If the virtual Screen nearly fills the FOV, then the small head movement will cause the Screen to cut off and will, in turn, will trigger a person’s peripheral vision causing some distraction. To avoid distraction, the user must limit head movement and eye movement; perhaps doable in a short demo, but not a comfortable way to watch a movie.
Locking Screen in 3-D space with the Screen at STMPE 30° to THX 40°: With ~40° FOV, there is room for the head to turn and total without cutting off the Screen or forcing the user to keep their head rigidly held in one location.
- This will test the ability of the system to track head motion without causing motion sickness. There will always be some motion-to-photon lag and some measurement errors. There is also the VOR issue discussed earlier and whether it is solvable.
- Some additional loss in resolution and potential for motion/temporal artifacts as the flat or 3-D movie is resampled into the virtual space.
- Add motion blur to deal with head and eye movement (unlikely as it would be really complex).
- The AVP reshows a 24 fps movie four times at 96Hz – does each frame get corrected at 96Hz, and what about visual artifacts when doing so?
- What does it do for 30 fps and 60 fps video?
- The Screen will still unnaturally be cut off if the user’s head turns too far. It does not “degrade gracefully” as a real-world screen would when you turn away from it.

Apple showed (above) images that might fill about 70 to 90 degrees of the FOV in its short Avatar demos (case 2 above). This will “work” in a demo to be something new and different, but as discussed in #2 above, it is not what you would want to do for a long movie.

And You Are on a Plane and Wearing A Heavy Headset Pressed Against Your Face with a Cord to Snag

On top of all the other issues, the headset processing and sensor must address vestibular-related motion sickness problems caused by being in a moving vehicle while displaying an image.

You then have the ergonomic issues of wearing a somewhat heavy, warm headset sealed against your face with no air circulation for hours while on a plane. Then you have the snag hazard of the cord, which will catch on just about everything.

There will be flight attendants or others tapping you to get your attention. Certainly, you don’t want the see-through mode to come on each time somebody walks by you in the aisle.

A more basic practical problem is that a headset takes up more room/volume due to its shape and the need to protect the glass front than a smartphone, tablet, or even a moderately sized laptop.

Conclusions

It is important to note that humans understand what behaves as “real” versus virtual. The AVP is still cutting off much of a person’s peripheral vision. Something like VOR and Vergence-Accommodation Conflict (VAC discussed in Part 2) and the way focus behaves are well-known issues with VR, but many more subtle issues can cause humans to sense there is something just not right.

In visual human factors, l like to bring up the 90/90 rule, which states, “it takes 90% of the effort to get 90% of the way there, and then the other 90% of the effort to solve the last 10%.” Sometimes this rule has to be applied recursively where multiples of the “90%” effort are required. Apple could do a vastly better job of head and eye tracking with faster response time, and yet people would still prefer to watch movies and videos on a direct-view display.

Certainly, nobody will be the wiser in a short flashy demo. The question is whether it will work for most people watching long movies on an airplane. If it does, it will break a 25+ year losing streak for this application.

KGOnTech
Apple Vision Pro (Part 2) – Hardware Issues
16 June 2023 at 20:52

Apple Vision Pro (Part 2) – Hardware Issues

KGOnTech

By: Karl Guttag

16 June 2023 at 20:52

Introduction

This part will primarily cover the hardware and related human physical and visual issues with the Apple Vision Pro (AVP). In Part 3, I intend to discuss my issues with the applications Apple has shown for the AVP. In many cases, I won’t be able to say that the AVP will definitely cause problems for most people, but I can see and report on many features and implementation issues and explain why they may cause problems.

It is important to note that there is a wide variation between humans in their susceptibility and discomfort with visual issues. All display technologies are based on an illusion, and different people have different issues with various imperfections in the illusions. Some people may be able to adapt to some ill effects, whereas others can’t or won’t. This article points out problems I see with the hardware that might not be readily apparent in a shot demo based on over 40 years of working with graphics and display devices. I can’t always say there will be problems, but some things concern me.

The Appendix has some “cleanup/corrections” on Part 1 of this series on the Apple Vision Pro (AVP).

Demos are a “Magic Show” and “Sizzle Reels”

Things a 30-minute demo won’t show

I’m constantly telling people that “Demos are Magic Shows,” what you see has been carefully selected not to show any problems and only what they want you to see. Additionally is impossible to find all the human factor physical and optical issues in the cumulative ~30-minute demo sessions at WWDC. Each session was further broken into short “Sizzle Reels” of various potential applications.

The experience that people can tolerate and enjoy with a short theme park ride or movie clip might make them sick if they endure it for more than a few minutes. In recent history, we have seen how 3-D movies reappeared, migrated to home TVs, and later disappeared after the novelty wore off and people discovered the limitations and downsides of longer-term use.

It will take months of studies with large populations as it is well known that problems with the human visual perception of display technologies vary widely from person to person. Maybe Apple has done some of these studies, but they have not released them. There are some things that Apple looks like they are doing wrong from a human and visual factors perspective (nothing is perfect), but how severe the effects will be on humans will vary from person to person. I will try to point out things I see that Apple is doing that may cause issues and claims that may be “incomplete” and gloss over problems.

Low Processing Lag Time and High Frame Rate are Necessary but not Sufficient to Solve Visual Issues

Apple employed a trick that gets the observer to focus on one aspect of a problem that is a known issue and where they think they do well. Quoting from the WWDC 2023 video at ~1:51:34:

In other head-worn systems, latency between sensors and displays can contribute to motion discomfort. R1 virtually eliminates lag, streaming new images to the displays within 12 milliseconds. That’s eight times faster than the blink of an eye!

I will give him credit for saying that the delay “can contribute” rather than saying it is the whole cause. But they were also very selective with the wording “streaming new images to the displays within 12 milliseconds,” which is only a part of the “motion to photon” latency problem. They didn’t discuss the camera or display latency. Assuming the camera and display are both at 90Hz frame rates and are working one frame at a time, this would roughly triple the total latency, and there may be other buffering delays not mentioned. We then have any errors that will occur.

The statement, “That’s eight times faster than the blink of an eye!” is pure marketing fluff as it does not tell you if it is fast enough.

In some applications, even 12 milliseconds could be marginal. Some very low latency systems process scan lines from the camera to the display with near zero latency rather than frames to reduce the motion-photon-time. But this scan line processing becomes even more difficult when you add virtual content and requires special cameras and displays that work line by line synchronously. Even systems that work on scan lines rather than frames may not be fast enough for intensive applications. Specifically, this issue is well-known in the area of night vision. The US and other militaries still prefer monochrome (green or b&w) photomultiplier tubes in Enhanced Night Vision Goggles (ENVG) over cameras with displays. They still use the photomultiplier tubes (improved 1940s-era technology) and not semiconductor cameras because the troops find even the slightest delay disorienting.

Granted, troops making military maneuvers outdoors for long periods may be an extreme case, but at least in this application, it shows that even the slightest delay causes issues. What is unknown is who, what applications, and which activities might have problems with the level of delays and tracking errors associated with the AVP.

The militaries also use photomultiplier tubes because they still work with less light (just starlight) than the best semiconductor sensors. But I have been told by night vision experts that the delay is the biggest issue.

Poor location of main cameras relative to the user’s eye due to the Eyesight Display

The proper location of the cameras would be coaxial with the user’s two eyes. Still, as seen in the figure (right), the Main Cameras and all the other cameras and sensors are in fixed locations well below the eyes, which is not optimal, as will be discussed. This is very different than other passthrough headsets, where the passthrough cameras are roughly located in front of the eyes.

It appears the main cameras and all the other sensors are so low down relative to the eyes to be out of the way of the “Eyesight Display.” The Eyesight display (right) has a glass cover that contributes a lot of weight to the headset. I hear the glass cover is also causing some calibration problems with the various cameras and sensors, as there is variation in the glass, and its placement varies from unit to unit. The glass cover also contributes significant weight to the headset while inhibiting heat from escaping on top of the power/heat caused by the display itself.

It seems Apple wanted the Eyesight Display so much that they were willing to hurt significantly other design aspects.

Centering correctly for the human visual system

The importance of centering the (actual or “virtual”) camera with the user’s eye for long-term comfort was a major point made by mixed reality (optical and passthrough) headset user and advocate Steve Mann in his March 2013 IEEE Spectrum article, “What I’ve learned from 35 years of wearing computerized eyewear“. Quoting from the article, “The slight misalignment seemed unimportant at the time, but it produced some strange and unpleasant results. And those troubling effects persisted long after I took the gear off. That’s because my brain had adjusted to an unnatural view, so it took a while to readjust to normal vision.”

I don’t know if or how well Apple has corrected the misalignment with “virtual cameras” (transforming the image to match what the eye should see) as Meta attempted (poorly) with the MQP. Still, they seem to have made the problem much more difficult by locating the cameras so far away from the center of the eyes.

Visual Coordination and depth perception

Having the cameras and sensors in poor locations would make visual depth sensing and coordination more difficult and less accurate, particularly at short distances. Any error will be relatively magnified as things like one’s hands get close to the eyes. In the extreme case, I don’t see how it would work if the user’s hands were near and above the eyes.

The demos indicated using some level of depth perception in the video (stills below) were contrived/simple. I have not heard any demos stressing coordinated hand movement with a real object. Any offset error in the virtual camera location might cause coordination problems. Nobody may know or have serious problems with a short demo, particularly if they don’t do anything close up, but I am curious about what will happen with prolonged use.

Vergence Accommodation Confict or Variable Focus

There must be on the order of a thousand papers and articles on the issue of vergence-accommodation conflict (VAC). Everyone in the AR/VR and 3-D movie industries knows about the problem. The 3-D stereo effect is caused by having a different view for each eye which causes the eyes to rotate and “verge,” but the muscles in the eye will adjust focus, “accommodate,” based on what it takes to focus. If the perceived distances are different, it causes discomfort, referred to as VAC.

Figure From: Kieran Carnegie, Taehyun Rhee, “Reducing Visual Discomfort with HMDs Using Dynamic Depth of Field,” *IEEE Computer Graphics & Applications*, Sept.-Oct. 2015, doi:10.1109/MCG.2015.98

Like most other VR headsets, the AVP most likely has a fixed focus at about 2 meters (+/- 0.5m). From multiple developer reports, Apple seems to be telling developers to put things further away from the eyes. Two meters is a good compromise distance for video games where things are on walls or further away. VAC is more of a problem when things get inside 1m, such as when the user works with their hands, which can be 0.5m or less away.

When there is a known problem with many papers on the subject and no products solving it, it usually means there aren’t good solutions. The Magic Leap 1 tried a dual-focus waveguide solution but at the expense of image quality and cost and abandoned it on Magic Leap 2. Meta regularly presents papers and videos about their attempts to address VAC, including Half Dome 1, 2, and 3, focus surfaces, and a new paper using varifocal at Siggraph in August 2023.

There are two main approaches to VAC; one involves trying to solve for focus everywhere, including light fields, computational holograms, or simultaneous focus planes (ex. CREAL3D, VividQ, & Lightspace3D), and the other uses eye tracking to control varifocal optics. Each requires more processing, hardware complexity, and a loss of absolute image quality. But just because the problem is hard does not make it disappear.

From bits and pieces I have heard from developers at WWDC 2023, it sounds like Apple is trying to nudge developers to make objects/screens bigger but with more virtual distance. In essence, to design the interfaces to reduce the VAC issue from close-up objects.

Real-World Monitors are typically less than 0.5m away

Consider a virtual computer monitor placed 2m away; it won’t behave like a real-world monitor less than 1/2 meter away. You can blow up the monitor to have the text be the same size, but if working properly in the virtual space, the text and other content won’t vary in size the same way when you lean in, no less being able to point at something with your finger. Many subtle things you do with a close-up monitor won’t work with a virtual, far-away large monitor. If you make the virtual monitor act like it is the size and distance of a real-world monitor, you have a VAC problem.

I know some people have suggested using large TVs from further away as computer monitor to relax the eyes, but I have not seen this happening much in practice. I suspect it does not work very well. I have also seen “Ice Bucket challenges,” where people have worn a VR headset as a computer monitor for a week or month, but I have yet to see anyone say they got rid of their monitors at the end of the experiment. Granted, the AVP has more resolution and better motion sensing and tracking than other VR headsets, but these may be necessary but not sufficient. I don’t see a Virtual workspace as efficient for business applications compared to using one or more monitors (I am open to seeing studies that could prove otherwise).

A related point that I plan to discuss in more detail in Part 3 is that there have been near-eye “glasses” for TVs (such as Sony Glasstron) and computer use for the last ~30 years. Yet, I have never seen one used on an airplane, train, or office in all these years. It is not that the displays didn’t work or were too expensive for an air traveler (who will spend $350 on noise-canceling earphones) and had a sufficient resolution for at least watching movies. But 100% of people decide to use a much smaller (effective) image; there must be a reason

The inconvenient real world with infinite focus distances and eye saccades

VAC is only one of many image generation issues I put in the class of “things not working right,” causing problems for the human visual system. The real world is also “inconvenient” because it has infinite focus distances, and objects can be any distance from the user.

The human eye works very differently from a camera or display device. The eye jumps around in “saccades,” that semi-blank vision between movements. Where the eye looks is a combination of voluntary and involuntary movement and varies if one is reading or looking, for example, at a face. Only the center of vision has a significant resolution and color differentiation, and a sort of variable resolution “snapshot” is taken at each saccade. The human visual system then pieces together what a person “sees” from a combination of objective things captured by each saccade and subjective information (eyewitnesses can be highly unreliable). Sometimes the human vision pieces together some display illusions “wrong,” and the person sees an artifact; often, it is just a flash of something the eye is not meant to see.

Even with great eye tracking, a computer system might know where the eye is pointing, but it does not know what was “seen” by the human visual system. So here we have the human eye taking these “snapshots,” and the virtual image presented does not change quite the way the real world does. There is a risk that the human visual system will know something is wrong at a conscious (you see an artifact that may flash, for example) or unconscious level (over time, you get a headache). And once again, everybody is different in what visual problems most affect them.

Safety and Peripheral Vision

Anyone who has put on a VR headset from a major manufacturer gets bombarded with messages at power-up to make sure they are in a safe place. Most have some form of electronic “boundaries” to warn you when you are straying from your safe zone. As VR evangelist Bradley Lynch told me, the issue is known as “VR to the ER,” for when an enthusiastic VR user accidentally meets a real-world object.

I should add that the warnings and virtual boundaries with VR headsets are probably more of a “lawyer thing” than true safety. As I’m fond of saying, “No virtual boundary is small enough to keep you safe or large enough not to be annoying.”

Those in human visual factors say (to the effect), “Your peripheral vision is there to keep you from being eaten by the tigers,” translated to the modern world, it keeps you from getting hit by cars and running into things in your house. Human vision and anatomy (how your neck wants to bend) are biased in favor of looking down. The saying goes, there are many more dangerous things on the ground than in the air.

Peripheral vision has very low resolution and almost no sense of color, but it is very motion and flicker-sensitive. It lets you sense things you don’t consciously see to make you turn your head to see them before you run into them. The two charts on the right illustrate a typical person’s human vision for the Hololens 2 and the AVP. The lightest gray areas are for the individual right and left eye; the central rounded triangular mid-gray area is where the eye has binocular overlap, and you have stereo/depth vision. The near-black areas are where the headset blocks your vision. The green area shows the display’s FOV.

Battery Cable and No Keep-Alive Battery

What is concerning from a safety perspective is that with the AVP, essentially all peripheral vision is lost, even if the display is in full passthrough mode with no content. It is one thing to have a demo in a safe demo room with “handlers/wranglers,” as Apple did at the WWDC; it is another thing to let people loose in a real house or workplace.

Almost as a topper on safety, the AVP has the battery on an external cable which is a snag hazard. By all reports, the AVP does not have a small “keep-alive” battery built into the headset if the battery is accidentally disconnected or deliberately swapped (this seems like an oversight). So if the cable gets pulled, the user is completely blinded; you better hope it doesn’t happen at the wrong time. Another saying I have is, “There is no release strength on a breakaway cable that is weak enough to keep you safe that is strong enough not to release when you don’t want it to break.”

Question, which is worse?:

A) To have the pull force so high that you risk pulling the head into something dangerous, or

B) To have the cord pull out needlessly blinding the person so they trip or run into something

This makes me wonder what warnings, if any, will occur with the AVP.

Mechanical Ergonomics

When it comes to the physical design of the headset, it appears that Apple strongly favored style over functionality. Even from largely favorable reviewers, there were many complaints about physical comfort being a problem.

Terrible Weight Distribution

About 90% of the weight of the AVP appears to be in front of the eyes, making the unit very front-heavy. The AVP’s “solution” is to clamp the headset to the face with the “Light Seal” face adapter applying pressure to the face. Many users with just half-hour wear periods discussed the unit’s weight and pressure on the face. Wall Street Journal reporter Joanne Stern discussed the problem and even showed how it left red marks on her face. Apple was making the excuse that they only had limited face adapters and that better adapters would fix or improve the problem. There is no way a better Light Seal shape will fix the problem with so much weight sitting beyond the eyes and without any overhead support.

*Estimation of the battery size and weight*

Experience VR users that tried on the AVP report that they think the AVP headset weighs at least 450 grams, with some thinking it might be over 500 grams. Based on the battery cable size, I think it weighs about 60 grams pulling asymmetrically on the headset. Based on a similar size but slightly differently shaped battery, the AVP’s battery is about 200 grams. While a detachable battery gives options for larger batteries or a direct power connection, it only saves about 200-60 = 140 grams of weight on the head in the current configuration.

Many test users commented on their being an over-the-head strap, and one was shown in the videos (see lower right above). Still, this strap shown is very far behind the unit’s center of gravity and will do little to take the weight off the front that could help reduce the clamping force required against the face. This is basic physics 101.

I have seen reports that several strap types will be available, including ones made out of leather. I expect there will have to be front-to-back straps built-in to relieve pressure on the user’s face.

I thought they could clip a battery back with a shorter cable to the back of the headset, similar to the Meta Quest Pro and Hololens 2 (below), but this won’t work as the back headband is flexible and thus will not transfer the force to help balance the front. Perhaps Apple or 3rd parties will develop a different back headband without as much flexibility, incorporating a battery to help counterbalance the front. Of course, all this talk of straps will be problematic with some hairstyles (ex., right) where neither a front-to-back nor side-to-side strap will work.

Meta Quest Pro is 722 grams (including a ~20Wh battery), and Hololens 2 is 566 grams (including a ~62Wh battery). Even with the forehead pad, the Hololens 2 comes with a front-to-back strap (not shown in the picture above), and the Meta Quest Pro needs one if worn for prolonged periods (and there are multiple aftermarket straps). Even most VR headsets lighter than the AVP with face seals have overhead straps.

If Apple integrated the battery into the back headband, they would only add about 200 grams or a net 140 grams, subtracting out the weight of the cable. This would place the AVP between the Meta Quest Pro and Hololens 2 in weight.

Apple denies physics and the shape of human heads to think they won’t need better support than they have shown for the AVP. I don’t think the net 140 grams of a battery is the difference between needing head straps.

Conclusions

I see Many of the problems with the AVP because doing Passthrough AR well is very hard and because of trade-offs and compromises they made between features and looks. I think Apple made some significant compromises to support the Eyesight feature that even many fans of the technology say Eyesight will have Uncanny Valley problems with people.

As I wrote in Part 1, the AVP blows away the Meta Quest Pro (MQP) and has a vastly improved passthrough. The MQP is obsolete by comparison. Still, I am not convinced it is good enough for long-term use. There are also a lot of basic safety issues.

Next time, I plan to explore more about the applications Apple presented and whether they are realistic regarding hardware support and human factors.

Appendix: Some Cleanup on Part 1

I had made some size comparisons and estimated that the AVP’s battery was about 35Wh to 50Wh, and then I found that someone had leaked (falsely) 36Wh, so I figured that must be it. But not a big difference, as other reports now estimate the battery at about 37Wh. My main point is that the power was higher than some reported, and my power estimate seems close to correct.

All the pre- and post-announcement rumors suggested that the AVP uses pancake optics. I jumped to an erroneous conclusion from the WWDC 2023 video that they made it look like it was aspheric refractive. In watching the flurry of reports and concentrating on the applications, I missed circling back to check on this assumption. It turns out that Apple’s June 5th news release states, “This technological breakthrough, combined with custom catadioptric lenses that enable incredible sharpness and clarity . . . ” Catadioptric means a combination of refractive and reflective optical elements, which included pancake optics. Apple recently bought Limbak, an optics design company known for catadioptric designs, including those used in Lynx (which are catadioptric, but not pancake optics, and not what the AVP uses). They also had what they called “super pancake” designs. Apple eschews using any word used by other companies as they avoided saying MR, XR, AR, VR, and Metaverse, and we can add to that list “pancake optics.”

*From Limbak’s Website: Left shows their 2 -element “Super-Pancake,” and the middle two show Lynx’s optics.*

KGOnTech
Apple Vision Pro (Part 1) – What Apple Got Right Compared to The Meta Quest Pro
14 June 2023 at 03:21

Apple Vision Pro (Part 1) – What Apple Got Right Compared to The Meta Quest Pro

KGOnTech

By: Karl Guttag

14 June 2023 at 03:21

Update June 14, 2023 PM: It turns out that Apple’s news release states, “This technological breakthrough, combined with custom catadioptric lenses that enable incredible sharpness and clarity . . . ” Catadioptric means a combination of refractive and reflective optical elements. This means that they are not “purely refractive” as I first guessed (wrongly). They could be pancake or some variation of pancake optics. Apple recently bought Limbak, an optics design company known for catadioptric designs including those used in Lynx. They also had what they called “super pancake” designs. Assuming Apple is using a pancake design, then the light and power output of the OLEDs will need to be about 10X higher.

UPDATE June 14, 2023 AM: The information on the battery used as posted by Twitter User Kosutami turned out to be a hoax/fake. The battery shown was that of a Meta Quest 2 Elite as shown in a Reddit post of a teardown of the Quest 2 Elite. I still think the battery power of the Apple Vision Pro is in the 35 to 50Wh range based on the size of the AVP’s battery pack. I want to thank reader Xuelei Zhang for pointing out the error. I have red-lined and X-out the incorrect information in the original article. Additionally based on the battery’s size, Charger Labs estimates that the Apple Vision Pro could be in the 74WH range, but I think this is likely too high based on my own comparison.

I have shot a picture with a Meta Quest Pro (as a stand-in to judge size and perspective to compare against Apple’s picture of the battery pack. In the picture is a known 37Wh battery pack. This battery pack is in a plastic case with two USB-A and one USB-micro, not in the Apple battery pack (there are likely some other differences internally).

I tried to get the picture with a similar setup and perspective, but this is all very approximate to get a rough idea of the battery size. The Apple battery pack looks a little thinner, less wide, and longer than the 37Wh “known” battery pack. The net volume appears to be similar. Thus I would judge the Apple battery to be between about 35Wh and 50Wh.

Introduction

I’ve been watching and reading the many reviews by those invited to try (typically for about 30 minutes) the Apple Vision Pro (AVP). Unfortunately, I saw very little technical analysis and very few with deep knowledge of the issues of virtual and augmented reality. At least they didn’t mention what seemed to me to be obvious issues and questions. Much of what I saw were people that were either fans or grateful to be selected to get an early look at the AVP and wanted (or needed) to be invited back by Apple.

Unfortunately, I didn’t see a lot of “critical thinking” or understanding of the technical issues rather than having “blown minds.” Specifically, while many discussed the issue of the uncanny valley with the face capture and Eyesight Display, no one even mentioned the issues of variable focusing and Vegence Accommodation Conflict (VAC). The only places I have seen it mentioned are in the Reddit AR/VR/MR and Y-Combinator forums. On June 4th, Brad Lynch reported on Twitter that Meta would present their “VR headset with a retinal resolution varifocal display” paper at Siggraph 2023.

As I mentioned in my AWE 2023 presentation video (and full slides set here), I was doubtful based on what was rumored that Apple would address VAC. Like many others, Apple appears to have ignored the well-known and well-documented human mechanical and visual problem with VR/MR. As I said many times, “If all it took were money and smart people, it would be here already. Apple, Meta, etc. can’t buy different physics,” and I should add, “they are also stuck with humans as they exist with their highly complex and varied visual systems.”

Treat the above as a “teaser” for some of what I will discuss in Part 2. Before discussing the problems I see with the Apple Vision Pro and its prospective applications in Part 2, this part will discuss what the AVP got right over the Meta Quest Pro (MQP).

I know many Apple researchers and executives read this blog; if you have the goods, how about arranging for someone that understands the technology and human factor issues to evaluate the AVP?

Some Media with some Critical Thinking about the AVP

I want to highlight three publications that brought up some good issues and dug at least a little below the surface. SadlyIsBradley had an hour and 49-minute live stream discussing many issues, particularly the display hardware and the applications relative to VR (the host, Brad Lynch, primarily follows VR). The Verge Podcast had a pre-WWDC (included some Meta Quest 3) and post-WWDC discussion that brought up issues with the presented applications. I particularly recommend listening to Adi Robertson’s comments in the “pre” podcast; she is hilarious in her take. Finally, I found Snazzy Lab’s 13-minute explanation about the applications put into words some of the problems with the applications Apple showed; in short, there was nothing new that had not failed before and was not just because the hardware was not good enough.

What Apple got right that Meta Quest (Half)-Pro got wrong.

Apple’s AVP has shown up in Meta’s MQP in just about everyone’s opinion. The Meta quest pro is considered expensive and poorly executed, with many features poorly executed. The MQP costs less than half as much at introduction (less than 1/3rd after the price drop) but is a bridge to nowhere. The MQP perhaps would better be called the Quest 2.5 (i.e., halfway to the Quest 3). Discussed below are specific hardware differences between the AVP and MQP.

People Saying the AVP’s $3,499 price is too high lack historical perspective

I will be critical of many of Apple’s AVP decisions, but I think all the comments I have seen about the price being too high completely miss the point. The price is temporal and can be reduced with volume. Apple or Meta must prove that a highly useful MR passthrough headset can be made at any price. I’m certainly not convinced yet, based on what I have seen, that the AVP will succeed in proving the future of passthrough MR, but the MQP has shown that halfway measures fail.

The people commenting on the AVP’s price have been spoiled by looking at mature rather than new technology. Take as just one example, the original retail price of the Apple 2 computer with 4 KB of RAM was US$1,298 (equivalent to $6,268 in 2022) and US$2,638 (equivalent to $12,739 in 2022) with the maximum 48KB of RAM (source Wikipedia). As another example, I bought my first video tape recorder in 1979 for about $1,000, which is more than $4,400 adjusted for inflation, and a blank 1.5-hour tape was about $10 (~$44 in 2023 dollars). The problem is not price but whether the AVP is something people will use regularly.

Passthrough

Meta Quest Pro’s (MQP) looks like a half-baked effort compared to the AVP. The MQP’s passthrough mode is comically bad, as shown in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough. Apple’s AVP passthrough will not be “perfect” (more on that in part 2), but Apple didn’t make something with so many obvious problems.

The MQP used two IR cameras with a single high-resolution color camera in the middle to try and synthesize a “virtual camera” for each eye with 3-D depth perception. The article above shows that the MQP’s method resulted in a low-resolution and very distorted view. The AVP has a high-resolution camera per eye, with more depth-sensing cameras/sensors and much more processing to create virtual camera-per-eye views.

I should add that there are no reports I have seen on how accurately the AVP creates 3-D views of the real world, but by all reports, the AVP’s passthrough is vastly better than that of the MQP. A hint that all is not well with the AVP’s passthrough is that the forward main cameras are poorly positioned (to be discussed in Part 2).

Resolution for “Business Applications” (including word processing) – Necessary, but not sufficient

The next issue is that if you target “business applications” and computer monitor replacement, you need at least 40 pixels per degree (ppd), preferably more. The MQP has only about 20 pixels per degree, meaning much less readable text can fit in a given area. Because the fonts are bigger, the eyes must move further to read the same amount of text, thus slowing down reading speed. The FOV of the AVP has been estimated to be about the same as the MQP, but the AVP has more than 2X the horizontal and vertical pixels, resulting in about 40 ppd.

A note on measuring Pixels per Degree: Typically, VR headset measurement of FOV includes the biocular overlap from both eyes. When it comes to measuring “pixels per degree,” the measurment is based on the total visible pixels divide by the FOV in the same direction for a single eye. The single eye FOV is often not specified and there may be pixels that are cut off based on the optics and the eye location. Additionally, the measurement has a degree of variability based on the amount of eye relief assumed.

Having at least 40 pixels per degree is “necessary but not sufficient” for supporting business applications. OI believe that other visual human factors will make the AVP unsuitable for business applications beyond “emergency” situations and what I call the “Ice Bucket Challenges,” where someone wears a headset for a week or a month to “prove” it could be done and then goes back to a computer monitor/laptop. I have not seen any study (having looked for many years), and Apple presented none that suggests the long-term use of virtual desktops is good for humans (if you know of one, please let me know).

The watchOS WWDC message to watch screens less

Ironically, in the watchOS video, only a few minutes before the AVP announcement, Apple discussed (linked in WWDC 2023 video) how they implemented features in watchOS to encourage people to go outside and stop looking at screens, as it may be a cause of myopia. I’m not the only one to catch this seeming contradiction in messaging.

AVP Micro-OLED vs. MQP’s LCD with Mini-LED Local Dimmable Backlight

The AVP’s Micro-OLED should give better black levels/contrast than MPQ’s LCD with a mini-LED local dimmable backlight. Local dimming is problematic and based on scene content. While the mini-LEDs are more efficient in producing light, much of that light is lost when going through the LCD, and typically only about 3% to 6% of the backlight makes it through the LCD.

While Apple claims to be making the Micro-OLED CMOS “backplane,” by all reports, Sony is applying the OLEDs and performing the Micro-OLED assembly. Sony has long been the leader Micro-OLEDs used in camera viewfinders and birdbath AR headsets, including Xreal (formerly Nreal — see Nreal Teardown: Part 2, Detailed Look Inside).

Micro-Lens Array for Added Efficiency

The color sub-pixel arrangement in the WWDC videos shows a decidedly small light emission area with black space between pixels than the older Sony ECX335 (shown with pixels roughly to scale above). This suggests that Apple didn’t need to push the light output (see optic efficiency in next section) and supported more efficient light collection (semi-collimation) with the use of micro-lens-arrays (MLAs) which are reportedly used on top of the AVP’s Micro-OLED.

MQP’s LCD with Mini-LED with Local Dimming

John Carmack, former Meta Consulting CTO, gave some of the limitations and issues with MQP’s Local Dimming feature in his unscripted talk after the MQP’s introduction (excerpts from his discussion):

21:10 Quest Pro has a whole lot of back lights, a full grid of them, so we can kind of strobe them off in rows or columns as we scan things out, which lets us sort of get the ability of chasing a rolling shutter like we have on some other things, which should give us some extra latency. But unfortunately, some other choices in this display architecture cost us some latency, so we didn’t wind up really getting a win with that.

But one of the exciting possible things that you can do with this is do local dimming, where if you know that an area of the screen has nothing but black in it, you could literally turn off the bits of the backlight there. . . .

Now, it’s not enabled by default because to do this, we have to kind of scan over the screens and that costs us some time, and we don’t have a lot of extra time here. But a layer can choose to enable this extra local dimming. . . .

And if you’ve got an environment like I’m in right now, there’s literally no complete, maybe a little bit on one of those surfaces over there that’s a complete black. On most systems, most scenes, it doesn’t wind up actually benefiting you. . . .

There’s still limits where you’re not going to get, on an OLED, you can do super bright stars on a completely black sky. With local dimming, you can’t do that because if you’ve got a max value star in a min value black sky, it’s still gotta pick something and stretch the pixels around it. . . . We do have this one flag that we can set up for layer optimization.
John Carmack Meta Connect 2022 Unscripted Talk

Pancake (MQP) versus Aspherical Refractive Optics (AVP)

Update June 14, 2023 PM: It turns out that Apple’s news release states, “This technological breakthrough, combined with custom catadioptric lenses that enable incredible sharpness and clarity . . . ” Catadioptric means a combination of refractive and reflective optical elements. This means that they are not “purely refractive” as I first guessed (wrongely). They could be pancake or some variation of pancake optics. Apple recently bought Limbak, an optics design company known for catadioptric designs including those used in Lynx. They also had what they called “super pancake” designs. Assuming Apple is using a pancake design, then the power output of the OLEDs will need to be about 10X higher.

Apple used a 3-element aspherical optic rather than Pancake optics in the MQP and many other new VR designs. See this blog’s article Meta (aka Facebook) Cambria Electrically Controllable LC Lens for VAC? which discusses the efficiencies issues with Pancake Optics. Pancake optics are particularly inefficient with Micro-OLED displays, as used in the AVP because they require the unpolarized OLED light to be polarized for the optics to work. This polarization typically loses about 55% of the light (45% transmissive). Then there is a 50% loss on the transmissive pass and another 50% loss on the reflection of a 50/50 semi-mirror in the pancake optics, which results, when combined with the polarization loss, less than 11% of the OLED’s light, making it through pancake optics. It should be noted that the MQP currently uses LCDs that output polarized light, so it doesn’t suffer the polarization loss with pancake optics but still has the 50/50 semi-mirror losses.

AVP’s Superior Hand Tracking

The AVP uses four hand-tracking cameras, with the two extra cameras supporting the tracking of hands at about waist level. Holding your hand up to be tracked has been a major ergonomic complaint of mine since I first tried the Hololens_1. Anyone who knows anything about ergonomics knows that humans are not designed to hold their hands up for long periods. Apple seems to be the first company to address this issue. Additionally, by all reports, the hand tracking is very accurate and likely much better than MQP.

AVP’s Exceptionally Good Eye Tracking

According to all reports, the AVP’s eye tracking is exceptionally good and accurate. Part of the reason for this better eye tracking is likely due to better algorithms and processing. On the hardware side, it is interesting that the AVP’s IR illuminator and cameras go through the eyepiece optics. In contrast, on the Meta Quest Pro, the IR illuminator and cameras are closer to the eye on a ring outside the optics. The result is that the AVP cameras have a more straight-on look at the eyes. {Brad Lynch of SadlyIsBradley pointed out the difference in IR illuminator and camera location between the AVP and MQP in an offline discussion.}

Processing and power

As many others have pointed out, the AVP uses a computer-level CPU+GPU (M2) and a custom-designed R1 “vision processor,” whereas the MQP uses high-end smartphone processors. Apple has pressed its advantage in hardware design over Meta or anyone else.

The AVP (below left), the AVP has two squirrel-cage fans situated between the M2 and R1 processor chips and the optics (below left). The AVP appears to have about 37 Watt-Hour battery (see next section) to support the two-hour rated battery life. Thus it suggests that the AVP consumes “typically” about 18.5 Watts. This is consistent with people noticing very-warm/hot air coming out of the top vent holes. The MQP (below right) has a similar dual fan cooling. The MQP has a 20.58 Watt-Hour battery, with the MQP rated by Meta as lasting 2-3 hours.

Because the AVP uses a Micro-OLED and a much more efficient optical design, I would expect the AVP’s OLED to consume less than 1W per eye and much less when not viewing mostly white content. I, therefore, suspect that much of the power in the AVP is going to the M2 and R1 processing. In the case of Meta’s MQP, I suspect that a much higher percentage of the system power will power through the inefficient optical architecture.

It should be noted that the AVP displays about 3.3 times the pixels, has more and higher resolution cameras, and supports much higher resolution passthrough. Thus the AVP is moving massively more data which also consumes power. So while it looks like the AVP consumes about double the power, the power “per pixel” is about 1/3rd less than the MQP and probably much less when considering all factors. Considering the processing done by the AVP seems much more advance processing, it demonstrates Apple’s processing efficiency.

APV appears to take about 2X the power of the MQP – And 2X what most others are reporting

CORRECTION (June 14, 2023): Based on information from reader Xuelei Zhang, I was able to confirm that widely reported tweet of the so-called Apple Vision Pro Battery was a hoax and what was shown is the battery used in a Meta Quest 2 Elite. You can see in the picture on the right how the number is the same and there is the metal slug with the hole just like the supposed AVP battery. I still think based on the size of the battery pack is similar in size to a 37Wh battery or perhaps larger. In an article publish today, Charger Labs estimates that the Apple Vision Pro could be in the 74WH range which is certainly possible, but appears to me to be too big. It looks to me like the batter is between 35Wh and 50Wh.

Based on the available information, I would peg the battery to be in the 35 to 50Wh range and thus the power “typical” power consumption of the AVP to be in the 17.5W to 25W range or about two times the Meta Quest Pro’s ~10W.

Numerous, what I think is erroneous, articles and video report that the AVP has a 4789mAh/18.3Wh battery. Going back to the source of those reports, at Tweat by Kosutami, it appears that the word “dual” was missed. Looking at the original follow-up Tweats, the report is clear that two cells are folded about a metal slug and, when added together, would total 36.6Wh. Additionally, in comparing the AVP’s battery to scale with the headset, it appears to be about the same size as a 37Wh battery I own, which is what I was estimating before I saw Kosutami’s tweet.

Importantly, if the AVP’s battery capacity is doubled, as I think is correct, then the estimated power consumption of the AVP is about double what others have reported, or about 18.5 Watts per hour.

The MQP battery was identified by iFixit (above left) to have two cells that combine to form a 20.58Wh battery pack, or just over half that of the AVP.

With both the MQP and AVP claiming similar battery life (big caveat, as both are talking “typical use”), it suggests the AVP is consuming about double the power.

Based on my quick analysis of the optics and displays, I think the AVP’s displays consume less than 1W per ey or less than 2W. This suggests that the bulk of the ~18W/hour is used by the two processors (M2, R1), data/memory movement (often ignored), the many cameras, and IR illuminators.

In part 2 of this series, I plan to will discuss the many user problems I see with the AVP’s battery pack.

Audio

This blog does not seriously follow audio technology, but by all accounts, the AVP’s audio hardware and spatial sound processing capability will be far superior to that of the MQP.

Conclusions

In many ways, the AVP can be seen as the “Meta Quest Pro done much better.” If you are doing more of a “flagship/Pro product,” it better be a flagship. The AVP is 3.5 times the current price of the MQP and about seven times that of the Meta Quest 3, but that is largely irrelevant in the long run. The key to the future is whether anyone can prove that the “vision” for passthrough VR at any price is workable for a large user base. I can see significant niche applications for the AVP (support for people with low vision is just one, although the display resolution is overkill for this use). But as I will discuss next time, there are giant holes in the applications presented.

If the MQP or AVP would solve the problems they purport to solve, the price would not be the major stumbling block. As Apple claimed in the WWDC 2023 video, the feature set of the AVP would be a bargain for many people. Time and volume will cure the cost issues. My problem (teaser for Part 2) is that neither will be able to fulfill the vision they paint, and it is not the difference between a few thousand dollars and a few more years of development.

KGOnTech
DigiLens, Lumus, Vuzix, Oppo, & Avegant Optical AR (CES & AR/VR/MR 2023 Pt. 8)
27 March 2023 at 19:46

DigiLens, Lumus, Vuzix, Oppo, & Avegant Optical AR (CES & AR/VR/MR 2023 Pt. 8)

KGOnTech

By: Karl Guttag

27 March 2023 at 19:46

Introduction – Contrast in Approaches and Technologies

This article will compare and contrast the Vuzix Ultralight, Lumus Z-lens, and DigiLens Argo waveguide-based AR prototypes I saw at CES 2023. I discussed these three prototypes with SadlyItsBradly in our CES 2023 video. It will also briefly discuss the related Avegant’s AR/VR/MR 2022 and 2023 presentations about their new smaller LCOS projection engine and Magic Leap 2’s LCOS design to show some other projection engine options.

It will go a bit deeper into some of the human factors of the Digitlens’ Argo. Not to pick on Digilens’ Argo, but because it has more features and demonstrates some common traits and issues of trying to support a rich feature set in a glasses-like form factor.

When I quote various specs below, they are all manufacturer’s claims unless otherwise stated. Some of these claims will be based on where the companies expect the product to be in production. No one has checked the claims’ veracity, and most companies typically round up, sometimes very generously, on brightness (nits) and field of view (FOV) specs.

This is a somewhat long article, and the key topics discussed include:

MicroLED versus LCOS Optical engine sizes
The image quality of MicroLED vs. LCOS and Reflective (Lumus) vs. Diffractive waveguides
The efficiency of Reflective vs. Diffractive waveguides with MicroLEDs
The efficiency of MicroLED vs. LCOS
Glasses form factor (using Digilens Argo as an example)

Overview of the prototypes

Vuzix Ultralite and Oppo Air Glass 2

The Vuzix Ultralite and Oppo Air Glass 2 (top two on the right) have 640 by 480 pixel Jade Bird Display (JBD) green-only per eye. And were discussed in MicroLEDs with Waveguides (CES & AR/VR/MR 2023 Pt. 7).

They are each about 38 grams in weight, including frames, processing, wireless communication, and batteries, and wirelessVuzix developed their own diffractive waveguide and support about a 30-degree FOV. Both are self-contained with wireless, with an integrated battery and processing.

Vuzix developed their own glass diffractive waveguides and optical engines for the Ultralight. They claim a 30-degree FOV with 3,000 nits.

Oppo uses resin plastic waveguides, and MicroLED optical engine developed jointly with Meta Bounds. I have previously seen prototype resin plastic waveguides from other companies for several years. This is the first time I have seen them in a product getting ready for production. The glasses (described in a 1.5-minute YouTube/CNET video) include microphones and speakers for applications, including voice-to-text and phone calls. They also plan on supporting vision correction with lenses built into the frames. Oppo claims the Air Glass 2 has a 27-degree FOV and outputs 1,400 nits.

Lumus Z-Lens

Lumus’s Z-Lens (third from the top right) supports up to a 2K by 2K full/true color LCOS display with a 50-degree FOV. Its FoV is 3 to 4 times the area of the other three headsets, so it must output more than 3 to 4 times the total light. It supports about 4.5x the number of pixels of the DigiLens Argo and over 13x the pixels of the Vuzix Ultralite and Oppo Air Glass 2.

The Z-Lens prototype is a demonstration of display capability and, unlike the other three, is not self-contained and has no battery or processing. A cable provides the display signal and power for each eye. Lumus is an optics waveguide and projector engine company and leaves it to its customers to make full-up products.

Digilens Argo

The DigiLens Argo (bottom, above right) uses a 1280 by 720 full/true color LCOS display. The Argo has many more features than the other devices, with integrated SLAM cameras, GNSS (GPS, etc.), Wi-Fi, Bluetooth, a 48mp (with 4×4 pixel “binning” like the iPhone 14) color camera, voice recognition, batteries, and a more advanced CPU (Qualcomm Snapdragon 2). Digilens intends to sell the Argo for enterprise applications, perhaps with partners, while continuing to sell waveguides optical engines as components for higher-volume applications. As the Argo has a much more complete feature set, I will discuss some of the pros and cons of some of the human factors of the Argo design later in this article.

Through the Lens Images

Below is a composite image from four photographs taken with the same camera (OM-D E-M5 Mark III) and lens (fixed 17mm). The pictures were taken at conferences, handheld, and not perfectly aligned for optimum image quality. The projected display and the room/outdoor lighting have a wide range of brightness between the pictures. None of the pictures have been resized, so the relative FoVs have been maintained, and you get an idea of the image content.

The Lumus Z-lens reflective waveguide has a much bigger FOV, significantly more resolution, and exhibits much better color uniformity with the same or higher brightness (nits). It also appears that reflective waveguides have a significant efficiency advantage with both MicroLEDs (and LCOS), as discussed in MicroLEDs with Waveguides (CES & AR/VR/MR 2023 Pt. 7). It should also be noted that the Lumus Z-lens prototype has only the display with optics and has no integrated processing, communication or battery. In contrast, the others are closer to full products.

A more complex issue is that of power consumption versus brightness. LCOS engines today are much more efficient for an image with full-screen bright images (by 10x or more) than MicroLEDs with similar waveguides. MicroLED’s big power advantage occurs when the content is sparse, as the power consumption is roughly proportional to the average pixel value, whereas, with LCOS, the whole display is illuminated regardless of the content.

If and when MicroLEDs support full color, the efficiency of nits-per-Watt will be significantly lower than monochrome green. Whatever method produces full color will detract from the overall electrical and optical efficiency. Additionally, color balancing for white requires adding blue and red light with lower nits-per-Watt.

Some caveats:

The Lumus Z-Lens is a prototype and does not have all the anti-reflective and other coatings of a production waveguide. Lumus uses an LCOS device with about ~3-micron pixels, which fits 1440 by 1440 within the ~50-degree FOV supported by the optics. Lumus is working with at least one LCOS maker to get an ~2-micron pixel size to support 2K by 2K resolution with the same size display. The image is cut off on the right-hand side of the image by the camera, which was rotated into portrait mode to fit inside the glasses.
The Digilens through the lens image is from Photonics West in 2022 (about one year old). Digilens has continued to improve its waveguide since this picture was taken.
The Vuzix picture was taken via Vuzix Shield, which uses the same waveguide and optics as the Vuzix Ultralight.
The Oppo image was taken at the AR/VR/MR 2023 conference.

Optical Engine Sizes

Vuzix has an impressively small optical engine driving Vuzix’s diffractive waveguides. Seen below left is a comparison of Vuzix’s older full-color DLP engine compared with an in-development color X-Cube engine and the green MicroLED engine used in the Vuzix Ultralite™ and Shield. In the center below is an exploded view of the Oppo and Meta Bound glasses (joint design as they describe it) with their MicroLED engine shown in their short CNET YouTube video. As seen in the still from the Oppo video, they have plans to support vision correction built into the glasses.

Below right is the Digilens LCOS engine, which uses a fairly conventional LCOS (using Ominivision’s LCOS device with driver ASIC showing). The dotted line indicates where the engine blocks off the upper part of the waveguide. This blocked-off area carries over to the Argo design.

The Digilens Argo, with its more “conventional” LCOS engine, requires are large “brow” above the eye to hide it (more on this issue later). All the other companies have designed their engine to avoid this level of intrusion into the front area of the glasses.

Lumus had developed their 1-D pupil-expanding reflective waveguide for nearly two decades, which needed a relatively wide optical engine. With the 2-D Maximus waveguide in 2021 (see: Lumus Maximus 2K x 2K Per Eye, >3000 Nits, 50° FOV with Through-the-Optics Pictures), Lumus demonstrated their ability to shrink the optical engine. This year, Lumus further reduced the size of the optical engine and its intrusion into the front lens area with their new Z-lens design (compare the two right pictures below of Maximus to Z-Lens)

Shown below are frontal views of the four lenses and their optical engines. The Oppo Air Glass 2 “disguises” the engine within the industrial design of a wider frame (and wider waveguide). The Lumus Z-Lens, with a full color about 3.5 times the FOV as the others, has about the same frontal intrusion as the green-only MicroLED engines. The Argo (below right) stands out with the large brow above the eye (the rough location of the optical engine is shown with the red dotted line).

Lumus Removes the Need for Air Gaps with the Z-Lens

Another significant improvement with Lumus’s Z-Lens is that unlike Lumus’s prior waveguides and all diffractive waveguides, it does not require an air gap between the waveguide’s surface and any encapsulating plastics. This could prove to be a big advantage in supporting integrated prescription vision correction or simple protection. Supporting air gaps with waveguides has numerous design, cost, and optical problems.

A typical full-color diffractive waveguide typically has two or three waveguides sandwiched together, with air gaps between them plus an air gap on each side of the sandwich. Everywhere there is an air gap, there is also a desire for antireflective coatings to remove reflections and improve efficiency.

Avegant and Magic Leap Small LCOS Projector Engines

Older LCOS projection engines have historically had size problems. We are seeing new LCOS designs, such as the Lumus Z-lens (above), and designs from Avegant and Magic Leap that are much smaller and no more intrusive into the lens area than the MicroLED engines. My AR/VR/MR 2022 coverage included the article Magic Leap 2 at SPIE AR/VR/MR 2022, which discusses the small LCOS engines from both Magic Leap and Avegant. In our AWE 2022 video with SadlyItsBradley, I discuss the smaller LCOS engines by Avegant, Lumus (Maximus), and Magic Leap.

Below is what Avegant demonstrated at AR/VR/MR 2022 with their small “L” shaped optical engines. These engines have very little intrusion into the front lenses, but they run down the temple of the glasses, which inhibits folding the temple for storage like normal glasses.

At the AR/VR/MR 2023, Avegant showed a newer optical design that reduced the footprint of their optics by 65%, including shortening them to the point that the temples can be folded, similar to conventional glasses (below left). It should be noted that what is called a “waveguide” in the Avegant diagram is very different from the waveguides used to show the image in AR glasses. Avegants waveguide is used to illuminate the LCOS device. Avengant, in their presentation, also discussed various drive modes of the LEDs to give higher brightness and efficiency with green-only and black-and-white modes. The 13-minute video of Avegant’s presentation is available at the SPIE site (behind SPIE’s paywall). According to Avegant’s presentation, the optics are 15.6mm long by 12.4mm wide, support a 30-degree FOV, with 34 pixels/degree, and 2 lumens of output in full color and up to 6 lumens in limited color outdoor mode. According to the presentation, they expect about 1,500 nits with typical diffractive waveguides in the full-color mode, which would roughly double in the outdoor mode.

The Magic Leap 2 (ML2) takes reducing the optics one step further and puts the illumination LEDs and LCOS on opposite sides of the display’s waveguide (below and described in Magic Leap 2 at SPIE AR/VR/MR 2022). The ML2 claims to have 2,000 nits with a much larger 70-degree FOV.

Transparency (vs. Birdbath) and “Eye Glow”

Transparency

As seen in the pictures above, all the waveguide-based glasses have transparency on the order of 80-90%. This is a far cry from the common birdbath optics, with typically only 25% transparency (see Nreal Teardown: Part 1, Clones and Birdbath Basics). The former Osterhout Design Group (ODG) made birdbath AR Glasses popular first with their R6 and then with the R8 and R9 models (see my 2017 article ODG R-8 and R-9 Optic with OLED Microdisplays) which served as the models for designs such at Nreal and Lenovo’s A3.

OGD Legacy and Progress

Several former ODG designers have ended up at Lenovo, the design firm Pulsar, Digilens, and elsewhere in the AR community. I found pictures of Digilens VP Nima Shams wearing the ODG R9 in 2017 and the Digilens Argo at CES. When I showed the pictures to Nima, he pointed out the progress that had been made. The 2023 Argo is lighter, sticks out less far, has more eye relief, is much more transparent, has a brighter image to the eye, and is much more power efficient. At the same time, it adds features and processing not found on the ODG R8 and R9.

Front Projection (“Eye Glow”)

Another social aspect of AR glasses is Front Projection, known as “Eye Glow.” Most famously, the Hololens 1 and 2 and the Magic Leap 1 and 2 project much of the light forward. The birdbath optics-based glasses also have front projection issues but are often hidden behind additional dark sunglasses.

When looking at the “eye glow” pictures below, I want to caution you that these are random pictures and not controlled tests. The glasses display radically different brightness settings, and the ambient light is very different. Also, front projection is typically highly directional, so the camera angle has a major effect (and there was no attempt to search for the worst-case angle).

In our AWE 2022 Video with SadlyItsBradley, I discussed how several companies, including Dispelix, are working to reduce front projection. Digilens is one of the companies I discussed that has been working to reduce front projection. Lumus’s reflective approach has inherent advantages in terms of front projection. DigiLens Argo (pictures 2 and 3 from the right) have greatly reduced their eye glow. The Vuzix Shield (with the same optics as the Ultralite) has some front projection (and some on my cheek), as seen in the picture below (4th from the left). Oppo appears to have a fairly pronounced front projection, as seen in two short videos (video 1 and video 2)

DigiLens Argo Deeper Look

DigiLens has been primarily a maker of diffractive waveguides, but it has, through the years, made several near-product demonstrations in the past. A few years ago, they when through a major management change (see 2021 article, DigiLens Visit), and with the management came changes in direction.

Argo’s Business Model

I’m always curious when a “component company” develops an end product. I asked DigiLens to help clarify their business approaches and received the following information (with my edits):

Optical Solutions Licensing – where we provide solutions to our license to build their own waveguides using our scalable printing/contactless copy process. Our licensees can design their waveguides, which Digilens’ software tools enable. This business is aimed at higher-volume applications from larger companies, mostly focused on, but not limited to, the consumer side of the head-worn market.

Enterprise/Industrial Products – ARGO is the first product from DigiLens that targets the enterprise and industrial market as a full solution. It will be built to scale and meet its target market’s compliance and reliability needs. It uses DigiLens optical technology in the waveguides and projector and is built by a team with experience shipping thousands of enterprise & Industrial glasses from Daqri, ODG, and RealWear.

Image Quality

As I was familiar with Digilen’s image quality, I didn’t really check it out that much with the ARGO, but rather I was interested in the overall product concept. Over the last several years, I have seen improved image quality, including uniformity and addressing the “eye glow” issue (discussed earlier).

For the type of applications in the “enterprise market” ARGO is trying to serve, absolute image quality may not be nearly as important as other factors. As I have often said, “Hololens 2 proves that image quality for the customers that use it” (see this set of articles discussing the Hololen 2’s poor image quality). For many AR markets, the display information is simple indicators such as arrows, a few numbers, and lines. It terms of color, it may be good enough if only a few key colors are easily distinguishable.

Overall, Digilens has similar issues with color uniformity across the field of view of all other diffractive waveguides I have seen. In the last few years, they have gone from having poor color uniformity to being among the better diffractive waveguides I have seen. I don’t think any diffractive waveguide would be widely considered good enough for movies and good photographs, but they are good enough to show lines, arrows, and text. But let me add a key caveat, what all companies demonstrate are invariably certainly cherry-picked samples.

Field of View (FOV)

While the Argos 30-degree FOV is considered too small for immersive games, for many “enterprise applications,” it should be more than sufficient. I discussed why very large FOVs are often unnecessary in AR in this blog’s 2109 article FOV Obsession. Many have conflated VR emersion with AR applications that need to support key information with high transparency, lightweight, and hands-free. As Professor and decades-long AR advocate Thad Starner pointed out, requiring the eye to move too much causes discomfort. I make this point because a very large FOV comes at the expense of weight, power, and cost.

Key Feature Set

The diagram below is from DigiLen on the ARGO and outlines the key features. I won’t review all the features, but I want to discuss some of their design choices. Also, I can’t comment on the quality of their various features (SLAM, WiFi, GPS, etc.) as A) I haven’t extensively tried them, and B) I don’t have the equipment or expertise. But at least on the surface, in terms of feature set, Argo compares favorably to the Hololens 1 and 2, if having a smaller FOV than the Hololens 2 but with much better image quality.

Audio Input for True Hands-Free Operation

As stated above, Digilens’ management team includes experience from RealWear. RealWear acquired a lot of technology from Kopin’s Golden-i. Like ARGO, Golden-i was a system product outgrowth from display component maker Kopin with a legacy before 2011 when I first saw Golden-i. Even though Kopin was a display device company, Golden-i emphasized voice recognition with high accuracy even in noisy environments. Note the inclusion of 5 microphones on the ARGO.

Most realistic enterprise-use models for AR headsets include significant, if not exclusively, hands-free operation. The basic idea of mounting a display on the user’s head it so they can keep their hands free. You can’t be working with your hands and have a controller in your hand.

While hand tracking cameras remove the need for the physical controller, they do not free up the hands as the hands are busy making gestures rather than performing the task with their hands. In the implementations I have tried thus far, gestures are even worse than physical controllers in terms of distraction, as they force the user to focus on the gestures to make it (barely sometimes) work. One of the most awful experiences I have had in AR was trying to type in a long WiFi password (with it hidden as I typed by asterisk marks) using gestures on a Hololens 1 (my hands hurt just thinking about it – it was a beyond terrible user experience).

Similarly, as I discussed with SadlyItsBradley about Meta’s BCI wristband, using nerve and/or muscle-detecting wristbands still does not free up the hands. The user still has their hands and mental focus slaved to making the wristband work.

Voice control seems to have big advantages for hands-free operation if it can work accurately in a noisy environment. There is a delicate balance between not recognizing words and phrases, false recognition or activation, and becoming too burdensome with the need for verification.

Skull-Gripping “Glasses” vs. Headband or Open Helmet

In what I see as a futile attempt to sort of look like glasses (big ugly ones at that), many companies have resorted to skull-gripping features. Looking at the skull profile (right), there really isn’t much that will stop the forward rotation of front-heavy AR glasses unless they wrap around the lower part of the occipital bone at the back of the head.

Both the ARGO (below left) and Panasonic’s (Shiftall division) VR headsets (right two images below) take the concept of skull-grabbing glasses to almost comic proportions. Panasonic includes a loop for the headband, and some models also include a forehead pad. The Panasonic Shiftall uses pads pressed against the front of the head to support the front, while the ARGO uses an oversized large noise bridge as found on many other AR “glasses.”

ARGO supports a headband option, but they require the ends of the temples with the skull-grabbers temples to be removed and replaced by a headband.

As anyone who knows anything about human factors with glasses knows, the ears and the nose cannot support much weight, and the ears and nose will get sore if much weight is supported for a long time.

Large soft nose pads are not an answer. There is still too much weight on the nose, and the variety of nose shapes makes them not work well for everyone. In the case of the Argo, the large nose pads also interfere with wearing glasses; the nose pads are located almost precisely where the nose pads for glasses would go.

Bussel/Bun on the Back Weight Distribution – Liberating the Design

As was pointed about by Microsoft with their Hololens 2 (HL2), weight distribution is also very important. I don’t know if they were the first with what I call “the bustle on the back” approach, but it was a massive improvement, as I discussed in Hololens 2 First Impressions: Good Ergonomics, But The LBS Resolution Math Fails! Several others have used a similar approach, most notably with the Meta Quest Pro VR (it has very poor passthrough AR, as I discussed in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough). Another feature of the HL2 ergonomics is the forehead pad eliminates weight from the nose and frees up that area in support of ordinary prescription glasses.

The problem with the sort-of-glasses form factor so common in most AR headsets today is that it locks the design into other poor decisions, not the least of which is putting too much weight too far forward. Once it is realized that these are not really glasses, it frees up other design features for improvement. Weight can be taken out of the front and moved to the back for better weight distribution.

ARGO’s Eye-Relief Missed Opportunity for Supporting Normal Glasses

Perhaps the best ergonomic/user feature of the Hololens 1 & 2 over most other AR headsets is that they have enough eye relief (distance from the waveguide to the eye) and space to support most normal eyeglasses. The ARGO’s waveguide and optical design have enough eye relief to support wearing most normal glasses, but still, they require specialized inserts.

You might notice some “eye glow” in the CNET picture (above right). I think this is not from the waveguide itself but is a reflection off of the prescription inserts (likely, they don’t have good anti-reflective coatings).

A big part of the problem with supporting eyeglasses goes back to trying to maintain the fiction of a “glasses form factor.” The nose bridge support will get in the way of the glasses, but the nose bridge support is required to support the headset. Additionally, hardware in the “brow” over the eyes could have been moved elsewhere, which may interfere.

Another technical issue is the location and shape of their optical engine. As discussed earlier, the Digilens engine shape causes issues with jutting into the front of glasses, resulting in a large brow over the eyes. This brow, in turn, may interfere with various eyeglasses.

It looks like Argo started with the premise of looking like glasses putting form ahead of function. As it turns out, they have what for me is an unhappy compromise that neither looks like glasses nor has the Hololens 2 advantage of working with most normal glasses. Starting from the comfort and functionality as primary would have also led to a different form factor for the optical engine.

Conclusions

While MicroLED may hold many long-term advantages, they are not ready to go head-to-head with LCOS engines regarding image quality and color. The LCOS engines are being shown by multiple companies that are more than competitive in size and shape with the small MicroLED engines. The LCOS engines are also supporting much higher resolutions and larger FOVs.

Lumus, with their Z-Lens 2-D reflective waveguides, seems to have a big advantage in image quality and efficiency over the many diffractive waveguides. Allowing the Z-lens to be encased without an air gap adds another significant advantage.

Yet today, most waveguide-based AR glasses use diffractive waveguides. The reasons include there being many sources of diffractive waveguides, and companies can make their own custom designs. In contrast, Lumus controls its reflective waveguide I.P. Additionally, Lumus has only recently developed 2-D reflective waveguides, dramatically reducing the size of the projection engine driving their waveguides. But the biggest reason for using diffraction waveguides is that the cost of Lumus waveguides is thought to be more expensive; Lumus and their new manufacturing partner Schott Glass claimed that they will be able to make waveguides at competitive or better costs.

A combination of cost, color, and image quality will likely limit MicroLEDs for use in ultra-small and light glasses with low amounts of visual content, known as “data snacking.” (think arrows and simple text and not web browsing and movies). This market could be attractive in enterprise applications. I’m doubtful that consumers will be very accepting of monochrome displays. I’m reminded of a quote from an IBM executive in the 1980s when asked whether resolution or color was more important said: “Color is the least necessary and most desired feature in a display.”

Not to pick on Argo, but it demonstrates many of the issues with making a full-featured device in a glasses form factor, as SLAM (with multiple spatially separated cameras), processing, communication, batteries, etc., the overall design strays away from looking like glasses. As I wrote in my 2019 article, Starts with Ray-Ban®, Ends Up Like Hololens.

The post DigiLens, Lumus, Vuzix, Oppo, & Avegant Optical AR (CES & AR/VR/MR 2023 Pt. 8) first appeared on KGOnTech.

KGOnTech
MicroLEDs with Waveguides (CES & AR/VR/MR 2023 Pt. 7)
13 March 2023 at 01:54

MicroLEDs with Waveguides (CES & AR/VR/MR 2023 Pt. 7)

KGOnTech

By: Karl Guttag

13 March 2023 at 01:54

Introduction

My coverage of CES and SPIE AR/VR/MR 2023 continues, this time on MicroLEDs. MicroLEDs companies were abundant in the booths, talks, and private conversations at AR/VR/MR 2023.

The list on the right shows some of the MicroLED companies I have looked at in recent years. Marked with a blue asterisk “*” are companies I talked to at AR/VR/MR 2023, with Jade Bird Display (JBD), PlayNitride, Porotech, and MICLEDI having booths in the exhibition. The green bracket on the left indicates companies where I had seen a MicroLED display generating an image (not just one or a few LEDs). Inside the gold rectangle in the list above are MicroLED companies that system companies have bought. MicroLEDs are the display technology where tech giants Meta, Apple, and Google place their bets for the future.

A much more extensive list of companies involved in MicroLED development can be found at microled-info.com, a site dedicated to tracking the MicroLED industry. Microled-info’s parent company, Metalgrass, also organized the MicroLED Association, and I spoke at their Feb. 7th Webinar (but you have to join the association to see it).

The efficiency of getting the Lambertian light that most LEDs emit through a waveguide to the eye is a major issue I have studied for years and will be covered first. Then after covering recent MicroLED prototypes and discussions, I have included an appendix with background information in the subsections “What is a MicroLED company,” “Microdisplay vs. Direct View Pixel Sizes,” and “Multicolor, Full Color, or True Color.”

MicroLEDs and Waveguides; Millions of Nits-In to Thousands of Nits-Out with Waveguides

When first hearing of MicroLEDs outputting millions of nits, you might think it must be overkill to deliver thousands of nits to the eye for outdoor use with a waveguide. But due to pupil expansion and light losses, only a tiny fraction of the light-in makes it to the eye. The figure (right) diagrams the efficiency issues with waveguides using a diffractive waveguide.

Most LEDs output diffuse (roughly) Lambertian light, whereas waveguides require collimated light. Typically, micro-optics such as microlens arrays (MLA) are on top of the MicroLEDs’ semi-collimate the light. These optics increase the nits; typically, the nits quoted for the MicroLED display are after micro-optics. A waveguide’s small entrance area severely limits the light due to a physics property known as “etendue,” causing it to be called “etendue loss.” Then there are the losses due to the pupil expansion/replication structures (diffraction gratings in the case of diffractive waveguides, semi-reflective “facets” in the case of reflective waveguides). Finally, the light-in from the small entrance area ends up spread out over the much larger exit area to support seeing the image over the whole FOV as the eye moves.

Multiple Headsets Using Diffractive Waveguides with JBD MicroLED

I found it an interesting dichotomy that while all the other prototypes I have seen using Jade Bird Display (JBD) MicroLEDs, including Vuzix, Oppo, TCL, Dispelix, and Waveoptics (before being acquired by Snap), JBD themselves showed a prototype 3-chip color cube projector with a Lochn “clone” (with lesser image quality) of a Lumus 2D expanding reflective waveguide in their booth (I was asked not photograph). Then in the Playnitride booth, they featured Lumus reflective waveguides. I should note that while efficiency is a major factor, other design factors, including cost, will drive different decisions.

Reflective (Lumus) Waveguides are More Efficient than Diffractive Waveguides with MicroLEDs

According to Lumus, their 2-D reflective (Lumus) waveguides result in a 3 to 9 times larger entrance area, and their semi-reflective facets lose less light than diffraction gratings. The net result is that reflective waveguides can be 5 to >10 times more optically efficient than diffractive waveguides with the same microLEDs, a major advantage in brightness and power (= less heat and longer battery life). This efficiency advantage appears to have been playing out at AR/VR/MR 2023.

Playnitride prominently showed their MicroLEDs using Lumus 2D and older 1D reflective waveguides in their booth (below left and middle). Their full-color QD-MicroLEDs only output about 150K nits (compared to the millions of others’ single-color native LEDs), so they needed a more efficient waveguide. Playnitride uses Quantum Dot conversion of blue LEDs to give red and green.

Lumus CTO Dr. Yochay Danziger brought a 2D expanding waveguide with input optics that he held up to Porotech’s MicroLEDs. I captured a quick handheld (and thus not very good) shot (with ND filters to reduce the very bright image) of Porotech’s green MicroLED via Lumus’s handheld waveguide (above right).

Lumus was the only company featured in the Schott Glasses booth at AR/VR/MR 2023. The often-asked question about Lumus is whether they can make them in volume production. The Schott Glass representative assured me they could make Lumus’s 2-D waveguides in volume production.

I plan on covering Lumus’s new smaller (than their two year old Maximus 2D waveguide) Z-Lens 2D waveguide in an upcoming article. In the meantime, I discussed the Z-Lens in the CES 2023 Video with SadlyItsBradley.

Other Optics (ex., Bird Bath, Freeform, and VR-Pancake) and Micro-OLEDs

I want to note here that while MicroLEDs are hundreds to over a thousand times brighter than Micro-OLEDs, they are likely well more than five years away from having anywhere near the same color control and uniformity. Thus designs that favor image quality over brightness using optical designs that are much more efficient than waveguides, such as Bird Bath, Freeform, and VR-pancake optics, will continue to use Micro-OLEDs or LCDs for the foreseeable future. Micro-OLEDs are expected to continue getting brighter, with some claiming they have roadmaps to get to about 30K nits.

Jade Bird Display (JBD) Based AR Glasses

Jade Bird Display (JBD) is the only company I know to be shipping MicroLEDs in production. All working headsets I have seen use JBD’s 640×480 green (only) MicroLEDs, including ones from Vuzix (Ultralite and Shield), Oppo, and Waveoptics (shown in 2022 before being acquired by Snap). JBD is developing devices supporting higher pixel depth and higher resolution.

Also, as background to MicroLEDs in general, as well as JBD and the glasses using their MicroLEDs, there is my 2022 blog article AWE 2022 (Part 6) – MicroLED Microdisplays for Augmented Reality and the associated video with SadlyItsBradley. Additionally, there is my 2021 article on JBD and WaveOptics in News: WaveOptics & Jade Bird Display MicroLED Partnership.

The current green MicroLEDs support only 4 bits per pixel or 16 (2⁴) brightness levels and will show contour lines with a smooth shaded area. I hear that JBD’s future designs will support more levels. While I have seen continuous improvement in the pixel-to-pixel brightness differences through the year, and while they are the most uniform MicroLED devices I have seen, there is still visible “grain” in what should be a solid area.

Vuzix

At CES 2023, Vuzix showed off the small size possible with their Utralite™ glasses (left side below) which weigh only 38 grams (not much more than most conventional glass). A tray full of display engines on public display was there to emphasize that they were in production. The comparison of light engines (below left) shows how compact the MicroLED green and color cube projector engines are compared with Vuzix’s older (but true color) DLP design with similar resolution. I discussed Vuzix’s Ultralite and Shield in the CES 2023 video with SadleyItsBradley.

The Vuzix Shield and Ultralite share the same small green MicroLED engine. The combination of the engine and Vuzix waveguide are capable of up to 4,100 nits which is bright enough to enable outdoor use. The power consumption of MicroLEDs is roughly proportional to the average pixel value (APV). Paul Travers, CEO of Vuzix, says that the Ultralites consume very little power and can work for two days in typical use on a charge. Vuzix has also improved their in-house developed waveguides, significantly reducing the forward projection (“eye-glow”).

Vuzix has been very involved with several MicroLED companies, as discussed with SadlyItsBradley in our AWE 2022 Video.

Oppo

At AR/VR/MR 2023, Oppo showed me their JBD green MicroLED based glassed with a form factor similar to the Vuzix Ultralite. The overall image quality and resolution seem similar on casual inspection. The Vuzix waveguides diffraction gratings seem less noticeable from the outside, but I have not compared them side by side in the same conditions.

TCL and JBD X-Cube Color

At CES 2023, TCL demonstrated a multicolor 3-Chip (R, G, and B) combined with an X-Cube prototype (using a Lochn reflective waveguide). Vuzix, in a 2020 concept video, and Meta (Facebook), in a 2019 patent application, have shown using three waveguides to combine the three primary colors (below right). I discussed the TCL glasses with JBD color X-Cube design and some of the issues with X-Cubes in the CES video with SadleyItsBradley.

The TCL glasses appear to be using a diffraction grating waveguide that is very different from others I have seen due to the way the exit grating has very big steps in the transmission of light (right). This waveguide differs from the reflective waveguide JBD was showing in their booth or other diffractive waveguides. I have seen diffractive waveguides that were none uniform but never with such large steps in the output gratings. While I didn’t get a chance to see an image through the TCL glasses, the reports I got from others were that the image quality was not very good.

Goertek/Goeroptics Design and Manufacturing JBD Projection Engines

In the CES 2023 TCL video, I discussed some of the issues associated X-Cube color combining and the problems with aligning the three panels. At the AR/VR/MR conference, the Goeroptics division of Goertek showed that they were making both green-only and Color X-Cube designs for JBD’s MicroLEDs (slide from their presentation below). While Goertek may not be a household name, they are a very large optics and not-optics design and OEM for many famous brands, including giants such as Apple, Microsoft, Sony, Samsung, and Lenovo.

Porotech, Ostendo, and Innovation Semiconductor color tunable LEDs

I met Porotech in their private suite at CES and their booth at AR/VR/MR 2023. They have already received much attention on this blog in CES 2023 (Part 2) – Porotech – The Most Advanced MicroLED Technology, AWE 2022 (Part 6) – MicroLED Microdisplays for Augmented Reality, and my CES 2023 video with SadlyIsBradley on Porotech. They have been making a lot of news in the last year with their development of single-color InGaS red, green, and blue MicroLEDs and particularly their single emitter color tunable LED (what Porotech calls DynamicPixelTuning ^® or DPT ^®)

Below is a very short video I captured in the Porotech booth with a macro lens of their DynamicPixelTuning demo. I apologize for the camera changing focus when I switched from still to video mode with the blooming due to the wide range of brightness as the color changes. The demo shows the whole display changing color, as Porotech does not have a backplane that can change colors pixel by pixel.

Porotech showed a combination of motion and color changing with their DynamicPixelTuning

At CES 2023, I was reminded by Ostendo, best known for the color-stacked MicroLEDs technology, that they had developed tunable color LEDs several years ago. Sure enough, six years ago, Ostendo presented the paper III-nitride monolithic LED covering full RGB color gamut in the Journal of the SPIE in February 2016. I have not seen evidence that Ostendo has come close to pursuing it beyond the single LED prototype stage, as Porotech has done with their DynamicPixelTuning.

The recent startup Innovation Semiconductor (below) is developing technology to integrate the control transistor circuitry into the InGaS substrate and avoid the more common hybrid InaS, and CMOS approaches almost all others are using. They are also developing a “V-grove” technology for making color-tunable LEDs. Innovation Semi cites work by the University of California at Stata Barbara (see paper 1 and paper 2 ) plus their own work that suggests that V-groves may be a more efficient way to produce color-tunable LEDs than the approach taken by Porotech and Ostendo.

A major concern I have with Innovation Semi’s approach to integrating the control transistors in GaN is whether they will be able to integrate enough control circuitry without making the devices too expensive and/or making the pixel size bigger.

PlayNitride (Blue with QD Conversion Spatial Color)

PlayNitride demonstrated its full-color MicroLED technology, which uses blue LEDs with Quantum Dot (QD) conversion to produce red and green. At 150K nits, they are extremely bright compared to Micro-OLEDs but are much less bright than native red, green, and blue MicroLEDs from companies including JBD and Porotech.

As discussed earlier, PlayNitride showed their MicroLEDs working with Lumus waveguides. But even though Lumus waveguides are more efficient than diffractive waveguides, 150K nits from the display are not bright enough for practical uses. They are about 1/10th the brightness of the native MicroLEDs of JBD and Porotech, and their pixels are bigger.

PlayNitride was the only company showing fairly high-resolution (1K by 1K and 1080P) full-color single-chip MicroLED microdisplays. Furthermore, these are only prototypes. Still, the green and red were substantially weaker than the blue, as seen in the direct (no waveguide) macro photograph of PlayNitrides MicroLED below. Also, the red was more magenta (mixed red and blue).

Looking at the 2X zoom, one sees the “grain” associated with the pixel-to-pixel brightness differences in all colors common to all MicroLEDs demonstrated to date. Additionally, in the larger reddish wedge pointed at by the red arrow, there are color differences/grain at the pixel level.

Known issue with QD spatial color conversion and microdisplays

While quantum dot (QD) color conversion of blue and UV LEDs has been proposed as a method to make full-color MicroLEDs for many years, there are particular issues with using QD with very small microdisplay pixels. Normally the QD layer required for conversion stays roughly the same thickness as the pixels become smaller, resulting in a very tall stack of QD compared to the pixel size. It then requires some form of microscopic baffling to prevent the light from adjacent LEDs from illuminating the wrong color.

Some have tried using thinner layers of QD and then relied on color filters to “clean up” the colors, but this comes with significant losses in efficiency and issues with heat. There are also issues with how hard the QD material can be driven before it degrades, which will limit brightness. Using spatial color itself has the issue of pixel sizes becoming too big for use in AR.

Many of these issues will be very different for making larger direct-view and VR pixels. The thickness of the QD layers becomes a non-issue as the pixels get bigger and spatial color has long been used with larger pixels. We have already seen where different OLED technologies have been used based on pixel size and application; for example, color-filtered OLEDs won out in large-screen TVs, whereas native color OLED subpixels are used in smartphone phones, smartwatches, and microdisplay OLEDs.

MICLEDI Reconstituted InGaS Wafers

MICLEDI is a spinout of the IMEC research institute in Belgium in 2019 with a booth at AR/VR/MR 2023. They are fabless with a mix of MicroLED technologies they have developed (right). They claim to have single color per die, spatial color (colors side by side), and stacked color technology. They have also developed GaN and Aluminum Gallium Phosphor (AlinGAP) red. After some brief discussions in their booth and going through their handout material, their MicroLEDs seem like a bit of a grab bag of technology for license without a clear direction.

The one technology that seems to set MICLEDI apart is for taking 100, 150mm, or 200mm GaN or AlinGap EPI wafers and making a “reconstituted” wafer with pick and placed known good dies. These reconstituted wafers can be “flip chipped” with today’s 300mm CMOS wafers. Today, almost all LED manufacturing is on much smaller wafers than mainstream production CMOS. For development today, companies are flipping small GaN wafers with spaced-out sets of LED arrays onto a larger CMOS wafer and throwing away most of the CMOS wafer.

Stacked MicroLEDs

While I didn’t see MIT at CES or AR/VR/MR 2023, MIT made news during AR/VR/MR with stacked color MicroLEDs. I don’t know the details, but it sounds similar to what Ostendo discussed, at least as far back as 2016 (see lower left). MICLEDI (above) has also developed a stated color LED technology where the LEDs are side by side.

The obvious advantage of stacked color is that the full color is smaller. But the disadvantage is that the LEDs and other circuitry above block light from lower LEDs. The net result is that stacked LEDs will likely be much brighter than Micro-OLEDs but much less bright than other MicroLED technologies. Also concerning is that while red is the color with the least efficiency today, it seems to end up on the lowest layer.

With their mid-range brightness, stacked MicroLEDs would likely be targeted at non-waveguide optics designs. Ostendo has been developing its optical design, which tiles multiple small MicroLEDs to give a wider FOV.

Conclusions

Many giant and small companies are betting that MicroLEDs will be the future of MicroDisplay technology for VR and AR. At the same time, one should realize that none of the technologies is competitive today regarding image quality with Micro-OLED, LCOS, or DLP. There are many manufacturing and technical hurdles yet to be solved. Each of the methods for producing full-color MicroLEDs has advantages and disadvantages. The race in AR is to support full-color displays and higher resolution at high brightness as, low power, and small size. I can’t see how multiple monochrome displays using X-Cubes, Waveguides, or other methods are long-term AR solutions.

I often warn people that if someone does a demo first, that does not mean they will be in production first. Some technical approaches will yield a hand-crafted one-off demo faster but are not manufacturable. The warning is doubly true when it comes to color MicroLEDs. It is easier to rule out certain approaches than to say which approach or approaches will succeed. For MicroDisplay MicroLEDs used in AR, I think native LEDs will win out over color-converted (ex., QD) blue LEDs. A different MicroLED technology will likely be better for direct-view displays.

It will be interesting to see the market adoptions of the new small form factor but green-only AR glasses. While they meet the form factor requirement of looking like glasses with acceptable weight, they don’t have great vision correction solutions, and being green-only will limit consumer interest.

A continuing issue with be which optics work best with MicroLEDs. Part of this issue will be affected by the degree of collimation of the light from the LEDs. The 2-D reflective waveguides developed by Lumus have a significant efficiency advantage, but still, many more companies are using diffractive waveguides today.

Appendix: MicroLED Background Information

What is a MicroLED Company?

To have a successful MicroLEDs is more than making the LEDs; it is about making a complete display and the ability to control it accurately at an affordable cost.

What constitutes a “MicroLED company” varies widely from a completely fabless design company to one that might design and fab the LEDs, design the (typically) CMOS control backplane, and then do the assembly and electrical connection of the (typically) Indium Gallium Nitride (InGaS) LEDs onto the CMOS backplane. Almost every company has a different “flow” or order in which they assemble/combine various component technologies. For example, shown below is the flow given by JBD, where they appear to be applying the Epi-lay to grow the LEDs on top of the CMOS wafer; other companies would form the LEDs first on the InGaN wafer and then bond the finished transistor arrays onto the finished CMOS control devices.

There is no common approach, and there are as many different methods as there are companies with some flows radically different from JBD’s. Greatly complicating matters is that most InGaN fabrication is done on 150mm to 200mm diameter wafers. In contrast, mainstream CMOS today is made on 300mm wafers which least to a variety of methods to address this issue, some of which are better suited to volume manufacturing than others.

Microdisplay vs. Direct View Pixel Sizes

What companies call MicroLED displays varies from wall-size monitors and TVs that can be more than a meter wide down to microdisplays typically less than 25mm in diagonal. As the table on the right shows, a small pixel on an AR microdisplay is about 300 to 600 times smaller than a direct-view smartphone or smartwatch. Pixel sizes get closer when comparing waveguide-based AR to VR pixels.

VR headsets started with essentially direct-view cell phone-type displays with some cheap optics to enable the human eye to focus but have been driving the pixel size down to improve angular resolution. The latest trend is to use pancake optics which can use even smaller pixels to enable smaller headsets.

There is some “bridging” between AR and VR with display types. For example, large combiner “bug-eye” AR often uses direct-view type displays common in VR. Some pancake optics-based VR displays use the same Micro-OLED displays used with AR birdbath optics.

With the radically different pixel sizes, it should not be surprising that the best technology to support that pixel size could change. Small microdisplays used by waveguide-based AR require microdisplays with semiconductor (usually CMOS) transistors. TVs, smartphones, and smartwatches use various types of thin film transistors.

Particularly regarding supporting color with MicroLEDs, it should be expected that the technologies used for microdisplays could be very different from those used for direct-view type displays. For example, while quantum dots color conversion of blue or UV light might be a good method for supporting larger displays, it does not seem to scale well to the small pixel sizes used in AR.

Multicolor, Full Color, or True Color

While not “industry standard definitions,” for the sake of discussion, I want to define three categories of color display:

Multicolor – Provides multiple identifiable colors, including, at a minimum, the primary colors of red, green, and blue. This type of display is useful for providing basic information and color coding it. Photographic images will look cartoonish at best, and there are typically very visible “contour lines” in what should be smoothly shaded surfaces.
Full Color – This case supports a wide range of colors, and smooth surfaces will not have significant contours, but the color control across the display is not good enough for showing pictures of people.
True Color – The display is capable of reasonably accurate color control across the display. Importantly, faces and skin tones, to which human vision is particularly sensitive, look good. It a display is “true color,” it should also be able to control the “white point,” and whites will look white, and grays will be gray. There should be no visible contouring.

The images below are examples of “multicolor,” “full color,” and “true color” images.

It might seem to some that my definition of “full” versus “true” color is redundant, but I have seen many demonstrations through the years where the display can display color but can’t control it well. In 2012, I wrote Cynics Guide to CES – Glossary of Terms. I called this issue “Pixar-ized” because there were so many demos of cartoon characters showing color saturation but none showing humans, which requires accurate color control.

Pixar-ized – The showing of only cartoons because the device can’t control color well and/or has low resolution. People have very poor absolute color perception but tend to be are very sensitive to skin tones and know what looks right when viewing humans, but the human visual systems is very poor at judging whether the color is right in a cartoon. Additionally it is very hard to tell resolution when viewing a cartoon.

I will add to this category above “artistic” false/shifted color images (see Playnitride’s above). Sometimes this is done because the work to calibrate the prototype has not been completed, even though the display can eventually support full color. Still, it is often done to hide problems.

I should note that what can be acceptable to the eye with a single-color image can look very bad when combined with other colors. What are weak or dead pixels with a monochrome display will turn into colorized or color-shifted pixels that will stick out. Anyone with a single dead color within a pixel on display has seen how the missing color sticks out. The images below are a simplified Photoshop (simulation) of what happens if random noise and dim areas occur in the various colors. The left image shows the effect on the full-color image, and the right image shows the same amount of random noise and dimming (in green) with the monochrome green (note, the image on the right is the grayscale image and then converted to green and not just the green channel from the true color image). In the green-only image, you can see some noise and a slight dimming that might not even be noticeable, whereas, in the color image, it turns into a magenta-colored area.

In that same 2012 article, I wrote about “Stilliphobia,” the fear of showing still images. We are seeing that with displaying content that is very busy and/or with lots of motion to hide dead or weak pixels or random pixel values in the display. When I see a needlessly busy image or lots of motion, I immediately think they are trying to hide problems. Someone with a great-looking display should show pictures of people and smooth images for at least some content.

Most of today’s MicroLED displays are working on getting to multicolor displays and are far from true color. All MicroLED microdisplays I have seen to date have large pixel-to-pixel variations. No amount of calibration or mura correction will be enough to produce a good photographic image if the individual colors can’t be controlled accurately. The good news is that most of today’s AR applications only require a multicolor display.

💾

Normal view

How does LG’s see-through OLED work?

How does Samsung’s magical MicroLED work?

But what is a transparent display good for?

Introduction

Outline of the Video and Additional Information

02:55 RaonTech

04:01 May Display (LCOS)

04:16 Kopin’s Forth Dimensions Display (LCOS)

05:34 Texas Instruments (TI) DLP®

My Background at Texas Instruments:

07:25 VueReal MicroLED

08:26 MojoVision MicroLED

10:20 Porotech MicroLED

12:55 Brilliance Color Laser Combiner

14:24 TriLite/Trixel (Laser Combiner and LBS Display Glasses)

Next Time

Introduction

Influencing the Influencers On Apple Vision Pro

Linus Tech Tips (LTT)

Adam Savages’ Tested

Artur’s Tech Tales – Interview on AVP’s Optical Design

Linus Tech Tips on Apple Vision Pro’s Human Factors

It makes a mess of your face

Need for game controllers

Windows PC Gaming Video Mirroring via WiFi has Lag, Low Resolution, and Compression Artifacts

Warping effect of passthrough

Sharing documents is a pain.

Information Density – The AVP Delivers Effectively Multiple Large but Very Low-Resolution Monitors

Conclusion

Appendix Linus Comments on AVP’s “Weird Camera Shutter Angle”

Introduction – Sorry, But It’s True

Hiding the Screen Door Effect in Plain Sight with Blur

Full Image Pictures from the center 46 Degrees of the FOV

Medium Close-Up Comparison

Extreme Close-Up of AVP and MQ3

Experiment with Slightly Blurring the Meta Quest 3

AVP’s Issues with High-Resolution Content

AVP Lack “Information Density”

ANSI-Like Contrast

The AVP Has Worse Color Uniformity than the MQ3

Conclusion and Comments

Introduction

Office Text Applications and “Information Density” – Font Size is Important

Meta Quest Pro Horizon Worktop Desktop Approach

Meta Quest Pro Virtual Versus Physical Monitor “Shootout”

The Shootout

Closeup Look at the Displays

Extrapolating to Apple Vision Pro

Reportedly, Apple Vision Pro Directly Rendering Fonts

Some More Evaluation of MQP’s Pancake Optics Using immersed Virtual Monitor

Some feedback on immersed (and all other VR/AR/MR) virtual monitor placement control.

Other Considerations and Conclusions in Part 5D

Appendix 1: Test Patterns

Appendix 2: Some More Background Information

More Comments on Font Sizes with Windows

How pictures were shot and moiré

Introduction

Back to the Future with Very Low Pixels Per Degree (ppd) with the Apple Vision Pro

The question is, “Would People?” Not “Could People?” Use an Apple Vision Pro (AVP) as a Computer Monitor

Note on Window’s Scaling

Optics

Optical Distortion

Binocular Overlap and Rivalry

Inscribing a virtual landscape-oriented monitor uses about half of the vertical pixels of the headset.

Scaling text – 40+ Years of Computer ont Grid Fitting (“Cheating”) Exposed

Rendering Options: Virtual Monitors Fixed in 3-D Space Breaks the “Pixel Grid.”

Inscribing a virtual landscape-oriented monitor uses about half of the vertical pixels of the headset.

A simplified example of scaling text

Scintillating Text

Conclusion

1982

1986 and the Battle with Intel for Early Graphics Processor Dominance

Introduction

Apple Vision Pro’s Pancake Optical Design

Hypervision’s Field of View Analysis

Pixels Per Degree (ppd)

Eye Box

Hypervision

AR/VR/MR 2022 with Dual Fused Fresnel Lenses and 270°

05:34 Texas Instruments (TI) DLP^®