Normal view

There are new articles available, click to refresh the page.
Today — 9 November 2024Main stream

Claude AI to process secret government data through new Palantir deal

8 November 2024 at 23:08

Anthropic has announced a partnership with Palantir and Amazon Web Services to bring its Claude AI models to unspecified US intelligence and defense agencies. Claude, a family of AI language models similar to those that power ChatGPT, will work within Palantir's platform using AWS hosting to process and analyze data. But some critics have called out the deal as contradictory to Anthropic's widely-publicized "AI safety" aims.

On X, former Google co-head of AI ethics Timnit Gebru wrote of Anthropic's new deal with Palantir, "Look at how they care so much about 'existential risks to humanity.'"

The partnership makes Claude available within Palantir's Impact Level 6 environment (IL6), a defense-accredited system that handles data critical to national security up to the "secret" classification level. This move follows a broader trend of AI companies seeking defense contracts, with Meta offering its Llama models to defense partners and OpenAI pursuing closer ties with the Defense Department.

Read full article

Comments

© Yuichiro Chino via Getty Images

Yesterday — 8 November 2024Main stream

FBI says hackers are sending fraudulent police data requests to tech giants to steal people’s private information

8 November 2024 at 17:50

The warning is a rare admission from the FBI about the threat from fake emergency data requests submitted by hackers with access to police email accounts.

© 2024 TechCrunch. All rights reserved. For personal use only.

Meta beats suit over tool that lets Facebook users unfollow everything

8 November 2024 at 17:46

Meta has defeated a lawsuit—for now—that attempted to invoke Section 230 protections for a third-party tool that would have made it easy for Facebook users to toggle on and off their news feeds as they pleased.

The lawsuit was filed by Ethan Zuckerman, a professor at University of Massachusetts Amherst. He feared that Meta might sue to block his tool, Unfollow Everything 2.0, because Meta threatened to sue to block the original tool when it was released by another developer. In May, Zuckerman told Ars that he was "suing Facebook to make it better" and planned to use Section 230's shield to do it.

Zuckerman's novel legal theory argued that Congress always intended for Section 230 to protect third-party tools designed to empower users to take control over potentially toxic online environments. In his complaint, Zuckerman tried to convince a US district court in California that:

Read full article

Comments

© NurPhoto / Contributor | NurPhoto

Before yesterdayMain stream

‘For You’ feeds fail on election night, offering outdated information, angering users

6 November 2024 at 16:49

“For You” algorithms that promote the most interesting content across a social network, personalized to the individual user, offered a disjointed, outdated, and nearly unusable experience on election night in the U.S. as they highlighted hours-old posts that no longer reflected the current state of the race. Frustrations were particularly high on Threads, Meta’s X […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Anticipating post-election drama, Meta extends political ad ban past Election Day

5 November 2024 at 17:40

Meta announced Tuesday that it is extending its restriction on political ads until later in the week, though it did not specify which day the ban would be lifted.  When asked for more information, a Meta spokesperson pointed TechCrunch to the announcement about the company working to “protect the integrity of elections on Facebook and […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Meta Orion (Pt. 3 Response to Meta CTO on Eye Glow and Transparency)

29 October 2024 at 13:44

Introduction: Challenge from Bosworth Accepted

Several people pointed me to an interesting Instagram video AMA (ask me anything) by Meta CTO Andrew Bosworth on October 21, 2024, that appeared to challenge my October 6th article, Meta Orion AR Glasses (Pt. 1 Waveguide), which discussed both transparency and “Eye Glow” (what Bosworth Referred to as “Blue Glow”) — Challenge Accepted.

On the right is a Google Search for “Meta” [and] “Orion” [and] “Eye Glow” OR “Blue Glow” from Sept 7th (Orion announced) through Oct 28, 2024. Everything pertinent to the issue was from this blog or was citing this blog. A Google Search for “Meta Orion” and “blue glow” returns nothing. Shown on the right is a Google search.

As far as I can find, this blog (and a few other sites citing this blog) has been the only one reporting on Meta Orion’s transparency or Eye Glow. So when Bosworth said, “Another thing that was kind of funny about the reviews is people were like, oh, you know you can see the blue glow well,” who else could he be referring to?

Housekeeping – Parts 2 and 3 of the Snap and Orion Roundtable are to be released soon.

The rest of the two-hour roundtable discussion about Snap Spectacles and Meta Orion should be released soon. Part 2 will focus on Meta Orion. Part 3 will discuss more applications and market issues, along with some scuttlebutt about Meta’s EMG wristband controller.

Bosworth’s Statement on Transparency and Eye Glow in Instagram AMA Video – Indirect Shout Out to this Blog

Below is computer transcription with minor light to clean up the speech-to-text and add punctuation and capitalization) of Bosworth’s October 21, 2024, AMA on Instagram, starting at about 14:22 into the video.

14:22 Question: What % of light does Orion block from your view of the world, how much is it darkened?

I don’t know exactly. So, all glass limits transmission to some degree. So, even if you have completely clear glasses, you know, maybe they take you from 100% transmission up your eyes like 97% um, and normal sunglasses that you have are much darker than you think they’re like 17% transmissive is like a standard for sunglasses.  Orion is clear. It’s closer [to clear], I don’t know what the exact number is, but it’s closer to regular prescription glasses than any kind of glasses [in context, he sounds like he is referring to other AR glasses]. There’s no tint on it [Orion]. We did put tint on a couple of demo units so we could see what that looked like, but that’s not how they [Orion] work.

I won’t get into the electrochromic and that kind of stuff.  Some people were theorizing that they were tinted to increase contrast. This is not uncommon [for AR] glasses. We’re actually quite proud that these were not. If I was wearing them, and you’re looking at my eyes, you would just see my eyes.

Note that Bosworth mentioned electrochromic [dimming] but “won’t get into it.” As I stated in Orion Part 1, I believe Orion has electrochromic (electrically controlled) dimming. While not asked, Bosworth gratuitously discusses “Blue Glow,” which in context can only mean “Eye Glow.”

Another thing that was kind of funny about the reviews is people were like, oh, you know you can see the blue glow well. What we noticed was so funny was the photographers from the press who were taking pictures of the glasses would work hard to get this one angle, which is like 15 degrees down and to the side where you do see the blue glow.  That’s what we’re actually shunting the light to. If you’re standing in front of me looking at my eyes, you don’t see the glow, you just see my eyes. We really worked hard on that we’re very proud of it.

But of course, if you’re the person who’s assigned by some journalist outfit to take pictures of these new AR glasses, you want to have pictures that look like you can see something special or different about them. It was so funny as every Outlet included that one angle. And if you look at them all now, you’ll see that they’re all taken from this one specific down and to the side angle.

As far as I can find (it’s difficult to search), this blog is the only place that has discussed the transparency percentage of Orion’s glasses (see: Light Transmission (Dimming?)). Also, as discussed in the introduction, this blog is the only one discussing eye glow (see Eye Glow) in the same article. Then, consider how asking about the percentage of light blockage caused Bosworth to discuss blue [eye] glow — a big coincidence?

But what caused me to write this article is the factually incorrect statement that the only place [the glow] is visible is from “15 degrees down and to the side. He doth protest too much, methinks. Most graciously

Orion’s Glow is in pictures taken from more than “taken from this one specific down and to the side angle

To begin with, the image I show in Meta Orion AR Glasses (Pt. 1 Waveguide), shows a more or less straight-on shot from a video by The Verge (right). It is definitely not shot from a “down and to the side angle.”

In fact, I was able to find images with Bosworth in which the camera was roughly straight on, from down and to the side, and even looking down on the Orion glasses Bosworth’s Sept. 25, 2024, Instagram video and in Adam Savage’s Tested video (far right below).

In the same The Verge Video, there is eye-glow with Mark Zuckerburg looking almost straight on into the camera and from about eye level to the side.

The eye glow was even captured by the person wearing another Orion headset when playing a pong-like game. The images below are composites of the Orion camera and what was shown in the glasses; thus, they are simulated views (and NOT through the Orion’s waveguide). The stills are from The Verge (left) and CNBC (right).

Below are more views of the eye-glow (mostly blue in this case) from the same The Verge video.

The eye glow stills frames below were captured from a CNBC video.

Here are a few more examples of eye glow that were taken while playing the pong-like game from roughly the same location as the CNBC frames above right. They were taken from about even with the glasses but off to the side.

In summary, there is plenty of evidence that the eye glow from Meta’s Orion can be seen from many different angles, not just from below but also from the side, as Bosworth states.

Meta Orion’s Transparency and Electrochromic Dimming

Bosworth’s deflection on the question of Orion’s light transmission

Bosworth started by correctly saying that nothing manmade is completely transparent. A typical (uncoated) glass reflects about 8% of the light. Eyeglasses with good antireflective coatings reflect about 0.5%. The ANSI/ISEA Z87.1, safety glasses standard, specifies “clear” as >85% transmission. Bosworth appears to catch himself knowing that there is a definition for clear and says that Orion is “closer to clear” than sunglasses at about 17%.

Bosworth then says there is “no tint” in Orion, but respectfully, that was NOT the question. He then says, “I won’t get into the electrochromic and that kind of stuff,” which is likely a major contributor to the light transmission. Any dimming technology I know of is going to block much more light than a typical waveguide. The transparency of Orion is a function of the waveguide, dimming layer, other optics layers, and inner and outer protection covers.

Since Bosworth evaded answering the question, I will work through it and try to get an answer. The process will include trying to figure out what kind of dimming I think Orion uses.

What type of electrochromic dimming is Orion Using?

First, I want to put in context what my first article was discussing regarding Orion’s Light Transmission (Dimming?). I was well aware that diffractive waveguides, even glass ones, alone are typically about 85-90% transmissive. From various photographs, I’m pretty sure Orion has some form of electrochromic dimming, as I stated in the first article. I could see the dimming change in one video, and in view of the exploded parts, there appeared to be a dimming device. In looking at this figure, the dimming device seems fairly transparent and on the order of the waveguides and other flat optics. What I was trying to figure out was whether they were using more common polarization-based dimming or a non-polarization-based technology. This picture is inconclusive as to the type of dimming that is used, as the dimmer identified (by me) might be only the liquid crystal part of the shutter with the polarizers, if there are any, in the cover glass or not shown.

The Magic Leap 2 (see: Magic Leap 2 (Pt. 3): Soft Edge Occlusion, a Solution for Investors and Not Users). Polarization-based dimming is fast and gives a very wide range of dimming (from 10:1 to >100:1), but it requires the real-world light first to be polarized, and when everything is considered, it blocks more than 70% of the light. It’s also possible to get somewhat better transmission by using semi-polarizing polarizers, but it gives up a lot of dimming range to gain some transmission. Polarization also causes issues when looking at LCDs, such as computer monitors and some cell phones.

Non-polarization dimming (see, for example, CES & AR/VR/MR Pt. 4 – FlexEnable’s Dimming, Electronic Lenses, & Curved LCDs) blocks less light in its most transmissive state but has less of a dimming range. For example, FlexEnable has a dimming cell that ranges between ~87% transmissive to 35% or less than a 3:1 dimming range. Snap Spectacles 5 uses (based on a LinkedIn post that has since been removed) a non-polarization-based electrochromic dimming by Alphamicron, what they call e-Tint. Both AlphaMicron’s e-Tint and FlexEnable’s dimming use what is known as  Guest-Host LC, which absorbs light rather than changing polarization.

Assuming Orion uses non-polarization dimming, I would assume that the waveguide and related optical surfaces have about 85-90% transmissivity and about 70% to 80% for non-polarization dimming. Since the two effects are multiplicative, that would put Orion in the 90%x80% = 72% to 85 x70% = 60% range.

Orion’s Dimming

Below are a series of images from videos by CNET, The Verge, and Bloomberg. Notice that CNET’s image appears to be much more transmissive. On both CNET and The Verge, I included eye glow pictures from a few frames in the video later to prove both glasses were turned on. CNET’s Orion glasses are significantly more transparent than any other Orion video I have seen (from over 10 I have looked at to date), even when looking at the same demos as in the videos. I missed this big difference when preparing my original article and only discovered it when preparing this article.

Below are some more frame captures on the top row. On the bottom row, there are pictures of the Lumus Maximus (the most transparent waveguide I have seen), WaveOptic Titan, The Magic Leap One (with no tint), and circular polarizing glasses for comparison. The circular polarizing glasses are approximately what I would expect if the Orion glasses were using polarizing dimming.

Snap Spectacles 5, which uses non-polarization dimming, is shown on the left. It compares reasonably well to the CNET mage. Based on the available evidence, it appears that Orion must also be using an electrochromic dimming technology. Per my prior estimate, this would put Orion’s best-case (CNET) transparency in the range of 60-70%

What I don’t know is why CNET was so much more transparent than the others, even when they appear to be in similar lighting. My best guess is that the dimming feature was adjusted differently or disabled for the CNET video.

Why is Orion Using Electronic Dimming Indoors?

All the Orion videos I have seen indicate that Orion is adding electrochromic dimming when indoors. Even bright indoor lighting is much less bright than sunlight. Unlike Snap Spectacles 5 (with electronic dimming) demos, Meta didn’t demo the unit outdoors. There can be several reasons, including:

  • The most obvious reason is the lack of display brightness.
  • For colors to “pop,” they need to be at least 8x brighter than the surroundings. Bright white objects in a well-lit room could be more than 50 nits. Maybe they couldn’t or didn’t want to go that bright for power/heat reasons.
  • Reduced heat of the MicroLEDs
  • Saves on battery life

Thinking about this issue made me notice that the walls in the demo room are painted a fairly dark color. Maybe it was a designer’s decision, but it also goes to my saying, “Demos is a Magic Show,” and that darker walls would make the AR display look better.

When this is added up, it suggests that the displays in the demos were likely outputting about 200 nits (just an educated guess). While ~200 nits would be a bright computer monitor, colors would be washed out in a well-lit room when viewed against a non-black background (monitors “bring their own black background”). Simply based on how they demoed it, I suspect that Snap Spectacles 5 is four to eight times brighter than Orion with the dimming used to work outdoors (rather than indoors).

Conclusion and Comments

When I first watched Bosworth’s video, his argument that the eye glow could only be seen from one angle seemed persuasive. But then I went back to check and could easily see that what he stated was provably false. I’m left to speculate as to why he brought up the eye glow issue (as it was not the original question) and proceeded to give erroneous information. It did motivate me to understand Orion better😁.

Based on what I saw in the CNET picture and what is a reasonable assumption for the waveguide, non-polarizing dimmer, and other optics (with transparency being multiplicative and not additive), it pegs Orion in the 60% transparency range plus or minus about 5%.

Bosworth’s answer on transparency was evasive, saying there was no “tint,” which was a non-answer. He mentioned electrochromic dimming but didn’t say for sure that Orion was using it. In the end, he said Orion was closer to prescription glasses (which are about 90% uncoated, 99.5% with anti-reflective coatings) than sunglasses at 17%. If we take uncoated glasses at 90% and sunglasses at 17%, then the midpoint between them would be 53% so that Orion may be, at best, slightly closer to uncoated eyeglasses than sunglasses. There are waveguide-based AR glasses that are more transparent (but without dimming) than Orion.

Bosworth gave more of an off-the-cuff AMA and not a formal presentation for a broad audience, and some level of generalization and goofs are to be expected. While he danced around the transparency issue a bit, it was the “glow” statement and its specificity that I have more of an issue with.

Even though Bosworth is the CTO and head of Meta’s Reality Labs, his background is in software, not optics so that he may have been ill-informed rather than deliberately misleading. I generally find him likable in the videos, and he shares a lot of information (while I have met many people from Meta’s Reality Labs, I have not met Bosworth). At the same time, it sounds to my ear that when he discusses optics, he is parroting things things he has been told, sometimes without fully understanding what he is saying. This is in sharp contrast to, say, Hololen’s former leader, Alex Kipman, who I believe out and out lied repeatedly.

Working on this article caused me to reexamine what Snap Spectacles was using for dimming. In my earlier look at AlphaMicron, I missed that AlphaMicron’s “e-Tint®” was a Guest Host dimming technology rather than a polarization-based one.

From the start, I was pretty sure Orion was using electrochromic dimming, but I was not sure whether it was polarization or non-polarization-based. In working through this article, I’m now reasonably certain it is a non-polarization-based dimming technology.

Working through this article, I realized that the available evidence also suggests that Orion’s display is not very bright. I would guess less than 200 nits, or at least they didn’t want to drive it brighter than that for very long.

Appendix: Determining the light blocking from videos is tricky

Human vision has a huge dynamic range and automatically adjusts as light varies. As Bosworth stated, typical sunglasses are less than 75% transmissive. Human perception of brightness is somewhat binary logarithmic. If there is plenty of available light, most people will barely notice a 50% dimming.

When wearing AR glasses, a large percentage (for some AR headsets, nearly all) of the light needed to view the eye will pass through the AR lens optics twice (in and back out). Because light blocking in series is multiplicative, this can cause the eyes to look much darker than what the person perceives when looking through them.

I set up a simple test using Wave Optic’s waveguide, which is ~85% transmissive, circular polarizing glasses (for 3-D movies) that was 33% tranmissive, and a Magic Leap One waveguide (out of the frame) that was 70% transmissive. In the upper right, I have shown a few examples of where I had a piece of white paper far enough away from the lens that the lens did not affect the illumination of the paper. On the lower right, I moved the paper up against the lens so the paper was primarily illuminated via the lens to demonstrate the light-blocking squared effect.

Orion’s Silicon Carbide (SiC) is not significantly more transparent than glass. Most of the light blocking in a diffraction waveguide comes from the diffraction grating, optical coatings, and number of layers. Considering that Orion’s “hero prototype” with $5B in R&D expenses for only 1,000 units, it is probably more transparent by about 5%.

When looking at open glasses like Orion (unlike, say, Magic Leap or Hololens), the lenses block only part of the eye’s illumination, so you get something less than the square law effect. So, in judging the amount of light blocking, you also have to estimate how much light is getting around the lenses and frames.

Snap Spectacles 5 and Meta Orion Roundtable Video Part 1

24 October 2024 at 03:59

Introduction

On October 17th, 2024, Jason McDowell (The AR Show), Jeri Ellsworth (Tilt Five), David Bonelli (Pulsar), Bradley Lynch (SadlyItsBradley), and I recorded a 2-hour roundtable discussion about the recent announcements of the Snap Spectacles 5 and Meta Orion optical AR/MR glasses. Along the way, we discussed various related subjects, including some about the Apple Vision Pro.

I’m breaking the video into several parts to keep some discussions from being buried in a single long video. In this first part, we primarily discuss the Snap Spectacles 5 (SS5). The SS5 will be discussed some more in the other parts, which will be released later. We also made some comments on the Apple Vision Pro, which Bradley Lynch and I own.

The 2-hour roundtable is being released in several parts, with AR Roundtable Part 1 Snap Spectacles 5 and some Apple Vision Pro being the first to be released.

0:00 Introduction of the panelist

Jason McDowall, as moderator, gets things going by having each panelist introduce themself.

2:11 See-Through versus Passthrough Mixed Reality

I gave a very brief explanation of the difference between see-through/optical AR/MR and passthrough MR. The big point is that with See-through/Optical AR/MR, the real world’s view is most important, and with passthrough MR, the virtual world is more important. With passthrough MR, the virtual world is most important with the camera’s view augmenting the virtual world.

5:51 Snap Spectacles 5 (SS5) experience and discussion

Jason McDowell had the opportunity to get a demo of the Snap Spectacles 5 and followed by a discussion by the panelist. Jason has a more detailed explanation of his experience and an interview with Sophia Dominguez, the Director of AR Platform Partnerships and Ecosystem at Snap, on his podcast.

11:59 Dimming (light blocking) with optical AR glasses

Jason noted the dimming feature of the SS5, and this led to the discussion of the need for light blocking with see-through AR.

19:15 See-though AR is not well suited for watching movies and TV

I make the point that see-through AR is not going to be a good device for watching movies and TV.

19:54 What is the application?

We get into a discussion of the applications for see-through AR

20:23 Snap’s motivation? And more on applications

There is some discussion about what is driving Snap to make Spectacles, followed by more discussion of applications.

22:35 What are Snap’s and Meta’s motivations?

The panelist gives their opinions on what is motivating Snap and Meta to enter into the see-through AR space.

23:31 What makes something “portable?”

David makes the point that if AR glasses are not all-day wearable, then they are not very portable. When you take them off, you have fragile things to protect in a case that is a lot bigger and bulkier than a smartphone you can shove in your pocket.

24:13 Wearable AI (Humane AT and Rabbit)

Many companies are working on “AI wearable” devices. We know many companies are looking to combine a small FOV display (typically 25-35 degrees) with audio “AI” glasses.

24:40 Reviewers/Media Chasing the Shiney Object (Apple Vision Pro and Meta Orion)

25:45 Need for a “$99 Google Glass”

Jeri liked Google Glass and thinks there is a place for a “$99 Google Glass”-like product in the market. David adds some information about the economics of ramping up production of the semi-custom display that Google Glass uses. I (Karl) then discuss some of the ecosystem issues of making a volume product.

27:28 Apple Vision Pro discussion

Brad Lynch uses his Apple Vision Pro daily and has even replaced his monitor with the AVP. He regularly uses the “Personas” (Avatars) when talking with co-workers and others in the VR community. But now refrains from using the Personas when talking with others “out of respect.” I have only used it very occasionally since doing my initial evaluation for this blog.

29:10 Mixed Reality while driving (is a bad idea)

Jeri brings up the “influencers” that bought (and likely returned in the two-week return window) and Apple Vision Pro may a viral YouTube video driving around in a Cyber Truck. We then discuss how driving around this was is dangerous.

Next Video – Meta Orion

In the next video in this series, we discuss Meta Orion.

Meta Orion AR (Pt. 2 Orion vs Wave Optics/Snap and Magic Leap Waveguides)

17 October 2024 at 14:50

Update (Oct. 19th, 2024)

While the general premise of this article is that Meta Orion is using similar waveguide technology to Snap (Wave Optics) and that Magic Leap 2 is correct, it turns out that a number of assumptions about the specifics of what the various companies actually used in their products were incorrect. One of my readers (who wishes to remain anonymous) with deep knowledge of waveguides responded to my request for more information on the various waveguides. This person had both a theoretical knowledge of waveguides and what Meta Orion, Wave Optics (now Snap), Magic Leap Two, and Hololen 2 used.

My main error about the nature of waveguide “grating” structures was a bias toward linear gratings, with which I was more familiar. I overlooked the possibility that Wave Optics was using a set of “pillar” gratings that act like a 2D set of linear gratings.

A summary of the corrections:

  1. Hololens 2 had a two-sided waveguide. The left and right expansion gratings are on opposite sides of the waveguide.
  2. Prior Wave Optics (Snap) waveguides use a pillar-type 2-D diffraction grating on one side. There is a single waveguide for full color. The new Snap Spectacles 5 is likely (not 100% sure) using linear diffraction gratings on both sides of a single waveguide full color, as shown in this article.
  3. Magic Leap Two uses linear diffraction gratings on both sides of the waveguide. It does use three waveguides.

The above corrections indicate that Meta Orion, Snap Spectacles 5 (Wave Optics), and Magic Leap all have overlapping linear gratings on both sides. Meta Orion and Snap likely use a single waveguide for full color, whereas the Magic Leap 2 has separate waveguides for the three primary colors.

I’m working on an article that will go into more detail and should appear soon, but I wanted to get this update out quickly.

Introduction and Background

After my last article, Meta Orion AR Glasses (Pt. 1 Waveguides), I got to thinking that the only other diffractive grating waveguide I have seen with a 2-D (X-Y) expansion and exit gratings, used in Meta’s Orion, was from Wave Optics (purchased by Snap in May 2021)

The unique look of Wave Optics waveguides is how I easily identified that Snap was using them before it was announced that Snap had bought Wave Optics in 2021 (see Exclusive: Snap Spectacles Appears to Be Using WaveOptics and [an LCOS] a DLP Display).

I then wondered what Magic Leap Two (ML2) did to achieve its 70-degree FOV and uncovered some more interesting information about Meta’s Orion. The more I researched ML2, the more similarities I found with Meta’s Orion. What started as a short observation that Meta Orion’s waveguide appears to share commonality with Snap (Wave Optics) waveguides ballooned up when I discovered/rediscovered the ML2 information.

Included in this article is some “background” information from prior articles to help compare and contrast what has been done before with what Meta’s Orion, Snap/Wave Optics, and Magie Leap Two are doing.

Diffractive Waveguide Background

I hadn’t looked at in any detail how Wave Optics diffraction gratings worked differently before. All other diffraction (I don’t know about holographic) grating waveguides I had seen before used three (or four) separate gratings on the same surface of the glass. There was an Entrance Grating, a first expansion and turning grating, and then a second expansion and exit grating. The location and whether the first expansion grating was horizontal or vertical varied with different waveguides.

Hololens 2 had a variation with left and right horizontal expansion and turning gratings and a single exit grating to increase the field of view. Still, all the gratings were on the same side of the waveguide.

Diffraction gratings bend light based on wavelength, similar to a prism. But unlike a prism, a grating will bend the light in a series of “orders.” With a diffractive waveguide, only the light from one of these orders is used, and the rest of the light is not only wasted but can cause problems, including “eye glow” and reduce the contrast of the overall system

Because diffraction is wavelength-based, it bends different colors/wavelengths in different amounts. This causes issues when sending more than one color through a single waveguide/diffraction grating. These problems are compounded as the size of the exit grating and FOV increases. Several diffraction waveguide companies have one (full color), or two (red+blue and blue+green) waveguides for smaller FOVs and then use three waveguides for wider FOVs.

For more information, Quick Background on Diffraction Waveguides, MicroLEDs and Waveguides: Millions of Nits-In to Thousands of Nits-Out with Waveguides, and Magic Leap, HoloLens, and Lumus Resolution “Shootout” (ML1 review part 3).

Meta Orion’s and Wave Optics Waveguides

I want to start with a quick summary of Orion’s waveguide, as the information and figures will be helpful in comparing it to that of Wave Optics (owned by Snap and in Snap’s Spectacles AR Glasses) and the ML2.

Summary of Orion’s waveguide from the last article

Orion’s waveguide appears to be using a waveguide substrate with one entrance grating per primary color and then two expansion and exit/output gratings. The two (crossed) output gratings are on opposite sides of the Silicon Carbide (SiC) substrate, whereas most diffractive waveguides use glass, and all the gratings are on one side.

Another interesting feature shown in the patents and discussed by Meta CTO Bosworth in some of his video interviews about Orion is “Disparity Correction,” which has an extra grating used by other optics and circuitry to detect if the waveguides are misaligned. This feature is not supported in Orion, but Bosworth says it will be included in future iterations that will move the input grating to the “eye side” of the waveguide. As shown in the figure below, and apparently in Orion, light enters the waveguide from the opposite side of the eyes. Since the projectors are on the eye side (in the temples), they require some extra optics, which, according to Bosworth, make the Orion frames thicker.

Wave Optics (Snap) Dual-Sided 2D Expanding Waveguide

Wave Optics US patent application 2018/0210205 is based on the first Wave Optics patent from the international application WO/2016/020643, first filed in 2014. FIG 3 (below) shows a 3-D representation of diffraction grating with an input grating (H0) and cross gratings (H1 and H2) on opposite sides of a single waveguide substrate.

The patent also shows that the cross gratings (H1 and H2) are on opposite sides of a single waveguide (FIG. 15B above) or one side of two waveguides (FIG. 15A above). I don’t know if Wave Optics (Snap) uses single- or double-sided waveguides in its current designs, but I would suspect it is double-sided.

While on the subject of Wave Optics waveguide design, I happen to have a picture of a Wave Optics 300mm glass wafer with 24 waveguides (right). I took the picture in the Schott booth at AR/VR/MR 2020. In the inset, I added Meta’s picture of the Orion 100mm SiC wafer, roughly to scale, with just four waveguides.

By the way, in my May 2021 article Exclusive: Snap Spectacles Appears to Be Using WaveOptics and [an LCOS] a DLP Display, I assumed that Spectacles would be using LCOS in 2021 since WaveOptics was in the process of moving to LCOS when they were acquired. I was a bit premature, as it took until 2024 for Spectacles to use LCOS.

In my hurry in putting together information and digging for connection, it was looking to me that WaveOptics would be using an LCOS microdisplay. As I pointed out, WaveOptics had been moving away from DLP to LCOS with their newer designs. Subsequent information suggests that WaveOptics was still using their much older DLP design. It is still likely that future versions will use LCOS, but the current version apparently does not.

Magic Leap

Magic Leap One (ML1) “Typical” Three Grating Waveguide

This blog’s first significant article about Magic Leap was in November 2016 (Magic Leap: “A Riddle Wrapped in an Enigma”). Since then, Magic Leap has been discussed in about 90 articles. Most other waveguide companies coaxially input all colors from a single projector. However, even though the ML1 had a single field sequential color LCOS device and projector, the LED illumination sources are spatially arranged so that the image from each color output is sent to a separate input grating. ML1 had six waveguides, three for each of the two focus planes, resulting in 6 LEDs (two sets of R, G, & B) and six entrance gratings (see: Magic Leap House of Cards – FSD, Waveguides, and Focus Planes).

Below is a diagram that iFixit developed jointly with this blog. It shows a side view of the ML1 optical path. The inset picture in the lower right shows the six entrance gratings of the six stacked waveguides.

Below left is a picture of the (stack of six) ML1 waveguides showing the six entrance gratings, the large expansion and turning gratings, and the exit gratings. Other than having spatially separate entrance gratings, the general design of the waveguides is the same as most other diffractive gratings, including the Hololens 1 shown in the introduction. The expansion gratings are mostly hidden in the ML1’s upper body (below right). The large expansion and turning grating can be seen as a major problem in fitting a “typical” diffractive waveguide into an eyeglass form factor, which is what drove Meta to find an alternative that goes beyond the ML1’s 50-degree FOV.

Figure 18 from US application 2018/0052276 diagrams the ML1’s construction. This diagram is very close to the ML1’s construction down to the shape of the waveguide and even the various diffraction grating shapes.

Magic Leap Two (ML2)

The ML1 failed so badly that very few were interested in the ML2 compared to the ML1. There is much less public information about the second-generation device, and I didn’t buy an ML2 for testing. I have covered many of the technical aspects of ML2, but I haven’t studied the waveguide before. With the ML2 having a 70-degree FOV compared to the ML1’s 50-degree FOV, I became curious about how they got it to fit.

To start with, the ML2 eliminated the ML1’s support for two focus planes. This cut the waveguides in half and meant that the exit grating of the waveguide didn’t need to change the focus of the virtual image (for more on this subject, see: Single Waveguide Set with Front and Back “Lens Assemblies”).

Looking through the Magic Leap patent applications, I turned up US 2018/0052276 to Magic Leap, which shows a 2-D combined exit grating. US 2018/0052276 is what is commonly referred to in the patent field as an “omnibus patent application,” which combines a massive number of concepts (the application has 272 pages) in a single application. The application starts with concepts in the ML1 (including the just prior FIG 18) and goes on to concepts in the ML2.

This application, loosely speaking, shows how to take the Wave Optics concept of two crossed diffraction gratings on different sides of a waveguide and integrate them onto the same side of the waveguide.

Magic Leap patent application 2020/0158942 describes in detail how the two crossed output gratings are made. It shows the “prior art” (Wave Optics and Meta Orion-like) method of two gratings on opposite sides of a waveguide in FIG. 1 (below). The application then shows how the two crossed gratings can be integrated into a single grating structure. The patent even includes scanning electron microscope photos of the structures Magic Leap had made (ex., FIG 5), which demonstrates that Magic Leap had gone far beyond the concept stage by the time of the application’s filing in Nov. 2018.

I then went back to pictures I took of Magic Leap’s 2022 AR/VR/MR conference presentation (see also Magic Leap 2 at SPIE AR/VR/MR 2022) on the ML2. I realized that the concept of a 2D OPE+EPE (crossed diffraction gratings) was hiding in plain sight as part of another figure, thus confirming that ML2 was using the concept. The main topic of this figure is “Online display calibration,” which appears to be the same concept as Orion’s “disparity correction” shown earlier.

The next issue is whether the ML2 used a single input grating for all colors and whether it used more than one waveguide. It turns out that these are both answered in another figure from Magic Leap’s 2022 AR/VR/MR presentation shown below. Magic Leap developed a very compact projector engine that illuminates and LCOS panel through the (clear) part of the waveguides. Like the ML1, the red, green, and blue illumination LEDs are spatially separated, which, in turn, causes the light out of the projector lens to be spatially separated. There are then three spatially separate input gratings on three waveguides, as shown.

Based on the ML2’s three waveguides, I assumed it was too difficult or impossible to support the “crossed” diffraction grating effect while supporting full color in a single wide FOV waveguide.

Summary: Orion, ML2, & Wave Optics Waveguide Concepts

Orion, ML2, and Wave Optics have some form of two-dimensional pupil expansion using overlapping diffraction gratings. By overlapping gratings, they reduce the size of the waveguide considerably over the more conventional approach, with three diffraction gratings spatially separate on a single surface.

To summarize:

  • Meta Orion – “Crossed” diffraction gratings on both sides of a single SiC waveguide for full color.
  • Snap/Wave Optics – “Crossed” diffraction gratings on both sides of a single glass waveguide for full color. Alternatively, “crossed” diffraction waveguides on two glass waveguides for full color (I just put a request into Snap to try and clarify).
  • Magic Leap Two – A single diffraction grating that acts like a crossed diffraction grating on high index (~2.0) glass with three waveguides (one per primary color).

The above is based on the currently available public information. If you have additional information or analysis, please share it in the comments, or if you don’t want to share it publicly, you can send a private email to newsinfo@kgontech.com. To be clear, I don’t want stolen information or any violation of NDAs, but I am sure there are waveguide experts who know more about this subject.

What about Meta Orion’s Image Quality?

I have not had the opportunity to look through Meta’s Orion or Snap Spectacles 5 and have only seen ML2 in a canned demo. Unfortunately, I was not invited to demo Meta’s Orion, no less have access to one for evaluation (if you can help me gain (legal) access, contact me at newsinfo@kgontech.com).

I have tried the ML2 a few times. However, I have never had the opportunity to take pictures through the optics or use my test patterns. From my limited experience with the ML2, it is much better in terms of image quality than the ML1 (which was abysmal – see Magic Leap Review Part 1 – The Terrible View Through Diffraction Gratings), it still has significant issues with color uniformity like other wide (>40-degree) FOV diffractive waveguides. If someone has a ML2 that I can borrow for evaluation, please get in touch with me at newsinfo@kgontech.com.

I have been following Wave Optics (now Snap) for many years and have a 2020-era Titan DLP-based 40-degree FOV Wave Optics evaluation unit (through the optics picture below). Wave Optics Titan, I would consider a “middle of the pack” (I had seen better and worse) diffractive waveguide at that time. I have seen what seem to be better diffractive waveguides before and since, but it is hard to compare them objectively as they have different FOVs, and I was not able to use my content but rather curated demo content. Wave Optics seemed to be showing better waveguides at shows before being acquired by Snap 2021, but once again, that was with their demo content with short views at shows. I am working on getting a Spectacles 5 to do a more in-depth evaluation and see how it has improved.

Without the ability to test, compare, and contrast, I can only speculate about Meta Orion’s image quality based on my experience with diffractive waveguides. The higher index of refraction of SiC helps as there are fewer TIR bounces, which degrades image quality, but it is far from a volume production-ready technology. I’m concerned about image uniformity with a large FOV and even more so with a single set of diffraction gratings as diffraction is based on wavelength (color).

Lumus Reflective Waveguide Rumors

In Meta Orion AR Glasses: The first DEEP DIVE into the optical architecture, it stated:

There were rumors before that Meta would launch new glasses with a 2D reflective (array) waveguide optical solution and LCoS optical engine in 2024-2025. With the announcement of Orion, I personally think this possibility has not disappeared and still exists.

The “reflective waveguide” would most likely be a reference to Lumus’s reflective waveguides. I have seen a few “Lumus clone” reflective waveguides from Chinese companies, but their image quality is very poor compared to Lumus. In the comment section of my last article, Ding, on October 8, 2024, wrote:

There’s indeed rumor that Meta is planning an actual product in 2025 based on LCOS and Lumus waveguide. 

Lumus has demonstrated impressive image quality in a glasses-like form factor (see my 2021 article: Exclusive: Lumus Maximus 2K x 2K Per Eye, >3000 Nits, 50° FOV with Through-the-Optics Pictures). Since the 2021 Maximus, they have been shrinking the form factor and improving support for prescription lens integration with their new “Z-lens” technology. Lumus claims its Z-Lens technology should be able to support greater than a 70-degree FoV in glass. Lumus also says because their waveguides support a larger input pupil, they should have a 5x to 10x efficiency advantage.

The market question about Lumus is whether they can make their waveguide cost-effectively in mass production. In the past, I have asked their manufacturing partner, Schott, who says they can make it, but I have yet to see a consumer product around the Z-Lens. It would be interesting to see if a company like Meta had put the kind of money they invested into complex Silicon Carbide waveguides into reflective waveguides.

While diffractive waveguides are not inexpensive, they are considered less expensive at present (except, of course, for Meta Orion’s SiC waveguides). Perhaps an attractive proposition to researchers and propriety companies is that diffraction waveguides can be customized more easily (at least on glass).

Not Addressing Who Invented What First

I want to be clear: this article does not in any way make assertions about who invented what first or whether anyone is infringing on anyone else’s invention. Making that determination would require a massive amount of work, lawyers, and the courts. The reason I cite patents and patent applications is that they are public records that are easily searched and often document technical details that are missing from published presentations and articles.

Conclusions

There seems to be a surprising amount of commonality between Meta’s Orion, the Snap/Wave Optics, and the Magic Leap Two waveguides. They all avoided the “conventional” three diffraction gratings on one side of a waveguide to support a wider FOV in an eyeglass form factor. Rediscovering that the ML2 supported “dispersion correction,” as Meta refers to it, was a bit of a bonus.

As I wrote last time, Meta’s Orion seems like a strange mix of technology to make a big deal about at Meta Connect. They combined a ridiculously expensive waveguide with a very low-resolution display. The two-sided diffraction grating Silicon Carbide waveguides seem to be more than a decade away from practical volume production. It’s not clear to me that even if they could be made cost-effective, they would have as good a view out and the image quality of reflective waveguides, particularly at wider FOVs.

Meta could have put together a headset with technology that was within three years of being ready for production. As it is, it seemed like more of a stunt in response to the Apple Vision Pro. In that regard, the stunt seems to have worked in the sense that some reviewers were reminded of seeing the real world directly with optical AR/MR beats, looking at it through camera and display.

This Eyewear Offers a Buckshot Method to Monitor Health



Emteq Labs wants eyewear to be the next frontier of wearable health technology.

The Brighton, England-based company introduced today its emotion-sensing eyewear, Sense. The glasses contain nine optical sensors distributed across the rims that detect subtle changes in facial expression with more than 93 percent accuracy when paired with Emteq’s current software. “If your face moves, we can capture it,” says Steen Strand, whose appointment as Emteq’s new CEO was also announced today. With that detailed data, “you can really start to decode all kinds of things.” The continuous data could help people uncover patterns in their behavior and mood, similar to an activity or sleep tracker.

Emteq is now aiming to take its tech out of laboratory settings with real-world applications. The company is currently producing a small number of Sense glasses, and they’ll be available to commercial partners in December.

The announcement comes just weeks after Meta and Snap each unveiled augmented reality glasses that remain in development. These glasses are “far from ready,” says Strand, who led the augmented reality eyewear division while working at Snap from 2018 to 2022. “In the meantime, we can serve up lightweight eyewear that we believe can deliver some really cool health benefits.”

Fly Vision Vectors

While current augmented reality (AR) headsets have large battery packs to power the devices, glasses require a lightweight design. “Every little bit of power, every bit of weight, becomes critically important,” says Strand. The current version of Sense weighs 62 grams, slightly heavier than the Ray-Ban Meta smart glasses, which weigh in at about 50 grams.

Because of the weight constraints, Emteq couldn’t use the power-hungry cameras typically used in headsets. With cameras, motion is detected by looking at how pixels change between consecutive images. The method is effective, but captures a lot of redundant information and uses more power. The eyewear’s engineers instead opted for optical sensors that efficiently capture vectors when points on the face move due to the underlying muscles. These sensors were inspired by the efficiency of fly vision. “Flies are incredibly efficient at measuring motion,” says Emteq founder and CSO Charles Nduka. “That’s why you can’t swat the bloody things. They have a very high sample rate internally.”

Sense glasses can capture data as often as 6,000 times per second. The vector-based approach also adds a third dimension to a typical camera’s 2D view of pixels in a single plane.

These sensors look for activation of facial muscles, and the area around the eyes is an ideal spot. While it’s easy to suppress or force a smile, the upper half of our face tends to have more involuntary responses, explains Nduka, who also works as a plastic surgeon in the United Kingdom. However, the glasses can also collect information about the mouth by monitoring the cheek muscles that control jaw movements, conveniently located near the lower rim of a pair of glasses. The data collected is then transmitted from the glasses to pass through Emteq’s algorithms in order to translate the vector data into usable information.

In addition to interpreting facial expressions, Sense can be used to track food intake, an application discovered by accident when one of Emteq’s developers was wearing the glasses while eating breakfast. By monitoring jaw movement, the glasses detect when a user chews and how quickly they eat. Meanwhile, a downward-facing camera takes a photo to log the food, and uses a large language model to determine what’s in the photo, effectively making food logging a passive activity. Currently, Emteq is using an instance of OpenAI’s GPT-4 large language model to accomplish this, but the company has plans to create their own algorithm in the future. Other applications, including monitoring physical activity and posture, are also in development.

One Platform, Many Uses

Nduka believes Emteq’s glasses represent a “fundamental technology,” similar to how the accelerometer is used for a host of applications in smartphones, including managing screen orientation, tracking activity, and even revealing infrastructure damage.

Similarly, Emteq has chosen to develop the technology as a general facial data platform for a range of uses. “If we went deep on just one, it means that all the other opportunities that can be helped—especially some of those rarer use cases—they’d all be delayed,” says Nduka. For example, Nduka is passionate about developing a tool to help those with facial paralysis. But a specialized device for those patients would have high unit costs and be unaffordable for the target user. Allowing more companies to use Emteq’s intellectual property and algorithms will bring down cost.

In this buckshot approach, the general target for Sense’s potential use cases is health applications. “If you look at the history of wearables, health has been the primary driver,” says Strand. The same may be true for eyewear, and he says there’s potential for diet and emotional data to be “the next pillar of health” after sleep and physical activity.

How the data is delivered is still to be determined. In some applications, it could be used to provide real-time feedback—for instance, vibrating to remind the user to slow down eating. Or, it could be used by health professionals only to collect a week’s worth of at-home data for patients with mental health conditions, which Nduka notes largely lack objective measures. (As a medical device for treatment of diagnosed conditions, Sense would have to go through a more intensive regulatory process.) While some users are hungry for more data, others may require a “much more gentle, qualitative approach,” says Strand. Emteq plans to work with expert providers to appropriately package information for users.

Interpreting the data must be done with care, says Vivian Genaro Motti, an associate professor at George Mason University who leads the Human-Centric Design Lab. What expressions mean may vary based on cultural and demographic factors, and “we need to take into account that people sometimes respond to emotions in different ways,” Motti says. With little regulation of wearable devices, she says it’s also important to ensure privacy and protect user data. But Motti raises these concerns because there is a promising potential for the device. “If this is widespread, it’s important that we think carefully about the implications.”

Privacy is also a concern to Edward Savonov, a professor of electrical and computer engineering at the University of Alabama, who developed a similar device for dietary tracking in his lab. Having a camera mounted on Emteq’s glasses could pose issues, both for the privacy of those around a user and a user’s own personal information. Many people eat in front of their computer or cell phone, so sensitive data may be in view.

For technology like Sense to be adopted, Sazonov says questions about usability and privacy concerns must first be answered. “Eyewear-based technology has potential for a great future—if we get it right.”

Meta Orion AR Glasses (Pt. 1 Waveguides)

7 October 2024 at 02:54

Introduction

While Meta’s announced Orion prototype AR Glasses at Meta Connect made big news, there were few technical details beyond it having a 70-degree field of view (FOV) and using Silicon Carbide waveguides. While they demoed to the more general technical press and “influencers,” they didn’t seem to invite the more AR and VR-centric people who might be more analytical. Via some Meta patents, a Reddit post, and studying videos and articles, I was able to tease out some information.

This first article will concentrate on Orion’s Silicon Carbide diffractive waveguide. I have a lot of other thoughts on the mismatch of features and human factors that I will discuss in upcoming articles.

Wild Enthusiasm Stage and Lack of Technical Reviews

In the words of Yogi Berra, “It’s like deja vu all over again.” We went through this with the Apple Vision Pro, which went from being the second coming of the smartphone to almost disappearing earlier this year. This time, a more limited group of media people has been given access. There is virtually no critical analysis of the display’s image quality or the effect on the real world. I may be skeptical, but I have seen dozens of different diffractive waveguide designs, and there must be some issues, yet nothing has been reported. I expect there are problems with color uniformity and diffraction artifacts, but nothing was mentioned in any article or video. Heck, I have yet to see anyone mention the obvious eye glow problem (more on this in a bit).

The Vergecast podcast video discusses some of the utility issues and their related video, Exclusive: We tried Meta’s AR glasses with Mark Zuckerberg, which gives some more information about the experience. Thankfully, unlike Meta or any other (simulated) through-the-optics videos, The Verge clearly marked the videos as “Simulated” (screen capture on the right).

As far as I can tell, there are no true “through-the-optics” videos or pictures (likely at Meta’s request). All the images and videos I found that may look like they could have been taken through the optics have been “simulated.”

Another informative video was by Norm Chan of Adam Savages Tested, particularly in the last two-thirds of the video after his interview with Meta CTO Andrew Bosworth. Norm discussed that the demo was “on rails” with limited demos in a controlled room environment. I’m going to quote Bosworth a few times in this article because he added information; while he may have been giving some level of marketing spin, he seems to be generally truthful, unlike former Hololens 2 leader Alex Kipman, who was repeatedly dishonest in his Hololens 2 presentation (which I documented in several articles including Hololens 2 and why the resolution math fails, and Alex Kipman Fibbing about the field of view, Alex Kipman’s problems at Microsoft with references to other places where Kipman was “fibbing,” and Hololens 2 Display Evaluation (Part 2: Comparison to Hololens 1) or input “Kipman” on this blog’s search feature)

I’m not against companies making technology demos in general. However, making a big deal about a “prototype” and not a “product” at Meta Connect rather than at a technical conference like Siggraph indicates AR’s importance to Meta. It invites comparisons to the Apple Vision Pro, which Meta probably intended.

It is a little disappointing that they also only share the demos with selected “invited media” that, for the most part, lack deep expertise in display technology and are easily manipulated by a “good” demo (see Appendix: “Escape from a Lab” and “Demos Are a Magic Show”). They will naturally tend to pull punches to keep access to new product announcements from Meta and other major companies. As a result, there is no information about the image quality of the virtual display or any reported issues looking through the waveguides (which there must be).

Eye Glow

I’ve watched hours of videos and read multiple articles, and I have yet to hear anyone mention the obvious issue of “eye glow” (front projection). They will talk about the social acceptance of them looking like glasses and being able to see the person’s eyes, but then they won’t mention the glaring problem of the person’s eyes glowing. It stuck out to me because they didn’t mention the eye glow issue, evident in all the videos and many photos.

Eye glow is an issue that diffractive waveguide designers have been trying to reduce/eliminate for years. Then there are Lumus reflective waveguides with inherently little eye glow. Vuzix, Digilens, and Dispelix make big points about how they have reduced the problem with diffractive waveguides (see Front Projection (“Eye Glow”) and Pantoscoptic Tilt to Eliminate “Eye Glow”). However, these diffractive waveguide designs with greatly reduced eye glow issues have relatively small (25-35 degree) FOVs. The Orion design supports a very wide 70-degree FOV while trying to make it fit the size of a “typical” (if bulky) glasses frame; I suspect that the design methods to meet the size and FOV requirements meant that the issue of “eye glow” could not be addressed.

Light Transmission (Dimming?)

The transmissivity seems to vary in the many images and videos of people wearing Orions. It’s hard to tell, but it seems to change. On the right, two frames switch back and forth, and the glasses darken as the person puts them on (from video Orion AR Glasses: Apple’s Last Days)

Because I’m judging from videos and pictures with uncontrolled lighting, it’s impossible to know the transmissivity, but I can compare it to other AR glasses. Below are the highly transmissive Lumus Maximus glasses with greater than 80% transmissivity and the Hololens 2 with ~40% compared to the two dimming levels of the Orion glasses.

Below is a still frame from a Meta video showing some of the individual parts of the Orion glasses. They appear to show unusually dark cover glass, a dimming shutter (possibly liquid crystal) with a drive circuit attached, and a stack of flat optics with the waveguide with electronics connected to it. In his video, Norm Chen stated, “My understanding is the frontmost layer can be like a polarized layer.” This seems consistent with what appears to be the cover “glass” (which could be plastic), which looks so dark compared to the dimming shutter (LC is nearly transparent as it only changes the polarization of light).

If it does use a polarization-based dimming structure, this will cause problems when viewing polarization-based displays (such as LCD-based computer monitors and smartphones).

Orion’s Unusual Diffractive Waveguides

Axel Wong‘s analysis of Meta Orion’s Waveguide, which was translated and published on Reddit as Meta Orion AR Glasses: The first DEEP DIVE into the optical architecture, served as a starting point for my study of the Meta Orions optics, and I largely agree with his findings. Based on the figures he showed, his analysis was based on Meta Platforms’ (a patent holding company of Meta) US patent application 2024/0179284. Three figures from that application are shown below.

[10-08-2024 – Corrected the order of the Red, Green, and Blue inputs in Fig 10 below]

Overlapping Diffraction Gratings

It appears that Orion uses waveguides with diffraction gratings on both sides of the substrate (see FIG. 12A above). In Figure 10, the first and second “output gratings” overlap, which suggests that these gratings are on different surfaces. Based on FIGs 12A and 7C above, the gratings are on opposite sides of the same substrate. I have not seen this before with other waveguides and suspect it is a complicated/expensive process.

Hololens 1

As Alex Wong pointed out in his analysis, supporting such a wide FOV in a glass form factor necessitated that the two large gratings overlap. Below (upper-left) is shown the Hololens 1 waveguide, typical of most other diffractive waveguides. It consists of a small input grating, a (often) trapezoidal-shaped expansion grating, and a more rectangular second expansion and output/exit grating. In the Orion (upper right), the two larger gratings effectively overlap so that the waveguide fits in the eyeglasses form factor. I have roughly positioned the Hololens 1 and Orion waveguides at the same vertical location relative to the eye.

Also shown in the figure above (lower left) is Orion’s waveguide wafer, which I used to generate the outlines of the gratings, and a picture (lower right) showing the two diffraction gratings in the eye glow from Orion.

It should be noted that while the Hololens 1 has only about half the FOV of the Orion, the size of the exit gratings is similar. The size of the Hololens 1 exit grating is due to the Hololen 1 having enough eye relief to support most wearing glasses. The farther away the eye is from the grating, the bigger the grating needs to be for a given FOV.

Light Entering From the “wrong side” of the waveguide

The patent application figures 12A and 7C are curious because the projector is on the opposite side of the waveguide from the eye/output. This would suggest that the projectors are outside the glasses rather than hidden in the temples on the same side of the waveguide as the eye.

Meta’s Bosworth in The WILDEST Tech I’ve Ever Tried – Meta Orion at 9:55 stated, “And so, this stack right here [pointing to the corner of the glasses of the clear plastic prototype] gets much thinner, actually, about half as thick. ‘Cause the protector comes in from the back at that point.”

Based on Bosworth’s statement, some optics route the light from the projectors in the temples to the front of the waveguides, necessitating thicker frames. Bosworth said that the next generation’s waveguides will accept light from the rear side of the waveguide. I assume that making the waveguides work this way is more difficult, or they would have already done it rather than having thicker frames on Orion.

However, Bosworth said, “There’s no bubbles. Like you throw this thing in a fish tank, you’re not gonna see anything.” This implies that everything is densely packed into the glasses, so other than saving the volume of the extra optics, there may not be a major size reduction possible. (Bosworth referenced Steve Jobs Dropping an iPod prototype in water story to prove that it could be made smaller due to the air bubbles that escaped)

Disparity Correction (Shown in Patent Application but not in Orion)

Meta’s application 2024/0179284, while showing many other details of the waveguide, is directed to “disparity correction.” Bosworth discusses in several interviews (including here) that Orion does not have disparity correction but that they intend to put it in future designs. As Bosworth describes it, the disparity correction is intended to correct for any flexing of the frames (or other alignment issues) that would cause the waveguides (and their images relative to the eyes) to move. He seems to suggest that this would allow Meta to use frames that would be thinner and that might have some flex to them.

Half Circular Entrance Gratings

Wong, in the Reddit article, also noticed that small input/entrance gratings visible on the wafer looked to be cut-off circles and commented:

However, if the coupling grating is indeed half-moon shaped, the light spot output by the light engine is also likely to be this shape. I personally guess that this design is mainly to reduce a common problem with SRG at the coupling point, that is, the secondary diffraction of the coupled light by the coupling grating.

Before the light spot of the light engine embarks on the great journey of total reflection and then entering the human eye after entering the coupling grating, a considerable part of the light will unfortunately be diffracted directly out by hitting the coupling grating again. This part of the light will cause a great energy loss, and it is also possible to hit the glass surface of the screen and then return to the grating to form ghost images.

Single Waveguide for all three colors?

Magic Leap Application Shown Three Stacked Waveguides

The patent application seems to suggest that there is a single (double-sided) waveguide for all three colors (red, green, and blue). Most larger FOV full-color diffractive AR glasses will stack three (red, green, and blue—Examples Hololens One and Magic Leap 1&2) or two waveguides (red+blue and blue+green—Example Hololens 2). Dispelix has single-layer, full-color diffractive waveguides that go up to 50 degrees FOV.

Diffraction gratings have a line spacing based on the wavelengths of light they are meant to diffract. Supporting full color with such a wide FOV in a single waveguide would typically cause issues with image quality, including light fall-off in some colors and contrast losses. Unfortunately, there are no “through the optics” pictures or even subjective evaluations by an independent expert as to the image quality of Orion.

Silicon Carbide Waveguide Substrate

The idea of using silicon carbide for Waveguides it not unique to Meta. Below is an image from GETTING THE BIG PICTURE IN AR/VR, which discusses the advantages of using high-index materials like Lithium Niobate and Silicon Carbide to make waveguides. It is well known that going to a higher index of refraction substrates supports wider FOVs, as shown in the figure below. The problem, as Bosworth points out, is that growing silicon carbide wafers are very expensive. The wafers are also much smaller, enabling fewer waveguides per wafer. From the pictures of Meta’s wafers, they only get four waveguides per wafer, whereas there can be a dozen or more diffractive waveguides made on larger and much less expensive glass wafers.

Bosworth says “Nearly Artifact Free” and with Low “Rainbow” capture

Examples of “Rainbow Artifacts” from Diffractive Wavguides

A common issue with diffractive waveguides is that the diffraction gratings will capture light in the real world and then spread it out by wavelength like a prism, which creates a rainbow-like effect.

In Adam Savage’s Tested interview (@~5:10), Bosworth said, “The waveguide itself is nano etched into silicon carbide, which is a novel material with a super high index of refraction, which allows us to minimize the Lost photons and minimize the number of photons we capture from the world, so it minimizes things like ghosting and Haze and rainbow all these artifacts while giving you that field of view that you want. Well it’s not artifact free, it’s very close to artifact-free.” I appreciate that while Bosworth tried to give the advantages of their waveguide technology, he immediately corrected himself when he had overstated his case (unlike Hololens’ Kipman as cited in the Introduction). I would feel even better if they let some independent experts study it and give their opinions.

What Bosworth says about rainbows and other diffractive artifacts may be true, but I would like to see it evaluated by independent experts. Norm said in the same video, “It was a very on-rails demo with many guard rails. They walked me through this very evenly diffused lit room, so no bright lights.” I appreciate that Norm recognized he was getting at least a bit of a “magic show” demo (see appendix).

Wild Enthusiasm Stage and Lack of Technical Reviews

In the words of Yogi Berra, “It’s like deja vu all over again.” We went through this with the Apple Vision Pro, which went from being the second coming of the smartphone to almost disappearing earlier this year. This time, a more limited group of media people has been given access. There is virtually no critical analysis of the display’s image quality or the effect on the real world. I may be skeptical, but I have seen dozens of different diffractive waveguide designs, and there must be some issues, yet nothing has been reported. I’m expecting there to be problems with color uniformity and diffraction artifacts, but nothing was mentioned.

Strange Mix of a Wide FOV and Low Resolution

There was also little to no discussion in the reviews of Orion’s very low angular resolution of only 13 pixels per degree (PPD) spread over a 70-degree FOV (a topic for my next article on Orion). This works to about a 720- by 540-pixel display resolution.

Several people reported seeing a 26PPD demo, but it was unclear if this was a form factor or a lab-bench demo. Even 26PPD is a fairly low angular resolution.

Optical versus Passthough AR – Orion vs Vision Pro

Meta’s Orion demonstration is a declaration that optical AR (e.g., Orion) and non-camera passthrough AR, such as Apple Vision Pro, are the long-term prize devices. It makes the point that no passthrough camera and display combination can come close to competing with the real-world view in terms of dynamic range, resolution, biocular stereo, and infinite numbers of focus depths.

As I have repeatedly pointed out in writing and presentations, optical AR prioritizes the view of the real world, while camera passthrough AR prioritizes the virtual image view. I think there is very little overlap in their applications. I can’t imagine anyone allowing someone out on a factor floor or onto the streets of a city in a future Apple Vision Pro type device, but one could imagine it with something like the Meta Orion. And I think this is the point that Meta wanted to make.

Conclusions

I understand that Meta was demonstrating, in a way, “If money was not an obstacle, what could we do?” I think they were too fixated on the very wide FOV issue. I am concerned that the diffractive Silicon Carbide waveguides are not the right solution in the near or long term. They certainly can’t have a volume/consumer product with a significant “eye glow” problem.

This is a subject I have discussed many times, including in Small FOV Optical AR Discussion with Thad Starner and FOV Obsession. They have the worst of all worlds in some ways, with a very large FOV and a relatively low-resolution display; they block most of the real world for a given amount of content. With the same money, I think they could have made a more impressive demo with exotic waveguide materials that didn’t seem so far off in the future. I intend to get more into the human factors and display utility in this series on Meta Orion.

Appendix: “Demos Are a Magic Show”

Seeing the way Meta introduced Orion and hearing of the crafted demos they gave reminded me of one of my earliest blog articles from 2012 call Cynics Guide to CES – Glossary of Terms which gave warning about seeing demos.

Escaped From the Lab

Orion seems to fit the definition of an “escape from the lab.” Quoting from the 2012 article:

“Escaped from the lab” – This is the demonstration of a product concept that is highly impractical for any of a number of reasons including cost, lifetime/reliability, size, unrealistic setting (for example requires a special room that few could afford), and dangerous without skilled supervision.  Sometimes demos “escape from the lab” because a company’s management has sunk a lot of money into a project and a public demo is an attempt to prove to management that the concepts will at least one day appeal to consumers.

I have used this phrase a few times over the years, including The Hololens 2 (Hololens 2 Video with Microvision “Easter Egg” Plus Some Hololens and Magic Leap Rumors), which was officially discontinued this month, although it has long since been seen as a failed product. I also commented (in Magic Leap Review Part 1 – The Terrible View Through Diffraction Gratings – see my Sept. 27, 2019 comment) that the Magic Leap One was “even more of a lab project.”

Why make such a big deal about Orion, a prototype with a strange mix of features and impractically expensive components? Someone(s) is trying to prove that the product concept was worth continued investment.

Magic Show

I also warned that demos are “a magic show.”

A Wizard of Oz (visual) – Carefully controlling the lighting, image size, viewing location and/or visual content in order to hide what would be obvious defects.   Sometimes you are seeing a “magic show” that has little relationship to real world use.

I went into further detail in this subject in my early coverages of the Hololens 2 in the section, “Demos are a Magic Show and why are there no other reports of problems?“:

I constantly try and remind people that “demos are a magic show.” Most people get wowed by the show or being one of the special people to try on a new device. Many in the media may be great at writing, but they are not experts on evaluating displays. The imperfections and problems go unnoticed in a well-crafted demo with someone that is not trained to “look behind the curtain.”

The demo content is often picked to best show off a device and avoid content that might show flaws. For example, content that is busy with lots of visual “noise” will hide problems like image uniformity and dead pixels. Usually, the toughest test patterns are the simplest, as one will immediately be able to tell if something is wrong. I typically like patterns with a mostly white screen to check for uniformity and a mostly black screen to check for contrast, with some details in the patterns to show resolution and some large spots to check for unwanted reflections. For example, see my test patterns, which are free to download. When trying on a headset that supports a web browser, I will navigate to my test pattern page and select one of the test patterns.

Most of the companies that are getting early devices will have a special relationship with the manufacturer. They have a vested interest in seeing that the product succeeds either for their internal program or because they hope to develop software for the device. They certainly won’t want to be seen as causing Microsoft problems. They tend to direct their negative opinions to the manufacturer, not public forums.

Only with independent testing by people with display experience using their own test content will we understand the image quality of the Hololens 2.

Meta Releases New Audio Ray Tracing Tool for More Immersive Soundscapes on Quest

31 July 2024 at 11:19

Meta announced its released a new Acoustic Ray Tracing feature that will make it easier for developers to add more immersive audio to their VR games and apps.

Released in the Audio SDK for Unity and Unreal, the new Acoustic Ray Tracing tech is designed to automate the complex process of simulating realistic acoustics which is traditionally achieved through labor-intensive, manual methods.

With Meta’s Acoustic Ray Tracing, the company says in a developer blog post it can simulate sound reflections, reverberations, and things like diffraction, occlusion, and obstruction—all critical to making spatial audio closer to the real thing.

Image courtesy Meta

The new audio feature, which Meta calls “a more natural audio experience” than its older Shoebox Room Acoustics model, also supports complex environments.

“Our acoustics features can handle arbitrarily complex geometry, ensuring that even the most intricate environments are accurately simulated,” Meta says. “Whether your VR scene is a winding cave, a bustling cityscape, or an intricate indoor environment, our technology can manage the complexity without compromising on performance.

The Acoustic Ray Tracing system is said to integrate with existing workflows, supporting popular middleware like FMOD and Wwise. Notably, Acoustic Ray Tracing is being used in the upcoming Quest exclusive Batman: Arkham Shadow, demonstrating its potential for creating immersive experiences.

“One of the standout benefits of our new acoustics features is their performance on mobile hardware. While other solutions in the market require powerful PCs due to their high performance cost, our SDK is optimized to run efficiently on mobile devices such as Quest headsets. This opens up new possibilities for high-quality audio simulation in mobile applications, making immersive audio more accessible than ever before,” the company says.

You can find out more about Meta’s Acoustic Ray Tracing here. You’ll also find documentation on Meta’s Audio SDK (Unity Unreal) and Acoustic Ray Tracing (Unity | Unreal).

The post Meta Releases New Audio Ray Tracing Tool for More Immersive Soundscapes on Quest appeared first on Road to VR.

AWE 2024 Panel: The Current State and Future Direction of AR Glasses

29 June 2024 at 23:22

Introduction

At AWE 2024, I was on a panel discussion titled “The Current State and Future Direction of AR Glasses.” Jeri Ellsworth, CEO of Tilt Five, Ed Tang, CEO of Avegant, Adi Robertson, Senior Reporter at The Verge, and I were on the panel, with Jason McDowell, The AR Show, moderating. Jason McDowell did an excellent job of moderation and keeping the discussion moving. Still, with only 55 minutes, including questions from the audience, we could only cover a fraction of the topics we had considered discussing. I’m hoping to reconvene this panel sometime. I also want to thank Dean Johnson, Associate Professor at Western Michigan University, who originated the idea and helped me organize this panel. AWE’s video of our panel is available on YouTube.

First, I will outline what was discussed in the panel. Then, I want to follow up on small FOV optical AR glasses and some back-and-forth discussions with AWE Legend Thad Starner.

Outline of the Panel Discussion

The panel covered many topics, and below, I have provided a link to each part of our discussion and added additional information and details for some of the topics.

  • 0:00 Introductions
  • 2:19 Apple Vision Pro (AVP) and why it has stalled. It has been widely reported that AVP sales have stalled. Just before the conference, The Information reported that Apple had suspended the Vision Pro 2 development and is now focused on a lower-cost version. I want to point out that a 1984 128K Mac 1 adjusted for inflation would cost over $7,000 adjusted for inflation, and the original 1977 Apple 2 4K computer (without a monitor or floppy drive) would cost about $6,700 in today’s dollars. I contend that utility and not price is the key problem with the AVP sales volume and that Apple is thus drawing the wrong conclusion.
  • 7:20 Optical versus Passthrough AR. The panel discusses why their requirements are so different.
  • 11:30 Mentioned Thad Starner and the desire for smaller FOV optical AR headsets. It turns out that Thad Starner attended our panel, but as I later found out, he arrived late and missed my mentioning him. Thad, later questioned the panel. In 2019, I wrote the article FOV Obsession, which discussed Thad’s SPIE AR/VR/MR presentation about smaller FOV. Thad is a Georgia Institute of Technology professor and a part-time Staff Researcher at Google (including on Google Glass). He has continuously worn AR devices since his research work at MIT’s media lab in the 1990s.
  • 13:50 Does “tethering make sense” with cables or wirelessly?
  • 20:40 Does an AR device have to work outside (in daylight)?
  • 26:49 The need to add displays to today’s Audio-AI glasses (ex. Meta Ray-Ban Wayfarer).
  • 31:45 Making AR glasses less creepy?
  • 35:10 Does it have to be a glasses form factor?
  • 35:55 Monocular versus Biocular
  • 37:25 What did Apple Vision Pro get right (and wrong) regarding user interaction?
  • 40:00 I make the point that eye tracking and gesture recognition on the “Apple Vision Pro is magical until it is not,” paraphrasing Adi Robertson, and I then added, “and then it is damn frustrating.” I also discuss that “it’s not truly hands-free if you have to make gestures with your hands.”
  • 41:48 Waiting for the Superman [savior] company. And do big companies help or crush innovation?
  • 44:20 Vertical integration (Apple’s big advantage)
  • 46:13 Audience Question: When will AR glasses replace a smartphone (enterprise and consumer)
  • 49:05 What is the first use case to break 1 million users in Consumer AR?
  • 49:45 Thad Starner – “Bold Prediction” that the first large application will be with small FOV (~20 degrees), monocular, and not centered in the user’s vision (off to the ear side by ~8 to 20 degrees), and monochrome would be OK. A smartphone is only about 9 by 15 degrees FOV [or ~20 degrees diagonally when a phone is held at a typical distance].
  • 52:10 Audience Question: Why aren’t more companies going after OSHA (safety) certification?

Small FOV Optical AR Discussion with Thad Starner

As stated in the outline above, Thad Starner arrived late and missed my discussion of smaller FOVs that mentioned Thad, as I learned after the panel. Thad, who has been continuously wearing AR glasses and researching them since the mid-1990s, brings an interesting perspective. Since I first saw and met him in 2019, he has strongly advocated for AR headsets having a smaller FOV.

Thad also states that the AR headset should have a monocular (single-eye) display and be 8—to 20 degrees on the ear side of the user’s straight-ahead vision. He also suggests that monochrome is fine for most purposes. Thad stated that his team will soon publish papers backing up these contentions.

In the sections below, I went from the YouTube transcript and did some light editing to make what was said more readable.

My discussion from earlier in the panel:

11:30 Karl Guttag – I think a lot of the AR or Optical see-through gets confabulated with what was going on in VR because VR was cheap and easy to make a wide field of view by sticking a cell phone with some cheap Optics in front of your face. You get a wide field of view, and people went crazy about that. I made this point years ago on my blog [2019 article FOV Obsession] was the problem. Thad Starner makes this point: he’s one of our Legends at AWE, and I took that to heart many years ago at SPIE AR/VR/MR 2019.

The problem is that as soon as you say beyond about 30-degree field of view, even projecting forward [with technology advancements], as you go beyond 30-degree field of view, you’re in a helmet, something looking like Magic Leap. And Magic Leap ended up in Nowheresville. [Magic Leap] ended up with 25 to 30% see-through, so it’s not really that good see-through, and yet it’s not got the image quality that you would get of an old display shot right in your eyes. You might you could get a better image on an Xreal or something like that.

People are confabulating too many different specs, so they want a wide field of view. The problem is as soon as you say 50 degrees and then you say, yeah, and I need like spatial recognition, I want to do SLAM, and I want to do this, and I want to do that. You’ve now spiraled into the helmet. I mean, you know, Meta was talking the other day about the other panels and said they’re looking at about 50 grams [for the Meta Ray Bans], and my glasses are 23 grams. You’re out of that as soon as you say 50-degree field of view, you’re over 100 grams and and and and and heading to the Moon as you add more and more cameras and all this other stuff, so I think that’s one of our bigger problems whereas AR really Optical AR.

The experiment we’re going to see played out because many companies are working on adding displays to to so called AI audio glasses. We’re going to see if that works because companies are getting ready to make glasses that have 20—to 30-degree field of view glasses tied into AI and audio stuff.

Thad Starner’s comments and the follow-up discussion during the Q&A at the end of the panel:

AWE Legend Thad Starner Wearing Vuzix’s Ultralight Glasses – After the Panel

49:46 Hi, my name is Thad Starner. I’m Professor Georgia Tech. I’m going to make a bold prediction here that the future, at least the first system to sell over a million units, will be a small field of view monocular, non-line-of-sight display, monochrome is okay now; the reason I say that is number one I’ve done different user studies in my lab that we’ll be publishing soon on this subject but the other thing is that you know our phones which is the most popular interface out there are only 9 degrees by 16 degrees field of view. Putting something outside of the line of sight means that it doesn’t interrupt you while you’re crossing the street or driving or flying a plane, right? We know these numbers, so between 8° and 20 degrees towards the ear and plus or minus 8 degrees, I’m looking at Karl [Guttag] here so he can digest all these things.

Karl – I wrote a whole article about it [FOV Obsession]

Thad – And not having a pixel in line of sight, so now feel free to pick me apart and disagree with me.

Jeri-  I want to know a price point.

Thad, I think the first market will be captioning for the heart of hearing, not for the deaf. Also, possible transcription, not translation; at that price point, you’re talking about making reading glasses for people instead of hearing aids. There’s a lot of pushback against hearing, but reading glasses people tend to do, so I’d say you’re probably in the $200 to $300 range.

Ed – I think your prediction is spot on, minus the color green. The only thing I think is that it’s not going to fly.

Thad – I said monochrome is okay.

Ed – I think the monocular field of view is going to be an entry-level product, and you see, I think you will see products that will fit that category with roughly that field of view with roughly that offset angle [not in the center of view] is what you’re going to see in the beginning. Yeah I agree with that but I don’t I think that’s the first step I think you will see a lot of products after that that’s going to do a lot more than monocular monochrome offset displays, start going to larger field of view binocular I think that will happen pretty quickly.

Adi – It does feel like somebody tries to do that every 18 months, though, like Intel tried to make a pair of glasses that did that. It’s a little bit what North did. I guess it’s just a matter of throwing the idea at the wall because I think it’s a good one until it takes.

I was a little taken aback to have Thad call me out as if I had disagreed with him when I had made the point about the advantages of a smaller FOV earlier. Only after the presentation did I find out that he had arrived late. I’m not sure what comment I made that made Thad think I was advocating for a larger FOV in AR glasses.

I want to add that there can be big differences between what consumers and experts will accept in a product. I’m reminded of a story I read in the early 1980s when there was a big debate between very high-resolution monochrome versus lower-resolution color (back then, you could only have one or the other with CRTs) that the head of IBM’s monitor division said, “Color is the least necessary and most desired feature in a monitor.” All the research suggested that resolution was more important for the tasks people did on a computer at the time, but people still insisted on color monitors. Another example is the 1985 New Coke fiasco, in which Coke’s taste studies proved that people liked New Coke better, but it still failed as a product.

In my experience, a big factor is whether the person is being trained to use the device for enterprise or military use versus whether the user is buying it for their own enjoyment. The military has used monochrome displays on devices, including night vision and heads-up displays for decades. I like to point out that the requirement can change if “If the user paid to use versus is paying to use.” Enterprises and the military care about whether the product gets the job done and pay someone to use the device. The consumer has different criteria. I will also agree that there are cases where the user is motivated to be trained, such as Thad’s hard-of-hearing example.

Conclusion on Small FOV Optical AR

First, I agree with Thad’s comments about the smaller FOV and have stated such before. There are also cases outside of enterprise and industrial use where the user is motivated to be trained, such as Thad’s hard-of-hearing example. But while I can’t disagree with Thad or his studies that show having a monocular monochrome image located outside the line of sight is technically better, I think consumers will have a tougher time accepting a monocular monochrome display. What you can train someone to use differs from what they would buy for themselves.

Thad makes a good point that having a biocular display directly in the line of sight can be problematic and even dangerous. At the same time, untrained people don’t like monocular displays outside the line of sight. It becomes (as Ed Tang said in the panel) a point of high friction to adoption.

Based on the many designs I have seen for AR glasses, we will see this all played out. Multiple companies are developing optical see-through AR glasses with monocular green MicroLEDs, color X-cube-based MicroLEDs, and LCOS-based displays with glass form-factor waveguide optics (both diffractive and reflective).

Apple Vision Pro Part 6 – Passthrough Mixed Reality (PtMR) Problems

27 September 2023 at 05:09

Introduction

I planned to wrap up my first pass coverage of the Apple Vision Pro (AVP) with my summary and conclusions based on prior articles. But the more I thought about it, Apple’s approach to Passthrough Mixed Reality (PtMR) seems like it will be so egregiously bad that it should be broken out and discussed separately.

Apple Prioritized EyeSight “Gimmick” Over Ergonomics and Functionality

There are some features, particularly surrounding camera passthrough, where there should have been an internal battle between those who wanted the EyeSight™ gimmick and what I would consider more important functionality. The backers of EyeSight must have won and forced the horrible location of the passthrough cameras, optical distortion from the curved glass in front of all the forward-facing cameras and sensors, put a fragile piece of hard-to-replace glass on the front where it can be easily scratched and broken, and added weight to the front were it is least desired. Also, as discussed later, there are negative effects on the human visual system caused by misaligning the passthrough cameras with the eyes.

The negative effects of EyeSight are so bad for so many fundamental features that someone in power with little appreciation for the technical difficulties must have forced the decision (at least, that is the only way I can conceive of it happening).  People inside the design team must have known it would cause serious problems. Supporting passthrough mixed reality (PtMR) is hard enough without deliberately creating problems.

Meta Quest 3 Camera Location

As noted in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, Meta is locating the soon-to-be-released Quest 3 main passthrough camera closer to the center of view of the eyes. Fixed cameras in front of the eyes won’t be perfect and will still require digital correction for better functional use. It does appear that Meta is taking the PtMR more seriously than it did with the Meta Quest Pro and Quest 2.

I’m going to be looking forward to getting a Meta Quest 3 to test out when it is released soon.

Definitions of AR/VR/MR and PtMR

The terms used to describe mixed reality have been very fluid over the last few years. Before the introduction of Hololens, Augmented reality meant any headset that displayed virtual content on a see-through display. For example, just before Hololens went on sale, Wired in 2015 titled their article (with my bold emphasis): Microsoft Shows HoloLens’ Augmented Reality Is No Gimmick. With the introduction of Hololens, the term “Mixed Reality” was used to distinguish AR headsets with SLAM to lock the virtual to the real world. “AR” headsets without SLAM are sometimes called AR Heads-Up Displays (HUDs), but these get confused with automotive HUDs. Many today refer to a see-through headset without SLAM as “AR” and one with SLAM as “MR,” whereas previously, the terms “AR” covered both with and without SLAM.

Now we have the added confusion of optical see-through (e.x. Hololens) and camera passthrough “Mixed Reality.” While they may be trying to accomplish similar capabilities, they are radically different in their capabilities. Rather than constantly typing “passthrough” before MR, I abbreviated it as PtMR.

In Optical AR, the Virtual Content Augments the Real World – With PtMR, the Real World Augments the Virtual Content

Optical MR prioritizes seeing the real world at the expense of the virtual content. The real world is in perfect perspective, at the correct focus distance, with no limitation by a camera or display on brightness, with zero lag, etc. If done well, there is minimal light blocking and distortion of the real world and little blocking of the real-world FOV.

PtMR, on the other hand, prioritizes virtual image quality at the expense of the real world, both in how things behave in 3-D space (focus perspective) and in image quality.

We are likely many decades away, if ever, from passing what Douglas Lanman of Meta calls their Visual Turing Test (see also the video linked here).

Meta’s demonstrations at Siggraph 2023 of their Flamera with perspective-correct passthrough and Butterscotch with vergence accommodation conflict served to show how far PtMR is from optical passthrough. They can only address each problem individually, each with a large prototype, and even then, there are severe restrictions. The Flamera has a very low-resolution passthrough, and Butterscotch only supports a 50-degree FOV.

It is also interesting that Butterscotch moves back from Half Dome 3’s electronic LCD variable focus to electro-mechanical focusing to address VAC. As reported in Mixed Reality News, “However, the technology presented problems with light transmission and image quality [of the electronic LCD approach], so Meta discarded it for Butterscotch Varifocal at the expense of weight and size.”

All of this work is to try and solve some of the many problems created by PtMR that don’t exist with optical MR. PtMR does not “solve” the issues with optical MR. It just creates a long list of massively hard new problems. Optical AR has issues with the image quality of the virtual world, very large FOV, and hard-edge occlusion (see my article Magic Leap 2 (Pt. 3): Soft Edge Occlusion, a Solution for Investors and Not Users). I often say, “What is hard in optical MR is easy in PtMR and vice versa.”

Demo or Die

Meta and others seem to use Siggraph to show off research work that is far from practical. As stated by Lanman of Meta, of their Flamera and Butterscotch VAC demos at Siggraph 2023, Meta’s Reality Labs has a “Demo or Die” philosophy. They will not be tipping off their competition on concepts they will use within a few years. To be clear, I’m happy to see companies showing off their technical prowess, but at the same time, I want to put it in perspective.

Cosmetic vs. Functional Passthrough PtMR

JayzTwoCents video on the HTC Vive XR Elite has a presentation by Phil on what he calls “3D Depth Projection” (others refer to it as “perspective correct“). In the video (sequence of clips below), Phil demonstrates that because the passthrough video was not corrected in scale, position, and perspective in 3-D space, it deprives him of hand-eye coordination to catch a bottle tossed to him.

As discussed in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough in the section The method in the Madness: MQP prioritizes 3-D spatial over image quality.

Phil demonstrated in the video (and in a sequence of clips below) that with the Meta Quest Pro, even though the image quality is much worse and distorted due to the 3D projection, he can at least catch the bottle.

I would classify the HTC Vive XR Elite as having a Cosmetic Passthrough.” While the image quality is better (but still not very good), it is non-functional. While Meta Quest Pro’s image quality is lousy, it is at least somewhat functional.

Something else to notice in the MQP frame sequence above is that there are both lag and accuracy errors in hand tracking.

Effects on Vision with Long-Term Use

It is less obvious that the human visual system will start adapting to any camera placement and then have to re-adapt after the headset is removed. This was briefly discussed in AVP Part 2 in the section titled Centering correctly for the human visual system, which references Steve Mann in his March 2013 IEEE Spectrum article, “What I’ve learned from 35 years of wearing computerized eyewear.” In the early days with Steve Mann, they had no processing power to attempt to move the effect of the camera images digitally. At the same time, I’m not sure how well the correction will work or how a distorted view will affect people’s visual perception during and after long exposure. As with most visual effects, it will vary from one individual to another.

Meta Flamera Light Field Camera at Siggraph 2023

As discussed in AVP Part 2 and Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, having the passthrough cameras as close as possible to being coaxial to the eyes (among other things) is highly desirable.

To reduce any undesired negative effects on human vision caused by cameras not aligning with the eyes, some devices, such as the Quest 2 and Quest Pro from Meta, use processing to create what I will call “virtual cameras” with a synthesized view for each eye. The farther the physical cameras are from the eye’s location, the larger the correction will be required and the larger the distortion in the final result.

Meta at Siggraph 2023 presented the paper “Perspective-Correct VR Passthrough Without Reprojection” (and IEEE article) and showed their Flamera prototype with a light field camera (right). The figure below shows how the camera receives light rays from the same angle as the eye with the Light Field Passthrough Camera.

Below are a couple of still frames (with my annotations) from the related video that show how, with the Meta Quest 2, the eye and camera views differ (below left), resulting in a distorted image (below right). The distortion/error as the distance from the eye decreases.

It should be noted that while Flamera’s light field camera approach addresses the angular problems of the camera location, it does so with a massive loss in resolution (by at least “n,” where n is the number of light field subviews). So, while interesting in terms of research and highlighting the problem, it is still a highly impractical approach.

The Importance of “Perspective Correct” PtMR

In preparing this article, I returned to a thread on Hacker News about my Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough article. In my article, I was trying to explain why there was a “The method in the Madness: MQP prioritizes 3-D spatial over image quality” of why Meta was distorting the image.

Poster Zee2 took exception to my article and seemed to feel I was understating the problem of 3-D perspective. I think Zee2 missed what I meant by “pyrrhic victory.” I was trying to say they were correct to address the 3D depth issue but that doing so with a massive loss in image quality was not the solution. I was not dismissing the importance of perspective-correct passthrough.

Below, I am copying his comment from that thread (with my bold highlighting)), including a quote from my article. Interestingly, Zee2 comments on Varjo having good image quality with its passthrough, but it is not perspective-correct.

I also really don’t know why he [refering to my article] decided to deemphasize the perspective and depth correctness so much. He mentions it here:

>[Quoting Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough] In this case, they were willing to sacrifice image quality to try to make the position of things in the real world agree with where virtual objects appear. To some degree, they have accomplished this goal. But the image quality and level of distortion, particularly of “close things,” which includes the user’s hands, is so bad that it seems like a pyrrhic victory.

I don’t think this is even close to capturing how important depth and perspective correct passthrough is.

Reprojecting the passthrough image onto a 3D representation of the world mesh to reconstruct a perspective-correct view is the difference between a novelty that quickly gives people headaches and something that people can actually wear and look through for an extended period of time.

Varjo, as a counterexample, uses incredibly high-resolution cameras for their passthrough. The image quality is excellent, text is readable, contrast is good, etc. However, they make no effort to reproject their passthrough in terms of depth reconstruction. The result is a passthrough image that is very sharp, but is instantly, painfully, nauseatingly uncomfortable when walking around or looking at closeup objects alongside a distant background.

The importance of depth-correct passthrough reprojection (essentially, spacewarp using the depth info of the scene reconstruction mesh) absolutely cannot be understated and is a make or break for general adoption of any MR device. Karl is doing the industry a disservice with this article.

From: Hacker News Meta Quest Pro – Bad AR Passthrough comment by Zee2 

Does the AVP have Cosmetic or Functional PtMR or Something Else?

With the AVP’s passthrough cameras being so poorly located (thanks to EyeSight™), severe distortion would seem inevitable to support functional PtMR. I don’t believe there is some magic (perhaps a pun on Magic Leap) that Apple could employ that Meta couldn’t that would simultaneously support good image quality without serious distortion with the terrible camera placement due to the Eyesight(tm) feature.

So, based on the placement of the cameras, I have low expectations for the functionality of the AVP’s PtMR. The “instant experts” who got to try out the AVP would be more impressed by a cosmetically better-looking passthrough. Since there are no reports of distortion like the MQP, I’m left to conclude that, at least for the demo, they were only doing a cosmetic passthrough.

As I often say, “Nobody will volunteer information, but everyone will correct you.” Thus, it is better to take a position based on the current evidence and then wait for a correction or confirmation from the many developers with AVPs who read this blog.

Conclusion

I’m not discounting the technical and financial power of Apple. But then I have been writing about the exaggerated claims for Mixed Reality products by giant companies such as Google, Meta, and Microsoft, not to mention the many smaller companies, including the over $3B spent by Magic Leap, for the last ten years. The combined sunk cost of about $50B of these companies, not including Apple. As I’m fond of saying, “If all it took were money and smart people, it would already be solved.

Apple doesn’t fully appreciate the difficulties with Passthrough Mixed Reality, or they wouldn’t prioritize the EyeSight gimmick over core capabilities. I’m not saying the AVP would work well for passthrough AR without EyeSight, but it is hard enough without digging big technical holes to support a novelty feature.

Apple Vision Pro (Part 5C) – More on Monitor Replacement is Ridiculous

21 August 2023 at 00:17

Introduction

In this series about the Apple Vision Pro, this sub-series on Monitor Replacement and Business/Text applications started with Part 5A, which discussed scaling, text grid fitting, and binocular overlap issues. Part 5B starts by documenting some of Apple’s claims that the AVP would be good for business and text applications. It then discusses the pincushion distortion common in VR optics and likely in the AVP and the radial effect of distortion on resolution in terms of pixels per degree (ppd).

The prior parts, 5A, and 5B, provide setup and background information for what started as a simple “Shootout” between a VR virtual monitor and physical monitors. As discussed in 5A, my office setup has a 34″ 22:9 3440×1440 main monitor with a 27″ 4K (3840×2160) monitor on the right side, which is a “modern” multiple monitor setup that costs ~$1,000. I will use these two monitors plus a 15.5″ 4K OLED Laptop display to compare to the Meta Quest Pro (MQP) since I don’t have an Apple AVP and then extrapolate the results to the AVP.

My Office Setup: 34″ 22:9 3440×1440 (left) and 27″ 4K (right)

I will be saving my overall assessment, comments, and conclusions about VR for Office Applications for Part 5D rather than somewhat burying them at the end of this article.

Office Text Applications and “Information Density” – Font Size is Important

A point to be made by using spreadsheets to generate the patterns is that if you have to make text bigger to be readable, you are lowering the information density and are less productive. Lowering the information density with bigger fonts is also true when reading documents, particularly when scanning web pages or documents for information.

Improving font readability is not solely about increasing their size. VR headsets will have imperfect optics that cause distortions, focus problems, chromatic aberrations, and loss of contrast. These issues make it harder to read fonts below a certain size. In Part 5A, I discussed how scaling/resampling and the inability to grid fit when simulating virtual monitors could cause fonts to appear blurry and scintillate/wiggle when locked in 3-D space, leading to reduced readability and distraction.

Meta Quest Pro Horizon Worktop Desktop Approach

As discussed in Part 5A, with Meta’s Horizon Desktop, each virtual monitor is reported to Windows as 1920 by 1200 pixels. When sitting at the nominal position of working at the desktop, the center virtual monitor fills about 880 physical pixels of the MQP’s display. So roughly 1200 virtual pixels are resampled into 880 vertical pixels in the center of view or by about 64%. As discussed in Part 5B, the scaling factor is variable due to severe pincushion distortion of the optics and the (impossible to turn off) curved screen effect in Meta Horizons.

The picture below shows the whole FOV captured by the camera before cropping shot through the left eye. The camera was aligned for the best image quality in the center of the virtual monitor.

Analogous to Nyquist sampling, when you scale pixel rendered image, you want about 2X (linearly) the number of pixels in the display of the source image to render it reasonably faithfully. Below left is a 1920 by 1200 pixel test pattern (a 1920×1080 pattern padded on the top and bottom), “native” to what the MQP reports to Windows. On the right is the picture cropped to that same center monitor.

1920×1200 Test Pattern
Through the optics picture

The picture was taken at 405mp, then scaled down by 3X linearly and cropped. When taking high-resolution display pictures, some amount of moiré in color and intensity is inevitable. The moiré is also affected by scaling and JPEG compression.

Below is a center crop from the original test pattern that has been 2x pixel-replicated to show the detail in the pattern.

Below is a crop from the full-resolution image with reduced exposure to show sub-pixel (color element) detail. Notice how the 1-pixel wide lines are completely blurred, and the test is just becoming fully formed at about Arial 11 point (close to, but not the same scale as used in the MS Excel Calibri 11pt tests to follow). Click on the image to see the full resolution that the camera captured (3275 x 3971 pixels).

The scaling process might lose a little detail for things like pictures and videos of the real world (such as the picture of the elf in the test pattern), but it will be almost impossible for a human to notice most of the time. Pictures of the real world don’t have the level of pixel-to-pixel contrast and fine detail caused by small text and other computer-generated objects.

Meta Quest Pro Virtual Versus Physical Monitor “Shootout”

For the desktop “shootout,” I picked the 34” 22:9 and 27” 4k monitors I regularly use (side by side as shown in Part 5A), plus a Dell 15.5” 4K laptop display. An Excel spreadsheet is used with various displays to demonstrate the amount of content that can be seen at one time on a screen. The spreadsheet allows for flexible changing of how the screen is scaled for various resolutions and text sizes, and the number of cells measures the information density. For repeatability, a screen capture of each spreadsheet was taken and then played back in full-screen mode (Appendix 1 includes the source test patterns)

The Shootout

The pictures below show the relative FOVs of the MQP and various physical monitors taken with the same camera and lens. The camera was approximately 0.5 meters from the center of the physical monitors, and the headset was at the initial position at the MQP’s Horizon Desktop. All the pictures were cropped to the size of a single physical or virtual monitor.

The following is the basic data:

  • Meta Quest Pro – Central Monitor (only) ~43.5° horizontal FOV. Used an 11pt font with Windows Display Text Scaling at 150% (100% and 175% also taken and included later)
  • 34″ 22:9 3440×1440 LCD – 75° FOV and 45ppd from 0.5m. 11pt font with 100% scaling
  • 27″ 4K (3840 x 2160) LCD – 56° FOV and 62ppd from 0.5m. 11pt font with 150% scaling (results in text the same size at the 34″ 3440×1400 at 100% – 2160/1440 = 150%)
  • 15.5″ 4K OLED – 32° FOV from 0.5m. Shown below is 11pt with 200% scaling, which is what I use on the laptop (a later image shows 250% scaling, which is what Windows “recommends” and would result in approximately the same size fonts at the 34″ 22:9 at 100%).
Composite image showing the relative FOV – Click to see in higher resolution (9016×5641 pixels)

The pictures below show the MQP with MS Windows display text scaling set to 100% (below left) and 175% (below middle). The 175% scaling would result in fonts with about the same number of pixels per font as the Apple Vision Pro (but with a larger angular resolution). Also included below (right) is the 15.5″ 4K display with 250% scaling (as recommended by Windows).

MQP -11pt scaled=100%
MQP – 11pt scaled=175%
15.5″ – 11pt scale=250%

The camera was aimed and focused at the center of the MQP, the best case for it, as the optical quality falls off radially (discussed in Part 5B). The text sharpness is the same for the physical monitors from center to outside, but they have some brightness variation due to their edge illumination.

Closeup Look at the Displays

Each picture above was initially taken 24,576 x 16,384 (405mp) by “pixel shifting” the 45MP R5 camera sensor to support capturing the whole FOV while capturing better than pixel-level detail from the various displays. In all the pictures above, including the composite image with multiple monitors, each image was reduced linearly by 3X.

The crops below show the full resolution (3x linearly the images above) of the center of the various monitors. As the camera, lines, and scaling are identical, the relative sizes are what you would see looking through the headset for the MQP sitting at the desktop and the physical monitors at about 0.5 meters. I have also included a 2X magnification of the MQP’s images.

With Windows 100% text scaling, the 11pt font on the MQP is about the same size as it is on the 34” 22:9 monitor at 100%, the 27” 4K monitor at 150% scaling, and the 15.5” 4K monitor at 250% scaling. But while the fonts are readable on the physical monitor, they are a blurry mess on the MQP at 100%. The MQP at 150% and 175% is “readable” but certainly does not look as sharp as the physical monitors.

Extrapolating to Apple Vision Pro

Apple’s AVP has about 175% linear pixel density of the MQP. Thus the 175% case gives a reasonable idea of how text should look on the AVP. For comparison below, the MQP’s 175% case has been scaled to match the size of the 34” 22:9 and 27” 4K monitors at 100%. While the text is “readable” and about the same size, it is much softer/blurrier than the physical monitor. Some of this softness is due to optics, but a large part is due to scaling. While the AVP may have better optics and a text rendering pipeline, they still don’t have the resolution to compete on content density and readability with a relatively inexpensive physical monitor.

Reportedly, Apple Vision Pro Directly Rendering Fonts

Thomas Kumlehn had an interesting comment on Part 5B (with my bold highlighting) that I would like to address:

After the VisionPro keynote in a Developer talk at WWDC, Apple mentioned that they rewrote the entire render stack, including the way text is rendered. Please do not extrapolate from the text rendering of the MQP, as Meta has the tech to do foveated rendering but decided to not ship it because it reduced FPS.

From Part 5A, “Rendering a Pixel Size Dot.

Based on my understanding, the AVP will “render from scratch” instead of rendering an intermediate image that is then rescaled as is done with the MQP discussed in Part 5A. While rendering from scratch has a theoretical advantage regarding text image quality, it may not make a big difference in practice. With an ~40 pixels per degree (ppd) display, the strokes and dots of what should be readable small text will be on the order of 1 pixel wide. The AVP will still have to deal with approximately pixel-width objects straddling four or more pixels, as discussed in Part 5A: Simplified Scaling Example – Rendering a Pixel Size Dot.

Some More Evaluation of MQP’s Pancake Optics Using immersed Virtual Monitor

I wanted to evaluate the MQP pancake optics more than I did in Part 5B. Meta’s Horizon Desktop interface was very limiting. So I decided to try out immersed Virtual Desktop software. Immersed has much more flexibility in the resolution, size, placements, and the ability to select flat or curved monitors. Importantly for my testing, I could create a large, flat virtual 4K monitor that could fill the entire FOV with a single test pattern (the pattern is included in Appendix 1).

Unfortunately, while the immersed software had the basic features I wanted, I found it difficult to precisely control the size and positioning of the virtual monitor (more on this later). Due to these difficulties, I just tried to fill the display with the test pattern with only a roughly perpendicular to the headset/camera monitor. It was a painfully time-consuming process, and I never could get the monitor where it seems perfectly perpendicular.

Below is a picture of the whole (camera) FOV taken at 405mp and then scaled down to 45mp. The image is a bit underexposed to show the sub-pixel (color) detail when viewed at full resolution. In taking the picture, I determined that the MQPs pancake optics focus appears to be a “dished,” with the focus in the center slightly different than on the outsides. The picture was taken focusing between the center and outside focus and using f11 to increase the photograph’s depth of focus. For a person using the headset, this dishing of the focus is likely not a problem as their eye will refocus based on their center of vision.

As discussed in Part 5B, the MQP’s pancake optics have severe pincushion distortion, requiring significant digital pre-correction to make the net result flat/rectilinear. Most notably, the outside areas of the display have about 1/3rd the linear pixel per degree of the center.

Next are shown 9 crops from the full-resolution (click to see) picture at the center, the four corners, top, bottom, left, and right of the camera’s FOV.

The main thing I learned out of this exercised is the apparent dish in focus of the optics and the fall off in brightness. I had determine the change in resolution in the studies shown in Part 5B.

Some feedback on immersed (and all other VR/AR/MR) virtual monitor placement control.

While the immersed had the features I wanted, it was difficult to control the setup of the monitors. The software feels very “beta,” and the interface I got differed from most of the help documentation and videos, suggesting it is a work in progress. In particular, I could’t figure out how to pin the screen, as the control for pinning shown in the help guides/videos didn’t seem to exist on my version. So I had to start from scratch on each session and often within a session.

Trying to orient and resize the screen with controllers or hand gestures was needlessly difficult. I would highly suggest immersed look at some of the 3-D CAD software controls of 3-D models. For example, it would be great to have a single (virtual) button that would position the center monitor directly in front and perpendicular to the user. It would also be a good idea to allow separate control for tilt, virtual distance, and zoom/resize while keeping the monitor centered.

It seemed to be “aware” of things in the room which only served to fight what I wanted to do. I was left contorting my wrist to try and get the monitor roughly perpendicular and then playing with the corners to try and both resized and center the monitor. The interface also appears to conflate “resizing” with moving the monitor closer. While moving the virtual monitor closer or resizing affect the size of everything, the effect will be different when the head moves. I would have a home (perpendicular and center) “button,” and then left-right, up-down, tilt, distance, and size controls.

To be fair, I wanted to set up the screen for a few pictures, and I may have overlooked something. Still, I found the user interface could be vastley better for the setting up the monitors, and the controller or gesture monitor size and positioning were a big fail in my use.

BTW, I don’t want to just pick on immersed for this “all-in-one” control problem. I have found it a pain on every VR and AR/MR headset I have tried that supports virtual monitors to give the user good simple intuitive controls for placing the monitors in the 3D space. Meta Horizons Desktop goes to the extreme of giving no control and overly curved screens.

Other Considerations and Conclusions in Part 5D

This series-within-a-series on the VR and the AVP use as an “office monitor replacement” has become rather long with many pictures and examples. I plan to wrap up this series within the series on the AVP with a separate article on issues to consider and my conclusions.

Appendix 1: Test Patterns

Below is a gallery of PNG file test patterns used in this article. Click on each thumbnail to see the full-resolution test pattern.

22:9 3440×1440 100% 11pt
MQP 1920×1200 100% 11pt
MQP 1920×1200 150% 11pt
MQP 1920×1200 175% 11pt
4K 150% 11pt
4K 200% 11pt
4K 250% 11pt
MQP 1920×1200 “Tuff Test” on Black
MQP 3840×2160 “immersed” lens test

Appendix 2: Some More Background Information

More Comments on Font Sizes with Windows

As discussed in Appendix 3: Confabulating typeface “points” (pt) with With Pixels – A Brief History, at font “point” is defined as 1/72nd of an inch (some use 1/72.272 or thereabout – it is a complicated history). Microsoft throws the concept of 96 dots per inch (dpi) as 100%. But it is not that simple.

I wanted to share measurements regarding the Calibri 11pt font size. After measuring it on my monitor with a resolution of 110 pixels per inch (PPI), I found that it translates to approximately 8.44pt (8.44/72 inches). However, when factoring in the monitor PPI of 110 and Windows DPI of 96, the font size increases to ~9.67pt. Alternatively, when using a monitor PPI of 72, the font size increases to ~12.89pt. Interestingly, if printed assuming a resolution of 96ppi, the font reaches the standard 11pt size. It seems Windows apply some additional scaling on the screen. Nevertheless, I regularly use the 11pt 100% font size on my 110ppi monitor, which is the Windows default in Excel and Word, and it is also the basis for the test patterns.

How pictures were shot and moiré

As discussed in 5A’s Appendix 2: Notes on Pictures, some moiré issues will be unavoidable when taking high-resolution pictures of a display device. As noted in that Appendix, all pictures in Lens Shootout were taken with the same camera and lens, and the original images were captured at 405 megapixels (Canon R5 “IBIS sensor shift” mode) and then scaled down by 3X. All test patterns used in this article are included in the Appendix below.

Apple Vision Pro (Part 5B) – More on Monitor Replacement is Ridiculous.

10 August 2023 at 03:21

Introduction – Now Three Parts 5A-C

I want to address feedback in the comments and on LinkedIn from Part 5A about whether Apple claimed the Apple Vision Pro (AVP) was supposed to be a monitor replacement for office/text applications. Another theory/comment from more than one person is that Apple is hiding the good “spatial computing” concepts so they will have a jump on their competitors. I don’t know whether Apple might be hiding “the good stuff,” but it would seem better for Apple to establish the credibility of the concept. Apple is, after all, a dominant high-tech company and could stomp any competitor.

Studying the MQP’s images in more detail, it was too simplistic to use the average pixels per degree (ppd), given by dividing the resolution into the FOV of the MQP (and likely the AVP).

As per last time, since I don’t have an AVP, I’m using the Meta Quest Pro (MQP) and extrapolating the results to the AVP’s resolution. I will show a “shootout” comparing the text quality of the MQP to existing computer monitors. I will then wrap up with miscellaneous comments and my conclusions.

I have also included some discussion of Gaze-Contingent Ocular Parallax (GCOP) from some work by Stanford Computational Imaging Labs (SCIL) that a reader of this blog asked about. These videos and papers suggest that some amount of depth perception is conveyed to a person by the movement of each eye in addition to vergence (biocular disparity) and accommodation (focus distance).

I’m pushing out a set of VR versus Physical Monitor “Shootout” pictures and some overall conclusions to Part 5C to discuss the above.

Yes, Apple Claimed the AVP is a Monitor Replacement and Good for High-Resolution Text

Apple Vision Pro Concept

In Apple Vision Pro (Part 5A) – Why Monitor Replacement is Ridiculous, I tried to lay a lot of groundwork for why The Apple Vision Pro (AVP), and VR headsets in general, will not be a good replacement for a monitor. I thought it was obvious, but apparently not, based on some feedback I got.

So to be specific and quote directly from Apple’s WWDC 2023 presentation (YouTube transcript) with timestamps with my bold emphasis added and in-line comments about resolution are given below:

1:22:33 Vision Pro is a new kind of computer that augments reality by seamlessly blending the real world with the digital world.

1:31:42 Use the virtual keyboard or Dictation to type. With Vision Pro, you have the room to do it all. Vision Pro also works seamlessly with familiar Bluetooth accessories, like Magic Trackpad and Magic Keyboard, which are great when you’re writing a long email or working on a spreadsheet in Numbers.

Seamless makes many lists of the most overused high-tech marketing words. Marketeers seem to love it because it is both imprecise, suggests it works well, and unfalsifiable (how do you measure “seamless?”). Seamlessly was used eight times in the WWDC23 to describe the AVP and by Meta to describe the Meta Quest Pro (MQP) twice at Meta Connect 2022. From Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, Meta also used “seamless” to describe the MQP’s MR passthrough:

Apple claims the AVP is good for text-intensive “writing a long email or working on a spreadsheet in numbers.”

1:32:10 Place your Mac screen wherever you want and expand it–giving you an enormous, private, and portable 4K display. Vision Pro is engineered to let you use your Mac seamlessly within your ideal workspace. So you can dial in the White Sands Environment, and use other apps in Vision Pro side by side with your Mac. This powerful Environment and capabilities makes Apple Vision Pro perfect for the office, or for when you’re working remote.

Besides the fact that it is not 4K wide, it is stretching those pixels over about 80 degrees so that there are only about 40 pixels per degree (ppd), much lower than typically with a TV or movie theater. There are the issues discussed in Part 5A that if you are going to make the display stationary in 3-D, the virtual monitor must be inscribed in the viewable area of the physical display with some margin for head movement, and content must be resampled, causing a loss of resolution. Movies are typically in a wide format, whereas the AVP’s FOV is closer to square. As discussed in Apple Vision Pro (Part 3) – Why It May Be Lousy for Watching Movies On a Plane, you have the issue that the AVP’s horizontal ~80° FOV where movies are designed for about 45 degrees.

Here, Apple claims that the “Apple Vision Pro; perfect for the office, or for when you’re working remote.”

1:48:06 And of course, technological breakthroughs in displays. Your eyes see the world with incredible resolution and color fidelity. To give your eyes what they need, we had to invent a display system with a huge number of pixels, but in a small form factor. A display where the pixels would disappear, creating a smooth, continuous image.

The AVP’s expected average of 40ppd is well below the angular resolution “where the pixels would disappear.” It is below Apple’s “retinal resolution.” If the AVP has a radial distortion profile similar to the MQP (discussed in the next section), then the center of the image will have about 60ppd or almost “retinal.” But most of the image will have jaggies that a typical eye can see, particularly when they move/ripple causing scintillation (discussed in part 5A).

1:48:56 We designed a custom three-element lens with incredible sharpness and clarity. The result is a display that’s everywhere you look, delivering jaw-dropping experiences that are simply not possible with any other device. It enables video to be rendered at true 4K resolution, with wide color and high dynamic range, all at massive scale. And fine text looks super sharp from any angle. This is critical for browsing the web, reading messages, and writing emails.

WWDC 2023 video at 1:56:08 with Excel shown

As stated above, the video will not be a “true 4K resolution.” Here is the claim, “fine text looks super sharp from any angle,” which is impossible with resampled text onto 40ppd displays.

1:56:08 Microsoft apps like Excel, Word, and Teams make full use of the expansive canvas and sharp text rendering of Vision Pro.

Here again, is the claim that there will be “sharp text” in text-intensive applications like Excel and Word.

I’m not sure how much clearer it can be that Apple was claiming that the AVP would be a reasonable monitor replacement, used even when a laptop display is present. Also, they were very clear that the AVP would be good for heavily text-based applications.

Meta Quest Pro (likely AVP) Pincushion Distortion and its Affect on Pixels Per Degree (ppd)

While I was aware, as discussed in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, that the MQP, like almost all VR optics, had a signification pincushion distortion, it didn’t quantify the amount of distortion and its effect on the angular resolution aka ppd. Below is the video capture from the MQP developers app on the left, and the resultant image is seen through the optics (middle).

Particularly note above how small the white wall to the left of the left bookcase is relative to its size after the optics; it looks more than 3X wide.

For a good (but old) video explaining how VR headsets map source pixels into the optics (among other concepts), I recommend watching How Barrel Distortion Works on the Oculus Rift. The image on the right shows how equal size rings in the display are mapped into ever-increasing width rings after the optics with a severe pincushion distortion.

Mapping Pixels Per Degree (ppd)

I started with a 405mp camera picture through the MQP optics (right – scaled down 3x linearly), where I could see most of the FOV and zoom in to see individual pixels. I then picked a series of regions in the image to evaluate. Since the pixels in the display device are of uniform size, any size change in their size/spacing must be due to the optics.

The RF16f2.8 camera lens has a known optical barrel distortion that was digitally corrected by the camera, so the camera pixels are roughly linear. The camera and lens combination has a horizontal FOV of 98 degrees and 24,576 pixels or ~250.8ppd.

The MQP display processing pre-compensates for the optics plus adds a cylindrical curvature effect to the virtual monitors. These corrections change the shape of objects in the image but not the physical pixels.

The cropped sections below demonstrate the process. For each region, 8 by 8 pixels were marked with a grid. The horizontal and vertical width of the 8 pixels was counted in terms of the camera pixels. The MQP display is rotated by about 20 degrees to clear the nose of the user, so the rectangular grids are rotated. In addition to the optical distortion in size, chroma aberrations (color separation) and focus worsen with increasing radii.

The image below shows the ppd at a few selected radii. Unlike the Oculus Rift video that showed equal rings, the stepping between these rings below is unequal. The radii are given in terms of angular distance from the optical center.

The plots below show the ppd verse radius for the MQP (left); interestingly, the relationship turns out to be close to linear. The right-hand plot assumes the AVP has a similar distortion profile and FOV, the l but three times the pixels, as reported. It should be noted that ppd is not the only factor affecting resolution; other factors include focus, chroma aberrations, and contrast which worsen with increasing radii.

The display on the MQP is 1920×1800 pixels, and the FOV is about 90° per eye diagonally across a roughly circular image, which works out to about 22 to 22.5 ppd. The optical center has about 1/3rd higher ppd with the pincushion distortion optics. For the MPQ Horizon Desktop application shown, the center monitor is mostly within the 25° circle, where the ppd is at or above average.

Gaze-Contingent Ocular Parallax

While a bit orthogonal to the discussion of ppd and resolution, Gazed-Contingent Ocular Parallax (GCOP) is another issue that may cause problems. A reader, VR user, claims to have noticed GCOP brought to my attention the work of the Stanford Computational Imaging Lab’s (SCIL) work in GCOP. SCIL has put out Multiple videos and articles, including Eye Tracking Revisited by Gordon Wetzstein and Gaze-Contingent Ocular Parallax Rendering for Virtual Reality (associated paper link). I’m a big fan of Wetzstein’s general presentations; per his usual standard, his video explains the concept and related issues well.

The basic concept is that because the center of projection (where the image land on the retina) and center of rotation of the eye are different, the human visual system can detect some amount of 3-D depth in each eye. A parallax and occlusion difference occurs when the eye moves (stills from some video sequences below). Since the eyes constantly move and fixate (saccades), depth can be detected.

GCOP may not be as big a factor as vergence and accommodation. I put it in the category of one of the many things that can cause people to perceive that they are not looking at the real world and may cause problems.

Conclusion

The marketing spin (I think I have heard this before) on VR optics is that they have “fixed foveated optics” in that there is a higher resolution in the center of the display. There is some truth that severe pincushion optical distortion improves the pixel density in the center, but it makes a mess of the rest of the display.

While MQP’s optics have a bigger sweet spot, and the optical quality falls off less rapidly than the Quest 2’s Fresnel optics, they are still very poor by camera standards (optical diagram for the 9-element RF16f2.8 lens, a very simple camera lens, used to take the main picture on the right). VR optics must compromise due to space, cost, and, perhaps most importantly, supporting a very wide FOV.

With a monitor, there is only air between the eye and the display device with no loss of image quality, and there is no need to resample the monitor’s image when the user’s head moves like there is with a VR virtual monitor.

As the MQP other pancake optics and most, if not all, other VR optics have major pincushion distortion; I fully expect the AVP will also. Regardless of the ppd, however, the MQP virtual monitor’s far left and right sides become difficult to read due to other optical problems. The image quality can be no better than its weakest link. If the AVP has 3X the pixels and roughly 1.75x the linear ppd, the optics must be much better than the MQP to deliver the same small readable text that a physical monitor can deliver.

Apple Vision Pro (Part 5A) – Why Monitor Replacement is Ridiculous

5 August 2023 at 17:53

Introduction

As I wrote in Apple Vision Pro (Part 1) regarding the media coverage of the Apple Vision Pro, “Unfortunately, I saw very little technical analysis and very few with deep knowledge of the issues of virtual and augmented reality. At least they didn’t mention what seemed to me to be obvious issues and questions.

I have been working for the last month on an article to quantify why it is ridiculous to think that a VR headset, even one from Apple, will be a replacement for a physical monitor. In writing the article, if felt the need to include a lot of background material and other information as part of the explanation. As the article was getting long, I decided to break it into two parts, this being the first part.

The issues will be demonstrated using the Meta Quest Pro (MQP) because that is the closest headset available, and it also claims to be for monitor replacement and uses similar pancake optics. I will then translate these results to the higher, but still insufficient, resolution of the Apple Vision Pro (AVP). The AVP will have to address all the same issues as the MQP.

Office applications, including word processing, spreadsheets, presentations, and internet browsing, mean dealing with text. As this article will discuss, text has always been treated as a special case with some “cheating” (“hints” for grid fitting) to improve sharpness and readability. This article will also deal with resolution issues with trying to fit a virtual monitor in a 3-D space.

I will be for this set of articles suspending my disbelief in many other human factor problems caused by trying to simulate a fixed monitor in VR to concentrate on the readability of text.

Back to the Future with Very Low Pixels Per Degree (ppd) with the Apple Vision Pro

Working on this article reminded me of lessons learned in the mid-1980s when I was the technical leader of the TMS34010, the first fully programmable graphics processor. The TMS340 development started in 1982 before an Apple Macintosh (1984) or Lisa (1983) existed (and they were only 1-bit per pixel). But like those products, my work on the 34010 was influenced by Xerox PARC. At that time, only very expensive CAD and CAM systems had “bitmapped graphics,” and all PC/Home Computer text was single-size and monospaced. They were very low resolution if they had color graphics (~320×200 pixels). IBM introduced VGA (640×480) and XGA (1024×768) in 1987, which were their first IBM PC square pixel color monitors.

The original XGA monitor, considered “high resolution” at the time, had a 16” diagonal and 82ppi, which translated 36 to 45 pixels per degree (ppd) from 0.5 meters to 0.8 meters (typical monitor viewing distance), respectively. Factoring in the estimated FOV and resolutions, the Apple Vision Pro is between 35 and 40 ppd or about the same as a 1987 monitor.

So it is time to dust off the DeLorean and go Back to the Future of the mid-1980s and the technical issues with low ppd displays. Only it is worse this time because, in the 1980s, we didn’t have to resample/rescale everything in 3-D space when the user’s head moves to give the illusion that the monitor isn’t moving.

For more about my history in 1980s computer graphics and GPUs, see Appendix 1: My 1980s History with Bitmapped Fonts and Multiple Monitors.

The question is, “Would People?” Not “Could People?” Use an Apple Vision Pro (AVP) as a Computer Monitor

With their marketing and images (below), Apple and Meta suggest that their headsets will work as a monitor replacement. Yes, they will “work” as a monitor if you are desperate and have nothing else, but having multiple terrible monitors is not a solution many people will want. These marketing concepts fail to convey that each virtual monitor will have low effective resolution forcing the text to be blown up to be readable and thus have less content per monitor. They also fail to convey that the text looks grainy and shimmers (more on this in a bit).

Meta Quest Pro (left) and Apple Vision Pro (right) have similar multiple monitor concepts.

Below is a through-the-lens picture of MQP’s Horizons Virtual Desktop. t was taken through the left eye’s optics with the camera centered for best image quality and showed more of the left side of the binocular FOV. Almost all the horizontal FOV for the left eye is shown in the picture, but the camera slightly cuts off the top and bottom.

MQP Horizon Desktop – Picture via the Left Eye Optics (camera FOV 80°x64°)

Below for comparison is my desktop setup with a 34” 22:9 3440×1400 monitor on the left and a 27” 4K monitor on the right. The combined cost of the two monitors is less than $1,000 today. The 22:9 monitor display setting is 100% scale (in Windows display settings) and has 11pt fonts in the spreadsheet. The righthand monitor is set for 150% scaling with 11pt fonts netting fonts that are physically the same size.

My office setup – 34” 22:9 3440×1440 (110 PPI) widescreen (left) & 27” 16:9 4K (163 PPI) Monitor (right)

Sitting 0.5 to 0.8 meters away (typical desktop monitor distance), I would judge the 11pt font on either of the physical monitors as much more easily readable than the 11pt font on the Meta Quest Pro with the 150% scaling, even though the MQP’s “11pt” is angularly about 1.5x bigger (as measured via the camera). The MQP’s text is fuzzier, grainier, and scintillates/shimmers. I could over six times the legible text on the 34” 22:9 monitor and over four times on the 27” 4K as the MQP. With higher angular resolution, the AVP will be better than the MQP but still well below the amount of legible text.

Note on Window’s Scaling

In Window, 100% means a theoretical 96 dots per inch. Windows factors in the information reported by the monitor to it (in this case, from the MQP’s software) give a “Scale and Layout” recommendation (right). The resolution reported to Windows by the MQP’s Horizon’s virtual monitor is 1920×1200, and the recommended scaling was 150%. This setting is what I used for most pictures other than for the ones called out as being at 100% or 175%.

For more on the subject of how font “points” are defined, see Appendix 3: Confabulating typeface “points” (pt) with With Pixels – A Brief History.

Optics

I’m not going to go into everything wrong with VR optics, and this article deals with being able to read text in office applications. VR optics have a lot of constraints in terms of cost, space, weight, and wide FOV. While pancake optics are a major improvement over the more common Fresnel lenses, to date, they still are poor optically (we will have to see about the AVP).

While not bad in the center of the FOV, they typically have severe pincushion distortion and chroma (color) aberrations. Pancake optics are more prone to collecting and scattering light, causing objects to glow on dark backgrounds, contrast reduction, and ghosts (out-of-focus reflection). I discussed these issues with Pancake Optics in Meta (aka Facebook) Cambria Electrically Controllable LC Lens for VAC. With computer monitors, there are no optics to cause these problems.

Optical Distortion

As explained in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, the Meta Quest Pro rotates the two displays for the eyes ~20° to clear the nose. The optics also have very large pincushion distortion. The display processor on the MQP pre-corrects digitally for the display optics’ severe pincushion distortion. This correction comes at some loss of fidelity in the resampling process.

The top right image shows the video feed to the displays. The distortion and rotation have been digitally corrected in the lower right image, but other optical problems are not shown (see through-the-lens pictures in this art cle).

There is also an optical “cropping” of the left and right eye displays, indicated by the Cyan and Red dashed lines, respectively. The optical cropping shown is based on my observations and photographs.

The pre-distortion correction is certainly going to hurt the image quality. It is likely that the AVP, using similar pancake optics, will have similar needs for pre-correction. Even though the MQP displays are rotated (no word on the AVP), there are so many other transforms/rescalings, including the transforms in 3-D space required to make the monitor(s) appear stationary, that if the rotation is combined with them (rather than done as a separate transform), the rotation o the display’s effect on resolution may be negligible. The optical quality distortion and the loss of text resolution, when transformed in 3-D space, are more problematic.

Binocular Overlap and Rivalry

One of the ways to improve the overall FOV with a biocular system is to have the FOV of the left and right eye only partially overlap (see figure below). The paper Perceptual Guidelines for Optimizing Field of View in Stereoscopic Augmented Reality Displays and the article Understanding Binocular Overlap and Why It’s Important for VR Headsets discuss the issues with binocular overlap (also known as “Stereo Overlap”). Most optical AR/MR systems have a full or nearly full overlap, whereas VR headsets often have a significant amount of partial overlap.

Partial overlap increases the total FOV when combining both eyes. The problem with partial overlap occurs at the boundary where one FOV ends in the middle of the other eye’s FOV. One eye sees the image fade out to black, whereas the other sees the image. This is a form of Biocular Rivalry, and it is left to the visual cortex to sort out what is seen. The visual cortex will mostly sort it out in a desirable way, but there will be artifacts. Most often, the visual cortex will pick the eye that appears brighter (i.e., the cortex picks one and does not average), but there can be problems with the transition area. Additionally, where one is concentra ing can affect what is seen/perceived.

In the case of the MQP, the region of binocular overlap is slightly less than the width of the center monitor in Meta’s Horizon’s Desktop when viewed from the starting position. Below left shows the view through the left eye when centering the monitor in the binocular FOV.

When concentrating on a cell in the center, I didn’t notice a problem, but when I took in the whole image, I could see these rings, particularly in the lighter parts of the image.  

The Meta Quest 2 appears to have substantially more overlap. On the left is a view through the left eye with the camera positioned similarly to the MQP (above left). Note how the left eye’s FOV overlaps the hole central monitor. I didn’t notice the transition “rings” with the Meta Quest 2 as I did with the MQP.

Binocular overlap is not one of those things VR companies like to specify; they would rather talk about the bigger FOV.

In the case of the AVP, it will be interesting to see the amount of binocular overlap in their optics and if it affects the view of the virtual monitors. One would like the overlap to be more than the width of a “typical” virtual monitor, but what does “typical” mean if the monitors can be of arbitrary size and positioned anywhere in 3-D space, as suggested in the AVP’s marketing material?

Inscribing a virtual landscape-oriented monitor uses about half of the vertical pixels of the headset.

The MQP’s desktop illustrates the basic issues of inscribing a virtual monitor into the VR FOV while keeping the monitor stationary. There is some margin for allowing head movement without cutting off the monitor, which would be distracting. Additionally, the binocular overlap cutting off the monitor is discussed above.

As discussed in more detail, the MQP uses a 16:10 aspect ratio, 1920×1200 pixel “virtual monitors” (the size it reports to Windows). The multiple virtual monitors are mapped into the MQP’s 1920×1800 physical display. Looking straight ahead, sitting at the desktop, you see the central monitor and about 30% of the two side monitors.

The center monitor’s center uses about 880 pixels, or about half of the 1800 vertical pixels of the QP’s physical display. The central monitor behaves about 1.5 meters (5 f et) away or about 2 to 3 times the distance of a typical computer monitor. This makes “head zooming” (leaning in to make the image bigger) ineffective.

Apple’s AVP has a similar FOV and will have similar limitations in fitting virtual moni ors. There is the inevitable compromise between showing the whole monitor with some latitude user moving their head while avoiding cutt ng off the monitor the sides of the monitor.

Simplified Scaling Example – Rendering a Pixel Size Dot

The typical readable text has a lot of high-resolution, high contra t, and features that will be on the order of one pixel wide, such as the stroke and dot in the letter “i.” The problems with drawing a single pixel size dot in 3-D space illustrate some of the problems.

Consider drawing a small circular dot that, after all the 3-D transforms, is the size of about one pixel. In the figure below, the pixel boundaries are shown with blue lines. The four columns below in the figure below show a few of an infinite number of relationships between a rendered dot and the pixel grid.

The first row shows the four dots relative to the grid. The nearest pixel is turned on in the second row based on the centroid. In row three, a simple average is used to draw the pixel where the average of 4 pixels should equal the brightness of one pixel. The fourth row shows a low-pass filter of the virtual dots. The fifth row renders the pixels based on the average value of the low-pass filtered version of the dots.

The centroid method is the sharpest and keeps the size of the dot the same, but the location will tend to jump around with the slightest head movement. If many dots formed an object, the shape would appear to wriggle. With the simple average, the “center of mass” is more accurate than the centroid method, but the dot changes shape dramatically based on alignment/movement. The average of the low-pass filter method is better in terms of center of mass, and the shape changes less based on alignment, but now a single pixel size circle is blurred out over 9 pixels.

There are many variations to resampling/scaling, but they all make tradeoffs. A first-order tradeoff is between wiggling (changing in shape and location), with movement versus sharpness. A big problem with text when rendered low ppd displays, including the Apple Vision Pro, is that many features, from periods to the dots of letters to the stroke width of small text fonts, will be close to 1 pixel.

Scaling text – 40+ Years of Computer ont Grid Fitting (“Cheating”) Exposed

Since the beginning, personal computers have dealt with low pixels-per-inch monitors, translating into low pixels per degree based on typical viewing distances. Text is full of fine detail and often has perfectly horizontal and vertical strokes that, even with today’s higher PPI monitors, cause pixel alignment issues. Text is so important and so common that it gets special treatment. Everyone “cheats” to make text look better.

The fonts need to be recognizable without making them so big that the eye has to move a lot to read words and make content less dense with less information on a single screen. Big fonts produce less content per display and more eye movement, making the muscles sore.

In the early to mid-1980, PCs moved rough-looking fixed space to proportionally spaced text and carefully hand-crafted fonts, and only a few font sizes were available. Font edges are also smoothed (antialiased) to make it look better. Today, most fonts are rendered from a model with “hints” that help the fonts look better on a pixel grid. TrueType, originally developed by Apple as a workaround to paying royalties to Adobe, is used by both Apple and MS Windows and includes “Hints” in the font definitions for grid fitting (see: Windows hinting and Apple hinting).

Simplistically, grid fitting tries to make horizontal and vertical strokes of a font land on the pixel grid by slightly modifying the shape and location (vertical and horizontal spacing) of the font. Doing so requires less smoothing/antialiasing without making the font look jagged. This works because computer monitor pixels are on a rectangular grid, and in most text applications, the fonts are drawn in horizontal rows.

Almost all font rending is grid fits, just some more than others (see from 2 07 Font rendering philosophies of Windows & Mac OS X). Apple (and Adobe) have historically tried to keep the text size and spacing more accurate at some loss in font sharpness and readability on low PPI monitors (an easy solution for Apple as they expect you to buy a higher PPI monitor). MS Windows with ClearType and Apple with their LCD font smoothing have options to try and improve fonts further by taking advantage of LCDs with side-by-side red-green-blue subpixels.

But this whole grid fitting scheme falls apart when the monitors are virtualized. Horizontal and vertical strokes transform into diagonal lines. Because grid fitting won’t work, the display of a virtual monitor needs to be much higher in angular resolution than a physical monitor to show a font of the same size with similar sharpness. Yet today and for the foreseeable future, VR displays are much lower resolution.

For more on the definition of font “Points” and their history with Windows and Macs, see Appendix 3: Confabulating typeface “points” (pt) with With Pixels – A Brief History.

Rendering Options: Virtual Monitors Fixed in 3-D Space Breaks the “Pixel Grid.”

The slightest head movement means that everything has to be re-rendered. The “grid” to which you want to render text is not the virtual monitor but that of the headset’s display. There are at least two main approaches:

  1. Re-render everything from scratch every frame – This will give the best theoretical image quality but is very processor intensive and will not be supported by most legacy applications. Simply put, these applications are structured to draw in terms of physical pixels of a fixed size and orientation rather than everything drawn virtually.
  2. Render to a “higher” resolution (if possible) and then scale to the headset’s physical pixels.
    • One would like the rendering to be at least 2X (linearly, 4X the pixels) of the physical pixels of the headset covering the same area to keep from having significant degradation in image quality after the scaling-down process.
    • The higher-resolution virtual image transformed onto the surface (which might be curved itself) of the virtual monitor in 3-D space. Virtual monitor processing can become complex if the user can put multiple monitors here, there, and everywhere that can be viewed from any angle and distance. The rendering resolution needed for each virtual monitor depends on the virtual distance from the eye.
    • Even with this approach, there are “application issues” from the legacy of 40+ years of pcs dealing with fixed pixel grids.
    • The grid stretching (font hinting) becomes counterproductive since they are stretching to the virtual rather than the physical display.

Systems will end up with a hybrid of the two approaches mixing “new” 3-D applications with legacy office applications.

Inscribing a virtual landscape-oriented monitor uses about half of the vertical pixels of the headset.

The MQP’s Horizons appears to render the virtual monitor(s) and then re-render them in 3-D space along with the cylindrical effect plus pre-correction for their Pancake lens distortion.

The MQP’s desktop illustrates the basic issues of inscribing a virtual monitor into the VR FOV while keeping the monitor stationary. There is some margin for allowing head movement without cutting off the monitor, which would be distracting. Additionally, the binocular overlap cutting off the monitor is discussed above.

The MQP uses a 16:10 aspect ratio, 1920×1200 pixel “virtual monitors.” The multiple virtual monitors are mapped into the MQP’s 1920×1800 physical display. Looking straight ahead, sitting at the desktop, you see the central monitor and about 30% of the two side monitors.

The virtual monitor’s center uses about 880 pixels, or about half of the 1800 vertical pixels of the MQP’s physical display or 64% of the 1200 vertical pixels reported to Windows with the use at the desktop.

The central monitor behaves like it is about 1.5 meters (5 feet) away or about 2 to 3 times the distance of a typical computer monitor. This makes “head zooming” (leaning in to make the image bigger) much less effective (by a factor of 2 to 3X).

Apple’s AVP has a similar FOV and will have similar limitations in fitting virtual monitors. There is the inevitable compromise between showing the whole monitor with some latitude user movi g their head while avoiding cutting off the monitor on the sides of the monitor.

The pre-distortion correction is certainly going to hurt the image. It is possible that the AVP, using similar pancake optics, will have similar needs for pre-correction (most, if not all, VR optics have significant pincushion distortion – a side effect of trying to support a wide FOV). The MQP displays are rotated to clear the nose (no word on the AVP). However, this can be rolled into the other transformations and probably does not significantly impact the processing requirement or image quality.

A simplified example of scaling text

The image below, one cell of a test pattern with two lines of text and some 1- and 2-pixel-wide lines, shows a simulation (in Photoshop) of the scaling process. For this test, I chose a 175% scaled 11pt front which should have roughly the same number of pixels as an 11pt font at 100% on an Apple Vision Pro. This simulation greatly simplifies the issue but shows what is happening with the pixels. The MQP and AVP must support resampling with 6 degrees of free om in the virtual world and a pre-correcting distortion with the optics (and, in the case of MQP’s Horizons, curve the virtual monitor).

Source Cel (left), Simulated 64% scaling (right)
  • Sidenote: This one test pattern accidentally has an “i” rather than a “j” between the g & k that I discovered late into editing.

The pixels have been magnified by 600% (in the full-size image), and a grid has been shown to see the individual pixels. On the top right source has been scaled by 64%, about the same amount MQP Horizons scales the center of the 1920×1200 virtual monitor when sitting at the desktop. The bottom right image scales by 64% and rotates by 1° to simulate some head tilt.

If you look carefully at the scaled one and two-pixel wide lines in the simulation, you will notice that sometimes the one-pixel wide lines are as wide as the 2-pixel lines but dimmer. You will also see what started as identical fonts from line to line look different when scaled even without any rotation. Looking through the lens cells, the fonts have further degradation/softening as they are displayed on color subpixels.

Below is what the 11pt 175% fonts look like via the lens of the MQP in high enough resolution to see the color subpixels. By the time the fonts have gone through all the various scaling, they are pretty rounded off. If you look closely at the same font in different locations (say the “7” for the decimal point), you will notice every instance is different, whereas, on a conventional physical monitor, they would all be identical due to grid fitting.

MQP 175% Scaled 11pt fonts

For reference, the full test pattern and the through-the-lens picture of the virtual monitor are given below (Click on the thumbnails to see the full-resolution images). The camera’s exposure was set low so the subpixels would not blow out and lose all their color.

Test Pattern Cells Replicated
A full test pattern with a center cell and an off-center cell indicated by red rectangles (exposed to show subpixels)

Scintillating Text

When looking through the MQP, the text scintillates/sparkles. This occurs because no one can keep their head perfectly still, and every text character is being redrawn on each frame with slightly different alignments to the physical pixels causing the text to wriggle and scintillate.

Scaling/resampling can be done with sharper or softer processing. Unfortunately, the sharper the image after resampling, the more it will wriggle with movement. The only way to avoid this wriggling and have sharp images is to have a much higher ppd. MQP has only 22.5ppd, and the AVP has about 40ppd and should be better, but I think they would need about 80pp (about the limit of good vision and what Apple retinal monitors support) to eliminate the problems.

The MQP (and most displays) uses spatial color with individual red, green, and blue subpixels, so the wriggling is at the subpixel level. The picture below shows the same text with the headset moving slightly between shots.

Below is a video from two pictures taken with the headset moved slightly between shots to demonstrate the scintillation effect. The 14pt font on the right has about the same number of pixels as an 11pt font with the resolution of the Apple Vision Pro.

Scintillation/wiggle of two frames (right-click > “loop” -> play triangle to see the effect)

Conclusion

This will not be a close call, and using any VR headset, including the QP and Apple Vision Pro, as a computer monitor replacement fails any serious analysis. It might impress people who don’t understand the issues and can be wowed by a flashy short demo, and it might be better than nothing. But it will be a terrible replacement for a physical monitor/display.

I can’t believe Apple seriously thinks a headset display with about 40ppd will make a good virtual monitor. Even if some future VR headset has 80ppd and over 100-degree FOV, double the AVP linearly or 4X, it will still have problems.

Part 5B of this series will include more examples and more on my conclusions.

Appendix 1: My 1980s History with Bitmapped Fonts and Multiple Monitors

All this discussion of fonts and 3-D rendering reminded me of those early days when the second-generation TMS34020 almost got designed into the color Macintosh (1985 faxed letter from Steve Perlman from that era – right). I also met with Steve Jobs at NeXT and mentioned Pixar to him before Jobs bought them (discussed in my 2011 blog article) and John Warnock, a founder of Adobe, who was interested in doing a Port of Postscript to the 34010 in that same time frame.

In the 1980s, I was the technical leader for a series of programs that led to the first fully programmable graphics processor, the TMS34010, and the Multi-ported Video DRAM (which led to today’s SDRAM and GDRAM) at Texas Instruments (TI) (discussed a bit more here and in Jon Peddie’s 2019 IEEE article and his 2022 book “The History of the GPU – Steps to Invention”).

In the early 1980s, Xerox PARC’s work influenced my development of the TMS34010, including Warnock’s 1980 paper (while still at PARC), “The Display of Characters Using Gray Level Sample Arrays,” and the series of PARC’s articles in BYTE Magazine, particularly the August 1981 edition on Smalltalk which discussed bit/pixel aligned transfers (BitBlt) and the use of a “mouse” which had to be explained to BYTE readers as, “a small mechanical box with wheels that lets you quickly move the cursor around the screen.”

When defining the 34010, I had to explain to TI managers that the Mouse would be the next big input device for ergonomic reasons, not the lightpen (used on CAD terminals at TI in the early 1980s), which requires the user to keep their arm floating in the air which quickly become tiring. Most AR headset user interfaces make users suffer with having to float their hands to point, select, and type, so the lessons of the past are being relearned.

In the late 1980s, a systems engineer for a company I had never heard of called “Bloomberg,” who wanted to support 2 to 4 monitors per PC graphics board, came to see us at TI. In a time when a single 1023×786 graphic card could cost over $1,200 (about $3,000 in 2023 dollars), this meeting stood out. The Bloomberg engineer explained how Wall Street traders would pay a premium to get as much information as possible in front of them, and a small advantage on a single trade would pay for the system. It was my first encounter with someone wanting multiple high-resolution monitors per PC.

I used to have a life designing cutting-edge products from blank sheets of paper (back then, it was physical paper) through production and marketing; in contrast, I blog about other people’s designs today. And I have dealt with pixels and fonts for over 40 years.

1982

Below is one of my early presentations on what was then called the “Intelligent Graphics Controller” (for internal political reasons, we could not call it a “processor”), which became the TMS34010 Graphics System Processor. You can also see the state of 1982 presentation technology with a fixed-spaced font and the need to cut and paste hand drawings. This slide was created in Feb 1982. The Apple Lisa didn’t come out until 1983, and the Mac in 1984.

1986 and the Battle with Intel for Early Graphics Processor Dominance

e announced the TMS34010 in 1986, and our initial main competitor was the Intel 82786. But the Intel chip was “hardware” and lacked the 34010’s programmability, and to top it off, the Intel chip had many bugs. In just a few months, the 82786 was a non-factor. The copies of a few of the many articles below capture the events.

1986 we wrote two articles on the 34010 in the IEEE CG&A magazine. You can see from the front pages of the articles the importance we put on drawing text. Copies of these articles are available online (click on the thumbnails below to be linked to the full articles). You may note the similarity of the IEEE CG&A article’s first figure to the one in the 1981 Byte Smalltalk article, where we discussed extending “BitBlt” to the color “PixBlt.”

Around 1980 we started publishing a 3rd party guide of all the companies developing hardware and software for the 340 family of products, and the June 1990 4th Edition contained over 200 hardware and software products.

Below is a page from the TMS340 TIGA Graphics Library, including the font library. In the early 1980s, everyone had to develop their font libraries. There was insufficient power to render fonts with “hints” on the fly. We also do well to have bitmapped fonts with little or no antialiasing/smoothing. From about

Sadly, we are a bit before our time, and Texas Instruments had, by the late 1980s, fallen far behind TSMC and many other companies in semiconductor technology for making processors. Our competitors, such as ATI (NVidia wasn’t founded until 1993), could get better semiconductor processing at a lower cost from the then-new semiconductor 3rd party fabs such as TSMC (founded in 1987).

Appendix 2: Notes on Pictures

All the MQP pictures in these two articles were taken through the l ft eye optics using either the Canon R5 (45mp) with an RF16mmf2.8 or 28mmf2.8 “pancake” lens or the lower resolution Olympus E-M5D-3 (20mp) with 9-18mm zoom lens at 9mm. Both cameras feature a “pixel shift” feature that moves the lens, giving 405mp (24,576 x 16,384) for the R5 and 80mp (10,368 x 7,776 pixels) for the M5D-3 and all the pictures used this feature as it gave better resolution, even if the images were later scaled down.

High-resolution pictures of computer monitors with color subpixels and any scaling or compression cause issues with color and intensity moiré (false patterning) due to the “beat frequency” between the camera’s color sensor and the display device. In this case, there are many different beat frequencies between both the pixels and color subpixels of the MQP’s displays and the cameras. Additionally, the issues of the MQP’s optics (which are poor compared to a camera lens) vary the resolution radially. I found for the whole FOV image, the lower-resolution Olympus camera didn’t have nearly as severe a moiré issue (only a little in intensity and almost none in color). In contrast, it was unavoidable with the R5 with the 16mm lens (see comparison below).

Lower Resolution Olympus D3 with very little moiré
Higher Resolution Cano R5 “catches” moiré

The R5 with the 28mmf2.8 Lens and pixel shift mode could capture the MQP’s individual red, green, and blue subpixels (right). In the picture above, the two “7s” on the far right have a little over 1 pixel wide horizontal and diagonal stroke. The two 7’s are formed by different subpixels caused by them being slightly differently aligned in 3D space. The MQP’s displays are rotated by about 20°; thus, the subpixels are on a 20° diagonal (about the same as the lower stoke on the 7’s. Capturing at this resolution where the individual red, green, and blue sub-pixels are visible necessitated underexposing the overall image by about 8X (3 camera stops). Otherwise, some color dots (particularly green) will “blow out” and shift the color balance.

As seen in the full-resolution crop above, each color dot in the MQP’s display device covers about 1/8th of the area of a pixel, with the other two colors and black filling the rest of the area of a pixel. Note how the scaled-down version of the same pixels on the right look dim when the subpixels are averaged together. The camera exposure had to be set about three stops lower (8 times in brightness as stops are a power of two) to avoid blowing out the subpixels.

Appendix 3: Confabulating typeface “points” (pt) with With Pixels – A Brief History

Making a monitor appear locked in 3-D spaces breaks everything about how PCs have dealt with rendering text and most other objects. Since the beginning of PC bitmap graphics, practical compromises (and shortcuts) have been made to reduce processing and to make images look better on affordable computer monitors. A classic compromise is the font “point,” defined (since 1517) at ~1/72nd of an inch.

So, in theory, when rendering text, a computer should consider the physical size of the monitor’s pixels. Early bitmapped graphics monitors in the mid-1980s had about 60 to 85 ppi, so the PC developers (except Adobe with their Postscript printers, with founders from Xerox PARC, that also influenced Apple) without a processing power to deal with it and the need to get on with making products confabulated “points” and “pixels.” Display font “scaling” helps correct this early transgression.

Many decades ago, MS Windows decided that a (virtual) 96 dots per inch (DPI) would be their default “100%” font scaling. An interesting Wikipedia article on the convoluted logic that led to Microsoft’s decision is discussed here. Conversely, Apple stuck with 72 PPI as their basis for fonts and made compromises with font readability on lower-resolution monitors with smaller fonts. Adherence to 72 PPI may explain why a modern Apple Mac 27” monitor is 5K to reach 218 ppi (within rounding of 3×72=216). In contrast, the much more common and affordable 27” 4K monitor has 163 ppi, not an integer multiple of 72, and Macs have scaling issues with 3rd party monitors, including the very common 27” 4k.

Microsoft and Macs have tried to improve the text by varying the intensity of the color subpixels. Below is an example from MS Windows with “ClearType” for a series of different-size fonts. Note particularly the horizontal strokes at the bottom of the numbers 1, 2, and 7 below and how the jump from 1 pixel wide with no smoothing from Calibri 9 to 14pt, then an 18pt, the strokes jump to 2 pixels wide with a little smoothing and then at 20pt become 2 pixels wide with no smoothing vertically.

Apple has a similar function known as “LCD Font Smoothing. Apple had low-ppd text rendering issues in its rearview mirror with “retinal resolution” displays for Mac laptops and monitors. “Retinal resolution” translates to more than 80ppd when viewed normally, which is about from about 12” (0.3 meters) for handheld devices (ex. iPhone) or about 0.5 to 0.8 meters for a computer.

The chart was Edited for Space, and ppd in information was added.

Apple today sells “retina monitors” with a high 218 PPI, which makes text grid fitting less of an issue. But as the chart from Mac external displays for designers and developers (right), Mac systems have resolution and performance issues with in-between resolution monitors.

The Apple Vision Pro has less than 40 ppd, much lower than any of these monitors at normal viewing distance. And that is before all the issues with making the virtual monitor seem stationary as the user moves.

❌
❌