Reading view

There are new articles available, click to refresh the page.

Instagram tightens restrictions on teen use, putting parents in control

17 September 2024 at 14:00

Instagram is introducing Teen Accounts to automatically enroll young users into an app experience with built-in protections. The company announced that starting on Tuesday, it will start placing all current and future accounts held by a teenager into a Teen Account. Instagram is also rolling out new updates to its parental supervision feature that allows […]

Instagram-reacties zijn straks ook op Threads te zien

Dutch Cowboys

Laura Jenny

16 September 2024 at 08:55

Het bericht Instagram-reacties zijn straks ook op Threads te zien verscheen eerst op DutchCowboys.

OpenAI previews its new Strawberry model

TechCrunch

Cody Corrall

14 September 2024 at 19:05

OpenAI this week unveiled a preview of OpenAI o1, also known as Strawberry. The company claims that o1 can more effectively reason through math and science, as well as fact-check itself by spending more time considering all parts of a query. The family of models is available in ChatGPT and via OpenAI’s API, though OpenAI […]

Meta reignites plans to train AI using UK users’ public Facebook and Instagram posts

TechCrunch

Paul Sawers

13 September 2024 at 16:25

Meta has confirmed that it’s restarting efforts to train its AI systems using public Facebook and Instagram posts from its U.K. userbase. The company claims it has “incorporated regulatory feedback” into a revised “opt-out” approach to ensure that it’s “even more transparent,” as its blog post spins it. It is also seeking to paint the move as […]

Meta is making its AI info label less visible on content edited or modified by AI tools

TechCrunch

Aisha Malik

12 September 2024 at 22:11

By making the AI info label harder to find, it might be easier for users to be deceived by content that was edited with AI, especially as editing tools become more and more advanced.

Threads makes it easier to evangelize the open social web with a new direct link feature

TechCrunch

Sarah Perez

12 September 2024 at 19:08

It’s a small advance, but one that speaks to Meta’s enginerring team paying attention to how the fediverse community is trying to educate Threads users about the possibilities.

Meta, TikTok, and Snap pledge to participate in program to combat suicide and self-harm content

TechCrunch

Kyle Wiggers

12 September 2024 at 15:55

In an attempt to prevent suicide and self-harm content from spreading online, the nonprofit Mental Health Coalition (MHC) today announced a new program, Thrive, aimed at encouraging online platforms to share “signals” of potentially harmful material. Thrive, which counts Meta, Snap, and TikTok as founding members, will provide ways for platforms to share hashes — […]

Irish Big Tech watchdog digs into platforms’ content reporting mechanisms after DSA complaints

TechCrunch

Natasha Lomas

12 September 2024 at 14:22

Ireland’s media regulator said it is reviewing how major platforms let users report illegal content, following a high number of complaints.

WhatsApp brings Meta Verified, customized messages to small businesses in India

TechCrunch

Jagmeet Singh， Ivan Mehta

12 September 2024 at 09:44

WhatsApp is now letting small businesses in India sign up for a Meta Verified badge and giving them the ability to send customized messages to customers.

Meta stopt met zijn concurrent voor Apple Vision Pro

Dutch Cowboys

Laura Jenny

23 August 2024 at 22:37

Het bericht Meta stopt met zijn concurrent voor Apple Vision Pro verscheen eerst op DutchCowboys.

Meta Releases New Audio Ray Tracing Tool for More Immersive Soundscapes on Quest

Road to VR

Scott Hayden

31 July 2024 at 11:19

Meta announced its released a new Acoustic Ray Tracing feature that will make it easier for developers to add more immersive audio to their VR games and apps.

Released in the Audio SDK for Unity and Unreal, the new Acoustic Ray Tracing tech is designed to automate the complex process of simulating realistic acoustics which is traditionally achieved through labor-intensive, manual methods.

With Meta’s Acoustic Ray Tracing, the company says in a developer blog post it can simulate sound reflections, reverberations, and things like diffraction, occlusion, and obstruction—all critical to making spatial audio closer to the real thing.

The new audio feature, which Meta calls “a more natural audio experience” than its older Shoebox Room Acoustics model, also supports complex environments.

“Our acoustics features can handle arbitrarily complex geometry, ensuring that even the most intricate environments are accurately simulated,” Meta says. “Whether your VR scene is a winding cave, a bustling cityscape, or an intricate indoor environment, our technology can manage the complexity without compromising on performance.

The Acoustic Ray Tracing system is said to integrate with existing workflows, supporting popular middleware like FMOD and Wwise. Notably, Acoustic Ray Tracing is being used in the upcoming Quest exclusive Batman: Arkham Shadow, demonstrating its potential for creating immersive experiences.

“One of the standout benefits of our new acoustics features is their performance on mobile hardware. While other solutions in the market require powerful PCs due to their high performance cost, our SDK is optimized to run efficiently on mobile devices such as Quest headsets. This opens up new possibilities for high-quality audio simulation in mobile applications, making immersive audio more accessible than ever before,” the company says.

You can find out more about Meta’s Acoustic Ray Tracing here. You’ll also find documentation on Meta’s Audio SDK (Unity | Unreal) and Acoustic Ray Tracing (Unity | Unreal).

The post Meta Releases New Audio Ray Tracing Tool for More Immersive Soundscapes on Quest appeared first on Road to VR.

Netflix Discontinues Quest App Following Browser Streaming Quality Bump

Road to VR

Scott Hayden

17 July 2024 at 16:05

Netflix has pulled its video streaming app from the Quest content store.

As first reported by UploadVR, the long-neglected Netflix app for Quest is now gone from the store. If you already downloaded the app before then, you’ll find it no longer works.

That doesn’t mean you can’t watch Netflix on Quest though. The streaming giant recently bumped streaming quality in the Quest Browser to 1080p, which comes in stark contrast to the app’s 480p capped resolution, which notably didn’t support mixed reality passthrough or downloads.

Originally released in 2015 for Samsung Gear VR and developed by former Meta CTO John Carmack, the app experienced very few updates over the years, with the latest arriving in 2019 alongside the launch of the original Quest.

Why not simply develop a new official app? Netflix requires devices to be certified in order to push streaming beyond that 480p cap, which requires meeting technical requirements, submitting the device for testing, and even possibly negotiating a licensing agreement, which are all things Meta would have to initiate.

Notably, Quest has native apps for Amazon Prime Video, Pluto TV, MLB, in addition to its own Meta TV app. It lacks Disney+, Hulu, Paramount+, and HBO Max.

The post Netflix Discontinues Quest App Following Browser Streaming Quality Bump appeared first on Road to VR.

Deepfake Porn Is Leading to a New Protection Industry

IEEE Spectrum Recent Content full text

Eliza Strickland

15 July 2024 at 14:00

It’s horrifyingly easy to make deepfake pornography of anyone thanks to today’s generative AI tools. A 2023 report by Home Security Heroes (a company that reviews identity-theft protection services) found that it took just one clear image of a face and less than 25 minutes to create a 60-second deepfake pornographic video—for free.

The world took notice of this new reality in January when graphic deepfake images of Taylor Swift circulated on social media platforms, with one image receiving 47 million views before it was removed. Others in the entertainment industry, most notably Korean pop stars, have also seen their images taken and misused—but so have people far from the public spotlight. There’s one thing that virtually all the victims have in common, though: According to the 2023 report, 99 percent of victims are women or girls.

This dire situation is spurring action, largely from women who are fed up. As one startup founder, Nadia Lee, puts it: “If safety tech doesn’t accelerate at the same pace as AI development, then we are screwed.” While there’s been considerable research on deepfake detectors, they struggle to keep up with deepfake generation tools. What’s more, detectors help only if a platform is interested in screening out deepfakes, and most deepfake porn is hosted on sites dedicated to that genre.

“Our generation is facing its own Oppenheimer moment,” says Lee, CEO of the Australia-based startup That’sMyFace. “We built this thing”—that is, generative AI—”and we could go this way or that way with it.” Lee’s company is first offering visual-recognition tools to corporate clients who want to be sure their logos, uniforms, or products aren’t appearing in pornography (think, for example, of airline stewardesses). But her long-term goal is to create a tool that any woman can use to scan the entire Internet for deepfake images or videos bearing her own face.

“If safety tech doesn’t accelerate at the same pace as AI development, then we are screwed.” —Nadia Lee, That’sMyFace

Another startup founder had a personal reason for getting involved. Breeze Liu was herself a victim of deepfake pornography in 2020; she eventually found more than 800 links leading to the fake video. She felt humiliated, she says, and was horrified to find that she had little recourse: The police said they couldn’t do anything, and she herself had to identify all the sites where the video appeared and petition to get it taken down—appeals that were not always successful. There had to be a better way, she thought. “We need to use AI to combat AI,” she says.

Liu, who was already working in tech, founded Alecto AI, a startup named after a Greek goddess of vengeance. The app she’s building lets users deploy facial recognition to check for wrongful use of their own image across the major social media platforms (she’s not considering partnerships with porn platforms). Liu aims to partner with the social media platforms so her app can also enable immediate removal of offending content. “If you can’t remove the content, you’re just showing people really distressing images and creating more stress,” she says.

Liu says she’s currently negotiating with Meta about a pilot program, which she says will benefit the platform by providing automated content moderation. Thinking bigger, though, she says the tool could become part of the “infrastructure for online identity,” letting people check also for things like fake social media profiles or dating site profiles set up with their image.

Can Regulations Combat Deepfake Porn?

Removing deepfake material from social media platforms is hard enough—removing it from porn platforms is even harder. To have a better chance of forcing action, advocates for protection against image-based sexual abuse think regulations are required, though they differ on what kind of regulations would be most effective.

Susanna Gibson started the nonprofit MyOwn after her own deepfake horror story. She was running for a seat in the Virginia House of Delegates in 2023 when the official Republican party of Virginia mailed out sexual imagery of her that had been created and shared without her consent, including, she says, screenshots of deepfake porn. After she narrowly lost the election, she devoted herself to leading the legislative charge in Virginia and then nationwide to fight back against image-based sexual abuse.

“The problem is that each state is different, so it’s a patchwork of laws. And some are significantly better than others.” —Susanna Gibson, MyOwn

Her first win was a bill that the Virginia governor signed in April to expand the state’s existing “revenge porn” law to cover more types of imagery. “It’s nowhere near what I think it should be, but it’s a step in the right direction of protecting people,” Gibson says.

While several federal bills have been introduced to explicitly criminalize the nonconsensual distribution of intimate imagery or deepfake porn in particular, Gibson says she doesn’t have great hopes of those bills becoming the law of the land. There’s more action at the state level, she says.

“Right now there are 49 states, plus D.C., that have legislation against nonconsensual distribution of intimate imagery,” Gibson says. “But the problem is that each state is different, so it’s a patchwork of laws. And some are significantly better than others.” Gibson notes that almost all of the laws require proof that the perpetrator acted with intent to harass or intimidate the victim, which can be very hard to prove.

Among the different laws, and the proposals for new laws, there’s considerable disagreement about whether the distribution of deepfake porn should be considered a criminal or civil matter. And if it’s civil, which means that victims have the right to sue for damages, there’s disagreement about whether the victims should be able to sue the individuals who distributed the deepfake porn or the platforms that hosted it.

Beyond the United States is an even larger patchwork of policies. In the United Kingdom, the Online Safety Act passed in 2023 criminalized the distribution of deepfake porn, and an amendment proposed this year may criminalize its creation as well. The European Union recently adopted a directive that combats violence and cyberviolence against women, which includes the distribution of deepfake porn, but member states have until 2027 to implement the new rules. In Australia, a 2021 law made it a civil offense to post intimate images without consent, but a newly proposed law aims to make it a criminal offense, and also aims to explicitly address deepfake images. South Korea has a law that directly addresses deepfake material, and unlike many others, it doesn’t require proof of malicious intent. China has a comprehensive law restricting the distribution of “synthetic content,” but there’s been no evidence of the government using the regulations to crack down on deepfake porn.

While women wait for regulatory action, services from companies like Alecto AI and That’sMyFace may fill the gaps. But the situation calls to mind the rape whistles that some urban women carry in their purses so they’re ready to summon help if they’re attacked in a dark alley. It’s useful to have such a tool, sure, but it would be better if our society cracked down on sexual predation in all its forms, and tried to make sure that the attacks don’t happen in the first place.

Zuckerberg: Meta “almost ready” to Show Off Prototype AR Glasses

Road to VR

Scott Hayden

2 July 2024 at 17:30

A previous report maintained that Meta is getting ready to show off prototype AR hardware at the company’s upcoming Connect developer conference in September, which up until now has been tightly under wraps. Now Meta CEO Mark Zuckerberg says he’s “almost ready” to reveal a pair of “unmistakably [AR] glasses.”

Update (July 2nd, 2024): Zuckerberg sat down with Kane ‘Kallaway’ Sutter in a recent video interview where he revealed that the company’s prototype AR glasses are nearly ready to be shown off to the public.

“The glasses are, I think, going to be a big deal,” Zuckerberg said. “We’re almost ready to start showing the prototype version of the full holographic glasses. We’re not going to be selling it broadly; we’re focused on building the full consumer version rather than selling the prototype.”

Zuckerberg noted early testers were left with “a giddy reaction” when demoing the device, which are indeed set be glasses and not a headset like HoloLens 2 or Quest 3:

The prototype version is “not the most stylish thing, but […] it’s unmistakably glasses, not a headset,” Zuckerberg confirmed.

The original article detailing the previous report follows below:

Original (March 5th, 2024): A report from Business Insider maintains Meta’s AR team has been tapped to get its ‘Orion’ AR glasses ready to unveil at Connect 2024, which typically happens in October. The report cites two people familiar with the matter, whose identities were confirmed by Business Insider.

Orion has been under development for the past nine years, however there is allegedly now “internal pressure to ensure a high level of performance” at Connect, which the company regularly uses to not only unveil new products, such as Quest 3, but also research projects and prototypes such as Project Aria, which when unveiled in 2020 showed off a bevy of sensors the company was using to train its AR perception systems and assess public perception of the technology.

It’s uncertain if Orion and Project Nazare, are one in the same, which Meta teased back in 2021, saying it would be the company’s “first full augmented reality glasses.” Back then, Meta CEO and founder Mark Zuckerberg outlined just how difficult it would be:

“There’s a lot of technical work to get this form-factor and experience right. We have to fit hologram displays, projectors, batteries, radios, custom silicon chips, cameras, speakers, sensors to map the world around you and more into glasses that are about 5mm thick. So we still have a ways to go with Nazare, but we’re making good progress,” Zuckerberg said.

Speaking to The Verge late last year, Meta CTO and Reality Labs Chief Andrew ‘Boz’ Bosworth described the company’s AR glasses as having been built on a “prohibitively expensive technology path.”

Notably, these are set to be ‘true’ AR glasses, and not HUD-based smartglasses like Google Glass, or a mixed reality headset, such as the company’s Quest line. Find out more about the difference between AR and smartglasses in our handy primer.

According to Business Insider, it’s expected that a consumer version of the AR glasses won’t be ready for a number of years, as previous reports maintain it could come as soon as 2027.

The post Zuckerberg: Meta “almost ready” to Show Off Prototype AR Glasses appeared first on Road to VR.

AWE 2024 Panel: The Current State and Future Direction of AR Glasses

KGOnTech

Karl Guttag

29 June 2024 at 23:22

Introduction

At AWE 2024, I was on a panel discussion titled “The Current State and Future Direction of AR Glasses.” Jeri Ellsworth, CEO of Tilt Five, Ed Tang, CEO of Avegant, Adi Robertson, Senior Reporter at The Verge, and I were on the panel, with Jason McDowell, The AR Show, moderating. Jason McDowell did an excellent job of moderation and keeping the discussion moving. Still, with only 55 minutes, including questions from the audience, we could only cover a fraction of the topics we had considered discussing. I’m hoping to reconvene this panel sometime. I also want to thank Dean Johnson, Associate Professor at Western Michigan University, who originated the idea and helped me organize this panel. AWE’s video of our panel is available on YouTube.

First, I will outline what was discussed in the panel. Then, I want to follow up on small FOV optical AR glasses and some back-and-forth discussions with AWE Legend Thad Starner.

Outline of the Panel Discussion

The panel covered many topics, and below, I have provided a link to each part of our discussion and added additional information and details for some of the topics.

0:00 Introductions
2:19 Apple Vision Pro (AVP) and why it has stalled. It has been widely reported that AVP sales have stalled. Just before the conference, The Information reported that Apple had suspended the Vision Pro 2 development and is now focused on a lower-cost version. I want to point out that a 1984 128K Mac 1 adjusted for inflation would cost over $7,000 adjusted for inflation, and the original 1977 Apple 2 4K computer (without a monitor or floppy drive) would cost about $6,700 in today’s dollars. I contend that utility and not price is the key problem with the AVP sales volume and that Apple is thus drawing the wrong conclusion.
7:20 Optical versus Passthrough AR. The panel discusses why their requirements are so different.
11:30 Mentioned Thad Starner and the desire for smaller FOV optical AR headsets. It turns out that Thad Starner attended our panel, but as I later found out, he arrived late and missed my mentioning him. Thad, later questioned the panel. In 2019, I wrote the article FOV Obsession, which discussed Thad’s SPIE AR/VR/MR presentation about smaller FOV. Thad is a Georgia Institute of Technology professor and a part-time Staff Researcher at Google (including on Google Glass). He has continuously worn AR devices since his research work at MIT’s media lab in the 1990s.
13:50 Does “tethering make sense” with cables or wirelessly?
20:40 Does an AR device have to work outside (in daylight)?
26:49 The need to add displays to today’s Audio-AI glasses (ex. Meta Ray-Ban Wayfarer).
31:45 Making AR glasses less creepy?
35:10 Does it have to be a glasses form factor?
35:55 Monocular versus Biocular
37:25 What did Apple Vision Pro get right (and wrong) regarding user interaction?
40:00 I make the point that eye tracking and gesture recognition on the “Apple Vision Pro is magical until it is not,” paraphrasing Adi Robertson, and I then added, “and then it is damn frustrating.” I also discuss that “it’s not truly hands-free if you have to make gestures with your hands.”
41:48 Waiting for the Superman [savior] company. And do big companies help or crush innovation?
44:20 Vertical integration (Apple’s big advantage)
46:13 Audience Question: When will AR glasses replace a smartphone (enterprise and consumer)
49:05 What is the first use case to break 1 million users in Consumer AR?
49:45 Thad Starner – “Bold Prediction” that the first large application will be with small FOV (~20 degrees), monocular, and not centered in the user’s vision (off to the ear side by ~8 to 20 degrees), and monochrome would be OK. A smartphone is only about 9 by 15 degrees FOV [or ~20 degrees diagonally when a phone is held at a typical distance].
52:10 Audience Question: Why aren’t more companies going after OSHA (safety) certification?

Small FOV Optical AR Discussion with Thad Starner

As stated in the outline above, Thad Starner arrived late and missed my discussion of smaller FOVs that mentioned Thad, as I learned after the panel. Thad, who has been continuously wearing AR glasses and researching them since the mid-1990s, brings an interesting perspective. Since I first saw and met him in 2019, he has strongly advocated for AR headsets having a smaller FOV.

Thad also states that the AR headset should have a monocular (single-eye) display and be 8—to 20 degrees on the ear side of the user’s straight-ahead vision. He also suggests that monochrome is fine for most purposes. Thad stated that his team will soon publish papers backing up these contentions.

In the sections below, I went from the YouTube transcript and did some light editing to make what was said more readable.

My discussion from earlier in the panel:

11:30 Karl Guttag – I think a lot of the AR or Optical see-through gets confabulated with what was going on in VR because VR was cheap and easy to make a wide field of view by sticking a cell phone with some cheap Optics in front of your face. You get a wide field of view, and people went crazy about that. I made this point years ago on my blog [2019 article FOV Obsession] was the problem. Thad Starner makes this point: he’s one of our Legends at AWE, and I took that to heart many years ago at SPIE AR/VR/MR 2019.

The problem is that as soon as you say beyond about 30-degree field of view, even projecting forward [with technology advancements], as you go beyond 30-degree field of view, you’re in a helmet, something looking like Magic Leap. And Magic Leap ended up in Nowheresville. [Magic Leap] ended up with 25 to 30% see-through, so it’s not really that good see-through, and yet it’s not got the image quality that you would get of an old display shot right in your eyes. You might you could get a better image on an Xreal or something like that.

People are confabulating too many different specs, so they want a wide field of view. The problem is as soon as you say 50 degrees and then you say, yeah, and I need like spatial recognition, I want to do SLAM, and I want to do this, and I want to do that. You’ve now spiraled into the helmet. I mean, you know, Meta was talking the other day about the other panels and said they’re looking at about 50 grams [for the Meta Ray Bans], and my glasses are 23 grams. You’re out of that as soon as you say 50-degree field of view, you’re over 100 grams and and and and and heading to the Moon as you add more and more cameras and all this other stuff, so I think that’s one of our bigger problems whereas AR really Optical AR.

The experiment we’re going to see played out because many companies are working on adding displays to to so called AI audio glasses. We’re going to see if that works because companies are getting ready to make glasses that have 20—to 30-degree field of view glasses tied into AI and audio stuff.

Thad Starner’s comments and the follow-up discussion during the Q&A at the end of the panel:

AWE Legend Thad Starner Wearing Vuzix’s Ultralight Glasses – After the Panel

49:46 Hi, my name is Thad Starner. I’m Professor Georgia Tech. I’m going to make a bold prediction here that the future, at least the first system to sell over a million units, will be a small field of view monocular, non-line-of-sight display, monochrome is okay now; the reason I say that is number one I’ve done different user studies in my lab that we’ll be publishing soon on this subject but the other thing is that you know our phones which is the most popular interface out there are only 9 degrees by 16 degrees field of view. Putting something outside of the line of sight means that it doesn’t interrupt you while you’re crossing the street or driving or flying a plane, right? We know these numbers, so between 8° and 20 degrees towards the ear and plus or minus 8 degrees, I’m looking at Karl [Guttag] here so he can digest all these things.

Karl – I wrote a whole article about it [FOV Obsession]

Thad – And not having a pixel in line of sight, so now feel free to pick me apart and disagree with me.

Jeri- I want to know a price point.

Thad, I think the first market will be captioning for the heart of hearing, not for the deaf. Also, possible transcription, not translation; at that price point, you’re talking about making reading glasses for people instead of hearing aids. There’s a lot of pushback against hearing, but reading glasses people tend to do, so I’d say you’re probably in the $200 to $300 range.

Ed – I think your prediction is spot on, minus the color green. The only thing I think is that it’s not going to fly.

Thad – I said monochrome is okay.

Ed – I think the monocular field of view is going to be an entry-level product, and you see, I think you will see products that will fit that category with roughly that field of view with roughly that offset angle [not in the center of view] is what you’re going to see in the beginning. Yeah I agree with that but I don’t I think that’s the first step I think you will see a lot of products after that that’s going to do a lot more than monocular monochrome offset displays, start going to larger field of view binocular I think that will happen pretty quickly.

Adi – It does feel like somebody tries to do that every 18 months, though, like Intel tried to make a pair of glasses that did that. It’s a little bit what North did. I guess it’s just a matter of throwing the idea at the wall because I think it’s a good one until it takes.

I was a little taken aback to have Thad call me out as if I had disagreed with him when I had made the point about the advantages of a smaller FOV earlier. Only after the presentation did I find out that he had arrived late. I’m not sure what comment I made that made Thad think I was advocating for a larger FOV in AR glasses.

I want to add that there can be big differences between what consumers and experts will accept in a product. I’m reminded of a story I read in the early 1980s when there was a big debate between very high-resolution monochrome versus lower-resolution color (back then, you could only have one or the other with CRTs) that the head of IBM’s monitor division said, “Color is the least necessary and most desired feature in a monitor.” All the research suggested that resolution was more important for the tasks people did on a computer at the time, but people still insisted on color monitors. Another example is the 1985 New Coke fiasco, in which Coke’s taste studies proved that people liked New Coke better, but it still failed as a product.

In my experience, a big factor is whether the person is being trained to use the device for enterprise or military use versus whether the user is buying it for their own enjoyment. The military has used monochrome displays on devices, including night vision and heads-up displays for decades. I like to point out that the requirement can change if “If the user paid to use versus is paying to use.” Enterprises and the military care about whether the product gets the job done and pay someone to use the device. The consumer has different criteria. I will also agree that there are cases where the user is motivated to be trained, such as Thad’s hard-of-hearing example.

Conclusion on Small FOV Optical AR

First, I agree with Thad’s comments about the smaller FOV and have stated such before. There are also cases outside of enterprise and industrial use where the user is motivated to be trained, such as Thad’s hard-of-hearing example. But while I can’t disagree with Thad or his studies that show having a monocular monochrome image located outside the line of sight is technically better, I think consumers will have a tougher time accepting a monocular monochrome display. What you can train someone to use differs from what they would buy for themselves.

Thad makes a good point that having a biocular display directly in the line of sight can be problematic and even dangerous. At the same time, untrained people don’t like monocular displays outside the line of sight. It becomes (as Ed Tang said in the panel) a point of high friction to adoption.

Based on the many designs I have seen for AR glasses, we will see this all played out. Multiple companies are developing optical see-through AR glasses with monocular green MicroLEDs, color X-cube-based MicroLEDs, and LCOS-based displays with glass form-factor waveguide optics (both diffractive and reflective).

LG Shakes Up XR Division, Reportedly Putting Meta Headset Partnership on Ice

Road to VR

Scott Hayden

12 June 2024 at 15:36

Mark Zuckerberg (Meta), William Cho (LG), and Park Hyoung-sei (LG)

Korean media last month alleged that the Meta/LG partnership to create a high-end XR headset wasn’t going so well, suggesting either outright cancellation or a delay pushing release of a prospective Apple Vision Pro competitor to 2027. While this hasn’t been substantiated by either company, it’s clear there’s something big going on under the surface, as LG is now shuffling employees from its XR division to other parts of the company.

As confirmed by Korean outlet ETNews (Korean), LG is reassigning employees in charge of the XR division to research and development and other business divisions within the company.

Here’s the official statement from LG obtained by ETNews, machine translated from Korean to English:

“We have recently confirmed the relocation policy for personnel in charge of the XR business. Taking into account the department and work location desired by the personnel and the demand for additional personnel in other departments, the relocation will take place for about a month.”

ETNews reports that the nature of the shakeup is “unusual” in LG, as such cases of forming a product division after research and development comes as a rare occurrence.

The report further stipulates LG has delayed its own XR tech indefinitely, and terminated its joint commercialization of a product with Meta. The report however maintains the two companies will continue in research and development of XR technologies.

When LG announced its collaboration with Meta in May, it was said the partnership would be focused on strengthening “the fusion of Meta’s diverse core technological elements with LG’s cutting-edge product and quality capabilities [promising] significant synergies in next-gen XR device development.”

While not explicitly stated by either LG or Meta, it was rumored the two has been working to create a competitor to Apple Vision Pro for launch in 2025.

One possible reason for the XR shakeup could be Meta is getting ready to release its XR operating system to third-party OEMs for the first time, which will include new Quest-style headsets coming from ASUS, Lenovo, and Xbox. Growing the number of competing devices that will use Meta’s HorizonOS (ex-QuestOS) and Horizon Store (ex-Quest Store) so rapidly may have spoiled the deal for LG—although without confirmation from either company, that remains conjecture at this time.

The post LG Shakes Up XR Division, Reportedly Putting Meta Headset Partnership on Ice appeared first on Road to VR.

Mark Zuckerberg (Meta), William Cho (LG), and Park Hyoung-sei (LG)

1-bit LLMs Could Solve AI’s Energy Demands

IEEE Spectrum Recent Content full text

Matthew Hutson

30 May 2024 at 20:28

Large language models, the AI systems that power chatbots like ChatGPT, are getting better and better—but they’re also getting bigger and bigger, demanding more energy and computational power. For LLMs that are cheap, fast, and environmentally friendly, they’ll need to shrink, ideally small enough to run directly on devices like cellphones. Researchers are finding ways to do just that by drastically rounding off the many high-precision numbers that store their memories to equal just 1 or -1.

LLMs, like all neural networks, are trained by altering the strengths of connections between their artificial neurons. These strengths are stored as mathematical parameters. Researchers have long compressed networks by reducing the precision of these parameters—a process called quantization—so that instead of taking up 16 bits each, they might take up 8 or 4. Now researchers are pushing the envelope to a single bit.

How to Make a 1-bit LLM

There are two general approaches. One approach, called post-training quantization (PTQ) is to quantize the parameters of a full-precision network. The other approach, quantization-aware training (QAT), is to train a network from scratch to have low-precision parameters. So far, PTQ has been more popular with researchers.

In February, a team including Haotong Qin at ETH Zurich, Xianglong Liu at Beihang University, and Wei Huang at the University of Hong Kong introduced a PTQ method called BiLLM. It approximates most parameters in a network using 1 bit, but represents a few salient weights—those most influential to performance—using 2 bits. In one test, the team binarized a version of Meta’s LLaMa LLM that has 13 billion parameters.

“One-bit LLMs open new doors for designing custom hardware and systems specifically optimized for 1-bit LLMs.” —Furu Wei, Microsoft Research Asia

To score performance, the researchers used a metric called perplexity, which is basically a measure of how surprised the trained model was by each ensuing piece of text. For one dataset, the original model had a perplexity of around 5, and the BiLLM version scored around 15, much better than the closest binarization competitor, which scored around 37 (for perplexity, lower numbers are better). That said, the BiLLM model required about a tenth of the memory capacity as the original.

PTQ has several advantages over QAT, says Wanxiang Che, a computer scientist at Harbin Institute of Technology, in China. It doesn’t require collecting training data, it doesn’t require training a model from scratch, and the training process is more stable. QAT, on the other hand, has the potential to make models more accurate, since quantization is built into the model from the beginning.

1-bit LLMs Find Success Against Their Larger Cousins

Last year, a team led by Furu Wei and Shuming Ma, at Microsoft Research Asia, in Beijing, created BitNet, the first 1-bit QAT method for LLMs. After fiddling with the rate at which the network adjusts its parameters, in order to stabilize training, they created LLMs that performed better than those created using PTQ methods. They were still not as good as full-precision networks, but roughly 10 times as energy efficient.

In February, Wei’s team announced BitNet 1.58b, in which parameters can equal -1, 0, or 1, which means they take up roughly 1.58 bits of memory per parameter. A BitNet model with 3 billion parameters performed just as well on various language tasks as a full-precision LLaMA model with the same number of parameters and amount of training, but it was 2.71 times as fast, used 72 percent less GPU memory, and used 94 percent less GPU energy. Wei called this an “aha moment.” Further, the researchers found that as they trained larger models, efficiency advantages improved.

A BitNet model with 3 billion parameters performed just as well on various language tasks as a full-precision LLaMA model.

This year, a team led by Che, of Harbin Institute of Technology, released a preprint on another LLM binarization method, called OneBit. OneBit combines elements of both PTQ and QAT. It uses a full-precision pretrained LLM to generate data for training a quantized version. The team’s 13-billion-parameter model achieved a perplexity score of around 9 on one dataset, versus 5 for a LLaMA model with 13 billion parameters. Meanwhile, OneBit occupied only 10 percent as much memory. On customized chips, it could presumably run much faster.

Wei, of Microsoft, says quantized models have multiple advantages. They can fit on smaller chips, they require less data transfer between memory and processors, and they allow for faster processing. Current hardware can’t take full advantage of these models, though. LLMs often run on GPUs like those made by Nvidia, which represent weights using higher precision and spend most of their energy multiplying them. New hardware could natively represent each parameter as a -1 or 1 (or 0), and then simply add and subtract values and avoid multiplication. “One-bit LLMs open new doors for designing custom hardware and systems specifically optimized for 1-bit LLMs,” Wei says.

“They should grow up together,” Huang, of the University of Hong Kong, says of 1-bit models and processors. “But it’s a long way to develop new hardware.”

Apple Vision Pro Part 6 – Passthrough Mixed Reality (PtMR) Problems

KGOnTech

Karl Guttag

27 September 2023 at 05:09

Introduction

I planned to wrap up my first pass coverage of the Apple Vision Pro (AVP) with my summary and conclusions based on prior articles. But the more I thought about it, Apple’s approach to Passthrough Mixed Reality (PtMR) seems like it will be so egregiously bad that it should be broken out and discussed separately.

Apple Prioritized EyeSight “Gimmick” Over Ergonomics and Functionality

There are some features, particularly surrounding camera passthrough, where there should have been an internal battle between those who wanted the EyeSight™ gimmick and what I would consider more important functionality. The backers of EyeSight must have won and forced the horrible location of the passthrough cameras, optical distortion from the curved glass in front of all the forward-facing cameras and sensors, put a fragile piece of hard-to-replace glass on the front where it can be easily scratched and broken, and added weight to the front were it is least desired. Also, as discussed later, there are negative effects on the human visual system caused by misaligning the passthrough cameras with the eyes.

The negative effects of EyeSight are so bad for so many fundamental features that someone in power with little appreciation for the technical difficulties must have forced the decision (at least, that is the only way I can conceive of it happening). People inside the design team must have known it would cause serious problems. Supporting passthrough mixed reality (PtMR) is hard enough without deliberately creating problems.

Meta Quest 3 Camera Location

As noted in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, Meta is locating the soon-to-be-released Quest 3 main passthrough camera closer to the center of view of the eyes. Fixed cameras in front of the eyes won’t be perfect and will still require digital correction for better functional use. It does appear that Meta is taking the PtMR more seriously than it did with the Meta Quest Pro and Quest 2.

I’m going to be looking forward to getting a Meta Quest 3 to test out when it is released soon.

Definitions of AR/VR/MR and PtMR

The terms used to describe mixed reality have been very fluid over the last few years. Before the introduction of Hololens, Augmented reality meant any headset that displayed virtual content on a see-through display. For example, just before Hololens went on sale, Wired in 2015 titled their article (with my bold emphasis): Microsoft Shows HoloLens’ Augmented Reality Is No Gimmick. With the introduction of Hololens, the term “Mixed Reality” was used to distinguish AR headsets with SLAM to lock the virtual to the real world. “AR” headsets without SLAM are sometimes called AR Heads-Up Displays (HUDs), but these get confused with automotive HUDs. Many today refer to a see-through headset without SLAM as “AR” and one with SLAM as “MR,” whereas previously, the terms “AR” covered both with and without SLAM.

Now we have the added confusion of optical see-through (e.x. Hololens) and camera passthrough “Mixed Reality.” While they may be trying to accomplish similar capabilities, they are radically different in their capabilities. Rather than constantly typing “passthrough” before MR, I abbreviated it as PtMR.

In Optical AR, the Virtual Content Augments the Real World – With PtMR, the Real World Augments the Virtual Content

Optical MR prioritizes seeing the real world at the expense of the virtual content. The real world is in perfect perspective, at the correct focus distance, with no limitation by a camera or display on brightness, with zero lag, etc. If done well, there is minimal light blocking and distortion of the real world and little blocking of the real-world FOV.

PtMR, on the other hand, prioritizes virtual image quality at the expense of the real world, both in how things behave in 3-D space (focus perspective) and in image quality.

We are likely many decades away, if ever, from passing what Douglas Lanman of Meta calls their Visual Turing Test (see also the video linked here).

Meta’s demonstrations at Siggraph 2023 of their Flamera with perspective-correct passthrough and Butterscotch with vergence accommodation conflict served to show how far PtMR is from optical passthrough. They can only address each problem individually, each with a large prototype, and even then, there are severe restrictions. The Flamera has a very low-resolution passthrough, and Butterscotch only supports a 50-degree FOV.

It is also interesting that Butterscotch moves back from Half Dome 3’s electronic LCD variable focus to electro-mechanical focusing to address VAC. As reported in Mixed Reality News, “However, the technology presented problems with light transmission and image quality [of the electronic LCD approach], so Meta discarded it for Butterscotch Varifocal at the expense of weight and size.”

All of this work is to try and solve some of the many problems created by PtMR that don’t exist with optical MR. PtMR does not “solve” the issues with optical MR. It just creates a long list of massively hard new problems. Optical AR has issues with the image quality of the virtual world, very large FOV, and hard-edge occlusion (see my article Magic Leap 2 (Pt. 3): Soft Edge Occlusion, a Solution for Investors and Not Users). I often say, “What is hard in optical MR is easy in PtMR and vice versa.”

Demo or Die

Meta and others seem to use Siggraph to show off research work that is far from practical. As stated by Lanman of Meta, of their Flamera and Butterscotch VAC demos at Siggraph 2023, Meta’s Reality Labs has a “Demo or Die” philosophy. They will not be tipping off their competition on concepts they will use within a few years. To be clear, I’m happy to see companies showing off their technical prowess, but at the same time, I want to put it in perspective.

Cosmetic vs. Functional Passthrough PtMR

JayzTwoCents video on the HTC Vive XR Elite has a presentation by Phil on what he calls “3D Depth Projection” (others refer to it as “perspective correct“). In the video (sequence of clips below), Phil demonstrates that because the passthrough video was not corrected in scale, position, and perspective in 3-D space, it deprives him of hand-eye coordination to catch a bottle tossed to him.

As discussed in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough in the section The method in the Madness: MQP prioritizes 3-D spatial over image quality.

Phil demonstrated in the video (and in a sequence of clips below) that with the Meta Quest Pro, even though the image quality is much worse and distorted due to the 3D projection, he can at least catch the bottle.

I would classify the HTC Vive XR Elite as having a “Cosmetic Passthrough.” While the image quality is better (but still not very good), it is non-functional. While Meta Quest Pro’s image quality is lousy, it is at least somewhat functional.

Something else to notice in the MQP frame sequence above is that there are both lag and accuracy errors in hand tracking.

Effects on Vision with Long-Term Use

It is less obvious that the human visual system will start adapting to any camera placement and then have to re-adapt after the headset is removed. This was briefly discussed in AVP Part 2 in the section titled Centering correctly for the human visual system, which references Steve Mann in his March 2013 IEEE Spectrum article, “What I’ve learned from 35 years of wearing computerized eyewear.” In the early days with Steve Mann, they had no processing power to attempt to move the effect of the camera images digitally. At the same time, I’m not sure how well the correction will work or how a distorted view will affect people’s visual perception during and after long exposure. As with most visual effects, it will vary from one individual to another.

Meta Flamera Light Field Camera at Siggraph 2023

As discussed in AVP Part 2 and Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, having the passthrough cameras as close as possible to being coaxial to the eyes (among other things) is highly desirable.

To reduce any undesired negative effects on human vision caused by cameras not aligning with the eyes, some devices, such as the Quest 2 and Quest Pro from Meta, use processing to create what I will call “virtual cameras” with a synthesized view for each eye. The farther the physical cameras are from the eye’s location, the larger the correction will be required and the larger the distortion in the final result.

Meta at Siggraph 2023 presented the paper “Perspective-Correct VR Passthrough Without Reprojection” (and IEEE article) and showed their Flamera prototype with a light field camera (right). The figure below shows how the camera receives light rays from the same angle as the eye with the Light Field Passthrough Camera.

Below are a couple of still frames (with my annotations) from the related video that show how, with the Meta Quest 2, the eye and camera views differ (below left), resulting in a distorted image (below right). The distortion/error as the distance from the eye decreases.

It should be noted that while Flamera’s light field camera approach addresses the angular problems of the camera location, it does so with a massive loss in resolution (by at least “n,” where n is the number of light field subviews). So, while interesting in terms of research and highlighting the problem, it is still a highly impractical approach.

The Importance of “Perspective Correct” PtMR

In preparing this article, I returned to a thread on Hacker News about my Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough article. In my article, I was trying to explain why there was a “The method in the Madness: MQP prioritizes 3-D spatial over image quality” of why Meta was distorting the image.

Poster Zee2 took exception to my article and seemed to feel I was understating the problem of 3-D perspective. I think Zee2 missed what I meant by “pyrrhic victory.” I was trying to say they were correct to address the 3D depth issue but that doing so with a massive loss in image quality was not the solution. I was not dismissing the importance of perspective-correct passthrough.

Below, I am copying his comment from that thread (with my bold highlighting)), including a quote from my article. Interestingly, Zee2 comments on Varjo having good image quality with its passthrough, but it is not perspective-correct.

I also really don’t know why he [refering to my article] decided to deemphasize the perspective and depth correctness so much. He mentions it here:

>[Quoting Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough] In this case, they were willing to sacrifice image quality to try to make the position of things in the real world agree with where virtual objects appear. To some degree, they have accomplished this goal. But the image quality and level of distortion, particularly of “close things,” which includes the user’s hands, is so bad that it seems like a pyrrhic victory.

I don’t think this is even close to capturing how important depth and perspective correct passthrough is.

Reprojecting the passthrough image onto a 3D representation of the world mesh to reconstruct a perspective-correct view is the difference between a novelty that quickly gives people headaches and something that people can actually wear and look through for an extended period of time.

Varjo, as a counterexample, uses incredibly high-resolution cameras for their passthrough. The image quality is excellent, text is readable, contrast is good, etc. However, they make no effort to reproject their passthrough in terms of depth reconstruction. The result is a passthrough image that is very sharp, but is instantly, painfully, nauseatingly uncomfortable when walking around or looking at closeup objects alongside a distant background.

The importance of depth-correct passthrough reprojection (essentially, spacewarp using the depth info of the scene reconstruction mesh) absolutely cannot be understated and is a make or break for general adoption of any MR device. Karl is doing the industry a disservice with this article.

From: Hacker News Meta Quest Pro – Bad AR Passthrough comment by Zee2

Does the AVP have Cosmetic or Functional PtMR or Something Else?

With the AVP’s passthrough cameras being so poorly located (thanks to EyeSight™), severe distortion would seem inevitable to support functional PtMR. I don’t believe there is some magic (perhaps a pun on Magic Leap) that Apple could employ that Meta couldn’t that would simultaneously support good image quality without serious distortion with the terrible camera placement due to the Eyesight(tm) feature.

So, based on the placement of the cameras, I have low expectations for the functionality of the AVP’s PtMR. The “instant experts” who got to try out the AVP would be more impressed by a cosmetically better-looking passthrough. Since there are no reports of distortion like the MQP, I’m left to conclude that, at least for the demo, they were only doing a cosmetic passthrough.

As I often say, “Nobody will volunteer information, but everyone will correct you.” Thus, it is better to take a position based on the current evidence and then wait for a correction or confirmation from the many developers with AVPs who read this blog.

Conclusion

I’m not discounting the technical and financial power of Apple. But then I have been writing about the exaggerated claims for Mixed Reality products by giant companies such as Google, Meta, and Microsoft, not to mention the many smaller companies, including the over $3B spent by Magic Leap, for the last ten years. The combined sunk cost of about $50B of these companies, not including Apple. As I’m fond of saying, “If all it took were money and smart people, it would already be solved.”

Apple doesn’t fully appreciate the difficulties with Passthrough Mixed Reality, or they wouldn’t prioritize the EyeSight gimmick over core capabilities. I’m not saying the AVP would work well for passthrough AR without EyeSight, but it is hard enough without digging big technical holes to support a novelty feature.

Apple Vision Pro (Part 5C) – More on Monitor Replacement is Ridiculous

KGOnTech

Karl Guttag

21 August 2023 at 00:17

Introduction

In this series about the Apple Vision Pro, this sub-series on Monitor Replacement and Business/Text applications started with Part 5A, which discussed scaling, text grid fitting, and binocular overlap issues. Part 5B starts by documenting some of Apple’s claims that the AVP would be good for business and text applications. It then discusses the pincushion distortion common in VR optics and likely in the AVP and the radial effect of distortion on resolution in terms of pixels per degree (ppd).

The prior parts, 5A, and 5B, provide setup and background information for what started as a simple “Shootout” between a VR virtual monitor and physical monitors. As discussed in 5A, my office setup has a 34″ 22:9 3440×1440 main monitor with a 27″ 4K (3840×2160) monitor on the right side, which is a “modern” multiple monitor setup that costs ~$1,000. I will use these two monitors plus a 15.5″ 4K OLED Laptop display to compare to the Meta Quest Pro (MQP) since I don’t have an Apple AVP and then extrapolate the results to the AVP.

*My Office Setup: 34″ 22:9 3440×1440 (left) and 27″ 4K (right)*

I will be saving my overall assessment, comments, and conclusions about VR for Office Applications for Part 5D rather than somewhat burying them at the end of this article.

Office Text Applications and “Information Density” – Font Size is Important

A point to be made by using spreadsheets to generate the patterns is that if you have to make text bigger to be readable, you are lowering the information density and are less productive. Lowering the information density with bigger fonts is also true when reading documents, particularly when scanning web pages or documents for information.

Improving font readability is not solely about increasing their size. VR headsets will have imperfect optics that cause distortions, focus problems, chromatic aberrations, and loss of contrast. These issues make it harder to read fonts below a certain size. In Part 5A, I discussed how scaling/resampling and the inability to grid fit when simulating virtual monitors could cause fonts to appear blurry and scintillate/wiggle when locked in 3-D space, leading to reduced readability and distraction.

Meta Quest Pro Horizon Worktop Desktop Approach

As discussed in Part 5A, with Meta’s Horizon Desktop, each virtual monitor is reported to Windows as 1920 by 1200 pixels. When sitting at the nominal position of working at the desktop, the center virtual monitor fills about 880 physical pixels of the MQP’s display. So roughly 1200 virtual pixels are resampled into 880 vertical pixels in the center of view or by about 64%. As discussed in Part 5B, the scaling factor is variable due to severe pincushion distortion of the optics and the (impossible to turn off) curved screen effect in Meta Horizons.

The picture below shows the whole FOV captured by the camera before cropping shot through the left eye. The camera was aligned for the best image quality in the center of the virtual monitor.

Analogous to Nyquist sampling, when you scale pixel rendered image, you want about 2X (linearly) the number of pixels in the display of the source image to render it reasonably faithfully. Below left is a 1920 by 1200 pixel test pattern (a 1920×1080 pattern padded on the top and bottom), “native” to what the MQP reports to Windows. On the right is the picture cropped to that same center monitor.

The picture was taken at 405mp, then scaled down by 3X linearly and cropped. When taking high-resolution display pictures, some amount of moiré in color and intensity is inevitable. The moiré is also affected by scaling and JPEG compression.

Below is a center crop from the original test pattern that has been 2x pixel-replicated to show the detail in the pattern.

Below is a crop from the full-resolution image with reduced exposure to show sub-pixel (color element) detail. Notice how the 1-pixel wide lines are completely blurred, and the test is just becoming fully formed at about Arial 11 point (close to, but not the same scale as used in the MS Excel Calibri 11pt tests to follow). Click on the image to see the full resolution that the camera captured (3275 x 3971 pixels).

The scaling process might lose a little detail for things like pictures and videos of the real world (such as the picture of the elf in the test pattern), but it will be almost impossible for a human to notice most of the time. Pictures of the real world don’t have the level of pixel-to-pixel contrast and fine detail caused by small text and other computer-generated objects.

Meta Quest Pro Virtual Versus Physical Monitor “Shootout”

For the desktop “shootout,” I picked the 34” 22:9 and 27” 4k monitors I regularly use (side by side as shown in Part 5A), plus a Dell 15.5” 4K laptop display. An Excel spreadsheet is used with various displays to demonstrate the amount of content that can be seen at one time on a screen. The spreadsheet allows for flexible changing of how the screen is scaled for various resolutions and text sizes, and the number of cells measures the information density. For repeatability, a screen capture of each spreadsheet was taken and then played back in full-screen mode (Appendix 1 includes the source test patterns)

The Shootout

The pictures below show the relative FOVs of the MQP and various physical monitors taken with the same camera and lens. The camera was approximately 0.5 meters from the center of the physical monitors, and the headset was at the initial position at the MQP’s Horizon Desktop. All the pictures were cropped to the size of a single physical or virtual monitor.

The following is the basic data:

Meta Quest Pro – Central Monitor (only) ~43.5° horizontal FOV. Used an 11pt font with Windows Display Text Scaling at 150% (100% and 175% also taken and included later)
34″ 22:9 3440×1440 LCD – 75° FOV and 45ppd from 0.5m. 11pt font with 100% scaling
27″ 4K (3840 x 2160) LCD – 56° FOV and 62ppd from 0.5m. 11pt font with 150% scaling (results in text the same size at the 34″ 3440×1400 at 100% – 2160/1440 = 150%)
15.5″ 4K OLED – 32° FOV from 0.5m. Shown below is 11pt with 200% scaling, which is what I use on the laptop (a later image shows 250% scaling, which is what Windows “recommends” and would result in approximately the same size fonts at the 34″ 22:9 at 100%).

*Composite image showing the relative FOV – Click to see in higher resolution (9016×5641 pixels)*

The pictures below show the MQP with MS Windows display text scaling set to 100% (below left) and 175% (below middle). The 175% scaling would result in fonts with about the same number of pixels per font as the Apple Vision Pro (but with a larger angular resolution). Also included below (right) is the 15.5″ 4K display with 250% scaling (as recommended by Windows).

The camera was aimed and focused at the center of the MQP, the best case for it, as the optical quality falls off radially (discussed in Part 5B). The text sharpness is the same for the physical monitors from center to outside, but they have some brightness variation due to their edge illumination.

Closeup Look at the Displays

Each picture above was initially taken 24,576 x 16,384 (405mp) by “pixel shifting” the 45MP R5 camera sensor to support capturing the whole FOV while capturing better than pixel-level detail from the various displays. In all the pictures above, including the composite image with multiple monitors, each image was reduced linearly by 3X.

The crops below show the full resolution (3x linearly the images above) of the center of the various monitors. As the camera, lines, and scaling are identical, the relative sizes are what you would see looking through the headset for the MQP sitting at the desktop and the physical monitors at about 0.5 meters. I have also included a 2X magnification of the MQP’s images.

With Windows 100% text scaling, the 11pt font on the MQP is about the same size as it is on the 34” 22:9 monitor at 100%, the 27” 4K monitor at 150% scaling, and the 15.5” 4K monitor at 250% scaling. But while the fonts are readable on the physical monitor, they are a blurry mess on the MQP at 100%. The MQP at 150% and 175% is “readable” but certainly does not look as sharp as the physical monitors.

Extrapolating to Apple Vision Pro

Apple’s AVP has about 175% linear pixel density of the MQP. Thus the 175% case gives a reasonable idea of how text should look on the AVP. For comparison below, the MQP’s 175% case has been scaled to match the size of the 34” 22:9 and 27” 4K monitors at 100%. While the text is “readable” and about the same size, it is much softer/blurrier than the physical monitor. Some of this softness is due to optics, but a large part is due to scaling. While the AVP may have better optics and a text rendering pipeline, they still don’t have the resolution to compete on content density and readability with a relatively inexpensive physical monitor.

Reportedly, Apple Vision Pro Directly Rendering Fonts

Thomas Kumlehn had an interesting comment on Part 5B (with my bold highlighting) that I would like to address:

After the VisionPro keynote in a Developer talk at WWDC, Apple mentioned that they rewrote the entire render stack, including the way text is rendered. Please do not extrapolate from the text rendering of the MQP, as Meta has the tech to do foveated rendering but decided to not ship it because it reduced FPS.

*From Part 5A,* “Rendering a Pixel Size Dot.“

Based on my understanding, the AVP will “render from scratch” instead of rendering an intermediate image that is then rescaled as is done with the MQP discussed in Part 5A. While rendering from scratch has a theoretical advantage regarding text image quality, it may not make a big difference in practice. With an ~40 pixels per degree (ppd) display, the strokes and dots of what should be readable small text will be on the order of 1 pixel wide. The AVP will still have to deal with approximately pixel-width objects straddling four or more pixels, as discussed in Part 5A: Simplified Scaling Example – Rendering a Pixel Size Dot.

Some More Evaluation of MQP’s Pancake Optics Using immersed Virtual Monitor

I wanted to evaluate the MQP pancake optics more than I did in Part 5B. Meta’s Horizon Desktop interface was very limiting. So I decided to try out immersed Virtual Desktop software. Immersed has much more flexibility in the resolution, size, placements, and the ability to select flat or curved monitors. Importantly for my testing, I could create a large, flat virtual 4K monitor that could fill the entire FOV with a single test pattern (the pattern is included in Appendix 1).

Unfortunately, while the immersed software had the basic features I wanted, I found it difficult to precisely control the size and positioning of the virtual monitor (more on this later). Due to these difficulties, I just tried to fill the display with the test pattern with only a roughly perpendicular to the headset/camera monitor. It was a painfully time-consuming process, and I never could get the monitor where it seems perfectly perpendicular.

Below is a picture of the whole (camera) FOV taken at 405mp and then scaled down to 45mp. The image is a bit underexposed to show the sub-pixel (color) detail when viewed at full resolution. In taking the picture, I determined that the MQPs pancake optics focus appears to be a “dished,” with the focus in the center slightly different than on the outsides. The picture was taken focusing between the center and outside focus and using f11 to increase the photograph’s depth of focus. For a person using the headset, this dishing of the focus is likely not a problem as their eye will refocus based on their center of vision.

As discussed in Part 5B, the MQP’s pancake optics have severe pincushion distortion, requiring significant digital pre-correction to make the net result flat/rectilinear. Most notably, the outside areas of the display have about 1/3rd the linear pixel per degree of the center.

Next are shown 9 crops from the full-resolution (click to see) picture at the center, the four corners, top, bottom, left, and right of the camera’s FOV.

The main thing I learned out of this exercised is the apparent dish in focus of the optics and the fall off in brightness. I had determine the change in resolution in the studies shown in Part 5B.

Some feedback on immersed (and all other VR/AR/MR) virtual monitor placement control.

While the immersed had the features I wanted, it was difficult to control the setup of the monitors. The software feels very “beta,” and the interface I got differed from most of the help documentation and videos, suggesting it is a work in progress. In particular, I could’t figure out how to pin the screen, as the control for pinning shown in the help guides/videos didn’t seem to exist on my version. So I had to start from scratch on each session and often within a session.

Trying to orient and resize the screen with controllers or hand gestures was needlessly difficult. I would highly suggest immersed look at some of the 3-D CAD software controls of 3-D models. For example, it would be great to have a single (virtual) button that would position the center monitor directly in front and perpendicular to the user. It would also be a good idea to allow separate control for tilt, virtual distance, and zoom/resize while keeping the monitor centered.

It seemed to be “aware” of things in the room which only served to fight what I wanted to do. I was left contorting my wrist to try and get the monitor roughly perpendicular and then playing with the corners to try and both resized and center the monitor. The interface also appears to conflate “resizing” with moving the monitor closer. While moving the virtual monitor closer or resizing affect the size of everything, the effect will be different when the head moves. I would have a home (perpendicular and center) “button,” and then left-right, up-down, tilt, distance, and size controls.

To be fair, I wanted to set up the screen for a few pictures, and I may have overlooked something. Still, I found the user interface could be vastley better for the setting up the monitors, and the controller or gesture monitor size and positioning were a big fail in my use.

BTW, I don’t want to just pick on immersed for this “all-in-one” control problem. I have found it a pain on every VR and AR/MR headset I have tried that supports virtual monitors to give the user good simple intuitive controls for placing the monitors in the 3D space. Meta Horizons Desktop goes to the extreme of giving no control and overly curved screens.

Other Considerations and Conclusions in Part 5D

This series-within-a-series on the VR and the AVP use as an “office monitor replacement” has become rather long with many pictures and examples. I plan to wrap up this series within the series on the AVP with a separate article on issues to consider and my conclusions.

Appendix 1: Test Patterns

Below is a gallery of PNG file test patterns used in this article. Click on each thumbnail to see the full-resolution test pattern.

Appendix 2: Some More Background Information

More Comments on Font Sizes with Windows

As discussed in Appendix 3: Confabulating typeface “points” (pt) with With Pixels – A Brief History, at font “point” is defined as 1/72nd of an inch (some use 1/72.272 or thereabout – it is a complicated history). Microsoft throws the concept of 96 dots per inch (dpi) as 100%. But it is not that simple.

I wanted to share measurements regarding the Calibri 11pt font size. After measuring it on my monitor with a resolution of 110 pixels per inch (PPI), I found that it translates to approximately 8.44pt (8.44/72 inches). However, when factoring in the monitor PPI of 110 and Windows DPI of 96, the font size increases to ~9.67pt. Alternatively, when using a monitor PPI of 72, the font size increases to ~12.89pt. Interestingly, if printed assuming a resolution of 96ppi, the font reaches the standard 11pt size. It seems Windows apply some additional scaling on the screen. Nevertheless, I regularly use the 11pt 100% font size on my 110ppi monitor, which is the Windows default in Excel and Word, and it is also the basis for the test patterns.

How pictures were shot and moiré

As discussed in 5A’s Appendix 2: Notes on Pictures, some moiré issues will be unavoidable when taking high-resolution pictures of a display device. As noted in that Appendix, all pictures in Lens Shootout were taken with the same camera and lens, and the original images were captured at 405 megapixels (Canon R5 “IBIS sensor shift” mode) and then scaled down by 3X. All test patterns used in this article are included in the Appendix below.