VideoBrand
Posts
What does the Vision Pro mean for the future of video?

What does the Vision Pro mean for the future of video?

Your 5 minute video trends and marketing update.

Anything happen in the tech world yesterday?

So I was organizing our usual 5-3-1 newsletter format but decided to write my thoughts about Apple’s new announcements and what it meant for video.

And then I kept writing. And writing.

So this’ll be a special edition of the VideoBrand Newsletter - all about Apple’s announcements and what it specifically means (or could mean) for the future of media and video.

I hope you enjoy and would love to get your thoughts. Just reply to this email or join us over on the VB Community.

Joey

What does the Vision Pro mean for the future of video?

There will be tons of think pieces and thoughts and opinions on what we know about the device and what a future with spatial computing means (and if it’ll take off).

Also, only a handful of people who are at WWDC got to briefly demo the headset (MKBHD says “very impressed”) and this device isn’t going to come out for at least 7 months, probably longer.

But I wanted to look at the update around what this means for video, both as a productivity tool for making videos and different ways we might watch or consume content.

But first, Vision Pro wasn’t the only announcement. There were a ton of other updates across all the product lines.

Video related ones:

Mac Pro finally gets upgraded from Intel to M2 chips. But this computer is overkill for 99% of use cases. Get the Mac Studio.
New Presenter Overlay mode uses machine learning to separate your face to place slides or a screen share between you and your background. Simple presentation feature but shows the potential of real-time video effect processing built right into the devices.
A lot of updates bring more uses and integration to Apple TV through AirPlay. TV as a content-watching device is coming back. Half of YouTube views in the US are on the TV.

Presenter Overlay demo

Ok. The Vision Pro.

So I have Meta’s Quest 2. The first time I tried it on and loaded one of the Star Wars virtual games, I had an ‘oh shit’ moment. Being fully immersed in a world was a completely different experience.

There’s only so much you can read about and watch (on 2D screens) about VR and AR, but until you try the device on and have that ‘aha’ moment, there’s a lot about it that’s hard to convey or get.

Now with the Apple device, there are a lot of noticeable differences around how they framed and demoed this compared to Meta’s vision.

Singular focus on augmented reality - overlaying the interface in the room you’re in, and blending the real world and virtual world. I think maybe 2-3 shots showed a fully virtual space, everything else was an augmented reality hybrid.

I don’t think they mentioned virtual reality once, and they definitely didn’t mention the metaverse.

Very minimal 3D in the demo. A lot of 2D panels floating in 3D space.

And a focus on being a different type of computer - spatial computing. But still a computer for doing things.

Like an iPad on your face (it looks like the initial apps will be iPad apps ported to the Vision Pro, plus whatever developers come out with before launch).

So let’s look at this spatial computer and what it could mean for making videos.

Spatial Video Editing

Final Cut in Vision Pro

This is the most immediate, obvious benefit of having a monitor strapped to your face - you have unlimited screen real estate for all your computer windows.

Yes - the Vision Pro will connect to your Mac computer and can be used as a virtual monitor with your keyboard and mouse. When they showed a demo of this, Final Cut was loaded on the screen. But it was only in one window.

To truly unlock the power (assuming it’s capable of doing it…which I’d hope so) I’d want to see Final Cut split into 2 separate windows plus a third window with video playback.

There are apps that make this possible on Meta, but with the Quest 2 the resolution was so low that everything was hard to read and if the device wasn’t perfectly aligned with my eyes it’d be blurry. Hopefully, the Vision Pro’s 4K screens per eye solve this.

Now this is running off a separate Mac - what about built-in apps?

Apple just released Final Cut for iPad, so I’d expect it to work on the Vision Pro. But will it take advantage of unlimited space or be restricted to one window? We’ll see.

I’d also expect other iPad editing staples to have some Vision Pro version, like LumaFusion.

Special thanks to our sponsors for making the newsletter possible

📮 Metricool ► Easy yet powerful social media scheduling and metrics dashboards.

📺 OpenReel ► No fuss video creation, great for marketing videos

🔒 Vestigit ► Secure and protect your video content

🪧 Adspective ► Use AI to dynamically place ads and products in existing videos and photos

🗄️ MASV ► Transfer large files and folders fast

New Types of Video?

Spatial Video

One of the other big features was the Vision Pro’s ability to use its many cameras to take what Apple calls Spatial Photos and Spatial Video - Minority Report like memories that have some form of 3D-ness captured so the memory feels more real when played back.

This is one of those things that’s like, “hmm, do we really need that?” but it’s impossible to fully understand it in a 2D demo.

This quote from Mike Podwal, an AR/VR engineer, caught my attention:

Spatial Photos/Videos: this will blow people's minds. The realism will transform how we think about capturing and reliving memories. (That they're being captured from first-person POV will only add to the feeling of 'being there' again). Very hard to understand how cool this is with a 2D demo video.

Mike Podwal

So Mike’s argument is the camera being in the goggles adds to the feeling of being there since it’s a POV angle.

The problem is…you look like this when taking videos:

Parenting in 2024 be like
— Chris Frantz (@frantzfries)
7:24 PM • Jun 5, 2023

If you hold a camera up to your face you still get the POV experience - so I’m wondering, will we see cameras (or future iPhones) with the ability to capture Spatial Video?

And while we’re on the subject of capturing reality, where do NeRFs fit into this spatial future?

Neural radiance fields (NeRFs) are a technique that generates 3D representations of an object or scene from 2D images by using advanced machine learning.

TechTarget

Imagine taking some photos and being able to stitch them together into a fully explorable 3D space with your Vision Pro.

Here’s a quick demo and this video from Corridor Digital is a really great explainer.

NeRFs will revolutionize how we consume, create, and interact w/ digital media, opening up new avenues for creativity, communication, and learning.
It's some of the most compelling & impactful AI tech out there right now and I'm hoping Apple helps show the world that on Monday. http
— Nick St. Pierre (@nickfloats)
5:47 PM • Jun 3, 2023

To further go down this rabbit hole - how much actual information do we need to capture from a moment to replicate it, and how much can be machine generated?

Just look at last week’s signal where we talked about the Paragraphica, an AI ‘camera’ that uses metadata like time, location, weather, points of interest, etc, to generate an AI image.

And of course, to bring it full circle, my favorite quote from our interview with Adobe’s Michael Cioni:

My advice to cinematographers and photographers, creatives in general, is to really recognize that, enormously, the percentage of images that are photographed or recorded in the world for production is going to go down.

Michael Cioni, Adobe’s Head of Innovation

Immersive Experiences & 3D Generation

In a throwback to Steve Jobs’ 2005 keynote where they announced a partnership with Disney/ABC to bring TV shows to the iPod, good ol’ Bob Iger came out to announce that Disney would be first in line to roll out new content for the Vision Pro.

But details were vague. A lot of the teaser demo was 2D content in a 3D environment - like watching The Mandalorian in a landspeeder on Tattoine.

The coolest part of the demo was with sports, with this 3D model appearing for a live-action replay.

They had a little teaser of an F1 race - I’d love to see a 3D model of a track to watch the cars in real time.

But I feel like something is lost with this abundance of caution of distancing this device from ‘the metaverse.’

One of my favorite experiences on the Quest 2 was Tales from the Galaxy’s Edge. It’s a simple shooter game, but the world is modeled on Batuu, which is the same fictional planet from Star Wars Land at Disney World/Land.

Building a slice of the world in real life at the theme parks, and then expanding it in virtual reality was a really fun and cool bridge to the story and possibilities for exploration - a sort of VR cyberflaneur.

In 3D land, the other mind-blowing thing the Vision Pro does is scan your face to create an accurate (…maybe?) looking virtual avatar of you. This was demoed for use during FaceTime calls - how you show up when you’re wearing the Vision Pro but everyone else is in front of a webcam.

But now that this device can scan your face, build a 3D model of you, and replicate your facial expressions - what else can we do with that?

Create quicker animations with Adobe Character Animator? Load your persona into Unreal’s Metahuman? Let Apple just make a complete copy of you since it can now also replicate your voice?

Collaboration

And lastly - collaboration.

All of the demos were with one person wearing a Vision Pro. There was a lot of emphasis on being able to blend how much of the real world you see versus the virtual world, from a web browser window floating in your living room to a fully immersive movie theater like experience watching Avatar in 3D.

Chris made a great point over on the VideoBrand Community:

I will say this: theaters certainly are afraid of this technology. And I don't look forward to the day when I walk into a room and everyone is wearing these (and not talking to one another) and the solitude/isolation this could potentially create.

Chris Donaldson

What happens when 2 people are in the same room (real or virtual) wearing one? This is probably the area with the most unanswered questions for me.

To Chris’ point - if I’m in the same room with friends/family, and we all have a Vision Pro on and are watching a movie, if we turn to look at each other, do we see our avatar? The other person wearing the goggles?

A lot of the Apple TV updates I mentioned at the beginning of this article were around merging FaceTime with watching TV with friends in sync - so I’d expect something similar to make the viewing experience more communal.

What about working remotely? The virtual face seems like it solves a lot of the presence issues Meta had with the floaty cartoon avatars.

If me and my co-worker are working on our computers with Vision Pro, can we get any type of 3D experience or would we have to just start a FaceTime call and talk to each other’s virtual avatar in a 2D window floating in our home office?

So yeah. Lots of questions. Lots of possibilities.

If AR/VR/Spatial Computing ever had a shot at taking off, this is the best thing we’ve seen so far to do it.

And comparing it to all the other v1 products from Apple - the iPhone, the iPad, the watch - this looks like the best, most usable v1 product they’ve ever launched.

Apple Watch 1 was so painfully slow I still haven’t gone back to an Apple Watch.

10-20 years from now, this ski-goggle sized device with a dangling power cable will be a sleek pair of glasses you just pop on your face.

How else I can help you:

Check out our YouTube channel with lots of free tutorials (and let me know what else you’d like covered)
Join our free community
Get our YouTube & Video course, which has had over 4000 students
If you have a specific project in mind and would like to potentially work together, just hit reply

What'd you think of this email?

🟢 Useful & Interesting | 🟡 Just OK | 🔴 Not Very Good