Discussion
IshKebab: Pretty impressive results! Seems like someone has even made a GUI for it: https://github.com/edenaion/EZ-CorridorKeyStill Python unfortunately.
Computer0: Looking forward to trying it out, 8gb of vram or unified memory required!
Springtime: In an earlier video they made a couple years back about Disney's sodium vapor technique Paul Debevec suggested he was considering creating a dataset using a similar premise: filming enough perfectly masked references to be able to train models to achieve better keying. So it was interesting seeing Corridor tackle this by instead using synthetic data.
somat: With regards to the sodium vapor process, an idea has been peculating in the back of my head ever since I saw that video. But I don't really have the budget or inclination to try it out.theory: make the mask out of non-visable lightilluminate the backing screen in near Infra-Red light.point two cameras at a splitting prism with a near IR pass filterLeave the 90 degree(unaltered path) camera untouched, this is the visible camera.Remove the IR filter from the 180 degree(filter path) camera, this is the mask camera.Now you get a perfect non-color shifting mask, The splitting prism does hurt light intake. It would be worth trying to put the cameras really close together and see if that is good enough.
ralusek: I'm a software engineer that, like the vast majority of you, uses AI/agents in my workflow every day. That being said, I have to admit that it feels a little weird to hear someone who does not write code say that they built something, without even mentioning that they had an agent build it (unless I missed that).
overvale: Debevec tried a version of this: https://arxiv.org/abs/2306.13702
diacritical: From ~04:10 till 05:00 they talk about sodium-vapor lights and how Disney has the exclusive rights to use it. From what I read the knowledge on how to make them is a trade secret, so it's not patented. Seems weird that it would be hard to recreate something from the 1950's.I also wonder how many hours were wasted by people who had to use inferior technology because Disney kept it secret. Cutting out animals and objects from the background 1 frame at a time seems so mindnumbingly boring.
meatmanek: The lights are relatively easy to get. iirc (it's been a bit since I watched their full video on the subject[1]) the hard part to find was the splitter that sends the sodium-vapor light to one camera and everything else to another camera.1. https://www.youtube.com/watch?v=UQuIVsNzqDk
wiml: This approach was used in the 1950s/60s with ultraviolet light (rather than IR) to create a traveling matte. I'm not sure why visible-light techniques won out. Easier to make sure that the illumination is set up correctly, maybe?
superjan: Watched this a few days ago. The video is light on technical details, except maybe that they used CGI to generate training data.
rhdunn: The idea behind a greenscreen is that you can make that green colour transparent in the frames of footage allowing you to blend that with some other background or other layered footage. This has issues like not always having a uniform colour, difficulty with things like hair, and lighting affecting some edges. These have to be manually cleaned up frame-by-frame, which takes a lot of time that is mostly busy work.An alternative approach (such as that used by the sodium lighting on Mary Poppins) is that you create two images per frame -- the core image and a mask. The mask is a black and white image where the white pixels are the pixels to keep and the black pixels the ones to discard. Shades of gray indicate blended pixels.For the mask approach you are filming a perfect alpha channel to apply to the footage that doesn't have the issues of greenscreen. The problem is that this requires specialist, licensed equipment and perfect filming conditions.The new approach is to take advantage of image/video models to train a model that can produce the alpha channel mask for a given frame (and thus an entire recording) when just given greenscreen footage.The use of CGI in the training data allows the input image and mask to be perfect without having to spend hundreds of hours creating that data. It's also easier to modify and create variations to test different cases such as reflective or soft edges.Thus, you have the greenscreen input footage, the expected processed output and alpha channel mask. You can then apply traditional neural net training techniques on the data using the expected image/alpha channel as the target. For example, you can compute the difference on each of the alpha channel output neurons from the expected result, then apply backpropagation to compute the differences through the neural network, and then nudge the neuron weights in the computed gradient direction. Repeat that process across a distribution of the test images over multiple passes until the network no longer changes significantly between passes.
diacritical: Yup, I wanted to say that the prisms are hard to recreate, not the light itself.
actionfromafar: I'll do you one better, which requires no special cameras (most have IR filters) nor double cameras or prisms.Shoot the scene in 48 or 96 fps. Sync the set lighting to odd frames. Every odd frame, the set lights are on. Every even frame, set lights are off.For the backing screen, do the reverse. Even frames, the backing screen is on. Odd frames, backing screen is off.There you go. Mask / normal shot / Mask normal shot / Mask ... you get the idea.Of course, motion will cause normal image and mask go out of sync, but I bet that can be remedied by interpolating a new frame between every mask frame. Plus, when you mix it down to 24fps you can introduce as much motion blur and shutter angle "emulation" as you want.
amluto: Surely this makes your actors feel sick? And wouldn’t it make your motion blur look dashed and also cause artifacts at the edge of the mask if there’s a lot of motion?