The system processes video input, reduces the quality of the frames, removes the object from the reduced-quality frames, and then improves the quality of the altered frames and integrates them into the video. The object removal takes just 41 milliseconds, so users effectively see the video in real-time. However, the "reducing frame quality" process seems like a vague explanation for such a system.

Based on the video, the system seems to have a similar effect as using the mask, rubber stamp, and other Photoshop tools to take a blemish out of a picture. In Photoshop, you take details from the surrounding area and draw over the blemish with details that can blend in with the rest of the picture. It requires the surrounding area to be of a consistent color and texture, so details can be brought in; this sort of editing can't draw details it can't see, only use what it can to produce a composite.
All of the examples in the video were of contrasting objects against consistent backgrounds, and the more consistent the background, the better the result. Flat surfaces like a red fabric chair or a black bathroom counter remove the objects almost flawlessly, while more textured, detailed surfaces like a wood-grain desktop or a brick street showed flickers and inconsistencies. Still, it's a very impressive technology to see happen in real-time.













































