Quantcast

In Brief: Project Naptha OCRs Web Images

By Norman Chan

Coolest web demo I've seen in a while.

If you're using Chrome, try this new web demo out right now. Project Naptha is a browser extension that taps into open-source OCR (optical character recognition) algorithms to let you copy and paste text from web images straight from your browser. It works very much like OCR software did a decade ago, except instead of processing text from a scanned document, it can do it from a webcomic, screenshot, or even Advice Animal image macro. The secret sauce isn't just OCR transcription, but using a technique called Stroke Width Transform to detect that there's text embedded in an image in the first place. The extension uses several tricks to hide computation--it tracks cursor movement and predicts where you might highlight over an image before scanning ahead and running processor-intensive character recognition algorithms. Its creators are also experimenting with the ability to translate highlighted text (much like the WordLens app) and even use "inpainting" algorithms to erase text from an image (similar to Photoshop's Content-Aware Fill feature).