Monday, May 17, 2010

Using Microsoft Office OneNote to Convert Image to Text



OneNote does two things fairly well - clipping images from the screen, and copying text from pictures. Combine the two, and you have a quick and dirty way to get text from a locked or scanned image PDF.

I will assume that you already have the OneNote Screen Clipper and Launcher running. If not, enable it and set it to run from start-up. If you prefer a different screen clipper, skip to step 2 below.

Step 1: use Win+S to clip the portion of the PDF you need.
Step 2: page the image into OneNote
Step 3: right click the image, and select "Copy Text from Picture"
Step 4: use Ctrl-V to page the text elsewhere
Step 5: check to make sure the text is recognized correctly.

A few additional tip:
Tip 1: OneNote tries to match both the text and format, so it might work better to page into a format-free editor like Notepad.
Tip 2: Because clearer the text, the better OneNote will recognize the text, make the text you want as big as possible.
Tip 3: you can use this for any images, not necessarily just PDFs.

The title picture shows a screen clipping of the Google home page, and the right click menu in OneNote. The first picture below shows what it looks like if you copy the text from the picture and paste into OneNote.



Finally, the text below shows what OneNote recognized if you pate into Notepad instead. Note that the recognition isn't perfect:


Web Images Videos Maps News Shoiing Gmail v
Google
images