Extract Images from an Excel Document

出至:http://stackoverflow.com/questions/5503015/extract-images-from-an-excel-document

First, use unoconv to convert the .xls to .pdf:

http://dag.wieers.com/home-made/unoconv/

On Ubuntu 10.10 command line:

sudo apt-get install unoconv
unoconv -f pdf file.xls
Then extract the images from the pdf using pdfimages (which seems to come bundled with Ubuntu):

http://en.wikipedia.org/wiki/Pdfimages

Back on the command line:

pdfimages file.pdf fileimage
And done! All of the images in the .xls are now in separate files in the directory. This could be done very easily on most Linux systems using your language of choice. In python, for example:

import subprocess
subprocess.call([‘unoconv’,’-f’,’pdf’,’file.xls’])
subprocess.call([‘pdfimages’,’file.pdf’,’fileimage’])

I would love to hear a simpler solution if somebody has one.
******************************************************************************************

If a excel file is a compressed file.(xlsx)

$ unzip file.xlsx

in xl/media/ are all pictures

廣告

發表迴響

在下方填入你的資料或按右方圖示以社群網站登入:

WordPress.com Logo

您的留言將使用 WordPress.com 帳號。 登出 / 變更 )

Twitter picture

您的留言將使用 Twitter 帳號。 登出 / 變更 )

Facebook照片

您的留言將使用 Facebook 帳號。 登出 / 變更 )

Google+ photo

您的留言將使用 Google+ 帳號。 登出 / 變更 )

連結到 %s