Extract Images from an Excel Document

出至:http://stackoverflow.com/questions/5503015/extract-images-from-an-excel-document

First, use unoconv to convert the .xls to .pdf:

http://dag.wieers.com/home-made/unoconv/

On Ubuntu 10.10 command line:

sudo apt-get install unoconv
unoconv -f pdf file.xls
Then extract the images from the pdf using pdfimages (which seems to come bundled with Ubuntu):

http://en.wikipedia.org/wiki/Pdfimages

Back on the command line:

pdfimages file.pdf fileimage
And done! All of the images in the .xls are now in separate files in the directory. This could be done very easily on most Linux systems using your language of choice. In python, for example:

import subprocess
subprocess.call([‘unoconv’,’-f’,’pdf’,’file.xls’])
subprocess.call([‘pdfimages’,’file.pdf’,’fileimage’])

I would love to hear a simpler solution if somebody has one.
******************************************************************************************

If a excel file is a compressed file.(xlsx)

$ unzip file.xlsx

in xl/media/ are all pictures

發表迴響

在下方填入你的資料或按右方圖示以社群網站登入:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / 變更 )

Twitter picture

You are commenting using your Twitter account. Log Out / 變更 )

Facebook照片

You are commenting using your Facebook account. Log Out / 變更 )

Google+ photo

You are commenting using your Google+ account. Log Out / 變更 )

連結到 %s