Origami/pdfextract
Jump to navigation
Jump to search
You are here: | pdextract
|
Description
Extracts various data out of a document (streams, scripts, images, fonts, metadata, attachments).
Usage
Syntax
Usage: /usr/local/bin/pdfextract <PDF-file> [-afjms] [-d <output-directory>]
Options
- -d, --output-dir DIR
- Output directory
- -s, --streams
- Extracts all decoded streams
- -a, --attachments
- Extracts file attachments
- -f, --fonts
- Extracts embedded font files
- -j, --js
- Extracts JavaScript scripts
- -m, --metadata
- Extracts metadata streams
- -i, --images
- Extracts embedded images
- -h, --help
- Show this message
Example
Let's extract JavaScript contained in the pdf1.pdf file:
$ ./pdfextract -j /data/tmp/pdf1.pdf Extracted 1 scripts to 'pdf1.dump/scripts'.
The JavaScript has been dumped to the "pdf1.dump/scripts/" directory:
$ cat pdf1.dump/scripts/script_576906449.js function re(count,what) { var v = ""; while (--count >= 0) v += what; return v; } [SNIP]