%FIXME: Distinguish properly between options and operations throughout manual. %FIXME: Finish and document -dump-attachments. %FIXME: Mention that OpenAction supersedes PageLayout so use -remove-dict-option to get rid of it %FIXME: Document new -hard-box option %FIXME: Document that -upright also shifts the page to 0,0 %FIXME: Document %PageDiv2 %FIXME: Document new bookmark format + use of -utf8 for getting good bookmarks %FIXME: Fix docs on -fit-window and friends %FIXME: Document new -pad-with (for -pad-before, -pad-after, -pad-every) %FIXME: Activate documentation for -extract-images (when done) %FIXME: Document new -artbox, -trimbox, -bleedbox and -remove-artbox, -remove-trimbox, -remove-bleedbox %FIXME: Document -cropbox and -remove-cropbox as synonyms of -crop and -remove-crop %FIXME: Document new XMP metadata stuff including setmetadata date and its format %FIXME: Document new -gs-malformed flag. %FIXME: Document new -create-metadata %FIXME: Document -remove-clipping %FIXME: Document new -list-spot-colours %FIXME: Document new -pad-multiple-before %FIXME: Document new @N@@@ @E@@@, @S@@@ options %FIXME: Document the rotate dance for adding rotated text %FIXME: Document -gs gs -gs-malformed %FIXME: Document -gs gs -gs-embed-fonts %FIXME: Document -merge-add-bookmarks, -merge-add-bookmarks-use-titles %FIXME: Document -bookmarks-open-to-level %FIXME: Explain in key places that you probably want UTF8 a lot \documentclass{book} \usepackage{palatino} \usepackage{microtype} \usepackage{graphics} \usepackage[plainpages=false,pdfpagelabels,pdfborder=0 0 0]{hyperref} \usepackage{framed} \newcommand{\smallgap}{\bigskip} \newcommand{\cpdf}{\texttt{cpdf}} \addtolength{\textwidth}{20mm} \usepackage{makeidx}\makeindex \usepackage[left=3cm, right=1.5cm, top=2cm, bottom=1.8cm, paperwidth=7.5in, paperheight=9.25in]{geometry} \usepackage{fancyhdr} \fancyhf{} \pagestyle{fancy} \fancyhead[lo]{\slshape\nouppercase{\leftmark}\hfill\thepage} \fancyhead[re]{\thepage\hfill\slshape\nouppercase{\leftmark}} \fancyfoot{} %\fancyfoot[LE,RO]{\thepage} \renewcommand{\headrulewidth}{0pt} \renewcommand{\footrulewidth}{0pt} \begin{document} \frontmatter \pagestyle{empty} \begin{flushright} {\sffamily \bfseries \Huge Coherent PDF \vspace{2mm} Command Line Toolkit} \vspace{12mm} {\Huge User Manual}\\ Version 2.3 (October 2019) \vspace{25mm} \vfill \includegraphics{logo.pdf} \vspace{2mm} {\sffamily \bfseries \LARGE Coherent Graphics Ltd} \end{flushright} \clearpage \pagestyle{empty} \noindent For bug reports, feature requests and comments, email\\ \texttt{contact@coherentgraphics.co.uk} \vspace*{\fill} \noindent\copyright 2017 Coherent Graphics Limited. All rights reserved. ISBN 978-0957671140 \smallgap \noindent Adobe, Acrobat, Adobe PDF, Adobe Reader and PostScript are registered trademarks of Adobe Systems Incorporated. Windows, Powerpoint and Excel are registered trademarks of Microsoft Corporation. % Letter \cleardoublepage \tableofcontents \cleardoublepage \chapter*{Typographical Conventions} Command lines to be typed are shown in \texttt{typewriter\hspace{-1mm} font} in a box. For example: \begin{framed} \small\verb!cpdf in.pdf -o out.pdf! \end{framed} \noindent When describing the general form of a command, rather than a particular example, square brackets \verb|[]| are used to enclose optional parts, and angled braces \verb!<>! to enclose general descriptions which may be substituted for particular instances. For example, \begin{framed} \small\verb!cpdf in.pdf [] -o out.pdf! \end{framed} \noindent describes a command line which requires an operation and, optionally, a range. An exception is that we use \texttt{in.pdf} and \texttt{out.pdf} instead of \texttt{} and \texttt{} to reduce verbosity. Under Microsoft Windows, type \texttt{cpdf.exe} instead of \texttt{cpdf}. \cleardoublepage \mainmatter %\chapterstyle{hangnum} %\pagestyle{ruled} \pagestyle{fancy} \chapter{Basic Usage} \label{basicusage} \begin{framed} \small \noindent\begin{verbatim} -o -idir -recrypt -stdout -stdin -stdin-user -stdin-owner -producer -creator -change-id -l -cpdflin -keep-l -no-preserve-objstm -create-objstm -control -args -utf8 -stripped -raw -no-embed-font\end{verbatim}\end{framed} The Coherent PDF tools provide a wide range of facilities for modifying PDF files created by other means. There is a single command-line program \cpdf\ (\texttt{cpdf.exe} under Microsoft Windows). The rest of this manual describes the options that may be given to this program. \index{input files} \index{output files} \section{Input and Output Files} The typical pattern for usage is \begin{framed} \small\verb!cpdf [] -o ! \end{framed} \noindent and the simplest concrete example, assuming the existence of a file \texttt{in.pdf} is: \begin{framed} \small\verb!cpdf in.pdf -o out.pdf! \end{framed} \noindent which copies \texttt{in.pdf} to \texttt{out.pdf}. The input and output may be the same file. Of course, we should like to do more interesting things to the PDF file than that! Files on the command line are distinguished from other input by their containing a period. If an input file does not contain a period, it should be preceded by \verb!-i!. For example: \begin{framed} \small\verb!cpdf -i in -o out.pdf! \end{framed} \noindent A whole directory of files may be added (where a command supports multiple files) by using the \verb!-idir! option: \begin{framed} \small\verb!cpdf -merge -idir myfiles -o out.pdf! \end{framed} \noindent The files in the directory \verb!myfiles! are considered in alphabetical order. They must all be PDF files. If the names of the files are numeric, leading zeroes will be required for the order to be correct (e.g \verb!001.pdf!, \verb!002.pdf! etc). \section{Input Ranges} An \index{input range} \index{range} \textit{input range} may be specified after each input file. This is treated differently by each operation. For instance \begin{framed} \small\verb!cpdf in.pdf 2-5 out.pdf! \end{framed} \noindent extracts pages two, three, four and five from \texttt{in.pdf}, writing the result to \texttt{out.pdf}, assuming that \texttt{in.pdf} contains at least five pages. \index{page!range} \index{reversing} Here are the rules for building input ranges: \begin{itemize} \item A dash (\texttt{-}) defines ranges, e.g. \texttt{1-5} or \texttt{6-3}. \item A comma (\texttt{,}) allows one to specify several ranges, e.g. \texttt{1-2,4-5}. \item The word \texttt{end} represents the last page number. \item The words \texttt{odd} and \texttt{even} can be used in place of or at the end of a page range to restrict to just the odd or even pages. \item The words \texttt{portrait} and \texttt{landscape} can be used in place of or at the end of a page range to restrict to just those pages which are portrait or landscape. Note that the meaning of ``portrait'' and ``landscape'' does not take account of any viewing rotation in place (use \texttt{-upright} first, if required). A page with equal width and height is considered neither portrait nor landscape. \item The word \texttt{reverse} is the same as \texttt{end-1}. \item The word \texttt{all} is the same as \texttt{1-end}. \item A range must contain no spaces. \item A tilde (\texttt{\~{}}) defines a page number counting from the end of the document rather than the beginning. Page \texttt{\~{}1} is the last page, \texttt{\~{}2} the penultimate page etc. \end{itemize} \noindent For example: \begin{framed} \small\verb!cpdf in.pdf 1,2,7-end -o out.pdf! \vspace{2.5mm} \noindent Remove pages three, four, five and six from a document. \vspace{2.5mm} \verb!cpdf in.pdf 1-16odd -o out.pdf! \vspace{2.5mm} \noindent Extract the odd pages 1,3,...,13,15. \vspace{2.5mm} \verb!cpdf in.pdf landscape -rotate 90 -o out.pdf! \vspace{2.5mm} \noindent Rotate all landscape pages by ninety degrees. \vspace{2.5mm} \verb!cpdf in.pdf 1,all -o out.pdf! \vspace{2.5mm} \noindent Duplicate the front page of a document, perhaps as a fax cover sheet. \vspace{2.5mm} \verb!cpdf in.pdf ~3-~1 -o out.pdf! \vspace{2.5mm} \noindent Extract the last three pages of a document, in order. \end{framed} \index{decryption} \section{Working with Encrypted Documents} \index{owner password} \index{user password} \index{password} In order to perform many operations, encrypted input PDF files must be decrypted. Some require the owner password, some either the user or owner passwords. Either password is supplied by writing \texttt{user=} or \texttt{owner=} following each input file requiring it (before or after any range). The document will \textit{not} be re-encrypted upon writing. For example: \begin{framed} \noindent\small\verb!cpdf in.pdf user=charles -info!\\ \noindent\small\verb!cpdf in.pdf owner=fred reverse -o out.pdf! \end{framed} \noindent To re-encrypt the file with its existing encryption upon writing, which is required if only the user password was supplied, but allowed in any case, add the \texttt{-recrypt} option: \begin{framed} \small\verb!cpdf in.pdf user=fred reverse -recrypt -o out.pdf! \end{framed} \noindent The password required (owner or user) depends upon the operation being performed. Separate facilities are provided to decrypt and encrypt files (See Section \ref{crypt}). \section{Standard Input and Standard Output} \index{standard input} \index{standard output} Thus far, we have assumed that the input PDF will be read from a file on disk, and the output written similarly. Often it's useful to be able to read input from \texttt{stdin} (Standard Input) or write output to \texttt{stdout} (Standard Output) instead. The typical use is to join several programs together into a \textit{pipe}, passing data from one to the next without the use of intermediate files. Use \texttt{-stdin} to read from standard input, and \texttt{-stdout} to write to standard input, either to pipe data between multiple programs, or multiple invocations of the same program. For example, this sequence of commands (all typed on one line) \begin{framed} \small\begin{verbatim} cpdf in.pdf reverse -stdout | cpdf -stdin 1-5 -stdout | cpdf -stdin reverse -o out.pdf\end{verbatim} \end{framed} \noindent extracts the last five pages of \texttt{in.pdf} in the correct order, writing them to \texttt{out.pdf}. It does this by reversing the input, taking the first five pages and then reversing the result. To supply passwords for a file from \texttt{-stdin}, use \texttt{-stdin-owner } and/or \texttt{-stdin-user }. Using \texttt{-stdout} on the final command in the pipeline to output the PDF to screen is not recommended, since PDF files often contain compressed sections which are not screen-readable. Several \cpdf\ operations write to standard output by default (for example, listing fonts). A useful feature of the command line (not specific to \cpdf) is the ability to redirect this output to a file. This is achieved with the \texttt{>} operator: \begin{framed} \small\verb!cpdf -info in.pdf > file.txt! \vspace{2.5mm} \noindent Use the \texttt{-info} operation (See Section \ref{info}), redirecting the output to \texttt{file.txt}. \end{framed} \section{Doing Several Things at Once with AND} The keyword \texttt{AND} can be used to string together several commands in one. The advantage compared with using pipes is that the file need not be repeatedly parsed and written out, saving time. To use \texttt{AND}, simply leave off the output specifier (e.g \texttt{-o}) of one command, and the input specifier (e.g filename) of the next. For instance: \begin{framed} \small\verb!cpdf -merge in.pdf in2.pdf AND -add-text "Label"! \noindent\small\verb! AND -merge in3.pdf -o out.pdf! \vspace{2.5mm} \noindent Merge \texttt{in.pdf} and \texttt{in2.pdf} together, add text to both pages, append \texttt{in3.pdf} and write to \texttt{out.pdf}. \end{framed} \noindent To specify the range for each section, use \texttt{-range}: \begin{framed} \small\verb!cpdf -merge in.pdf in2.pdf AND -range 2-4 -add-text "Label"! \noindent\small\verb! AND -merge in3.pdf -o out.pdf! \end{framed} \section{Units} \index{units} When measurements are given to \cpdf, they are in points (1 point = 1/72 inch). They may optionally be followed by some letters to change the measurement. The following are supported: \begin{table}[h] \centering \begin{tabular}{rl} \texttt{pt} & Points (72 points per inch). The default. \\ \texttt{cm} & Centimeters \\ \texttt{mm} & Millimeters \\ \texttt{in} & Inches \\ \end{tabular} \end{table} \noindent For example, one may write \texttt{14mm} or \texttt{21.6in}. In addition, the following letters stand, in some operations (\texttt{-scale-page}, \texttt{-scale-to-fit}, \texttt{-scale-contents}, \texttt{-shift}, \texttt{-mediabox},\\ \texttt{-crop}) for various page dimensions: \begin{table}[h] \centering \begin{tabular}{rl} \texttt{PW} & Page width\\ \texttt{PH} & Page height\\ \texttt{PMINX} & Page minimum x coordinate\\ \texttt{PMINY} & Page minimum y coordinate\\ \texttt{PMAXX} & Page maximum x coordinate\\ \texttt{PMAXY} & Page maximum y coordinate\\ \texttt{CW} & Crop box width\\ \texttt{CH} & Crop box height\\ \texttt{CMINX} & Crop box minimum x coordinate\\ \texttt{CMINY} & Crop box minimum y coordinate\\ \texttt{CMAXX} & Crop box maximum x coordinate\\ \texttt{CMAXY} & Crop box maximum y coordinate \end{tabular} \end{table} \noindent For example, we may write \texttt{PMINX PMINY} to stand for the coordinate of the lower left corner of the page. Simple arithmetic may be performed using the words \texttt{add}, \texttt{sub}, \texttt{mul} and \texttt{div} to stand for addition, subtraction, multiplication and division. For example, one may write \texttt{14in\hspace{-1mm} sub\hspace{-1mm} 30pt} or \texttt{PMINX\hspace{-1mm} mul\hspace{-1mm} 2} \section{Setting the Producer and Creator} The \texttt{-producer} and \texttt{-creator} options may be added to any \texttt{cpdf} command line to set the producer and/or creator of the PDF file. If the file was converted from another format, the \textit{creator} is the program producing the original, the \textit{producer} the program converting it to PDF. \begin{framed} \small\verb!cpdf -merge in.pdf in2.pdf -producer MyMerger -o out.pdf!\\ \vspace{2.5mm} \noindent Merge \texttt{in.pdf} and \texttt{in2.pf}, setting the producer to \texttt{MyMerger} and writing the output to \texttt{out.pdf}.\end{framed} \section{PDF Version Numbers} \index{version number} When an operation which uses a part of the PDF standard which was introduced in a later version than that of the input file, the PDF version in the output file is set to the later version (most PDF viewers will try to load any PDF file, even if it is marked with a later version number). However, this automatic version changing may be suppressed with the \texttt{-keep-version} flag. Here is a list of Acrobat versions together with the maximum PDF version they are intended to support: \vspace{2mm} \begin{tabular}{rl} PDF 1.2 & Acrobat 3.0 \\ PDF 1.3 & Acrobat 4.0 \\ PDF 1.4 & Acrobat 5.0 \\ PDF 1.5 & Acrobat 6.0 \\ PDF 1.6 & Acrobat 7.0 \\ PDF 1.7 & Acrobat 8.0, 9.0, 10.0 \end{tabular} \vspace{2mm} \noindent If you wish to manually alter the PDF version of a file, use the \texttt{-set-version} option described in Section \ref{setversion}. \section{File IDs} PDF files contain an ID (consisting of two parts), used by some workflow systems to uniquely identify a file. To change the ID, behavior, use the \texttt{-change-id} operation. This will create a new ID for the output file. \begin{framed} \small\verb!cpdf -change-id in.pdf -o out.pdf! \vspace{2.5mm} \noindent Write \texttt{in.pdf} to \texttt{out.pdf}, changing the ID. \end{framed} \section{Linearization} \index{linearization} Linearized PDF is a version of the PDF format in which the data is held in a special manner to allow content to be fetched only when needed. This means viewing a multipage PDF over a slow connection is more responsive. By default, \cpdf\ does not linearize output files. To make it do so, add the \texttt{-l} option to the command line, in addition to any other command being used. For example: \begin{framed} \small\verb!cpdf -l in.pdf -o out.pdf! \vspace{2.5mm} \noindent Linearize the file \texttt{in.pdf}, writing to \texttt{out.pdf}. \end{framed} \noindent This requires the existence of the external program \texttt{cpdflin} which is provided with commercial versions of \texttt{cpdf}. This must be installed as described in the installation documentation provided with your copy of \texttt{cpdf}. If you are unable to install \texttt{cpdflin}, you must use \texttt{-cpdflin} to let \texttt{cpdf} know where to find it: \begin{framed} \small\verb!cpdf.exe -cpdflin "C:\\cpdflin.exe" -l in.pdf -o out.pdf! \vspace{2.5mm} \noindent Linearize the file \texttt{in.pdf}, writing to \texttt{out.pdf}. \end{framed} In extremis, you may place \texttt{cpdflin} and its resources in the current working directory, though this is not recommended. For further help, refer to the installation instructions for your copy of \texttt{cpdf}. To keep the existing linearization status of a file (produce linearized output if the input is linearized and the reverse), use \texttt{-keep-l} instead of \texttt{-l}. \section{Object Streams} PDF 1.5 introduced a new mechanism for storing objects to save space: object streams. by default, \texttt{cpdf} will preserve object streams in input files, creating no more. To prevent the retention of existing object streams, use \texttt{-no-preserve-objstm}: \begin{framed} \small\verb!cpdf -no-preserve-objstm in.pdf -o out.pdf! \vspace{2.5mm} \noindent Write the file \texttt{in.pdf} to \texttt{out.pdf}, removing any object streams. \end{framed} \noindent To create new object streams if none exist, or augment the existing ones, use \texttt{-create-objstm}: \begin{framed} \small\verb!cpdf -create-objstm in.pdf -o out.pdf! \vspace{2.5mm} \noindent Write the file \texttt{in.pdf} to \texttt{out.pdf}, preserving any existing object streams, and creating any new ones for new objects which have been added. \end{framed} \noindent To create wholly new object streams, use both options together: \begin{framed} \small\verb!cpdf -create-objstm -no-preserve-objstm in.pdf -o out.pdf! \vspace{2.5mm} \noindent Write the file \texttt{in.pdf} to \texttt{out.pdf} with wholly new object streams. \end{framed} \noindent Files written with object streams will be set to PDF 1.5 or higher, unless \texttt{-keep-version} is used (see above). \section{Malformed Files} There are many malformed PDF files in existence, including many produced by otherwise-reputable applications. \cpdf\ attempts to correct these problems silently. Grossly malformed files will be reconstructed. The reconstruction progress is shown on \verb!stderr! (Standard Error): \begin{framed} \noindent\small\verb!./cpdf in.pdf -o out.pdf!\\ \small\verb!couldn't lex object number!\\ \small\verb!Attempting to reconstruct the malformed pdf in.pdf...!\\ \small\verb!Read 5530 objects!\\ \small\verb$Malformed PDF reconstruction succeeded!$ \end{framed} \noindent Sometimes files can be technically well-formed but use inefficient PDF constructs. If you are sure the input files you are using are impeccably formed, the \texttt{-fast} option added to the command line (or, if using \texttt{AND}, to each section of the command line). This will use certain shortcuts which speed up processing, but would fail on badly-produced files. The \verb!-fast! option may be used with: \begin{framed} \small\noindent Chapter \ref{pages}\\ \noindent\small\verb!-rotate-contents -upright -vflip -hflip!\\ \small\verb!-shift -scale -scale-to-fit -scale-contents!\\ \noindent Chapter \ref{stamps}\\ \noindent\small\verb!-add-text!\\ \small\verb!-stamp-on -stamp-under -combine-pages! \end{framed} \noindent If problems occur, refrain from using \verb!-fast!. \section{Error Handling} \index{error handling} When \cpdf\ encounters an error, it exits with code 2. An error message is displayed on \texttt{stderr} (Standard Error). In normal usage, this means it's displayed on the screen. When a bad or inappropriate password is given, the exit code is 1. \section{Control Files} \index{control file} \begin{framed} \noindent\small\verb!cpdf -control !\\ \noindent\small\verb!cpdf -args ! \end{framed} Some operating systems have a limit on the length of a command line. To circumvent this, or simply for reasons of flexibility, a control file may be specified from which arguments are drawn. This file does not support the full syntax of the command line. Commands are separated by whitespace, quotation marks may be used if an argument contains a space, and the sequence \verb!\"! may be used to introduce a genuine quotation mark in such an argument. Several \verb!-control! arguments may be specified, and may be mixed in with conventional command-line arguments. The commands in each control file are considered in the order in which they are given, after all conventional arguments have been processed. It is recommended to use \texttt{-args} in all new applications. However, \texttt{-control} will be supported for legacy applications. To avoid interference between \texttt{-control} and \texttt{AND}, a new mechanism has been added. Using \texttt{-args} in place of \texttt{-control} will perform direct textual substitution of the file into the command line, prior to any other processing. \section{String Arguments} Command lines are handled differently on each operating system. Some characters are reserved with special meanings, even when they occur inside quoted string arguments. To avoid this problem, \cpdf\ performs processing on string arguments as they are read. A backslash is used to indicate that a character which would otherwise be treated specially by the command line interpreter is to be treated literally. For example, Unix-like systems attribute a special meaning to the exclamation mark, so the command line \begin{framed} \small\verb?cpdf -add-text "Hello!" in.pdf -o out.pdf? \end{framed} \noindent would fail. We must escape the exclamation mark with a backslash: \begin{framed} \small\verb?cpdf -add-text "Hello\!" in.pdf -o out.pdf? \end{framed} \noindent It follows that backslashes intended to be taken literally must themselves be escaped (i.e. written \verb!\\!). \section{Text Encodings} \index{text encodings} Some \texttt{cpdf} commands write text to standard output, or read text from the command line or configuration files. These are: \begin{framed} \noindent\small\verb!-info!\\ \noindent\small\verb!-list-bookmarks!\\ \noindent\small\verb!-set-author! et al.\\ \noindent\small\verb!-list-annotations! \end{framed} \noindent There are three options to control how the text is interpreted: \begin{framed} \noindent\small\verb!-utf8!\\ \noindent\small\verb!-stripped!\\ \noindent\small\verb!-raw! \end{framed} \noindent Add \verb!-utf8! to use Unicode UTF8, \verb!-stripped! to convert to 7 bit ASCII by dropping any high characters, or \verb!-raw! to perform no processing. The default is \verb!-stripped!. \section{Font Embedding} Use the \texttt{-no-embed-font} to avoid embedding the Standard 14 Font metrics when adding text with \texttt{-add-text}. \chapter{Merging and Splitting} \begin{framed} \small \noindent\begin{verbatim} cpdf -merge in1.pdf [] in2.pdf [] [] [-retain-numbering] [-remove-duplicate-fonts] -o out.pdf\end{verbatim} \vspace{1.5mm} \noindent\verb!cpdf -split in.pdf -o [-chunk ]! \vspace{1.5mm} \noindent\verb!cpdf -split-bookmarks in.pdf -o ! \end{framed} \vspace{12mm} \section{Merging} \index{merging} The \texttt{-merge} operation allow the merging of several files into one. Ranges can be used to select only a subset of pages from each input file in the output. The output file consists of the concatenation of all the input pages in the order specified on the command line. Actually, the \texttt{-merge} can be omitted, since this is the default operation of \cpdf. \begin{framed}\small \verb!cpdf -merge a.pdf 1 b.pdf 2-end -o out.pdf! \vspace{2.5mm} \noindent Take page one of \texttt{a.pdf} and all but the first page of \texttt{b.pdf}, merge them and produce \texttt{out.pdf}. \end{framed} \noindent Merge maintains bookmarks, named destinations, and name dictionaries. Forms and other objects which cannot be merged are retained if they are from the document which first exhibits that feature. The \texttt{-retain-numbering} option keeps the PDF page numbering labels of each document intact, rather than renumbering the output pages from 1. The \texttt{-remove-duplicate-fonts} ensures that fonts used in more than one of the inputs only appear once in the output. \section{Splitting} \index{splitting} The \texttt{-split} operation splits a PDF file into a number of parts which are written to file, their names being generated from a \emph{format}. The optional \texttt{-chunk} option allows the number of pages written to each output file to be set. \begin{framed}\small \verb!cpdf -split a.pdf -o out%%%.pdf! \vspace{2.5mm} \noindent Split \texttt{a.pdf} to the files \texttt{out001.pdf}, \texttt{out002.pdf} etc. \vspace{2.5mm} \verb!cpdf -split a.pdf 1 even -chunk 10 -o dir/out%%%.pdf! \vspace{2.5mm} \noindent Split the even pages of \texttt{a.pdf} to the files \texttt{out001.pdf}, \texttt{out002.pdf} etc. with at most ten pages in each file. The directory (folder) \texttt{dir} must exist. \end{framed} \noindent If the output format does not provide enough numbers for the files generated, the result is unspecified. The following format operators may be used: \begin{table}[h] \centering \begin{tabular}{rl} \verb!%, %%, %%% etc.! & Sequence number padded to the number of percent signs\\ \texttt{@F} & Original filename without extension \\ \texttt{@N} & Sequence number without padding zeroes \\ \texttt{@S} & Start page of this chunk \\ \texttt{@E} & End page of this chunk \\ \texttt{@B} & Bookmark name at this page \\ \end{tabular} \end{table} \section{Splitting on Bookmarks} \index{splitting!on bookmarks} The \texttt{-split-bookmarks } operation splits a PDF file into a number of parts, according to the page ranges implied by the document's bookmarks. These parts are then written to file with names generated from the given format. Level 0 denotes the top-level bookmarks, level 1 the next level (sub-bookmarks) and so on. So \texttt{-split-bookmarks 1} creates breaks on level 0 and level 1 boundaries. \begin{framed}\small \verb!cpdf -split-bookmarks 0 a.pdf -o out%%%.pdf! \vspace{2.5mm} \noindent Split \texttt{a.pdf} to the files \texttt{out001.pdf}, \texttt{out002.pdf} on bookmark boundaries. \end{framed} \noindent Now, there may be many bookmarks on a single page (for instance, if paragraphs are bookmarked or there are two subsections on one page). The splits calculated by \texttt{-split-bookmarks} ensure that each page appears in only one of the output files. It is possible to use the \texttt{@} operators above, including operator \texttt{@B} which expands to the text of the bookmark: \begin{framed}\small \verb!cpdf -split-bookmarks 0 a.pdf -o @B.pdf! \vspace{2.5mm} \noindent Split \texttt{a.pdf} on bookmark boundaries, using the bookmark text as the filename. \end{framed} \noindent The bookmark text used for a name is converted from unicode to 7 bit ASCII, and the following characters are removed, in addition to any character with ASCII code less than 32: \begin{framed} \centering \verb! / ? < > \ : * | " ^ + =! \end{framed} \section{Encrypting with Split and Split Bookmarks} The encryption parameters described in Chapter \ref{encryption} may be added to the command line to encrypt each split PDF. Similarly, the \texttt{-recrypt} switch described in \ref{basicusage} may by given to re-encrypt each file with the existing encryption of the source PDF. \pagestyle{empty}\thispagestyle{fancy} \chapter{Pages} \pagestyle{fancy} \label{pages} \begin{framed} \small\noindent\verb!cpdf -scale-page " " in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -scale-to-fit " " [-scale-to-fit-scale ]!\\ \noindent\verb! in.pdf [] -o out.pdf! %\vspace{1.5mm} %\small\noindent\verb!cpdf -scale-to-fit-best " " in.pdf [] -o out.pdf! % %\vspace{1.5mm} %\small\noindent\verb!cpdf -scale-to-fit-minus " " in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -scale-contents [] [] in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -shift " " in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -rotate in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -rotateby in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -rotate-contents in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -upright in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -hflip in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -vflip in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -mediabox " " in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -crop " " in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -remove-crop in.pdf [] -o out.pdf! %\vspace{1.5mm} %\small\noindent\verb!cpdf -copy-cropbox-to-mediabox in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -frombox -tobox [-mediabox-if-missing]! \\ \noindent\verb! in.pdf [] -o out.pdf! \end{framed} \section{Page Sizes} \index{page size} Any time when a page size is required, instead of writing, for instance \texttt{"210mm 197mm"} one can instead write \texttt{a4portrait}. Here is a list of supported page sizes: {\small \smallgap \begin{tabular}{lll} \texttt{a0portrait} & \texttt{a1portrait} & \texttt{a2portrait} \\ \texttt{a3portrait} & \texttt{a4portrait} & \texttt{a5portrait} \\ \texttt{a6portrait} & \texttt{a7portrait} & \texttt{a8portrait} \\ \texttt{a9portrait} & \texttt{a10portrait} & \\ \\ \texttt{a0landscape} & \texttt{a1landscape} & \texttt{a2landscape} \\ \texttt{a3landscape} & \texttt{a4landscape} & \texttt{a5landscape} \\ \texttt{a6landscape} & \texttt{a7landscape} & \texttt{a8landscape} \\ \texttt{a9landscape} & \texttt{a10landscape} & \\ \\ \texttt{usletterportrait} & \texttt{usletterlandscape} & \\ \texttt{uslegalportrait} & \texttt{uslegallandscape} & \end{tabular} } \section{Scale Pages} \index{scale pages} The \texttt{-scale-page} operation scales each page in the range by the X and Y factors given. This scales both the page contents, and the page size itself. It also scales any Crop Box and other boxes (Art Box, Trim Box etc). As with several of these commands, remember to take into account any page rotation when considering what the X and Y axes relate to. \begin{framed} \small\noindent\verb!cpdf -scale-page "2 2" in.pdf -o out.pdf! \vspace{2.5mm} \noindent Convert an A4 page to A3, for instance. \end{framed} \noindent The \texttt{-scale-to-fit} operation scales each page in the range to fit a given page size, preserving aspect ratio and centering the result. \begin{framed} \small\noindent\verb!cpdf -scale-to-fit "297mm 210mm" in.pdf -o out.pdf! \small\noindent\verb!cpdf -scale-to-fit a4portrait in.pdf -o out.pdf! \vspace{2.5mm} \noindent Scale a file's pages to fit A4 portrait. \end{framed} %The \texttt{-scale-to-fit-best} and \texttt{-scale-to-fit-minus} are similar, but will rotate a page by $90^\circ$ or $-90^\circ$ respectively on any page where doing so would maximise the scale. \noindent The scale can optionally be set to a percentage of the available area, instead of filling it. \begin{framed} \small\noindent\verb!cpdf -scale-to-fit a4portrait -scale-to-fit-scale 0.9 in.pdf -o out.pdf! \vspace{2.5mm} \noindent Scale a file's pages to fit A4 portrait, scaling the page 90\% of its possible size. \end{framed} \noindent The \texttt{-scale-contents} operation scales the contents about the center of the crop box (or, if absent, the media box), leaving the page dimensions (boxes) unchanged. \begin{framed} \small\noindent\verb!cpdf -scale-contents 0.5 in.pdf -o out.pdf! \vspace{2.5mm} \noindent Scale a file's contents on all pages to 50\% of its original dimensions. \end{framed} \noindent To scale about a point other than the center, one can use the positioning commands described in Section \ref{position}. For example: \begin{framed} \small\noindent\verb!cpdf -scale-contents 0.5 -topright 20 in.pdf -o out.pdf! \vspace{2.5mm} \noindent Scale a file's contents on all pages to 50\% of its original dimensions about a point 20pts from its top right corner. \end{framed} \section{Shift Page Contents} \index{shift page contents} The \texttt{-shift} operation shifts the contents of each page in the range by X points horizontally and Y points vertically. \begin{framed} \small\noindent\verb!cpdf -shift "50 0" in.pdf even -o out.pdf! \vspace{2.5mm} \noindent Shift pages to the right by 50 points (for instance, to increase the binding margin). \end{framed} \section{Rotating Pages} \index{rotate!pages} There are two ways of rotating pages: (1)~setting a value in the PDF file which asks the viewer (e.g. Acrobat) to rotate the page on-the-fly when viewing it (use \texttt{-rotate} or \texttt{-rotateby}) and (2)~actually rotating the page contents and/or the page dimensions (use \texttt{-upright} afterwards or \texttt{-rotate-contents} to just rotate the page contents). The possible values for \texttt{-rotate} and \texttt{-rotate-by} are 0, 90, 180 and 270, all interpreted as being clockwise. Any value may be used for \texttt{-rotate-contents}. The \texttt{-rotate} operation sets the viewing rotation of the selected pages to the absolute value given. \begin{framed} \small\verb!cpdf -rotate 90 in.pdf -o out.pdf! \vspace{2.5mm} \noindent Set the rotation of all the pages in the input file to ninety degrees clockwise. \end{framed} \noindent The \texttt{-rotateby} operation changes the viewing rotation of all the given pages by the relative value given. \begin{framed} \small\verb!cpdf -rotateby 90 in.pdf -o out.pdf! \vspace{2.5mm} \noindent Rotate all the pages in the input file by ninety degrees clockwise. \end{framed} \noindent The \texttt{-rotate-contents} operation rotates the contents and dimensions of the page by the given relative value. \index{rotate!contents} \begin{framed} \small\verb!cpdf -rotate-contents 90 in.pdf -o out.pdf! \vspace{2.5mm} \noindent Rotate all the page contents in the input file by ninety degrees clockwise. Does not change the page dimensions. \end{framed} \label{upright} \noindent The \texttt{-upright} operation does whatever combination of \texttt{-rotate} and \texttt{-rotate-contents} is required to change the rotation of the document to zero without altering its appearance. In addition, it makes sure the media box has its origin at (0,0), changing other boxes to compensate. \section{Flipping Pages} \index{flip pages} The \texttt{-hflip} and \texttt{-vflip} operations flip the contents of the chosen pages horizontally or vertically. No account is taken of the current page rotation when considering what "horizontally" and "vertically" mean, so you may like to use \texttt{-upright} first. \begin{framed} \small\verb!cpdf -hflip in.pdf even -o out.pdf! \vspace{2.5mm} \noindent Flip the even pages in \texttt{in.pdf} horizontally. \vspace{2.5mm} \verb!cpdf -vflip in.pdf -o out.pdf! \vspace{2.5mm} \noindent Flip all the pages in \texttt{in.pdf} vertically. \end{framed} \section{Boxes and Cropping} \index{crop pages} \index{media box} All PDF files contain a \textit{media box} for each page, giving the dimensions of the paper. To change these dimensions (without altering the page contents in any way), use the \texttt{-mediabox} option. \begin{framed} \small\verb!cpdf -mediabox "0pt 0pt 500pt 500pt" in.pdf -o out.pdf! \vspace{2.5mm} \noindent Set the media box to 500 points square. \end{framed} \noindent The four numbers are minimum x, minimum y, width, height. x coordinates increase to the right, y coordinates increase upwards. PDF file can also optionally contain a \textit{crop box} for each page, defining to what extent the page is cropped before being displayed or printed. A crop box can be set, changed and removed, without affecting the underlying media box. To set or change the crop box use \texttt{-crop}. To remove any existing crop box, use \texttt{-remove-crop}. \begin{framed} \small\verb!cpdf -crop "0pt 0pt 200mm 200mm" in.pdf -o out.pdf! \vspace{2.5mm} \noindent Crop pages to the bottom left 200-millimeter square of the page. \vspace{2.5mm} \verb!cpdf -remove-crop in.pdf -o out.pdf! \vspace{2.5mm} \noindent Remove cropping. \end{framed} \noindent Note that the crop box is only obeyed in some viewers. \begin{framed} \small\noindent\verb!cpdf -frombox -tobox [-mediabox-if-missing]! \\ \noindent\verb! in.pdf [] -o out.pdf! \vspace{2.5mm} \noindent Copy the contents of one box to another. \end{framed} \noindent This operation copies the contents of one box (Media box, Crop box, Trim box etc.) to another. If \texttt{-mediabox-if-missing} is added, the media box will be substituted when the 'from' box is not set for a given page. For example \begin{framed} \small\verb!cpdf -frombox /TrimBox -tobox /CropBox in.pdf -o out.pdf! \end{framed} \noindent copies the Trim Box of each page to the Crop Box of each page. The possible boxes are \texttt{/MediaBox}, \texttt{/CropBox}, \texttt{/BleedBox}, \texttt{/TrimBox}, \texttt{/ArtBox}.\pagestyle{empty}\thispagestyle{fancy} \chapter{Encryption and Decryption} \pagestyle{fancy} \label{encryption} \index{encryption} \index{decryption} \begin{framed} \small\noindent\verb!cpdf -encrypt !\\ \noindent\verb! [-no-encrypt-metadata] in.pdf -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -decrypt in.pdf owner= -o out.pdf! \end{framed} \label{crypt} \section{Introduction} PDF files can be encrypted using various types of encryption and attaching various permissions describing what someone can do with a particular document (for instance, printing it or extracting content). There are two types of person: \begin{description} \item The \textbf{User} can do to the document what is allowed in the permissions. \item The \textbf{Owner} can do anything, including altering the permissions or removing encryption entirely. \end{description} There are five kinds of encryption: \begin{itemize} \item 40-bit encryption (method \texttt{40bit}) in Acrobat 3 (PDF 1.1) and above \item 128-bit encryption (method \texttt{128bit}) in Acrobat 5 (PDF 1.4) and above \item 128-bit AES encryption (method \texttt{AES}) in Acrobat 7 (PDF 1.6) and above \item 256-bit AES encryption (method \texttt{AES256}) in Acrobat 9 (PDF 1.7) -- \textit{this is deprecated -- do not use for new documents} \item 256-bit AES encryption (method \texttt{AES256ISO}) in PDF 2.0 \end{itemize} \vspace{2mm} \noindent All encryption supports these kinds of permissions: \vspace{2mm} \begin{tabular}{ll} \texttt{-no-edit} & Cannot change the document\\ \texttt{-no-print} & Cannot print the document\\ \texttt{-no-copy} & Cannot select or copy text or graphics\\ \texttt{-no-annot} & Cannot add or change form fields or annotations\\ \end{tabular} \vspace{2mm} \noindent In addition, 128-bit encryption (Acrobat 5 and above) and AES encryption supports these: \vspace{2mm} \begin{tabular}{ll} \texttt{-no-forms} & Cannot edit form fields\\ \texttt{-no-extract} & Cannot extract text or graphics\\ \texttt{-no-assemble} & Cannot merge files etc.\\ \texttt{-no-hq-print} & Cannot print high-quality\\ \end{tabular} \vspace{2mm} \noindent Add these flags to the command line to prevent each operation. \vspace{2mm} \section{Encrypting a Document} To encrypt a document, the owner and user passwords must be given (here, \texttt{fred} and \texttt{charles} respectively): \begin{framed} \small\verb!cpdf -encrypt 40bit fred charles -no-print in.pdf -o out.pdf! \vspace{1.5mm} \small\verb!cpdf -encrypt 128bit fred charles -no-extract in.pdf -o out.pdf! \vspace{1.5mm} \small\verb!cpdf -encrypt AES fred "" -no-edit -no-copy in.pdf -o out.pdf! \end{framed} \noindent A blank user password is common. In this event, PDF viewers will typically not prompt for a password for when opening the file or for operations allowable with the user password. \begin{framed} \vspace{1.5mm} \small\verb!cpdf -encrypt AES256 fred "" -no-forms in.pdf -o out.pdf! \end{framed} \noindent In addition, the usual method can be used to give the existing owner password, if the document is already encrypted. When using AES encryption, the option is available to refrain from encrypting the metadata. Add \texttt{-no-encrypt-metadata} to the command line. \section{Decrypting a Document} To decrypt a document, the owner password is provided. \begin{framed} \small\verb!cpdf -decrypt in.pdf owner=fred -o out.pdf! \end{framed} \noindent The user password cannot decrypt a file. \chapter{Compression} \begin{framed} \small\noindent\verb!cpdf -decompress in.pdf -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -compress in.pdf -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -squeeze in.pdf [-squeeze-log-to ] -o out.pdf! \end{framed} \cpdf\ provides basic facilities for decompressing and compressing PDF streams. \section{Decompressing a Document} \index{decompressing} To decompress the streams in a PDF file, for instance to manually inspect the PDF, use: \begin{framed} \small\verb!cpdf -decompress in.pdf -o out.pdf! \end{framed} \noindent If \cpdf\ finds a compression type it can't cope with, the stream is left compressed. When using \texttt{-decompress}, object streams are not compressed. \section{Compressing a Document} \index{compressing} To compress the streams in a PDF file, use: \begin{framed} \small\verb!cpdf -compress in.pdf -o out.pdf! \end{framed} \noindent\cpdf\ compresses any streams which have no compression using the \textbf{Flate\-Decode} method, with the exception of Metadata streams, which are left uncompressed. \section{Squeezing a Document} \index{squeeze} To \textit{squeeze} a PDF file, reducing its size by an average of about twenty percent (though sometimes not at all), use: \begin{framed} \small\verb!cpdf -squeeze in.pdf -o out.pdf! \end{framed} \noindent Adding \texttt{-squeeze} to the command line when using another operation will \textit{squeeze} the file or files upon output. The \texttt{-squeeze} operation writes some information about the squeezing process to standard output. The squeezing process involves several processes which losslessly attempt to reduce the file size. It is slow, so should not be used without thought. \begin{verbatim} $ ./cpdf -squeeze in.pdf -o out.pdf Beginning squeeze: 123847 objects Squeezing... Down to 114860 objects Squeezing... Down to 114842 objects Squeezing page data Recompressing document \end{verbatim} The \texttt{-squeeze-log-to } option writes the log to the given file instead of to standard output. \chapter{Bookmarks} \begin{framed} \small\noindent\verb!cpdf -list-bookmarks [-utf8 | -raw] in.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -remove-bookmarks in.pdf -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -add-bookmarks in.pdf -o out.pdf! \end{framed} \index{bookmarks} \index{document outline} PDF Bookmarks (properly called the \textit{document outline}) represent a tree of references to parts of the file, typically displayed at the side of the screen. The user can click on one to move to the specified place. \cpdf\ provides facilities to list, add, and remove bookmarks. The format used by the list and add operations is the same, so you can feed the output of one into the other, for instance to copy bookmarks. \section{List Bookmarks} \index{bookmarks!listing} The \texttt{-list-bookmarks} operation prints (to standard output) the bookmarks in a file. The first column gives the level of the tree at which a particular bookmark is. Then the text of the bookmark in quotes, then the page number which the bookmark points to, then (optionally) the word "open" if the bookmark should have its children (at the level immediately below) visible when the file is loaded. For example, upon executing \begin{framed} \small\verb!cpdf -list-bookmarks doc.pdf! \end{framed} \noindent the result might be: \begin{framed}{\small\begin{verbatim} 0 "Part 1" 1 open 1 "Part 1A" 2 1 "Part 1B" 3 0 "Part 2" 4 1 "Part 2a" 5\end{verbatim}}\end{framed} \noindent If the page number is 0, it indicates that clicking on that entry doesn't move to a page. By default, \cpdf\ converts unicode to ASCII text, dropping characters outside the ASCII range. To prevent this, and return unicode UTF8 output, add the \texttt{-utf8} option to the command. To prevent any processing, use the \texttt{-raw} option. \section{Remove Bookmarks} \label{removebookmarks} \index{bookmarks!removing} The \texttt{-remove-bookmarks} operations removes all bookmarks from the file. \begin{framed} \small\verb!cpdf -remove-bookmarks in.pdf -o out.pdf! \end{framed} \section{Add Bookmarks} \index{bookmarks!adding} The \texttt{-add-bookmarks} file adds bookmarks as specified by a \textit{bookmarks file}, a text file in ASCII or UTF8 encoding and in the same format as that produced by the \texttt{-list-bookmarks} option. If there are any bookmarks in the input PDF already, they are discarded. For example, if the file \texttt{bookmarks.txt} contains the output from \texttt{-list-bookmarks} above, then the command \begin{framed} \small\verb!cpdf -add-bookmarks bookmarks.txt in.pdf -o out.pdf! \end{framed} \noindent adds the bookmarks to the input file, writing to \texttt{out.pdf}. An error will be given if the bookmarks file is not in the correct form (in particular, the numbers in the first column which specify the level must form a proper tree with no entry being more than one greater than the last). \chapter{Presentations} \begin{framed} \small\noindent\begin{verbatim} cpdf -presentation in.pdf [] -o out.pdf [-trans ] [-duration ] [-vertical] [-outward] [-direction ] [-effect-duration ]\end{verbatim} \end{framed} \index{presentations} \vspace{12mm} The PDF file format, starting at Version 1.1, provides for simple slide-show presentations in the manner of Microsoft Powerpoint. These can be played in Acrobat and possibly other PDF viewers, typically started by entering full-screen mode. The \texttt{-presentation} operation allows such a presentation to be built from any PDF file. The \texttt{-trans} option chooses the transition style. When a page range is used, it is the transition \textit{from} each page named which is altered. The following transition styles are available: \begin{description} \item[Split]Two lines sweep across the screen, revealing the new page. By default the lines are horizontal. Vertical lines are selected by using the \texttt{-vertical} option. \item[Blinds]Multiple lines sweep across the screen, revealing the new page. By default the lines are horizontal. Vertical lines are selected by using the \texttt{-vertical} option. \item[Box]A rectangular box sweeps inward from the edges of the page. Use \texttt{-outward} to make it sweep from the center to the edges. \item[Wipe]A single line sweeps across the screen from one edge to the other in a direction specified by the \texttt{-direction} option. \item[Dissolve]The old page dissolves gradually to reveal the new one. \item[Glitter]The same as \textbf{Dissolve} but the effect sweeps across the page in the direction specified by the \texttt{-direction} option. \end{description} \noindent To remove a transition style currently applied to the selected pages, omit the \texttt{-trans} option. The \texttt{-effect-duration} option specifies the length of time in seconds for the transition itself. The default value is one second. The \texttt{-duration} option specifies the maximum time in seconds that the page is displayed before the presentation automatically advances. The default, in the absence of the \texttt{-duration} option, is for no automatic advancement. The \texttt{-direction} option (for \textbf{Wipe} and \textbf{Glitter} styles only) specifies the direction of the effect. The following values are valid: \begin{itemize} \item[\textbf{0}] Left to right \item[\textbf{90}] Bottom to top (\textbf{Wipe} only) \item[\textbf{180}] Right to left (\textbf{Wipe} only) \item[\textbf{270}] Top to bottom \item[\textbf{315}] Top-left to bottom-right (\textbf{Glitter} only) \end{itemize} \noindent For example: \begin{framed} \small \noindent\verb!cpdf -presentation in.pdf 2-end -trans Split -duration 10 -o out.pdf! \vspace{2.5mm} The \textbf{Split} style, with vertical lines, and each slide staying ten seconds unless manually advanced. The first page (being a title) does not move on automatically, and has no transition effect. \end{framed} \noindent To use different options on different page ranges, run \cpdf\ multiple times on the file using a different page range each time. \chapter{Watermarks and Stamps} \label{stamps} \index{watermarks} \index{stamps} \begin{framed} \noindent\small\verb!cpdf -stamp-on source.pdf!\\ \noindent\small\verb! [-scale-stamp-to-fit] [] [-relative-to-cropbox] !\\ \noindent\small\verb! in.pdf [] -o out.pdf! \vspace{1.5mm} \noindent\small\verb!cpdf -stamp-under source.pdf!\\ \noindent\small\verb! [-scale-stamp-to-fit] [] [-relative-to-cropbox]!\\ \noindent\small\verb! in.pdf [] -o out.pdf! \vspace{1.5mm} \noindent\small\verb!cpdf -combine-pages over.pdf under.pdf -o out.pdf! \vspace{1.5mm} \noindent\small\begin{verbatim}cpdf ([-add-text | -add-rectangle ]) [-font ] [-font-size ] [-color ] [-line-spacing ] [-outline] [-linewidth ] [-underneath] [-relative-to-cropbox] [-prerotate] [-bates ] [-bates-at-range ] [-bates-pad-to ] [-opacity ] [-midline] [-topline] in.pdf [] -o out.pdf\end{verbatim} \noindent See also positioning commands below. \vspace{1.5mm} \noindent\small\verb!cpdf -remove-text in.pdf [] -o out.pdf! \end{framed} \section{Add a Watermark or Logo} The \texttt{-stamp-on} and \texttt{-stamp-under} operations stamp the first page of a source PDF onto or under each page in the given range of the input file. For example, \begin{framed} \small\verb!cpdf -stamp-on logo.pdf in.pdf odd -o out.pdf! \end{framed} \noindent stamps the file \texttt{logo.pdf} onto the odd pages of \texttt{in.pdf}, writing to \texttt{out.pdf}. A watermark should go underneath each page: \begin{framed} \small\verb!cpdf -stamp-under topsecret.pdf in.pdf -o out.pdf! \end{framed} \noindent The position commands in Section \ref{position} can be used to locate the stamp more precisely (they are calculated relative to the crop box of the stamp). Or, preprocess the stamp with \texttt{-shift} first. The \texttt{-scale-stamp-to-fit} option can be added to scale the stamp to fit the page before applying it. The use of positioning commands together with \texttt{-scale-stamp-to-fit} is not recommended. The \texttt{-combine-pages} operation takes two PDF files and stamps each page of one over each page of the other. The length of the output is the same as the length of the ``under'' file. For instance: \begin{framed} \small\verb!cpdf -combine-pages over.pdf under.pdf -o out.pdf! \end{framed} \noindent Page attributes (such as the display rotation) are taken from the ``under'' file. For best results, remove any rotation differences in the two files using \texttt{-upright} first. \noindent The \texttt{-relative-to-cropbox} option takes the positioning command to be relative to the cro box of each page rather than the media box. \section{Stamp Text, Dates and Times.} \index{date} \index{time} \index{stamp text} The \texttt{-add-text} operation allows text, dates and times to be stamped over one or more pages of the input at a given position and using a given font, font size and color. \begin{framed} \small\verb!cpdf -add-text "Copyright 2014 ACME Corp." in.pdf -o out.pdf! \end{framed} \noindent The default is black 12pt Times New Roman text in the top left of each page. The text can be placed underneath rather than over the page by adding the \texttt{-underneath} option. Text previously added by \cpdf\ may be removed by the \texttt{-remove-text} operation. \index{removing text} \begin{framed} \small\verb!cpdf -remove-text in.pdf -o out.pdf! \end{framed} \subsection{Page Numbers} \index{page!numbers} There are various special codes to include the page number in the text: \vspace{2mm} \begin{tabular}{ll} \texttt{\%Page} & Page number in arabic notation (1, 2, 3\ldots) \\ \texttt{\%roman} & Page number in lower-case roman notation (i, ii, iii\ldots) \\ \texttt{\%Roman} & Page number in upper-case roman notation (I, II, III\ldots) \\ \texttt{\%EndPage} & Last page of document in arabic notation \\ \texttt{\%Label} & The page label of the page \\ \texttt{\%EndLabel} & The page label of the last page \\ \texttt{\%filename} & The full file name of the input document \\ \end{tabular} \vspace{2mm} \noindent For example, the format \texttt{"Page~\%Page~of~\%EndPage"} might become "Page~5~of~17". NB: In some circumstances (e.g in batch files) on Microsoft Windows, \verb!%! is a special character, and must be escaped (written as \verb$%%$). Consult your local documentation for details. \subsection{Date and Time Formats} \begin{tabular}{ll} \texttt{\%a} & Abbreviated weekday name (Sun, Mon etc.)\\ \texttt{\%A} & Full weekday name (Sunday, Monday etc.)\\ \texttt{\%b} & Abbreviated month name (Jan, Feb etc.)\\ \texttt{\%B} & Full month name (January, February etc.)\\ \texttt{\%d} & Day of the month (01--31) \\ \texttt{\%e} & Day of the month (1--31) \\ \texttt{\%H} & Hour in 24-hour clock (00--23)\\ \texttt{\%I} & Hour in 12-hour clock (01--12)\\ \texttt{\%j} & Day of the year (001--366)\\ \texttt{\%m} & Month of the year (01--12)\\ \texttt{\%M} & Minute of the hour (00--59)\\ \texttt{\%p} & "a.m" or "p.m"\\ \texttt{\%S} & Second of the minute (00--61)\\ \texttt{\%T} & Same as \%H:\%M:\%S\\ \texttt{\%u} & Weekday (1--7, 1 = Monday)\\ \texttt{\%w} & Weekday (0--6, 0 = Monday)\\ \texttt{\%Y} & Year (0000--9999)\\ \texttt{\%\%} & The \% character. \end{tabular} \subsection{Bates Numbers} \index{bates numbers} Unique page identifiers can be specified by putting \verb!%Bates! in the format. The starting point can be set with the \texttt{-bates} option. For example: \begin{framed} \small\verb!cpdf -add-text "Page ID: %Bates" -bates 23745 in.pdf -o out.pdf! \end{framed} To specify that bates numbering begins at the first page of the range, use \texttt{-bates-at-range} instead. This option must be specified after the range is specified. To pad the bates number up to a given number of leading zeros, use \texttt{-bates-pad-to} in addition to either \texttt{-bates} or \texttt{-bates-at-range}. \subsection{Position} \label{position} The position of the text may be specified either in absolute terms: \begin{framed} \small\verb!-pos-center "200 200"! \vspace{2.5mm} \noindent Position the center of the baseline text at (200pt, 200pt) \vspace{2.5mm} \small\verb!-pos-left "200 200"! \vspace{2.5mm} \noindent Position the left of the baseline of the text at (200pt, 200pt) \vspace{2.5mm} \small\verb!-pos-right "200 200"! \vspace{2.5mm} \noindent Position the right of the baseline of the text at (200pt, 200pt) \end{framed} \noindent Positions relative to certain common points can be set: \begin{framed} \noindent\begin{tabular}{ll} \small\verb!-top 10! & Center of baseline 10 pts down from the top center \\ \small\verb!-topleft 10! & Left of baseline 10 pts down and in from top left \\ \small\verb!-topright 10! & Right of baseline 10 pts down and left from top right\\ \small\verb!-left 10! & Left of baseline 10 pts in from center left \\ \small\verb!-bottomleft 10! & Left of baseline 10 pts in and up from bottom left \\ \small\verb!-bottom 10! & Center of baseline 10 pts up from bottom center\\ \small\verb!-bottomright 10! & Right of baseline 10 pts up and in from bottom right \\ \small\verb!-right 10! & Right of baseline 10 pts in from the center right \\ \small\verb!-diagonal! & Diagonal, bottom left to top right, centered on page\\ \small\verb!-reverse-diagonal! & Diagonal, top left to bottom right, centered on page\\ \small\verb!-center! & Centered on page\\ \end{tabular} \end{framed} \noindent No attempt is made to take account of the page rotation when interpreting the position, so \texttt{-prerotate} must be added to the command line if the file contains pages with a non-zero viewing rotation. This is equivalent to pre-processing the document with \texttt{-upright}. %The \texttt{-shorter-side} modifier can be used to indicate that all the %positions above are relative to the shorter side of the page, any rotation %required being automatic. In other words, \texttt{top, topleft, topright} are %either on the top or left, depending upon which is the shorter side, and %\texttt{bottom, bottomleft, bottomright} are either on the bottom or right %similarly. This flag has no effect on \texttt{-diagonal}. The \texttt{-relative-to-cropbox} modifier can be added to the command line to make these measurements relative to the crop box instead of the media box. The default position is equivalent to \texttt{-topleft 100}. The \texttt{-midline} option may be added to specify that the positioning commands above are to be considered relative to the midline of the text, rather than its baseline. Similarly, the \texttt{-topline} option may be used to specify that the position is taken relative to the top of the text. \subsection{Font and Size} \index{font} The font may be set with the \texttt{-font} option. The 14 Standard PDF fonts are available: \vspace{2mm} \begin{tabular}{l} Times-Roman\\ Times-Bold\\ Times-Italic\\ Times-BoldItalic\\ Helvetica\\ Helvetica-Bold\\ Helvetica-Oblique\\ Helvetica-BoldOblique\\ Courier\\ Courier-Bold\\ Courier-Oblique\\ Courier-BoldOblique\\ Symbol\\ ZapfDingbats \end{tabular} \noindent For example, page numbers in Times Italic can be achieved by: \begin{framed} \small\verb!cpdf -add-text "-%Page-" -font "Times-Italic" in.pdf -o out.pdf! \end{framed} \noindent See Section \ref{copyfont} for how to use other fonts. The font size can be altered with the \texttt{-font-size} option, which specifies the size in points: \begin{framed} \small\verb!cpdf -add-text "-%Page-" -font-size 36 in.pdf -o out.pdf! \end{framed} \subsection{Colors} \index{color} The \texttt{-color} option takes an RGB color, where red, green and blue components range between 0 and 1. The following values are predefined: \vspace{2mm} \begin{tabular}{ll} \textbf{Color} & \textbf{R, G, B} \\ \hline white & 1, 1, 1\\ black & 0, 0, 0\\ red & 1, 0, 0\\ green & 0, 1, 0\\ blue & 0, 0, 1\\ \end{tabular} \begin{framed} \small\verb!cpdf -add-text "Hullo" -color "red" in.pdf -o out.pdf! \vspace{1.5mm} \small\verb!cpdf -add-text "Hullo" -color "0.5 0.5 0.5" in.pdf -o out.pdf! \end{framed} \noindent Partly-transparent text may be specified using the \verb!-opacity! option. Wholly opaque is 1 and wholly transparent is 0. For example: \begin{framed} \small\verb!cpdf -add-text "DRAFT" -color "red" -opacity 0.3 -o out.pdf! \end{framed} \subsection{Outline Text} \index{outline text} The \texttt{-outline} option sets outline text. The line width (default 1pt) may be set with the \texttt{-linewidth} option. For example, to stamp documents as drafts: \begin{framed} \small\verb!cpdf -add-text "DRAFT" -diagonal -outline in.pdf -o out.pdf! \end{framed} \subsection{Multi-line Text} The code \texttt{$\backslash$n} can be included in the text string to move to the next line. In this case, the vertical position refers to the baseline of the first line of text (if the position is at the top, top left or top right of the page) or the baseline of the last line of text (if the position is at the bottom, bottom left or bottom right). \begin{framed} \small\begin{verbatim}cpdf -add-text "Specification\n%Page of %EndPage" -topright 10 in.pdf -o out.pdf\end{verbatim} \end{framed} \noindent The \texttt{-midline} option may be used to make these vertical positions relative to the midline of a line of text rather than the baseline, as usual. The \texttt{-line-spacing} option can be used to increase or decrease the line spacing, where a spacing of 1 is the standard. \begin{framed} \small\begin{verbatim}cpdf -add-text "Specification\n%Page of %EndPage" -topright 10 -line-spacing 1.5 in.pdf -o out.pdf\end{verbatim} \end{framed} \noindent Justification of multiple lines is handled by the \texttt{-justify-left}, \texttt{-justify-right} and\\ \texttt{-justify-center} options. The defaults are left justification for positions relative to the left hand side of the page, right justification for those relative to the right, and center justification for positions relative to the center of the page. For example: \begin{framed} \small\begin{verbatim}cpdf -add-text "Long line\nShort" -justify-right in.pdf -o out.pdf\end{verbatim} \end{framed} \subsection{Special Characters} If your command line allows for the inclusion of unicode characters, the input text will be considered as UTF8 by \verb!cpdf!. Special characters which exist in the PDF WinAnsiEncoding Latin 1 code (such as many accented characters) will be reproduced in the PDF. This does not mean, however, that every special character can be reproduced. You must experiment. For compatibility with previous versions of cpdf, special characters may be introduced manually with a backslash followed by the three-digit octal code of the character in the PDF WinAnsiEncoding Latin 1 Code. The full table is included in Appendix D of the Adobe PDF Reference Manual, which is available at \url{http://www.adobe.com/devnet/pdf/pdf_reference.html}. For example, a German sharp s (\ss) may be introduced by \verb!\337!. \section{Stamping Graphics} A rectangle may be placed on one or more pages by using the \texttt{-add-rectangle } command. Most of the options discussed above for text placement apply in the same way. For example: \begin{framed} \small\begin{verbatim}cpdf -add-rectangle "200 300" -pos-right 30 -color red -outline in.pdf -o out.pdf\end{verbatim} \end{framed} This can be used to blank out or highlight part of the document. The following positioning options work as you would expect: \texttt{-topleft}, \texttt{-top}, \texttt{-topright}, \texttt{-right}, \texttt{-bottomright}, \texttt{-bottom}, \texttt{-bottomleft}, \texttt{-left}, \texttt{-center}. When using the option \texttt{-pos-left "x y"}, the point (x, y) refers to the bottom-left of the rectangle. When using the option \texttt{-pos-right "x y"}, the point (x, y) refers to the bottom-right of the rectangle. When using the option \texttt{-pos-center "x y"}, the point (x, y) refers to the center of the rectangle. The options \texttt{-diagonal} and \texttt{-reverse-diagonal} have no meaning.\pagestyle{empty}\thispagestyle{fancy} \chapter{Multipage Facilities}\pagestyle{fancy} \begin{framed} \small\noindent\verb!cpdf -twoup-stack in.pdf -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -twoup in.pdf -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -pad-before in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -pad-after in.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -pad-every [] in.pdf -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -pad-multiple [] in.pdf -o out.pdf! \end{framed} \section{Two-up} \index{two-up} This facility puts multiple logical pages on a single physical page. The \texttt{-twoup-stack} operation puts two logical pages on each physical page, rotating them 90 degrees to do so. The new mediabox is thus larger. The \texttt{-twoup} operation does the same, but scales the new sides down so that the media box is unchanged. \section{Inserting Blank Pages} \index{blank pages!inserting} Sometimes, for instance to get a printing arrangement right, it's useful to be able to insert blank pages into a PDF file. \cpdf\ can add blank pages before a given page or pages, or after. The pages in question are specified by a range in the usual way: \begin{framed} \small\verb!cpdf -pad-before in.pdf 1 -o out.pdf! \vspace{2.5mm} \noindent Add a blank page before page 1 (i.e. at the beginning of the document.) \vspace{2.5mm} \verb!cpdf -pad-after in.pdf 2,16,38,84,121,147 -o out.pdf! \vspace{2.5mm} \noindent Add a blank page after pages 2, 16, 38, 84, 121 and 147 (for instance, to add a clean page between chapters of a document.) \end{framed} \noindent The dimensions of the padded page are derived from the boxes (media box, crop box etc.) of the page after or before which the padding is to be applied. The \verb!-pad-every n! operation places a blank page after every n pages, excluding any last one. For example\ldots \begin{framed} \small\verb!cpdf -pad-every 3 in.pdf -o out.pdf! \vspace{2.5mm} \noindent Add a blank page after every three pages \end{framed} \noindent\ldots on a 9 page document adds a blank page after pages 3 and 6. The \verb!-pad-multiple n! operation adds blank pages so the document has a multiple of \verb!n! pages. For example: \begin{framed} \small\verb!cpdf -pad-multiple 8 in.pdf -o out.pdf! \vspace{2.5mm} \noindent Add blank pages to \texttt{in.pdf} so it has a multiple of 8 pages. \end{framed} \chapter{Annotations} \begin{framed} \small\noindent\verb!cpdf -list-annotations in.pdf []! \vspace{1.5mm} \small\noindent\verb!cpdf -copy-annotations from.pdf to.pdf [] -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -remove-annotations in.pdf [] -o out.pdf! \end{framed} \section{List Annotations} \index{annotations!listing} The \texttt{-list-annotations} operation prints the textual content of any annotations on the selected pages to standard output. Each annotation is preceded by the page number and followed by a newline. \begin{framed} \small\verb!cpdf -list-annotations in.pdf > annots.txt! \vspace{2.5mm} \noindent Print annotations from \texttt{in.pdf}, redirecting output to \texttt{annots.txt}. \end{framed} \section{Copy Annotations} \index{annotations!copying} The \texttt{-copy-annotations} operation copies the annotations in the given page range from one file (the file specified immediately after the option) to another pre-existing PDF. The range is specified after this pre-existing PDF. The result is then written an output file, specified in the usual way. \begin{framed} \small\verb!cpdf -copy-annotations from.pdf to.pdf 1-10 -o result.pdf ! \vspace{2.5mm} \noindent Copy annotations from the first ten pages of \texttt{from.pdf} onto the PDF file \texttt{to.pdf}, writing the result to \texttt{results.pdf}. \end{framed} \section{Remove Annotations} \index{annotations!removing} The \texttt{-remove-annotations} operation removes all annotations from the given page range. \begin{framed} \small\verb!cpdf -remove-annotations in.pdf 1 -o out.pdf! \vspace{2.5mm} \noindent Remove annotations from the first page of a file only. \end{framed} \chapter{Document Information and Metadata} \index{document information} \index{metadata} \begin{framed} \small\noindent\verb!cpdf -list-fonts in.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -info [-raw | -utf8] in.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -page-info in.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -pages in.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -set-title in.pdf -o out.pdf!\\ (Also \texttt{-set-author} etc. See Section \ref{setdocinfo}.) \vspace{1.5mm} \small\noindent\verb!cpdf -set-page-layout <layout> in.pdf -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -set-page-mode <mode> in.pdf -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -hide-toolbar <true | false> in.pdf -o out.pdf!\\ \noindent\verb! -hide-menubar!\\ \noindent\verb! -hide-window-ui!\\ \noindent\verb! -fit-window!\\ \noindent\verb! -center-window!\\ \noindent\verb! -display-doc-title! \vspace{1.5mm} \small\noindent\verb!cpdf -open-at-page <page number> in.pdf -o out.pdf!\\ \noindent\verb!cpdf -open-at-page-fit <page number> in.pdf -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -set-metadata <metadata-file> in.pdf -o out.pdf! \small\noindent\verb!cpdf -remove-metadata in.pdf -o out.pdf!\\ \small\noindent\verb!cpdf -print-metadata in.pdf -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -add-page-labels in.pdf -o out.pdf!\\ \noindent\verb! [-label-style <style>] [-label-prefix <string>]!\\ \noindent\verb! [-label-startval <integer>]!\\ \vspace{1.5mm} \small\noindent\verb!cpdf -remove-page-labels in.pdf -o out.pdf!\\ \small\noindent\verb!cpdf -print-page-labels in.pdf! \end{framed} \section{Listing Fonts} \index{fonts!listing} The \texttt{-list-fonts} operation prints the fonts in the document, one-per-line to standard output. For example: \begin{framed}\small\begin{verbatim}1 /F245 /Type0 /Cleargothic-Bold /Identity-H 1 /F247 /Type0 /ClearGothicSerialLight /Identity-H 1 /F248 /Type1 /Times-Roman /WinAnsiEncoding 1 /F250 /Type0 /Cleargothic-RegularItalic /Identity-H 2 /F13 /Type0 /Cleargothic-Bold /Identity-H 2 /F16 /Type0 /Arial-ItalicMT /Identity-H 2 /F21 /Type0 /ArialMT /Identity-H 2 /F58 /Type1 /Times-Roman /WinAnsiEncoding 2 /F59 /Type0 /ClearGothicSerialLight /Identity-H 2 /F61 /Type0 /Cleargothic-BoldItalic /Identity-H 2 /F68 /Type0 /Cleargothic-RegularItalic /Identity-H 3 /F47 /Type0 /Cleargothic-Bold /Identity-H 3 /F49 /Type0 /ClearGothicSerialLight /Identity-H 3 /F50 /Type1 /Times-Roman /WinAnsiEncoding 3 /F52 /Type0 /Cleargothic-BoldItalic /Identity-H 3 /F54 /Type0 /TimesNewRomanPS-BoldItalicMT /Identity-H 3 /F57 /Type0 /Cleargothic-RegularItalic /Identity-H 4 /F449 /Type0 /Cleargothic-Bold /Identity-H 4 /F451 /Type0 /ClearGothicSerialLight /Identity-H 4 /F452 /Type1 /Times-Roman /WinAnsiEncoding \end{verbatim} \end{framed} \noindent The first column gives the page number, the second the internal unique font name, the third the type of font (Type1, TrueType etc), the fourth the PDF font name, the fifth the PDF font encoding. \section{Reading Document Information} \label{info} The \texttt{-info} option prints entries from the document information dictionary, and from any XMP metadata to standard output. \begin{framed} {\small\begin{verbatim} $cpdf -info pdf_reference.pdf Encryption: 40bit Linearized: true Permissions: No edit Version: 1.6 Pages: 1310 Title: PDF Reference, version 1.7 Author: Adobe Systems Incorporated Subject: Adobe Portable Document Format (PDF) Keywords: Creator: FrameMaker 7.2 Producer: Acrobat Distiller 7.0.5 (Windows) Created: D:20061017081020Z Modified: D:20061118211043-02'30' XMP pdf:Producer: Adobe PDF library 7.77 XMP xmp:CreateDate: 2006-12-21T18:19:09+01:00 XMP xmp:CreatorTool: Adobe Illustrator CS2 XMP xmp:MetadataDate: 2006-12-21T18:19:09Z XMP xmp:ModifyDate: 2006-12-21T18:19:09Z XMP dc:title: AI6\end{verbatim}}\end{framed} \noindent The details of the format for creation and modification dates can be found in Appendix~\ref{dates}. By default, cpdf strips to ASCII, discarding character codes in excess of 127. In order to preserve the original unicode, add the \texttt{-utf8} option. To disable all postprocessing of the string, add \texttt{-raw}. \vspace{4mm} The \texttt{-page-info} option prints the page label, media box and other boxes page-by-page to standard output, for all pages in the current range. \begin{framed} {\small\begin{verbatim} $cpdf -page-info 14psfonts.pdf Page 1: Label: i MediaBox: 0.000000 0.000000 600.000000 450.000000 CropBox: 200.000000 200.000000 500.000000 500.000000 BleedBox: TrimBox: ArtBox: Rotation: 0 \end{verbatim}} \end{framed} \noindent Note that the format for boxes is minimum x, minimum y, maximum x, maximum y. \smallgap \noindent The \texttt{-pages} operation prints the number of pages in the file. \begin{framed} {\small\begin{verbatim} cpdf -pages Archos.pdf 8 \end{verbatim}} \end{framed} \section{Setting Document Information} \label{setdocinfo} The \textit{document information dictionary} in a PDF file specifies various pieces of information about a PDF. These can be consulted in a PDF viewer (for instance, Acrobat). Here is a summary of the commands for setting entries in the document information dictionary: {\small\begin{framed} \noindent\begin{tabular}{ll} \textbf{Information} & \textbf{Example command-line fragment} \\ Title & \texttt{cpdf -set-title "Discourses"} \\ Author & \texttt{cpdf -set-author "Joe Smith"} \\ Subject & \texttt{cpdf -set-subject "Behavior"} \\ Keywords & \texttt{cpdf -set-keywords "Ape Primate"} \\ Creator & \texttt{cpdf -set-creator "Original Program"} \\ Producer & \texttt{cpdf -set-producer "Distilling Program"} \\ Creation Date & \texttt{cpdf -set-create "D:19970915110347-08'00'"} \\ Modification Date & \texttt{cpdf -set-modify "D:19970915110347-08'00'"} \\ Mark as Trapped & \texttt{cpdf -set-trapped} \\ Mark as Untrapped & \texttt{cpdf -set-untrapped} \\ \end{tabular} \end{framed}} \noindent (The details of the format for creation and modification dates can be found in Appendix~\ref{dates}. Using the date \texttt{"now"} uses the time and date at which the command is executed. Note also that \texttt{-producer} and \texttt{-creator} may be used to set the producer and/or the creator when writing any file, separate from the operations described in this chapter.) \vspace{2mm} For example, to set the title, the full command line would be \begin{framed} \small\verb!cpdf -set-title "A Night in London" in.pdf -o out.pdf! \end{framed} \noindent The text string is considered to be in UTF8 format, unless the \texttt{-raw} option is added---in which case, it is unprocessed, save for the replacement of any octal escape sequence such as \texttt{\textbackslash 017}, which is replaced by a character of its value (here, 15). \section{Upon Opening a Document} \subsection{Page Layout} \index{page!layout} The \texttt{-set-page-layout} option specifies the page layout to be used when a document is opened in, for instance, Acrobat. The possible (case-sensitive) values are: \vspace{2mm} {\small\begin{tabular}{ll} \texttt{SinglePage} & \vspace{2mm} \parbox{8cm}{Display one page at a time} \\ \texttt{OneColumn} & \vspace{2mm} \parbox{8cm}{Display the pages in one column} \\ \texttt{TwoColumnLeft} & \vspace{2mm} \parbox{8cm}{Display the pages in two columns, odd numbered pages on the left} \\ \texttt{TwoColumnRight} & \vspace{2mm} \parbox{8cm}{Display the pages in two columns, even numbered pages on the left} \\ \texttt{TwoPageLeft} & \vspace{2mm} \parbox{8cm}{(PDF 1.5 and above) Display the pages two at a time, odd numbered pages on the left} \\ \texttt{TwoPageRight} & \vspace{2mm} \parbox{8cm}{(PDF 1.5 and above) Display the pages two at a time, even numbered pages on the left} \end{tabular}}\\ \noindent For instance: \begin{framed} \small\verb!cpdf -set-page-layout TwoColumnRight in.pdf -o out.pdf! \end{framed} \subsection{Page Mode} \index{page!mode} The \textit{page mode} in a PDF file defines how a viewer should display the document when first opened. The possible (case-sensitive) values are: \vspace{2mm} {\small\begin{tabular}{ll} \texttt{UseNone} & \vspace{2mm} \parbox{8cm}{Neither document outline nor thumbnail images visible} \\ \texttt{UseOutlines} & \vspace{2mm} \parbox{8cm}{Document outline (bookmarks) visible} \\ \texttt{UseThumbs} & \vspace{2mm} \parbox{8cm}{Thumbnail images visible} \\ \texttt{FullScreen} & \vspace{2mm} \parbox{8cm}{Full-screen mode (no menu bar, window controls, or anything but the document visible)} \\ \texttt{UseOC} & \vspace{2mm} \parbox{8cm}{(PDF 1.5 and above) Optional content group panel visible} \\ \texttt{UseAttachments} & \vspace{2mm} \parbox{8cm}{(PDF 1.5 and above) Attachments panel visible} \end{tabular}}\\ \noindent For instance: \begin{framed} \small\verb!cpdf -set-page-mode FullScreen in.pdf -o out.pdf! \end{framed} \subsection{Display Options} \vspace{2mm} {\small\begin{tabular}{ll} \texttt{-hide-toolbar} & \vspace{2mm} \parbox{8cm}{Hide the viewer's toolbar} \\ \texttt{-hide-menubar} & \vspace{2mm} \parbox{8cm}{Document outline (bookmarks) visible} \\ \texttt{-hide-window-ui} & \vspace{2mm} \parbox{8cm}{Hide the viewer's scroll bars} \\ \texttt{-fit-window} & \vspace{2mm} \parbox{8cm}{Resize the document's windows to fit size of first page} \\ \texttt{-center-window} & \vspace{2mm} \parbox{8cm}{Position the document window in the center of the screen} \\ \texttt{-display-doc-title} & \vspace{2mm} \parbox{8cm}{Display the document title instead of the file name in the title bar} \end{tabular}}\\ \noindent For instance: \begin{framed} \small\verb!cpdf -hide-toolbar true in.pdf -o out.pdf! \end{framed} \noindent The page a PDF file opens at can be set using \texttt{-open-at-page}: \begin{framed} \small\verb!cpdf -open-at-page 15 in.pdf -o out.pdf! \end{framed} \noindent To have that page scaled to fit the window in the viewer, use \texttt{-open-at-page-fit} instead: \begin{framed} \small\verb!cpdf -open-at-page-fit 15 in.pdf -o out.pdf! \end{framed} \section{Metadata} \index{metadata} PDF files can contain a piece of arbitrary metadata, often in XMP format. This is typically stored in an uncompressed stream, so that other applications can read it without having to decode the whole PDF. To set the metadata: \begin{framed} \small\verb!cpdf -set-metadata data.xml in.pdf -o out.pdf! \end{framed} \noindent To remove any metadata: \begin{framed} \small\verb!cpdf -remove-metadata in.pdf -o out.pdf! \end{framed} \noindent To print the current metadata to standard output: \begin{framed} \small\verb!cpdf -print-metadata in.pdf! \end{framed} \section{Page Labels} \index{page labels}\index{page!labels} It is possible to add \textit{page labels} to a document. These are not the printed on the page, but may be displayed alongside thumbnails or in print dialogue boxes by PDF readers. We use \texttt{-add-page-labels} to do this, by default with decimal arabic numbers (1,2,3\ldots). We can add \texttt{-label-style} to choose what type of labels to add from these kinds: \vspace{4mm} {\small\begin{tabular}{rl} \texttt{DecimalArabic} & 1,2,3,4,5\ldots \\ \texttt{LowercaseRoman} & i,ii,iii,iv,v\ldots \\ \texttt{UppercaseRoman} & I,II,III,IV,V\ldots \\ \texttt{LowercaseLetters} & a,b,c,\ldots,z,aa,bb\ldots \\ \texttt{UppercaseLetters} & A,B,C,\ldots,Z,AA,BB\ldots \\ \texttt{NoLabelPrefixOnly} & No number, but a prefix will be used if defined. \end{tabular}} \vspace{4mm} \noindent We can use \texttt{-label-prefix} to add a textual prefix to each label. Consider a file with twenty pages and no current page labels (a PDF reader will assume 1,2,3\ldots if there are none). We will add the following page labels: \vspace{4mm} i, ii, iii, iv, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, A-0, A-1, A-2, A-3, A-4, A-5 \vspace{4mm} \noindent Here are the commands, in order: {\small\begin{framed} \noindent\verb!cpdf -add-page-labels in.pdf 1-4 -label-style LowercaseRoman!\\ \noindent\verb! -o out.pdf!\\ \noindent\verb!cpdf -add-page-labels out.pdf 5-14 -o out.pdf!\\ \noindent\verb!cpdf -add-page-labels out.pdf 15-20 -label-prefix "A-"!\\ \noindent\verb! -label-startval 0 -o out.pdf! \end{framed}} \noindent By default the labels begin at page number 1 for each range. To override this, we can use \texttt{-label-startval} (we used $0$ in the final command), where we want the numbers to begin at zero rather than one. Page labels may be removed altogether by using \texttt{-remove-page-labels} command. To print the page labels from an existing file, use \texttt{-print-page-labels}. For example: \begin{framed}\small\begin{verbatim}$ cpdf -print-page-labels cpdfmanual.pdf labelstyle: LowercaseRoman labelprefix: None startpage: 1 startvalue: 1 labelstyle: DecimalArabic labelprefix: None startpage: 9 startvalue: 1 \end{verbatim} \end{framed}\pagestyle{empty}\thispagestyle{fancy} \chapter{File Attachments}\pagestyle{fancy} \index{attachments} \begin{framed} \small\noindent\verb!cpdf -attach-file <filename> [-to-page <page number>] in.pdf -o out.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -list-attached-files in.pdf! \vspace{1.5mm} \small\noindent\verb!cpdf -remove-files in.pdf -o out.pdf! \end{framed} PDF supports adding attachments (files of any kind, including other PDFs) to an existing file. The \cpdf\ tool supports adding and removing \textit{document-level attachments} --- that is, ones which are associated with the document as a whole rather than with an individual page, and also \textit{page-level attachments}, associated with a particular page. \section{Adding Attachments} \index{attachments!adding} To add an attachment, use the \texttt{-attach-file} option. For instance, \begin{framed} \small\verb!cpdf -attach-file sheet.xls in.pdf -o out.pdf! \end{framed} \noindent attaches the Excel spreadsheet \texttt{sheet.xls} to the input file. If the file already has attachments, the new file is added to their number. You can specify multiple files to be attached by using \verb!-attach-file! multiple times. They will be attached in the given order. The \texttt{-to-page} option can be used to specify that the files will be attached to the given page, rather than at the document level. The \texttt{-to-page} option may be specified at most once. \section{Listing Attachments} \index{attachments!listing} To list all document- and page-level attachments, use the \texttt{-list-attached-files} operation. The page number and filename of each attachment is given, page 0 representing a document-level attachment. \begin{framed} {\small\begin{verbatim} $cpdf -list-attached-files 14psfonts.pdf 0 utility.ml 0 utility.mli 4 notes.xls \end{verbatim}} \end{framed} \section{Removing Attachments} \index{attachments!removing} To remove all document-level and page-level attachments from a file, use the \texttt{-remove-files} operation: \begin{framed} \small\verb!cpdf -remove-files in.pdf -o out.pdf! \end{framed} \chapter{Working with Images} \begin{framed} \noindent\small\verb!cpdf -image-resolution <minimum resolution> in.pdf [<range>]! %\vspace{1.5mm} %\noindent\small\verb!cpdf -extract-images in.pdf [<range>] -o <string>! \end{framed} \section{Detecting Low-resolution Images} To list all images in the given range of pages which fall below a given resolution (in dots-per-inch), use the \verb!-image-resolution! function: \begin{framed} \noindent\small\verb@cpdf -image-resolution 300 in.pdf [<range>]@ \end{framed} \begin{framed} {\small\begin{verbatim}2, /Im5, 531, 684, 149.935297, 150.138267 2, /Im6, 184, 164, 149.999988, 150.458710 2, /Im7, 171, 156, 149.999996, 150.579145 2, /Im9, 65, 91, 149.999986, 151.071856 2, /Im10, 94, 60, 149.999990, 152.284285 2, /Im15, 184, 139, 149.960011, 150.672060 4, /Im29, 53, 48, 149.970749, 151.616446\end{verbatim}} \end{framed} \noindent The format is \textit{page number, image name, x pixels, y pixels, x resolution, y resolution}. The resolutions refer to the image's effective resolution at point of use (taking account of scaling, rotation etc). % \section{Extracting Images} % \begin{framed} % \noindent\verb!cpdf -extract-images in.pdf [<range>] -o <string>! % \end{framed} %The Tools can extract images from PDF files to JPEG, JPEG2000, JBIG2 and PNM (Portable Any Map) files. Images which are already in JPEG/JPEG2000/JBIG2 format in the PDF are written in those formats, unaltered. All other images are decoded and written as PNM files (unless the decoding method is unknown). If the command line tool \textsf{pnm2png} is present, PNG files are output instead. %For example, % \begin{framed} % \small\verb!cpdf -extract-images in.pdf 2-6 -o img%%%! % \end{framed} %might generate \texttt{img001.jpg}, \texttt{img002.png}, \texttt{img003.jpg} etc. from the images on pages two to six. The number of percentage characters in the output format indicate the width of the numbering system for the output file names. \pagestyle{empty} \chapter{Fonts}\pagestyle{fancy} {\small \begin{framed} \noindent\verb!cpdf -copy-font fromfile.pdf -copy-font-page <int>!\\ \noindent\verb! -copy-font-name <name> in.pdf [<range>] -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -remove-fonts in.pdf -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -missing-fonts in.pdf! \end{framed}} \section{Copying Fonts} \label{copyfont} In order to use a font other than the standard 14 with \verb!-add-text!, it must be added to the file. The font source PDF is given, together with the font's resource name on a given page, and that font is copied to all the pages in the input file's range, and then written to the output file. The font is named in the output file with its basefont name, so it can be easily used with \verb!-add-text!. For example, if the file \verb!fromfile.pdf! has a font \verb!/GHLIGA+c128! with the name \verb!/F10! on page 1 (this information can be found with \verb!-list-fonts!), the following would copy the font to the file \verb!in.pdf! on all pages, writing the output to \verb!out.pdf!: \begin{framed} \small\noindent\verb!cpdf -copy-font fromfile.pdf -copy-font-name /F10!\\ \small\noindent\verb! -copy-font-page 1 in.pdf -o out.pdf! \end{framed} \noindent Text in this font can then be added by giving \verb!-font /GHLIGA+c128!. Be aware that due to the vagaries of PDF font handling concerning which characters are present in the source font, not all characters may be available, or the encoding (mapping from input codes to glyphs) may be non-obvious. \section{Removing Fonts} \label{removefont} To remove embedded fonts from a document, use \verb!-remove-fonts!. PDF readers will substitute local fonts for the missing fonts. The use of this function is only recommended when file size is the sole consideration. \begin{framed} \small\noindent\verb!cpdf -remove-fonts in.pdf -o out.pdf! \vspace{2.5mm} \end{framed} \section{Listing Missing Fonts} The \verb!-missing-fonts! operation lists any unembedded fonts in the document, one per line. \begin{framed} \small\noindent\verb!cpdf -missing-fonts in.pdf! \vspace{2.5mm} \end{framed} \noindent The format is \begin{framed} \small\noindent\verb!Page number, Name, Subtype, Basefont, Encoding! \vspace{2.5mm} \end{framed} \label{listmisingfonts} \chapter{Miscellaneous} {\small\begin{framed} \noindent\verb!cpdf -draft [-boxes] in.pdf [<range>] -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -blacktext in.pdf [<range>] -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -blacklines in.pdf [<range>] -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -blackfills in.pdf [<range>] -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -thinlines <minimum thickness> in.pdf [<range>] -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -clean in.pdf -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -set-version <version number> in.pdf -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -copy-id-from source.pdf in.pdf -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -remove-id in.pdf -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -list-spot-colors in.pdf! \vspace{1.5mm} \noindent\verb!cpdf -remove-dict-entry in.pdf -o out.pdf! \vspace{1.5mm} \noindent\verb!cpdf -remove-clipping in.pdf -o out.pdf! \end{framed}} \section{Draft Documents} \index{draft} The \texttt{-draft} option removes bitmap (photographic) images from a file, so that it can be printed with less ink. Optionally, the \texttt{-boxes} option can be added, filling the spaces left blank with a crossed box denoting where the image was. This is not guaranteed to be fully visible in all cases (the bitmap may be have been partially covered by vector objects or clipped in the original). For example: \begin{framed} \small\verb!cpdf -draft -boxes in.pdf -o out.pdf! \end{framed} \section{Blackening Text, Lines and Fills} \index{blacken text} Sometimes PDF output from an application (for instance, a web browser) has text in colors which would not print well on a grayscale printer. The \texttt{-blacktext} operation blackens all text on the given pages so it will be readable when printed. This will not work on text which has been converted to outlines, nor on text which is part of a form. \begin{framed} \small\verb!cpdf -blacktext in.pdf -o out.pdf! \end{framed} \index{blacken lines} \noindent The \texttt{-blacklines} operation blackens all lines on the given pages. \begin{framed} \small\verb!cpdf -blacklines in.pdf -o out.pdf! \end{framed} \index{blacken fills} \noindent The \texttt{-blackfills} operation blackens all fills on the given pages. \begin{framed} \small\verb!cpdf -blackfills in.pdf -o out.pdf! \end{framed} \noindent Contrary to their names, all these operations can use another color, if specified with \texttt{-color}. \section{Hairline Removal} \index{hairline removal} Quite often, applications will use very thin lines, or even the value of 0, which in PDF means "The thinnest possible line on the output device". This might be fine for on-screen work, but when printed on a high resolution device, such as by a commercial printer, they may be too faint, or disappear altogether. The \texttt{-thinlines} option prevents this by changing all lines thinner than \texttt{<minimal~thickness>} to the given thickness. For example: \begin{framed} \small\noindent\verb!cpdf -thinlines 0.2mm in.pdf [<range>] -o out.pdf! \vspace{2.5mm} \noindent Thicken all lines less than 0.2mm to that value. \end{framed} \section{Garbage Collection} \index{garbage collection} Sometimes incremental updates to a file by an application, or bad applications can leave data in a PDF file which is no longer used. This function removes that unneeded data. \begin{framed} \small\noindent\verb!cpdf -clean in.pdf -o out.pdf! \end{framed} \section{Change PDF Version Number} \index{version number} \label{setversion} To change the pdf version number, use the \texttt{-set-version} operation, giving the part of the version number after the decimal point. For example: \begin{framed} \small\noindent\verb!cpdf -set-version 4 in.pdf -o out.pdf! \vspace{2.5mm} \noindent Change file to PDF 1.4. \end{framed} \noindent This does not alter any of the actual data in the file --- just the supposed version number. \section{Copy ID} \index{copy ID} The \texttt{-copy-id-from} operation copies the ID from the given file to the input, writing to the output. \begin{framed} \small\noindent\verb!cpdf -copy-id-from source.pdf in.pdf -o out.pdf! \vspace{2.5mm} \noindent Copy the id from \texttt{source.pdf} to the contents of \texttt{in.pdf}, writing to \texttt{out.pdf}. \end{framed} \noindent If there is no ID in the source file, the existing ID is retained. You cannot use \texttt{-recrypt} with \texttt{-copy-id-from}. \section{Remove ID} \index{remove ID} The \texttt{-remove-id} operation removes the ID from a document. \begin{framed} \small\noindent\verb!cpdf -remove-id in.pdf -o out.pdf! \vspace{2.5mm} \noindent Remove the ID from \texttt{in.pdf}, writing to \texttt{out.pdf}. \end{framed} You cannot use \texttt{-recrypt} with \texttt{-remove-id}. \section{List Spot Colours} This operation lists the name of any ``separation'' color space in the given PDF file. \begin{framed} \small\noindent\verb!cpdf -list-spot-colors in.pdf! \vspace{2.5mm} \noindent List the spot colors, one per line in \texttt{in.pdf}, writing to \texttt{stdout}. \end{framed} \section{Removing Dictionary Entries} This is for editing data within the PDF's internal representation. Use with caution. \begin{framed} \small\noindent\verb!cpdf -remove-dict-entry /One in.pdf -o out.pdf! \vspace{2.5mm} \noindent Remove the entry for \texttt{/One} in every dictionary \texttt{in.pdf}, writing to \texttt{out.pdf}. \end{framed} \section{Remove Clipping} The \texttt{-remove-clipping} operation removes any clipping paths from the file. \begin{framed} \small\noindent\verb!cpdf -remove-clipping in.pdf -o out.pdf! \vspace{2.5mm} \noindent Remove every clipping path in \texttt{in.pdf}, writing to \texttt{out.pdf}. \end{framed} \appendix \chapter{Dates}\pagestyle{empty} \label{dates} \index{dates!defined} Dates in PDF are specified according to the following format: \begin{framed} \texttt{D:YYYYMMDDHHmmSSOHH'mm'}\\\\where: \begin{itemize} \item \texttt{YYYY} is the year; \item \texttt{MM} is the month; \item \texttt{DD} is the day (01-31); \item \texttt{HH} is the hour (00-23); \item \texttt{mm} is the minute (00-59); \item \texttt{SS} is the second (00-59); \item \texttt{O} is the relationship of local time to Universal Time (UT), denoted by '+', '-' or 'Z'; \item \texttt{HH} is the absolute value of the offset from UT in hours (00-23); \item \texttt{mm} is the absolute value of the offset from UT in minutes (00-59). \end{itemize} \end{framed} \noindent A contiguous prefix of the parts above can be used instead, for lower accuracy dates. For example: \begin{framed} \small\noindent\verb!D:2014! (2014) \vspace{1.5mm} \noindent\verb!D:20140103! (3rd March 2014) \vspace{1.5mm} \noindent\verb!D:201401031854-08'00'! (3rd March 2014, 6:54PM, US Pacific Standard Time) \end{framed} \backmatter \pagestyle{fancy} \printindex \end{document}