diff --git a/html_manual/cpdfmanual.html b/html_manual/cpdfmanual.html deleted file mode 100644 index 243e2c9..0000000 --- a/html_manual/cpdfmanual.html +++ /dev/null @@ -1,2259 +0,0 @@ - - - - - - - - - - - - -

- - -

-

-

Coherent PDF -

Command Line Toolkit -

User Manual
-Version 2.2 (March 2017) -

Coherent Graphics Ltd -

- - -

For bug reports, feature requests and comments, email
contact@coherentgraphics.co.uk -

©2017 Coherent Graphics Limited. All rights reserved. ISBN 978-0957671140 -

Adobe, Acrobat, Adobe PDF, Adobe Reader and PostScript are registered trademarks of Adobe -Systems Incorporated. Windows, Powerpoint and Excel are registered trademarks of Microsoft -Corporation. - - - - -

- - -

Contents

- 1 Basic Usage -
 1.1 Input and Output Files -
 1.2 Input Ranges -
 1.3 Working with Encrypted Documents -
 1.4 Standard Input and Standard Output -
 1.5 Doing Several Things at Once with AND -
 1.6 Units -
 1.7 Setting the Producer and Creator -
 1.8 PDF Version Numbers -
 1.9 File IDs -
 1.10 Linearization -
 1.11 Object Streams -
 1.12 Malformed Files -
 1.13 Error Handling -
 1.14 Control Files -
 1.15 String Arguments -
 1.16 Text Encodings -
 1.17 Font Embedding -
2 Merging and Splitting -
 2.1 Merging -
 2.2 Splitting -
 2.3 Splitting on Bookmarks -
 2.4 Encrypting with Split and Split Bookmarks -
3 Pages -
 3.1 Page Sizes -
 3.2 Scale Pages -
 3.3 Shift Page Contents -
 3.4 Rotating Pages -
 3.5 Flipping Pages -
 3.6 Boxes and Cropping -
4 Encryption and Decryption -
 4.1 Introduction -
 4.2 Encrypting a Document -
 4.3 Decrypting a Document -
5 Compression -
 5.1 Decompressing a Document -
 5.2 Compressing a Document - - -
 5.3 Squeezing a Document -
6 Bookmarks -
 6.1 List Bookmarks -
 6.2 Remove Bookmarks -
 6.3 Add Bookmarks -
7 Presentations -
8 Watermarks and Stamps -
 8.1 Add a Watermark or Logo -
 8.2 Stamp Text, Dates and Times. -
  8.2.1 Page Numbers -
  8.2.2 Date and Time Formats -
  8.2.3 Bates Numbers -
  8.2.4 Position -
  8.2.5 Font and Size -
  8.2.6 Colors -
  8.2.7 Outline Text -
  8.2.8 Multi-line Text -
  8.2.9 Special Characters -
 8.3 Stamping Graphics -
9 Multipage Facilities -
 9.1 Two-up -
 9.2 Inserting Blank Pages -
10 Annotations -
 10.1 List Annotations -
 10.2 Copy Annotations -
 10.3 Remove Annotations -
11 Document Information and Metadata -
 11.1 Listing Fonts -
 11.2 Reading Document Information -
 11.3 Setting Document Information -
 11.4 Upon Opening a Document -
  11.4.1 Page Layout -
  11.4.2 Page Mode -
  11.4.3 Display Options -
 11.5 Metadata -
 11.6 Page Labels -
12 File Attachments -
 12.1 Adding Attachments -
 12.2 Listing Attachments -
 12.3 Removing Attachments -
13 Working with Images -
 13.1 Detecting Low-resolution Images -
14 Fonts -
 14.1 Copying Fonts -
 14.2 Removing Fonts -
 14.3 Listing Missing Fonts - - -
15 Miscellaneous -
 15.1 Draft Documents -
 15.2 Blackening Text, Lines and Fills -
 15.3 Hairline Removal -
 15.4 Garbage Collection -
 15.5 Change PDF Version Number -
 15.6 Copy ID -
 15.7 Remove ID -
 15.8 List Spot Colours -
 15.9 Removing Dictionary Entries -
 15.10 Remove Clipping -
A Dates -
- - -

- - - - -

- - -

Typographical Conventions

Command lines to be typed are shown in typewriterfont in a box. -For example: -

-

When describing the general form of a command, rather than a particular example, square brackets [] -are used to enclose optional parts, and angled braces <> to enclose general descriptions which may be -substituted for particular instances. For example, -

-

describes a command line which requires an operation and, optionally, a range. An exception is that -we use in.pdf and out.pdf instead of <input file> and <output file> to reduce verbosity. Under -Microsoft Windows, type cpdf.exe instead of cpdf. - - -

- - - - -

- - - - -

- - -

Chapter 1
Basic Usage

-

-

The Coherent PDF tools provide a wide range of facilities for modifying PDF files -created by other means. There is a single command-line program cpdf (cpdf.exe under -Microsoft Windows). The rest of this manual describes the options that may be given to this -program. - - -

1.1 Input and Output Files

-

The typical pattern for usage is -

-

and the simplest concrete example, assuming the existence of a file in.pdf is: -

-

which copies in.pdf to out.pdf. The input and output may be the same file. Of course, we should like -to do more interesting things to the PDF file than that! -

Files on the command line are distinguished from other input by their containing a period. If an -input file does not contain a period, it should be preceded by -i. For example: -

-

A whole directory of files may be added (where a command supports multiple files) by using the -idir -option: -

-

The files in the directory myfiles are considered in alphabetical order. They must all be PDF files. If -the names of the files are numeric, leading zeroes will be required for the order to be correct (e.g -001.pdf, 002.pdf etc). -

-

1.2 Input Ranges

-

An input range may be specified after each input file. This is treated differently by each operation. -For instance -

-

extracts pages two, three, four and five from in.pdf, writing the result to out.pdf, assuming that -in.pdf contains at least five pages. Here are the rules for building input ranges: - - -

-

For example: -

- - - -

-

1.3 Working with Encrypted Documents

- - - -

In order to perform many operations, encrypted input PDF files must be decrypted. Some require the -owner password, some either the user or owner passwords. Either password is supplied -by writing user=<password> or owner=<password> following each input file requiring it -(before or after any range). The document will not be re-encrypted upon writing. For -example: -

-

To re-encrypt the file with its existing encryption upon writing, which is required if only the user -password was supplied, but allowed in any case, add the -recrypt option: -

-

The password required (owner or user) depends upon the operation being performed. Separate -facilities are provided to decrypt and encrypt files (See Section 4). -

-

1.4 Standard Input and Standard Output

- - -

Thus far, we have assumed that the input PDF will be read from a file on disk, and the output written -similarly. Often it’s useful to be able to read input from stdin (Standard Input) or write output to -stdout (Standard Output) instead. The typical use is to join several programs together into a pipe, -passing data from one to the next without the use of intermediate files. Use -stdin to read from -standard input, and -stdout to write to standard input, either to pipe data between multiple -programs, or multiple invocations of the same program. For example, this sequence of commands (all -typed on one line) -

-

extracts the last five pages of in.pdf in the correct order, writing them to out.pdf. It does this by -reversing the input, taking the first five pages and then reversing the result. -

To supply passwords for a file from -stdin, use -stdin-owner <password> and/or -stdin-user -<password>. -

Using -stdout on the final command in the pipeline to output the PDF to screen is not -recommended, since PDF files often contain compressed sections which are not screen-readable. -

Several cpdf operations write to standard output by default (for example, listing fonts). A useful -feature of the command line (not specific to cpdf) is the ability to redirect this output to a file. This is -achieved with the > operator: - - -

-

-

1.5 Doing Several Things at Once with AND

-

The keyword AND can be used to string together several commands in one. The advantage compared -with using pipes is that the file need not be repeatedly parsed and written out, saving -time. -

To use AND, simply leave off the output specifier (e.g -o) of one command, and the input specifier -(e.g filename) of the next. For instance: -

-

To specify the range for each section, use -range: -

-

-

1.6 Units

- -

When measurements are given to cpdf, they are in points (1 point = 1/72 inch). They -may optionally be followed by some letters to change the measurement. The following are -supported: -

- - -


- - -
-pt  Points (72 points per inch). The default.
-cm  Centimeters
-mm  Millimeters
-in  Inches
-
- - -

-
-

For example, one may write 14mm or 21.6in. In addition, the following letters stand, in some -operations (-scale-page, -scale-to-fit, -scale-contents, -shift, -mediabox, -crop) for various -page dimensions: -

- - -


- - -
-   PW  Page width
-   PH  Page height
-PMINX  Page minimum x coordinate
-PMINY  Page minimum y coordinate
-PMAXX  Page maximum  x coordinate
-PMAXY  Page maximum  y coordinate
-   CW  Crop box width
-   CH  Crop box height
-CMINX  Crop box minimum  x coordinate
-CMINY  Crop box minimum  y coordinate
-CMAXX  Crop box maximum  x coordinate
-CMAXY  Crop box maximum  y coordinate
-
- - -

-
-

For example, we may write PMINX PMINY to stand for the coordinate of the lower left corner of the -page. -

Simple arithmetic may be performed using the words add, sub, mul and div to stand for addition, -subtraction, multiplication and division. For example, one may write 14insub30pt or PMINXmul -2 -

1.7 Setting the Producer and Creator

-

The -producer and -creator options may be added to any cpdf command line to set the -producer and/or creator of the PDF file. If the file was converted from another format, the -creator is the program producing the original, the producer the program converting it to -PDF. -

-

-

1.8 PDF Version Numbers

- -

When an operation which uses a part of the PDF standard which was introduced in a later version -than that of the input file, the PDF version in the output file is set to the later version (most PDF -viewers will try to load any PDF file, even if it is marked with a later version number). -However, this automatic version changing may be suppressed with the -keep-version -flag. -

Here is a list of Acrobat versions together with the maximum PDF version they are intended to -support: -

- PDF 1.2  Acrobat 3.0
-PDF 1.3  Acrobat 4.0
-PDF 1.4  Acrobat 5.0
-PDF 1.5  Acrobat 6.0
-PDF 1.6  Acrobat 7.0
-PDF 1.7  Acrobat 8.0, 9.0, 10.0
-

If you wish to manually alter the PDF version of a file, use the -set-version option described in -Section 15.5. -

- - -

1.9 File IDs

-

PDF files contain an ID (consisting of two parts), used by some workflow systems to uniquely identify -a file. To change the ID, behavior, use the -change-id operation. This will create a new ID for the -output file. -

-

-

1.10 Linearization

- -

Linearized PDF is a version of the PDF format in which the data is held in a special manner to allow -content to be fetched only when needed. This means viewing a multipage PDF over a slow connection -is more responsive. By default, cpdf does not linearize output files. To make it do so, add -the -l option to the command line, in addition to any other command being used. For -example: -

-

This requires the existence of the external program cpdflin which is provided with commercial -versions of cpdf. This must be installed as described in the installation documentation provided with -your copy of cpdf. If you are unable to install cpdflin, you must use -cpdflin to let cpdf know -where to find it: -

-

In extremis, you may place cpdflin and its resources in the current working directory, though this -is not recommended. For further help, refer to the installation instructions for your copy of -cpdf. -

To keep the existing linearization status of a file (produce linearized output if the input is -linearized and the reverse), use -keep-l instead of -l. -

-

1.11 Object Streams

-

PDF 1.5 introduced a new mechanism for storing objects to save space: object streams. by default, -cpdf will preserve object streams in input files, creating no more. To prevent the retention of existing -object streams, use -no-preserve-objstm: -

-

To create new object streams if none exist, or augment the existing ones, use -create-objstm: - - -

-

To create wholly new object streams, use both options together: -

-

Files written with object streams will be set to PDF 1.5 or higher, unless -keep-version is used (see -above). -

-

1.12 Malformed Files

-

There are many malformed PDF files in existence, including many produced by otherwise-reputable -applications. cpdf attempts to correct these problems silently. -

Grossly malformed files will be reconstructed. The reconstruction progress is shown on stderr -(Standard Error): -

-

Sometimes files can be technically well-formed but use inefficient PDF constructs. If you are sure the -input files you are using are impeccably formed, the -fast option added to the command line (or, if -using AND, to each section of the command line). This will use certain shortcuts which speed up -processing, but would fail on badly-produced files. -

The -fast option may be used with: -

-

If problems occur, refrain from using -fast. -

-

1.13 Error Handling

- -

When cpdf encounters an error, it exits with code 2. An error message is displayed on stderr -(Standard Error). In normal usage, this means it’s displayed on the screen. When a bad or -inappropriate password is given, the exit code is 1. - - -

-

1.14 Control Files

- -

-

Some operating systems have a limit on the length of a command line. To circumvent this, or -simply for reasons of flexibility, a control file may be specified from which arguments are drawn. This -file does not support the full syntax of the command line. Commands are separated by whitespace, -quotation marks may be used if an argument contains a space, and the sequence \" may be used to -introduce a genuine quotation mark in such an argument. -

Several -control arguments may be specified, and may be mixed in with conventional -command-line arguments. The commands in each control file are considered in the order in which they -are given, after all conventional arguments have been processed. It is recommended to use -args in all -new applications. However, -control will be supported for legacy applications. -

To avoid interference between -control and AND, a new mechanism has been added. Using -args -in place of -control will perform direct textual substitution of the file into the command line, prior to -any other processing. -

-

1.15 String Arguments

-

Command lines are handled differently on each operating system. Some characters are reserved with -special meanings, even when they occur inside quoted string arguments. To avoid this problem, -cpdf performs processing on string arguments as they are read. -

A backslash is used to indicate that a character which would otherwise be treated specially by the -command line interpreter is to be treated literally. For example, Unix-like systems attribute a special -meaning to the exclamation mark, so the command line -

-

would fail. We must escape the exclamation mark with a backslash: -

-

It follows that backslashes intended to be taken literally must themselves be escaped (i.e. written -\\). -

-

1.16 Text Encodings

- -

Some cpdf commands write text to standard output, or read text from the command line or -configuration files. These are: - - -

-

There are three options to control how the text is interpreted: -

-

Add -utf8 to use Unicode UTF8, -stripped to convert to 7 bit ASCII by dropping any high -characters, or -raw to perform no processing. The default is -stripped. -

-

1.17 Font Embedding

-

Use the -no-embed-font to avoid embedding the Standard 14 Font metrics when adding text with --add-text. - - -

Chapter 2
Merging and Splitting

-

2.1 Merging

- -

The -merge operation allow the merging of several files into one. Ranges can be used to select only a -subset of pages from each input file in the output. The output file consists of the concatenation of all -the input pages in the order specified on the command line. Actually, the -merge can be omitted, since -this is the default operation of cpdf. -

-

Merge maintains bookmarks, named destinations, and name dictionaries. -

Forms and other objects which cannot be merged are retained if they are from the document which -first exhibits that feature. -

The -retain-numbering option keeps the PDF page numbering labels of each document intact, -rather than renumbering the output pages from 1. -

The -remove-duplicate-fonts ensures that fonts used in more than one of the inputs only -appear once in the output. -

-

2.2 Splitting

- -

The -split operation splits a PDF file into a number of parts which are written to file, their names -being generated from a format. The optional -chunk option allows the number of pages written to -each output file to be set. -

-

If the output format does not provide enough numbers for the files generated, the result is unspecified. -The following format operators may be used: -

- - -


- - -
-%, %%, %%% etc.  Sequence number padded to the number of percent signs
-             @F  Original filename without extension
-             @N  Sequence number without padding zeroes
-             @S  Start page of this chunk
-             @E  End page of this chunk
-             @B  Bookmark name  at this page
-
- - -

-
-

2.3 Splitting on Bookmarks

- -

The -split-bookmarks <level> operation splits a PDF file into a number of parts, according to the -page ranges implied by the document’s bookmarks. These parts are then written to file with names -generated from the given format. -

Level 0 denotes the top-level bookmarks, level 1 the next level (sub-bookmarks) and so on. So --split-bookmarks 1 creates breaks on level 0 and level 1 boundaries. -

-

Now, there may be many bookmarks on a single page (for instance, if paragraphs are bookmarked or -there are two subsections on one page). The splits calculated by -split-bookmarks ensure that each -page appears in only one of the output files. It is possible to use the @ operators above, including -operator @B which expands to the text of the bookmark: -

-

The bookmark text used for a name is converted from unicode to 7 bit ASCII, and the -following characters are removed, in addition to any character with ASCII code less than -32: -

-

-

2.4 Encrypting with Split and Split Bookmarks

-

The encryption parameters described in Chapter 4 may be added to the command line to encrypt each -split PDF. Similarly, the -recrypt switch described in 1 may by given to re-encrypt each file with the -existing encryption of the source PDF. - - -

Chapter 3
Pages

-

3.1 Page Sizes

- -

Any time when a page size is required, instead of writing, for instance "210mm 197mm" one can instead -write a4portrait. Here is a list of supported page sizes: -

- a0portrait       a1portrait        a2portrait
-a3portrait       a4portrait        a5portrait
-a6portrait       a7portrait        a8portrait
-a9portrait       a10portrait
-a0landscape      a1landscape        a2landscape
-a3landscape      a4landscape        a5landscape
-a6landscape      a7landscape        a8landscape
-a9landscape      a10landscape
-
-usletterportrait  usletterlandscape
-uslegalportrait   uslegallandscape
-
-

3.2 Scale Pages

- -

The -scale-page operation scales each page in the range by the X and Y factors given. -This scales both the page contents, and the page size itself. It also scales any Crop Box - - -and other boxes (Art Box, Trim Box etc). As with several of these commands, remember -to take into account any page rotation when considering what the X and Y axes relate -to. -

-

The -scale-to-fit operation scales each page in the range to fit a given page size, preserving aspect -ratio and centering the result. -

-

The scale can optionally be set to a percentage of the available area, instead of filling it. -

-

The -scale-contents operation scales the contents about the center of the crop box (or, if absent, -the media box), leaving the page dimensions (boxes) unchanged. -

-

To scale about a point other than the center, one can use the positioning commands described in -Section 8.2.4. For example: -

-

-

3.3 Shift Page Contents

- -

The -shift operation shifts the contents of each page in the range by X points horizontally and Y -points vertically. -

-

-

3.4 Rotating Pages

- -

There are two ways of rotating pages: (1) setting a value in the PDF file which asks the viewer (e.g. -Acrobat) to rotate the page on-the-fly when viewing it (use -rotate or -rotateby) and -(2) actually rotating the page contents and/or the page dimensions (use -upright afterwards or - - --rotate-contents to just rotate the page contents). -

The possible values for -rotate and -rotate-by are 0, 90, 180 and 270, all interpreted as being -clockwise. Any value may be used for -rotate-contents. -

The -rotate operation sets the viewing rotation of the selected pages to the absolute value -given. -

The --rotateby operation changes the viewing rotation of all the given pages by the relative value -given. -

The --rotate-contents operation rotates the contents and dimensions of the page by the given relative -value. -

-

The -upright operation does whatever combination of -rotate and -rotate-contents is -required to change the rotation of the document to zero without altering its appearance. -In addition, it makes sure the media box has its origin at (0,0), changing other boxes to -compensate. -

-

3.5 Flipping Pages

- -

The -hflip and -vflip operations flip the contents of the chosen pages horizontally or vertically. No -account is taken of the current page rotation when considering what ”horizontally” and ”vertically” -mean, so you may like to use -upright first. -

-

3.6 Boxes and Cropping

- - -

All PDF files contain a media box for each page, giving the dimensions of the paper. To -change these dimensions (without altering the page contents in any way), use the -mediabox -option. - - -

The -four numbers are minimum x, minimum y, width, height. x coordinates increase to the right, y -coordinates increase upwards. PDF file can also optionally contain a crop box for each page, defining -to what extent the page is cropped before being displayed or printed. A crop box can be set, changed -and removed, without affecting the underlying media box. To set or change the crop box use -crop. -To remove any existing crop box, use -remove-crop. -

-

Note that the crop box is only obeyed in some viewers. -

-This operation copies the contents of one box (Media box, Crop box, Trim box etc.) to another. If --mediabox-if-missing is added, the media box will be substituted when the ’from’ box is not set for -a given page. For example -

-copies the Trim Box of each page to the Crop Box of each page. The possible boxes are /MediaBox, -/CropBox, /BleedBox, /TrimBox, /ArtBox. - - -

Chapter 4
Encryption and Decryption

-

4.1 Introduction

-

PDF files can be encrypted using various types of encryption and attaching various permissions -describing what someone can do with a particular document (for instance, printing it or extracting -content). There are two types of person: -

-
The User can do to the document what is allowed in the permissions. -
-
The Owner can do anything, including altering the permissions or removing encryption - entirely.
-

There are five kinds of encryption: -

-

All encryption supports these kinds of permissions: -

- -no-edit   Cannot change the document
--no-print  Cannot print the document
--no-copy   Cannot select or copy text or graphics
--no-annot  Cannot add or change form fields or annotations
-
-

In addition, 128-bit encryption (Acrobat 5 and above) and AES encryption supports these: -

- -no-forms     Cannot edit form fields
--no-extract   Cannot extract text or graphics
--no-assemble  Cannot merge files etc.
--no-hq-print  Cannot print high-quality
-
- - -

Add these flags to the command line to prevent each operation. -

-

4.2 Encrypting a Document

-

To encrypt a document, the owner and user passwords must be given (here, fred and charles -respectively): -

A -blank user password is common. In this event, PDF viewers will typically not prompt for a password -for when opening the file or for operations allowable with the user password. -

In -addition, the usual method can be used to give the existing owner password, if the document is -already encrypted. -

When using AES encryption, the option is available to refrain from encrypting the metadata. Add --no-encrypt-metadata to the command line. -

-

4.3 Decrypting a Document

-

To decrypt a document, the owner password is provided. -

The -user password cannot decrypt a file. - - -

Chapter 5
Compression

-

-cpdf provides basic facilities for decompressing and compressing PDF streams. -

5.1 Decompressing a Document

- -

To decompress the streams in a PDF file, for instance to manually inspect the PDF, use: -

If -cpdf finds a compression type it can’t cope with, the stream is left compressed. When using --decompress, object streams are not compressed. -

5.2 Compressing a Document

- -

To compress the streams in a PDF file, use: -

-cpdf compresses any streams which have no compression using the FlateDecode method, with the -exception of Metadata streams, which are left uncompressed. -

-

5.3 Squeezing a Document

- -

To squeeze a PDF file, reducing its size by an average of about twenty percent (though sometimes not -at all), use: -

-Adding -squeeze to the command line when using another operation will squeeze the file or files upon -output. -

The -squeeze operation writes some information about the squeezing process to standard output. -The squeezing process involves several processes which losslessly attempt to reduce the file size. It is -slow, so should not be used without thought. - - -

-$ ./cpdf -squeeze in.pdf -o out.pdf - 
Beginning squeeze: 123847 objects - 
Squeezing... Down to 114860 objects - 
Squeezing... Down to 114842 objects - 
Squeezing page data - 
Recompressing document -
-

-

The -squeeze-log-to <filename> option writes the log to the given file instead of to standard -output. - - -

Chapter 6
Bookmarks

PDF Bookmarks (properly called the document outline) represent a tree of references to parts of the -file, typically displayed at the side of the screen. The user can click on one to move to the specified -place. cpdf provides facilities to list, add, and remove bookmarks. The format used by the list and -add operations is the same, so you can feed the output of one into the other, for instance to copy -bookmarks. -

6.1 List Bookmarks

- -

The -list-bookmarks operation prints (to standard output) the bookmarks in a file. The first column -gives the level of the tree at which a particular bookmark is. Then the text of the bookmark in quotes, -then the page number which the bookmark points to, then (optionally) the word ”open” if the -bookmark should have its children (at the level immediately below) visible when the file is loaded. For -example, upon executing -

-

the result might be: - - -

-0 "Part 1" 1 open - 
1 "Part 1A" 2 - 
1 "Part 1B" 3 - 
0 "Part 2" 4 - 
1 "Part 2a" 5 -
-

-

If the page number is 0, it indicates that clicking on that entry doesn’t move to a page. -

By default, cpdf converts unicode to ASCII text, dropping characters outside the ASCII range. To -prevent this, and return unicode UTF8 output, add the -utf8 option to the command. To prevent any -processing, use the -raw option. -

-

6.2 Remove Bookmarks

- -

The -remove-bookmarks operations removes all bookmarks from the file. -

-

-

6.3 Add Bookmarks

- -

The -add-bookmarks file adds bookmarks as specified by a bookmarks file, a text file in ASCII or -UTF8 encoding and in the same format as that produced by the -list-bookmarks option. If there -are any bookmarks in the input PDF already, they are discarded. For example, if the file -bookmarks.txt contains the output from -list-bookmarks above, then the command - -adds the bookmarks to the input file, writing to out.pdf. An error will be given if the bookmarks file -is not in the correct form (in particular, the numbers in the first column which specify -the level must form a proper tree with no entry being more than one greater than the -last). - - -

Chapter 7
Presentations

-

- -

The PDF file format, starting at Version 1.1, provides for simple slide-show presentations in the -manner of Microsoft Powerpoint. These can be played in Acrobat and possibly other PDF viewers, -typically started by entering full-screen mode. The -presentation operation allows such a -presentation to be built from any PDF file. -

The -trans option chooses the transition style. When a page range is used, it is the transition -from each page named which is altered. The following transition styles are available: -

-Split
Two lines sweep across the screen, revealing the new page. By default the lines are - horizontal. Vertical lines are selected by using the -vertical option. -
-Blinds
Multiple lines sweep across the screen, revealing the new page. By default the lines are - horizontal. Vertical lines are selected by using the -vertical option. -
-Box
A rectangular box sweeps inward from the edges of the page. Use -outward to make it - sweep from the center to the edges. -
-Wipe
A single line sweeps across the screen from one edge to the other in a direction specified - by the -direction option. -
-Dissolve
The old page dissolves gradually to reveal the new one. -
-Glitter
The same as Dissolve but the effect sweeps across the page in the direction specified - by the -direction option.
-

To remove a transition style currently applied to the selected pages, omit the -trans option. -

The -effect-duration option specifies the length of time in seconds for the transition itself. The -default value is one second. -

The -duration option specifies the maximum time in seconds that the page is displayed before the -presentation automatically advances. The default, in the absence of the -duration option, is for no -automatic advancement. -

The -direction option (for Wipe and Glitter styles only) specifies the direction of the effect. -The following values are valid: -

-

For example: -

-

To use different options on different page ranges, run cpdf multiple times on the file using a different -page range each time. - - -

Chapter 8
Watermarks and Stamps

-

-

8.1 Add a Watermark or Logo

-

The -stamp-on and -stamp-under operations stamp the first page of a source PDF onto or under -each page in the given range of the input file. For example, -

-stamps the file logo.pdf onto the odd pages of in.pdf, writing to out.pdf. A watermark should go -underneath each page: -

-

The position commands in Section 8.2.4 can be used to locate the stamp more precisely (they are -calculated relative to the crop box of the stamp). Or, preprocess the stamp with -shift -first. -

The -scale-stamp-to-fit option can be added to scale the stamp to fit the page before -applying it. The use of positioning commands together with -scale-stamp-to-fit is not -recommended. -

The -combine-pages operation takes two PDF files and stamps each page of one over each page -of the other. The length of the output is the same as the length of the “under” file. For -instance: - - -

-

Page attributes (such as the display rotation) are taken from the “under” file. For best results, remove -any rotation differences in the two files using -upright first. -

The -relative-to-cropbox option takes the positioning command to be relative to the crop box of -each page rather than the media box. -

-

8.2 Stamp Text, Dates and Times.

- - - -

The -add-text operation allows text, dates and times to be stamped over one or more pages of the -input at a given position and using a given font, font size and color. -

-

The default is black 12pt Times New Roman text in the top left of each page. The text can be placed -underneath rather than over the page by adding the -underneath option. -

Text previously added by cpdf may be removed by the -remove-text operation. -

-

-

8.2.1 Page Numbers

- -

There are various special codes to include the page number in the text: -

- %Page      Page number in arabic notation (1, 2, 3...)
-%roman     Page number in lower-case roman notation (i, ii, iii...)
-%Roman     Page number in upper- case roman notation (I, II, III...)
-%EndPage   Last page of document in arabic notation
-%Label     The page label of the page
-%EndLabel  The page label of the last page
-%filename  The full file name of the input document
-
-

For example, the format "Page %Page of %EndPage" might become ”Page 5 of 17”. -

NB: In some circumstances (e.g in batch files) on Microsoft Windows, % is a special -character, and must be escaped (written as %%). Consult your local documentation for -details. - - -

-

8.2.2 Date and Time Formats

-
- %a  Abbreviated weekday name (Sun, Mon etc.)
-%A  Full weekday name (Sunday, Monday etc.)
-%b  Abbreviated month name (Jan, Feb etc.)
-%B  Full month name (January, February etc.)
-%d  Day of the month (01–31)
-%e  Day of the month (1–31)
-%H  Hour in 24-hour clock (00–23)
-%I  Hour in 12-hour clock (01–12)
-%j  Day of the year (001–366)
-%m  Month of the year (01–12)
-%M  Minute of the hour (00–59)
-%p  ”a.m ” or ”p.m”
-%S  Second of the minute (00–61)
-%T  Same as %H:%M:%S
-%u  Weekday (1–7, 1 = Monday)
-%w  Weekday (0–6, 0 = Monday)
-%Y  Year (0000–9999)
-%%  The % character.
-
-

-

8.2.3 Bates Numbers

- -

Unique page identifiers can be specified by putting %Bates in the format. The starting point can be set -with the -bates option. For example: -

-

To specify that bates numbering begins at the first page of the range, use -bates-at-range -instead. This option must be specified after the range is specified. To pad the bates number up -to a given number of leading zeros, use -bates-pad-to in addition to either -bates or --bates-at-range. -

-

8.2.4 Position

-

The position of the text may be specified either in absolute terms: - - -

-

Positions relative to certain common points can be set: -

-

No attempt is made to take account of the page rotation when interpreting the position, so --prerotate must be added to the command line if the file contains pages with a non-zero viewing -rotation. This is equivalent to pre-processing the document with -upright. -

The -relative-to-cropbox modifier can be added to the command line to make these -measurements relative to the crop box instead of the media box. -

The default position is equivalent to -topleft 100. -

The -midline option may be added to specify that the positioning commands above are to -be considered relative to the midline of the text, rather than its baseline. Similarly, the --topline option may be used to specify that the position is taken relative to the top of the -text. -

-

8.2.5 Font and Size

- -

The font may be set with the -font option. The 14 Standard PDF fonts are available: -

- TTiimmeess--RBoomldan
-Times-Italic
-Times-BoldItalic
-Helvetica
-Helvetica-Bold
-Helvetica-Oblique
-Helvetica-BoldOblique
-Courier
-Courier-Bold
-Courier-Oblique
-Courier-BoldOblique
-Symbol
-ZapfDingbats
-
-

For example, page numbers in Times Italic can be achieved by: -

See -Section 14.1 for how to use other fonts. -

The font size can be altered with the -font-size option, which specifies the size in -points: -

-

-

8.2.6 Colors

- -

The -color option takes an RGB color, where red, green and blue components range between 0 and 1. -The following values are predefined: -

- -Color--R,-G,-B---
- white   1, 1, 1
- black   0, 0, 0
- red     1, 0, 0
- green   0, 1, 0
- blue    0, 0, 1
-
-

-

Partly-transparent text may be specified using the -opacity option. Wholly opaque is 1 and wholly -transparent is 0. For example: -

- - -

-

8.2.7 Outline Text

- -

The -outline option sets outline text. The line width (default 1pt) may be set with the -linewidth -option. For example, to stamp documents as drafts: -

-

-

8.2.8 Multi-line Text

-

The code \n can be included in the text string to move to the next line. In this case, the vertical -position refers to the baseline of the first line of text (if the position is at the top, top left or top right -of the page) or the baseline of the last line of text (if the position is at the bottom, bottom left or -bottom right). -

-

The -midline option may be used to make these vertical positions relative to the midline of a line of -text rather than the baseline, as usual. -

The -line-spacing option can be used to increase or decrease the line spacing, where a spacing of -1 is the standard. -

-

Justification of multiple lines is handled by the -justify-left, -justify-right and
-justify-center options. The defaults are left justification for positions relative to the left hand side -of the page, right justification for those relative to the right, and center justification for positions -relative to the center of the page. For example: -

-

-

8.2.9 Special Characters

-

If your command line allows for the inclusion of unicode characters, the input text will be considered -as UTF8 by cpdf. Special characters which exist in the PDF WinAnsiEncoding Latin 1 code (such as -many accented characters) will be reproduced in the PDF. This does not mean, however, that every -special character can be reproduced. You must experiment. -

For compatibility with previous versions of cpdf, special characters may be introduced manually -with a backslash followed by the three-digit octal code of the character in the PDF WinAnsiEncoding -Latin 1 Code. The full table is included in Appendix D of the Adobe PDF Reference Manual, which is -available at http://www.adobe.com/devnet/pdf/pdf_reference.html. -

For example, a German sharp s (ß) may be introduced by \337. - - -

-

8.3 Stamping Graphics

-

A rectangle may be placed on one or more pages by using the -add-rectangle <size> -command. Most of the options discussed above for text placement apply in the same way. For -example: -

-

This can be used to blank out or highlight part of the document. The following positioning options -work as you would expect: -topleft, -top, -topright, -right, -bottomright, -bottom, --bottomleft, -left, -center. When using the option -pos-left "x y", the point (x, y) refers to the -bottom-left of the rectangle. When using the option -pos-right "x y", the point (x, y) refers to the -bottom-right of the rectangle. When using the option -pos-center "x y", the point (x, y) refers -to the center of the rectangle. The options -diagonal and -reverse-diagonal have no -meaning. - - -

- - -

Chapter 9
Multipage Facilities

-

9.1 Two-up

- -

This facility puts multiple logical pages on a single physical page. The -twoup-stack operation puts -two logical pages on each physical page, rotating them 90 degrees to do so. The new mediabox is thus -larger. The -twoup operation does the same, but scales the new sides down so that the media box is -unchanged. -

-

9.2 Inserting Blank Pages

- -

Sometimes, for instance to get a printing arrangement right, it’s useful to be able to insert blank pages -into a PDF file. cpdf can add blank pages before a given page or pages, or after. The pages in -question are specified by a range in the usual way: -

The -dimensions of the padded page are derived from the boxes (media box, crop box etc.) of the page after -or before which the padding is to be applied. -

The -pad-every n operation places a blank page after every n pages, excluding any last one. For -example… -

…on -a 9 page document adds a blank page after pages 3 and 6. -

The -pad-multiple n operation adds blank pages so the document has a multiple of n pages. For -example: -

- - -

- - -

Chapter 10
Annotations

-

10.1 List Annotations

- -

The -list-annotations operation prints the textual content of any annotations on the selected pages -to standard output. Each annotation is preceded by the page number and followed by a -newline. -

-

-

10.2 Copy Annotations

- -

The -copy-annotations operation copies the annotations in the given page range from one file (the -file specified immediately after the option) to another pre-existing PDF. The range is specified after -this pre-existing PDF. The result is then written an output file, specified in the usual way. - -

-

10.3 Remove Annotations

- -

The -remove-annotations operation removes all annotations from the given page range. -

- - -

- - -

Chapter 11
Document Information and Metadata

-

-

11.1 Listing Fonts

- -

The -list-fonts operation prints the fonts in the document, one-per-line to standard output. For -example: - - -

-

The first column gives the page number, the second the internal unique font name, the third the type -of font (Type1, TrueType etc), the fourth the PDF font name, the fifth the PDF font -encoding. -

-

11.2 Reading Document Information

-

The -info option prints entries from the document information dictionary, and from any XMP -metadata to standard output. - - -

-

The details of the format for creation and modification dates can be found in Appendix A. -

By default, cpdf strips to ASCII, discarding character codes in excess of 127. In order to preserve -the original unicode, add the -utf8 option. To disable all postprocessing of the string, add --raw. -

The -page-info option prints the page label, media box and other boxes page-by-page to standard -output, for all pages in the current range. -

-

Note that the format for boxes is minimum x, minimum y, maximum x, maximum y. -

The -pages operation prints the number of pages in the file. -

-

-

11.3 Setting Document Information

- - -

The document information dictionary in a PDF file specifies various pieces of information about a -PDF. These can be consulted in a PDF viewer (for instance, Acrobat). -

Here is a summary of the commands for setting entries in the document information -dictionary: -

-

(The details of the format for creation and modification dates can be found in Appendix A. Using the -date "now" uses the time and date at which the command is executed. Note also that -producer and --creator may be used to set the producer and/or the creator when writing any file, separate from the -operations described in this chapter.) -

For example, to set the title, the full command line would be -

-

The text string is considered to be in UTF8 format, unless the -raw option is added—in which case, it -is unprocessed, save for the replacement of any octal escape sequence such as \017, which is replaced -by a character of its value (here, 15). -

-

11.4 Upon Opening a Document

-

-

11.4.1 Page Layout

- -

The -set-page-layout option specifies the page layout to be used when a document is opened in, for -instance, Acrobat. The possible (case-sensitive) values are: -

- SinglePage      Display one page at a time
-OneColumn       Display the pages in one column
-TwoColumnLeft   Display the pages in two columns, odd numbered pages
-               on the left
-TwoColumnRight   Display the pages in two columns, even numbered pages
-               on the left
-TwoPageLeft     (PDF  1.5 and above) Display the pages two at a time,
-               odd numbered pages on the left
-TwoPageRight    (PDF  1.5 and above) Display the pages two at a time,
-               even numbered pages on the left
-

-

For instance: -

-

-

11.4.2 Page Mode

- -

The page mode in a PDF file defines how a viewer should display the document when first opened. -The possible (case-sensitive) values are: -

- UseNone         Neither document outline nor thumbnail images visible
-UseOutlines     Document outline (bookmarks) visible
-
-UseThumbs       Thumbnail images visible
-FullScreen      Full- screenmode (nomenu bar, window controls, or any-
-               thing but the document visible)
-UseOC           (PDF  1.5 and above) Optional content group panel visi-
-               ble
-UseAttachments   (PDF  1.5 and above) Attachments panel visible
-

-

For instance: -

-

-

11.4.3 Display Options

-
- -hide-toolbar      Hide the viewer’s toolbar
--hide-menubar      Document outline (bookmarks) visible
--hide-window- ui     Hide the viewer’s scroll bars
-
--fit-window        Resize the document’s windows to fit size of first page
--center-window      Position thedocumentwindow inthecenterofthescreen
-                  Displaythedocumenttitleinsteadofthefile name inthe
--display-doc- title  title bar
-

-

For instance: -

-

The page a PDF file opens at can be set using -open-at-page: -

-

To have that page scaled to fit the window in the viewer, use -open-at-page-fit instead: -

-

-

11.5 Metadata

- -

PDF files can contain a piece of arbitrary metadata, often in XMP format. This is typically stored in -an uncompressed stream, so that other applications can read it without having to decode the whole -PDF. To set the metadata: -

-

To remove any metadata: -

To -print the current metadata to standard output: -

-

-

11.6 Page Labels

- - -

It is possible to add page labels to a document. These are not the printed on the page, but may be -displayed alongside thumbnails or in print dialogue boxes by PDF readers. We use -add-page-labels -to do this, by default with decimal arabic numbers (1,2,3…). We can add -label-style to choose what -type of labels to add from these kinds: - - -

-    DecimalArabic  1,2,3,4,5...
-  LowercaseRoman  i,ii,iii,iv,v...
-  UppercaseRoman  I,II,III,IV,V...
- LowercaseLetters  a,b,c,...,z,aa,bb...
- UppercaseLetters  A,B,C,...,Z,AA,BB...
-NoLabelPrefixOnly  No number, but a prefix will be used if defined.
-

We can use -label-prefix to add a textual prefix to each label. Consider a file with twenty pages -and no current page labels (a PDF reader will assume 1,2,3…if there are none). We will add the -following page labels: -

i, ii, iii, iv, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, A-0, A-1, A-2, A-3, A-4, A-5 -

Here are the commands, in order: -

-

By default the labels begin at page number 1 for each range. To override this, we can use --label-startval (we used 0 in the final command), where we want the numbers to begin at zero -rather than one. -

Page labels may be removed altogether by using -remove-page-labels command. To print the -page labels from an existing file, use -print-page-labels. For example: -

- - -

- - -

Chapter 12
File Attachments

-

-

PDF supports adding attachments (files of any kind, including other PDFs) to an existing file. The -cpdf tool supports adding and removing document-level attachments — that is, ones which are -associated with the document as a whole rather than with an individual page, and also page-level -attachments, associated with a particular page. -

12.1 Adding Attachments

- -

To add an attachment, use the -attach-file option. For instance, -

-attaches the Excel spreadsheet sheet.xls to the input file. If the file already has attachments, the -new file is added to their number. You can specify multiple files to be attached by using -attach-file -multiple times. They will be attached in the given order. -

The -to-page option can be used to specify that the files will be attached to the given -page, rather than at the document level. The -to-page option may be specified at most -once. -

-

12.2 Listing Attachments

- -

To list all document- and page-level attachments, use the -list-attached-files operation. The page -number and filename of each attachment is given, page 0 representing a document-level -attachment. -

-

-

12.3 Removing Attachments

- -

To remove all document-level and page-level attachments from a file, use the -remove-files -operation: -

- - -

- - -

Chapter 13
Working with Images

-

-

13.1 Detecting Low-resolution Images

-

To list all images in the given range of pages which fall below a given resolution (in dots-per-inch), use -the -image-resolution function: -

-

-

The format is page number, image name, x pixels, y pixels, x resolution, y resolution. The resolutions -refer to the image’s effective resolution at point of use (taking account of scaling, rotation -etc). - - -

- - -

Chapter 14
Fonts

-

-

14.1 Copying Fonts

-

In order to use a font other than the standard 14 with -add-text, it must be added to the file. The -font source PDF is given, together with the font’s resource name on a given page, and that -font is copied to all the pages in the input file’s range, and then written to the output -file. -

The font is named in the output file with its basefont name, so it can be easily used with --add-text. -

For example, if the file fromfile.pdf has a font /GHLIGA+c128 with the name /F10 on page 1 -(this information can be found with -list-fonts), the following would copy the font to the file -in.pdf on all pages, writing the output to out.pdf: -

-

Text in this font can then be added by giving -font /GHLIGA+c128. Be aware that due to the vagaries -of PDF font handling concerning which characters are present in the source font, not all -characters may be available, or the encoding (mapping from input codes to glyphs) may be -non-obvious. -

-

14.2 Removing Fonts

-

To remove embedded fonts from a document, use -remove-fonts. PDF readers will substitute local -fonts for the missing fonts. The use of this function is only recommended when file size is the sole -consideration. -

-

-

14.3 Listing Missing Fonts

-

The -missing-fonts operation lists any unembedded fonts in the document, one per line. -

-

The format is -

- - -

- - -

Chapter 15
Miscellaneous

-

-

15.1 Draft Documents

- -

The -draft option removes bitmap (photographic) images from a file, so that it can be printed with -less ink. Optionally, the -boxes option can be added, filling the spaces left blank with a crossed box -denoting where the image was. This is not guaranteed to be fully visible in all cases (the -bitmap may be have been partially covered by vector objects or clipped in the original). For -example: -

-

-

15.2 Blackening Text, Lines and Fills

- -

Sometimes PDF output from an application (for instance, a web browser) has text in colors which -would not print well on a grayscale printer. The -blacktext operation blackens all text on the given -pages so it will be readable when printed. -

This will not work on text which has been converted to outlines, nor on text which is part of a -form. -

- -

The -blacklines operation blackens all lines on the given pages. -

- -

The -blackfills operation blackens all fills on the given pages. - - -

-

Contrary to their names, all these operations can use another color, if specified with -color. -

-

15.3 Hairline Removal

- -

Quite often, applications will use very thin lines, or even the value of 0, which in PDF means ”The -thinnest possible line on the output device”. This might be fine for on-screen work, but when printed -on a high resolution device, such as by a commercial printer, they may be too faint, or -disappear altogether. The -thinlines option prevents this by changing all lines thinner than -<minimal thickness> to the given thickness. For example: -

-

-

15.4 Garbage Collection

- -

Sometimes incremental updates to a file by an application, or bad applications can leave data in a -PDF file which is no longer used. This function removes that unneeded data. -

-

-

15.5 Change PDF Version Number

- -

To change the pdf version number, use the -set-version operation, giving the part of the version -number after the decimal point. For example: -

-This does not alter any of the actual data in the file — just the supposed version number. -

-

15.6 Copy ID

- -

The -copy-id-from operation copies the ID from the given file to the input, writing to the -output. - - -

-

If there is no ID in the source file, the existing ID is retained. You cannot use -recrypt with --copy-id-from. -

-

15.7 Remove ID

- -

The -remove-id operation removes the ID from a document. -

-

You cannot use -recrypt with -remove-id. -

-

15.8 List Spot Colours

-

This operation lists the name of any “separation” color space in the given PDF file. -

-

-

15.9 Removing Dictionary Entries

-

This is for editing data within the PDF’s internal representation. Use with caution. -

-

-

15.10 Remove Clipping

-

The -remove-clipping operation removes any clipping paths from the file. -

- - -

- - - -

Appendix A
Dates

Dates in PDF are specified according to the following format: -

-

A contiguous prefix of the parts above can be used instead, for lower accuracy dates. For -example: -

- - - - - - - - - -