diff --git a/html_manual/cpdfmanual.html b/html_manual/cpdfmanual.html index dd05156..55e0bf1 100644 --- a/html_manual/cpdfmanual.html +++ b/html_manual/cpdfmanual.html @@ -12,37 +12,37 @@ > -
+
-
+ Coherent PDF
- Command Line Toolkit
- User Manual Coherent Graphics Ltd
Version 2.2 (March 2017)
-
For bug reports, feature requests and comments, email
For bug reports, feature requests and comments, email
contact@coherentgraphics.co.uk
-
©2017 Coherent Graphics Limited. All rights reserved. ISBN 978-0957671140
- Adobe, Acrobat, Adobe PDF, Adobe Reader and PostScript are registered trademarks of Adobe
+ Adobe, Acrobat, Adobe PDF, Adobe Reader and PostScript are registered trademarks of Adobe
Systems Incorporated. Windows, Powerpoint and Excel are registered trademarks of Microsoft
Corporation.
-
+
Dates
+
+ + + +
- - -
- -
-
When describing the general form of a command, rather than a particular example, square brackets When describing the general form of a command, rather than a particular example, square brackets []
are used to enclose optional parts, and angled braces <> to enclose general descriptions which may be
substituted for particular instances. For example,
- describes a command line which requires an operation and, optionally, a range. An exception is that
+ describes a command line which requires an operation and, optionally, a range. An exception is that
we use in.pdf and out.pdf instead of cpdf.exe instead of cpdf.
+
+
+
+
+
-
-
-
-
-
-
+
The Coherent PDF tools provide a wide range of facilities for modifying PDF files
+ The Coherent PDF tools provide a wide range of facilities for modifying PDF files
created by other means. There is a single command-line program cpdf (cpdf.exe under
@@ -305,44 +305,44 @@ program.
id="dx1-3002">
The typical pattern for usage is
- The typical pattern for usage is
+ and the simplest concrete example, assuming the existence of a file and the simplest concrete example, assuming the existence of a file in.pdf is:
- which copies which copies in.pdf to out.pdf. The input and output may be the same file. Of course, we should like
to do more interesting things to the PDF file than that!
- Files on the command line are distinguished from other input by their containing a period. If an
+ Files on the command line are distinguished from other input by their containing a period. If an
input file does not contain a period, it should be preceded by -i. For example:
- A whole directory of files may be added (where a command supports multiple files) by using the A whole directory of files may be added (where a command supports multiple files) by using the -idir
option:
- The files in the directory The files in the directory myfiles are considered in alphabetical order. They must all be PDF files. If
the names of the files are numeric, leading zeroes will be required for the order to be correct (e.g
001.pdf, 002.pdf etc).
-
+
An An input range may be specified after each input file. This is treated differently by each operation.
For instance
- extracts pages two, three, four and five from extracts pages two, three, four and five from in.pdf, writing the result to out.pdf, assuming that
~) defines a page number counting from the end of the doc
beginning. Page ~1 is the last page, ~2 the penultimate page etc.
- For example:
- For example:
+
+
In order to perform many operations, encrypted input PDF files must be decrypted. Some require the
+ In order to perform many operations, encrypted input PDF files must be decrypted. Some require the
owner password, some either the user or owner passwords. Either password is supplied
by writing user=<password> or owner=<password> following each input file
(before or after any range). The document will not be re-encrypted upon writing. For
example:
- To re-encrypt the file with its existing encryption upon writing, which is required if only the user
+ To re-encrypt the file with its existing encryption upon writing, which is required if only the user
password was supplied, but allowed in any case, add the -recrypt option:
- The password required (owner or user) depends upon the operation being performed. Separate
+ The password required (owner or user) depends upon the operation being performed. Separate
facilities are provided to decrypt and encrypt files (See Section 4).
-
+
Thus far, we have assumed that the input PDF will be read from a file on disk, and the output written
+ Thus far, we have assumed that the input PDF will be read from a file on disk, and the output written
similarly. Often it’s useful to be able to read input from stdin (Standard Input) or write output to
-stdout to write to standard input, either to pipe data between multiple
programs, or multiple invocations of the same program. For example, this sequence of commands (all
typed on one line)
- extracts the last five pages of extracts the last five pages of in.pdf in the correct order, writing them to out.pdf. It does this by
reversing the input, taking the first five pages and then reversing the result.
- To supply passwords for a file from To supply passwords for a file from -stdin, use -stdin-owner <password> and/or -stdin-user
<password>.
- Using Using -stdout on the final command in the pipeline to output the PDF to screen is not
recommended, since PDF files often contain compressed sections which are not screen-readable.
- Several Several cpdf operations write to standard output by default (for example, listing fonts). A useful
feature of the command line (not specific to cpdf) is the ability to redirect this output to a file. This is
@@ -469,38 +469,38 @@ achieved with the > operator:
-
+
The keyword The keyword AND can be used to string together several commands in one. The advantage compared
with using pipes is that the file need not be repeatedly parsed and written out, saving
time.
- To use To use AND, simply leave off the output specifier (e.g -o) of one command, and the input specifier
(e.g filename) of the next. For instance:
- To specify the range for each section, use To specify the range for each section, use -range:
-
+
When measurements are given to When measurements are given to cpdf, they are in points (1 point = 1/72 inch). They
may optionally be followed by some letters to change the measurement. The following are
supported:
For example, one may write For example, one may write 14mm or 21.6in. In addition, the following letters stand, in some
operations (-crop) for various page dimensions:
For example, we may write For example, we may write PMINX PMINY to stand for the coordinate of the lower left corner of the
page.
- Simple arithmetic may be performed using the words Simple arithmetic may be performed using the words add, sub, mul and PMINXmul
class="cmtt-10">2 The The -producer and -creator options may be added to any cpdf command line to set the
@@ -576,20 +576,20 @@ producer and/or creator of the PDF file. If the file was converted from another
class="cmti-10">creator
+
When an operation which uses a part of the PDF standard which was introduced in a later version
+ When an operation which uses a part of the PDF standard which was introduced in a later version
than that of the input file, the PDF version in the output file is set to the later version (most PDF
viewers will try to load any PDF file, even if it is marked with a later version number).
However, this automatic version changing may be suppressed with the -keep-version
flag.
- Here is a list of Acrobat versions together with the maximum PDF version they are intended to
+ Here is a list of Acrobat versions together with the maximum PDF version they are intended to
support:
If you wish to manually alter the PDF version of a file, use the If you wish to manually alter the PDF version of a file, use the -set-version option described in
Section 15.5.
-
+
PDF files contain an ID (consisting of two parts), used by some workflow systems to uniquely identify
+ PDF files contain an ID (consisting of two parts), used by some workflow systems to uniquely identify
a file. To change the ID, behavior, use the -change-id operation. This will create a new ID for the
output file.
-
+
Linearized PDF is a version of the PDF format in which the data is held in a special manner to allow
+ Linearized PDF is a version of the PDF format in which the data is held in a special manner to allow
content to be fetched only when needed. This means viewing a multipage PDF over a slow connection
is more responsive. By default, cpdf does not linearize output files. To make it do so, add
the -l option to the command line, in addition to any other command being used. For
example:
- This requires the existence of the external program This requires the existence of the external program cpdflin which is provided with commercial
versions of cpdf. This must be installed as described in the installation documentation provided with
@@ -638,144 +638,144 @@ class="cmtt-10">cpdflin In extremis, you may place In extremis, you may place cpdflin and its resources in the current working directory, though this
is not recommended. For further help, refer to the installation instructions for your copy of
cpdf.
- To keep the existing linearization status of a file (produce linearized output if the input is
+ To keep the existing linearization status of a file (produce linearized output if the input is
linearized and the reverse), use -keep-l instead of -l.
-
+
PDF 1.5 introduced a new mechanism for storing objects to save space: object streams. by default,
+ PDF 1.5 introduced a new mechanism for storing objects to save space: object streams. by default,
cpdf will preserve object streams in input files, creating no more. To prevent the retention of existing
object streams, use -no-preserve-objstm:
- To create new object streams if none exist, or augment the existing ones, use To create new object streams if none exist, or augment the existing ones, use -create-objstm:
- To create wholly new object streams, use both options together:
- To create wholly new object streams, use both options together:
+ Files written with object streams will be set to PDF 1.5 or higher, unless Files written with object streams will be set to PDF 1.5 or higher, unless -keep-version is used (see
above).
-
+
There are many malformed PDF files in existence, including many produced by otherwise-reputable
+ There are many malformed PDF files in existence, including many produced by otherwise-reputable
applications. cpdf attempts to correct these problems silently.
- Grossly malformed files will be reconstructed. The reconstruction progress is shown on Grossly malformed files will be reconstructed. The reconstruction progress is shown on stderr
(Standard Error):
- Sometimes files can be technically well-formed but use inefficient PDF constructs. If you are sure the
+ Sometimes files can be technically well-formed but use inefficient PDF constructs. If you are sure the
input files you are using are impeccably formed, the -fast option added to the command line (or, if
using AND, to each section of the command line). This will use certain shortcuts which speed up
processing, but would fail on badly-produced files.
- The The -fast option may be used with:
- If problems occur, refrain from using If problems occur, refrain from using -fast.
-
+
When When cpdf encounters an error, it exits with code 2. An error message is displayed on stderr
(Standard Error). In normal usage, this means it’s displayed on the screen. When a bad or
inappropriate password is given, the exit code is 1.
-
+
Some operating systems have a limit on the length of a command line. To circumvent this, or
+ Some operating systems have a limit on the length of a command line. To circumvent this, or
simply for reasons of flexibility, a control file may be specified from which arguments are drawn. This
file does not support the full syntax of the command line. Commands are separated by whitespace,
quotation marks may be used if an argument contains a space, and the sequence \" may be used to
introduce a genuine quotation mark in such an argument.
- Several Several -control arguments may be specified, and may be mixed in with conventional
command-line arguments. The commands in each control file are considered in the order in which they
are given, after all conventional arguments have been processed. It is recommended to use -args in all
new applications. However, -control will be supported for legacy applications.
- To avoid interference between To avoid interference between -control and AND, a new mechanism has been added. Using -args
in place of -control will perform direct textual substitution of the file into the command line, prior to
any other processing.
-
+
Command lines are handled differently on each operating system. Some characters are reserved with
+ Command lines are handled differently on each operating system. Some characters are reserved with
special meanings, even when they occur inside quoted string arguments. To avoid this problem,
cpdf performs processing on string arguments as they are read.
- A backslash is used to indicate that a character which would otherwise be treated specially by the
+ A backslash is used to indicate that a character which would otherwise be treated specially by the
command line interpreter is to be treated literally. For example, Unix-like systems attribute a special
meaning to the exclamation mark, so the command line
- would fail. We must escape the exclamation mark with a backslash:
- would fail. We must escape the exclamation mark with a backslash:
+ It follows that backslashes intended to be taken literally must themselves be escaped (i.e. written
+ It follows that backslashes intended to be taken literally must themselves be escaped (i.e. written
\\).
-
+
Some Some cpdf commands write text to standard output, or read text from the command line or
configuration files. These are:
- There are three options to control how the text is interpreted:
- There are three options to control how the text is interpreted:
+ Add Add -utf8 to use Unicode UTF8, -stripped to convert to 7 bit ASCII by dropping any high
characters, or -raw to perform no processing. The default is -stripped.
-
+
Use the Use the -no-embed-font to avoid embedding the Standard 14 Font metrics when adding text with
-add-text.
@@ -788,43 +788,43 @@ src="cpdfmanual32x.png" alt="" class="fbox" >
id="x1-220002.1">Merging
- The The -merge operation allow the merging of several files into one. Ranges can be used to select only a
subset of pages from each input file in the output. The output file consists of the concatenation of all
the input pages in the order specified on the command line. Actually, the -merge can be omitted, since
this is the default operation of cpdf.
- Merge maintains bookmarks, named destinations, and name dictionaries.
- Forms and other objects which cannot be merged are retained if they are from the document which
+ Merge maintains bookmarks, named destinations, and name dictionaries.
+ Forms and other objects which cannot be merged are retained if they are from the document which
first exhibits that feature.
- The The -retain-numbering option keeps the PDF page numbering labels of each document intact,
rather than renumbering the output pages from 1.
- The The -remove-duplicate-fonts ensures that fonts used in more than one of the inputs only
appear once in the output.
-
+
The The -split operation splits a PDF file into a number of parts which are written to file, their names
being generated from a format. The optional -chunk option allows the number of pages written to
each output file to be set.
- If the output format does not provide enough numbers for the files generated, the result is unspecified.
+ If the output format does not provide enough numbers for the files generated, the result is unspecified.
The following format operators may be used:
The The -split-bookmarks <level> operation splits a PDF file into a number of parts, according to the
page ranges implied by the document’s bookmarks. These parts are then written to file with names
generated from the given format.
- Level 0 denotes the top-level bookmarks, level 1 the next level (sub-bookmarks) and so on. So
+ Level 0 denotes the top-level bookmarks, level 1 the next level (sub-bookmarks) and so on. So
-split-bookmarks 1 creates breaks on level 0 and level 1 boundaries.
- Now, there may be many bookmarks on a single page (for instance, if paragraphs are bookmarked or
+ Now, there may be many bookmarks on a single page (for instance, if paragraphs are bookmarked or
there are two subsections on one page). The splits calculated by -split-bookmarks ensure that each
page appears in only one of the output files. It is possible to use the @ operators above, including
operator @B which expands to the text of the bookmark:
- The bookmark text used for a name is converted from unicode to 7 bit ASCII, and the
+ The bookmark text used for a name is converted from unicode to 7 bit ASCII, and the
following characters are removed, in addition to any character with ASCII code less than
32:
-
+
The encryption parameters described in Chapter The encryption parameters described in Chapter 4 may be added to the command line to encrypt each
split PDF. Similarly, the -recrypt switch described in
id="x1-270003.1">Page Sizes
- Any time when a page size is required, instead of writing, for instance Any time when a page size is required, instead of writing, for instance "210mm 197mm" one can instead
write a4portrait. Here is a list of supported page sizes:
@@ -908,7 +908,7 @@ uslegalportrait uslegallandscape
id="x1-280003.2"> The The -scale-page operation scales each page in the range by the X and Y factors given.
This scales both the page contents, and the page size itself. It also scales any Crop Box
@@ -916,42 +916,42 @@ This scales both the page contents, and the page size itself. It also scales any
and other boxes (Art Box, Trim Box etc). As with several of these commands, remember
to take into account any page rotation when considering what the X and Y axes relate
to.
- The The -scale-to-fit operation scales each page in the range to fit a given page size, preserving aspect
ratio and centering the result.
- The scale can optionally be set to a percentage of the available area, instead of filling it.
- The scale can optionally be set to a percentage of the available area, instead of filling it.
+ The The -scale-contents operation scales the contents about the center of the crop box (or, if absent,
the media box), leaving the page dimensions (boxes) unchanged.
- To scale about a point other than the center, one can use the positioning commands described in
+ To scale about a point other than the center, one can use the positioning commands described in
Section 8.2.4. For example:
-
+
The The -shift operation shifts the contents of each page in the range by X points horizontally and Y
points vertically.
-
+
There are two ways of rotating pages: (1) setting a value in the PDF file which asks the viewer (e.g.
+ There are two ways of rotating pages: (1) setting a value in the PDF file which asks the viewer (e.g.
Acrobat) to rotate the page on-the-fly when viewing it (use -rotate or -rotateby) and
@@ -961,46 +961,46 @@ class="cmtt-10">-upright The possible values for The possible values for -rotate and -rotate-by are 0, 90, 180 and 270, all interpreted as being
clockwise. Any value may be used for -rotate-contents.
- The The -rotate operation sets the viewing rotation of the selected pages to the absolute value
given.
- The The -upright operation does whatever combination of -rotate and -rotate-contents is
required to change the rotation of the document to zero without altering its appearance.
In addition, it makes sure the media box has its origin at (0,0), changing other boxes to
compensate.
-
+
The The -hflip and -vflip operations flip the contents of the chosen pages horizontally or vertically. No
account is taken of the current page rotation when considering what ”horizontally” and ”vertically”
mean, so you may like to use -upright first.
- All PDF files contain a All PDF files contain a media box for each page, giving the dimensions of the paper. To
change these dimensions (without altering the page contents in any way), use the -mediabox
option.
- Note that the crop box is only obeyed in some viewers.
- Note that the crop box is only obeyed in some viewers.
+ PDF files can be encrypted using various types of encryption and attaching various permissions
+ PDF files can be encrypted using various types of encryption and attaching various permissions
describing what someone can do with a particular document (for instance, printing it or extracting
content). There are two types of person:
There are five kinds of encryption:
+ There are five kinds of encryption:
All encryption supports these kinds of permissions:
+ All encryption supports these kinds of permissions:
In addition, 128-bit encryption (Acrobat 5 and above) and AES encryption supports these:
+ In addition, 128-bit encryption (Acrobat 5 and above) and AES encryption supports these:
Add these flags to the command line to prevent each operation.
-
+ Add these flags to the command line to prevent each operation.
+
To encrypt a document, the owner and user passwords must be given (here, To encrypt a document, the owner and user passwords must be given (here, fred and charles
respectively):
- When using AES encryption, the option is available to refrain from encrypting the metadata. Add
+ When using AES encryption, the option is available to refrain from encrypting the metadata. Add
-no-encrypt-metadata to the command line.
-
+
To decrypt a document, the owner password is provided.
- To decrypt a document, the owner password is provided.
+ To decompress the streams in a PDF file, for instance to manually inspect the PDF, use:
- To decompress the streams in a PDF file, for instance to manually inspect the PDF, use:
+ To compress the streams in a PDF file, use: To compress the streams in a PDF file, use:
+
To To squeeze a PDF file, reducing its size by an average of about twenty percent (though sometimes not
at all), use:
- The The -squeeze operation writes some information about the squeezing process to standard output.
The squeezing process involves several processes which losslessly attempt to reduce the file size. It is
slow, so should not be used without thought.
@@ -1183,8 +1183,8 @@ $ ./cpdf -squeeze in.pdf -o out.pdf
- The
+ The -squeeze-log-to <filename> option writes the log to the given file instead of to standard
output.
@@ -1204,7 +1204,7 @@ bookmarks.
id="x1-420006.1">List Bookmarks
- The The -list-bookmarks operation prints (to standard output) the bookmarks in a file. The
first column gives the level of the tree at which a particular bookmark is. Then the
text of the bookmark in quotes, then the page number which the bookmark points to,
@@ -1212,7 +1212,7 @@ then (optionally) the word ”open” if the bookmark should have its ch
level immediately below) visible when the file is loaded. For example, upon executing
the result might be:
+ the result might be:
- If the page number is 0, it indicates that clicking on that entry doesn’t move to a page.
- By default,
+ If the page number is 0, it indicates that clicking on that entry doesn’t move to a page.
+ By default, cpdf converts unicode to ASCII text, dropping characters outside the ASCII range. To
prevent this, and return unicode UTF8 output, add the -utf8 option to the command. To prevent any
processing, use the -raw option.
-
+
The The -remove-bookmarks operations removes all bookmarks from the file.
+
The The -add-bookmarks file adds bookmarks as specified by a bookmarks file, a text file in ASCII or
UTF8 encoding and in the same format as that produced by the Chapter 7 The PDF file format, starting at Version 1.1, provides for simple slide-show presentations in the
+ The PDF file format, starting at Version 1.1, provides for simple slide-show presentations in the
manner of Microsoft Powerpoint. These can be played in Acrobat and possibly other PDF viewers,
typically started by entering full-screen mode. The -presentation operation allows such a
presentation to be built from any PDF file.
- The The -trans option chooses the transition style. When a page range is used, it is the transition
from each page named which is altered. The following transition styles are available:
@@ -1312,17 +1312,17 @@ class="description">The same as Dissolve but the effect sweeps across the page in the direction specified
by the -direction option.
- To remove a transition style currently applied to the selected pages, omit the To remove a transition style currently applied to the selected pages, omit the -trans option.
- The The -effect-duration option specifies the length of time in seconds for the transition itself. The
default value is one second.
- The The -duration option specifies the maximum time in seconds that the page is displayed before the
presentation automatically advances. The default, in the absence of the -duration option, is for no
automatic advancement.
- The The -direction option (for Wipe and Glitter styles only) specifies the direction of the effect.
@@ -1342,10 +1342,10 @@ class="cmbx-10">Wipe For example:
- For example:
+ To use different options on different page ranges, run To use different options on different page ranges, run cpdf multiple times on the file using a different
page range each time.
@@ -1354,48 +1354,48 @@ page range each time.
id="x1-460008">Watermarks and Stamps
- The The -stamp-on and -stamp-under operations stamp the first page of a source PDF onto or under
each page in the given range of the input file. For example,
- The position commands in Section The position commands in Section 8.2.4 can be used to locate the stamp more precisely (they are
calculated relative to the crop box of the stamp). Or, preprocess the stamp with -shift
first.
- The The -scale-stamp-to-fit option can be added to scale the stamp to fit the page before
applying it. The use of positioning commands together with -scale-stamp-to-fit is not
recommended.
- The The -combine-pages operation takes two PDF files and stamps each page of one over each page
of the other. The length of the output is the same as the length of the “under” file. For
instance:
- Page attributes (such as the display rotation) are taken from the “under” file. For best results, remove
+ Page attributes (such as the display rotation) are taken from the “under” file. For best results, remove
any rotation differences in the two files using -upright first.
- The The -relative-to-cropbox option takes the positioning command to be relative to the crop box of
each page rather than the media box.
-
+
The The -add-text operation allows text, dates and times to be stamped over one or more pages of the
input at a given position and using a given font, font size and color.
- The default is black 12pt Times New Roman text in the top left of each page. The text can be placed
+ The default is black 12pt Times New Roman text in the top left of each page. The text can be placed
underneath rather than over the page by adding the -underneath option.
- Text previously added by Text previously added by cpdf may be removed by the -remove-text operation.
-
+
There are various special codes to include the page number in the text:
+ There are various special codes to include the page number in the text:
For example, the format For example, the format "Page %Page of %EndPage" might become ”Page 5 of 17”.
- NB: In some circumstances (e.g in batch files) on Microsoft Windows, NB: In some circumstances (e.g in batch files) on Microsoft Windows, % is a special
character, and must be escaped (written as %%). Consult your local documentation for
details.
-
+
+
Unique page identifiers can be specified by putting Unique page identifiers can be specified by putting %Bates in the format.
The starting point can be set with the -bates option. For example:
To specify that bates numbering begins at the first page of the range, use To specify that bates numbering begins at the first page of the range, use -bates-at-range
instead. This option must be specified after the range is specified. To pad the bates number up
to a given number of leading zeros, use -bates-pad-to in addition to either -bates or
-bates-at-range.
-
+
The position of the text may be specified either in absolute terms:
+ The position of the text may be specified either in absolute terms:
- Positions relative to certain common points can be set:
- Positions relative to certain common points can be set:
+ No attempt is made to take account of the page rotation when interpreting the position, so
+ No attempt is made to take account of the page rotation when interpreting the position, so
-prerotate must be added to the command line if the file contains pages with a non-zero viewing
rotation. This is equivalent to pre-processing the document with -upright.
- The The -relative-to-cropbox modifier can be added to the command line to make these
measurements relative to the crop box instead of the media box.
- The default position is equivalent to The default position is equivalent to -topleft 100.
- The The -midline option may be added to specify that the positioning commands above are to
be considered relative to the midline of the text, rather than its baseline. Similarly, the
-topline option may be used to specify that the position is taken relative to the top of the
text.
-
+
The font may be set with the The font may be set with the -font option. The 14 Standard PDF fonts are available:
For example, page numbers in Times Italic can be achieved by:
- For example, page numbers in Times Italic can be achieved by:
+ The font size can be altered with the The font size can be altered with the -font-size option, which specifies the size in
points:
-
+
The The -color option takes an RGB color, where red, green and blue components range between 0 and 1.
The following values are predefined:
Partly-transparent text may be specified using the Partly-transparent text may be specified using the -opacity option. Wholly opaque is 1 and wholly
transparent is 0. For example:
-
+
The The -outline option sets outline text. The line width (default 1pt) may be set with the -linewidth
option. For example, to stamp documents as drafts:
-
+
The code The code \n can be included in the text string to move to the next line. In this case, the vertical
position refers to the baseline of the first line of text (if the position is at the top, top left or top right
of the page) or the baseline of the last line of text (if the position is at the bottom, bottom left or
bottom right).
- The The -midline option may be used to make these vertical positions relative to the midline of a line of
text rather than the baseline, as usual.
- The The -line-spacing option can be used to increase or decrease the line spacing, where a spacing of
1 is the standard.
- Justification of multiple lines is handled by the Justification of multiple lines is handled by the -justify-left, -justify-right and
+
If your command line allows for the inclusion of unicode characters, the input text will be considered
+ If your command line allows for the inclusion of unicode characters, the input text will be considered
as UTF8 by cpdf. Special characters which exist in the PDF WinAnsiEncoding Latin 1 code (such as
many accented characters) will be reproduced in the PDF. This does not mean, however, that every
special character can be reproduced. You must experiment.
- For compatibility with previous versions of cpdf, special characters may be introduced manually
+ For compatibility with previous versions of cpdf, special characters may be introduced manually
with a backslash followed by the three-digit octal code of the character in the PDF WinAnsiEncoding
Latin 1 Code. The full table is included in Appendix D of the Adobe PDF Reference Manual, which is
available at http://www.adobe.com/devnet/pdf/pdf_reference.html.
- For example, a German sharp s (ß) may be introduced by For example, a German sharp s (ß) may be introduced by \337.
-
+
A rectangle may be placed on one or more pages by using the A rectangle may be placed on one or more pages by using the -add-rectangle <size>
command. Most of the options discussed above for text placement apply in the same way. For
example:
- This can be used to blank out or highlight part of the document. The following positioning options
+ This can be used to blank out or highlight part of the document. The following positioning options
work as you would expect: -topleft, -top, -reverse-diagonal have no
meaning.
-
+
This facility puts multiple logical pages on a single physical page. The This facility puts multiple logical pages on a single physical page. The -twoup-stack operation puts
two logical pages on each physical page, rotating them 90 degrees to do so. The new mediabox is thus
larger. The -twoup operation does the same, but scales the new sides down so that the media box is
unchanged.
-
+
Sometimes, for instance to get a printing arrangement right, it’s useful to be able to insert blank pages
+ Sometimes, for instance to get a printing arrangement right, it’s useful to be able to insert blank pages
into a PDF file. cpdf can add blank pages before a given page or pages, or after. The pages in
question are specified by a range in the usual way:
- The The -pad-every n operation places a blank page after every n pages, excluding any last one. For
example…
- The The -pad-multiple n operation adds blank pages so the document has a multiple of n pages. For
example:
-
+
The The -list-annotations operation prints the textual content of any annotations on the selected pages
to standard output. Each annotation is preceded by the page number and followed by a
newline.
-
+
The The -copy-annotations operation copies the annotations in the given page range from one file (the
file specified immediately after the option) to another pre-existing PDF. The range is specified after
this pre-existing PDF. The result is then written an output file, specified in the usual way.
+
The The -remove-annotations operation removes all annotations from the given page range.
-
+
The The -list-fonts operation prints the fonts in the document, one-per-line to standard output. For
example:
- The first column gives the page number, the second the internal unique font name, the third the type
+ The first column gives the page number, the second the internal unique font name, the third the type
of font (Type1, TrueType etc), the fourth the PDF font name, the fifth the PDF font
encoding.
-
+
The The -info option prints entries from the document information dictionary, and from any XMP
metadata to standard output.
- The details of the format for creation and modification dates can be found in Appendix The details of the format for creation and modification dates can be found in Appendix A.
- By default, cpdf strips to ASCII, discarding character codes in excess of 127. In order to preserve
+ By default, cpdf strips to ASCII, discarding character codes in excess of 127. In order to preserve
the original unicode, add the -utf8 option. To disable all postprocessing of the string, add
-raw.
- The The -page-info option prints the page label, media box and other boxes page-by-page to standard
output, for all pages in the current range.
- Note that the format for boxes is minimum x, minimum y, maximum x, maximum y.
- The Note that the format for boxes is minimum x, minimum y, maximum x, maximum y.
+ The -pages operation prints the number of pages in the file.
-
+
The The document information dictionary in a PDF file specifies various pieces of information about a
PDF. These can be consulted in a PDF viewer (for instance, Acrobat).
- Here is a summary of the commands for setting entries in the document information
+ Here is a summary of the commands for setting entries in the document information
dictionary:
- (The details of the format for creation and modification dates can be found in Appendix (The details of the format for creation and modification dates can be found in Appendix A. Using the
date "now" uses the time and date at which the command is executed. Note also that -producer and
-creator may be used to set the producer and/or the creator when writing any file, separate from the
operations described in this chapter.)
- For example, to set the title, the full command line would be
- For example, to set the title, the full command line would be
+ The text string is considered to be in UTF8 format, unless the The text string is considered to be in UTF8 format, unless the -raw option is added—in which case, it
is unprocessed, save for the replacement of any octal escape sequence such as \017, which is replaced
by a character of its value (here, 15).
-
+
+
The The -set-page-layout option specifies the page layout to be used when a document is opened in, for
instance, Acrobat. The possible (case-sensitive) values are:
For instance:
- For instance:
+
+
The The page mode in a PDF file defines how a viewer should display the document when first opened.
The possible (case-sensitive) values are:
For instance:
- For instance:
+
+
For instance:
- For instance:
+ The page a PDF file opens at can be set using The page a PDF file opens at can be set using -open-at-page:
- To have that page scaled to fit the window in the viewer, use To have that page scaled to fit the window in the viewer, use -open-at-page-fit instead:
-
+
PDF files can contain a piece of arbitrary metadata, often in XMP format. This is typically stored in
+ PDF files can contain a piece of arbitrary metadata, often in XMP format. This is typically stored in
an uncompressed stream, so that other applications can read it without having to decode the whole
PDF. To set the metadata:
- To remove any metadata:
- To remove any metadata:
+
+
It is possible to add It is possible to add page labels to a document. These are not the printed on the page, but may be
displayed alongside thumbnails or in print dialogue boxes by PDF readers. We use -add-page-labels
@@ -1944,35 +1944,35 @@ src="cpdfmanual119x.png" alt=" DecimalArabic 1,2,3,4,5...
LowercaseLetters a,b,c,...,z,aa,bb...
UppercaseLetters A,B,C,...,Z,AA,BB...
NoLabelPrefixOnly No number, but a prefix will be used if defined. " > We can use We can use -label-prefix to add a textual prefix to each label. Consider a file with twenty pages
and no current page labels (a PDF reader will assume 1,2,3…if there are none). We will add the
following page labels:
- i, ii, iii, iv, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, A-0, A-1, A-2, A-3, A-4, A-5
- Here are the commands, in order:
- i, ii, iii, iv, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, A-0, A-1, A-2, A-3, A-4, A-5
+ Here are the commands, in order:
+ By default the labels begin at page number 1 for each range. To override this, we can use
+ By default the labels begin at page number 1 for each range. To override this, we can use
-label-startval (we used 0 in the final command), where we want the numbers to begin at zero
rather than one.
- Page labels may be removed altogether by using Page labels may be removed altogether by using -remove-page-labels command. To print the
page labels from an existing file, use -print-page-labels. For example:
-
+
PDF supports adding attachments (files of any kind, including other PDFs) to an existing file. The
+ PDF supports adding attachments (files of any kind, including other PDFs) to an existing file. The
cpdf tool supports adding and removing document-level attachments — that is, ones which are
@@ -1984,83 +1984,83 @@ class="cmti-10">attachments To add an attachment, use the To add an attachment, use the -attach-file option. For instance,
- The The -to-page option can be used to specify that the files will be attached to the given
page, rather than at the document level. The -to-page option may be specified at most
once.
-
+
To list all document- and page-level attachments, use the To list all document- and page-level attachments, use the -list-attached-files operation. The page
number and filename of each attachment is given, page 0 representing a document-level
attachment.
-
+
To remove all document-level and page-level attachments from a file, use the To remove all document-level and page-level attachments from a file, use the -remove-files
operation:
-
+
To list all images in the given range of pages which fall below a given resolution (in dots-per-inch), use
+ To list all images in the given range of pages which fall below a given resolution (in dots-per-inch), use
the -image-resolution function:
- The format is The format is page number, image name, x pixels, y pixels, x resolution, y resolution. The resolutions
refer to the image’s effective resolution at point of use (taking account of scaling, rotation
etc).
-
+
In order to use a font other than the standard 14 with In order to use a font other than the standard 14 with -add-text, it must be added to the file. The
font source PDF is given, together with the font’s resource name on a given page, and that
font is copied to all the pages in the input file’s range, and then written to the output
file.
- The font is named in the output file with its basefont name, so it can be easily used with
+ The font is named in the output file with its basefont name, so it can be easily used with
-add-text.
- For example, if the file For example, if the file fromfile.pdf has a font /GHLIGA+c128 with the name /F10 on page 1
@@ -2069,90 +2069,90 @@ class="cmtt-10">-list-fonts Text in this font can then be added by giving Text in this font can then be added by giving -font /GHLIGA+c128. Be aware that due to the vagaries
of PDF font handling concerning which characters are present in the source font, not all
characters may be available, or the encoding (mapping from input codes to glyphs) may be
non-obvious.
-
+
To remove embedded fonts from a document, use To remove embedded fonts from a document, use -remove-fonts. PDF readers will substitute local
fonts for the missing fonts. The use of this function is only recommended when file size is the sole
consideration.
-
+
The The -missing-fonts operation lists any unembedded fonts in the document, one per line.
- The format is
- The format is
+
+
The The -draft option removes bitmap (photographic) images from a file, so that it can be printed with
less ink. Optionally, the -boxes option can be added, filling the spaces left blank with a crossed box
denoting where the image was. This is not guaranteed to be fully visible in all cases (the
bitmap may be have been partially covered by vector objects or clipped in the original). For
example:
-
+
Sometimes PDF output from an application (for instance, a web browser) has text in colors which
+ Sometimes PDF output from an application (for instance, a web browser) has text in colors which
would not print well on a grayscale printer. The -blacktext operation blackens all text on the given
pages so it will be readable when printed.
- This will not work on text which has been converted to outlines, nor on text which is part of a
+ This will not work on text which has been converted to outlines, nor on text which is part of a
form.
- The The -blacklines operation blackens all lines on the given pages.
- The The -blackfills operation blackens all fills on the given pages.
- Contrary to their names, all these operations can use another color, if specified with Contrary to their names, all these operations can use another color, if specified with -color.
-
+
Quite often, applications will use very thin lines, or even the value of 0, which in PDF means ”The
+ Quite often, applications will use very thin lines, or even the value of 0, which in PDF means ”The
thinnest possible line on the output device”. This might be fine for on-screen work, but when printed
on a high resolution device, such as by a commercial printer, they may be too faint, or
disappear altogether. The -thinlines option prevents this by changing all lines thi
<minimal thickness> to the given thickness. For example:
-
+
Sometimes incremental updates to a file by an application, or bad applications can leave data in a
+ Sometimes incremental updates to a file by an application, or bad applications can leave data in a
PDF file which is no longer used. This function removes that unneeded data.
-
+
To change the pdf version number, use the To change the pdf version number, use the -set-version operation, giving the part of the version
number after the decimal point. For example:
-
+
The The -copy-id-from operation copies the ID from the given file to the input, writing to the
output.
- If there is no ID in the source file, the existing ID is retained. You cannot use If there is no ID in the source file, the existing ID is retained. You cannot use -recrypt with
-copy-id-from.
-
+
The The -remove-id operation removes the ID from a document.
- You cannot use You cannot use -recrypt with -remove-id.
-
+
This operation lists the name of any “separation” color space in the given PDF file.
- This operation lists the name of any “separation” color space in the given PDF file.
+
+
This is for editing data within the PDF’s internal representation. Use with caution.
- This is for editing data within the PDF’s internal representation. Use with caution.
+
+
The The -remove-clipping operation removes any clipping paths from the file.
-
+
A contiguous prefix of the parts above can be used instead, for lower accuracy dates. For
+ A contiguous prefix of the parts above can be used instead, for lower accuracy dates. For
example:
-
-Chapter 1
-
Basic Usage
-1.1 Input and Output Files
-
-
-
-
-1.2 Input Ranges
-
-1.3 Working with Encrypted Documents
id="dx1-6002">
-
-
-1.4 Standard Input and Standard Output
-
-
-1.5 Doing Several Things at Once with AND
-
-
-1.6 Units
-
1.7 Setting the Producer and Creator
-
-1.8 PDF Version Numbers
-1.9 File IDs
-
-1.10 Linearization
-
-
-1.11 Object Streams
-
-
-
-1.12 Malformed Files
-
-
-1.13 Error Handling
-1.14 Control Files
-
-1.15 String Arguments
-
-
-1.16 Text Encodings
-
-
-1.17 Font Embedding
-
-2.2 Splitting
-
-
-
-
-2.4 Encrypting with Split and Split Bookmarks
-
-
-
-
-
-3.3 Shift Page Contents
-
-3.4 Rotating Pages
-
The
-rotateby operation changes the viewing rotation of all the given pages by the relative value
given.
-
The
-rotate-contents operation rotates the contents and dimensions of the page by the given relative
value.
-
-3.5 Flipping Pages
-
3.6 Boxes and Cropping
@@ -1008,14 +1008,14 @@ src="cpdfmanual50x.png" alt="" class="fbox" >
id="dx1-32001">
-
The
four numbers are minimum x, minimum y, width, height. x coordinates increase to the right, y
coordinates increase upwards. PDF file can also optionally contain a -crop.
To remove any existing crop box, use -remove-crop.
-
-
This operation copies the contents of one box (Media box, Crop box, Trim box etc.) to another. If
-mediabox-if-missing is added, the media box will be substituted when the ’from’ box is not set for
a given page. For example
-
copies the Trim Box of each page to the Crop Box of each page. The possible boxes are /MediaBox,
@@ -1052,7 +1052,7 @@ class="cmtt-10">/ArtBox4.1 Introduction
-
can do to the document what is allowed in the permis
class="description">The Owner can do anything, including altering the permissions or removing encryption
entirely.
-


4.2 Encrypting a Document
-
A
blank user password is common. In this event, PDF viewers will typically not prompt for a password
for when opening the file or for operations allowable with the user password.
-
In
addition, the usual method can be used to give the existing owner password, if the document is
already encrypted.
-4.3 Decrypting a Document
-
The
user password cannot decrypt a file.
Chapter 5
-
Compression
cpdf provides basic facilities for decompressing and compressing PDF streams.
@@ -1139,8 +1139,8 @@ class="cmtt-10">cpdf
If
cpdf finds a compression type it can’t cope with, the stream is left compressed. When using
@@ -1150,26 +1150,26 @@ class="cmtt-10">-decompress
cpdf compresses any streams which have no compression using the FlateDecode method, with the
exception of Metadata streams, which are left uncompressed.
-5.3 Squeezing a Document
-
Adding -squeeze to the command line when using another operation will squeeze the file or files upon
output.
-
Squeezing page data
Recompressing document
-
0 "Part 2" 4
1 "Part 2a" 5
6.2 Remove Bookmarks
-
-6.3 Add Bookmarks
-
Presentations
-
-
8.1 Add a Watermark or Logo
-
stamps the file logo.pdf onto the odd pages of in.pdf, writing to out.pdf. A watermark should go
underneath each page:
-
-
-8.2 Stamp Text, Dates and Times.
-
-
-8.2.1 Page Numbers
-
8.2.2 Date and Time Formats
8.2.3 Bates Numbers
-
-8.2.4 Position
-
-
-8.2.5 Font and Size
-
See
Section 14.1 for how to use other fonts.
-
-8.2.6 Colors
-
-
-8.2.7 Outline Text
-
-8.2.8 Multi-line Text
-
-
-
-justify-center options. The defaults are left justification for positions relative to the left hand side
of the page, right justification for those relative to the right, and center justification for positions
relative to the center of the page. For example:
-
-8.2.9 Special Characters
-8.3 Stamping Graphics
-
-Chapter 9
id="x1-600009.1">Two-up
-9.2 Inserting Blank Pages
-
The
dimensions of the padded page are derived from the boxes (media box, crop box etc.) of the page after
or before which the padding is to be applied.
-
…on
a 9 page document adds a blank page after pages 3 and 6.
-
-Chapter 10
id="x1-6300010.1">List Annotations
-
-10.2 Copy Annotations
-
-10.3 Remove Annotations
-
-Chapter 11
-
Document Information and Metadata
11.1 Listing Fonts
-
-11.2 Reading Document Information
-
-
-
-11.3 Setting Document Information
-
-
-11.4 Upon Opening a Document
-11.4.1 Page Layout
-
-
-11.4.2 Page Mode
-
-
-11.4.3 Display Options
-
-
-
-11.5 Metadata
-
-
To
print the current metadata to standard output:
-
-11.6 Page Labels
-
-
-Chapter 12
-
File Attachments
-
attaches the Excel spreadsheet sheet.xls to the input file. If the file already has attachments, the
new file is added to their number. You can specify multiple files to be attached by using -attach-file
multiple times. They will be attached in the given order.
-12.2 Listing Attachments
-
-12.3 Removing Attachments
-
-Chapter 13
-
Working with Images
13.1 Detecting Low-resolution Images
-
-
-Chapter 14
-
Fonts
14.1 Copying Fonts
-
-14.2 Removing Fonts
-
-14.3 Listing Missing Fonts
-
-
-Chapter 15
-
Miscellaneous
15.1 Draft Documents
-
-15.2 Blackening Text, Lines and Fills
-
-15.3 Hairline Removal
-
-15.4 Garbage Collection
-
-15.5 Change PDF Version Number
-
This does not alter any of the actual data in the file — just the supposed version number.
-15.6 Copy ID
-
-15.7 Remove ID
-
-15.8 List Spot Colours
-
-15.9 Removing Dictionary Entries
-
-15.10 Remove Clipping
-
-Appendix A
Dates in PDF are specified according to the following format:
-
Dates
-