diff --git a/html_manual/clean b/html_manual/clean
index df07573..2b7b2c7 100755
--- a/html_manual/clean
+++ b/html_manual/clean
@@ -1 +1 @@
-rm -f *.4ct *.4tc *.aux *.css *.dvi *.idv *.idx *.lg *.log *.tmp *.xref *.png
+rm -f *.4ct *.4tc *.aux *.css *.dvi *.idv *.idx *.lg *.log *.tmp *.xref *.png *.toc *.out
diff --git a/html_manual/cpdfmanual.html b/html_manual/cpdfmanual.html
index 5cc7ddf..dd05156 100644
--- a/html_manual/cpdfmanual.html
+++ b/html_manual/cpdfmanual.html
@@ -91,6 +91,156 @@ href="#x1-220002.1" id="QQ2-1-22">Merging
href="#x1-230002.2" id="QQ2-1-23">Splitting
2.3 Splitting on Bookmarks
+
2.4 Encrypting with Split and Split Bookmarks
+
3 Pages
+
3.1 Page Sizes
+
3.2 Scale Pages
+
3.3 Shift Page Contents
+
3.4 Rotating Pages
+
3.5 Flipping Pages
+
3.6 Boxes and Cropping
+
4 Encryption and Decryption
+
4.1 Introduction
+
4.2 Encrypting a Document
+
4.3 Decrypting a Document
+
5 Compression
+
5.1 Decompressing a Document
+
5.2 Compressing a Document
+
+
+
5.3 Squeezing a Document
+
6 Bookmarks
+
6.1 List Bookmarks
+
6.2 Remove Bookmarks
+
6.3 Add Bookmarks
+
7 Presentations
+
8 Watermarks and Stamps
+
8.1 Add a Watermark or Logo
+
8.2 Stamp Text, Dates and Times.
+
8.2.1 Page Numbers
+
8.2.2 Date and Time Formats
+
8.2.3 Bates Numbers
+
8.2.4 Position
+
8.2.5 Font and Size
+
8.2.6 Colors
+
8.2.7 Outline Text
+
8.2.8 Multi-line Text
+
8.2.9 Special Characters
+
8.3 Stamping Graphics
+
9 Multipage Facilities
+
9.1 Two-up
+
9.2 Inserting Blank Pages
+
10 Annotations
+
10.1 List Annotations
+
10.2 Copy Annotations
+
10.3 Remove Annotations
+
11 Document Information and Metadata
+
11.1 Listing Fonts
+
11.2 Reading Document Information
+
11.3 Setting Document Information
+
11.4 Upon Opening a Document
+
11.4.1 Page Layout
+
11.4.2 Page Mode
+
11.4.3 Display Options
+
11.5 Metadata
+
11.6 Page Labels
+
12 File Attachments
+
12.1 Adding Attachments
+
12.2 Listing Attachments
+
12.3 Removing Attachments
+
13 Working with Images
+
13.1 Detecting Low-resolution Images
+
14 Fonts
+
14.1 Copying Fonts
+
14.2 Removing Fonts
+
14.3 Listing Missing Fonts
+
+
+
15 Miscellaneous
+
15.1 Draft Documents
+
15.2 Blackening Text, Lines and Fills
+
15.3 Hairline Removal
+
15.4 Garbage Collection
+
15.5 Change PDF Version Number
+
15.6 Copy ID
+
15.7 Remove ID
+
15.8 List Spot Colours
+
15.9 Removing Dictionary Entries
+
15.10 Remove Clipping
+
A Dates
@@ -275,8 +425,8 @@ class="cmtt-10">-recrypt option:
The password required (owner or user) depends upon the operation being performed. Separate -facilities are provided to decrypt and encrypt files (See Section ??). +facilities are provided to decrypt and encrypt files (See Section 4).
If you wish to manually alter the PDF version of a file, use the -set-version option described in -Section ??. +Section 15.5.
@@ -718,6 +868,1387 @@ following characters are removed, in addition to any character with ASCII code l 32:
+
+
The encryption parameters described in Chapter 4 may be added to the command line to encrypt each +split PDF. Similarly, the -recrypt switch described in 1 may by given to re-encrypt each file with the +existing encryption of the source PDF. + + +
Any time when a page size is required, instead of writing, for instance "210mm 197mm" one can instead +write a4portrait. Here is a list of supported page sizes: +
The -scale-page operation scales each page in the range by the X and Y factors given. +This scales both the page contents, and the page size itself. It also scales any Crop Box + + +and other boxes (Art Box, Trim Box etc). As with several of these commands, remember +to take into account any page rotation when considering what the X and Y axes relate +to. +
+
The -scale-to-fit operation scales each page in the range to fit a given page size, preserving aspect +ratio and centering the result. +
+
The scale can optionally be set to a percentage of the available area, instead of filling it. +
+
The -scale-contents operation scales the contents about the center of the crop box (or, if absent, +the media box), leaving the page dimensions (boxes) unchanged. +
+
To scale about a point other than the center, one can use the positioning commands described in +Section 8.2.4. For example: +
+
+
The -shift operation shifts the contents of each page in the range by X points horizontally and Y +points vertically. +
+
+
There are two ways of rotating pages: (1) setting a value in the PDF file which asks the viewer (e.g. +Acrobat) to rotate the page on-the-fly when viewing it (use -rotate or -rotateby) and +(2) actually rotating the page contents and/or the page dimensions (use -upright afterwards or + + +-rotate-contents to just rotate the page contents). +
The possible values for -rotate and -rotate-by are 0, 90, 180 and 270, all interpreted as being +clockwise. Any value may be used for -rotate-contents. +
The -rotate operation sets the viewing rotation of the selected pages to the absolute value +given. +
The +-rotateby operation changes the viewing rotation of all the given pages by the relative value +given. +
The +-rotate-contents operation rotates the contents and dimensions of the page by the given relative +value. +
+
The -upright operation does whatever combination of -rotate and -rotate-contents is +required to change the rotation of the document to zero without altering its appearance. +In addition, it makes sure the media box has its origin at (0,0), changing other boxes to +compensate. +
+
The -hflip and -vflip operations flip the contents of the chosen pages horizontally or vertically. No +account is taken of the current page rotation when considering what ”horizontally” and ”vertically” +mean, so you may like to use -upright first. +
+
All PDF files contain a media box for each page, giving the dimensions of the paper. To +change these dimensions (without altering the page contents in any way), use the -mediabox +option. + + +
The +four numbers are minimum x, minimum y, width, height. x coordinates increase to the right, y +coordinates increase upwards. PDF file can also optionally contain a crop box for each page, defining +to what extent the page is cropped before being displayed or printed. A crop box can be set, changed +and removed, without affecting the underlying media box. To set or change the crop box use -crop. +To remove any existing crop box, use -remove-crop. +
+
Note that the crop box is only obeyed in some viewers. +
+This operation copies the contents of one box (Media box, Crop box, Trim box etc.) to another. If +-mediabox-if-missing is added, the media box will be substituted when the ’from’ box is not set for +a given page. For example +
+copies the Trim Box of each page to the Crop Box of each page. The possible boxes are /MediaBox, +/CropBox, /BleedBox, /TrimBox, /ArtBox. + + +
PDF files can be encrypted using various types of encryption and attaching various permissions +describing what someone can do with a particular document (for instance, printing it or extracting +content). There are two types of person: +
There are five kinds of encryption: +
All encryption supports these kinds of permissions: +
In addition, 128-bit encryption (Acrobat 5 and above) and AES encryption supports these: +
Add these flags to the command line to prevent each operation. +
+
To encrypt a document, the owner and user passwords must be given (here, fred and charles +respectively): +
A +blank user password is common. In this event, PDF viewers will typically not prompt for a password +for when opening the file or for operations allowable with the user password. +
In +addition, the usual method can be used to give the existing owner password, if the document is +already encrypted. +
When using AES encryption, the option is available to refrain from encrypting the metadata. Add +-no-encrypt-metadata to the command line. +
+
To decrypt a document, the owner password is provided. +
The +user password cannot decrypt a file. + + +
+cpdf provides basic facilities for decompressing and compressing PDF streams. +
To decompress the streams in a PDF file, for instance to manually inspect the PDF, use: +
If +cpdf finds a compression type it can’t cope with, the stream is left compressed. When using +-decompress, object streams are not compressed. +
To compress the streams in a PDF file, use: cpdf compresses any streams which have no compression using the FlateDecode method, with the +exception of Metadata streams, which are left uncompressed. +
+
To squeeze a PDF file, reducing its size by an average of about twenty percent (though sometimes not +at all), use: +
+Adding -squeeze to the command line when using another operation will squeeze the file or files upon +output. +
The -squeeze operation writes some information about the squeezing process to standard output. +The squeezing process involves several processes which losslessly attempt to reduce the file size. It is +slow, so should not be used without thought. + + +
+
The -squeeze-log-to <filename> option writes the log to the given file instead of to standard +output. + + +
The -list-bookmarks operation prints (to standard output) the bookmarks in a file. The +first column gives the level of the tree at which a particular bookmark is. Then the +text of the bookmark in quotes, then the page number which the bookmark points to, +then (optionally) the word ”open” if the bookmark should have its children (at the +level immediately below) visible when the file is loaded. For example, upon executing + +
the result might be: + + +
+
If the page number is 0, it indicates that clicking on that entry doesn’t move to a page. +
By default, cpdf converts unicode to ASCII text, dropping characters outside the ASCII range. To +prevent this, and return unicode UTF8 output, add the -utf8 option to the command. To prevent any +processing, use the -raw option. +
+
The -remove-bookmarks operations removes all bookmarks from the file. + +
+
The -add-bookmarks file adds bookmarks as specified by a bookmarks file, a text file in ASCII or +UTF8 encoding and in the same format as that produced by the -list-bookmarks option. If there +are any bookmarks in the input PDF already, they are discarded. For example, if the file +bookmarks.txt contains the output from -list-bookmarks above, then the command + +adds the bookmarks to the input file, writing to out.pdf. An error will be given if the bookmarks file +is not in the correct form (in particular, the numbers in the first column which specify +the level must form a proper tree with no entry being more than one greater than the +last). + + +
The PDF file format, starting at Version 1.1, provides for simple slide-show presentations in the +manner of Microsoft Powerpoint. These can be played in Acrobat and possibly other PDF viewers, +typically started by entering full-screen mode. The -presentation operation allows such a +presentation to be built from any PDF file. +
The -trans option chooses the transition style. When a page range is used, it is the transition +from each page named which is altered. The following transition styles are available: +
To remove a transition style currently applied to the selected pages, omit the -trans option. +
The -effect-duration option specifies the length of time in seconds for the transition itself. The +default value is one second. +
The -duration option specifies the maximum time in seconds that the page is displayed before the +presentation automatically advances. The default, in the absence of the -duration option, is for no +automatic advancement. +
The -direction option (for Wipe and Glitter styles only) specifies the direction of the effect. +The following values are valid: +
For example: +
+
To use different options on different page ranges, run cpdf multiple times on the file using a different +page range each time. + + +
+
The -stamp-on and -stamp-under operations stamp the first page of a source PDF onto or under +each page in the given range of the input file. For example, +
+stamps the file logo.pdf onto the odd pages of in.pdf, writing to out.pdf. A watermark should go +underneath each page: +
+
The position commands in Section 8.2.4 can be used to locate the stamp more precisely (they are +calculated relative to the crop box of the stamp). Or, preprocess the stamp with -shift +first. +
The -scale-stamp-to-fit option can be added to scale the stamp to fit the page before +applying it. The use of positioning commands together with -scale-stamp-to-fit is not +recommended. +
The -combine-pages operation takes two PDF files and stamps each page of one over each page +of the other. The length of the output is the same as the length of the “under” file. For + + +instance: +
+
Page attributes (such as the display rotation) are taken from the “under” file. For best results, remove +any rotation differences in the two files using -upright first. +
The -relative-to-cropbox option takes the positioning command to be relative to the crop box of +each page rather than the media box. +
+
The -add-text operation allows text, dates and times to be stamped over one or more pages of the +input at a given position and using a given font, font size and color. +
+
The default is black 12pt Times New Roman text in the top left of each page. The text can be placed +underneath rather than over the page by adding the -underneath option. +
Text previously added by cpdf may be removed by the -remove-text operation. +
+
+
There are various special codes to include the page number in the text: +
For example, the format "Page %Page of %EndPage" might become ”Page 5 of 17”. +
NB: In some circumstances (e.g in batch files) on Microsoft Windows, % is a special +character, and must be escaped (written as %%). Consult your local documentation for +details. + + +
+
+
Unique page identifiers can be specified by putting %Bates in the format. +The starting point can be set with the -bates option. For example: + +
To specify that bates numbering begins at the first page of the range, use -bates-at-range +instead. This option must be specified after the range is specified. To pad the bates number up +to a given number of leading zeros, use -bates-pad-to in addition to either -bates or +-bates-at-range. +
+
The position of the text may be specified either in absolute terms: + + +
+
Positions relative to certain common points can be set: +
+
No attempt is made to take account of the page rotation when interpreting the position, so +-prerotate must be added to the command line if the file contains pages with a non-zero viewing +rotation. This is equivalent to pre-processing the document with -upright. +
The -relative-to-cropbox modifier can be added to the command line to make these +measurements relative to the crop box instead of the media box. +
The default position is equivalent to -topleft 100. +
The -midline option may be added to specify that the positioning commands above are to +be considered relative to the midline of the text, rather than its baseline. Similarly, the +-topline option may be used to specify that the position is taken relative to the top of the +text. +
+
The font may be set with the -font option. The 14 Standard PDF fonts are available: +
For example, page numbers in Times Italic can be achieved by: +
See +Section 14.1 for how to use other fonts. +
The font size can be altered with the -font-size option, which specifies the size in +points: +
+
+
The -color option takes an RGB color, where red, green and blue components range between 0 and 1. +The following values are predefined: +
+
Partly-transparent text may be specified using the -opacity option. Wholly opaque is 1 and wholly +transparent is 0. For example: +
+ + +
+
The -outline option sets outline text. The line width (default 1pt) may be set with the -linewidth +option. For example, to stamp documents as drafts: +
+
+
The code \n can be included in the text string to move to the next line. In this case, the vertical +position refers to the baseline of the first line of text (if the position is at the top, top left or top right +of the page) or the baseline of the last line of text (if the position is at the bottom, bottom left or +bottom right). +
+
The -midline option may be used to make these vertical positions relative to the midline of a line of +text rather than the baseline, as usual. +
The -line-spacing option can be used to increase or decrease the line spacing, where a spacing of +1 is the standard. +
+
Justification of multiple lines is handled by the -justify-left, -justify-right and
-justify-center options. The defaults are left justification for positions relative to the left hand side
+of the page, right justification for those relative to the right, and center justification for positions
+relative to the center of the page. For example:
+
+
+
If your command line allows for the inclusion of unicode characters, the input text will be considered +as UTF8 by cpdf. Special characters which exist in the PDF WinAnsiEncoding Latin 1 code (such as +many accented characters) will be reproduced in the PDF. This does not mean, however, that every +special character can be reproduced. You must experiment. +
For compatibility with previous versions of cpdf, special characters may be introduced manually +with a backslash followed by the three-digit octal code of the character in the PDF WinAnsiEncoding +Latin 1 Code. The full table is included in Appendix D of the Adobe PDF Reference Manual, which is +available at http://www.adobe.com/devnet/pdf/pdf_reference.html. +
For example, a German sharp s (ß) may be introduced by \337. + + +
+
A rectangle may be placed on one or more pages by using the -add-rectangle <size> +command. Most of the options discussed above for text placement apply in the same way. For +example: +
+
This can be used to blank out or highlight part of the document. The following positioning options +work as you would expect: -topleft, -top, -topright, -right, -bottomright, -bottom, +-bottomleft, -left, -center. When using the option -pos-left "x y", the point (x, y) refers to the +bottom-left of the rectangle. When using the option -pos-right "x y", the point (x, y) refers to the +bottom-right of the rectangle. When using the option -pos-center "x y", the point (x, y) refers +to the center of the rectangle. The options -diagonal and -reverse-diagonal have no +meaning. + + +
+ + +
This facility puts multiple logical pages on a single physical page. The -twoup-stack operation puts +two logical pages on each physical page, rotating them 90 degrees to do so. The new mediabox is thus +larger. The -twoup operation does the same, but scales the new sides down so that the media box is +unchanged. +
+
Sometimes, for instance to get a printing arrangement right, it’s useful to be able to insert blank pages +into a PDF file. cpdf can add blank pages before a given page or pages, or after. The pages in +question are specified by a range in the usual way: +
The +dimensions of the padded page are derived from the boxes (media box, crop box etc.) of the page after +or before which the padding is to be applied. +
The -pad-every n operation places a blank page after every n pages, excluding any last one. For +example… +
…on +a 9 page document adds a blank page after pages 3 and 6. +
The -pad-multiple n operation adds blank pages so the document has a multiple of n pages. For +example: +
+ + +
+ + +
The -list-annotations operation prints the textual content of any annotations on the selected pages +to standard output. Each annotation is preceded by the page number and followed by a +newline. +
+
+
The -copy-annotations operation copies the annotations in the given page range from one file (the +file specified immediately after the option) to another pre-existing PDF. The range is specified after +this pre-existing PDF. The result is then written an output file, specified in the usual way. + +
+
The -remove-annotations operation removes all annotations from the given page range. +
+ + +
+ + +
+
The -list-fonts operation prints the fonts in the document, one-per-line to standard output. For +example: + + +
+
The first column gives the page number, the second the internal unique font name, the third the type +of font (Type1, TrueType etc), the fourth the PDF font name, the fifth the PDF font +encoding. +
+
The -info option prints entries from the document information dictionary, and from any XMP +metadata to standard output. + + +
+
The details of the format for creation and modification dates can be found in Appendix A. +
By default, cpdf strips to ASCII, discarding character codes in excess of 127. In order to preserve +the original unicode, add the -utf8 option. To disable all postprocessing of the string, add +-raw. +
The -page-info option prints the page label, media box and other boxes page-by-page to standard +output, for all pages in the current range. +
+
Note that the format for boxes is minimum x, minimum y, maximum x, maximum y. +
The -pages operation prints the number of pages in the file. +
+
+
The document information dictionary in a PDF file specifies various pieces of information about a +PDF. These can be consulted in a PDF viewer (for instance, Acrobat). +
Here is a summary of the commands for setting entries in the document information +dictionary: +
+
(The details of the format for creation and modification dates can be found in Appendix A. Using the +date "now" uses the time and date at which the command is executed. Note also that -producer and +-creator may be used to set the producer and/or the creator when writing any file, separate from the +operations described in this chapter.) +
For example, to set the title, the full command line would be +
+
The text string is considered to be in UTF8 format, unless the -raw option is added—in which case, it +is unprocessed, save for the replacement of any octal escape sequence such as \017, which is replaced +by a character of its value (here, 15). +
+
+
The -set-page-layout option specifies the page layout to be used when a document is opened in, for +instance, Acrobat. The possible (case-sensitive) values are: +
For instance: +
+
+
The page mode in a PDF file defines how a viewer should display the document when first opened. +The possible (case-sensitive) values are: +
For instance: +
+
+
For instance: +
+
The page a PDF file opens at can be set using -open-at-page: +
+
To have that page scaled to fit the window in the viewer, use -open-at-page-fit instead: +
+
+
PDF files can contain a piece of arbitrary metadata, often in XMP format. This is typically stored in +an uncompressed stream, so that other applications can read it without having to decode the whole +PDF. To set the metadata: +
+
To remove any metadata: +
To +print the current metadata to standard output: +
+
+
It is possible to add page labels to a document. These are not the printed on the page, but may be +displayed alongside thumbnails or in print dialogue boxes by PDF readers. We use -add-page-labels +to do this, by default with decimal arabic numbers (1,2,3…). We can add -label-style to choose what +type of labels to add from these kinds: + + +
We can use -label-prefix to add a textual prefix to each label. Consider a file with twenty pages +and no current page labels (a PDF reader will assume 1,2,3…if there are none). We will add the +following page labels: +
i, ii, iii, iv, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, A-0, A-1, A-2, A-3, A-4, A-5 +
Here are the commands, in order: +
+
By default the labels begin at page number 1 for each range. To override this, we can use +-label-startval (we used 0 in the final command), where we want the numbers to begin at zero +rather than one. +
Page labels may be removed altogether by using -remove-page-labels command. To print the +page labels from an existing file, use -print-page-labels. For example: +
+ + +
+ + +
+
PDF supports adding attachments (files of any kind, including other PDFs) to an existing file. The +cpdf tool supports adding and removing document-level attachments — that is, ones which are +associated with the document as a whole rather than with an individual page, and also page-level +attachments, associated with a particular page. +
To add an attachment, use the -attach-file option. For instance, +
+attaches the Excel spreadsheet sheet.xls to the input file. If the file already has attachments, the +new file is added to their number. You can specify multiple files to be attached by using -attach-file +multiple times. They will be attached in the given order. +
The -to-page option can be used to specify that the files will be attached to the given +page, rather than at the document level. The -to-page option may be specified at most +once. +
+
To list all document- and page-level attachments, use the -list-attached-files operation. The page +number and filename of each attachment is given, page 0 representing a document-level +attachment. +
+
+
To remove all document-level and page-level attachments from a file, use the -remove-files +operation: +
+ + +
+ + +
+
To list all images in the given range of pages which fall below a given resolution (in dots-per-inch), use +the -image-resolution function: +
+
+
The format is page number, image name, x pixels, y pixels, x resolution, y resolution. The resolutions +refer to the image’s effective resolution at point of use (taking account of scaling, rotation +etc). + + +
+ + +
+
In order to use a font other than the standard 14 with -add-text, it must be added to the file. The +font source PDF is given, together with the font’s resource name on a given page, and that +font is copied to all the pages in the input file’s range, and then written to the output +file. +
The font is named in the output file with its basefont name, so it can be easily used with +-add-text. +
For example, if the file fromfile.pdf has a font /GHLIGA+c128 with the name /F10 on page 1 +(this information can be found with -list-fonts), the following would copy the font to the file +in.pdf on all pages, writing the output to out.pdf: +
+
Text in this font can then be added by giving -font /GHLIGA+c128. Be aware that due to the vagaries +of PDF font handling concerning which characters are present in the source font, not all +characters may be available, or the encoding (mapping from input codes to glyphs) may be +non-obvious. +
+
To remove embedded fonts from a document, use -remove-fonts. PDF readers will substitute local +fonts for the missing fonts. The use of this function is only recommended when file size is the sole +consideration. +
+
+
The -missing-fonts operation lists any unembedded fonts in the document, one per line. +
+
The format is +
+ + +
+ + +
+
The -draft option removes bitmap (photographic) images from a file, so that it can be printed with +less ink. Optionally, the -boxes option can be added, filling the spaces left blank with a crossed box +denoting where the image was. This is not guaranteed to be fully visible in all cases (the +bitmap may be have been partially covered by vector objects or clipped in the original). For +example: +
+
+
Sometimes PDF output from an application (for instance, a web browser) has text in colors which +would not print well on a grayscale printer. The -blacktext operation blackens all text on the given +pages so it will be readable when printed. +
This will not work on text which has been converted to outlines, nor on text which is part of a +form. +
The -blacklines operation blackens all lines on the given pages. +
The -blackfills operation blackens all fills on the given pages. + + +
+
Contrary to their names, all these operations can use another color, if specified with -color. +
+
Quite often, applications will use very thin lines, or even the value of 0, which in PDF means ”The +thinnest possible line on the output device”. This might be fine for on-screen work, but when printed +on a high resolution device, such as by a commercial printer, they may be too faint, or +disappear altogether. The -thinlines option prevents this by changing all lines thinner than +<minimal thickness> to the given thickness. For example: +
+
+
Sometimes incremental updates to a file by an application, or bad applications can leave data in a +PDF file which is no longer used. This function removes that unneeded data. +
+
+
To change the pdf version number, use the -set-version operation, giving the part of the version +number after the decimal point. For example: +
+This does not alter any of the actual data in the file — just the supposed version number. +
+
The -copy-id-from operation copies the ID from the given file to the input, writing to the +output. + + +
+
If there is no ID in the source file, the existing ID is retained. You cannot use -recrypt with +-copy-id-from. +
+
The -remove-id operation removes the ID from a document. +
+
You cannot use -recrypt with -remove-id. +
+
This operation lists the name of any “separation” color space in the given PDF file. +
+
+
This is for editing data within the PDF’s internal representation. Use with caution. +
+
+
The -remove-clipping operation removes any clipping paths from the file. +
+ + +
+
A contiguous prefix of the parts above can be used instead, for lower accuracy dates. For +example: +
+ + + +