pdftotext.cat (xpdf-4.03) | : | pdftotext.cat (xpdf-4.04) | ||
---|---|---|---|---|
pdftotext(1) General Commands Manual pdftotext(1) | pdftotext(1) General Commands Manual pdftotext(1) | |||
NAME | NAME | |||
pdftotext - Portable Document Format (PDF) to text converter (version | pdftotext - Portable Document Format (PDF) to text converter (version | |||
4.03) | 4.04) | |||
SYNOPSIS | SYNOPSIS | |||
pdftotext [options] [PDF-file [text-file]] | pdftotext [options] [PDF-file [text-file]] | |||
DESCRIPTION | DESCRIPTION | |||
Pdftotext converts Portable Document Format (PDF) files to plain text. | Pdftotext converts Portable Document Format (PDF) files to plain text. | |||
Pdftotext reads the PDF file, PDF-file, and writes a text file, text- | Pdftotext reads the PDF file, PDF-file, and writes a text file, text- | |||
file. If text-file is not specified, pdftotext converts file.pdf to | file. If text-file is not specified, pdftotext converts file.pdf to | |||
file.txt. If text-file is '-', the text is sent to stdout. | file.txt. If text-file is '-', the text is sent to stdout. | |||
CONFIGURATION FILE | CONFIGURATION FILE | |||
Pdftotext reads a configuration file at startup. It first tries to | Pdftotext reads a configuration file at startup. It first tries to | |||
find the user's private config file, ~/.xpdfrc. If that doesn't exist, | find the user's private config file, ~/.xpdfrc. If that doesn't exist, | |||
it looks for a system-wide config file, typically /usr/local/etc/xpdfrc | it looks for a system-wide config file, typically /etc/xpdfrc (but this | |||
(but this location can be changed when pdftotext is built). See the | location can be changed when pdftotext is built). See the xpdfrc(5) | |||
xpdfrc(5) man page for details. | man page for details. | |||
OPTIONS | OPTIONS | |||
Many of the following options can be set with configuration file com- | Many of the following options can be set with configuration file com- | |||
mands. These are listed in square brackets with the description of the | mands. These are listed in square brackets with the description of the | |||
corresponding command line option. | corresponding command line option. | |||
-f number | -f number | |||
Specifies the first page to convert. | Specifies the first page to convert. | |||
-l number | -l number | |||
skipping to change at line 102 | skipping to change at line 102 | |||
Sets the encoding to use for text output. The encoding-name | Sets the encoding to use for text output. The encoding-name | |||
must be defined with the unicodeMap command (see xpdfrc(5)). | must be defined with the unicodeMap command (see xpdfrc(5)). | |||
The encoding name is case-sensitive. This defaults to "Latin1" | The encoding name is case-sensitive. This defaults to "Latin1" | |||
(which is a built-in encoding). [config file: textEncoding] | (which is a built-in encoding). [config file: textEncoding] | |||
-eol unix | dos | mac | -eol unix | dos | mac | |||
Sets the end-of-line convention to use for text output. [config | Sets the end-of-line convention to use for text output. [config | |||
file: textEOL] | file: textEOL] | |||
-nopgbrk | -nopgbrk | |||
Don't insert page breaks (form feed characters) between pages. | Don't insert a page breaks (form feed character) at the end of | |||
[config file: textPageBreaks] | each page. [config file: textPageBreaks] | |||
-bom Insert a Unicode byte order marker (BOM) at the start of the | -bom Insert a Unicode byte order marker (BOM) at the start of the | |||
text output. | text output. | |||
-marginl number | -marginl number | |||
Specifies the left margin, in points. Text in the left margin | Specifies the left margin, in points. Text in the left margin | |||
(i.e., within that many points of the left edge of the page) is | (i.e., within that many points of the left edge of the page) is | |||
discarded. The default value is zero. | discarded. The default value is zero. | |||
-marginr number | -marginr number | |||
skipping to change at line 135 | skipping to change at line 135 | |||
gin (i.e., within that many points of the bottom edge of the | gin (i.e., within that many points of the bottom edge of the | |||
page) is discarded. The default value is zero. | page) is discarded. The default value is zero. | |||
-opw password | -opw password | |||
Specify the owner password for the PDF file. Providing this | Specify the owner password for the PDF file. Providing this | |||
will bypass all security restrictions. | will bypass all security restrictions. | |||
-upw password | -upw password | |||
Specify the user password for the PDF file. | Specify the user password for the PDF file. | |||
-verbose | ||||
Print a status message (to stdout) before processing each page. | ||||
[config file: printStatusInfo] | ||||
-q Don't print any messages or errors. [config file: errQuiet] | -q Don't print any messages or errors. [config file: errQuiet] | |||
-cfg config-file | -cfg config-file | |||
Read config-file in place of ~/.xpdfrc or the system-wide config | Read config-file in place of ~/.xpdfrc or the system-wide config | |||
file. | file. | |||
-listencodings | -listencodings | |||
List all available text output encodings, then exit. | List all available text output encodings, then exit. | |||
-v Print copyright and version information, then exit. | -v Print copyright and version information, then exit. | |||
-h Print usage information, then exit. (-help and --help are | -h Print usage information, then exit. (-help and --help are | |||
equivalent.) | equivalent.) | |||
BUGS | BUGS | |||
Some PDF files contain fonts whose encodings have been mangled beyond | Some PDF files contain fonts whose encodings have been mangled beyond | |||
recognition. There is no way (short of OCR) to extract text from these | recognition. There is no way (short of OCR) to extract text from these | |||
files. | files. | |||
EXIT CODES | EXIT CODES | |||
The Xpdf tools use the following exit codes: | The Xpdf tools use the following exit codes: | |||
0 No error. | 0 No error. | |||
1 Error opening a PDF file. | 1 Error opening a PDF file. | |||
2 Error opening an output file. | 2 Error opening an output file. | |||
3 Error related to PDF permissions. | 3 Error related to PDF permissions. | |||
99 Other error. | 99 Other error. | |||
AUTHOR | AUTHOR | |||
The pdftotext software and documentation are copyright 1996-2021 Glyph | The pdftotext software and documentation are copyright 1996-2022 Glyph | |||
& Cog, LLC. | & Cog, LLC. | |||
SEE ALSO | SEE ALSO | |||
xpdf(1), pdftops(1), pdftohtml(1), pdfinfo(1), pdffonts(1), pdfde- | xpdf(1), pdftops(1), pdftohtml(1), pdfinfo(1), pdffonts(1), pdfde- | |||
tach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5) | tach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5) | |||
http://www.xpdfreader.com/ | http://www.xpdfreader.com/ | |||
28 Jan 2021 pdftotext(1) | 18 Apr 2022 pdftotext(1) | |||
End of changes. 8 change blocks. | ||||
9 lines changed or deleted | 13 lines changed or added |