"Fossies" - the Fresh Open Source Software Archive  

Source code changes of the file "src/parallel.texi" between
parallel-20210122.tar.bz2 and parallel-20210222.tar.bz2

About: GNU Parallel is a shell tool for executing jobs in parallel using multiple CPU cores and/or multiple computers.

parallel.texi  (parallel-20210122.tar.bz2):parallel.texi  (parallel-20210222.tar.bz2)
skipping to change at line 16 skipping to change at line 16
@settitle parallel - build and execute shell command lines from standard input i n parallel @settitle parallel - build and execute shell command lines from standard input i n parallel
@node Top @node Top
@top parallel @top parallel
@menu @menu
* NAME:: * NAME::
* SYNOPSIS:: * SYNOPSIS::
* DESCRIPTION:: * DESCRIPTION::
* OPTIONS:: * OPTIONS::
* EXAMPLE@asis{:} Working as xargs -n1. Argument appending:: * EXAMPLES::
* EXAMPLE@asis{:} Simple network scanner::
* EXAMPLE@asis{:} Reading arguments from command line::
* EXAMPLE@asis{:} Inserting multiple arguments::
* EXAMPLE@asis{:} Context replace::
* EXAMPLE@asis{:} Compute intensive jobs and substitution::
* EXAMPLE@asis{:} Substitution and redirection::
* EXAMPLE@asis{:} Composed commands::
* EXAMPLE@asis{:} Composed command with perl replacement string::
* EXAMPLE@asis{:} Composed command with multiple input sources::
* EXAMPLE@asis{:} Calling Bash functions::
* EXAMPLE@asis{:} Function tester::
* EXAMPLE@asis{:} Continously show the latest line of output::
* EXAMPLE@asis{:} Log rotate::
* EXAMPLE@asis{:} Removing file extension when processing files::
* EXAMPLE@asis{:} Removing strings from the argument::
* EXAMPLE@asis{:} Download 24 images for each of the past 30 days::
* EXAMPLE@asis{:} Download world map from NASA::
* EXAMPLE@asis{:} Download Apollo-11 images from NASA using jq::
* EXAMPLE@asis{:} Download video playlist in parallel::
* EXAMPLE@asis{:} Prepend last modified date (ISO8601) to file name::
* EXAMPLE@asis{:} Save output in ISO8601 dirs::
* EXAMPLE@asis{:} Digital clock with "blinking" @asis{:}::
* EXAMPLE@asis{:} Aggregating content of files::
* EXAMPLE@asis{:} Breadth first parallel web crawler/mirrorer::
* EXAMPLE@asis{:} Process files from a tar file while unpacking::
* EXAMPLE@asis{:} Rewriting a for-loop and a while-read-loop::
* EXAMPLE@asis{:} Rewriting nested for-loops::
* EXAMPLE@asis{:} Finding the lowest difference between files::
* EXAMPLE@asis{:} for-loops with column names::
* EXAMPLE@asis{:} All combinations in a list::
* EXAMPLE@asis{:} From a to b and b to c::
* EXAMPLE@asis{:} Count the differences between all files in a dir::
* EXAMPLE@asis{:} Speeding up fast jobs::
* EXAMPLE@asis{:} Using shell variables::
* EXAMPLE@asis{:} Group output lines::
* EXAMPLE@asis{:} Tag output lines::
* EXAMPLE@asis{:} Colorize output::
* EXAMPLE@asis{:} Keep order of output same as order of input::
* EXAMPLE@asis{:} Parallel grep::
* EXAMPLE@asis{:} Grepping n lines for m regular expressions.::
* EXAMPLE@asis{:} Using remote computers::
* EXAMPLE@asis{:} Transferring of files::
* EXAMPLE@asis{:} Distributing work to local and remote computers::
* EXAMPLE@asis{:} Running the same command on remote computers::
* EXAMPLE@asis{:} Running 'sudo' on remote computers::
* EXAMPLE@asis{:} Using remote computers behind NAT wall::
* EXAMPLE@asis{:} Parallelizing rsync::
* EXAMPLE@asis{:} Use multiple inputs in one command::
* EXAMPLE@asis{:} Use a table as input::
* EXAMPLE@asis{:} Output to database::
* EXAMPLE@asis{:} Output to CSV-file for R::
* EXAMPLE@asis{:} Use XML as input::
* EXAMPLE@asis{:} Run the same command 10 times::
* EXAMPLE@asis{:} Working as cat | sh. Resource inexpensive jobs and evaluation:
:
* EXAMPLE@asis{:} Call program with FASTA sequence::
* EXAMPLE@asis{:} Processing a big file using more CPUs::
* EXAMPLE@asis{:} Grouping input lines::
* EXAMPLE@asis{:} Running more than 250 jobs workaround::
* EXAMPLE@asis{:} Working as mutex and counting semaphore::
* EXAMPLE@asis{:} Mutex for a script::
* EXAMPLE@asis{:} Start editor with filenames from stdin (standard input)::
* EXAMPLE@asis{:} Running sudo::
* EXAMPLE@asis{:} GNU Parallel as queue system/batch manager::
* EXAMPLE@asis{:} GNU Parallel as dir processor::
* EXAMPLE@asis{:} Locate the missing package::
* SPREADING BLOCKS OF DATA:: * SPREADING BLOCKS OF DATA::
* QUOTING:: * QUOTING::
* LIST RUNNING JOBS:: * LIST RUNNING JOBS::
* COMPLETE RUNNING JOBS BUT DO NOT START NEW JOBS:: * COMPLETE RUNNING JOBS BUT DO NOT START NEW JOBS::
* ENVIRONMENT VARIABLES:: * ENVIRONMENT VARIABLES::
* DEFAULT PROFILE (CONFIG FILE):: * DEFAULT PROFILE (CONFIG FILE)::
* PROFILE FILES:: * PROFILE FILES::
* EXIT STATUS:: * EXIT STATUS::
* DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES:: * DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES::
* BUGS:: * BUGS::
skipping to change at line 409 skipping to change at line 344
without extension. It is a combination of @strong{@{}@emph{n}@strong{@}}, @stro ng{@{/@}}, and without extension. It is a combination of @strong{@{}@emph{n}@strong{@}}, @stro ng{@{/@}}, and
@strong{@{.@}}. @strong{@{.@}}.
This positional replacement string will be replaced by the input from This positional replacement string will be replaced by the input from
input source @emph{n} (when used with @strong{-a} or @strong{::::}) or with the input source @emph{n} (when used with @strong{-a} or @strong{::::}) or with the
@emph{n}'th argument (when used with @strong{-N}). The input will have the @emph{n}'th argument (when used with @strong{-N}). The input will have the
directory (if any) and extension removed. directory (if any) and extension removed.
To understand positional replacement strings see @strong{@{}@emph{n}@strong{@}}. To understand positional replacement strings see @strong{@{}@emph{n}@strong{@}}.
@item @strong{@{=}@emph{perl expression}@strong{=@}} @item @strong{@{=}@emph{perl expression}@strong{=@}} (alpha testing)
@anchor{@strong{@{=}@emph{perl expression}@strong{=@}}} @anchor{@strong{@{=}@emph{perl expression}@strong{=@}} (alpha testing)}
Replace with calculated @emph{perl expression}. @strong{$_} will contain the Replace with calculated @emph{perl expression}. @strong{$_} will contain the
same as @strong{@{@}}. After evaluating @emph{perl expression} @strong{$_} will be used same as @strong{@{@}}. After evaluating @emph{perl expression} @strong{$_} will be used
as the value. It is recommended to only change $_ but you have full as the value. It is recommended to only change $_ but you have full
access to all of GNU @strong{parallel}'s internal functions and data access to all of GNU @strong{parallel}'s internal functions and data
structures. A few convenience functions and data structures have been structures.
made:
The expression must give the same result if evaluated twice -
otherwise the behaviour is undefined. E.g. this will not work as expected:
@verbatim
parallel echo '{= $_= ++$wrong_counter =}' ::: a b c
@end verbatim
A few convenience functions and data structures have been made:
@table @asis @table @asis
@item @strong{Q(}@emph{string}@strong{)} @item @strong{Q(}@emph{string}@strong{)}
@anchor{@strong{Q(}@emph{string}@strong{)}} @anchor{@strong{Q(}@emph{string}@strong{)}}
shell quote a string shell quote a string
@item @strong{pQ(}@emph{string}@strong{)} @item @strong{pQ(}@emph{string}@strong{)}
@anchor{@strong{pQ(}@emph{string}@strong{)}} @anchor{@strong{pQ(}@emph{string}@strong{)}}
skipping to change at line 882 skipping to change at line 825
Even quoted newlines are parsed correctly: Even quoted newlines are parsed correctly:
@verbatim @verbatim
(echo '"Start of field 1 with newline' (echo '"Start of field 1 with newline'
echo 'Line 2 in field 1";value 2') | echo 'Line 2 in field 1";value 2') |
parallel --csv --colsep ';' echo Field 1: {1} Field 2: {2} parallel --csv --colsep ';' echo Field 1: {1} Field 2: {2}
@end verbatim @end verbatim
When used with @strong{--pipe} only pass full CSV-records. When used with @strong{--pipe} only pass full CSV-records.
@item @strong{--delay} @emph{mytime} (beta testing) @item @strong{--delay} @emph{mytime}
@anchor{@strong{--delay} @emph{mytime} (beta testing)} @anchor{@strong{--delay} @emph{mytime}}
Delay starting next job by @emph{mytime}. GNU @strong{parallel} will pause Delay starting next job by @emph{mytime}. GNU @strong{parallel} will pause
@emph{mytime} after starting each job. @emph{mytime} is normally in seconds, @emph{mytime} after starting each job. @emph{mytime} is normally in seconds,
but can be floats postfixed with @strong{s}, @strong{m}, @strong{h}, or @strong{ d} which would but can be floats postfixed with @strong{s}, @strong{m}, @strong{h}, or @strong{ d} which would
multiply the float by 1, 60, 3600, or 86400. Thus these are multiply the float by 1, 60, 3600, or 86400. Thus these are
equivalent: @strong{--delay 100000} and @strong{--delay 1d3.5h16.6m4s}. equivalent: @strong{--delay 100000} and @strong{--delay 1d3.5h16.6m4s}.
If you append 'auto' to @emph{mytime} (e.g. 13m3sauto) GNU @strong{parallel} wil l If you append 'auto' to @emph{mytime} (e.g. 13m3sauto) GNU @strong{parallel} wil l
automatically try to find the optimal value: If a job fails, @emph{mytime} automatically try to find the optimal value: If a job fails, @emph{mytime}
is doubled. If a job succeeds, @emph{mytime} is decreased by 10%. is doubled. If a job succeeds, @emph{mytime} is decreased by 10%.
skipping to change at line 1020 skipping to change at line 963
@strong{--pipepart} will give data to the program on stdin (standard @strong{--pipepart} will give data to the program on stdin (standard
input). With @strong{--fifo} GNU @strong{parallel} will create a temporary fifo input). With @strong{--fifo} GNU @strong{parallel} will create a temporary fifo
with the name in @strong{@{@}}, so you can do: @strong{parallel --pipe --fifo wc @{@}}. with the name in @strong{@{@}}, so you can do: @strong{parallel --pipe --fifo wc @{@}}.
Beware: If data is not read from the fifo, the job will block forever. Beware: If data is not read from the fifo, the job will block forever.
Implies @strong{--pipe} unless @strong{--pipepart} is used. Implies @strong{--pipe} unless @strong{--pipepart} is used.
See also: @strong{--cat}. See also: @strong{--cat}.
@item @strong{--filter} @emph{filter} (alpha testing)
@anchor{@strong{--filter} @emph{filter} (alpha testing)}
Only run jobs where @emph{filter} is true. @emph{filter} can contain
replacement strings and Perl code. Example:
@verbatim
parallel --filter '{1} < {2}+1' echo ::: {1..3} ::: {1..3}
@end verbatim
Outputs: 1,1 1,2 1,3 2,2 2,3 3,3
@item @strong{--filter-hosts} @item @strong{--filter-hosts}
@anchor{@strong{--filter-hosts}} @anchor{@strong{--filter-hosts}}
Remove down hosts. For each remote host: check that login through ssh Remove down hosts. For each remote host: check that login through ssh
works. If not: do not use this host. works. If not: do not use this host.
For performance reasons, this check is performed only at the start and For performance reasons, this check is performed only at the start and
every time @strong{--sshloginfile} is changed. If an host goes down after every time @strong{--sshloginfile} is changed. If an host goes down after
the first check, it will go undetected until @strong{--sshloginfile} is the first check, it will go undetected until @strong{--sshloginfile} is
changed; @strong{--retries} can be used to mitigate this. changed; @strong{--retries} can be used to mitigate this.
skipping to change at line 1058 skipping to change at line 1013
followed by stderr (standard error). followed by stderr (standard error).
This takes in the order of 0.5ms per job and depends on the speed of This takes in the order of 0.5ms per job and depends on the speed of
your disk for larger output. It can be disabled with @strong{-u}, but this your disk for larger output. It can be disabled with @strong{-u}, but this
means output from different commands can get mixed. means output from different commands can get mixed.
@strong{--group} is the default. Can be reversed with @strong{-u}. @strong{--group} is the default. Can be reversed with @strong{-u}.
See also: @strong{--line-buffer} @strong{--ungroup} See also: @strong{--line-buffer} @strong{--ungroup}
@item @strong{--group-by} @emph{val} (beta testing) @item @strong{--group-by} @emph{val}
@anchor{@strong{--group-by} @emph{val} (beta testing)} @anchor{@strong{--group-by} @emph{val}}
Group input by value. Combined with @strong{--pipe}/@strong{--pipepart} Group input by value. Combined with @strong{--pipe}/@strong{--pipepart}
@strong{--group-by} groups lines with the same value into a record. @strong{--group-by} groups lines with the same value into a record.
The value can be computed from the full line or from a single column. The value can be computed from the full line or from a single column.
@emph{val} can be: @emph{val} can be:
@table @asis @table @asis
@item column number @item column number
skipping to change at line 1277 skipping to change at line 1232
replacement variables: @strong{@{column name@}}, @strong{@{column name/@}}, @str ong{@{column replacement variables: @strong{@{column name@}}, @strong{@{column name/@}}, @str ong{@{column
name//@}}, @strong{@{column name/.@}}, @strong{@{column name.@}}, @strong{@{=col umn name perl name//@}}, @strong{@{column name/.@}}, @strong{@{column name.@}}, @strong{@{=col umn name perl
expression =@}}, .. expression =@}}, ..
For @strong{--pipe} the matched header will be prepended to each output. For @strong{--pipe} the matched header will be prepended to each output.
@strong{--header :} is an alias for @strong{--header '.*\n'}. @strong{--header :} is an alias for @strong{--header '.*\n'}.
If @emph{regexp} is a number, it is a fixed number of lines. If @emph{regexp} is a number, it is a fixed number of lines.
@item @strong{--hostgroups} (alpha testing) @item @strong{--hostgroups} (beta testing)
@anchor{@strong{--hostgroups} (alpha testing)} @anchor{@strong{--hostgroups} (beta testing)}
@item @strong{--hgrp} (alpha testing) @item @strong{--hgrp} (beta testing)
@anchor{@strong{--hgrp} (alpha testing)} @anchor{@strong{--hgrp} (beta testing)}
Enable hostgroups on arguments. If an argument contains '@@' the string Enable hostgroups on arguments. If an argument contains '@@' the string
after '@@' will be removed and treated as a list of hostgroups on which after '@@' will be removed and treated as a list of hostgroups on which
this job is allowed to run. If there is no @strong{--sshlogin} with a this job is allowed to run. If there is no @strong{--sshlogin} with a
corresponding group, the job will run on any hostgroup. corresponding group, the job will run on any hostgroup.
Example: Example:
@verbatim @verbatim
parallel --hostgroups \ parallel --hostgroups \
skipping to change at line 1573 skipping to change at line 1528
mix. Compare: mix. Compare:
@verbatim @verbatim
parallel -j0 'echo {};sleep {};echo {}' ::: 1 3 2 4 parallel -j0 'echo {};sleep {};echo {}' ::: 1 3 2 4
parallel -j0 --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4 parallel -j0 --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4
parallel -j0 -k --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4 parallel -j0 -k --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4
@end verbatim @end verbatim
See also: @strong{--group} @strong{--ungroup} See also: @strong{--group} @strong{--ungroup}
@item @strong{--xapply} @item @strong{--xapply} (alpha testing)
@anchor{@strong{--xapply}} @anchor{@strong{--xapply} (alpha testing)}
@item @strong{--link} @item @strong{--link} (alpha testing)
@anchor{@strong{--link}} @anchor{@strong{--link} (alpha testing)}
Link input sources. Read multiple input sources like @strong{xapply}. If Link input sources. Read multiple input sources like @strong{xapply}. If
multiple input sources are given, one argument will be read from each multiple input sources are given, one argument will be read from each
of the input sources. The arguments can be accessed in the command as of the input sources. The arguments can be accessed in the command as
@strong{@{1@}} .. @strong{@{}@emph{n}@strong{@}}, so @strong{@{1@}} will be a li ne from the first input @strong{@{1@}} .. @strong{@{}@emph{n}@strong{@}}, so @strong{@{1@}} will be a li ne from the first input
source, and @strong{@{6@}} will refer to the line with the same line number source, and @strong{@{6@}} will refer to the line with the same line number
from the 6th input source. from the 6th input source.
Compare these two: Compare these two:
skipping to change at line 1656 skipping to change at line 1611
only start as many as there is memory for. If less than @emph{size} bytes only start as many as there is memory for. If less than @emph{size} bytes
are free, no more jobs will be started. If less than 50% @emph{size} bytes are free, no more jobs will be started. If less than 50% @emph{size} bytes
are free, the youngest job will be killed, and put back on the queue are free, the youngest job will be killed, and put back on the queue
to be run later. to be run later.
@strong{--retries} must be set to determine how many times GNU @strong{parallel} @strong{--retries} must be set to determine how many times GNU @strong{parallel}
should retry a given job. should retry a given job.
See also: @strong{--memsuspend} See also: @strong{--memsuspend}
@item @strong{--memsuspend} @emph{size} (alpha testing) @item @strong{--memsuspend} @emph{size} (beta testing)
@anchor{@strong{--memsuspend} @emph{size} (alpha testing)} @anchor{@strong{--memsuspend} @emph{size} (beta testing)}
Suspend jobs when there is less than 2 * @emph{size} memory free. The Suspend jobs when there is less than 2 * @emph{size} memory free. The
@emph{size} can be postfixed with K, M, G, T, P, k, m, g, t, or p which @emph{size} can be postfixed with K, M, G, T, P, k, m, g, t, or p which
would multiply the size with 1024, 1048576, 1073741824, 1099511627776, would multiply the size with 1024, 1048576, 1073741824, 1099511627776,
1125899906842624, 1000, 1000000, 1000000000, 1000000000000, or 1125899906842624, 1000, 1000000, 1000000000, 1000000000000, or
1000000000000000, respectively. 1000000000000000, respectively.
If the available memory falls below 2 * @emph{size}, GNU @strong{parallel} If the available memory falls below 2 * @emph{size}, GNU @strong{parallel}
will suspend some of the running jobs. If the available memory falls will suspend some of the running jobs. If the available memory falls
below @emph{size}, only one job will be running. below @emph{size}, only one job will be running.
skipping to change at line 1772 skipping to change at line 1727
@anchor{@strong{--outputasfiles}} @anchor{@strong{--outputasfiles}}
@item @strong{--files} @item @strong{--files}
@anchor{@strong{--files}} @anchor{@strong{--files}}
Instead of printing the output to stdout (standard output) the output Instead of printing the output to stdout (standard output) the output
of each job is saved in a file and the filename is then printed. of each job is saved in a file and the filename is then printed.
See also: @strong{--results} See also: @strong{--results}
@item @strong{--pipe} (beta testing) @item @strong{--pipe}
@anchor{@strong{--pipe} (beta testing)} @anchor{@strong{--pipe}}
@item @strong{--spreadstdin} (beta testing) @item @strong{--spreadstdin}
@anchor{@strong{--spreadstdin} (beta testing)} @anchor{@strong{--spreadstdin}}
Spread input to jobs on stdin (standard input). Read a block of data Spread input to jobs on stdin (standard input). Read a block of data
from stdin (standard input) and give one block of data as input to one from stdin (standard input) and give one block of data as input to one
job. job.
The block size is determined by @strong{--block}. The strings @strong{--recstart } The block size is determined by @strong{--block}. The strings @strong{--recstart }
and @strong{--recend} tell GNU @strong{parallel} how a record starts and/or and @strong{--recend} tell GNU @strong{parallel} how a record starts and/or
ends. The block read will have the final partial record removed before ends. The block read will have the final partial record removed before
the block is passed on to the job. The partial record will be the block is passed on to the job. The partial record will be
prepended to next block. prepended to next block.
skipping to change at line 1844 skipping to change at line 1799
@anchor{@strong{--plus}} @anchor{@strong{--plus}}
Activate additional replacement strings: @{+/@} @{+.@} @{+..@} @{+...@} @{..@} Activate additional replacement strings: @{+/@} @{+.@} @{+..@} @{+...@} @{..@}
@{...@} @{/..@} @{/...@} @{##@}. The idea being that '@{+foo@}' matches the oppo site of @{...@} @{/..@} @{/...@} @{##@}. The idea being that '@{+foo@}' matches the oppo site of
'@{foo@}' and @{@} = @{+/@}/@{/@} = @{.@}.@{+.@} = @{+/@}/@{/.@}.@{+.@} = @{..@} .@{+..@} = '@{foo@}' and @{@} = @{+/@}/@{/@} = @{.@}.@{+.@} = @{+/@}/@{/.@}.@{+.@} = @{..@} .@{+..@} =
@{+/@}/@{/..@}.@{+..@} = @{...@}.@{+...@} = @{+/@}/@{/...@}.@{+...@} @{+/@}/@{/..@}.@{+..@} = @{...@}.@{+...@} = @{+/@}/@{/...@}.@{+...@}
@strong{@{##@}} is the total number of jobs to be run. It is incompatible with @strong{@{##@}} is the total number of jobs to be run. It is incompatible with
@strong{-X}/@strong{-m}/@strong{--xargs}. @strong{-X}/@strong{-m}/@strong{--xargs}.
@strong{@{0%@}} zero-padded jobslot. (alpha testing)
@strong{@{0#@}} zero-padded sequence number. (alpha testing)
@strong{@{choose_k@}} is inspired by n choose k: Given a list of n elements, @strong{@{choose_k@}} is inspired by n choose k: Given a list of n elements,
choose k. k is the number of input sources and n is the number of choose k. k is the number of input sources and n is the number of
arguments in an input source. The content of the input sources must arguments in an input source. The content of the input sources must
be the same and the arguments must be unique. be the same and the arguments must be unique.
Shorthands for variables: Shorthands for variables:
@verbatim @verbatim
{slot} $PARALLEL_JOBSLOT (see {%}) {slot} $PARALLEL_JOBSLOT (see {%})
{sshlogin} $PARALLEL_SSHLOGIN {sshlogin} $PARALLEL_SSHLOGIN
skipping to change at line 2155 skipping to change at line 2114
If @emph{name} ends in @strong{.csv}/@strong{.tsv} the output will be a CSV-file If @emph{name} ends in @strong{.csv}/@strong{.tsv} the output will be a CSV-file
named @emph{name}. named @emph{name}.
@strong{.csv} gives a comma separated value file. @strong{.tsv} gives a TAB @strong{.csv} gives a comma separated value file. @strong{.tsv} gives a TAB
separated value file. separated value file.
@strong{-.csv}/@strong{-.tsv} are special: It will give the file on stdout @strong{-.csv}/@strong{-.tsv} are special: It will give the file on stdout
(standard output). (standard output).
@strong{JSON file output} (beta testing) @strong{JSON file output}
If @emph{name} ends in @strong{.json} the output will be a JSON-file If @emph{name} ends in @strong{.json} the output will be a JSON-file
named @emph{name}. named @emph{name}.
@strong{-.json} is special: It will give the file on stdout (standard @strong{-.json} is special: It will give the file on stdout (standard
output). output).
@strong{Replacement string output file} (beta testing) @strong{Replacement string output file}
If @emph{name} contains a replacement string and the replaced result does If @emph{name} contains a replacement string and the replaced result does
not end in /, then the standard output will be stored in a file named not end in /, then the standard output will be stored in a file named
by this result. Standard error will be stored in the same file name by this result. Standard error will be stored in the same file name
with '.err' added, and the sequence number will be stored in the same with '.err' added, and the sequence number will be stored in the same
file name with '.seq' added. file name with '.seq' added.
E.g. E.g.
@verbatim @verbatim
skipping to change at line 2785 skipping to change at line 2744
If @strong{--sqlworker} runs on the local machine, the hostname in the SQL If @strong{--sqlworker} runs on the local machine, the hostname in the SQL
table will not be ':' but instead the hostname of the machine. table will not be ':' but instead the hostname of the machine.
@item @strong{--ssh} @emph{sshcommand} @item @strong{--ssh} @emph{sshcommand}
@anchor{@strong{--ssh} @emph{sshcommand}} @anchor{@strong{--ssh} @emph{sshcommand}}
GNU @strong{parallel} defaults to using @strong{ssh} for remote access. This can GNU @strong{parallel} defaults to using @strong{ssh} for remote access. This can
be overridden with @strong{--ssh}. It can also be set on a per server be overridden with @strong{--ssh}. It can also be set on a per server
basis (see @strong{--sshlogin}). basis (see @strong{--sshlogin}).
@item @strong{--sshdelay} @emph{mytime} (beta testing) @item @strong{--sshdelay} @emph{mytime}
@anchor{@strong{--sshdelay} @emph{mytime} (beta testing)} @anchor{@strong{--sshdelay} @emph{mytime}}
Delay starting next ssh by @emph{mytime}. GNU @strong{parallel} will not start Delay starting next ssh by @emph{mytime}. GNU @strong{parallel} will not start
another ssh for the next @emph{mytime}. another ssh for the next @emph{mytime}.
For details on @emph{mytime} see @strong{--delay}. For details on @emph{mytime} see @strong{--delay}.
@item @strong{-S} @emph{[@@hostgroups/][ncpus/]sshlogin[,[@@hostgroups/][ncpus/] sshlogin[,...]]} @item @strong{-S} @emph{[@@hostgroups/][ncpus/]sshlogin[,[@@hostgroups/][ncpus/] sshlogin[,...]]}
@anchor{@strong{-S} @emph{[@@hostgroups/][ncpus/]sshlogin[@comma{}[@@hostgroups/ ][ncpus/]sshlogin[@comma{}...]]}} @anchor{@strong{-S} @emph{[@@hostgroups/][ncpus/]sshlogin[@comma{}[@@hostgroups/ ][ncpus/]sshlogin[@comma{}...]]}}
@item @strong{-S} @emph{@@hostgroup} @item @strong{-S} @emph{@@hostgroup}
skipping to change at line 2932 skipping to change at line 2891
Use the replacement string @emph{replace-str} instead of @strong{@{%@}} for Use the replacement string @emph{replace-str} instead of @strong{@{%@}} for
job slot number. job slot number.
@item @strong{--silent} @item @strong{--silent}
@anchor{@strong{--silent}} @anchor{@strong{--silent}}
Silent. The job to be run will not be printed. This is the default. Silent. The job to be run will not be printed. This is the default.
Can be reversed with @strong{-v}. Can be reversed with @strong{-v}.
@item @strong{--template} @emph{file}=@emph{repl} (alpha testing)
@anchor{@strong{--template} @emph{file}=@emph{repl} (alpha testing)}
@item @strong{--tmpl} @emph{file}=@emph{repl} (alpha testing)
@anchor{@strong{--tmpl} @emph{file}=@emph{repl} (alpha testing)}
Copy @emph{file} to @emph{repl}. All replacement strings in the contents of
@emph{file} will be replaced. All replacement strings in the name @emph{repl}
will be replaced.
With @strong{--cleanup} the new file will be removed when the job is done.
If @emph{my.tmpl} contains this:
@verbatim
Xval: {x}
Yval: {y}
FixedValue: 9
# x with 2 decimals
DecimalX: {=x $_=sprintf("%.2f",$_) =}
TenX: {=x $_=$_*10 =}
RandomVal: {=1 $_=rand() =}
@end verbatim
it can be used like this:
@verbatim
myprog() { echo Using "$@"; cat "$@"; }
export -f myprog
parallel --cleanup --header : --tmpl my.tmpl={#}.t myprog {#}.t \
::: x 1.234 2.345 3.45678 ::: y 1 2 3
@end verbatim
@item @strong{--tty} @item @strong{--tty}
@anchor{@strong{--tty}} @anchor{@strong{--tty}}
Open terminal tty. If GNU @strong{parallel} is used for starting a program Open terminal tty. If GNU @strong{parallel} is used for starting a program
that accesses the tty (such as an interactive program) then this that accesses the tty (such as an interactive program) then this
option may be needed. It will default to starting only one job at a option may be needed. It will default to starting only one job at a
time (i.e. @strong{-j1}), not buffer the output (i.e. @strong{-u}), and it will time (i.e. @strong{-j1}), not buffer the output (i.e. @strong{-u}), and it will
open a tty for the job. open a tty for the job.
You can of course override @strong{-j1} and @strong{-u}. You can of course override @strong{-j1} and @strong{-u}.
skipping to change at line 3328 skipping to change at line 3320
line. If @strong{@{@}} is used multiple times each @strong{@{@}} will be replac ed line. If @strong{@{@}} is used multiple times each @strong{@{@}} will be replac ed
with all the arguments. with all the arguments.
Support for @strong{--xargs} with @strong{--sshlogin} is limited and may fail. Support for @strong{--xargs} with @strong{--sshlogin} is limited and may fail.
See also @strong{-X} for context replace. If in doubt use @strong{-X} as that wi ll See also @strong{-X} for context replace. If in doubt use @strong{-X} as that wi ll
most likely do what is needed. most likely do what is needed.
@end table @end table
@node EXAMPLES
@chapter EXAMPLES
@menu
* EXAMPLE@asis{:} Working as xargs -n1. Argument appending::
* EXAMPLE@asis{:} Simple network scanner::
* EXAMPLE@asis{:} Reading arguments from command line::
* EXAMPLE@asis{:} Inserting multiple arguments::
* EXAMPLE@asis{:} Context replace::
* EXAMPLE@asis{:} Compute intensive jobs and substitution::
* EXAMPLE@asis{:} Substitution and redirection::
* EXAMPLE@asis{:} Composed commands::
* EXAMPLE@asis{:} Composed command with perl replacement string::
* EXAMPLE@asis{:} Composed command with multiple input sources::
* EXAMPLE@asis{:} Calling Bash functions::
* EXAMPLE@asis{:} Function tester::
* EXAMPLE@asis{:} Continously show the latest line of output::
* EXAMPLE@asis{:} Log rotate::
* EXAMPLE@asis{:} Removing file extension when processing files::
* EXAMPLE@asis{:} Removing strings from the argument::
* EXAMPLE@asis{:} Download 24 images for each of the past 30 days::
* EXAMPLE@asis{:} Download world map from NASA::
* EXAMPLE@asis{:} Download Apollo-11 images from NASA using jq::
* EXAMPLE@asis{:} Download video playlist in parallel::
* EXAMPLE@asis{:} Prepend last modified date (ISO8601) to file name::
* EXAMPLE@asis{:} Save output in ISO8601 dirs::
* EXAMPLE@asis{:} Digital clock with "blinking" @asis{:}::
* EXAMPLE@asis{:} Aggregating content of files::
* EXAMPLE@asis{:} Breadth first parallel web crawler/mirrorer::
* EXAMPLE@asis{:} Process files from a tar file while unpacking::
* EXAMPLE@asis{:} Rewriting a for-loop and a while-read-loop::
* EXAMPLE@asis{:} Rewriting nested for-loops::
* EXAMPLE@asis{:} Finding the lowest difference between files::
* EXAMPLE@asis{:} for-loops with column names::
* EXAMPLE@asis{:} All combinations in a list::
* EXAMPLE@asis{:} From a to b and b to c::
* EXAMPLE@asis{:} Count the differences between all files in a dir::
* EXAMPLE@asis{:} Speeding up fast jobs::
* EXAMPLE@asis{:} Using shell variables::
* EXAMPLE@asis{:} Group output lines::
* EXAMPLE@asis{:} Tag output lines::
* EXAMPLE@asis{:} Colorize output::
* EXAMPLE@asis{:} Keep order of output same as order of input::
* EXAMPLE@asis{:} Parallel grep::
* EXAMPLE@asis{:} Grepping n lines for m regular expressions.::
* EXAMPLE@asis{:} Using remote computers::
* EXAMPLE@asis{:} Transferring of files::
* EXAMPLE@asis{:} Distributing work to local and remote computers::
* EXAMPLE@asis{:} Running the same command on remote computers::
* EXAMPLE@asis{:} Running 'sudo' on remote computers::
* EXAMPLE@asis{:} Using remote computers behind NAT wall::
* EXAMPLE@asis{:} Parallelizing rsync::
* EXAMPLE@asis{:} Use multiple inputs in one command::
* EXAMPLE@asis{:} Use a table as input::
* EXAMPLE@asis{:} Output to database::
* EXAMPLE@asis{:} Output to CSV-file for R::
* EXAMPLE@asis{:} Use XML as input::
* EXAMPLE@asis{:} Run the same command 10 times::
* EXAMPLE@asis{:} Working as cat | sh. Resource inexpensive jobs and evaluation:
:
* EXAMPLE@asis{:} Call program with FASTA sequence::
* EXAMPLE@asis{:} Processing a big file using more CPUs::
* EXAMPLE@asis{:} Grouping input lines::
* EXAMPLE@asis{:} Running more than 250 jobs workaround::
* EXAMPLE@asis{:} Working as mutex and counting semaphore::
* EXAMPLE@asis{:} Mutex for a script::
* EXAMPLE@asis{:} Start editor with filenames from stdin (standard input)::
* EXAMPLE@asis{:} Running sudo::
* EXAMPLE@asis{:} GNU Parallel as queue system/batch manager::
* EXAMPLE@asis{:} GNU Parallel as dir processor::
* EXAMPLE@asis{:} Locate the missing package::
@end menu
@node EXAMPLE: Working as xargs -n1. Argument appending @node EXAMPLE: Working as xargs -n1. Argument appending
@chapter EXAMPLE: Working as xargs -n1. Argument appending @section EXAMPLE: Working as xargs -n1. Argument appending
GNU @strong{parallel} can work similar to @strong{xargs -n1}. GNU @strong{parallel} can work similar to @strong{xargs -n1}.
To compress all html files using @strong{gzip} run: To compress all html files using @strong{gzip} run:
@verbatim @verbatim
find . -name '*.html' | parallel gzip --best find . -name '*.html' | parallel gzip --best
@end verbatim @end verbatim
If the file names may contain a newline use @strong{-0}. Substitute FOO BAR with If the file names may contain a newline use @strong{-0}. Substitute FOO BAR with
FUBAR in all files in this dir and subdirs: FUBAR in all files in this dir and subdirs:
@verbatim @verbatim
find . -type f -print0 | \ find . -type f -print0 | \
parallel -q0 perl -i -pe 's/FOO BAR/FUBAR/g' parallel -q0 perl -i -pe 's/FOO BAR/FUBAR/g'
@end verbatim @end verbatim
Note @strong{-q} is needed because of the space in 'FOO BAR'. Note @strong{-q} is needed because of the space in 'FOO BAR'.
@node EXAMPLE: Simple network scanner @node EXAMPLE: Simple network scanner
@chapter EXAMPLE: Simple network scanner @section EXAMPLE: Simple network scanner
@strong{prips} can generate IP-addresses from CIDR notation. With GNU @strong{prips} can generate IP-addresses from CIDR notation. With GNU
@strong{parallel} you can build a simple network scanner to see which @strong{parallel} you can build a simple network scanner to see which
addresses respond to @strong{ping}: addresses respond to @strong{ping}:
@verbatim @verbatim
prips 130.229.16.0/20 | \ prips 130.229.16.0/20 | \
parallel --timeout 2 -j0 \ parallel --timeout 2 -j0 \
'ping -c 1 {} >/dev/null && echo {}' 2>/dev/null 'ping -c 1 {} >/dev/null && echo {}' 2>/dev/null
@end verbatim @end verbatim
@node EXAMPLE: Reading arguments from command line @node EXAMPLE: Reading arguments from command line
@chapter EXAMPLE: Reading arguments from command line @section EXAMPLE: Reading arguments from command line
GNU @strong{parallel} can take the arguments from command line instead of GNU @strong{parallel} can take the arguments from command line instead of
stdin (standard input). To compress all html files in the current dir stdin (standard input). To compress all html files in the current dir
using @strong{gzip} run: using @strong{gzip} run:
@verbatim @verbatim
parallel gzip --best ::: *.html parallel gzip --best ::: *.html
@end verbatim @end verbatim
To convert *.wav to *.mp3 using LAME running one process per CPU run: To convert *.wav to *.mp3 using LAME running one process per CPU run:
@verbatim @verbatim
parallel lame {} -o {.}.mp3 ::: *.wav parallel lame {} -o {.}.mp3 ::: *.wav
@end verbatim @end verbatim
@node EXAMPLE: Inserting multiple arguments @node EXAMPLE: Inserting multiple arguments
@chapter EXAMPLE: Inserting multiple arguments @section EXAMPLE: Inserting multiple arguments
When moving a lot of files like this: @strong{mv *.log destdir} you will When moving a lot of files like this: @strong{mv *.log destdir} you will
sometimes get the error: sometimes get the error:
@verbatim @verbatim
bash: /bin/mv: Argument list too long bash: /bin/mv: Argument list too long
@end verbatim @end verbatim
because there are too many files. You can instead do: because there are too many files. You can instead do:
skipping to change at line 3409 skipping to change at line 3473
ls | grep -E '\.log$' | parallel -m mv {} destdir ls | grep -E '\.log$' | parallel -m mv {} destdir
@end verbatim @end verbatim
In many shells you can also use @strong{printf}: In many shells you can also use @strong{printf}:
@verbatim @verbatim
printf '%s\0' *.log | parallel -0 -m mv {} destdir printf '%s\0' *.log | parallel -0 -m mv {} destdir
@end verbatim @end verbatim
@node EXAMPLE: Context replace @node EXAMPLE: Context replace
@chapter EXAMPLE: Context replace @section EXAMPLE: Context replace
To remove the files @emph{pict0000.jpg} .. @emph{pict9999.jpg} you could do: To remove the files @emph{pict0000.jpg} .. @emph{pict9999.jpg} you could do:
@verbatim @verbatim
seq -w 0 9999 | parallel rm pict{}.jpg seq -w 0 9999 | parallel rm pict{}.jpg
@end verbatim @end verbatim
You could also do: You could also do:
@verbatim @verbatim
skipping to change at line 3437 skipping to change at line 3501
You could also run: You could also run:
@verbatim @verbatim
seq -w 0 9999 | parallel -X rm pict{}.jpg seq -w 0 9999 | parallel -X rm pict{}.jpg
@end verbatim @end verbatim
This will also only run @strong{rm} as many times needed to keep the command This will also only run @strong{rm} as many times needed to keep the command
line length short enough. line length short enough.
@node EXAMPLE: Compute intensive jobs and substitution @node EXAMPLE: Compute intensive jobs and substitution
@chapter EXAMPLE: Compute intensive jobs and substitution @section EXAMPLE: Compute intensive jobs and substitution
If ImageMagick is installed this will generate a thumbnail of a jpg If ImageMagick is installed this will generate a thumbnail of a jpg
file: file:
@verbatim @verbatim
convert -geometry 120 foo.jpg thumb_foo.jpg convert -geometry 120 foo.jpg thumb_foo.jpg
@end verbatim @end verbatim
This will run with number-of-cpus jobs in parallel for all jpg files This will run with number-of-cpus jobs in parallel for all jpg files
in a directory: in a directory:
skipping to change at line 3474 skipping to change at line 3538
Use @strong{@{.@}} to avoid the extra .jpg in the file name. This command will Use @strong{@{.@}} to avoid the extra .jpg in the file name. This command will
make files like ./foo/bar_thumb.jpg: make files like ./foo/bar_thumb.jpg:
@verbatim @verbatim
find . -name '*.jpg' | \ find . -name '*.jpg' | \
parallel convert -geometry 120 {} {.}_thumb.jpg parallel convert -geometry 120 {} {.}_thumb.jpg
@end verbatim @end verbatim
@node EXAMPLE: Substitution and redirection @node EXAMPLE: Substitution and redirection
@chapter EXAMPLE: Substitution and redirection @section EXAMPLE: Substitution and redirection
This will generate an uncompressed version of .gz-files next to the .gz-file: This will generate an uncompressed version of .gz-files next to the .gz-file:
@verbatim @verbatim
parallel zcat {} ">"{.} ::: *.gz parallel zcat {} ">"{.} ::: *.gz
@end verbatim @end verbatim
Quoting of > is necessary to postpone the redirection. Another Quoting of > is necessary to postpone the redirection. Another
solution is to quote the whole command: solution is to quote the whole command:
@verbatim @verbatim
parallel "zcat {} >{.}" ::: *.gz parallel "zcat {} >{.}" ::: *.gz
@end verbatim @end verbatim
Other special shell characters (such as * ; $ > < | >> <<) also need Other special shell characters (such as * ; $ > < | >> <<) also need
to be put in quotes, as they may otherwise be interpreted by the shell to be put in quotes, as they may otherwise be interpreted by the shell
and not given to GNU @strong{parallel}. and not given to GNU @strong{parallel}.
@node EXAMPLE: Composed commands @node EXAMPLE: Composed commands
@chapter EXAMPLE: Composed commands @section EXAMPLE: Composed commands
A job can consist of several commands. This will print the number of A job can consist of several commands. This will print the number of
files in each directory: files in each directory:
@verbatim @verbatim
ls | parallel 'echo -n {}" "; ls {}|wc -l' ls | parallel 'echo -n {}" "; ls {}|wc -l'
@end verbatim @end verbatim
To put the output in a file called <name>.dir: To put the output in a file called <name>.dir:
skipping to change at line 3541 skipping to change at line 3605
find mirror_dir -type l | parallel -m rm {} '&&' touch {} find mirror_dir -type l | parallel -m rm {} '&&' touch {}
@end verbatim @end verbatim
Find the files in a list that do not exist Find the files in a list that do not exist
@verbatim @verbatim
cat file_list | parallel 'if [ ! -e {} ] ; then echo {}; fi' cat file_list | parallel 'if [ ! -e {} ] ; then echo {}; fi'
@end verbatim @end verbatim
@node EXAMPLE: Composed command with perl replacement string @node EXAMPLE: Composed command with perl replacement string
@chapter EXAMPLE: Composed command with perl replacement string @section EXAMPLE: Composed command with perl replacement string
You have a bunch of file. You want them sorted into dirs. The dir of You have a bunch of file. You want them sorted into dirs. The dir of
each file should be named the first letter of the file name. each file should be named the first letter of the file name.
@verbatim @verbatim
parallel 'mkdir -p {=s/(.).*/$1/=}; mv {} {=s/(.).*/$1/=}' ::: * parallel 'mkdir -p {=s/(.).*/$1/=}; mv {} {=s/(.).*/$1/=}' ::: *
@end verbatim @end verbatim
@node EXAMPLE: Composed command with multiple input sources @node EXAMPLE: Composed command with multiple input sources
@chapter EXAMPLE: Composed command with multiple input sources @section EXAMPLE: Composed command with multiple input sources
You have a dir with files named as 24 hours in 5 minute intervals: You have a dir with files named as 24 hours in 5 minute intervals:
00:00, 00:05, 00:10 .. 23:55. You want to find the files missing: 00:00, 00:05, 00:10 .. 23:55. You want to find the files missing:
@verbatim @verbatim
parallel [ -f {1}:{2} ] "||" echo {1}:{2} does not exist \ parallel [ -f {1}:{2} ] "||" echo {1}:{2} does not exist \
::: {00..23} ::: {00..55..5} ::: {00..23} ::: {00..55..5}
@end verbatim @end verbatim
@node EXAMPLE: Calling Bash functions @node EXAMPLE: Calling Bash functions
@chapter EXAMPLE: Calling Bash functions @section EXAMPLE: Calling Bash functions
If the composed command is longer than a line, it becomes hard to If the composed command is longer than a line, it becomes hard to
read. In Bash you can use functions. Just remember to @strong{export -f} the read. In Bash you can use functions. Just remember to @strong{export -f} the
function. function.
@verbatim @verbatim
doit() { doit() {
echo Doing it for $1 echo Doing it for $1
sleep 2 sleep 2
echo Done with $1 echo Done with $1
skipping to change at line 3599 skipping to change at line 3663
@verbatim @verbatim
parallel --env doit -S server doit ::: 1 2 3 parallel --env doit -S server doit ::: 1 2 3
parallel --env doubleit -S server doubleit ::: 1 2 3 ::: a b parallel --env doubleit -S server doubleit ::: 1 2 3 ::: a b
@end verbatim @end verbatim
If your environment (aliases, variables, and functions) is small you If your environment (aliases, variables, and functions) is small you
can copy the full environment without having to @strong{export -f} can copy the full environment without having to @strong{export -f}
anything. See @strong{env_parallel}. anything. See @strong{env_parallel}.
@node EXAMPLE: Function tester @node EXAMPLE: Function tester
@chapter EXAMPLE: Function tester @section EXAMPLE: Function tester
To test a program with different parameters: To test a program with different parameters:
@verbatim @verbatim
tester() { tester() {
if (eval "$@") >&/dev/null; then if (eval "$@") >&/dev/null; then
perl -e 'printf "\033[30;102m[ OK ]\033[0m @ARGV\n"' "$@" perl -e 'printf "\033[30;102m[ OK ]\033[0m @ARGV\n"' "$@"
else else
perl -e 'printf "\033[30;101m[FAIL]\033[0m @ARGV\n"' "$@" perl -e 'printf "\033[30;101m[FAIL]\033[0m @ARGV\n"' "$@"
fi fi
} }
export -f tester export -f tester
parallel tester my_program ::: arg1 arg2 parallel tester my_program ::: arg1 arg2
parallel tester exit ::: 1 0 2 0 parallel tester exit ::: 1 0 2 0
@end verbatim @end verbatim
If @strong{my_program} fails a red FAIL will be printed followed by the failing If @strong{my_program} fails a red FAIL will be printed followed by the failing
command; otherwise a green OK will be printed followed by the command. command; otherwise a green OK will be printed followed by the command.
@node EXAMPLE: Continously show the latest line of output @node EXAMPLE: Continously show the latest line of output
@chapter EXAMPLE: Continously show the latest line of output @section EXAMPLE: Continously show the latest line of output
It can be useful to monitor the output of running jobs. It can be useful to monitor the output of running jobs.
This shows the most recent output line until a job finishes. After This shows the most recent output line until a job finishes. After
which the output of the job is printed in full: which the output of the job is printed in full:
@verbatim @verbatim
parallel '{} | tee >(cat >&3)' ::: 'command 1' 'command 2' \ parallel '{} | tee >(cat >&3)' ::: 'command 1' 'command 2' \
3> >(perl -ne '$|=1;chomp;printf"%.'$COLUMNS's\r",$_." "x100') 3> >(perl -ne '$|=1;chomp;printf"%.'$COLUMNS's\r",$_." "x100')
@end verbatim @end verbatim
@node EXAMPLE: Log rotate @node EXAMPLE: Log rotate
@chapter EXAMPLE: Log rotate @section EXAMPLE: Log rotate
Log rotation renames a logfile to an extension with a higher number: Log rotation renames a logfile to an extension with a higher number:
log.1 becomes log.2, log.2 becomes log.3, and so on. The oldest log is log.1 becomes log.2, log.2 becomes log.3, and so on. The oldest log is
removed. To avoid overwriting files the process starts backwards from removed. To avoid overwriting files the process starts backwards from
the high number to the low number. This will keep 10 old versions of the high number to the low number. This will keep 10 old versions of
the log: the log:
@verbatim @verbatim
seq 9 -1 1 | parallel -j1 mv log.{} log.'{= $_++ =}' seq 9 -1 1 | parallel -j1 mv log.{} log.'{= $_++ =}'
mv log log.1 mv log log.1
@end verbatim @end verbatim
@node EXAMPLE: Removing file extension when processing files @node EXAMPLE: Removing file extension when processing files
@chapter EXAMPLE: Removing file extension when processing files @section EXAMPLE: Removing file extension when processing files
When processing files removing the file extension using @strong{@{.@}} is When processing files removing the file extension using @strong{@{.@}} is
often useful. often useful.
Create a directory for each zip-file and unzip it in that dir: Create a directory for each zip-file and unzip it in that dir:
@verbatim @verbatim
parallel 'mkdir {.}; cd {.}; unzip ../{}' ::: *.zip parallel 'mkdir {.}; cd {.}; unzip ../{}' ::: *.zip
@end verbatim @end verbatim
skipping to change at line 3679 skipping to change at line 3743
@end verbatim @end verbatim
Put all converted in the same directory: Put all converted in the same directory:
@verbatim @verbatim
find sounddir -type f -name '*.wav' | \ find sounddir -type f -name '*.wav' | \
parallel lame {} -o mydir/{/.}.mp3 parallel lame {} -o mydir/{/.}.mp3
@end verbatim @end verbatim
@node EXAMPLE: Removing strings from the argument @node EXAMPLE: Removing strings from the argument
@chapter EXAMPLE: Removing strings from the argument @section EXAMPLE: Removing strings from the argument
If you have directory with tar.gz files and want these extracted in If you have directory with tar.gz files and want these extracted in
the corresponding dir (e.g foo.tar.gz will be extracted in the dir the corresponding dir (e.g foo.tar.gz will be extracted in the dir
foo) you can do: foo) you can do:
@verbatim @verbatim
parallel --plus 'mkdir {..}; tar -C {..} -xf {}' ::: *.tar.gz parallel --plus 'mkdir {..}; tar -C {..} -xf {}' ::: *.tar.gz
@end verbatim @end verbatim
If you want to remove a different ending, you can use @{%string@}: If you want to remove a different ending, you can use @{%string@}:
skipping to change at line 3709 skipping to change at line 3773
@end verbatim @end verbatim
To remove a string anywhere you can use regular expressions with To remove a string anywhere you can use regular expressions with
@{/regexp/replacement@} and leave the replacement empty: @{/regexp/replacement@} and leave the replacement empty:
@verbatim @verbatim
parallel --plus echo {/demo_/} ::: demo_mycode remove_demo_here parallel --plus echo {/demo_/} ::: demo_mycode remove_demo_here
@end verbatim @end verbatim
@node EXAMPLE: Download 24 images for each of the past 30 days @node EXAMPLE: Download 24 images for each of the past 30 days
@chapter EXAMPLE: Download 24 images for each of the past 30 days @section EXAMPLE: Download 24 images for each of the past 30 days
Let us assume a website stores images like: Let us assume a website stores images like:
@verbatim @verbatim
http://www.example.com/path/to/YYYYMMDD_##.jpg http://www.example.com/path/to/YYYYMMDD_##.jpg
@end verbatim @end verbatim
where YYYYMMDD is the date and ## is the number 01-24. This will where YYYYMMDD is the date and ## is the number 01-24. This will
download images for the past 30 days: download images for the past 30 days:
skipping to change at line 3735 skipping to change at line 3799
} }
export -f getit export -f getit
parallel getit ::: $(seq 30) ::: $(seq -w 24) parallel getit ::: $(seq 30) ::: $(seq -w 24)
@end verbatim @end verbatim
@strong{$(date -d "today -$1 days" +%Y%m%d)} will give the dates in @strong{$(date -d "today -$1 days" +%Y%m%d)} will give the dates in
YYYYMMDD with @strong{$1} days subtracted. YYYYMMDD with @strong{$1} days subtracted.
@node EXAMPLE: Download world map from NASA @node EXAMPLE: Download world map from NASA
@chapter EXAMPLE: Download world map from NASA @section EXAMPLE: Download world map from NASA
NASA provides tiles to download on earthdata.nasa.gov. Download tiles NASA provides tiles to download on earthdata.nasa.gov. Download tiles
for Blue Marble world map and create a 10240x20480 map. for Blue Marble world map and create a 10240x20480 map.
@verbatim @verbatim
base=https://map1a.vis.earthdata.nasa.gov/wmts-geo/wmts.cgi base=https://map1a.vis.earthdata.nasa.gov/wmts-geo/wmts.cgi
service="SERVICE=WMTS&REQUEST=GetTile&VERSION=1.0.0" service="SERVICE=WMTS&REQUEST=GetTile&VERSION=1.0.0"
layer="LAYER=BlueMarble_ShadedRelief_Bathymetry" layer="LAYER=BlueMarble_ShadedRelief_Bathymetry"
set="STYLE=&TILEMATRIXSET=EPSG4326_500m&TILEMATRIX=5" set="STYLE=&TILEMATRIXSET=EPSG4326_500m&TILEMATRIX=5"
tile="TILEROW={1}&TILECOL={2}" tile="TILEROW={1}&TILECOL={2}"
format="FORMAT=image%2Fjpeg" format="FORMAT=image%2Fjpeg"
url="$base?$service&$layer&$set&$tile&$format" url="$base?$service&$layer&$set&$tile&$format"
parallel -j0 -q wget "$url" -O {1}_{2}.jpg ::: {0..19} ::: {0..39} parallel -j0 -q wget "$url" -O {1}_{2}.jpg ::: {0..19} ::: {0..39}
parallel eval convert +append {}_{0..39}.jpg line{}.jpg ::: {0..19} parallel eval convert +append {}_{0..39}.jpg line{}.jpg ::: {0..19}
convert -append line{0..19}.jpg world.jpg convert -append line{0..19}.jpg world.jpg
@end verbatim @end verbatim
@node EXAMPLE: Download Apollo-11 images from NASA using jq @node EXAMPLE: Download Apollo-11 images from NASA using jq
@chapter EXAMPLE: Download Apollo-11 images from NASA using jq @section EXAMPLE: Download Apollo-11 images from NASA using jq
Search NASA using their API to get JSON for images related to 'apollo Search NASA using their API to get JSON for images related to 'apollo
11' and has 'moon landing' in the description. 11' and has 'moon landing' in the description.
The search query returns JSON containing URLs to JSON containing The search query returns JSON containing URLs to JSON containing
collections of pictures. One of the pictures in each of these collections of pictures. One of the pictures in each of these
collection is @emph{large}. collection is @emph{large}.
@strong{wget} is used to get the JSON for the search query. @strong{jq} is then @strong{wget} is used to get the JSON for the search query. @strong{jq} is then
used to extract the URLs of the collections. @strong{parallel} then calls used to extract the URLs of the collections. @strong{parallel} then calls
skipping to change at line 3784 skipping to change at line 3848
media_type="media_type=image" media_type="media_type=image"
wget -O - "$base?$q&$description&$media_type" | wget -O - "$base?$q&$description&$media_type" |
jq -r .collection.items[].href | jq -r .collection.items[].href |
parallel wget -O - | parallel wget -O - |
jq -r .[] | jq -r .[] |
grep large | grep large |
parallel wget parallel wget
@end verbatim @end verbatim
@node EXAMPLE: Download video playlist in parallel @node EXAMPLE: Download video playlist in parallel
@chapter EXAMPLE: Download video playlist in parallel @section EXAMPLE: Download video playlist in parallel
@strong{youtube-dl} is an excellent tool to download videos. It can, @strong{youtube-dl} is an excellent tool to download videos. It can,
however, not download videos in parallel. This takes a playlist and however, not download videos in parallel. This takes a playlist and
downloads 10 videos in parallel. downloads 10 videos in parallel.
@verbatim @verbatim
url='youtu.be/watch?v=0wOf2Fgi3DE&list=UU_cznB5YZZmvAmeq7Y3EriQ' url='youtu.be/watch?v=0wOf2Fgi3DE&list=UU_cznB5YZZmvAmeq7Y3EriQ'
export url export url
youtube-dl --flat-playlist "https://$url" | youtube-dl --flat-playlist "https://$url" |
parallel --tagstring {#} --lb -j10 \ parallel --tagstring {#} --lb -j10 \
youtube-dl --playlist-start {#} --playlist-end {#} '"https://$url"' youtube-dl --playlist-start {#} --playlist-end {#} '"https://$url"'
@end verbatim @end verbatim
@node EXAMPLE: Prepend last modified date (ISO8601) to file name @node EXAMPLE: Prepend last modified date (ISO8601) to file name
@chapter EXAMPLE: Prepend last modified date (ISO8601) to file name @section EXAMPLE: Prepend last modified date (ISO8601) to file name
@verbatim @verbatim
parallel mv {} '{= $a=pQ($_); $b=$_;' \ parallel mv {} '{= $a=pQ($_); $b=$_;' \
'$_=qx{date -r "$a" +%FT%T}; chomp; $_="$_ $b" =}' ::: * '$_=qx{date -r "$a" +%FT%T}; chomp; $_="$_ $b" =}' ::: *
@end verbatim @end verbatim
@strong{@{=} and @strong{=@}} mark a perl expression. @strong{pQ} perl-quotes th e @strong{@{=} and @strong{=@}} mark a perl expression. @strong{pQ} perl-quotes th e
string. @strong{date +%FT%T} is the date in ISO8601 with time. string. @strong{date +%FT%T} is the date in ISO8601 with time.
@node EXAMPLE: Save output in ISO8601 dirs @node EXAMPLE: Save output in ISO8601 dirs
@chapter EXAMPLE: Save output in ISO8601 dirs @section EXAMPLE: Save output in ISO8601 dirs
Save output from @strong{ps aux} every second into dirs named Save output from @strong{ps aux} every second into dirs named
yyyy-mm-ddThh:mm:ss+zz:zz. yyyy-mm-ddThh:mm:ss+zz:zz.
@verbatim @verbatim
seq 1000 | parallel -N0 -j1 --delay 1 \ seq 1000 | parallel -N0 -j1 --delay 1 \
--results '{= $_=`date -Isec`; chomp=}/' ps aux --results '{= $_=`date -Isec`; chomp=}/' ps aux
@end verbatim @end verbatim
@node EXAMPLE: Digital clock with "blinking" : @node EXAMPLE: Digital clock with "blinking" :
@chapter EXAMPLE: Digital clock with "blinking" : @section EXAMPLE: Digital clock with "blinking" :
The : in a digital clock blinks. To make every other line have a ':' The : in a digital clock blinks. To make every other line have a ':'
and the rest a ' ' a perl expression is used to look at the 3rd input and the rest a ' ' a perl expression is used to look at the 3rd input
source. If the value modulo 2 is 1: Use ":" otherwise use " ": source. If the value modulo 2 is 1: Use ":" otherwise use " ":
@verbatim @verbatim
parallel -k echo {1}'{=3 $_=$_%2?":":" "=}'{2}{3} \ parallel -k echo {1}'{=3 $_=$_%2?":":" "=}'{2}{3} \
::: {0..12} ::: {0..5} ::: {0..9} ::: {0..12} ::: {0..5} ::: {0..9}
@end verbatim @end verbatim
@node EXAMPLE: Aggregating content of files @node EXAMPLE: Aggregating content of files
@chapter EXAMPLE: Aggregating content of files @section EXAMPLE: Aggregating content of files
This: This:
@verbatim @verbatim
parallel --header : echo x{X}y{Y}z{Z} \> x{X}y{Y}z{Z} \ parallel --header : echo x{X}y{Y}z{Z} \> x{X}y{Y}z{Z} \
::: X {1..5} ::: Y {01..10} ::: Z {1..5} ::: X {1..5} ::: Y {01..10} ::: Z {1..5}
@end verbatim @end verbatim
will generate the files x1y01z1 .. x5y10z5. If you want to aggregate will generate the files x1y01z1 .. x5y10z5. If you want to aggregate
the output grouping on x and z you can do this: the output grouping on x and z you can do this:
skipping to change at line 3859 skipping to change at line 3923
For all values of x and z it runs commands like: For all values of x and z it runs commands like:
@verbatim @verbatim
cat x1y*z1 > x1z1 cat x1y*z1 > x1z1
@end verbatim @end verbatim
So you end up with x1z1 .. x5z5 each containing the content of all So you end up with x1z1 .. x5z5 each containing the content of all
values of y. values of y.
@node EXAMPLE: Breadth first parallel web crawler/mirrorer @node EXAMPLE: Breadth first parallel web crawler/mirrorer
@chapter EXAMPLE: Breadth first parallel web crawler/mirrorer @section EXAMPLE: Breadth first parallel web crawler/mirrorer
This script below will crawl and mirror a URL in parallel. It This script below will crawl and mirror a URL in parallel. It
downloads first pages that are 1 click down, then 2 clicks down, then downloads first pages that are 1 click down, then 2 clicks down, then
3; instead of the normal depth first, where the first link link on 3; instead of the normal depth first, where the first link link on
each page is fetched first. each page is fetched first.
Run like this: Run like this:
@verbatim @verbatim
PARALLEL=-j100 ./parallel-crawl http://gatt.org.yeslab.org/ PARALLEL=-j100 ./parallel-crawl http://gatt.org.yeslab.org/
skipping to change at line 3910 skipping to change at line 3974
do { $seen{$1}++ or print }' | do { $seen{$1}++ or print }' |
grep -F $BASEURL | grep -F $BASEURL |
grep -v -x -F -f $SEEN | tee -a $SEEN > $URLLIST2 grep -v -x -F -f $SEEN | tee -a $SEEN > $URLLIST2
mv $URLLIST2 $URLLIST mv $URLLIST2 $URLLIST
done done
rm -f $URLLIST $URLLIST2 $SEEN rm -f $URLLIST $URLLIST2 $SEEN
@end verbatim @end verbatim
@node EXAMPLE: Process files from a tar file while unpacking @node EXAMPLE: Process files from a tar file while unpacking
@chapter EXAMPLE: Process files from a tar file while unpacking @section EXAMPLE: Process files from a tar file while unpacking
If the files to be processed are in a tar file then unpacking one file If the files to be processed are in a tar file then unpacking one file
and processing it immediately may be faster than first unpacking all and processing it immediately may be faster than first unpacking all
files. files.
@verbatim @verbatim
tar xvf foo.tgz | perl -ne 'print $l;$l=$_;END{print $l}' | \ tar xvf foo.tgz | perl -ne 'print $l;$l=$_;END{print $l}' | \
parallel echo parallel echo
@end verbatim @end verbatim
The Perl one-liner is needed to make sure the file is complete before The Perl one-liner is needed to make sure the file is complete before
handing it to GNU @strong{parallel}. handing it to GNU @strong{parallel}.
@node EXAMPLE: Rewriting a for-loop and a while-read-loop @node EXAMPLE: Rewriting a for-loop and a while-read-loop
@chapter EXAMPLE: Rewriting a for-loop and a while-read-loop @section EXAMPLE: Rewriting a for-loop and a while-read-loop
for-loops like this: for-loops like this:
@verbatim @verbatim
(for x in `cat list` ; do (for x in `cat list` ; do
do_something $x do_something $x
done) | process_output done) | process_output
@end verbatim @end verbatim
and while-read-loops like this: and while-read-loops like this:
skipping to change at line 4009 skipping to change at line 4073
doit() { doit() {
x=$1 x=$1
do_something $x do_something $x
[... 100 lines that do something with $x ...] [... 100 lines that do something with $x ...]
} }
export -f doit export -f doit
cat list | parallel doit cat list | parallel doit
@end verbatim @end verbatim
@node EXAMPLE: Rewriting nested for-loops @node EXAMPLE: Rewriting nested for-loops
@chapter EXAMPLE: Rewriting nested for-loops @section EXAMPLE: Rewriting nested for-loops
Nested for-loops like this: Nested for-loops like this:
@verbatim @verbatim
(for x in `cat xlist` ; do (for x in `cat xlist` ; do
for y in `cat ylist` ; do for y in `cat ylist` ; do
do_something $x $y do_something $x $y
done done
done) | process_output done) | process_output
@end verbatim @end verbatim
skipping to change at line 4044 skipping to change at line 4108
done) | sort done) | sort
@end verbatim @end verbatim
can be written like this: can be written like this:
@verbatim @verbatim
parallel echo {1} {2} ::: red green blue ::: S M L XL XXL | sort parallel echo {1} {2} ::: red green blue ::: S M L XL XXL | sort
@end verbatim @end verbatim
@node EXAMPLE: Finding the lowest difference between files @node EXAMPLE: Finding the lowest difference between files
@chapter EXAMPLE: Finding the lowest difference between files @section EXAMPLE: Finding the lowest difference between files
@strong{diff} is good for finding differences in text files. @strong{diff | wc - l} @strong{diff} is good for finding differences in text files. @strong{diff | wc - l}
gives an indication of the size of the difference. To find the gives an indication of the size of the difference. To find the
differences between all files in the current dir do: differences between all files in the current dir do:
@verbatim @verbatim
parallel --tag 'diff {1} {2} | wc -l' ::: * ::: * | sort -nk3 parallel --tag 'diff {1} {2} | wc -l' ::: * ::: * | sort -nk3
@end verbatim @end verbatim
This way it is possible to see if some files are closer to other This way it is possible to see if some files are closer to other
files. files.
@node EXAMPLE: for-loops with column names @node EXAMPLE: for-loops with column names
@chapter EXAMPLE: for-loops with column names @section EXAMPLE: for-loops with column names
When doing multiple nested for-loops it can be easier to keep track of When doing multiple nested for-loops it can be easier to keep track of
the loop variable if is is named instead of just having a number. Use the loop variable if is is named instead of just having a number. Use
@strong{--header :} to let the first argument be an named alias for the @strong{--header :} to let the first argument be an named alias for the
positional replacement string: positional replacement string:
@verbatim @verbatim
parallel --header : echo {colour} {size} \ parallel --header : echo {colour} {size} \
::: colour red green blue ::: size S M L XL XXL ::: colour red green blue ::: size S M L XL XXL
@end verbatim @end verbatim
This also works if the input file is a file with columns: This also works if the input file is a file with columns:
@verbatim @verbatim
cat addressbook.tsv | \ cat addressbook.tsv | \
parallel --colsep '\t' --header : echo {Name} {E-mail address} parallel --colsep '\t' --header : echo {Name} {E-mail address}
@end verbatim @end verbatim
@node EXAMPLE: All combinations in a list @node EXAMPLE: All combinations in a list
@chapter EXAMPLE: All combinations in a list @section EXAMPLE: All combinations in a list
GNU @strong{parallel} makes all combinations when given two lists. GNU @strong{parallel} makes all combinations when given two lists.
To make all combinations in a single list with unique values, you To make all combinations in a single list with unique values, you
repeat the list and use replacement string @strong{@{choose_k@}}: repeat the list and use replacement string @strong{@{choose_k@}}:
@verbatim @verbatim
parallel --plus echo {choose_k} ::: A B C D ::: A B C D parallel --plus echo {choose_k} ::: A B C D ::: A B C D
parallel --plus echo 2{2choose_k} 1{1choose_k} ::: A B C D ::: A B C D parallel --plus echo 2{2choose_k} 1{1choose_k} ::: A B C D ::: A B C D
@end verbatim @end verbatim
@strong{@{choose_k@}} works for any number of input sources: @strong{@{choose_k@}} works for any number of input sources:
@verbatim @verbatim
parallel --plus echo {choose_k} ::: A B C D ::: A B C D ::: A B C D parallel --plus echo {choose_k} ::: A B C D ::: A B C D ::: A B C D
@end verbatim @end verbatim
@node EXAMPLE: From a to b and b to c @node EXAMPLE: From a to b and b to c
@chapter EXAMPLE: From a to b and b to c @section EXAMPLE: From a to b and b to c
Assume you have input like: Assume you have input like:
@verbatim @verbatim
aardvark aardvark
babble babble
cab cab
dab dab
each each
@end verbatim @end verbatim
skipping to change at line 4134 skipping to change at line 4198
If the input is in the array $a here are two solutions: If the input is in the array $a here are two solutions:
@verbatim @verbatim
seq $((${#a[@]}-1)) | \ seq $((${#a[@]}-1)) | \
env_parallel --env a echo '${a[{=$_--=}]} - ${a[{}]}' env_parallel --env a echo '${a[{=$_--=}]} - ${a[{}]}'
parallel echo {1} - {2} ::: "${a[@]::${#a[@]}-1}" :::+ "${a[@]:1}" parallel echo {1} - {2} ::: "${a[@]::${#a[@]}-1}" :::+ "${a[@]:1}"
@end verbatim @end verbatim
@node EXAMPLE: Count the differences between all files in a dir @node EXAMPLE: Count the differences between all files in a dir
@chapter EXAMPLE: Count the differences between all files in a dir @section EXAMPLE: Count the differences between all files in a dir
Using @strong{--results} the results are saved in /tmp/diffcount*. Using @strong{--results} the results are saved in /tmp/diffcount*.
@verbatim @verbatim
parallel --results /tmp/diffcount "diff -U 0 {1} {2} | \ parallel --results /tmp/diffcount "diff -U 0 {1} {2} | \
tail -n +3 |grep -v '^@'|wc -l" ::: * ::: * tail -n +3 |grep -v '^@'|wc -l" ::: * ::: *
@end verbatim @end verbatim
To see the difference between file A and file B look at the file To see the difference between file A and file B look at the file
'/tmp/diffcount/1/A/2/B'. '/tmp/diffcount/1/A/2/B'.
@node EXAMPLE: Speeding up fast jobs @node EXAMPLE: Speeding up fast jobs
@chapter EXAMPLE: Speeding up fast jobs @section EXAMPLE: Speeding up fast jobs
Starting a job on the local machine takes around 10 ms. This can be a Starting a job on the local machine takes around 10 ms. This can be a
big overhead if the job takes very few ms to run. Often you can group big overhead if the job takes very few ms to run. Often you can group
small jobs together using @strong{-X} which will make the overhead less small jobs together using @strong{-X} which will make the overhead less
significant. Compare the speed of these: significant. Compare the speed of these:
@verbatim @verbatim
seq -w 0 9999 | parallel touch pict{}.jpg seq -w 0 9999 | parallel touch pict{}.jpg
seq -w 0 9999 | parallel -X touch pict{}.jpg seq -w 0 9999 | parallel -X touch pict{}.jpg
@end verbatim @end verbatim
skipping to change at line 4200 skipping to change at line 4264
mygenerator() { mygenerator() {
seq 10000000 | perl -pe 'print "echo This is fast job number "'; seq 10000000 | perl -pe 'print "echo This is fast job number "';
} }
mygenerator | parallel --pipe --block 10M sh mygenerator | parallel --pipe --block 10M sh
@end verbatim @end verbatim
The overhead is 100000 times smaller namely around 100 nanoseconds per The overhead is 100000 times smaller namely around 100 nanoseconds per
job. job.
@node EXAMPLE: Using shell variables @node EXAMPLE: Using shell variables
@chapter EXAMPLE: Using shell variables @section EXAMPLE: Using shell variables
When using shell variables you need to quote them correctly as they When using shell variables you need to quote them correctly as they
may otherwise be interpreted by the shell. may otherwise be interpreted by the shell.
Notice the difference between: Notice the difference between:
@verbatim @verbatim
ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar) ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar)
parallel echo ::: ${ARR[@]} # This is probably not what you want parallel echo ::: ${ARR[@]} # This is probably not what you want
@end verbatim @end verbatim
skipping to change at line 4249 skipping to change at line 4313
@verbatim @verbatim
VAR="My brother's 12\" records are worth <\$\$\$>" VAR="My brother's 12\" records are worth <\$\$\$>"
export VAR export VAR
myfunc() { echo "$VAR" "$1"; } myfunc() { echo "$VAR" "$1"; }
export -f myfunc export -f myfunc
parallel myfunc ::: '!' parallel myfunc ::: '!'
@end verbatim @end verbatim
@node EXAMPLE: Group output lines @node EXAMPLE: Group output lines
@chapter EXAMPLE: Group output lines @section EXAMPLE: Group output lines
When running jobs that output data, you often do not want the output When running jobs that output data, you often do not want the output
of multiple jobs to run together. GNU @strong{parallel} defaults to grouping of multiple jobs to run together. GNU @strong{parallel} defaults to grouping
the output of each job, so the output is printed when the job the output of each job, so the output is printed when the job
finishes. If you want full lines to be printed while the job is finishes. If you want full lines to be printed while the job is
running you can use @strong{--line-buffer}. If you want output to be running you can use @strong{--line-buffer}. If you want output to be
printed as soon as possible you can use @strong{-u}. printed as soon as possible you can use @strong{-u}.
Compare the output of: Compare the output of:
skipping to change at line 4273 skipping to change at line 4337
::: {12..16} ::: {12..16}
parallel --line-buffer wget --limit-rate=100k \ parallel --line-buffer wget --limit-rate=100k \
https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \ https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
::: {12..16} ::: {12..16}
parallel -u wget --limit-rate=100k \ parallel -u wget --limit-rate=100k \
https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \ https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
::: {12..16} ::: {12..16}
@end verbatim @end verbatim
@node EXAMPLE: Tag output lines @node EXAMPLE: Tag output lines
@chapter EXAMPLE: Tag output lines @section EXAMPLE: Tag output lines
GNU @strong{parallel} groups the output lines, but it can be hard to see GNU @strong{parallel} groups the output lines, but it can be hard to see
where the different jobs begin. @strong{--tag} prepends the argument to make where the different jobs begin. @strong{--tag} prepends the argument to make
that more visible: that more visible:
@verbatim @verbatim
parallel --tag wget --limit-rate=100k \ parallel --tag wget --limit-rate=100k \
https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \ https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
::: {12..16} ::: {12..16}
@end verbatim @end verbatim
skipping to change at line 4300 skipping to change at line 4364
::: {12..16} ::: {12..16}
@end verbatim @end verbatim
Check the uptime of the servers in @emph{~/.parallel/sshloginfile}: Check the uptime of the servers in @emph{~/.parallel/sshloginfile}:
@verbatim @verbatim
parallel --tag -S .. --nonall uptime parallel --tag -S .. --nonall uptime
@end verbatim @end verbatim
@node EXAMPLE: Colorize output @node EXAMPLE: Colorize output
@chapter EXAMPLE: Colorize output @section EXAMPLE: Colorize output
Give each job a new color. Most terminals support ANSI colors with the Give each job a new color. Most terminals support ANSI colors with the
escape code "\033[30;3Xm" where 0 <= X <= 7: escape code "\033[30;3Xm" where 0 <= X <= 7:
@verbatim @verbatim
seq 10 | \ seq 10 | \
parallel --tagstring '\033[30;3{=$_=++$::color%8=}m' seq {} parallel --tagstring '\033[30;3{=$_=++$::color%8=}m' seq {}
parallel --rpl '{color} $_="\033[30;3".(++$::color%8)."m"' \ parallel --rpl '{color} $_="\033[30;3".(++$::color%8)."m"' \
--tagstring {color} seq {} ::: {1..10} --tagstring {color} seq {} ::: {1..10}
@end verbatim @end verbatim
To get rid of the initial \t (which comes from @strong{--tagstring}): To get rid of the initial \t (which comes from @strong{--tagstring}):
@verbatim @verbatim
... | perl -pe 's/\t//' ... | perl -pe 's/\t//'
@end verbatim @end verbatim
@node EXAMPLE: Keep order of output same as order of input @node EXAMPLE: Keep order of output same as order of input
@chapter EXAMPLE: Keep order of output same as order of input @section EXAMPLE: Keep order of output same as order of input
Normally the output of a job will be printed as soon as it Normally the output of a job will be printed as soon as it
completes. Sometimes you want the order of the output to remain the completes. Sometimes you want the order of the output to remain the
same as the order of the input. This is often important, if the output same as the order of the input. This is often important, if the output
is used as input for another system. @strong{-k} will make sure the order of is used as input for another system. @strong{-k} will make sure the order of
output will be in the same order as input even if later jobs end output will be in the same order as input even if later jobs end
before earlier jobs. before earlier jobs.
Append a string to every line in a text file: Append a string to every line in a text file:
skipping to change at line 4377 skipping to change at line 4441
To download a 1 GB file we need 100 10MB chunks downloaded and To download a 1 GB file we need 100 10MB chunks downloaded and
combined in the correct order. combined in the correct order.
@verbatim @verbatim
seq 0 99 | parallel -k curl -r \ seq 0 99 | parallel -k curl -r \
{}0000000-{}9999999 http://example.com/the/big/file > file {}0000000-{}9999999 http://example.com/the/big/file > file
@end verbatim @end verbatim
@node EXAMPLE: Parallel grep @node EXAMPLE: Parallel grep
@chapter EXAMPLE: Parallel grep @section EXAMPLE: Parallel grep
@strong{grep -r} greps recursively through directories. On multicore CPUs @strong{grep -r} greps recursively through directories. On multicore CPUs
GNU @strong{parallel} can often speed this up. GNU @strong{parallel} can often speed this up.
@verbatim @verbatim
find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {} find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
@end verbatim @end verbatim
This will run 1.5 job per CPU, and give 1000 arguments to @strong{grep}. This will run 1.5 job per CPU, and give 1000 arguments to @strong{grep}.
@node EXAMPLE: Grepping n lines for m regular expressions. @node EXAMPLE: Grepping n lines for m regular expressions.
@chapter EXAMPLE: Grepping n lines for m regular expressions. @section EXAMPLE: Grepping n lines for m regular expressions.
The simplest solution to grep a big file for a lot of regexps is: The simplest solution to grep a big file for a lot of regexps is:
@verbatim @verbatim
grep -f regexps.txt bigfile grep -f regexps.txt bigfile
@end verbatim @end verbatim
Or if the regexps are fixed strings: Or if the regexps are fixed strings:
@verbatim @verbatim
skipping to change at line 4423 skipping to change at line 4487
on the disk system it may be faster or slower to parallelize. The only on the disk system it may be faster or slower to parallelize. The only
way to know for certain is to test and measure. way to know for certain is to test and measure.
@menu @menu
* Limiting factor@asis{:} RAM:: * Limiting factor@asis{:} RAM::
* Limiting factor@asis{:} CPU:: * Limiting factor@asis{:} CPU::
* Bigger problem:: * Bigger problem::
@end menu @end menu
@node Limiting factor: RAM @node Limiting factor: RAM
@section Limiting factor: RAM @subsection Limiting factor: RAM
The normal @strong{grep -f regexps.txt bigfile} works no matter the size of The normal @strong{grep -f regexps.txt bigfile} works no matter the size of
bigfile, but if regexps.txt is so big it cannot fit into memory, then bigfile, but if regexps.txt is so big it cannot fit into memory, then
you need to split this. you need to split this.
@strong{grep -F} takes around 100 bytes of RAM and @strong{grep} takes about 500 @strong{grep -F} takes around 100 bytes of RAM and @strong{grep} takes about 500
bytes of RAM per 1 byte of regexp. So if regexps.txt is 1% of your bytes of RAM per 1 byte of regexp. So if regexps.txt is 1% of your
RAM, then it may be too big. RAM, then it may be too big.
If you can convert your regexps into fixed strings do that. E.g. if If you can convert your regexps into fixed strings do that. E.g. if
skipping to change at line 4487 skipping to change at line 4551
@end verbatim @end verbatim
If you can live with duplicated lines and wrong order, it is faster to do: If you can live with duplicated lines and wrong order, it is faster to do:
@verbatim @verbatim
parallel --pipepart -a regexps.txt --block $percpu --compress \ parallel --pipepart -a regexps.txt --block $percpu --compress \
grep -F -f - bigfile grep -F -f - bigfile
@end verbatim @end verbatim
@node Limiting factor: CPU @node Limiting factor: CPU
@section Limiting factor: CPU @subsection Limiting factor: CPU
If the CPU is the limiting factor parallelization should be done on If the CPU is the limiting factor parallelization should be done on
the regexps: the regexps:
@verbatim @verbatim
cat regexps.txt | parallel --pipe -L1000 --roundrobin --compress \ cat regexps.txt | parallel --pipe -L1000 --roundrobin --compress \
grep -f - -n bigfile | \ grep -f - -n bigfile | \
sort -un | perl -pe 's/^\d+://' sort -un | perl -pe 's/^\d+://'
@end verbatim @end verbatim
skipping to change at line 4524 skipping to change at line 4588
combine the two using @strong{--cat}: combine the two using @strong{--cat}:
@verbatim @verbatim
parallel --pipepart --block 100M -a bigfile --cat cat regexps.txt \ parallel --pipepart --block 100M -a bigfile --cat cat regexps.txt \
\| parallel --pipe -L1000 --roundrobin grep -f - {} \| parallel --pipe -L1000 --roundrobin grep -f - {}
@end verbatim @end verbatim
If a line matches multiple regexps, the line may be duplicated. If a line matches multiple regexps, the line may be duplicated.
@node Bigger problem @node Bigger problem
@section Bigger problem @subsection Bigger problem
If the problem is too big to be solved by this, you are probably ready If the problem is too big to be solved by this, you are probably ready
for Lucene. for Lucene.
@node EXAMPLE: Using remote computers @node EXAMPLE: Using remote computers
@chapter EXAMPLE: Using remote computers @section EXAMPLE: Using remote computers
To run commands on a remote computer SSH needs to be set up and you To run commands on a remote computer SSH needs to be set up and you
must be able to login without entering a password (The commands must be able to login without entering a password (The commands
@strong{ssh-copy-id}, @strong{ssh-agent}, and @strong{sshpass} may help you do t hat). @strong{ssh-copy-id}, @strong{ssh-agent}, and @strong{sshpass} may help you do t hat).
If you need to login to a whole cluster, you typically do not want to If you need to login to a whole cluster, you typically do not want to
accept the host key for every host. You want to accept them the first accept the host key for every host. You want to accept them the first
time and be warned if they are ever changed. To do that: time and be warned if they are ever changed. To do that:
@verbatim @verbatim
skipping to change at line 4624 skipping to change at line 4688
If the number of CPUs on the remote computers is not identified If the number of CPUs on the remote computers is not identified
correctly the number of CPUs can be added in front. Here the computer correctly the number of CPUs can be added in front. Here the computer
has 8 CPUs. has 8 CPUs.
@verbatim @verbatim
seq 10 | parallel --sshlogin 8/server.example.com echo seq 10 | parallel --sshlogin 8/server.example.com echo
@end verbatim @end verbatim
@node EXAMPLE: Transferring of files @node EXAMPLE: Transferring of files
@chapter EXAMPLE: Transferring of files @section EXAMPLE: Transferring of files
To recompress gzipped files with @strong{bzip2} using a remote computer run: To recompress gzipped files with @strong{bzip2} using a remote computer run:
@verbatim @verbatim
find logs/ -name '*.gz' | \ find logs/ -name '*.gz' | \
parallel --sshlogin server.example.com \ parallel --sshlogin server.example.com \
--transfer "zcat {} | bzip2 -9 >{.}.bz2" --transfer "zcat {} | bzip2 -9 >{.}.bz2"
@end verbatim @end verbatim
This will list the .gz-files in the @emph{logs} directory and all This will list the .gz-files in the @emph{logs} directory and all
skipping to change at line 4711 skipping to change at line 4775
If the file @emph{~/.parallel/sshloginfile} contains the list of computers If the file @emph{~/.parallel/sshloginfile} contains the list of computers
the special short hand @emph{-S ..} can be used: the special short hand @emph{-S ..} can be used:
@verbatim @verbatim
find logs/ -name '*.gz' | parallel -S .. \ find logs/ -name '*.gz' | parallel -S .. \
--trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2" --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
@end verbatim @end verbatim
@node EXAMPLE: Distributing work to local and remote computers @node EXAMPLE: Distributing work to local and remote computers
@chapter EXAMPLE: Distributing work to local and remote computers @section EXAMPLE: Distributing work to local and remote computers
Convert *.mp3 to *.ogg running one process per CPU on local computer Convert *.mp3 to *.ogg running one process per CPU on local computer
and server2: and server2:
@verbatim @verbatim
parallel --trc {.}.ogg -S server2,: \ parallel --trc {.}.ogg -S server2,: \
'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg' ::: *.mp3 'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg' ::: *.mp3
@end verbatim @end verbatim
@node EXAMPLE: Running the same command on remote computers @node EXAMPLE: Running the same command on remote computers
@chapter EXAMPLE: Running the same command on remote computers @section EXAMPLE: Running the same command on remote computers
To run the command @strong{uptime} on remote computers you can do: To run the command @strong{uptime} on remote computers you can do:
@verbatim @verbatim
parallel --tag --nonall -S server1,server2 uptime parallel --tag --nonall -S server1,server2 uptime
@end verbatim @end verbatim
@strong{--nonall} reads no arguments. If you have a list of jobs you want @strong{--nonall} reads no arguments. If you have a list of jobs you want
to run on each computer you can do: to run on each computer you can do:
@verbatim @verbatim
parallel --tag --onall -S server1,server2 echo ::: 1 2 3 parallel --tag --onall -S server1,server2 echo ::: 1 2 3
@end verbatim @end verbatim
Remove @strong{--tag} if you do not want the sshlogin added before the Remove @strong{--tag} if you do not want the sshlogin added before the
output. output.
If you have a lot of hosts use '-j0' to access more hosts in parallel. If you have a lot of hosts use '-j0' to access more hosts in parallel.
@node EXAMPLE: Running 'sudo' on remote computers @node EXAMPLE: Running 'sudo' on remote computers
@chapter EXAMPLE: Running 'sudo' on remote computers @section EXAMPLE: Running 'sudo' on remote computers
Put the password into passwordfile then run: Put the password into passwordfile then run:
@verbatim @verbatim
parallel --ssh 'cat passwordfile | ssh' --nonall \ parallel --ssh 'cat passwordfile | ssh' --nonall \
-S user@server1,user@server2 sudo -S ls -l /root -S user@server1,user@server2 sudo -S ls -l /root
@end verbatim @end verbatim
@node EXAMPLE: Using remote computers behind NAT wall @node EXAMPLE: Using remote computers behind NAT wall
@chapter EXAMPLE: Using remote computers behind NAT wall @section EXAMPLE: Using remote computers behind NAT wall
If the workers are behind a NAT wall, you need some trickery to get to If the workers are behind a NAT wall, you need some trickery to get to
them. them.
If you can @strong{ssh} to a jumphost, and reach the workers from there, If you can @strong{ssh} to a jumphost, and reach the workers from there,
then the obvious solution would be this, but it @strong{does not work}: then the obvious solution would be this, but it @strong{does not work}:
@verbatim @verbatim
parallel --ssh 'ssh jumphost ssh' -S host1 echo ::: DOES NOT WORK parallel --ssh 'ssh jumphost ssh' -S host1 echo ::: DOES NOT WORK
@end verbatim @end verbatim
skipping to change at line 4796 skipping to change at line 4860
@verbatim @verbatim
parallel -S host1,host2,host3 echo ::: This does work parallel -S host1,host2,host3 echo ::: This does work
@end verbatim @end verbatim
@menu @menu
* No jumphost@comma{} but port forwards:: * No jumphost@comma{} but port forwards::
* No jumphost@comma{} no port forwards:: * No jumphost@comma{} no port forwards::
@end menu @end menu
@node No jumphost@comma{} but port forwards @node No jumphost@comma{} but port forwards
@section No jumphost, but port forwards @subsection No jumphost, but port forwards
If there is no jumphost but each server has port 22 forwarded from the If there is no jumphost but each server has port 22 forwarded from the
firewall (e.g. the firewall's port 22001 = port 22 on host1, 22002 = host2, firewall (e.g. the firewall's port 22001 = port 22 on host1, 22002 = host2,
22003 = host3) then you can use @strong{~/.ssh/config}: 22003 = host3) then you can use @strong{~/.ssh/config}:
@verbatim @verbatim
Host host1.v Host host1.v
Port 22001 Port 22001
Host host2.v Host host2.v
Port 22002 Port 22002
skipping to change at line 4820 skipping to change at line 4884
Hostname firewall Hostname firewall
@end verbatim @end verbatim
And then use host@{1..3@}.v as normal hosts: And then use host@{1..3@}.v as normal hosts:
@verbatim @verbatim
parallel -S host1.v,host2.v,host3.v echo ::: a b c parallel -S host1.v,host2.v,host3.v echo ::: a b c
@end verbatim @end verbatim
@node No jumphost@comma{} no port forwards @node No jumphost@comma{} no port forwards
@section No jumphost, no port forwards @subsection No jumphost, no port forwards
If ports cannot be forwarded, you need some sort of VPN to traverse If ports cannot be forwarded, you need some sort of VPN to traverse
the NAT-wall. TOR is one options for that, as it is very easy to get the NAT-wall. TOR is one options for that, as it is very easy to get
working. working.
You need to install TOR and setup a hidden service. In @strong{torrc} put: You need to install TOR and setup a hidden service. In @strong{torrc} put:
@verbatim @verbatim
HiddenServiceDir /var/lib/tor/hidden_service/ HiddenServiceDir /var/lib/tor/hidden_service/
HiddenServicePort 22 127.0.0.1:22 HiddenServicePort 22 127.0.0.1:22
skipping to change at line 4854 skipping to change at line 4918
If not all hosts are accessible through TOR: If not all hosts are accessible through TOR:
@verbatim @verbatim
parallel -S 'torsocks ssh izjafdceobowklhz.onion,host2,host3' \ parallel -S 'torsocks ssh izjafdceobowklhz.onion,host2,host3' \
echo ::: a b c echo ::: a b c
@end verbatim @end verbatim
See more @strong{ssh} tricks on https://en.wikibooks.org/wiki/OpenSSH/Cookbook/P roxies_and_Jump_Hosts See more @strong{ssh} tricks on https://en.wikibooks.org/wiki/OpenSSH/Cookbook/P roxies_and_Jump_Hosts
@node EXAMPLE: Parallelizing rsync @node EXAMPLE: Parallelizing rsync
@chapter EXAMPLE: Parallelizing rsync @section EXAMPLE: Parallelizing rsync
@strong{rsync} is a great tool, but sometimes it will not fill up the @strong{rsync} is a great tool, but sometimes it will not fill up the
available bandwidth. Running multiple @strong{rsync} in parallel can fix available bandwidth. Running multiple @strong{rsync} in parallel can fix
this. this.
@verbatim @verbatim
cd src-dir cd src-dir
find . -type f | find . -type f |
parallel -j10 -X rsync -zR -Ha ./{} fooserver:/dest-dir/ parallel -j10 -X rsync -zR -Ha ./{} fooserver:/dest-dir/
@end verbatim @end verbatim
skipping to change at line 4886 skipping to change at line 4950
The @strong{/./} is what @strong{rsync -R} works on. The @strong{/./} is what @strong{rsync -R} works on.
If you are unable to push data, but need to pull them and the files If you are unable to push data, but need to pull them and the files
are called digits.png (e.g. 000000.png) you might be able to do: are called digits.png (e.g. 000000.png) you might be able to do:
@verbatim @verbatim
seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/ seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/
@end verbatim @end verbatim
@node EXAMPLE: Use multiple inputs in one command @node EXAMPLE: Use multiple inputs in one command
@chapter EXAMPLE: Use multiple inputs in one command @section EXAMPLE: Use multiple inputs in one command
Copy files like foo.es.ext to foo.ext: Copy files like foo.es.ext to foo.ext:
@verbatim @verbatim
ls *.es.* | perl -pe 'print; s/\.es//' | parallel -N2 cp {1} {2} ls *.es.* | perl -pe 'print; s/\.es//' | parallel -N2 cp {1} {2}
@end verbatim @end verbatim
The perl command spits out 2 lines for each input. GNU @strong{parallel} The perl command spits out 2 lines for each input. GNU @strong{parallel}
takes 2 inputs (using @strong{-N2}) and replaces @{1@} and @{2@} with the inputs . takes 2 inputs (using @strong{-N2}) and replaces @{1@} and @{2@} with the inputs .
skipping to change at line 4925 skipping to change at line 4989
-a <(seq $(find . -type f|wc -l)) convert {1} {2}.png -a <(seq $(find . -type f|wc -l)) convert {1} {2}.png
@end verbatim @end verbatim
Alternative version: Alternative version:
@verbatim @verbatim
find . -type f | sort | parallel convert {} {#}.png find . -type f | sort | parallel convert {} {#}.png
@end verbatim @end verbatim
@node EXAMPLE: Use a table as input @node EXAMPLE: Use a table as input
@chapter EXAMPLE: Use a table as input @section EXAMPLE: Use a table as input
Content of table_file.tsv: Content of table_file.tsv:
@verbatim @verbatim
foo<TAB>bar foo<TAB>bar
baz <TAB> quux baz <TAB> quux
@end verbatim @end verbatim
To run: To run:
skipping to change at line 4955 skipping to change at line 5019
@end verbatim @end verbatim
Note: The default for GNU @strong{parallel} is to remove the spaces around Note: The default for GNU @strong{parallel} is to remove the spaces around
the columns. To keep the spaces: the columns. To keep the spaces:
@verbatim @verbatim
parallel -a table_file.tsv --trim n --colsep '\t' cmd -o {2} -i {1} parallel -a table_file.tsv --trim n --colsep '\t' cmd -o {2} -i {1}
@end verbatim @end verbatim
@node EXAMPLE: Output to database @node EXAMPLE: Output to database
@chapter EXAMPLE: Output to database @section EXAMPLE: Output to database
GNU @strong{parallel} can output to a database table and a CSV-file: GNU @strong{parallel} can output to a database table and a CSV-file:
@verbatim @verbatim
dburl=csv:///%2Ftmp%2Fmydir dburl=csv:///%2Ftmp%2Fmydir
dbtableurl=$dburl/mytable.csv dbtableurl=$dburl/mytable.csv
parallel --sqlandworker $dbtableurl seq ::: {1..10} parallel --sqlandworker $dbtableurl seq ::: {1..10}
@end verbatim @end verbatim
It is rather slow and takes up a lot of CPU time because GNU It is rather slow and takes up a lot of CPU time because GNU
skipping to change at line 5003 skipping to change at line 5067
dburl=mysql://user:pass@host/mydb dburl=mysql://user:pass@host/mydb
dbtableurl=$dburl/mytable dbtableurl=$dburl/mytable
parallel --sqlandworker $dbtableurl seq ::: {1..10} parallel --sqlandworker $dbtableurl seq ::: {1..10}
sql -p -B $dburl "SELECT * FROM mytable;" > mytable.tsv sql -p -B $dburl "SELECT * FROM mytable;" > mytable.tsv
perl -pe 's/"/""/g; s/\t/","/g; s/^/"/; s/$/"/; perl -pe 's/"/""/g; s/\t/","/g; s/^/"/; s/$/"/;
%s=("\\" => "\\", "t" => "\t", "n" => "\n"); %s=("\\" => "\\", "t" => "\t", "n" => "\n");
s/\\([\\tn])/$s{$1}/g;' mytable.tsv s/\\([\\tn])/$s{$1}/g;' mytable.tsv
@end verbatim @end verbatim
@node EXAMPLE: Output to CSV-file for R @node EXAMPLE: Output to CSV-file for R
@chapter EXAMPLE: Output to CSV-file for R @section EXAMPLE: Output to CSV-file for R
If you have no need for the advanced job distribution control that a If you have no need for the advanced job distribution control that a
database provides, but you simply want output into a CSV file that you database provides, but you simply want output into a CSV file that you
can read into R or LibreCalc, then you can use @strong{--results}: can read into R or LibreCalc, then you can use @strong{--results}:
@verbatim @verbatim
parallel --results my.csv seq ::: 10 20 30 parallel --results my.csv seq ::: 10 20 30
R R
> mydf <- read.csv("my.csv"); > mydf <- read.csv("my.csv");
> print(mydf[2,]) > print(mydf[2,])
> write(as.character(mydf[2,c("Stdout")]),'') > write(as.character(mydf[2,c("Stdout")]),'')
@end verbatim @end verbatim
@node EXAMPLE: Use XML as input @node EXAMPLE: Use XML as input
@chapter EXAMPLE: Use XML as input @section EXAMPLE: Use XML as input
The show Aflyttet on Radio 24syv publishes an RSS feed with their audio The show Aflyttet on Radio 24syv publishes an RSS feed with their audio
podcasts on: http://arkiv.radio24syv.dk/audiopodcast/channel/4466232 podcasts on: http://arkiv.radio24syv.dk/audiopodcast/channel/4466232
Using @strong{xpath} you can extract the URLs for 2019 and download them Using @strong{xpath} you can extract the URLs for 2019 and download them
using GNU @strong{parallel}: using GNU @strong{parallel}:
@verbatim @verbatim
wget -O - http://arkiv.radio24syv.dk/audiopodcast/channel/4466232 | \ wget -O - http://arkiv.radio24syv.dk/audiopodcast/channel/4466232 | \
xpath -e "//pubDate[contains(text(),'2019')]/../enclosure/@url" | \ xpath -e "//pubDate[contains(text(),'2019')]/../enclosure/@url" | \
parallel -u wget '{= s/ url="//; s/"//; =}' parallel -u wget '{= s/ url="//; s/"//; =}'
@end verbatim @end verbatim
@node EXAMPLE: Run the same command 10 times @node EXAMPLE: Run the same command 10 times
@chapter EXAMPLE: Run the same command 10 times @section EXAMPLE: Run the same command 10 times
If you want to run the same command with the same arguments 10 times If you want to run the same command with the same arguments 10 times
in parallel you can do: in parallel you can do:
@verbatim @verbatim
seq 10 | parallel -n0 my_command my_args seq 10 | parallel -n0 my_command my_args
@end verbatim @end verbatim
@node EXAMPLE: Working as cat | sh. Resource inexpensive jobs and evaluation @node EXAMPLE: Working as cat | sh. Resource inexpensive jobs and evaluation
@chapter EXAMPLE: Working as cat | sh. Resource inexpensive jobs and evaluation @section EXAMPLE: Working as cat | sh. Resource inexpensive jobs and evaluation
GNU @strong{parallel} can work similar to @strong{cat | sh}. GNU @strong{parallel} can work similar to @strong{cat | sh}.
A resource inexpensive job is a job that takes very little CPU, disk A resource inexpensive job is a job that takes very little CPU, disk
I/O and network I/O. Ping is an example of a resource inexpensive I/O and network I/O. Ping is an example of a resource inexpensive
job. wget is too - if the webpages are small. job. wget is too - if the webpages are small.
The content of the file jobs_to_run: The content of the file jobs_to_run:
@verbatim @verbatim
skipping to change at line 5072 skipping to change at line 5136
To run 100 processes simultaneously do: To run 100 processes simultaneously do:
@verbatim @verbatim
parallel -j 100 < jobs_to_run parallel -j 100 < jobs_to_run
@end verbatim @end verbatim
As there is not a @emph{command} the jobs will be evaluated by the shell. As there is not a @emph{command} the jobs will be evaluated by the shell.
@node EXAMPLE: Call program with FASTA sequence @node EXAMPLE: Call program with FASTA sequence
@chapter EXAMPLE: Call program with FASTA sequence @section EXAMPLE: Call program with FASTA sequence
FASTA files have the format: FASTA files have the format:
@verbatim @verbatim
>Sequence name1 >Sequence name1
sequence sequence
sequence continued sequence continued
>Sequence name2 >Sequence name2
sequence sequence
sequence continued sequence continued
skipping to change at line 5095 skipping to change at line 5159
To call @strong{myprog} with the sequence as argument run: To call @strong{myprog} with the sequence as argument run:
@verbatim @verbatim
cat file.fasta | cat file.fasta |
parallel --pipe -N1 --recstart '>' --rrs \ parallel --pipe -N1 --recstart '>' --rrs \
'read a; echo Name: "$a"; myprog $(tr -d "\n")' 'read a; echo Name: "$a"; myprog $(tr -d "\n")'
@end verbatim @end verbatim
@node EXAMPLE: Processing a big file using more CPUs @node EXAMPLE: Processing a big file using more CPUs
@chapter EXAMPLE: Processing a big file using more CPUs @section EXAMPLE: Processing a big file using more CPUs
To process a big file or some output you can use @strong{--pipe} to split up To process a big file or some output you can use @strong{--pipe} to split up
the data into blocks and pipe the blocks into the processing program. the data into blocks and pipe the blocks into the processing program.
If the program is @strong{gzip -9} you can do: If the program is @strong{gzip -9} you can do:
@verbatim @verbatim
cat bigfile | parallel --pipe --recend '' -k gzip -9 > bigfile.gz cat bigfile | parallel --pipe --recend '' -k gzip -9 > bigfile.gz
@end verbatim @end verbatim
skipping to change at line 5138 skipping to change at line 5202
byte has to be copied through GNU @strong{parallel}. But if @strong{bigfile} is a byte has to be copied through GNU @strong{parallel}. But if @strong{bigfile} is a
real (seekable) file GNU @strong{parallel} can by-pass the copying and send real (seekable) file GNU @strong{parallel} can by-pass the copying and send
the parts directly to the program: the parts directly to the program:
@verbatim @verbatim
parallel --pipepart --block 100m -a bigfile --files sort |\ parallel --pipepart --block 100m -a bigfile --files sort |\
parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort
@end verbatim @end verbatim
@node EXAMPLE: Grouping input lines @node EXAMPLE: Grouping input lines
@chapter EXAMPLE: Grouping input lines @section EXAMPLE: Grouping input lines
When processing with @strong{--pipe} you may have lines grouped by a When processing with @strong{--pipe} you may have lines grouped by a
value. Here is @emph{my.csv}: value. Here is @emph{my.csv}:
@verbatim @verbatim
Transaction Customer Item Transaction Customer Item
1 a 53 1 a 53
2 b 65 2 b 65
3 b 82 3 b 82
4 c 96 4 c 96
skipping to change at line 5181 skipping to change at line 5245
sep=`perl -e 'print map { ("a".."z","A".."Z")[rand(52)] } (1..50);'` sep=`perl -e 'print map { ("a".."z","A".."Z")[rand(52)] } (1..50);'`
cat my.csv | \ cat my.csv | \
perl -ape '$F[1] ne $l and print "'$sep'"; $l = $F[1]' | \ perl -ape '$F[1] ne $l and print "'$sep'"; $l = $F[1]' | \
parallel --recend $sep --rrs --pipe -N1 wc parallel --recend $sep --rrs --pipe -N1 wc
@end verbatim @end verbatim
If your program can process multiple customers replace @strong{-N1} with a If your program can process multiple customers replace @strong{-N1} with a
reasonable @strong{--blocksize}. reasonable @strong{--blocksize}.
@node EXAMPLE: Running more than 250 jobs workaround @node EXAMPLE: Running more than 250 jobs workaround
@chapter EXAMPLE: Running more than 250 jobs workaround @section EXAMPLE: Running more than 250 jobs workaround
If you need to run a massive amount of jobs in parallel, then you will If you need to run a massive amount of jobs in parallel, then you will
likely hit the filehandle limit which is often around 250 jobs. If you likely hit the filehandle limit which is often around 250 jobs. If you
are super user you can raise the limit in /etc/security/limits.conf are super user you can raise the limit in /etc/security/limits.conf
but you can also use this workaround. The filehandle limit is per but you can also use this workaround. The filehandle limit is per
process. That means that if you just spawn more GNU @strong{parallel}s then process. That means that if you just spawn more GNU @strong{parallel}s then
each of them can run 250 jobs. This will spawn up to 2500 jobs: each of them can run 250 jobs. This will spawn up to 2500 jobs:
@verbatim @verbatim
cat myinput |\ cat myinput |\
skipping to change at line 5204 skipping to change at line 5268
This will spawn up to 62500 jobs (use with caution - you need 64 GB This will spawn up to 62500 jobs (use with caution - you need 64 GB
RAM to do this, and you may need to increase /proc/sys/kernel/pid_max): RAM to do this, and you may need to increase /proc/sys/kernel/pid_max):
@verbatim @verbatim
cat myinput |\ cat myinput |\
parallel --pipe -N 250 --roundrobin -j250 parallel -j250 your_prg parallel --pipe -N 250 --roundrobin -j250 parallel -j250 your_prg
@end verbatim @end verbatim
@node EXAMPLE: Working as mutex and counting semaphore @node EXAMPLE: Working as mutex and counting semaphore
@chapter EXAMPLE: Working as mutex and counting semaphore @section EXAMPLE: Working as mutex and counting semaphore
The command @strong{sem} is an alias for @strong{parallel --semaphore}. The command @strong{sem} is an alias for @strong{parallel --semaphore}.
A counting semaphore will allow a given number of jobs to be started A counting semaphore will allow a given number of jobs to be started
in the background. When the number of jobs are running in the in the background. When the number of jobs are running in the
background, GNU @strong{sem} will wait for one of these to complete before background, GNU @strong{sem} will wait for one of these to complete before
starting another command. @strong{sem --wait} will wait for all jobs to starting another command. @strong{sem --wait} will wait for all jobs to
complete. complete.
Run 10 jobs concurrently in the background: Run 10 jobs concurrently in the background:
skipping to change at line 5243 skipping to change at line 5307
the file at the same time. the file at the same time.
Name the semaphore to have multiple different semaphores active at the Name the semaphore to have multiple different semaphores active at the
same time: same time:
@verbatim @verbatim
seq 3 | parallel sem --id mymutex sed -i -e '1i{}' myfile seq 3 | parallel sem --id mymutex sed -i -e '1i{}' myfile
@end verbatim @end verbatim
@node EXAMPLE: Mutex for a script @node EXAMPLE: Mutex for a script
@chapter EXAMPLE: Mutex for a script @section EXAMPLE: Mutex for a script
Assume a script is called from cron or from a web service, but only Assume a script is called from cron or from a web service, but only
one instance can be run at a time. With @strong{sem} and @strong{--shebang-wrap} one instance can be run at a time. With @strong{sem} and @strong{--shebang-wrap}
the script can be made to wait for other instances to finish. Here in the script can be made to wait for other instances to finish. Here in
@strong{bash}: @strong{bash}:
@verbatim @verbatim
#!/usr/bin/sem --shebang-wrap -u --id $0 --fg /bin/bash #!/usr/bin/sem --shebang-wrap -u --id $0 --fg /bin/bash
echo This will run echo This will run
skipping to change at line 5280 skipping to change at line 5344
@verbatim @verbatim
#!/usr/local/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/python #!/usr/local/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/python
import time import time
print "This will run "; print "This will run ";
time.sleep(5) time.sleep(5)
print "exclusively"; print "exclusively";
@end verbatim @end verbatim
@node EXAMPLE: Start editor with filenames from stdin (standard input) @node EXAMPLE: Start editor with filenames from stdin (standard input)
@chapter EXAMPLE: Start editor with filenames from stdin (standard input) @section EXAMPLE: Start editor with filenames from stdin (standard input)
You can use GNU @strong{parallel} to start interactive programs like emacs or vi : You can use GNU @strong{parallel} to start interactive programs like emacs or vi :
@verbatim @verbatim
cat filelist | parallel --tty -X emacs cat filelist | parallel --tty -X emacs
cat filelist | parallel --tty -X vi cat filelist | parallel --tty -X vi
@end verbatim @end verbatim
If there are more files than will fit on a single command line, the If there are more files than will fit on a single command line, the
editor will be started again with the remaining files. editor will be started again with the remaining files.
@node EXAMPLE: Running sudo @node EXAMPLE: Running sudo
@chapter EXAMPLE: Running sudo @section EXAMPLE: Running sudo
@strong{sudo} requires a password to run a command as root. It caches the @strong{sudo} requires a password to run a command as root. It caches the
access, so you only need to enter the password again if you have not access, so you only need to enter the password again if you have not
used @strong{sudo} for a while. used @strong{sudo} for a while.
The command: The command:
@verbatim @verbatim
parallel sudo echo ::: This is a bad idea parallel sudo echo ::: This is a bad idea
@end verbatim @end verbatim
skipping to change at line 5322 skipping to change at line 5386
or: or:
@verbatim @verbatim
sudo parallel echo ::: This is a good idea sudo parallel echo ::: This is a good idea
@end verbatim @end verbatim
This way you only have to enter the sudo password once. This way you only have to enter the sudo password once.
@node EXAMPLE: GNU Parallel as queue system/batch manager @node EXAMPLE: GNU Parallel as queue system/batch manager
@chapter EXAMPLE: GNU Parallel as queue system/batch manager @section EXAMPLE: GNU Parallel as queue system/batch manager
GNU @strong{parallel} can work as a simple job queue system or batch manager. GNU @strong{parallel} can work as a simple job queue system or batch manager.
The idea is to put the jobs into a file and have GNU @strong{parallel} read The idea is to put the jobs into a file and have GNU @strong{parallel} read
from that continuously. As GNU @strong{parallel} will stop at end of file we from that continuously. As GNU @strong{parallel} will stop at end of file we
use @strong{tail} to continue reading: use @strong{tail} to continue reading:
@verbatim @verbatim
true >jobqueue; tail -n+0 -f jobqueue | parallel true >jobqueue; tail -n+0 -f jobqueue | parallel
@end verbatim @end verbatim
skipping to change at line 5392 skipping to change at line 5456
will start immediately if free slots are available. Output from the will start immediately if free slots are available. Output from the
running or completed jobs are held back and will only be printed when running or completed jobs are held back and will only be printed when
JobSlots more jobs has been started (unless you use --ungroup or JobSlots more jobs has been started (unless you use --ungroup or
--line-buffer, in which case the output from the jobs are printed --line-buffer, in which case the output from the jobs are printed
immediately). E.g. if you have 10 jobslots then the output from the immediately). E.g. if you have 10 jobslots then the output from the
first completed job will only be printed when job 11 has started, and first completed job will only be printed when job 11 has started, and
the output of second completed job will only be printed when job 12 the output of second completed job will only be printed when job 12
has started. has started.
@node EXAMPLE: GNU Parallel as dir processor @node EXAMPLE: GNU Parallel as dir processor
@chapter EXAMPLE: GNU Parallel as dir processor @section EXAMPLE: GNU Parallel as dir processor
If you have a dir in which users drop files that needs to be processed If you have a dir in which users drop files that needs to be processed
you can do this on GNU/Linux (If you know what @strong{inotifywait} is you can do this on GNU/Linux (If you know what @strong{inotifywait} is
called on other platforms file a bug report): called on other platforms file a bug report):
@verbatim @verbatim
inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\ inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\
parallel -u echo parallel -u echo
@end verbatim @end verbatim
skipping to change at line 5422 skipping to change at line 5486
@end verbatim @end verbatim
If the files to be processed are in a tar file then unpacking one file If the files to be processed are in a tar file then unpacking one file
and processing it immediately may be faster than first unpacking all and processing it immediately may be faster than first unpacking all
files. Set up the dir processor as above and unpack into the dir. files. Set up the dir processor as above and unpack into the dir.
Using GNU @strong{parallel} as dir processor has the same limitations as Using GNU @strong{parallel} as dir processor has the same limitations as
using GNU @strong{parallel} as queue system/batch manager. using GNU @strong{parallel} as queue system/batch manager.
@node EXAMPLE: Locate the missing package @node EXAMPLE: Locate the missing package
@chapter EXAMPLE: Locate the missing package @section EXAMPLE: Locate the missing package
If you have downloaded source and tried compiling it, you may have seen: If you have downloaded source and tried compiling it, you may have seen:
@verbatim @verbatim
$ ./configure $ ./configure
[...] [...]
checking for something.h... no checking for something.h... no
configure: error: "libsomething not found" configure: error: "libsomething not found"
@end verbatim @end verbatim
skipping to change at line 5696 skipping to change at line 5760
@node ENVIRONMENT VARIABLES @node ENVIRONMENT VARIABLES
@chapter ENVIRONMENT VARIABLES @chapter ENVIRONMENT VARIABLES
@table @asis @table @asis
@item $PARALLEL_HOME @item $PARALLEL_HOME
@anchor{$PARALLEL_HOME} @anchor{$PARALLEL_HOME}
Dir where GNU @strong{parallel} stores config files, semaphores, and caches Dir where GNU @strong{parallel} stores config files, semaphores, and caches
information between invocations. Default: $HOME/.parallel. information between invocations. Default: $HOME/.parallel.
@item $PARALLEL_ARGHOSTGROUPS @item $PARALLEL_ARGHOSTGROUPS (beta testing)
@anchor{$PARALLEL_ARGHOSTGROUPS} @anchor{$PARALLEL_ARGHOSTGROUPS (beta testing)}
When using @strong{--hostgroups} GNU @strong{parallel} sets this to the hostgrou ps When using @strong{--hostgroups} GNU @strong{parallel} sets this to the hostgrou ps
of the job. of the job.
Remember to quote the $, so it gets evaluated by the correct shell. Or Remember to quote the $, so it gets evaluated by the correct shell. Or
use @strong{--plus} and @{agrp@}. use @strong{--plus} and @{agrp@}.
@item $PARALLEL_HOSTGROUPS @item $PARALLEL_HOSTGROUPS
@anchor{$PARALLEL_HOSTGROUPS} @anchor{$PARALLEL_HOSTGROUPS}
skipping to change at line 5892 skipping to change at line 5956
multiple @strong{--profiles}. multiple @strong{--profiles}.
Profiles are searched for in @strong{~/.parallel}. If the name starts with Profiles are searched for in @strong{~/.parallel}. If the name starts with
@strong{/} it is seen as an absolute path. If the name starts with @strong{./} i t @strong{/} it is seen as an absolute path. If the name starts with @strong{./} i t
is seen as a relative path from current dir. is seen as a relative path from current dir.
Example: Profile for running a command on every sshlogin in Example: Profile for running a command on every sshlogin in
~/.ssh/sshlogins and prepend the output with the sshlogin: ~/.ssh/sshlogins and prepend the output with the sshlogin:
@verbatim @verbatim
echo --tag -S .. --nonall > ~/.parallel/n echo --tag -S .. --nonall > ~/.parallel/nonall_profile
parallel -Jn uptime parallel -J nonall_profile uptime
@end verbatim @end verbatim
Example: Profile for running every command with @strong{-j-1} and @strong{nice} Example: Profile for running every command with @strong{-j-1} and @strong{nice}
@verbatim @verbatim
echo -j-1 nice > ~/.parallel/nice_profile echo -j-1 nice > ~/.parallel/nice_profile
parallel -J nice_profile bzip2 -9 ::: * parallel -J nice_profile bzip2 -9 ::: *
@end verbatim @end verbatim
Example: Profile for running a perl script before every command: Example: Profile for running a perl script before every command:
 End of changes. 92 change blocks. 
168 lines changed or deleted 232 lines changed or added

Home  |  About  |  Features  |  All  |  Newest  |  Dox  |  Diffs  |  RSS Feeds  |  Screenshots  |  Comments  |  Imprint  |  Privacy  |  HTTP(S)