"Fossies" - the Fresh Open Source Software Archive  

Source code changes of the file "Basic/Pod/ParallelCPU.pod" between
PDL-2.074.tar.gz and PDL-2.075.tar.gz

About: PDL (Perl Data Language) aims to turn perl into an efficient numerical language for scientific computing (similar to IDL and MatLab).

ParallelCPU.pod  (PDL-2.074):ParallelCPU.pod  (PDL-2.075)
skipping to change at line 40 skipping to change at line 40
# processing operation. # processing operation.
$actualPthreads = get_autopthread_actual(); $actualPthreads = get_autopthread_actual();
# Or compare these to see CPU usage (first one only 1 pthread, second one 10) # Or compare these to see CPU usage (first one only 1 pthread, second one 10)
# in the PDL shell: # in the PDL shell:
$x = ones(10,1000,10000); set_autopthread_targ(1); $y = sin($x)*cos($x); p get _autopthread_actual; $x = ones(10,1000,10000); set_autopthread_targ(1); $y = sin($x)*cos($x); p get _autopthread_actual;
$x = ones(10,1000,10000); set_autopthread_targ(10); $y = sin($x)*cos($x); p ge t_autopthread_actual; $x = ones(10,1000,10000); set_autopthread_targ(10); $y = sin($x)*cos($x); p ge t_autopthread_actual;
=head1 Terminology =head1 Terminology
The use of the term I<threading> can be confusing with PDL, because it can refer To reduce the confusion that existed in PDL before 2.075, this document uses
to I<PDL threading>,
as defined in the L<PDL::Threading> docs, or to I<processor multi-threading>.
To reduce confusion with the existing PDL threading terminology, this document u
ses
B<pthreading> to refer to I<processor multi-threading>, which is the use of mult iple processor threads B<pthreading> to refer to I<processor multi-threading>, which is the use of mult iple processor threads
to split up numerical processing into parallel operations. to split up numerical processing into parallel operations.
=head1 Functions that control PDL pthreads =head1 Functions that control PDL pthreads
This is a brief listing and description of the PDL pthreading functions, see the L<PDL::Core> docs This is a brief listing and description of the PDL pthreading functions, see the L<PDL::Core> docs
for detailed information. for detailed information.
=over 5 =over 5
skipping to change at line 92 skipping to change at line 89
I<set_autopthread_size> functions made with the environment variable's values. I<set_autopthread_size> functions made with the environment variable's values.
For example, if the environment var B<PDL_AUTOPTHREAD_TARG> is set to 3, and B<P DL_AUTOPTHREAD_SIZE> is For example, if the environment var B<PDL_AUTOPTHREAD_TARG> is set to 3, and B<P DL_AUTOPTHREAD_SIZE> is
set to 10, then any pdl script will run as if the following lines were at the to p of the file: set to 10, then any pdl script will run as if the following lines were at the to p of the file:
set_autopthread_targ(3); set_autopthread_targ(3);
set_autopthread_size(10); set_autopthread_size(10);
=head1 How It Works =head1 How It Works
The auto-pthreading process works by analyzing threaded array dimensions in PDL The auto-pthreading process works by analyzing broadcast array dimensions in PDL
operations operations (those above the operation's "signature" dimensions)
and splitting up processing based on the thread dimension sizes and desired numb and splitting up processing according to those and the desired number of
er of
pthreads (i.e. the pthread target or pthread_targ). The offsets, pthreads (i.e. the pthread target or pthread_targ). The offsets,
increments, and dimension-sizes (in case the whole dimension does increments, and dimension-sizes (in case the whole dimension does
not divide neatly by the number of pthreads) that PDL uses to step not divide neatly by the number of pthreads) that PDL uses to step
thru the data in memory are modified for each pthread so each one sees a differe nt set of data when thru the data in memory are modified for each pthread so each one sees a differe nt set of data when
performing processing. performing processing.
B<Example> B<Example>
$x = sequence(20,4,3); # Small 3-D Array, size 20,4,3 $x = sequence(20,4,3); # Small 3-D Array, size 20,4,3
# Setup auto-pthreading: # Setup auto-pthreading:
set_autopthread_targ(2); # Target of 2 pthreads set_autopthread_targ(2); # Target of 2 pthreads
set_autopthread_size(0); # Zero so that the small PDLs in this example will be pthreaded set_autopthread_size(0); # Zero so that the small PDLs in this example will be pthreaded
# This will be split up into 2 pthreads # This will be split up into 2 pthreads
$c = maximum($x); $c = maximum($x);
For the above example, the I<maximum> function has a signature of C<(a(n); [o]c( ))>, which means that the first For the above example, the I<maximum> function has a signature of C<(a(n); [o]c( ))>, which means that the first
dimension of $x (size 20) is a I<Core> dimension of the I<maximum> function. The other dimensions of $x (size 4,3) dimension of $x (size 20) is a I<Core> dimension of the I<maximum> function. The other dimensions of $x (size 4,3)
are I<threaded> dimensions (i.e. will be threaded-over in the I<maximum> functio n. are I<broadcast> dimensions (i.e. will be broadcasted-over in the I<maximum> fun ction.
The auto-pthreading algorithm examines the threaded dims of size (4,3) and picks the 4 dimension, The auto-pthreading algorithm examines the broadcasted dims of size (4,3) and pi cks the 4 dimension,
since it is evenly divisible by the autopthread_targ of 2. The processing of the maximum function is then since it is evenly divisible by the autopthread_targ of 2. The processing of the maximum function is then
split into two pthreads on the size-4 dimension, with dim indexes 0,2 processed by one pthread split into two pthreads on the size-4 dimension, with dim indexes 0,2 processed by one pthread
and dim indexes 1,3 processed by the other pthread. and dim indexes 1,3 processed by the other pthread.
=head1 Limitations =head1 Limitations
=head2 Must have POSIX Threads Enabled =head2 Must have POSIX Threads Enabled
Auto-pthreading only works if your PDL installation was compiled with POSIX thre ads enabled. This is normally Auto-pthreading only works if your PDL installation was compiled with POSIX thre ads enabled. This is normally
the case if you are running on Windows, Linux, MacOS X, or other unix variants. the case if you are running on Windows, Linux, MacOS X, or other unix variants.
skipping to change at line 138 skipping to change at line 135
Not all the libraries that PDL intefaces to are thread-safe, i.e. they aren't wr itten to operate Not all the libraries that PDL intefaces to are thread-safe, i.e. they aren't wr itten to operate
in a multi-threaded environment without crashing or causing side-effects. Some e xamples in the PDL in a multi-threaded environment without crashing or causing side-effects. Some e xamples in the PDL
core is the I<fft> function and the I<pnmout> functions. core is the I<fft> function and the I<pnmout> functions.
To operate properly with these types of functions, the PPCode flag B<NoPthread> has been introduced to indicate To operate properly with these types of functions, the PPCode flag B<NoPthread> has been introduced to indicate
a function as I<not> being pthread-safe. See L<PDL::PP> docs for details. a function as I<not> being pthread-safe. See L<PDL::PP> docs for details.
=head2 Size of PDL Dimensions and pthread Target =head2 Size of PDL Dimensions and pthread Target
As of PDL 2.058, the threaded dimension sizes do not need to divide As of PDL 2.058, the broadcasted dimension sizes do not need to divide
exactly by the pthread target, although if one does, it will be exactly by the pthread target, although if one does, it will be
used. used.
If no dimension is as large as the pthread target, the number of If no dimension is as large as the pthread target, the number of
pthreads will be the size of the largest threaded dimension. pthreads will be the size of the largest broadcasted dimension.
In order to minimise idle CPUs on the last iteration at the end of In order to minimise idle CPUs on the last iteration at the end of
the threaded dimension, the algorithm that picks the dimension to the broadcasted dimension, the algorithm that picks the dimension to
pthread on aims for the largest remainder in dividing the pthread pthread on aims for the largest remainder in dividing the pthread
target into the sizes of the threaded dimensions. For example, if target into the sizes of the broadcasted dimensions. For example, if
a PDL has threaded dimension sizes of (9,6,2) and the I<auto_pthread_targ> a PDL has broadcasted dimension sizes of (9,6,2) and the I<auto_pthread_targ>
is 4, the algorithm will pick the 1-th (size 6), as that will leave is 4, the algorithm will pick the 1-th (size 6), as that will leave
a remainder of 2 (leaving 2 idle at the end) in preference to one a remainder of 2 (leaving 2 idle at the end) in preference to one
with size 9, which would leave 3 idle. with size 9, which would leave 3 idle.
=head2 Speed improvement might be less than you expect. =head2 Speed improvement might be less than you expect.
If you have an 8-core machine and call I<auto_pthread_targ> with 8 If you have an 8-core machine and call I<auto_pthread_targ> with 8
to generate 8 parallel pthreads, you to generate 8 parallel pthreads, you
probably won't get a 8X improvement in speed, due to memory bandwidth issues. Ev en though you have 8 separate probably won't get a 8X improvement in speed, due to memory bandwidth issues. Ev en though you have 8 separate
CPUs crunching away on data, you will have (for most common machine architecture s) common RAM that now becomes CPUs crunching away on data, you will have (for most common machine architecture s) common RAM that now becomes
 End of changes. 8 change blocks. 
17 lines changed or deleted 11 lines changed or added

Home  |  About  |  Features  |  All  |  Newest  |  Dox  |  Diffs  |  RSS Feeds  |  Screenshots  |  Comments  |  Imprint  |  Privacy  |  HTTP(S)