"Fossies" - the Fresh Open Source Software Archive  

Source code changes of the file "Basic/Pod/BadValues.pod" between
PDL-2.078.tar.gz and PDL-2.079.tar.gz

About: PDL (Perl Data Language) aims to turn perl into an efficient numerical language for scientific computing (similar to IDL and MatLab).

BadValues.pod  (PDL-2.078):BadValues.pod  (PDL-2.079)
skipping to change at line 27 skipping to change at line 27
If you're not interested in this, then you may (rightly) be concerned If you're not interested in this, then you may (rightly) be concerned
with how this affects the speed of PDL, since the overhead of checking for a with how this affects the speed of PDL, since the overhead of checking for a
bad value at each operation can be large. bad value at each operation can be large.
Because of this, the code has been written to be as fast as possible - Because of this, the code has been written to be as fast as possible -
particularly when operating on ndarrays which do not contain bad values. particularly when operating on ndarrays which do not contain bad values.
In fact, you should notice essentially no speed difference when working In fact, you should notice essentially no speed difference when working
with ndarrays which do not contain bad values. with ndarrays which do not contain bad values.
You may also ask 'well, my computer supports IEEE NaN, so I already have this'. You may also ask 'well, my computer supports IEEE NaN, so I already have this'.
Well, yes and no - many routines, such as C<y=sin(x)>, will propagate NaN's They are different things; a bad value signifies "leave this out of
processing", whereas NaN is the result of a mathematically-invalid
operation.
Many routines, such as C<y=sin(x)>, will propagate NaN's
without the user having to code differently, but routines such as C<qsort>, or without the user having to code differently, but routines such as C<qsort>, or
finding the median of an array, need to be re-coded to handle bad values. finding the median of an array, need to be re-coded to handle bad values.
For floating-point datatypes, C<NaN> and C<Inf> can be used to flag bad For floating-point datatypes, C<NaN> and C<Inf> can be used to flag bad
values, but by default values, but by default
special values are used (L<Default bad values|/Default bad values>). I special values are used (L<Default bad values|/Default bad values>).
do not have any benchmarks to see which option is faster.
As of PDL 2.040, you can have different bad values for separate ndarrays of the There is one default bad value for each datatype, but
as of PDL 2.040, you can have different bad values for separate ndarrays of the
same type. same type.
You can use C<NaN> as the bad value for any floating-point type,
including complex.
=head2 A quick overview =head2 A quick overview
pdl> $x = sequence(4,3); pdl> $x = sequence(4,3);
pdl> p $x pdl> p $x
[ [
[ 0 1 2 3] [ 0 1 2 3]
[ 4 5 6 7] [ 4 5 6 7]
[ 8 9 10 11] [ 8 9 10 11]
] ]
pdl> $x = $x->setbadif( $x % 3 == 2 ) pdl> $x = $x->setbadif( $x % 3 == 2 )
skipping to change at line 64 skipping to change at line 71
pdl> $x *= 3 pdl> $x *= 3
pdl> p $x pdl> p $x
[ [
[ 0 3 BAD 9] [ 0 3 BAD 9]
[ 12 BAD 18 21] [ 12 BAD 18 21]
[BAD 27 30 BAD] [BAD 27 30 BAD]
] ]
pdl> p $x->sum pdl> p $x->sum
120 120
C<demo bad> and C<demo bad2> C<demo bad>
within L<perldl|PDL::perldl> or L<pdl2|PDL::Perldl2> gives a demonstration of so me of the things within L<perldl|PDL::perldl> or L<pdl2|PDL::Perldl2> gives a demonstration of so me of the things
possible with bad values. These are also available on PDL's web-site, possible with bad values. These are also available on PDL's web-site,
at F<http://pdl.perl.org/demos/>. See L<PDL::Bad> for useful routines for worki ng at F<http://pdl.perl.org/demos/>. See L<PDL::Bad> for useful routines for worki ng
with bad values and F<t/bad.t> to see them in action. with bad values and F<t/bad.t> to see them in action.
To find out if a routine supports bad values, use the C<badinfo> command in To find out if a routine supports bad values, use the C<badinfo> command in
L<perldl|PDL::perldl> or L<pdl2|PDL::Perldl2> or the C<-b> option to L<perldl> or L<pdl2> or the C<-b> option to L<pdldoc>.
L<pdldoc|PDL::pdldoc>. This facility is currently a 'proof of concept'
(or, more realistically, a quick hack) so expect it to be rough around the edges
.
Each ndarray contains a flag - accessible via C<< $pdl->badflag >> - to say Each ndarray contains a flag - accessible via C<< $pdl->badflag >> - to say
whether there's any bad data present: whether there's any bad data present:
=over 4 =over 4
=item * =item *
If B<false/0>, which means there's no bad data here, the code supplied by the If B<false/0>, which means there's no bad data here, the code supplied by the
C<Code> option to C<pp_def()> is executed. C<Code> option to C<pp_def()> is executed.
skipping to change at line 116 skipping to change at line 121
pdl> $x = zeroes(20,30); pdl> $x = zeroes(20,30);
pdl> $y = $x->slice('0:10,0:10'); pdl> $y = $x->slice('0:10,0:10');
pdl> $c = $y->slice(',(2)'); pdl> $c = $y->slice(',(2)');
pdl> print ">>c: ", $c->badflag, "\n"; pdl> print ">>c: ", $c->badflag, "\n";
>>c: 0 >>c: 0
pdl> $x->badflag(1); pdl> $x->badflag(1);
pdl> print ">>c: ", $c->badflag, "\n"; pdl> print ">>c: ", $c->badflag, "\n";
>>c: 1 >>c: 1
I<No> change is made to the parents of an ndarray, so This is also propagated to the parents of an ndarray, so
pdl> print ">>a: ", $x->badflag, "\n"; pdl> print ">>a: ", $x->badflag, "\n";
>>a: 1 >>a: 1
pdl> $c->badflag(0); pdl> $c->badflag(0);
pdl> print ">>a: ", $x->badflag, "\n"; pdl> print ">>a: ", $x->badflag, "\n";
>>a: 1 >>a: 0
Thoughts:
=over 4
=item *
the badflag can ONLY be cleared IF an ndarray has NO parents,
and that this change will propagate to all the children of that
ndarray. I am not so keen on this anymore (too awkward to code, for
one).
=item *
C<< $x->badflag(1) >> should propagate the badflag to BOTH parents and
children.
=back
This shouldn't be hard to implement (although an initial attempt failed!). There's also
Does it make sense though? There's also
the issue of what happens if you change the badvalue of an ndarray - should the issue of what happens if you change the badvalue of an ndarray - should
these propagate to children/parents (yes) or whether you should only be these propagate to children/parents (yes) or whether you should only be
able to change the badvalue at the 'top' level - i.e. those ndarrays which do able to change the badvalue at the 'top' level - i.e. those ndarrays which do
not have parents. not have parents.
The C<orig_badvalue()> method returns the compile-time value for a given The C<orig_badvalue()> method returns the compile-time value for a given
datatype. It works on ndarrays, PDL::Type objects, and numbers - eg datatype. It works on ndarrays, PDL::Type objects, and numbers - eg
$pdl->orig_badvalue(), byte->orig_badvalue(), and orig_badvalue(4). $pdl->orig_badvalue(), byte->orig_badvalue(), and orig_badvalue(4).
It also has a horrible name...
To get the current bad value, use the C<badvalue()> method - it has the same To get the current bad value, use the C<badvalue()> method - it has the same
syntax as C<orig_badvalue()>. syntax as C<orig_badvalue()>.
To change the current bad value, supply the new number to badvalue - eg To change the current bad value, supply the new number to badvalue - eg
$pdl->badvalue(2.3), byte->badvalue(2), badvalue(5,-3e34). $pdl->badvalue(2.3), byte->badvalue(2), badvalue(5,-3e34).
I<Note>: the value is silently converted to the correct C type, and I<Note>: the value is silently converted to the correct C type, and
returned - i.e. C<< byte->badvalue(-26) >> returns 230 on my Linux machine. returned - i.e. C<< byte->badvalue(-26) >> returns 230 on my Linux machine.
Note that changes to the bad value are I<NOT> propagated to previously-created Note that changes to the bad value are I<NOT> propagated to previously-created
ndarrays - they will still have the bad value set, but suddenly the elements ndarrays - they will still have the bad flag set, but suddenly the elements
that were bad will become 'good', but containing the old bad value. that were bad will become 'good', but containing the old bad value.
See discussion below. It's not a problem for floating-point types See discussion below.
which use NaN, since you can not change their badvalue.
=head2 Bad values and boolean operators =head2 Bad values and boolean operators
For those boolean operators in L<PDL::Ops>, evaluation For those boolean operators in L<PDL::Ops>, evaluation
on a bad value returns the bad value. Whilst this means that on a bad value returns the bad value. This:
$mask = $img > $thresh; $mask = $img > $thresh;
correctly propagates bad values, it I<will> cause problems correctly propagates bad values. This will omit any bad values, but
for checks such as return a bad value if there are no good ones:
do_something() if any( $img > $thresh );
which need to be re-written as something like $bool = any( $img > $thresh );
do_something() if any( setbadtoval( ($img > $thresh), 0 ) ); As of 2.077, a bad value used as a boolean will throw an exception.
When using one of the 'projection' functions in L<PDL::Ufunc> - such as When using one of the 'projection' functions in L<PDL::Ufunc> - such as
L<orover|PDL::Ufunc/orover> - L<orover|PDL::Ufunc/orover> -
bad values are skipped over (see the documentation of these bad values are skipped over (see the documentation of these
functions for the current (poor) handling of the case when functions for the current handling of the case when
all elements are bad). all elements are bad).
=head2 A bad value for each ndarray, and related issues
There is one default bad value for each datatype, but
you can have a separate bad value for each ndarray as of PDL 2.040.
=head1 IMPLEMENTATION DETAILS =head1 IMPLEMENTATION DETAILS
PDL code just needs to access the C<%PDL::Config>
array (e.g. F<Basic/Bad/bad.pd>) to find out whether bad-value support is requir
ed.
A new flag has been added to the state of an ndarray - C<PDL_BADVAL>. If unset, then A new flag has been added to the state of an ndarray - C<PDL_BADVAL>. If unset, then
the ndarray does not contain bad values, and so all the support code can be the ndarray does not contain bad values, and so all the support code can be
ignored. If set, it does not guarantee that bad values are present, just that ignored. If set, it does not guarantee that bad values are present, just that
they should be checked for. Thanks to Christian, C<badflag()> - which they should be checked for.
sets/clears this flag (see F<Basic/Bad/bad.pd>) - will update I<ALL> the
children/grandchildren/etc of an ndarray if its state changes (see
C<badflag> in F<Basic/Bad/bad.pd> and
C<propagate_badflag> in F<Basic/Core/Core.xs.PL>).
It's not clear what to do with parents: I can see the reason for propagating a
'set badflag' request to parents, but I think a child should NOT be able to clea
r
the badflag of a parent.
There's also the issue of what happens when you change the bad value for an ndar
ray.
The C<pdl_trans> structure has been extended to include an integer value, The C<pdl_trans> structure has been extended to include an integer value,
C<bvalflag>, which acts as a switch to tell the code whether to handle bad value s C<bvalflag>, which acts as a switch to tell the code whether to handle bad value s
or not. This value is set if any of the input ndarrays have their C<PDL_BADVAL> or not. This value is set if any of the input ndarrays have their C<PDL_BADVAL>
flag set (although this code can be replaced by setting C<FindBadStateCode> in flag set (although this code can be replaced by setting C<FindBadStateCode> in
pp_def). The logic of the check is going to get a tad more complicated pp_def).
if I allow routines to fall back to using the C<Code> section for
floating-point types. =head2 Default bad values
The default bad values The default bad values
are now stored in a structure within the Core PDL structure are now stored in a structure within the Core PDL structure
- C<PDL.bvals> (eg F<Basic/Core/pdlcore.h.PL>); see also - C<PDL.bvals> (eg F<Basic/Core/pdlcore.h.PL>); see also
C<typedef badvals> in F<Basic/Core/pdl.h.PL> and the C<typedef badvals> in F<Basic/Core/pdl.h.PL> and the
BOOT code of F<Basic/Core/Core.xs.PL> where the values are initialised to BOOT code of F<Basic/Core/Core.xs.PL> where the values are initialised to
(hopefully) sensible values. (hopefully) sensible values.
See F<PDL/Bad/bad.pd> for read/write routines to the values. See L<PDL::Bad/badvalue> and L<PDL::Bad/orig_badvalue> for read/write routines t
o the values.
=head2 Why not make a PDL subclass?
The support for bad values could have been done as a PDL sub-class. The default/original bad values are set to the C type's maximum (unsigned
The advantage of this approach would be that you only load in the code integers) or the minimum (floating-point and signed integers).
to handle bad values if you actually want to use them.
The downside is that the code then gets separated: any bug fixes/improvements
have to be done to the code in two different files. With the present approach
the code is in the same C<pp_def> function (although there is still the problem
that both C<Code> and C<BadCode> sections need updating).
=head2 Default bad values
The default/original bad values are set to (taken from the Starlink
distribution):
#include <limits.h>
PDL_Byte == UCHAR_MAX
PDL_Short == SHRT_MIN
PDL_Ushort == USHRT_MAX
PDL_Long == INT_MIN
PDL_Float == -FLT_MAX
PDL_Double == -DBL_MAX
=head2 How do I change a routine to handle bad values? =head2 How do I change a routine to handle bad values?
Examples can be found in most of the F<*.pd> files in F<Basic/> (and See L<PDL::PP/BadCode> and L<PDL::PP/HandleBad>.
hopefully many more places soon!).
Some of the logic might appear a bit unclear - that's probably because it
is! Comments appreciated.
All routines should automatically propagate the bad status flag to output
ndarrays, unless you declare otherwise.
If a routine explicitly deals with bad values, you must provide this option
to pp_def:
HandleBad => 1
This ensures that the correct variables are initialised for the C<$ISBAD> etc
macros. It is also used by the automatic document-creation routines to
provide default information on the bad value support of a routine without
the user having to type it themselves (this is in its early stages).
To flag a routine as NOT handling bad values, use
HandleBad => 0
This I<should> cause the routine to print a warning if it's sent any ndarrays
with the bad flag set. Primitive's C<intover> has had this set - since it
would be awkward to convert - but I've not tried it out to see if it works.
If you want to handle bad values but not set the state of all the output
ndarrays, or if it's only one input ndarray that's important, then look
at the PP rules C<NewXSFindBadStatus> and C<NewXSCopyBadStatus> and the
corresponding C<pp_def> options:
=over 4
=item FindBadStatusCode
By default, C<FindBadStatusCode> creates code which sets
C<$PRIV(bvalflag)> depending on the state of the bad flag
of the input ndarrays: see C<findbadstatus> in F<Basic/Gen/PP.pm>.
User-defined code should also store the value of C<bvalflag>
in the C<$BADFLAGCACHE()> variable.
=item CopyBadStatusCode
The default code here is a bit simpler than for C<FindBadStatusCode>:
the bad flag of the output ndarrays are set if
C<$BADFLAGCACHE()> is true after the code has been
evaluated. Sometimes C<CopyBadStatusCode> is set to an empty string,
with the responsibility of setting the badflag of the output ndarray
left to the C<BadCode> section (e.g. the C<xxxover> routines
in F<Basic/Primitive/primitive.pd>).
Prior to PDL 2.4.3 we used C<$PRIV(bvalflag)>
instead of C<$BADFLAGCACHE()>. This is dangerous since the C<$PRIV()>
structure is not guaranteed to be valid at this point in the
code.
=back
If you have a routine that you want to be able to use as in-place, look If you have a routine that you want to be able to use as in-place, look
at the routines in F<bad.pd> (or F<ops.pd>) at the routines in F<bad.pd> (or F<ops.pd>)
which use the C<in-place> option to see how the which use the C<in-place> option to see how the
bad flag is propagated to children using the C<xxxBadStatusCode> options. bad flag is propagated to children using the C<xxxBadStatusCode> options.
I decided not to automate this as rules would be a I decided not to automate this as rules would be a
little complex, since not every in-place op will need to propagate the little complex, since not every in-place op will need to propagate the
badflag (eg unary functions). badflag (eg unary functions).
If the option
HandleBad => 1
is given, then many things happen. For integer types, the readdata code
automatically creates a variable called C<(pdl name)_badval>,
which contains the bad value for that ndarray (see
C<get_xsdatapdecl()> in F<Basic/Gen/PP/PdlParObjs.pm>). However, do not
hard code this name into your code!
Instead use macros (thanks to Tuomas for the suggestion):
'$ISBAD(a(n=>1))' expands to '$a(n=>1) == a_badval'
'$ISGOOD(a())' '$a() != a_badval'
'$SETBAD(bob())' '$bob() = bob_badval'
well, the C<$a(...)> is expanded as well. Also, you can use a C<$> before the
pdl name, if you so wish, but it begins to look like line noise -
eg C<$ISGOOD($a())>.
If you cache an ndarray value in a variable -- eg C<index> in F<slices.pd> --
the following routines are useful:
'$ISBADVAR(c_var,pdl)' 'c_var == pdl_badval'
'$ISGOODVAR(c_var,pdl)' 'c_var != pdl_badval'
'$SETBADVAR(c_var,pdl)' 'c_var = pdl_badval'
The following have been introduced, They may need playing around with to
improve their use.
'$PPISBAD(CHILD,[i]) 'CHILD_physdatap[i] == CHILD_badval'
'$PPISGOOD(CHILD,[i]) 'CHILD_physdatap[i] != CHILD_badval'
'$PPSETBAD(CHILD,[i]) 'CHILD_physdatap[i] = CHILD_badval'
You can use C<NaN> as the bad value for any floating-point type,
including complex.
This all means that you can change This all means that you can change
Code => '$a() = $b() + $c();' Code => '$a() = $b() + $c();'
to to
BadCode => 'if ( $ISBAD(b()) || $ISBAD(c()) ) { BadCode => 'if ( $ISBAD(b()) || $ISBAD(c()) ) {
$SETBAD(a()); $SETBAD(a());
} else { } else {
$a() = $b() + $c(); $a() = $b() + $c();
}' }'
leaving Code as it is. PP::PDLCode will then create a loop something like leaving Code as it is. PP::PDLCode will then create code something like
if ( __trans->bvalflag ) { if ( __trans->bvalflag ) {
broadcastloop over BadCode broadcastloop over BadCode
} else { } else {
broadcastloop over Code broadcastloop over Code
} }
(it's probably easier to just look at the F<.xs> file to see what goes on).
=head2 Going beyond the Code section
Similar to C<BadCode>, there's C<BadBackCode>, and C<BadRedoDimsCode>.
Handling C<EquivCPOffsCode> is a bit different: under the assumption that the
only access to data is via the C<$EQUIVCPOFFS(i,j)> macro, then we can
automatically create the 'bad' version of it; see the C<[EquivCPOffsCode]>
and C<[Code]> rules in L<PDL::PP>.
=head2 Macro access to the bad flag of an ndarray
Macros have been provided to provide access to the bad-flag status of
a pdl:
'$PDLSTATEISBAD(a)' -> '($PDL(a)->state & PDL_BADVAL) > 0'
'$PDLSTATEISGOOD(a)' '($PDL(a)->state & PDL_BADVAL) == 0'
'$PDLSTATESETBAD(a)' '$PDL(a)->state |= PDL_BADVAL'
'$PDLSTATESETGOOD(a)' '$PDL(a)->state &= ~PDL_BADVAL'
For use in C<xxxxBadStatusCode> (+ other stuff that goes into the INIT: section)
there are:
'$SETPDLSTATEBAD(a)' -> 'a->state |= PDL_BADVAL'
'$SETPDLSTATEGOOD(a)' -> 'a->state &= ~PDL_BADVAL'
'$ISPDLSTATEBAD(a)' -> '((a->state & PDL_BADVAL) > 0)'
'$ISPDLSTATEGOOD(a)' -> '((a->state & PDL_BADVAL) == 0)'
In PDL 2.4.3 the C<$BADFLAGCACHE()> macro was introduced for use in
C<FindBadStatusCode> and C<CopyBadStatusCode>.
=head1 WHAT ABOUT DOCUMENTATION? =head1 WHAT ABOUT DOCUMENTATION?
One of the strengths of PDL is its on-line documentation. The aim is to use One of the strengths of PDL is its on-line documentation. The aim is to use
this system to provide information on how/if a routine supports bad values: this system to provide information on how/if a routine supports bad values:
in many cases C<pp_def()> contains all the information anyway, so the in many cases C<pp_def()> contains all the information anyway, so the
function-writer doesn't need to do anything at all! For the cases when this is function-writer doesn't need to do anything at all! For the cases when this is
not sufficient, there's the C<BadDoc> option. For code written at not sufficient, there's the C<BadDoc> option. For code written at
the Perl level - i.e. in a .pm file - use the C<=for bad> pod directive. the Perl level - i.e. in a .pm file - use the C<=for bad> pod directive.
This information will be available via man/pod2man/html documentation. It's also This information will be available via man/pod2man/html documentation. It's also
accessible from the C<perldl> or C<pdl2> shells - using the C<badinfo> command - and the C<pdldoc> accessible from the C<perldl> or C<pdl2> shells - using the C<badinfo> command - and the C<pdldoc>
shell command - using the C<-b> option. shell command - using the C<-b> option.
=head1 CURRENT ISSUES
There are a number of areas that need work, user input, or both! They are
mentioned elsewhere in this document, but this is just to make sure they don't g
et lost.
=head2 Trapping invalid mathematical operations
Should we add exceptions to the functions in C<PDL::Ops> to
set the output bad for out-of-range input values?
pdl> p log10(pdl(10,100,-1))
I would like the above to produce "[1 2 BAD]", but this would
slow down operations on I<all> ndarrays.
We could check for C<NaN>/C<Inf> values after the operation,
but I doubt that would be any faster.
=head2 Dataflow of the badflag
Currently changes to the bad flag are propagated to the children of an ndarray,
but perhaps they should also be passed on to the parents as well. With the
advent of per-ndarray bad values we need to consider how to handle changes
to the value used to represent bad items too.
=head1 EVERYTHING ELSE
The build process has been affected. The following files are
now created during the build:
Basic/Core/pdlcore.h pdlcore.h.PL
pdlcore.c pdlcore.c.PL
pdlapi.c pdlapi.c.PL
Core.xs Core.xs.PL
Core.pm Core.pm.PL
Several new files have been added:
Basic/Pod/BadValues.pod (i.e. this file)
t/bad.t
Basic/Bad/
Basic/Bad/Makefile.PL
bad.pd
etc
=head1 TODO/SUGGESTIONS
=over 4
=item *
what to do about C<$y = pdl(-2); $x = log10($y)> - C<$x> should
be set bad, but it currently isn't.
=item *
Allow the operations in PDL::Ops to skip the check for bad values when using
NaN as a bad value and processing a floating-point ndarray.
Needs a fair bit of work to PDL::PP::PDLCode.
=item *
C<< $pdl->badflag() >> now updates all the children of this ndarray
as well. However, not sure what to do with parents, since:
$y = $x->slice();
$y->badflag(0)
doesn't mean that C<$x> shouldn't have its badvalue cleared.
however, after
$y->badflag(1)
it's sensible to assume that the parents now get flagged as
containing bad values.
PERHAPS you can only clear the bad value flag if you are NOT
a child of another ndarray, whereas if you set the flag then all
children AND parents should be set as well?
Similarly, if you change the bad value in an ndarray, should this
be propagated to parent & children? Or should you only be able to do
this on the 'top-level' ndarray? Nasty...
=item *
some of the names aren't appealing - I'm thinking of C<orig_badvalue()>
in F<Basic/Bad/bad.pd> in particular. Any suggestions appreciated.
=back
=head1 AUTHOR =head1 AUTHOR
Copyright (C) Doug Burke (djburke@cpan.org), 2000, 2006. Copyright (C) Doug Burke (djburke@cpan.org), 2000, 2006.
The per-ndarray bad value support is by Heiko Klein (2006). The per-ndarray bad value support is by Heiko Klein (2006).
Commercial reproduction of this documentation in a different format
is forbidden.
=cut
 End of changes. 29 change blocks. 
313 lines changed or deleted 34 lines changed or added

Home  |  About  |  Features  |  All  |  Newest  |  Dox  |  Diffs  |  RSS Feeds  |  Screenshots  |  Comments  |  Imprint  |  Privacy  |  HTTP(S)