"Fossies" - the Fresh Open Source Software Archive  

Source code changes of the file "README.html" between
hpcc-1.5.0b.tar.gz and hpcc-1.5.0.tar.gz

About: HPCC (HPC Challenge) benchmark consists of basically 7 tests: HPL, STREAM, RandomAccess, PTRANS, FFTE, DGEMM and b_eff Latency/Bandwidth.

README.html  (hpcc-1.5.0b):README.html  (hpcc-1.5.0)
<!DOCTYPE html> <!DOCTYPE html>
<html> <html>
<head> <head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII"> <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<meta name="generator" content="hevea 2.00"> <meta name="generator" content="hevea 2.25">
<style type="text/css"> <style type="text/css">
.c000{font-family:monospace;color:navy;} .c000{font-family:monospace}
.li-itemize{margin:1ex 0ex;} .li-itemize{margin:1ex 0ex;}
.li-enumerate{margin:1ex 0ex;} .li-enumerate{margin:1ex 0ex;}
.footnotetext{margin:0ex; padding 0ex;} .footnotetext{margin:0ex; padding:0ex;}
div.footnotetext P{margin:0px; text-indent:1em;} div.footnotetext P{margin:0px; text-indent:1em;}
.thefootnotes{text-align:left;margin:0ex;} .thefootnotes{text-align:left;margin:0ex;}
.dt-thefootnotes{margin:0em;} .dt-thefootnotes{margin:0em;}
.dd-thefootnotes{margin:0em 0em 0em 2em;} .dd-thefootnotes{margin:0em 0em 0em 2em;}
.footnoterule{margin:1em auto 1em 0px;width:50%;} .footnoterule{margin:1em auto 1em 0px;width:50%;}
.title{margin:2ex auto;text-align:center} .title{margin:2ex auto;text-align:center}
.titlemain{margin:1ex 2ex 2ex 1ex;}
.titlerest{margin:0ex 2ex;}
div table{margin-left:inherit;margin-right:inherit;margin-bottom:2px;margin-top: 2px} div table{margin-left:inherit;margin-right:inherit;margin-bottom:2px;margin-top: 2px}
td table{margin:auto;} td table{margin:auto;}
table{border-collapse:collapse;} table{border-collapse:collapse;}
td{padding:0;} td{padding:0;}
pre{text-align:left;margin-left:0ex;margin-right:auto;} pre{text-align:left;margin-left:0ex;margin-right:auto;}
blockquote{margin-left:4ex;margin-right:4ex;text-align:left;} blockquote{margin-left:4ex;margin-right:4ex;text-align:left;}
td p{margin:0px;} td p{margin:0px;}
tt {color:navy}
h2,h3,h4 {color:#527bbd;}
h2 {border-bottom: 2px solid silver;}
</style> </style>
<title>DARPA/DOE HPC&#XA0;Challenge Benchmark version 1.5.0beta <title>DARPA/DOE HPC&#XA0;Challenge Benchmark version 1.5.0beta
</title> </title>
</head> </head>
<body> <body>
<!--HEVEA command line is: hevea -fix -O README.tex --> <!--HEVEA command line is: hevea -fix -O README.tex -->
<!--CUT STYLE article--><!--CUT DEF section 1 --><table class="title"><tr><td><h 1 class="titlemain">DARPA/DOE HPC&#XA0;Challenge Benchmark version 1.5.0beta</h1 ><h3 class="titlerest">Piotr Luszczek<sup><a id="text1" href="#note1">*</a></sup ></h3><h3 class="titlerest">October 12, 2012</h3></td></tr> <!--CUT STYLE article--><!--CUT DEF section 1 --><table class="title"><tr><td st yle="padding:1ex"><h1 class="titlemain">DARPA/DOE HPC&#XA0;Challenge Benchmark v ersion 1.5.0beta</h1><h3 class="titlerest">Piotr Luszczek<sup><a id="text1" href ="#note1">*</a></sup></h3><h3 class="titlerest">October 12, 2012</h3></td></tr>
</table> </table>
<!--TOC section id=sec1 Introduction--> <!--TOC section id="sec1" Introduction-->
<h2 class="section" id="sec1">1&#XA0;&#XA0;Introduction</h2><!--SEC END --><p> <h2 class="section" id="sec1">1&#XA0;&#XA0;Introduction</h2><!--SEC END --><p>
This is a suite of benchmarks that measure performance of processor, This is a suite of benchmarks that measure performance of processor,
memory subsytem, and the interconnect. For details refer to the memory subsytem, and the interconnect. For details refer to the
HPC&#XA0;Challenge web site (<span class="c000">http://icl.cs.utk.edu/hpcc/</spa n>.)</p><p>In essence, HPC&#XA0;Challenge consists of a number of tests each HPC&#XA0;Challenge web site (<span class="c000">http://icl.cs.utk.edu/hpcc/</spa n>.)</p><p>In essence, HPC&#XA0;Challenge consists of a number of tests each
of which measures performance of a different aspect of the system.</p><p>If you are familiar with the High Performance Linpack&#XA0;(HPL) benchmark of which measures performance of a different aspect of the system.</p><p>If you are familiar with the High Performance Linpack&#XA0;(HPL) benchmark
code (see the HPL web site: code (see the HPL web site:
<span class="c000">http://www.netlib.org/benchmark/hpl/</span>) then you can reu se the <span class="c000">http://www.netlib.org/benchmark/hpl/</span>) then you can reu se the
build script file&#XA0;(input for <span class="c000">make(1)</span> command) and the input build script file&#XA0;(input for <span class="c000">make(1)</span> command) and the input
file that you already have for HPL. The HPC&#XA0;Challenge benchmark file that you already have for HPL. The HPC&#XA0;Challenge benchmark
includes HPL and uses its build script and input files with only includes HPL and uses its build script and input files with only
slight modifications. The most important change must be done to the slight modifications. The most important change must be done to the
line that sets the <span class="c000">TOPdir</span> variable. For HPC&#XA0;Chall enge, the line that sets the <span class="c000">TOPdir</span> variable. For HPC&#XA0;Chall enge, the
variable&#X2019;s value should always be <span class="c000">../../..</span> rega rdless of what variable&#X2019;s value should always be <span class="c000">../../..</span> rega rdless of what
it was in the HPL build script file.</p> it was in the HPL build script file.</p>
<!--TOC section id=sec2 Compiling--> <!--TOC section id="sec2" Compiling-->
<h2 class="section" id="sec2">2&#XA0;&#XA0;Compiling</h2><!--SEC END --><p> <h2 class="section" id="sec2">2&#XA0;&#XA0;Compiling</h2><!--SEC END --><p>
The first step is to create a build script file that reflects The first step is to create a build script file that reflects
characteristics of your machine. This file is reused by all the characteristics of your machine. This file is reused by all the
components of the HPC&#XA0;Challenge suite. The build script file should be components of the HPC&#XA0;Challenge suite. The build script file should be
created in the <span class="c000">hpl</span> directory. This directory contains created in the <span class="c000">hpl</span> directory. This directory contains
instructions (the files <span class="c000">README</span> and <span class="c000"> INSTALL</span>) on how instructions (the files <span class="c000">README</span> and <span class="c000"> INSTALL</span>) on how
to create the build script file for your system. The to create the build script file for your system. The
<span class="c000">hpl/setup</span> directory contains many examples of build sc ript <span class="c000">hpl/setup</span> directory contains many examples of build sc ript
files. A recommended approach is to copy one of them to the files. A recommended approach is to copy one of them to the
<span class="c000">hpl</span> directory and if it doesn&#X2019;t work then chang e it.</p><p>The build script file has a name that starts with <span class="c000" >Make.</span> <span class="c000">hpl</span> directory and if it doesn&#X2019;t work then chang e it.</p><p>The build script file has a name that starts with <span class="c000" >Make.</span>
skipping to change at line 116 skipping to change at line 115
loop. If the number of process is not a power two then the code loop. If the number of process is not a power two then the code
is the same as the one performed with the <span class="c000">RA_SANDIA_NOPT</spa n> setting.</li><li class="li-itemize"><span class="c000">RA_TIME_BOUND_DISABLE< /span>: if this symbol is defined then the is the same as the one performed with the <span class="c000">RA_SANDIA_NOPT</spa n> setting.</li><li class="li-itemize"><span class="c000">RA_TIME_BOUND_DISABLE< /span>: if this symbol is defined then the
standard Global RandomAccess code will be used without time limits. This is standard Global RandomAccess code will be used without time limits. This is
discouraged for most runs because the standard algorithm tends to be slow for discouraged for most runs because the standard algorithm tends to be slow for
large array sizes due to a large overhead for short MPI messages.</li><li class= "li-itemize"><span class="c000">USING_FFTW</span>: if this symbol is defined the standard large array sizes due to a large overhead for short MPI messages.</li><li class= "li-itemize"><span class="c000">USING_FFTW</span>: if this symbol is defined the standard
HPC&#XA0;Challenge FFT implemenation&#XA0;(called FFTE) will not be used. HPC&#XA0;Challenge FFT implemenation&#XA0;(called FFTE) will not be used.
Instead, FFTW library will be called. Defining the Instead, FFTW library will be called. Defining the
<span class="c000">USING_FFTW</span> symbol is not sufficient: appropriate flags have <span class="c000">USING_FFTW</span> symbol is not sufficient: appropriate flags have
to be added in the make script so that FFTW headers files can be found to be added in the make script so that FFTW headers files can be found
at compile time and the FFTW libraries at link time.</li></ul> at compile time and the FFTW libraries at link time.</li></ul>
<!--TOC section id=sec3 Runtime Configuration--> <!--TOC section id="sec3" Runtime Configuration-->
<h2 class="section" id="sec3">3&#XA0;&#XA0;Runtime Configuration</h2><!--SEC END --><p> <h2 class="section" id="sec3">3&#XA0;&#XA0;Runtime Configuration</h2><!--SEC END --><p>
The HPC&#XA0;Challenge is driven by a short input file named The HPC&#XA0;Challenge is driven by a short input file named
<span class="c000">hpccinf.txt</span> that is almost the same as the input file for <span class="c000">hpccinf.txt</span> that is almost the same as the input file for
HPL&#XA0;(customarily called <span class="c000">HPL.dat</span>). Refer to the di rectory HPL&#XA0;(customarily called <span class="c000">HPL.dat</span>). Refer to the di rectory
<span class="c000">hpl/www/tuning.html</span> for details about the input file f or <span class="c000">hpl/www/tuning.html</span> for details about the input file f or
HPL. A sample input file is included with the HPC&#XA0;Challenge HPL. A sample input file is included with the HPC&#XA0;Challenge
distribution.</p><p>The differences between HPL&#X2019;s input file and HPC&#XA0 ;Challenge&#X2019;s input distribution.</p><p>The differences between HPL&#X2019;s input file and HPC&#XA0 ;Challenge&#X2019;s input
file can be summarized as follows:</p><ul class="itemize"><li class="li-itemize" > file can be summarized as follows:</p><ul class="itemize"><li class="li-itemize" >
Lines 3 and 4 are ignored. The output is always appended to the Lines 3 and 4 are ignored. The output is always appended to the
file named <span class="c000">hpccoutf.txt</span>. file named <span class="c000">hpccoutf.txt</span>.
skipping to change at line 177 skipping to change at line 176
</li><li class="li-itemize">Line 28: form of L1 for HPL </li><li class="li-itemize">Line 28: form of L1 for HPL
</li><li class="li-itemize">Line 29: form of U for HPL </li><li class="li-itemize">Line 29: form of U for HPL
</li><li class="li-itemize">Line 30: value that specifies whether equilibration should be used by HPL </li><li class="li-itemize">Line 30: value that specifies whether equilibration should be used by HPL
</li><li class="li-itemize">Line 31: memory alignment for HPL </li><li class="li-itemize">Line 31: memory alignment for HPL
</li><li class="li-itemize">Line 32: ignored </li><li class="li-itemize">Line 32: ignored
</li><li class="li-itemize">Line 33: number of additional problem sizes for PTRA NS </li><li class="li-itemize">Line 33: number of additional problem sizes for PTRA NS
</li><li class="li-itemize">Line 34: additional problem sizes for PTRANS </li><li class="li-itemize">Line 34: additional problem sizes for PTRANS
</li><li class="li-itemize">Line 35: number of additional blocking factors for P TRANS </li><li class="li-itemize">Line 35: number of additional blocking factors for P TRANS
</li><li class="li-itemize">Line 36: additional blocking factors for PTRANS </li><li class="li-itemize">Line 36: additional blocking factors for PTRANS
</li></ul> </li></ul>
<!--TOC section id=sec4 Running--> <!--TOC section id="sec4" Running-->
<h2 class="section" id="sec4">4&#XA0;&#XA0;Running</h2><!--SEC END --><p> <h2 class="section" id="sec4">4&#XA0;&#XA0;Running</h2><!--SEC END --><p>
The exact way to run the HPC&#XA0;Challenge benchmark depends on the MPI The exact way to run the HPC&#XA0;Challenge benchmark depends on the MPI
implementation and system details. An example command to run the implementation and system details. An example command to run the
benchmark could like like this: <span class="c000">mpirun -np 4 hpcc</span>. The benchmark could like like this: <span class="c000">mpirun -np 4 hpcc</span>. The
meaning of the command&#X2019;s components is as follows: meaning of the command&#X2019;s components is as follows:
</p><ul class="itemize"><li class="li-itemize"> </p><ul class="itemize"><li class="li-itemize">
<span class="c000">mpirun</span> is the command that starts execution of an MPI <span class="c000">mpirun</span> is the command that starts execution of an MPI
code. Depending on the system, it might also be <span class="c000">aprun</span>, code. Depending on the system, it might also be <span class="c000">aprun</span>,
<span class="c000">mpiexec</span>, <span class="c000">mprun</span>, <span class= "c000">poe</span>, or something <span class="c000">mpiexec</span>, <span class="c000">mprun</span>, <span class= "c000">poe</span>, or something
appropriate for your computer.</li><li class="li-itemize"><span class="c000">-np 4</span> is the argument that specifies that 4 MPI appropriate for your computer.</li><li class="li-itemize"><span class="c000">-np 4</span> is the argument that specifies that 4 MPI
processes should be started. The number of MPI processes should be processes should be started. The number of MPI processes should be
large enough to accomodate all the process grids specified in the large enough to accomodate all the process grids specified in the
<span class="c000">hpccinf.txt</span> file.</li><li class="li-itemize"><span cla ss="c000">hpcc</span> is the name of the HPC&#XA0;Challenge executable to <span class="c000">hpccinf.txt</span> file.</li><li class="li-itemize"><span cla ss="c000">hpcc</span> is the name of the HPC&#XA0;Challenge executable to
run. run.
</li></ul><p>After the run, a file called <span class="c000">hpccoutf.txt</span> is created. It </li></ul><p>After the run, a file called <span class="c000">hpccoutf.txt</span> is created. It
contains results of the benchmark. This file should be uploaded contains results of the benchmark. This file should be uploaded
through the web form at the HPC&#XA0;Challenge website.</p> through the web form at the HPC&#XA0;Challenge website.</p>
<!--TOC section id=sec5 Source Code Changes across Versions (ChangeLog)--> <!--TOC section id="sec5" Source Code Changes across Versions (ChangeLog)-->
<h2 class="section" id="sec5">5&#XA0;&#XA0;Source Code Changes across Versions ( ChangeLog)</h2><!--SEC END --> <h2 class="section" id="sec5">5&#XA0;&#XA0;Source Code Changes across Versions ( ChangeLog)</h2><!--SEC END -->
<!--TOC subsection id=sec6 Version 1.5.0beta (2015-07-23)--> <!--TOC subsection id="sec6" Version 1.5.0 (2016-03-18)-->
<h3 class="subsection" id="sec6">5.1&#XA0;&#XA0;Version 1.5.0beta (2015-07-23)</ <h3 class="subsection" id="sec6">5.1&#XA0;&#XA0;Version 1.5.0 (2016-03-18)</h3><
h3><!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> !--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Fixed memory leak in STREAM code.
</li><li class="li-enumerate">Fixed bug in STREAM that resulted in minimum resul
ts reported as 0.
</li><li class="li-enumerate">Removed some of the compilation warnings.
</li></ol>
<!--TOC subsection id="sec7" Version 1.5.0beta (2015-07-23)-->
<h3 class="subsection" id="sec7">5.2&#XA0;&#XA0;Version 1.5.0beta (2015-07-23)</
h3><!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Added new targets to the main make(1) file. Added new targets to the main make(1) file.
</li><li class="li-enumerate">Fixed bug introduced while updating to MPI STREAM 1.7 with spurious global communicator (reported by NEC). </li><li class="li-enumerate">Fixed bug introduced while updating to MPI STREAM 1.7 with spurious global communicator (reported by NEC).
</li><li class="li-enumerate">Added make(1) file for OpenMPI from MacPorts. </li><li class="li-enumerate">Added make(1) file for OpenMPI from MacPorts.
</li><li class="li-enumerate">Fixed bug introduced while updating to MPI STREAM 1.7 that caused some ranks to use NULL communicator. </li><li class="li-enumerate">Fixed bug introduced while updating to MPI STREAM 1.7 that caused some ranks to use NULL communicator.
</li><li class="li-enumerate">Fixed bug introduced while updating to MPI STREAM 1.7 that caused syntax errors. </li><li class="li-enumerate">Fixed bug introduced while updating to MPI STREAM 1.7 that caused syntax errors.
</li></ol> </li></ol>
<!--TOC subsection id=sec7 Version 1.5.0alpha (2015-05-22)--> <!--TOC subsection id="sec8" Version 1.5.0alpha (2015-05-22)-->
<h3 class="subsection" id="sec7">5.2&#XA0;&#XA0;Version 1.5.0alpha (2015-05-22)< <h3 class="subsection" id="sec8">5.3&#XA0;&#XA0;Version 1.5.0alpha (2015-05-22)<
/h3><!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> /h3><!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Added global error accounting in STREAM. Added global error accounting in STREAM.
</li><li class="li-enumerate">Updated checking to report from multiple MPI proce sses contributing to overall error. </li><li class="li-enumerate">Updated checking to report from multiple MPI proce sses contributing to overall error.
</li><li class="li-enumerate">Added barrier to make sure all processes enter STR EAM kernel tests at the same time. </li><li class="li-enumerate">Added barrier to make sure all processes enter STR EAM kernel tests at the same time.
</li><li class="li-enumerate">Updated naming conventions to match the original b enchmark in STREAM. </li><li class="li-enumerate">Updated naming conventions to match the original b enchmark in STREAM.
</li><li class="li-enumerate">Changed scaling constant to prevent verification f rom overflowing in STREAM. </li><li class="li-enumerate">Changed scaling constant to prevent verification f rom overflowing in STREAM.
</li><li class="li-enumerate">Simplified MPI communicator code in STREAM. </li><li class="li-enumerate">Simplified MPI communicator code in STREAM.
</li><li class="li-enumerate">Substituted large constants for more descriptive c ompile time arithmetic in STREAM. </li><li class="li-enumerate">Substituted large constants for more descriptive c ompile time arithmetic in STREAM.
</li><li class="li-enumerate">Added the &#X201C;restrict&#X201D; keyword to the STREAM vector pointers for faster generated code. </li><li class="li-enumerate">Added the &#X201C;restrict&#X201D; keyword to the STREAM vector pointers for faster generated code.
</li><li class="li-enumerate">Updated STREAM code to the official STREAM MPI ver sion 1.7. </li><li class="li-enumerate">Updated STREAM code to the official STREAM MPI ver sion 1.7.
</li><li class="li-enumerate">Removed infinite loop due to default compiler opti mization in DLAMCH and SLAMCH. </li><li class="li-enumerate">Removed infinite loop due to default compiler opti mization in DLAMCH and SLAMCH.
</li><li class="li-enumerate">Added compiler flags to allow compiling with a C++ compiler. </li><li class="li-enumerate">Added compiler flags to allow compiling with a C++ compiler.
</li></ol> </li></ol>
<!--TOC subsection id=sec8 Version 1.4.3 (2013-08-26)--> <!--TOC subsection id="sec9" Version 1.4.3 (2013-08-26)-->
<h3 class="subsection" id="sec8">5.3&#XA0;&#XA0;Version 1.4.3 (2013-08-26)</h3>< <h3 class="subsection" id="sec9">5.4&#XA0;&#XA0;Version 1.4.3 (2013-08-26)</h3><
!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> !--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Increased the size of scratch vector for local FFT tests that was Increased the size of scratch vector for local FFT tests that was
missed in the previous version (reported by SGI). missed in the previous version (reported by SGI).
</li><li class="li-enumerate">Added Makefile for Blue Gene/P contributed by Vasi l Tsanov. </li><li class="li-enumerate">Added Makefile for Blue Gene/P contributed by Vasi l Tsanov.
</li></ol> </li></ol>
<!--TOC subsection id=sec9 Version 1.4.2 (2012-10-12)--> <!--TOC subsection id="sec10" Version 1.4.2 (2012-10-12)-->
<h3 class="subsection" id="sec9">5.4&#XA0;&#XA0;Version 1.4.2 (2012-10-12)</h3>< <h3 class="subsection" id="sec10">5.5&#XA0;&#XA0;Version 1.4.2 (2012-10-12)</h3>
!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> <!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Increased sizes of scratch vectors for local FFT tests to account for Increased sizes of scratch vectors for local FFT tests to account for
runs on systems with large main memory (reported by IBM, SGI and Intel). runs on systems with large main memory (reported by IBM, SGI and Intel).
</li><li class="li-enumerate">Reduced vector size for local FFT tests due to lar ger scratch space needed. </li><li class="li-enumerate">Reduced vector size for local FFT tests due to lar ger scratch space needed.
</li><li class="li-enumerate">Added a type cast to prevent overflow of a 32-bit integer vector </li><li class="li-enumerate">Added a type cast to prevent overflow of a 32-bit integer vector
size in FFT data generation routine (reported by IBM). size in FFT data generation routine (reported by IBM).
</li><li class="li-enumerate">Fixed variable types to handle array sizes that ov erflow 32-bit </li><li class="li-enumerate">Fixed variable types to handle array sizes that ov erflow 32-bit
integers in RandomAccess (reported by IBM and SGI). integers in RandomAccess (reported by IBM and SGI).
</li><li class="li-enumerate">Changed time-bound code to be used by default in G lobal RandomAccess and </li><li class="li-enumerate">Changed time-bound code to be used by default in G lobal RandomAccess and
allowed for it to be switched off with a compile time flag if necessary. allowed for it to be switched off with a compile time flag if necessary.
</li><li class="li-enumerate">Code cleanup to allow compilation without warnings of RandomAccess test. </li><li class="li-enumerate">Code cleanup to allow compilation without warnings of RandomAccess test.
</li><li class="li-enumerate">Changed communication code in PTRANS to avoid larg e message sizes that </li><li class="li-enumerate">Changed communication code in PTRANS to avoid larg e message sizes that
caused problems in some MPI implementations. caused problems in some MPI implementations.
</li><li class="li-enumerate">Updated documentation in README.txt and README.htm l files. </li><li class="li-enumerate">Updated documentation in README.txt and README.htm l files.
</li></ol> </li></ol>
<!--TOC subsection id=sec10 Version 1.4.1 (2010-06-01)--> <!--TOC subsection id="sec11" Version 1.4.1 (2010-06-01)-->
<h3 class="subsection" id="sec10">5.5&#XA0;&#XA0;Version 1.4.1 (2010-06-01)</h3> <h3 class="subsection" id="sec11">5.6&#XA0;&#XA0;Version 1.4.1 (2010-06-01)</h3>
<!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> <!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Added optimized variants of RandomAccess that use Linear Congruential Generator for random number generation. Added optimized variants of RandomAccess that use Linear Congruential Generator for random number generation.
</li><li class="li-enumerate">Made corrections to comments that provide definiti on of the RandomAccess test. </li><li class="li-enumerate">Made corrections to comments that provide definiti on of the RandomAccess test.
</li><li class="li-enumerate">Removed initialization of the main array from the timed section of optimized versions of RandomAccess. </li><li class="li-enumerate">Removed initialization of the main array from the timed section of optimized versions of RandomAccess.
</li><li class="li-enumerate">Fixed the length of the vector used to compute err or when using MPI implementation from FFTW. </li><li class="li-enumerate">Fixed the length of the vector used to compute err or when using MPI implementation from FFTW.
</li><li class="li-enumerate">Added global reduction to error calculation in MPI FFT to achieve more accurate error estimate. </li><li class="li-enumerate">Added global reduction to error calculation in MPI FFT to achieve more accurate error estimate.
</li><li class="li-enumerate">Updated documentation in README. </li><li class="li-enumerate">Updated documentation in README.
</li></ol> </li></ol>
<!--TOC subsection id=sec11 Version 1.4.0 (2010-03-26)--> <!--TOC subsection id="sec12" Version 1.4.0 (2010-03-26)-->
<h3 class="subsection" id="sec11">5.6&#XA0;&#XA0;Version 1.4.0 (2010-03-26)</h3> <h3 class="subsection" id="sec12">5.7&#XA0;&#XA0;Version 1.4.0 (2010-03-26)</h3>
<!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> <!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Added new variant of RandomAccess that uses Linear Congruential Generator for ra ndom number generation. Added new variant of RandomAccess that uses Linear Congruential Generator for ra ndom number generation.
</li><li class="li-enumerate">Rearranged the order of benchmarks so that HPL com ponent runs last and may be aborted </li><li class="li-enumerate">Rearranged the order of benchmarks so that HPL com ponent runs last and may be aborted
if the performance of other components was not satisfactory. RandomAccess is now first to assist in tuning if the performance of other components was not satisfactory. RandomAccess is now first to assist in tuning
the code. the code.
</li><li class="li-enumerate">Added global initialization and finalization routi ne that allows to properly initialize </li><li class="li-enumerate">Added global initialization and finalization routi ne that allows to properly initialize
and finalize external software and hardware components without changing the rest of the HPCC testing harness. and finalize external software and hardware components without changing the rest of the HPCC testing harness.
</li><li class="li-enumerate">Lack of <span class="c000">hpccinf.txt</span> is n o longer reported as error but as a warning. </li><li class="li-enumerate">Lack of <span class="c000">hpccinf.txt</span> is n o longer reported as error but as a warning.
</li></ol> </li></ol>
<!--TOC subsection id=sec12 Version 1.3.2 (2009-03-24)--> <!--TOC subsection id="sec13" Version 1.3.2 (2009-03-24)-->
<h3 class="subsection" id="sec12">5.7&#XA0;&#XA0;Version 1.3.2 (2009-03-24)</h3> <h3 class="subsection" id="sec13">5.8&#XA0;&#XA0;Version 1.3.2 (2009-03-24)</h3>
<!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> <!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Fixed memory leaks in G-RandomAccess driver routine. Fixed memory leaks in G-RandomAccess driver routine.
</li><li class="li-enumerate">Made the check for 32-bit vector sizes in G-FFT op tional. MKL allows for 64-bit vector sizes in its FFTW wrapper. </li><li class="li-enumerate">Made the check for 32-bit vector sizes in G-FFT op tional. MKL allows for 64-bit vector sizes in its FFTW wrapper.
</li><li class="li-enumerate">Fixed memory bug in single-process FFT. </li><li class="li-enumerate">Fixed memory bug in single-process FFT.
</li><li class="li-enumerate">Update documentation (README). </li><li class="li-enumerate">Update documentation (README).
</li></ol> </li></ol>
<!--TOC subsection id=sec13 Version 1.3.1 (2008-12-09)--> <!--TOC subsection id="sec14" Version 1.3.1 (2008-12-09)-->
<h3 class="subsection" id="sec13">5.8&#XA0;&#XA0;Version 1.3.1 (2008-12-09)</h3> <h3 class="subsection" id="sec14">5.9&#XA0;&#XA0;Version 1.3.1 (2008-12-09)</h3>
<!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> <!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Fixed a dead-lock problem in FFT component due to use of wrong communicator. Fixed a dead-lock problem in FFT component due to use of wrong communicator.
</li><li class="li-enumerate">Fixed the 32-bit random number generator in PTRANS that was using 64-bit </li><li class="li-enumerate">Fixed the 32-bit random number generator in PTRANS that was using 64-bit
routines from HPL. routines from HPL.
</li></ol> </li></ol>
<!--TOC subsection id=sec14 Version 1.3.0 (2008-11-13)--> <!--TOC subsection id="sec15" Version 1.3.0 (2008-11-13)-->
<h3 class="subsection" id="sec14">5.9&#XA0;&#XA0;Version 1.3.0 (2008-11-13)</h3> <h3 class="subsection" id="sec15">5.10&#XA0;&#XA0;Version 1.3.0 (2008-11-13)</h3
<!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> ><!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Updated HPL component to use HPL 2.0 source code Updated HPL component to use HPL 2.0 source code
<ol class="enumerate" type=a><li class="li-enumerate"> <ol class="enumerate" type=a><li class="li-enumerate">
Replaced 32-bit Pseudo Random Number Generator (PRNG) with a 64-bit one. Replaced 32-bit Pseudo Random Number Generator (PRNG) with a 64-bit one.
</li><li class="li-enumerate">Removed 3 numerical checks of the solution residua l with a single one. </li><li class="li-enumerate">Removed 3 numerical checks of the solution residua l with a single one.
</li><li class="li-enumerate">Added support for 64-bit systems with large memory sizes (before they would </li><li class="li-enumerate">Added support for 64-bit systems with large memory sizes (before they would
overflow during index calculations 32-bit integers.) overflow during index calculations 32-bit integers.)
</li></ol> </li></ol>
</li><li class="li-enumerate">Introduced a limit on FFT vector size so they fit in a 32-bit integer (only </li><li class="li-enumerate">Introduced a limit on FFT vector size so they fit in a 32-bit integer (only
applicable when using FFTW version 2.) applicable when using FFTW version 2.)
</li></ol> </li></ol>
<!--TOC subsection id=sec15 Version 1.2.0 (2007-06-25)--> <!--TOC subsection id="sec16" Version 1.2.0 (2007-06-25)-->
<h3 class="subsection" id="sec15">5.10&#XA0;&#XA0;Version 1.2.0 (2007-06-25)</h3 <h3 class="subsection" id="sec16">5.11&#XA0;&#XA0;Version 1.2.0 (2007-06-25)</h3
><!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate"> ><!--SEC END --><ol class="enumerate" type=1><li class="li-enumerate">
Changes in the FFT component: Changes in the FFT component:
<ol class="enumerate" type=a><li class="li-enumerate"> <ol class="enumerate" type=a><li class="li-enumerate">
Added flexibility in choosing vector sizes and processor counts: Added flexibility in choosing vector sizes and processor counts:
now the code can do powers of 2, 3, and 5 both sequentially and in parallel now the code can do powers of 2, 3, and 5 both sequentially and in parallel
tests. tests.
</li><li class="li-enumerate">FFTW can now run with ESTIMATE (not just MEASURE) flag: it might produce </li><li class="li-enumerate">FFTW can now run with ESTIMATE (not just MEASURE) flag: it might produce
worse performance results but often reduces time to run the test and cuases worse performance results but often reduces time to run the test and cuases
less memory fragmentation. less memory fragmentation.
</li></ol> </li></ol>
</li><li class="li-enumerate">Changes in the DGEMM component: </li><li class="li-enumerate">Changes in the DGEMM component:
skipping to change at line 334 skipping to change at line 339
</li></ol> </li></ol>
</li><li class="li-enumerate">Miscellaneous changes: </li><li class="li-enumerate">Miscellaneous changes:
<ol class="enumerate" type=a><li class="li-enumerate"> <ol class="enumerate" type=a><li class="li-enumerate">
Added better support for Windows-based clusters by taking advantage of Added better support for Windows-based clusters by taking advantage of
Win32 API. Win32 API.
</li><li class="li-enumerate">Added custom memory allocator to deal with memory fragmentation on some </li><li class="li-enumerate">Added custom memory allocator to deal with memory fragmentation on some
systems. systems.
</li><li class="li-enumerate">Added better reporting of configuration options in the output file. </li><li class="li-enumerate">Added better reporting of configuration options in the output file.
</li></ol> </li></ol>
</li></ol> </li></ol>
<!--TOC subsection id=sec16 Version 1.0.0 (2005-06-11)--> <!--TOC subsection id="sec17" Version 1.0.0 (2005-06-11)-->
<h3 class="subsection" id="sec16">5.11&#XA0;&#XA0;Version 1.0.0 (2005-06-11)</h3 <h3 class="subsection" id="sec17">5.12&#XA0;&#XA0;Version 1.0.0 (2005-06-11)</h3
><!--SEC END --> ><!--SEC END -->
<!--TOC subsection id=sec17 Version 0.8beta (2004-10-19)--> <!--TOC subsection id="sec18" Version 0.8beta (2004-10-19)-->
<h3 class="subsection" id="sec17">5.12&#XA0;&#XA0;Version 0.8beta (2004-10-19)</ <h3 class="subsection" id="sec18">5.13&#XA0;&#XA0;Version 0.8beta (2004-10-19)</
h3><!--SEC END --> h3><!--SEC END -->
<!--TOC subsection id=sec18 Version 0.8alpha (2004-10-15)--> <!--TOC subsection id="sec19" Version 0.8alpha (2004-10-15)-->
<h3 class="subsection" id="sec18">5.13&#XA0;&#XA0;Version 0.8alpha (2004-10-15)< <h3 class="subsection" id="sec19">5.14&#XA0;&#XA0;Version 0.8alpha (2004-10-15)<
/h3><!--SEC END --> /h3><!--SEC END -->
<!--TOC subsection id=sec19 Version 0.6beta (2004-08-21)--> <!--TOC subsection id="sec20" Version 0.6beta (2004-08-21)-->
<h3 class="subsection" id="sec19">5.14&#XA0;&#XA0;Version 0.6beta (2004-08-21)</ <h3 class="subsection" id="sec20">5.15&#XA0;&#XA0;Version 0.6beta (2004-08-21)</
h3><!--SEC END --> h3><!--SEC END -->
<!--TOC subsection id=sec20 Version 0.6alpha (2004-05-31)--> <!--TOC subsection id="sec21" Version 0.6alpha (2004-05-31)-->
<h3 class="subsection" id="sec20">5.15&#XA0;&#XA0;Version 0.6alpha (2004-05-31)< <h3 class="subsection" id="sec21">5.16&#XA0;&#XA0;Version 0.6alpha (2004-05-31)<
/h3><!--SEC END --> /h3><!--SEC END -->
<!--TOC subsection id=sec21 Version 0.5beta (2003-12-01)--> <!--TOC subsection id="sec22" Version 0.5beta (2003-12-01)-->
<h3 class="subsection" id="sec21">5.16&#XA0;&#XA0;Version 0.5beta (2003-12-01)</ <h3 class="subsection" id="sec22">5.17&#XA0;&#XA0;Version 0.5beta (2003-12-01)</
h3><!--SEC END --> h3><!--SEC END -->
<!--TOC subsection id=sec22 Version 0.4alpha (2003-11-13)--> <!--TOC subsection id="sec23" Version 0.4alpha (2003-11-13)-->
<h3 class="subsection" id="sec22">5.17&#XA0;&#XA0;Version 0.4alpha (2003-11-13)< <h3 class="subsection" id="sec23">5.18&#XA0;&#XA0;Version 0.4alpha (2003-11-13)<
/h3><!--SEC END --> /h3><!--SEC END -->
<!--TOC subsection id=sec23 Version 0.3alpha (2004-11-05)--> <!--TOC subsection id="sec24" Version 0.3alpha (2004-11-05)-->
<h3 class="subsection" id="sec23">5.18&#XA0;&#XA0;Version 0.3alpha (2004-11-05)< <h3 class="subsection" id="sec24">5.19&#XA0;&#XA0;Version 0.3alpha (2004-11-05)<
/h3><!--SEC END --><!--BEGIN NOTES document--> /h3><!--SEC END --><!--BEGIN NOTES document-->
<hr class="footnoterule"><dl class="thefootnotes"><dt class="dt-thefootnotes"> <hr class="footnoterule"><dl class="thefootnotes"><dt class="dt-thefootnotes">
<a id="note1" href="#text1">*</a></dt><dd class="dd-thefootnotes"><div class="fo otnotetext">University of Tennessee Knoxville, Innovative <a id="note1" href="#text1">*</a></dt><dd class="dd-thefootnotes"><div class="fo otnotetext">University of Tennessee Knoxville, Innovative
Computing Laboratory</div> Computing Laboratory</div></dd></dl>
</dd></dl>
<!--END NOTES--> <!--END NOTES-->
<!--CUT END --> <!--CUT END -->
<!--HTMLFOOT--> <!--HTMLFOOT-->
<!--ENDHTML--> <!--ENDHTML-->
<!--FOOTER--> <!--FOOTER-->
<hr style="height:2"><blockquote class="quote"><em>This document was translated from L<sup>A</sup>T<sub>E</sub>X by <hr style="height:2"><blockquote class="quote"><em>This document was translated from L<sup>A</sup>T<sub>E</sub>X by
</em><a href="http://hevea.inria.fr/index.html"><em>H<span style="font-size:smal l"><sup>E</sup></span>V<span style="font-size:small"><sup>E</sup></span>A</em></ a><em>.</em></blockquote></body> </em><a href="http://hevea.inria.fr/index.html"><em>H<span style="font-size:smal l"><sup>E</sup></span>V<span style="font-size:small"><sup>E</sup></span>A</em></ a><em>.</em></blockquote></body>
</html> </html>
 End of changes. 23 change blocks. 
68 lines changed or deleted 74 lines changed or added

Home  |  About  |  All  |  Newest  |  Fossies Dox  |  Screenshots  |  Comments  |  Imprint  |  Privacy  |  HTTPS