mandrews/spiral_wht_2
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
README
-------------------------------------------------------------------------------
DESCRIPTION
-------------------------------------------------------------------------------
The WHT package exists for mostly for pedagogical purposes and
serves as a reference model for exploring and experimenting with
architecture specific optimizations. This package is a modified
version of the Spiral WHT 1.8 package:
http://www.spiral.net/software/wht.html
Attention:
It is presently a prelease, and some features and data
structures will
change before the next release.
Background
-------------------------------------------------------------------------------
This package has been created as part of the SPIRAL project:
http://www.spiral.net
The SPIRAL project pursues the goal of automating the implementa-
tion, optimization, and platform adaptation of digital signal
processing (DSP) algorithms. The original design of this package
was based on an FFT package by Sebastian Egner:
http://avalon.ira.uka.de/home/egner
The latest version incorporates a more modular design based on
the FFTW package by Matteo Frigo, and Steven G. Johnson:
http://www.fftw.org
It is possible to use native hardware performance counters using
the PAPI library:
http://icl.cs.utk.edu/papi
Authors
-------------------------------------------------------------------------------
See the AUTHORS file which is distributed with this package.
Copyright
-------------------------------------------------------------------------------
See the COPYING and COPYRIGHT files which are distributed with
this package.
Installation
-------------------------------------------------------------------------------
See the INSTALL file which is distributed with this package.
Getting Started
-------------------------------------------------------------------------------
Once you have configured and compiled the package, several pro-
grams will be available to you.
Verify
-------------------------------------------------------------------------------
This program accepts as input a WHT plan and empyrically deter-
mines, through direct calculation, whether or not the plan is:
1. In the langauge accepted by the library.
2. Numerically correct
To execute this program try:
$ wht_verify -w 'small[1]'
$ wht_verify -w 'split[small[2],small[3]]'
$ wht_verify -w 'split[]'
The last invocation should fail since it is not a string accepted
by the grammar.
Measure
-------------------------------------------------------------------------------
This program measures some metric against the execution of a WHT
plan. By default this utility can measure running time in mi-
croseconds. If configured to use the PAPI library, all perfor-
mance counters available on your architecture can be measured.
To execute this program try:
$ wht_measure -w 'split[small[4],small[4]]' -t 2.0
This will run execute the plan until it has spent a total of 2
seconds executing and print the average computation time in mi-
croseconds. If you have PAPI, try these parameters:
$ wht_measure -w 'split[small[4],small[4]]' -e PAPI -m
TOT_CYC -a 99.5 -p 1
-k 10
This will count the average number of cycles it takes the plan to
execute. The alpha parameter (-a) and rho parameter (-p) specify
that the number of samples should be such that the measured mean
is within 1 % of the true mean with 99.5 % confidence. The run
parameter (-k) specifies that 10 runs are average and considered
a single sample.
Random Plans
-------------------------------------------------------------------------------
This program generates random WHT plans given a set of con-
straints. To be specific, all trees generated by this program are
Uniform Level, Bernoulli Recursive. That is, for each integer
composition within the constraints, we assume that each is equal-
ly likely to occur. If the integer composition can be factored
into another integer composition, we recursively apply the algo-
rithm. If the integer could be factor but could also remain as a
leaf node, we flip a coin to determine if the algorithm should be
recursively applied. To execute this program try:
$ wht_randtree -n 8 -a 2 -b 4
This generates a random WHT trees of order 8, with 2 to 4 chil-
dren at level.
Convert
-------------------------------------------------------------------------------
This program applies code transformations to a given WHT plan to
produce a new plan. If you have enabled interleaveding try:
$ wht_convert -w 'split[small[2],small[2]]' -t 'spli-
til[small[0]]' |
wht_convert -t 'smallil(2)[0]'
This first transforms the split codelet into a split interleaved
codelet, and then interleaves all codelets by 2 if this operation
is allowed.
Maintainers
-------------------------------------------------------------------------------
A set of tools hidden from normal users is available for those
who wish to further develop the package. To rebuild the auto-
tools build system type:
./autogen.sh
To rebuild the documentation including this (README file) type:
./autodoc.sh
Various developer related features are enabled by configuring
with the maintainer mode flag:
./configure --enable-maintainer-mode ...
Adding new codelets to the system:
See also:
wht/registry.h
wht/Makefile.am
Adding new code generators to the system:
See also:
whtgen/whtgen.c
wht/Makefile.am
whtgen/whtgen-simd
Adding generated code to the system:
See also:
wht/codelets/register_codelets.pl
wht/codelets/make_small_codelets.sh
wht/codelets/make_interleave_codelets.sh
wht/codelets/make_simd_codelets.sh
wht/Makefile.am
When submiting a patch please use:
diff -Naur spiral_wht-2.0 my_source_tree | gzip >
my_patch.txt.gz
Publications
-------------------------------------------------------------------------------
* About the original WHT package:
o Jeremy Johnson and Markus Pueschel. In Search of the Optimal
Walsh-Hadamard
Transform. Proc. ICASSP 2000, pp. 3347-3350.
* About DDL:
o Neungsoo Park and Viktor Prasanna. Cache Conscious Walsh-
Hadamard
Transform. Proc. ICASSP 2001, Vol. II.
* About Loop Interleaving:
o K. S. Gatlin and L. Carter. Faster FFTs via Architecture-Cog-
nizance. Proc.
PACT, 2000.
* About Parallelization with openMP:
o Kang Chen and Jeremy Johnson. A Prototypical Self-Optimiza-
tion Package for
Parallel Implementation of Fast. Signal Transforms. IPDPS
2002.
* About Performance Analysis:
o Michael Andrews and Jeremy Johnson. Performance Analysis of a
Family of WHT
Algorithms. IPDPS 2007.