| 1 | **SHTns is a high performance library for Spherical Harmonic Transform written in C,
|
|---|
| 2 | aimed at numerical simulation (fluid flows, mhd, ...) in spherical geometries.**
|
|---|
| 3 |
|
|---|
| 4 | Copyright (c) 2010-2014 Centre National de la Recherche Scientifique.
|
|---|
| 5 | written by Nathanael Schaeffer (CNRS, ISTerre, Grenoble, France).
|
|---|
| 6 | SHTns is distributed under the open source [CeCILL License](http://www.cecill.info/licences/Licence_CeCILL_V2.1-en.html)
|
|---|
| 7 | (GPL compatible) located in the LICENSE file.
|
|---|
| 8 |
|
|---|
| 9 | FEATURES:
|
|---|
| 10 | ---------
|
|---|
| 11 |
|
|---|
| 12 | - **blazingly fast**.
|
|---|
| 13 | - both **scalar and vector transforms**.
|
|---|
| 14 | - backward and forward (synthesis and analysis) functions.
|
|---|
| 15 | - flexible truncation (degree, order, azimuthal periodicity).
|
|---|
| 16 | - spatial data can be stored in latitude-major or longitude-major arrays.
|
|---|
| 17 | - various conventions (normalization and Condon-Shortley phase).
|
|---|
| 18 | - can be used from **Fortran, c/c++, and Python** programs.
|
|---|
| 19 | - a highly efficient Gauss algorithm working with Gauss nodes (based on
|
|---|
| 20 | Gauss-Legendre quadrature).
|
|---|
| 21 | - support for SSE2, SSE3 and **AVX** vectorization, as well as Xeon Phi and
|
|---|
| 22 | Blue Gene/Q.
|
|---|
| 23 | - **parallel transforms with OpenMP** (for Gauss grid only).
|
|---|
| 24 | - an algorithm using DCT for regular nodes (generalized Fejer quadrature).
|
|---|
| 25 | - synthesis (inverse transform) at any coordinate (not constrained to a grid).
|
|---|
| 26 | - ability to choose the optimal spatial sizes for a given spectral truncation.
|
|---|
| 27 | - **on-the-fly transforms** : saving memory and bandwidth, they are even faster
|
|---|
| 28 | on modern architectures.
|
|---|
| 29 | - accurate up to spherical harmonic degree l=16383 (at least).
|
|---|
| 30 | - rotation functions to rotate spherical harmonics (beta).
|
|---|
| 31 | - special spectral operator functions that do not require a transform
|
|---|
| 32 | (multiply by cos(theta)...).
|
|---|
| 33 | - scalar transforms for complex spatial fields.
|
|---|
| 34 | - SHT at fixed m (without fft, aka Legendre transform - beta).
|
|---|
| 35 |
|
|---|
| 36 |
|
|---|
| 37 | INSTALL:
|
|---|
| 38 | --------
|
|---|
| 39 |
|
|---|
| 40 | Briefly, the shell commands `./configure; make; make install` should
|
|---|
| 41 | configure, build, and install this package. `./configure --help` will
|
|---|
| 42 | list available options (among which `--enable-openmp` and `--enable-python`).
|
|---|
| 43 | However, in order to get the best performance, it is highly recommended to
|
|---|
| 44 | compile and install the FFTW library yourself, because many distributions
|
|---|
| 45 | include a non-optimized FFTW library.
|
|---|
| 46 |
|
|---|
| 47 | DOCUMENTATION:
|
|---|
| 48 | --------------
|
|---|
| 49 |
|
|---|
| 50 | - On-line doc is available: <http://users.isterre.fr/nschaeff/SHTns/>
|
|---|
| 51 | - You can build it locally: Run `make docs` to generate documentation
|
|---|
| 52 | (requires doxygen).
|
|---|
| 53 | Then browse the html documentation starting with `doc/html/index.html`
|
|---|
| 54 | - A related research paper has been published:
|
|---|
| 55 | [Efficient Spherical Harmonic Transforms aimed at pseudo-spectral numerical simulations](http://dx.doi.org/10.1002/ggge.20071),
|
|---|
| 56 | also [available from arXiv](http://arxiv.org/abs/1202.6522).
|
|---|
| 57 | - If you use SHTns for research work, please **cite this paper**:
|
|---|
| 58 |
|
|---|
| 59 | @article {shtns,
|
|---|
| 60 | author = {Schaeffer, Nathanael},
|
|---|
| 61 | title = {Efficient spherical harmonic transforms aimed at
|
|---|
| 62 | pseudospectral numerical simulations},
|
|---|
| 63 | journal = {Geochemistry, Geophysics, Geosystems},
|
|---|
| 64 | doi = {10.1002/ggge.20071},
|
|---|
| 65 | volume = {14}, number = {3}, pages = {751--758},
|
|---|
| 66 | year = {2013},
|
|---|
| 67 | }
|
|---|
| 68 |
|
|---|
| 69 | CHANGE LOG:
|
|---|
| 70 | -----------
|
|---|
| 71 |
|
|---|
| 72 | * v2.6.3 (9 Mar 2015)
|
|---|
| 73 | - better default compilation flags for icc.
|
|---|
| 74 | - complex transforms added to Fortran API (thanks to Bertrand Putigny)
|
|---|
| 75 |
|
|---|
| 76 | * v2.6.2 (30 Dec 2014)
|
|---|
| 77 | - fix regression: Schmidt normalized analysis failed since v2.6 in some cases.
|
|---|
| 78 |
|
|---|
| 79 | * v2.6.1 (17 Dec 2014)
|
|---|
| 80 | - new functions in python interface to control console output.
|
|---|
| 81 | - fix: `spat_cplx_to_SH()` and `SH_to_spat_cplx()` were missing
|
|---|
| 82 | a (-1)^m for m<0 [issue #16].
|
|---|
| 83 | - fix: segfault in `spat_to_SH_ml()` [issue #15].
|
|---|
| 84 | - fix a few compilation issues.
|
|---|
| 85 |
|
|---|
| 86 | * v2.6 (24 Oct 2014)
|
|---|
| 87 | - support for IBM Blue Gene/Q (QPX) with [bgclang](http://trac.alcf.anl.gov/projects/llvm-bgq).
|
|---|
| 88 | Configure with `./configure --enable-many-core CC=bgclang`
|
|---|
| 89 | - new beta feature: SHT at fixed m (aka Legendre transform).
|
|---|
| 90 | - faster initialization with OpenMP.
|
|---|
| 91 | - fix: in python, a rare coredump now correctly raises an exception.
|
|---|
| 92 | - fix: a few compilation problems.
|
|---|
| 93 |
|
|---|
| 94 | * v2.5 (13 Mar 2014)
|
|---|
| 95 | - new experimental support for Intel Xeon Phi (MIC) in native mode,
|
|---|
| 96 | with contributions from Vincent Boulos (Bull). For good performance
|
|---|
| 97 | icc 14 is required. Configure with `./configure --enable-mic CC=icc`
|
|---|
| 98 | - fftw3.h included for easier compilation.
|
|---|
| 99 | - fix: obey `OMP_NUM_THREADS` environement variable
|
|---|
| 100 | - fix: failure of fly analysis with some special (rare) sizes.
|
|---|
| 101 | - add missing `shtns_print_cfg()` to Fortran interface
|
|---|
| 102 | - new save/restore plan feature for bit-level reproducibility
|
|---|
| 103 |
|
|---|
| 104 | * v2.4.1 (18 Sep 2013)
|
|---|
| 105 | - performance improvement: analysis with `SHT_PHI_CONTIGUOUS` is now
|
|---|
| 106 | on par with synthesis (or better), even for large transforms.
|
|---|
| 107 |
|
|---|
| 108 | * v2.4 (5 Aug 2013)
|
|---|
| 109 | - new scalar transforms for complex spatial fields: `SH_to_spat_cplx()`
|
|---|
| 110 | and `spat_cplx_to_SH()`.
|
|---|
| 111 | - new `shtns_verbose()` function to control output during initialization.
|
|---|
| 112 | - better MKL support (including multi-thread). Warning: MKL's FFTW
|
|---|
| 113 | interface is not thread safe, SHTns can't be called from multiple
|
|---|
| 114 | threads if compiled with MKL.
|
|---|
| 115 | - fix compatibility with c++ std::complex.
|
|---|
| 116 | - new shallow water simulation example in examples/
|
|---|
| 117 |
|
|---|
| 118 | * v2.3.1 (10 Apr 2013)
|
|---|
| 119 | - OpenMP library is now installed as `libshtns_omp.a`.
|
|---|
| 120 | - fix detection of OpenMP mutlithreaded FFTW.
|
|---|
| 121 | - new configure option `--enable-mkl` to use the FFT of the MKL
|
|---|
| 122 | library instead of FFTW (lower performance expected).
|
|---|
| 123 | - `time_SHT` can be compiled on MacOSX and uses less memory.
|
|---|
| 124 | - new `SH_to_lat()` function.
|
|---|
| 125 | - a few other minor improvements and fixes.
|
|---|
| 126 |
|
|---|
| 127 | * v2.3 (3 Oct 2012)
|
|---|
| 128 | - added `mi` member in `shtns_info` structure (ABI change).
|
|---|
| 129 | - added function to access the Gauss nodes.
|
|---|
| 130 | - added support for special operators in spectral space (multiplication
|
|---|
| 131 | by cos(theta) and sin(theta).d/dtheta for instance).
|
|---|
| 132 | - shtns.h is now compatible with C++.
|
|---|
| 133 | - better python interface for rotations.
|
|---|
| 134 | - performance improvement for OpenMP code without `fftw3_omp`.
|
|---|
| 135 | - slightly faster `SH_to_point()` [5%] and `SHqst_to_point()` [20%].
|
|---|
| 136 | - bugfix: in some rare cases, OpenMP code freed unallocated memory.
|
|---|
| 137 | - bugfix: fixed python interface compilation with clang.
|
|---|
| 138 |
|
|---|
| 139 | * v2.2.4 (25 Jun 2012)
|
|---|
| 140 | - the previous critical bugfix had not been applied to parallel OpenMP
|
|---|
| 141 | transforms.
|
|---|
| 142 |
|
|---|
| 143 | * v2.2.3 (24 Jun 2012)
|
|---|
| 144 | - critical bugfix: `SHtor_to_spat()` and `SHsph_to_spat()` gave wrong results
|
|---|
| 145 | for mmax>0 with on-the-fly transoforms.
|
|---|
| 146 | - minor bugfix in Python interface.
|
|---|
| 147 |
|
|---|
| 148 | * v2.2.2 (21 Jun 2012)
|
|---|
| 149 | - better Python interface: using `synth()` and `analys()` methods.
|
|---|
| 150 | - bugfix in build system: can now compile python extension without openmp.
|
|---|
| 151 |
|
|---|
| 152 | * v2.2.1 (21 May 2012)
|
|---|
| 153 | - slightly faster parallel transforms.
|
|---|
| 154 | - better Python interface: decent error handling and keyword argument support.
|
|---|
| 155 | - changes to Python interface: grid defaults to `SHT_PHI_CONTIGUOUS`,
|
|---|
| 156 | `set_grid_auto()` removed.
|
|---|
| 157 | - bugfix: default compilation with FFTW 3.0 to avoid "bad Gauss points" error.
|
|---|
| 158 | - bugfix: correct alignement of gauss weights in 32 bit systems to avoid
|
|---|
| 159 | segfaults.
|
|---|
| 160 | - new ./configure script for easier configuration and compilation.
|
|---|
| 161 |
|
|---|
| 162 | * v2.2 (23 Apr 2012)
|
|---|
| 163 | - parallel transforms with OpenMP (for Gauss grid, significant benefit
|
|---|
| 164 | for l>=127).
|
|---|
| 165 |
|
|---|
| 166 | * v2.1 (8 Mar 2012)
|
|---|
| 167 | - support for huge spherical harmonic degree (tested up to l>43600).
|
|---|
| 168 | - speed improvements, especially for large transforms.
|
|---|
| 169 | - compilation with FFTW v3.0 or more is now possible through a
|
|---|
| 170 | configuration option (see `sht_config.h`)
|
|---|
| 171 |
|
|---|
| 172 | * v2.0 (9 Feb 2012)
|
|---|
| 173 | - support for AVX instruction set (almost x2 speed-up on Sandy-Bridge
|
|---|
| 174 | processors).
|
|---|
| 175 | - allow multiple transforms with different sizes, normalizations and
|
|---|
| 176 | grids (C interface only).
|
|---|
| 177 | - changes to C interface : most functions now require a handle to identify
|
|---|
| 178 | the transform. (Fortran interface unchanged)
|
|---|
| 179 | - transforms are accurate up to spherical harmonic degree l=2700 (at least).
|
|---|
| 180 | - lots of small improvements, speed-ups and a few bug fixes.
|
|---|
| 181 | - requires FFTW v3.3.
|
|---|
| 182 | - better Python interface using NumPy arrays (beta).
|
|---|
| 183 | - rotation functions to rotate spherical harmonics (beta).
|
|---|
| 184 |
|
|---|
| 185 | * v1.5 (4 May 2011)
|
|---|
| 186 | - on-the-fly transforms which do not require huge matrices : save memory
|
|---|
| 187 | and bandwidth, and can be faster on some architecture.
|
|---|
| 188 | - runtime selection of fastest algorithm, including on-the-fly transforms.
|
|---|
| 189 | - transforms are accurate up to spherical harmonic degree l=2045 (at least).
|
|---|
| 190 | - fix a bug that lead to wrong results for `SHtor_to_spat` and `SHsph_to_spat`.
|
|---|
| 191 | - a bunch of minor improvements, optimizations and fixes.
|
|---|
| 192 |
|
|---|
| 193 | * v1.0 (9 June 2010)
|
|---|
| 194 | - initial release for C/C++ and Fortran under CeCILL licence (GPL compatible).
|
|---|
| 195 | - scalar and vector, forward and backward transforms.
|
|---|
| 196 | - support several normalization conventions.
|
|---|
| 197 | - transforms are accurate up to spherical harmonic degree l=1300 (at least).
|
|---|
| 198 | - flexible truncation and spatial sizes.
|
|---|
| 199 | - support spatial data stored in latitude-major or longitude-major arrays.
|
|---|
| 200 | - regular grid (with DCT acceleration) or Gauss grid (highly optimized).
|
|---|
| 201 | - SSE2 vectorization.
|
|---|
| 202 | - synthesis at any coordinate (not constrained to grid).
|
|---|
| 203 | - can choose the optimal spatial size for a given spectral truncation.
|
|---|
| 204 | - requires FFTW 3.0.
|
|---|