=====================================================================
NEW IN VERSION 13
=====================================================================
- (Feature) Added in the ability for XSBench to write out a binary
  file containing a randomized XS dataset. The code is also capable
  of reading in this file instead of generating a new XS dataset
  each time the program is run. This feature may be useful for those
  running in simulation environments where walltime minimization
  is key for logistical reasons.

- Minor refactoring/reorganization of code to make the code clearer
  and easier to read. After many updates, the code had become a
  little bloated and difficult to read, so a cleanup was in order.

- Removed synthetic delay injection (via dummy FLOPS or loads).
  These were not very useful or accurate and had not been used by
  anyone after the initial analyses were done with them. As they were
  definitely adding to the code bloat of the program, they were
  removed.

=====================================================================
NEW IN VERSION 12
=====================================================================
- (Bugfix) The XL and XXL runtime options didn't work correctly.
  The unionized energy grid overflowed the bounds of normal 4 byte
  integers, and actually required use of 8 byte integers.

  The variables "n_isotopes" and "n_gridpoints" have been refactored
  to 8 byte long integers. All variables that use n_isotopes and
  n_gridpoints as input have also been refactored to 8 byte longs.

  Note that a simple "patch" from version 11 to version 12 can be
  manually done by simply changing line 73 of GridInit.c to be a
  long instead of an int. The more thorough refactoring done in v12
  is done to "future proof" the code.

=====================================================================
NEW IN VERSION 11
=====================================================================

- Updated & greatly improved the PAPI capability of XSBench. Now
  events can be tallied during multi-core. See README for more
  info.

- Added in option for thread sleep pause in between macro XS lookups.
  Very similar to adding dummy flops, but a little cleaner.
  
  With as small as a 0.1 ms sleep, we get linear scaling with threads.
  While this initially appears to confirm our initial suspicions
  regarding memory contention / latency problems, I think the delays
  resulting from the sleeps could potentially just be washing out
  the scaling numbers. Even with just 0.1, over 15 million lookups,
  the majority of the runtime (>90%) is just sleep, so scaling numbers
  aren't very expressive anymore. Need to implement timers that
  ignore the sleep parts.

- Specified OpenMP schedule mode as 'dynamic'. This is the default
  on most systems, but now it's set explicitly since it's a lot
  faster than 'static' or other modes.

- Added in a "benchmarking" mode, which will attempt all possible
  thread combinations between 1 <= nthreads <= max_threads.
  This helps to save considerable benchmarking time, as the
  data structures can be re-used between runs rather than regenerated
  each time. Benchmarking mode is enabled in the makefile.

=====================================================================
NEW IN VERSION 10
=====================================================================

- Changed verification mode to be more portable. The verification
strategy introduced in version 9 had discrepancies on different
platforms and compilers. This was due to reliance on the compiler
provided rand() function producing a different series of random
numbers than other implementations. Also, there were some issues
with the associativity of floating point arithmetic. These issues
have now all been solved, and the verification hash is consistent
across all tested platforms.

- Revised "XL" size parameters, as well as adding in an "XXL" size
option. The XL size now uses 120GB of XS data. The XXL mode uses
252GN of XS data. More details are in the verification section of the
readme. 

=====================================================================
NEW IN VERSION 9
=====================================================================

- Added in new code verification mode. This can be toggled on in
the makefile. When code is compiled and run, a hash of the results
will be generated which can then be compared to other versions and
configurations of XSBench. See readme for more details.

- Moved PAPI def to makefile. Makes it easier to toggle.

- Added -l command line option to set the number of cross section
lookups performed by XSBench.

=====================================================================
NEW IN VERSION 8
=====================================================================

- Simplified command line interface (CLI) read in process. XSBench
now supports a more traditional CLI, as follows:

Usage: ./XSBench <options>
Options include:
  -n <threads>     Number of OpenMP threads to run
    -s <size>        Size of H-M Benchmark to run (small, large, XL)
	  -g <gridpoints>  Number of gridpoints per isotope
	  Default is equivalent to: -s large

- Updated README with new CLI usage details.

- Fixed several typos in the XSBench Theory PDF.

=====================================================================
NEW IN VERSION 7
=====================================================================

- Added MPI support. Multithreaded run executes on all ranks.
  Problem size or data is not subdivided - the exact same problem
  is solved in parallel by all ranks. Only MPI communication is
  a single reduce at the end to aggregate timing data. 
  
  To enable MPi mode, simply change the MPI flag in the makefile
  to "MPI = yes". Make sure mpicc is available on your system.

- Added in "XL" size option for a giant 277 GB energy grid. This
  is unlikely to fit on a single node, but is useful for
  experimentation purposes.

- Removed "BGQ mode" CLI argument option, as it wasn't being used
  by anything in the code anymore.

=====================================================================
NEW IN VERSION 6
=====================================================================

- Fixed small bug in calculate_micro_xs() function. Occasionally,
  the index returned would be the last nuclidegridpoint for that
  nuclide, causing the "high" energy point to be off the end of the
  grid (likely into the next nuclide's energy grid). Added a check
  to correct for when this occurs.

  Note that this bug did not affect performance - only made the
  calculation of XS's more "correct".

=====================================================================
NEW IN VERSION 5
=====================================================================

- Added ChangeLog

- Moved source code files to src/ directory.

- Updated README.txt file to enhance documentation

- Added significant documentation with regards to theory
  in the docs/XSBench_Theory.pdf file. The README.txt file is now
  more of a quick-start & users guide, whereas the XSBench_Theory.pdf
  guide covers the details and theory behind the code.

=====================================================================
