Commit fbd2d4fa authored by muntwiler_m's avatar muntwiler_m
Browse files

public distro 2.1.0

parent acea809e
# Slurm script template for PMSCO calculations on the Ra cluster
# based on by V. Markushin 2016-03-01
# this version checks out the source code from a git repository
# to a temporary location and compiles the code.
# this is to minimize conflicts between different jobs
# but requires that each job has its own git commit.
# Use:
# - enter the appropriate parameters and save as a new file.
# - call the sbatch command to pass the job script.
# request a specific number of nodes and tasks.
# example:
# sbatch --nodes=2 --ntasks-per-node=24 --time=02:00:00
# the qpmsco script does all this for you.
# PMSCO arguments
# copy this template to a new file, and set the arguments
# path to be used as working directory.
# contains the script derived from this template
# and a copy of the pmsco code in the 'pmsco' directory.
# receives output and temporary files.
# python module that declares the project and starts the calculation.
# must include the file path relative to $PMSCO_WORK_DIR.
# name of output file. should not include a path.
# all paths are relative to $PMSCO_WORK_DIR or (better) absolute.
# Further arguments
# PMSCO_JOBNAME (required)
# the job name is the base name for output files.
# PMSCO_WALLTIME_HR (integer, required)
# wall time limit in hours. must be integer, minimum 1.
# this value is passed to PMSCO.
# it should specify the same amount of wall time as requested from the scheduler.
# extra arguments that are parsed by the project module.
#SBATCH --output="_PMSCO_JOBNAME.o.%j"
#SBATCH --error="_PMSCO_JOBNAME.e.%j"
module load psi-python36/4.4.0
module load gcc/4.8.5
module load openmpi/3.1.3
source activate pmsco3
echo '================================================================================'
echo "=== Running $0 at the following time and place:"
ls -lA
#the intel compiler is currently not compatible with mpi4py. -mm 170131
#echo '================================================================================'
#echo "=== Setting the environment to use Intel Cluster Studio XE 2016 Update 2 intel/16.2:"
#cmd="source /opt/psi/Programming/intel/16.2/bin/ intel64"
#echo $cmd
echo '================================================================================'
echo "=== The environment is set as following:"
echo '================================================================================'
echo "BEGIN test"
which mpirun
cmd="mpirun /bin/hostname"
echo $cmd
echo "END test"
echo '================================================================================'
echo "BEGIN mpirun pmsco"
cd pmsco
echo "code revision"
git log --pretty=tformat:'%h %ai %d' -1
make -C pmsco all
python -m compileall pmsco
python -m compileall projects
PMSCO_CMD="python pmsco/pmsco $PMSCO_PROJECT_FILE"
if [ -n "$PMSCO_SCAN_FILES" ]; then
if [ -n "$PMSCO_OUT" ]; then
if [ "$PMSCO_WALLTIME_HR" -ge 1 ]; then
if [ -n "$PMSCO_LOGLEVEL" ]; then
# Do no use the OpenMPI specific options, like "-x LD_LIBRARY_PATH", with the Intel mpirun.
cmd="mpirun $PMSCO_CMD $PMSCO_ARGS"
echo $cmd
echo "END mpirun pmsco"
echo '================================================================================'
rm -rf pmsco
ls -lAtr
echo '================================================================================'
exit 0
......@@ -75,10 +75,10 @@ PMSCO_OUT="_PMSCO_JOBNAME"
module load psi-python27/2.4.1
module load psi-python36/4.4.0
module load gcc/4.8.5
module load openmpi/1.10.2
source activate pmsco
module load openmpi/3.1.3
source activate pmsco3
echo '================================================================================'
echo "=== Running $0 at the following time and place:"
# submission script for PMSCO calculations on the Ra cluster
# this version clones the current git repository at HEAD to the work directory.
# thus, version conflicts between jobs are avoided.
if [ $# -lt 1 ]; then
echo ""
echo " NOSUB (optional): do not submit the script to the queue. default: submit."
echo " GIT_TAG: git tag or branch name of the code. HEAD for current code."
echo " DESTDIR: destination directory. must exist. a sub-dir \$JOBNAME is created."
echo " JOBNAME (text): name of job. use only alphanumeric characters, no spaces."
echo " NODES (integer): number of computing nodes. (1 node = 24 or 32 processors)."
echo " do not specify more than 2."
echo " TASKS_PER_NODE (integer): 1...24, or 32."
echo " 24 or 32 for full-node allocation."
echo " 1...23 for shared node allocation."
echo " WALLTIME:HOURS (integer): requested wall time."
echo " 1...24 for day partition"
echo " 24...192 for week partition"
echo " 1...192 for shared partition"
echo " PROJECT: python module (file path) that declares the project and starts the calculation."
echo " ARGS (optional): any number of further PMSCO or project arguments (except time)."
echo ""
echo "the job script is written to \$DESTDIR/\$JOBNAME which is also the destination of calculation output."
exit 1
# location of the pmsco package is derived from the path of this script
SCRIPTDIR="$(dirname $(readlink -f $0))"
SOURCEDIR="$(readlink -f $SCRIPTDIR/..)"
# read arguments
if [ "$1" == "NOSUB" ]; then
if [ "$1" == "HEAD" ]; then
BRANCH_ARG="-b $1"
shift 2
# select partition
if [ $PMSCO_WALLTIME_HR -ge 25 ]; then
if [ $PMSCO_TASKS_PER_NODE -lt 24 ]; then
PMSCO_PROJECT_FILE="$(readlink -f $1)"
# set up working directory
cd "$DEST_DIR"
if [ ! -d "$PMSCO_JOBNAME" ]; then
# copy code
git clone $BRANCH_ARG --single-branch --depth 1 $PMSCO_SOURCE_REPO pmsco || exit
cd pmsco
PMSCO_REV=$(git log --pretty=format:"%h, %ai" -1) || exit
echo "$PMSCO_REV" > revision.txt
# generate job script from template
"$SCRIPTDIR/pmsco.ra-git.template" > $PMSCO_JOBNAME.job
chmod u+x "$PMSCO_JOBNAME.job" || exit
# request nodes and tasks
# The option --ntasks-per-node is meant to be used with the --nodes option.
# (For the --ntasks option, the default is one task per node, use the --cpus-per-task option to change this default.)
# sbatch options
# --cores-per-socket=16
# 32 cores per node
# --partition=[shared|day|week]
# --time=8-00:00:00
# override default time limit (2 days in long queue)
# time formats: "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes", "days-hours:minutes:seconds"
# --mail-type=ALL
# --test-only
# check script but do not submit
SLURM_ARGS="--nodes=$PMSCO_NODES --ntasks-per-node=$PMSCO_TASKS_PER_NODE"
if [ $PMSCO_TASKS_PER_NODE -gt 24 ]; then
SLURM_ARGS="--cores-per-socket=16 $SLURM_ARGS"
echo $CMD
if [ "$NOSUB" != "true" ]; then
exit 0
# submission script for PMSCO calculations on the Ra cluster
# CAUTION: the job will execute the pmsco code which is present in the directory tree
# of this script _at the time of job execution_, not submission!
# before changing the code, make sure that all pending jobs have started execution,
# otherwise you will experience version conflicts.
# it's better to use the script which clones the code.
if [ $# -lt 1 ]; then
......@@ -87,9 +93,9 @@ PMSCO_WORK_DIR="$WORKDIR"
# provide revision information, requires git repository
PMSCO_REV=$(git log --pretty=format:"Data revision %h, %ai" -1)
PMSCO_REV=$(git log --pretty=format:"%h, %ai" -1)
if [ $? -ne 0 ]; then
PMSCO_REV="Data revision unknown, "$(date +"%F %T %z")
PMSCO_REV="revision unknown, "$(date +"%F %T %z")
echo "$PMSCO_REV" > revision.txt
......@@ -86,9 +86,9 @@ PHD_WORK_DIR="$WORKDIR"
# provide revision information, requires git repository
PHD_REV=$(git log --pretty=format:"Data revision %h, %ad" --date=iso -1)
PHD_REV=$(git log --pretty=format:"%h, %ad" --date=iso -1)
if [ $? -ne 0 ]; then
PHD_REV="Data revision unknown, "$(date +"%F %T %z")
PHD_REV="revision unknown, "$(date +"%F %T %z")
echo "$PHD_REV" > revision.txt
......@@ -763,6 +763,7 @@ src/introduction.dox \
src/concepts.dox \
src/concepts-tasks.dox \
src/concepts-emitter.dox \
src/concepts-atomscat.dox \
src/installation.dox \
src/execution.dox \
src/commandline.dox \
......@@ -21,9 +21,6 @@ Do not include the extension <code>.py</code> or a trailing slash.
@c path/to/ should be the path and name to your project module.
Common args and project args are described below.
Note: In contrast to earlier versions, the project module is not executed directly any more.
Rather, it is loaded by the main pmsco module as a 'plug-in'.
\subsection sec_common_args Common Arguments
......@@ -43,15 +40,14 @@ The following table is ordered by importance.
| --log-level | DEBUG, INFO, WARNING (default), ERROR, CRITICAL | Minimum level of messages that should be added to the log. |
| --log-file | file system path | Name of the main log file. Under MPI, the rank of the process is inserted before the extension. Default: output-file + log, or pmsco.log. |
| --log-disable | | Disable logging. By default, logging is on. |
| --pop-size | integer | Population size (number of particles) in swarm optimization mode. The default value is the greater of 4 or two times the number of calculation processes. |
| --pop-size | integer | Population size (number of particles) in swarm and genetic optimization mode. The default value is the greater of 4 or the number of parallel calculation processes. |
| --seed-file | file system path | Name of the population seed file. Population data of previous optimizations can be used to seed a new optimization. The file must have the same structure as the .pop or .dat files. See @ref pmsco.project.Project.seed_file. |
| --table-file | file system path | Name of the model table file in table scan mode. |
| -c, --code | edac (default) | Scattering code. At the moment, only edac is supported. |
\subsubsection sec_file_categories File Categories
The following category names can be used with the @c --keep-files option.
The following category names can be used with the `--keep-files` option.
Multiple names can be specified and must be separated by spaces.
| Category | Description | Default Action |
......@@ -59,7 +55,7 @@ Multiple names can be specified and must be separated by spaces.
| all | shortcut to include all categories | |
| input | raw input files for calculator, including cluster and phase files in custom format | delete |
| output | raw output files from calculator | delete |
| phase | phase files in portable format for report | delete |
| atomic | atomic scattering and emission files in portable format | delete |
| cluster | cluster files in portable XYZ format for report | keep |
| debug | debug files | delete |
| model | output files in ETPAI format: complete simulation (a_-1_-1_-1_-1) | keep |
......@@ -67,9 +63,20 @@ Multiple names can be specified and must be separated by spaces.
| symmetry | output files in ETPAI format: symmetry (a_b_c_-1_-1) | delete |
| emitter | output files in ETPAI format: emitter (a_b_c_d_-1) | delete |
| region | output files in ETPAI format: region (a_b_c_d_e) | delete |
| report| final report of results | keep |
| report| final report of results | keep always |
| population | final state of particle population | keep |
| rfac | files related to models which give bad r-factors | delete |
| rfac | files related to models which give bad r-factors, see warning below | delete |
The `report` category is always kept and cannot be turned off.
The `model` category is always kept in single calculation mode.
If you want to specify `rfac` with the `--keep-files` option,
you have to add the file categories that you want to keep, e.g.,
`--keep-files rfac cluster model scan population`
(to return the default categories for all calculated models).
Do not specify `rfac` alone as this will effectively not return any file.
\subsection sec_project_args Project Arguments
......@@ -125,4 +132,4 @@ The job script is written to @c $DESTDIR/$JOBNAME which is also the destination
| MODE | single, swarm, grid, genetic | PMSCO operation mode. This value is passed on to PMSCO as the @c --mode argument. |
| ARGS (optional) | | Any further arguments are passed on verbatim to PMSCO. You don't need to specify the mode and time limit here. |
\ No newline at end of file
/*! @page pag_concepts_atomscat Atomic scattering
\section sec_atomscat Atomic scattering
\subsection sec_atomscat_intro Introduction
The process of calculating atomic scattering factors (phase shifts) can be customized in several ways.
1. Internal processing.
Some multiple scattering programs, like EDAC, contain a built-in facility to calculate phase shifts.
This is the most simple and default behaviour.
2. Automatic calculation in a separate program.
PMSCO has an interface to run the PHAGEN program from
the [MsSpec-1.0 package]( to calculate scattering factors.
Note that the PHAGEN code is not included in the public distribution of PMSCO.
3. Manual calculation.
Scattering files created manually using an external program can be used by providing the file names.
The files must have the format required by the multiple scattering code,
and they must be linked to the corresponding atoms of the cluster.
In the case of automatic calculation, the project code can optionally hook into the process
and modify clusters before and after scattering factors are calculated.
For instance, it may provide an extended cluster in order to reduce boundary effects,
or it may modify the assignment of scattering files to cluster atoms
so that the scattering factors of selected atom classes are used
(cf. section \ref sec_atomscat_atomclass).
\subsection sec_atomscat_usage Usage
\subsubsection sec_atomscat_internal Internal processing
This is the default behaviour selected in the inherited pmsco.project.Project class.
Make sure not to override the `atomic_scattering_factory` attribute.
Its default value is pmsco.calculators.calculator.InternalAtomicCalculator.
\subsubsection sec_atomscat_external Automatic calculation in a separate program
To select the atomic scattering calculator,
assign its interface class to the project's `atomic_scattering_factory` attribute.
For example, to use PHAGEN, add the following code to your project's `__init__` constructor:
from pmsco.calculators.phagen import PhagenCalculator
self.atomic_scattering_factory = PhagenCalculator
\subsubsection sec_atomscat_manual Manual calculation
If you want to keep the scattering factors constant during an optimization,
you should run PMSCO in _single_ mode and provide the model parameters and cluster
that will return the desired scattering files.
In the `create_params` method of your project,
you should then set the `phase_files` attribute,
which is a dictionary that maps atom classes to the names of the scattering files.
Unless you set specific values in the cluster object, the atom class defaults to the element number.
The file names should include a path relative to the working directory.
\subsection sec_atomscat_implement Implementation
\subsubsection sec_atomscat_atomclass Atom classes
Atomic scattering programs classify atoms based on chemical element, charge state and symmetry of the local environment.
This means that two atoms of the same chemical element may have different scattering factors.
For example, if you have EDAC output the cluster after calculation of the muffin tin potential,
you will find that the chemical element number has been replaced by an arbitrary integer.
By default, PMSCO will do the linking of atom classes and scattering files transparently.
However, if you want to reduce the number of atom classes,
or if you have the scattering factors calculated on a reference cluster,
you will have to provide project code to do the assignment.
This is described further below.
\subsubsection sec_atomscat_calculator Atomic scattering calculator
The project selects the atomic scattering calculation mode by specifying its `atomic_scattering_factory` attributed.
This is the name of a class that inherits from @ref pmsco.calculators.calculator.AtomicCalculator.
The following calculators are currently implemented:
| Class | Description |
| --- | --- |
| pmsco.calculators.calculator.InternalAtomicCalculator | Calculate the atomic scattering factors in the multiple-scattering program. |
| pmsco.calculators.phagen.PhagenCalculator | Calculate the atomic scattering factors in the PHAGEN program. |
An atomic calculator class essentially defines a `run` method that operates on a cluster and scattering parameters object.
It generates the necessary scattering files, updates the cluster with the new atom classes
and updates the parameters with the file names of the scattering files.
Note that the scattering files have to be in the correct format for the multiple scattering calculator.
\subsubsection sec_atomscat_hooks Project hooks
Before and after calculation of the scattering factors,
the project's `before_atomic_scattering` and `after_atomic_scattering` methods are called
with the cluster and input parameters.
The _before_ method provides the cluster to be used for atomic scattering calculations.
It may,
1. just return the original cluster,
2. modify the provided cluster to include additional atoms or modify the charge state of the emitter,
3. create a completely different cluster,
4. return None to suppress the atomic scattering calculation.
The method is called once at the beginning of the PMSCO job with model -1,
where it may return the global reference cluster.
Later on it is called once for each calculation task with the specific task index.
Similarly, the _after_ method collects the results and updates the `phase_files` dictionary of the input parameters.
It is free to consolidate atom classes and remove unwanted atoms.
However, it must make sure that for each atom class in the cluster,
there is a corresponding link to a scattering file.
\ No newline at end of file
......@@ -39,8 +39,8 @@ The code depends on the following libraries:
- Python 2.7 or 3.6
- Numpy >= 1.11
- Python packages from PyPI listed in the requirements.txt file
- Numpy >= 1.13
- Python packages listed in the requirements.txt file
Most of these requirements are available from the Linux distribution.
For an easily maintainable Python environment, [Miniconda]( is recommended.
......@@ -50,11 +50,11 @@ and it's difficult to switch between different Python versions.
On the PSI cluster machines, the environment must be set using the module system and conda (on Ra).
Details are explained in the PEARL Wiki.
PMSCO runs under Python 2.7 or Python 3.6 or higher.
Since Python 2 is being deprecated, the code has been ported to Python 3.6.
Compatibility with Python 2.7 is maintained by using
the [future package](
New code should be written according to their guidelines.
PMSCO runs under Python 2.7 or Python 3.6.
Since Python 2 is being deprecated, Python 3.6 is recommended.
Compatibility with Python 2.7 is currently maintained by using
the [future package](
but may be dropped at any time.
\subsection sec_install_instructions Instructions
......@@ -86,7 +86,6 @@ nano \
openmpi-bin \
openmpi-common \
sqlite3 \
swig \
......@@ -102,11 +101,11 @@ Install Miniconda according to their [instructions](
then configure the Python environment:
conda create -q --yes -n pmsco python=2.7
conda create -q --yes -n pmsco python=3.6
source activate pmsco
conda install -q --yes -n pmsco \
pip \
numpy \
"numpy>=1.13" \
scipy \
ipython \
mpi4py \
......@@ -114,7 +113,9 @@ conda install -q --yes -n pmsco \
nose \
mock \
future \
statsmodels \
swig \
pip install periodictable attrdict fasteners
......@@ -9,13 +9,15 @@ The actual scattering calculation is done by code developed by other parties.
While the scattering program typically calculates a diffraction pattern based on a set of static parameters and a specific coordinate file in a single process,
PMSCO wraps around that program to facilitate parameter handling, cluster building, structural optimization and parallel processing.
In the current version, the [EDAC]( code
developed by F. J. García de Abajo, M. A. Van Hove, and C. S. Fadley (1999) is used for scattering calculations.
Other code can be integrated as well.
Initially, support for the MSC program by Kaduwela, Friedman, and Fadley was planned but is currently not maintained.
PMSCO is written in Python 2.7.
EDAC is written in C++, MSC in Fortran.
PMSCO interacts with the calculation programs through Python wrappers for C++ or Fortran.
In the current version, PMSCO can make use of the following programs.
Other programs may be integrated as well.
- [EDAC](
by F. J. García de Abajo, M. A. Van Hove, and C. S. Fadley,
[Phys. Rev. B 63 (2001) 075404](
- PHAGEN from the [MsSpec package](
by C. R. Natoli and D. Sébilleau,
[Comp. Phys. Comm. 182 (2011) 2567](
\section sec_intro_highlights Highlights
......@@ -63,11 +65,11 @@ An open distribution of PMSCO is available under the [Apache License, Version 2.
- Please acknowledge the use of the code.
- Please share your development of the code with the original author.
Due to different copyright, the MSC and EDAC programs are not contained in the public software repository.
Due to different copyright terms, the third-party calculation programs are not contained in the public software repository.
These programs may not be used without an explicit agreement by the respective original authors.
\author Matthias Muntwiler, <>
\version This documentation is compiled from version $(REVISION).
\copyright 2015-2018 by [Paul Scherrer Institut](
\copyright 2015-2019 by [Paul Scherrer Institut](
\copyright Licensed under the [Apache License, Version 2.0](
......@@ -30,6 +30,55 @@ The domain parameters have the following meanings:
| step | Not used. |
\subsubsection sec_opt_seed Seeding a population
By default, one particle is initialized with the start value declared in the parameter domain,
and the other are set to random values within the domain.
You may initialize more particles of the population with specific values by providing a seed file.
The seed file must have a similar format as the result `.dat` files
with a header line specifying the column names and data rows containing the values for each particle.
A good practice is to use a previous `.dat` file and remove unwanted rows.
To continue an interrupted optimization,
the `.dat` file from the previous optimization can be used as is.
The seeding procedure can be tweaked by several optimizer parameters (see above).