This is a short summary of the steps to be followed to access the experimental data and the compute resources available for users of the Maloja endstation.
Setup for external users:
- Contact the team to get an external PSI account (the DUO account is not enough).
- To access the data and compute resources (i.e., the Ra cluster) remotely or from a private computer there are two options:
In the following, access via NoMachine will be described:
- Create a new NX connection to host
rem-acc.psi.ch
and connect to it. - Log in to host:
ra-nx.psi.ch
. - Open a terminal inside NoMachine.
- Run the command:
/sf/maloja/bin/ra_setup.sh pXXXXX
with pXXXXX
being the number of the proposal/project (example: p12345
). A folder named like the project number will appear in your home folder containing a folder structure as described below.
If you ran the command already before, you may receive the following warning:
'/sf/maloja/applications/miniconda3/envs' already in 'envs_dirs' list, moving to the top
This can be safely ignored.
-
Open the web browser in NoMachine and go to the website https://jupytera.psi.ch.
-
Sign in with your PSI username (for external users they typically follow the scheme
ext-familyname_initial
) and password. -
A couple of windows like the followings will appear in your browser:
Launch the server and specify the run time, with a maximum of 24 hr. After the specified runtime, you will be automatically logged off. The files are always automatically saved.
- Continue here.
Setup for internal users:
-
Connect to the
corp
wifi. -
Go to the website https://jupytera.psi.ch.
-
Sign in with your PSI username (for internal users they typically follow the scheme
familyname_initial
) and password. -
A couple of windows like the followings will appear in your browser. Launch the server and specify the run time, with a maximum of 24 hr. After the specified runtime, you will be automatically logged off. The files are always automatically saved.
- Open a terminal within Jupyter: top-right menu -> New -> Terminal.
- Run the command:
/sf/maloja/bin/ra_setup.sh pXXXXX
with pXXXXX
being the number of the proposal/project (example: p12345
). A folder named like the project number will appear in your home folder containing a folder structure as described below.
If you ran the command already before, you may receive the following warning:
'/sf/maloja/applications/miniconda3/envs' already in 'envs_dirs' list, moving to the top
This can be safely ignored.
- Close the terminal tab.
- Restart Jupyter (Control Panel button at the top -> Stop My Server -> Start My Server)
- Continue here.
How to access the data
Note: To transfer the data to an external location, please check the procedure here.
Create a new notebook
In order to create a new notebook using the Maloja analysis go to the top-right menu:
New -> Python [conda env:miniconda3-mana]
Project folder organization:
Inside the project folder there are four sub-folders:
-
raw
- contains the raw experimental data, which are read-only. -
work
- contains processed data and personal folders. -
scratch
- for temporary files (folder cleaned regularly). -
res
- for final results.
Please create a personal folder with your codes to treat data in work/analysis/yourname
.
Raw data files
The raw data is organized in four sub-folders:
run_info
scan_data
scan_info
static_data
static_data
includes data acquired without changing any experimental condition for N number of shots. scan_data
is similar to static_data
but scanning a certain parameter, e.g., delay time, pulse energy, etc. The whole scan over the given parameter may be repeated several times ("repetitions") that are saved with the same name but attaching a counter to the name. For each scan a json file is stored in the folder scan_info
and named as the scan_data
name.
For automated signal/background data acquisition, an automatic background is taken and saved after each scan. The signal/background data is named with the same name distinguishing with "sig"/"bkg". Backgrounds are static data.
Data structure
The data is stored in folders where there are several .h5 files per run: for instance BSDATA, CAMERAS and PVCHANNELS. In specific experiments there could be extra .h5 files with data from special devices such as a Jungfrau detector. Each run contains the information of a single step in a scan or the whole data set for static measurements and includes the information per shot. In the folder run_info
there is information about the runs grouped according to the first 3 digits of the run number adding 000
(for example for run numbers starting with 011
the run info is in the folder -> 011000
). This information is not to be used to treat the data but to check possible problems.
BSDATA is data recorded at the repetition rate of the FEL, typically 100 Hz. CAMERAS saves data from cameras also at the repetition rate of the FEL, similarly to BSDATA. The difference comes only from how the data is managed. PVCHANNELS are data for variables that evolve slowly (not at single shot), such as temperatures, manipulator positions, etc. So far this is not properly saved and when loading the data a warning will appear.
The structure of the data depends on the channel. For instance Cameras data are matrices with (# Shots, # Y pixels, # X pixels).
Preprocessed data
We run a preprocessing of the CAMERAS data simultaneously to the acquisition. The raw image is always keep as ChannelName:FPICTURE
. The preprocessing consists on loading the data, correct it with the background-dark (recorded before the acquisition) and calculate 2 projections of the image within 2 regions of interest (ROIs): one for the signal and one for a background taken from the data image in a region where there is no signal (do not mix with the background-dark, which is the full size of the image). A threshold can also be applied to filter the noise.
The results of the preprocessing are stored as BSDATA:
-
ChannelName.projection_signal
contains the projection of the signal within an specified ROI (per shot) over a certain axis (typically X). -
ChannelName.projection_background
similar but within the ROI specified for the background. -
ChannelName.processing_parameters
contains the pulse ID, time stamp, dimensions of the ROIs used for the data and background, the axis over which the projection was done and the threshold value (along with other extra parameters).
Working with the data
The SFDataFiles python library: https://github.com/paulscherrerinstitute/sf_datafiles helps to handle the SwissFEL data files. Please have a look at the detailed information given there (expert support by Sven Augustin).
Additional information and documentation for detector processing can be found in other folders of this Gitlab (https://gitlab.psi.ch/maloja/docs/-/wikis/home) e.g. SPECS 150Ep processing description (https://gitlab.psi.ch/maloja/docs/-/wikis/SPECS-electron-spectrometer-processing).
Generic examples are available in https://github.com/paulscherrerinstitute/sf_datafiles. In addition, a folder with running examples of how to check the recorded channels, load the data, and basic data operations can be found in the folder pXXXXX/work/analysis/GettingStarted
.