src issueshttps://gitlab.psi.ch/OPAL/src/-/issues2018-01-04T18:14:26+01:00https://gitlab.psi.ch/OPAL/src/-/issues/200Optimiser throws unexpected Exceptions and later die2018-01-04T18:14:26+01:00adelmannOptimiser throws unexpected Exceptions and later dieAfter 3ed381f8 I see more Exceptions (merlinl01:/gpfs/home/adelmann/scratch/awa-optim/code/optLinac_40nC.o78706)
I am now a bit more confused:
- the *.data file should only read once (but we have over 600 Exeptions)
- we again have di...After 3ed381f8 I see more Exceptions (merlinl01:/gpfs/home/adelmann/scratch/awa-optim/code/optLinac_40nC.o78706)
I am now a bit more confused:
- the *.data file should only read once (but we have over 600 Exeptions)
- we again have directories that seams to exists
The real problem is reported in merlinl01:/gpfs/home/adelmann/scratch/awa-optim-0/code/optLinac_40nC.o78651
**░░░░░terminate called after throwing an instance of 'OpalException**OPAL 2.0.0krausadelmannYves Ineichenkraushttps://gitlab.psi.ch/OPAL/src/-/issues/174optimiser run hasResultsAvailable()2019-04-06T10:08:12+02:00adelmannoptimiser run hasResultsAvailable()it seams that hasResultsAvailable() is sometimes true after I removed
the pid from the hash string. This was necessary when more than
one $CORE is used for a worker.
I probable need to add this back but not with the pid but with an
id ...it seams that hasResultsAvailable() is sometimes true after I removed
the pid from the hash string. This was necessary when more than
one $CORE is used for a worker.
I probable need to add this back but not with the pid but with an
id that represents the "worker with more than one core"
@ineichen can you point me to that structure.OPAL 2.0.0adelmannYves Ineichenadelmann2017-10-28https://gitlab.psi.ch/OPAL/src/-/issues/93SAAMG-Test-1.in PARALLEL2017-08-09T21:28:33+02:00adelmannSAAMG-Test-1.in PARALLEL
The test is from git@gitlab.psi.ch:OPAL/regression-tests.git
and the `git checkout OPAL-1.6`
Parallel run fails, serial is ok.
```
mpirun -np 4 opal SAAMG-Test-1.in
* Node:0, Filling RHS...
* Node:1, Filling RHS...
* Nod...
The test is from git@gitlab.psi.ch:OPAL/regression-tests.git
and the `git checkout OPAL-1.6`
Parallel run fails, serial is ok.
```
mpirun -np 4 opal SAAMG-Test-1.in
* Node:0, Filling RHS...
* Node:1, Filling RHS...
* Node:1, Rho for final element: 0.0000000000000000e+00
* Node:2, Filling RHS...
* Node:2, Rho for final element: 0.0000000000000000e+00
* Node:2, Local nx*ny*nz = 1575
* Node:2, Number of reserved local elements in RHS: 832
* Node:2, Number of reserved global elements in RHS: 3328
* Node:3, Filling RHS...
* Node:3, Rho for final element: 0.0000000000000000e+00
* Node:3, Local nx*ny*nz = 3375
* Node:3, Number of reserved local elements in RHS: 832
* Node:3, Number of reserved global elements in RHS: 3328
* Node:0, Rho for final element: 0.0000000000000000e+00
* Node:0, Local nx*ny*nz = 735
* Node:0, Number of reserved local elements in RHS: 832
* Node:0, Number of reserved global elements in RHS: 3328
* Node:1, Local nx*ny*nz = 1575
* Node:1, Number of reserved local elements in RHS: 832
* Node:1, Number of reserved global elements in RHS: 3328
* Node:2, Number of Local Inside Points 832
* Node:0, Number of Local Inside Points 832
* Node:3, Number of Local Inside Points 832
* Node:3, Done.
* Node:0, Done.
* Node:1, Number of Local Inside Points 832
* Node:1, Done.
* Node:2, Done.
[fast-dude:02195] *** Process received signal ***
[fast-dude:02195] Signal: Segmentation fault: 11 (11)
[fast-dude:02195] Signal code: Address not mapped (1)
[fast-dude:02195] Failing at address: 0x7fe2336ae600
[fast-dude:02195] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 3 with PID 2195 on node fast-dude exited on signal 11 (Segmentation fault: 11).
--------------------------------------------------------------------------
```OPAL 2.0.0Yves IneichenYves Ineichen