1. 15 Jul, 2019 4 commits
  2. 14 Jul, 2019 5 commits
  3. 12 Jul, 2019 7 commits
  4. 11 Jul, 2019 4 commits
    • ulrich_y's avatar
      New MPL routine · 9ff6a342
      ulrich_y authored
      This is the algorithm by GiNaC. In theory one
      could extend this to add a caching mechanism such as
      
          complex(kind=prec) :: cache(size(x),MPLMaxQ)
      
          do q=1,j
            cache(:,q) = x**q/q**m
          enddo
          do q=1,MPLMaxQ
            res = t(1)
      
            ! Fortran uses Column-major order, hence cache(:,q) is
            ! faster than cache(q,:).
            cache(:,q+j-1) = x**(q+j-1)/(q+j-1)**m
            t(j) = t(j) + cache(j,q)
            do k=1,j-1
              t(j-k) = t(j-k) + t(j-k+1) * cache(j-k,k+q)
            enddo
      
            if (mod(q,2) .eq. 1) then
              if (abs(t(1)-res).lt.MPLdel) exit
            endif
          enddo
      
      In practice this doesn't really help because any
      time saved with the cache is paid back through the
      allocation and clearing of cache(:,:). Both
      variations work similarly well now. If at some
      point we might need MPLs with many more arguments
      (size(x)), this might change.
      9ff6a342
    • ulrich_y's avatar
      Follow-up: missing endif · ee4b3891
      ulrich_y authored
      ee4b3891
    • ulrich_y's avatar
      Added ifort · f9b9a84c
      ulrich_y authored
      f9b9a84c
    • ulrich_y's avatar
      Added cflags · 3d3d01c3
      ulrich_y authored
      3d3d01c3
  5. 10 Jul, 2019 10 commits
    • ulrich_y's avatar
      Changed compiler flags · 3c052e0e
      ulrich_y authored
      I've studied the timing a bit and the message is
      quite clear:
      
       1) Use -O3
       2) Use -march=native and -mtune=native (in some
          cases it might be better to use actually work
          out what the architecture is as Kalby Lake
          (7th gen i5) is misdetected as Broadwell (5th
          gen))
       3) even though -ffast-math speeds the code up
          tremendously it also produces very wrong
          results.
      
      Below a list of G/s
      
      Nothing           :  5907.27,  5852.97,  4255.59,  5627.56,  5886.03
      O3                :  9780.68, 11269.85, 11464.97, 10475.08, 10966.49
      O3+unroll         : 11385.20, 10785.49, 11361.03, 10225.69, 11134.86
      O3+tree vec       : 11028.18, 11232.13, 11349.96, 11257.04, 11410.13
      O3+native         :  9124.84,  8609.82,  9330.82,  9912.89,  9503.70
      O3+skylake        : 11894.75, 11966.64, 11882.61, 12000.73, 11666.96
      O3+march+mtune    : 11818.07, 11943.54, 11963.75, 11780.81, 11560.69
      O3+native+nati    : 11390.44, 11873.35, 11827.96, 11781.51, 11725.69
      O3+ffast-math     : 19014.69, 19016.52, 18849.96, 19213.97, 19067.80
      O3+un+vec+nat+nat : 11521.01, 11666.27, 11341.51, 11290.41, 11488.92
      O3+vec+nat+nat    : 10712.55, 11211.78, 11328.53, 11140.49, 11298.79
      O3+un+nat+nat     : 11702.13, 11442.44, 11680.73, 11498.26, 11677.28
      
      unroll: -funroll-loops
      tree vec: -ftree-vectorize
      skylake: -march=skylake
      native: -march=native
      march+mtune: -march=skylake -mtune=skylake
      3c052e0e
    • ulrich_y's avatar
      Added user interface · 1b1a9094
      ulrich_y authored
      1b1a9094
    • ulrich_y's avatar
      Made zeta array · 81f13108
      ulrich_y authored
      81f13108
    • ulrich_y's avatar
      Removed unused things · 48d76755
      ulrich_y authored
      48d76755
    • ulrich_y's avatar
      Disabled verb if RELEASE · 5a06376e
      ulrich_y authored
      5a06376e
    • ulrich_y's avatar
      Merge branch 'ieps' · 24959b26
      ulrich_y authored
      24959b26
    • ulrich_y's avatar
      Removed old ieps framework · 3215f5fb
      ulrich_y authored
      3215f5fb
    • ulrich_y's avatar
      Added some muone G's · 61308a0a
      ulrich_y authored
      61308a0a
    • ulrich_y's avatar
      Added optional GiNaC interface · 23bf1dac
      ulrich_y authored
      23bf1dac
    • Luca Naterop's avatar
      global params to control accuracy · 3ec2beb9
      Luca Naterop authored
      3ec2beb9
  6. 09 Jul, 2019 10 commits