ADIOS2: Python/Fortran bindings -- bug likely in write with different dimensionality / time stepping

I’m having some issues with writing and reading N-dimensional arrays in Python, possibly over multiple time steps. Here’s a test script I made to investigate. Basically, I write a really simple piece of data, and then check if I get the same hash back on the read end. Sometimes I do, sometimes I don’t. I attached an example, where I write up to 6-D arrays, over 10 time steps. Doing bpls looks more or less okay.

if __name__ == "__main__":
    comm = MPI.COMM_WORLD

    ndim = int(sys.argv[1])
    ntime = int(sys.argv[2])

    size = comm.Get_size()
    rank = comm.Get_rank()
    globalsizes = []

    hashes = []
    zeros = []

    fw = adios2.open('ND.bp', "w", comm)

    for time in range(ntime):
        h = []
        z = []
        for i in range(ndim):
            ind = i + 1
            shape = np.array([ind]*ind)
            data = (rank+1)*(np.reshape(1+np.arange(np.power(ind, ind), dtype=np.float64), tuple(shape)))

            start = np.zeros(ind, dtype=np.int)
            start[0] = rank * shape[0]
            globalsize = np.copy(shape)
            globalsize[0] *= size
            globalsizes.append(globalsize)

            fw.write("arr{0}".format(ind), data, list(globalsize), list(start), list(shape), endl=True)

            newdata = np.empty(globalsize, dtype=np.float64)
            comm.Gather(data, newdata, root=0)

            if rank == 0:
                newdata.flags.writeable = False
                h.append(hash(newdata.data))
                z.append(np.sum(newdata <= 0))
        hashes.append(h)
        zeros.append(z)
    fw.close()


    fr = adios2.open("ND.bp", "r", comm)
    for time in range(ntime):
        if rank == 0:
            for i in range(ndim):
                ind = i + 1
                data = fr.read("arr{0}".format(ind), np.zeros(ind, dtype=np.int), globalsizes[i], endl=True)

                data.flags.writeable = False

                h = hash(data.data)
                if hashes[time][i] == h:
                    print("time: {1}, dimensionality: {0} OK".format(ind, time))
                else:
                    print("time: {1}, dimensionality: {0} ERROR -- unequal hashes {3} != {4} , {2} obvously bad values returned".format(ind, time, np.sum(data <= 0), hashes[time][i], h))

    fr.close()
mpirun -host localhost -np 4 ./test-nd.py 6 10
time: 0, dimensionality: 1 OK
time: 0, dimensionality: 2 OK
time: 0, dimensionality: 3 OK
time: 0, dimensionality: 4 ERROR -- unequal hashes -5817521839159847613 != 4710672338217278104 , 648 obvously bad values returned
time: 0, dimensionality: 5 OK
time: 0, dimensionality: 6 OK
time: 1, dimensionality: 1 OK
time: 1, dimensionality: 2 OK
time: 1, dimensionality: 3 OK
time: 1, dimensionality: 4 ERROR -- unequal hashes -5817521839159847613 != 1644371425632087350 , 104 obvously bad values returned
time: 1, dimensionality: 5 ERROR -- unequal hashes 7537877987181268272 != 411882050336115912 , 10576 obvously bad values returned
time: 1, dimensionality: 6 ERROR -- unequal hashes -4189560973479895787 != -8040249360400408643 , 173120 obvously bad values returned
time: 2, dimensionality: 1 OK
time: 2, dimensionality: 2 OK
time: 2, dimensionality: 3 OK
time: 2, dimensionality: 4 OK
time: 2, dimensionality: 5 ERROR -- unequal hashes 7537877987181268272 != 6921899629403144131 , 10068 obvously bad values returned
time: 2, dimensionality: 6 ERROR -- unequal hashes -4189560973479895787 != 1150135602477787782 , 172612 obvously bad values returned
time: 3, dimensionality: 1 OK
time: 3, dimensionality: 2 OK
time: 3, dimensionality: 3 OK
time: 3, dimensionality: 4 OK
time: 3, dimensionality: 5 ERROR -- unequal hashes 7537877987181268272 != 8606326729852778618 , 9560 obvously bad values returned
time: 3, dimensionality: 6 ERROR -- unequal hashes -4189560973479895787 != 5384636986928982883 , 172108 obvously bad values returned
time: 4, dimensionality: 1 OK
time: 4, dimensionality: 2 OK
time: 4, dimensionality: 3 OK
time: 4, dimensionality: 4 OK
time: 4, dimensionality: 5 ERROR -- unequal hashes 7537877987181268272 != -1677615237727256141 , 9052 obvously bad values returned
time: 4, dimensionality: 6 ERROR -- unequal hashes -4189560973479895787 != -163812680856230090 , 171600 obvously bad values returned
time: 5, dimensionality: 1 OK
time: 5, dimensionality: 2 OK
time: 5, dimensionality: 3 OK
time: 5, dimensionality: 4 OK
time: 5, dimensionality: 5 ERROR -- unequal hashes 7537877987181268272 != -1583164979516588521 , 8548 obvously bad values returned
time: 5, dimensionality: 6 ERROR -- unequal hashes -4189560973479895787 != -5321989567651457599 , 171092 obvously bad values returned
time: 6, dimensionality: 1 OK
time: 6, dimensionality: 2 OK
time: 6, dimensionality: 3 OK
time: 6, dimensionality: 4 OK
time: 6, dimensionality: 5 ERROR -- unequal hashes 7537877987181268272 != 2250494806201746315 , 8040 obvously bad values returned
time: 6, dimensionality: 6 ERROR -- unequal hashes -4189560973479895787 != -7413598994453349213 , 170584 obvously bad values returned
time: 7, dimensionality: 1 OK
time: 7, dimensionality: 2 OK
time: 7, dimensionality: 3 OK
time: 7, dimensionality: 4 OK
time: 7, dimensionality: 5 ERROR -- unequal hashes 7537877987181268272 != 6178203745650766318 , 7532 obvously bad values returned
time: 7, dimensionality: 6 ERROR -- unequal hashes -4189560973479895787 != -3610488979245109061 , 170079 obvously bad values returned
time: 8, dimensionality: 1 OK
time: 8, dimensionality: 2 OK
time: 8, dimensionality: 3 OK
time: 8, dimensionality: 4 OK
time: 8, dimensionality: 5 ERROR -- unequal hashes 7537877987181268272 != -8130837426127217004 , 7024 obvously bad values returned
time: 8, dimensionality: 6 ERROR -- unequal hashes -4189560973479895787 != 1935007435453733543 , 169572 obvously bad values returned
time: 9, dimensionality: 1 OK
time: 9, dimensionality: 2 OK
time: 9, dimensionality: 3 OK
time: 9, dimensionality: 4 OK
time: 9, dimensionality: 5 ERROR -- unequal hashes 7537877987181268272 != -2487306817042493317 , 6520 obvously bad values returned
time: 9, dimensionality: 6 ERROR -- unequal hashes -4189560973479895787 != 6528630630600386117 , 169064 obvously bad values returned
bpls2 ND.bp --long
  double  arr1  10*{4}  1 / 4
  double  arr2  10*{8, 2}  1 / 16
  double  arr3  10*{12, 3, 3}  1 / 108
  double  arr4  10*{16, 4, 4, 4}  1 / 1024
  double  arr5  10*{20, 5, 5, 5, 5}  1 / 12500
  double  arr6  10*{24, 6, 6, 6, 6, 6}  1 / 186624

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 24 (24 by maintainers)

Commits related to this issue

Most upvoted comments

Thanks @williamfgc . That was it in the python script. All the python tests now pass. In Fortran I did not have a barrier that I should. I’ll share the results with these fixed next.

I might not be understanding what you mean @williamfgc but I think that’s what I’m doing. I’m only writing one array in this case (the only data in file) – and that’s the only write statement in each loop iteration.