Skip to content

[flang] bug passing non-contiguous array to MPI procedure #138471

@nncarlson

Description

@nncarlson

I've run into errors with flang 20.1.0 when passing a non-contiguous dummy array to MPI procedures (MPICH 4.3.0), where it needs to be doing copy-in/copy-out. It appears that the address of the initial array element is being passed instead of the address of a contiguous copy of the array. I'm not sure if this is a bug with flang, or a problem with MPICH where it doesn't know how to provide a correct mpi module for flang.

I've pared it down to a small (serial) reproducer that doesn't use MPI at all, but mimics the mpi module from MPICH 4.3.0. There are two files mpi_bcast.c:

// Dummy version of MPI_Bcast to dump the received buffer
#include <stdio.h>
void mpi_bcast_(void *buffer, int *count) {
  float *vector = (float*) buffer;
  printf("     in mpi_bcast: buffer=");
  for (int i = 0; i < *count; ++i) {
    printf(" %f", vector[i]);
  }
  printf("\n");
}

and the main program flang-20240504.F90:

module mpi_dummy

  !! Actual form of the interface from the MPICH 4.3.0 mpi module
  interface MPI_Bcast
    subroutine MPI_Bcast(buffer, count)!, datatype, root, comm, ierror)
      implicit none
#ifdef __flang__
      !DIR$ IGNORE_TKR buffer
#endif
#ifdef __INTEL_COMPILER
      !DEC$ ATTRIBUTES NO_ARG_CHECK :: buffer
#endif
#ifdef __GFORTRAN__
      !GCC$ ATTRIBUTES NO_ARG_CHECK :: buffer
#endif
      real :: buffer
      integer :: count
      !integer :: datatype
      !integer :: root
      !integer :: comm
      !integer :: ierror
    end subroutine
  end interface

end module

program main

  use mpi_dummy

  real :: x(5)
  x = [1,2,3,4,5]
  print *, 'x=', x
  call bcast(x(1::2))

contains

  subroutine bcast(buffer)
    real, intent(inout) :: buffer(:)
    print *, '  in bcast: buffer=', buffer
    ! A CONTIGUOUS COPY OF BUFFER MUST BE PASSED HERE
    call MPI_Bcast(buffer, size(buffer))
  end subroutine

end program

To compile:

clang -c mpi_bcast.c
flang flang-20240504.F90 mpi_bcast.o

The expected output from running should be

$ ./a.out
 x= 1. 2. 3. 4. 5.
   in bcast: buffer= 1. 3. 5.
     in mpi_bcast: buffer= 1.000000 3.000000 5.000000

But with flang I'm getting

$ ./a.out
 x= 1. 2. 3. 4. 5.
   in bcast: buffer= 1. 3. 5.
     in mpi_bcast: buffer= 1.000000 2.000000 3.000000

Both Intel ifx and gfortran produce the expected results.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions