Write a program to measure the time it takes to perform an MPI_Allreduce on MPI_COMM_WORLD. Use a single MPI_DOUBLE as the in and out arguments. Use the same techniques as in the memcpy assignment to average out variations and overhead in MPI_Wtime.

Print the size of MPI_COMM_WORLD and time for each test.

Make sure that both sender and reciever are ready when you begin the test.

How does the performance of MPI_Allreduce vary with the size of MPI_COMM_WORLD?