작은 클러스터에서 OpenMPI 1.3을 사용하고 있습니다.OpenMPI 이상한 출력 오류
void invertColor_Parallel(struct image *im, int size, int rank)
{
int i,j,aux,r;
int total_pixels = (*im).ih.width * (*im).ih.height;
int qty = total_pixels/(size-1);
int rest = total_pixels % (size-1);
MPI_Status status;
//printf("\n%d\n", rank);
if(rank == 0)
{
for(i=1; i<size; i++){
j = i*qty - qty;
aux = j;
if(rest != 0 && i==size-1) {qty=qty+rest;} //para distrubuir toda la carga
//printf("\nj: %d qty: %d rest: %d\n", j, qty, rest);
MPI_Send(&aux, 1, MPI_INT, i, MASTER_TO_SLAVE_TAG+1, MPI_COMM_WORLD);
MPI_Send(&qty, 1, MPI_INT, i, MASTER_TO_SLAVE_TAG+2, MPI_COMM_WORLD);
MPI_Send(&(*im).array[j], qty*3, MPI_BYTE, i, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD);
}
}
else
{
MPI_Recv(&aux, 1, MPI_INT, MPI_ANY_SOURCE, MASTER_TO_SLAVE_TAG+1, MPI_COMM_WORLD,&status);
MPI_Recv(&qty, 1, MPI_INT, MPI_ANY_SOURCE, MASTER_TO_SLAVE_TAG+2, MPI_COMM_WORLD,&status);
pixel *arreglo = (pixel *)calloc(qty, sizeof(pixel));
MPI_Recv(&arreglo[0], qty*3, MPI_BYTE, MPI_ANY_SOURCE, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD,&status);
//printf("Receiving node=%d, message=%d\n", rank, aux);
for(i=0;i<qty;i++)
{
arreglo[i].R = 255-arreglo[i].R;
arreglo[i].G = 255-arreglo[i].G;
arreglo[i].B = 255-arreglo[i].B;
}
MPI_Send(&aux, 1, MPI_INT, 0, SLAVE_TO_MASTER_TAG+1, MPI_COMM_WORLD);
MPI_Send(&qty, 1, MPI_INT, 0, SLAVE_TO_MASTER_TAG+2, MPI_COMM_WORLD);
MPI_Send(&arreglo[0], qty*3, MPI_BYTE, 0, SLAVE_TO_MASTER_TAG, MPI_COMM_WORLD);
free(arreglo);
}
if (rank==0){
//printf("\nrank: %d\n", rank);
for (i=1; i<size; i++) // untill all slaves have handed back the processed data
{
MPI_Recv(&aux, 1, MPI_INT, MPI_ANY_SOURCE, SLAVE_TO_MASTER_TAG+1, MPI_COMM_WORLD,&status);
MPI_Recv(&qty, 1, MPI_INT, MPI_ANY_SOURCE, SLAVE_TO_MASTER_TAG+2, MPI_COMM_WORLD,&status);
MPI_Recv(&(*im).array[aux], qty*3, MPI_BYTE, MPI_ANY_SOURCE, SLAVE_TO_MASTER_TAG, MPI_COMM_WORLD,&status);
}
}
}
int main(int argc, char *argv[])
{
//////////time counter
clock_t begin;
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Status status;
int op = (int)atof(argv[1]);
char filename_toload[50];
int bright_number=0;
struct image image2;
if (rank==0)
{
loadImage(&image2, argv[2]);
}
//Broadcast the user's choice to all other ranks
MPI_Bcast(&op, 1, MPI_INT, 0, MPI_COMM_WORLD);
switch(op)
{
case 1:
if (rank==0) {begin = clock();}
MPI_Barrier(MPI_COMM_WORLD);
invertColor_Parallel(&image2, size, rank);
MPI_Barrier(MPI_COMM_WORLD);
if (rank==0) {runningTime(begin, clock()); printf("Se invirtieron los colores de la imagen\n\n");}
break;
}
MPI_Barrier(MPI_COMM_WORLD);
if (rank==0)
{
saveImage(&image2, argv[3]);
free(image2.array);
}
MPI_Finalize();
return 0;
}
을 때때로 나는 다음과 같은 오류가 발생합니다 :
이
내가 부르고 기능입니다.[email protected]:/mpi$ mpirun -np 60 -hostfile /home/hostfile paralelo
1 image.bmp out.bmp
[email protected]'s password:
[maestro:5194] *** An error occurred in MPI_Recv
[maestro:5194] *** on communicator MPI_COMM_WORLD
[maestro:5194] *** MPI_ERR_TRUNCATE: message truncated
[maestro:5194] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 5194 on node maestro
exiting without calling "finalize". This may have caused other
processes in the application to be terminated by signals sent by
mpirun (as reported here).
--------------------------------------------------------------------------
[nodo1] [[49223,1],55][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
오류가 발생하는지 여부에 따라 프로세스 수가 달라집니다. -np 99
으로 꽤 잘 작동합니다.
무슨 일이 일어나고 있는지 알고 싶습니다.
안녕하세요는 같아야 'MPI_Recv (AUX, 1 MPI_INT, MPI_ANY_SOURCE, SLAVE_TO_MASTER_TAG + 1 MPI_COMM_WORLD, 상태)'또는'MPI_Recv (AUX, 1 MPI_INT, status.MPI_SOURCE, SLAVE_TO_MASTER_TAG + 1 , MPI_COMM_WORLD, & status);'당신이 나에게 더 나은 해결책을 제공 했습니까? 시간 내 주셔서 대단히 감사합니다. –
정확히 내가 가지고있는 두 번째 코드 조각과 동일해야합니다. 'aux'를 수신하기위한'MPI_ANY_SOURCE'와'MPI_Recv'에 대한 다음 호출을위한'status.MPI_SOURCE'입니다. 나는 왜 내가 그렇게 설명했는지 믿는다. –
대단히 고마워, 지금 왜 그런 일이 있었는지 이해하고 지금은 꽤 잘하고있다. 고마워. –