Next: Performance of blocked version Up: Motivation: an example Previous: Blocking (a.k.a. ``tiling'')

...

Now interchange so blocked loops are outermost:
for (kk = 0; kk < 512; kk += BLKSZ)
 for (jj = 0; jj < 512; jj += BLKSZ)
  for (i = 0; i < 512; i++)
   for (k = kk; k < min(kk+BLKSZ,512); k++){
    r = A[i][k];
     for (j = jj; j < min(jj+BLKSZ, 512); j++)
      C[i][j] += r * B[k][j];
    }