
  • Loop Pipelining
  • canci
  • pipelining and unrolling

    • improve hardware function’s performance
    • by exploiting the parallelism between loop iterations.


  • The basic concepts of loop pipelining

    • and loop unrolling
    • and example codes to apply these techniques are shown
    • and
  • limiting factors to achieve optimal performance using these techniques are discussed.


Loop Pipelining

  • sequential languages C/C++
  • the operations in a loop
    • are executed sequentially
  • the next iteration of the loop
    • only
    • begin when the last operation in the current loop iteration complete
  • Loop pipelining
    • allows operations in a loop to be implemented in a concurrent manner as shown

  • without, three clock between two RD and

    • six clock cycles for the entire loop to finish.


  • with, one clock cycle between the two RD

    • four clock cycles for the entire loop to finish,
    • the next iteration of the loop can start before the current iteration is finished.

  • term for loop pipelining

    • Initiation Interval (II)
    • number of clock cycles between the start times of consecutive loop iterations.


  • In Loop Pipelining

    • II is one
    • one clock between start times of consecutive loop iterations.

  • To pipeline a loop,
  • put #pragma HLS pipeline at the beginning of loop body,
  • Vivado HLS tries to pipeline the loop with minimum Initiation Interval.
for (index_a = 0; index_a < A_NROWS; index_a++) {for (index_b = 0; index_b < B_NCOLS; index_b++) {#pragma HLS PIPELINE II=1float result = 0;for (index_d = 0; index_d < A_NCOLS; index_d++) {float product_term = in_A[index_a][index_d] * in_B[index_d][index_b];result += product_term;}out_C[index_a * B_NCOLS + index_b] = result;}


