Previous | Next --- Slide 39 of 88
Back to Lecture Thumbnails
suninhouse

This may have come up as a question in the lecture but why (mul + add)? Does it mean that per clock mul and add can be executed at the same time?

jgrace

I was wondering the same thing and tried to learn about this. I could be incorrect (somebody please correct me) but I think it means that either one multiplication or one addition operation can be computed per clock cycle on each execution unit.

blipblop

@suninhouse @jgrace It seems that (from the algebra) one mul and one add (= 2 operations) can both be executed simultaneously in one clock cycle. How the ALU achieves this, I am not sure.

timothyyeo

Can anybody show the exact calculation procedures for 268 GFLOPs? I'm trying to replicate the numbers but cannot succeed yet.

timothyyeo

Is it 4x8x2x4.2 = 268.8 GFLOPs?

a7hu

I think mul+add refers to the fused multiply-add instruction set in the intel processor. It is the fused multiple-add instruction which performs both a multiplication and an addition in one clock cycle. For instance: _mm256_fmadd_ps(a, b, c) performs: dst[i] = (a[i] * b[i]) + c[i]

jessiexu

Fused multiply-add saves cycles compared to one multiply+one add since multiply and addition share some of the same circuits. If the compiler can find enough multiply-add instructions, the program can run faster.

Please log in to leave a comment.