Open
Description
Is your feature request related to a problem? Please describe.
I can't analyze the following loop (generated by the Intel ifx
compiler):
.LBB0_112:
vmovups -8(%r10,%r14,4), %ymm6
vmovups -4(%r10,%r14,4), %ymm7
vaddps (%r10,%r14,4), %ymm6, %ymm6
vaddps (%r11,%r14,4), %ymm6, %ymm6
vaddps 4(%r12,%r14,4), %ymm6, %ymm6
vmulps %ymm2, %ymm6, %ymm6
vsubps %ymm6, %ymm7, %ymm7
vandps %ymm3, %ymm7, %ymm7
vcmpleps %ymm5, %ymm7, %ymm8
vblendvps %ymm8, %ymm5, %ymm7, %ymm5
vmovups %ymm6, (%r9,%r14,4)
addq $8, %r14
cmpq %r8, %r14
jl .LBB0_112
The performance data for the vcpmleps
instruction is missing.
Describe the solution you'd like
The data were available, at least for SKX and SPR.
Additional context
Link to Compiler Explorer for the cause under scrutiny: https://godbolt.org/z/5h5zxr6o7