VPSHLDD - Packed SHift Left Double Dword
VPSHLDD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst, imm8    (V5+VBMI2+VL
__m128i _mm_shldi_epi32(__m128i a, __m128i b, int imm8)
__m128i _mm_mask_shldi_epi32(__m128i s, __mmask8 k, __m128i a, __m128i b, int imm8)
__m128i _mm_maskz_shldi_epi32(__mmask8 k, __m128i a, __m128i b, int imm8)

For each DWORD, (1) is shifted to the left by the number of bits specified in imm8 bit 4:0. Upper bits of (2) are copied to the emptied lower bits.  The result is stored in (3).
VPSHLDD ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst, imm8    (V5+VBMI2+VL
__m256i _mm256_shldi_epi32(__m256i a, __m256i b, int imm8)
__m256i _mm256_mask_shldi_epi32(__m256i s, __mmask8 k, __m256i a, __m256i b, int imm8)
__m256i _mm256_maskz_shldi_epi32(__mmask8 k, __m256i a, __m256i b, int imm8)

For each DWORD, (1) is shifted to the left by the number of bits specified in imm8 bit 4:0. Upper bits of (2) are copied to the emptied lower bits.  The result is stored in (3).
VPSHLDD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst, imm8    (V5+VBMI2
__m512i _mm512_shldi_epi32(__m512i a, __m512i b, int imm8)
__m512i _mm512_mask_shldi_epi32(__m512i s, __mmask16 k, __m512i a, __m512i b, int imm8)
__m512i _mm512_maskz_shldi_epi32(__mmask16 k, __m512i a, __m512i b, int imm8)

For each DWORD, (1) is shifted to the left by the number of bits specified in imm8 bit 4:0. Upper bits of (2) are copied to the emptied lower bits.  The result is stored in (3).
x86/x64 SIMD Instruction List  
Feedback