runtime: Support emitting a constant for the Vector64/128/256.Create hardware intrinsic methods
For the case where the Vector64.Create, Vector128.Create, and Vector256.Create helper functions are called with all constant arguments, we should support emitting a constant which can be loaded from memory, rather than emitting a chain of shuffle or insert calls.
We may also find some benefit in doing the same for partial constants as a partial constant with several inserts can still be faster than treating it as non-constant.
category:cq theme:vector-codegen skill-level:intermediate cost:medium
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 4
- Comments: 18 (18 by maintainers)
Hmm, right,
SetVectortends to generate so much code that either way things aren’t that great in terms of size. The concerns about data size would be more relevant toSetAll, that one generates much less code and yet native compilers tends to also use memory constants for that as well. But then native compilers have the luxury of deduplicating constants…I’m not so sure on this part. It looks like it generally takes more bytes to do the insert/shift code than it does to store the raw bytes and read from memory.
So, even with “perfect” deduping of float constants (which we don’t have), we still only have a code savings of ~2-bytes.