These Vulkan 1.1 operations can be used in place of
`OpSubgroup{All,Any,AllEqual,Ballot}KHR`, among other things.
For `OpGroupNonUniformShuffleXor`, which was already implemented, turns
out the scope argument needs to be encoded not as an immediate, but as
an id that points to a constant integer.
Suppose you try to call, say, `AddEntryPoint` with a `std::vector<Id>`
as the `interfaces` argument - something that yuzu does. This can match
the non-variadic overload, since `std::vector<Id>` is implicitly
convertible to the argument type `std::span<const Id>`. But it can also
match the variadic overload, and the compiler sees that as a 'better'
match because it doesn't require implicit conversion. So it picks that
overload and promptly errors out trying to convert `std::vector<Id>` to
`Id`.
To make the compiler pick the right overload, you would have to
explicitly convert to `std::span<const Id>`, which is annoyingly
verbose.
To avoid this, add `requires` clauses to all variadic convenience
overloads, requiring each of the variadic arguments to be convertible to
the corresponding element type. If you pass a vector/array/etc., this
rules out the variadic overload as a candidate, and the call goes
through with the non-variadic overload.
Also, use slightly different code to forward to the non-variadic
overloads, that works even if the arguments need to be converted.
Note: I used this in a WIP branch updating yuzu to the latest version of
sirit.
Note 2: I tried to run clang-format on this, but it mangled the requires
clauses pretty horribly, so I didn't accept its changes. I googled it,
and apparently clang-format doesn't properly support concepts yet...
Before this commit sirit generated a stream of tokens that would then be
inserted to the final SPIR-V binary. This design was carried from the
initial design of manually inserting opcodes into the code. Now that
all instructions but labels are inserted when their respective function
is called, the old design can be dropped in favor of generating a valid
stream of SPIR-V opcodes.
The API for variables is broken, but adopting the new one is trivial.
Instead of calling OpVariable and then adding a global or local
variable, OpVariable was removed and global or local variables are
generated when they are called.
Avoiding duplicates is now done with an std::unordered_set instead of
using a linear search jumping through vtables.
Enable cast warnings in gcc and clang and always treat warnings as
errors.
GetWordCount now returns std::size_t for simplicity and the word count
is asserted and casted in WordCount (now called CalculateTotalWords.
Silence warnings.
Previously the test couldn't fail unless it crashed. Now that sirit does
not do work "behind the scenes" that can change between versions (like
declaring capabilities), we can have this checking.
All instructions but OpVariable and OpLabel are automatically emitted.
These functions have to call AddLocalVariable/AddGlobalVariable or
AddLabel respectively.
Like the other overloads, we can insert the whole string within one
operation instead of doing a byte-by-byte append.
We only do byte-by-byte appending when padding is necessary.
It's undefined behavior to cast down to any other type and dereference
that pointer unless:
1. It's similar (*extremely* vague definition at face value, see below
for clarification)
2. The casted to type is either the signed/unsigned variant of the
original type. (e.g. it's fine to cast an int* to an unsigned int*
and vice-versa).
3. The casted to pointer type is either std::byte*, char*, or unsigned
char*.
With regards to type similarity, two types (X and Y) are considered
"similar" if:
1. They're the same type (naturally)
2. They're both pointers and the pointed-to types are similar (basically
1. but for pointers)
3. They're both pointers to members of the same class and the types of
the pointed-to members are similar in type.
4. They're both arrays of the same size or both arrays of unknown size
*and* the array element types are similar.
Plus, doing it this way doesn't do a byte-by-byte appending to the
underlying std::vector and instead allocates all the necessary memory up
front and slaps the elements at the end of it.
Previously this wasn't utilizing any of the compiler flags, meaning it
wasn't applying any of the specified warnings.
Since applying the warnings to the target, this uncovered a few warning
cases, such as shadowing class variables on MSVC, etc, which have been fixed.
While looping here does work fine, it's mildly inefficient, particularly
if the number of members being added is large, because it can result in
multiple allocations over the period of the insertion, depending on how
much extra memory push_back may allocate for successive elements.
Instead, we can just tell the std::vector that we want to slap the whole
contained sequence at the back of it with insert, which lets it allocate
the whole memory block in one attempt.