We use cookies on this site to enhance your user experience
By clicking the Accept button, you agree to us doing so. More info on our cookie policy
We use cookies on this site to enhance your user experience
By clicking the Accept button, you agree to us doing so. More info on our cookie policy
SIMD[dtype, size]dtype = Dtype.int/ Dtype.float64 ....Int64 is a SIMD vector of size 1.The SIMD type has Methods and operators.
print(
4 * SIMD[DType.int8,4](1,2,3,4)
)
[4, 8, 12, 16]
print(
SIMD[DType.int32,4](1,1,2,2).reduce_add()
)
6
print(
SIMD[DType.bool,4](True,False,True,False).reduce_and()
)
false
The multiply operator (*) works in an unifying manner, with both SIMD vectors of size 1 and size 32.
It has the
__add__dunder,__init__and many moremethods.
SIMD on the Stackvar x = SIMD[DType.float64,2](1.5, 2.5)
var y = x.reduce_add()
print(y)
y is a Float64.
SIMD on the HeapLet’s take for example a pointer to 10 * Int64. Instead of iterating each elements to add them together, It is also possible to do a fast addition with SIMD !
DTypePointerIt is like a pointer, but is more specialized for SIMD.
We’ll use alloc, we’ll have to free.
def main():
alias amount_of_bytes = 256
var mem = DTypePointer[DType.uint8].alloc(amount_of_bytes)
for i in range(amount_of_bytes):
mem[i] = i #slower but good first step !
SIMD vectorLet’s load the first 8 elements
var bunch_of_bytes = SIMD[type=DType.uint8, size=8].load(mem)
print(bunch_of_bytes)
[0, 1, 2, 3, 4, 5, 6, 7]
The data is now in a SIMD vector.
width is the size of the SIMD vector, stride can be used with offset.
0◄─────┐
1 │
2◄─────┤
3 │ Stride: 2
4◄─────┤ Width: 4
5 │
6◄─────┤
7 │
│
▼
[0,2,4,6] SIMD[Width:4] ###### A. The concept ```mojo var stride_like = 2 for i in range(0,8,stride_like):
print(i) ``` > 0, 2, 4, 6 ###### B. The SIMD stride
var separated_by_2 = mem.simd_strided_load[width = 8](
stride = 2
)
print(separated_by_2)
[0, 2, 4, 6, 8, 10, 12, 14]
gatherIt gathers the values stored at various positions into a SIMD vector.
for i in range(16):
mem[i] = i*i
print(
mem.gather(
SIMD[DType.int64,4](1, 2, 5, 6)
)
)
[1, 4, 25, 36]
Here is the gather method of DTypePointer in a visual form:
Memory: 0 10 20 30 40 50
│ │ │ │
└─────┬─┴──┴──┘
Gather 0 │ 3 4 5
▼
[0,30,40,50]
scatterIt assign new values to various positions. The positions(int64) and values are provided in a SIMD vector.
mem.scatter(
offset = SIMD[DType.int64, 2](1,10),
val = SIMD[DType.uint8, 2](0, 0)
)
print(mem[1])
print(mem[10])
0 0
Here is the scatter method of DTypePointer in a visual form:
Memory: 0 10 20 30 40 50
▲ ▲
│ │
│ │
┌──┴─────┘
│ 0 2 Indexes
│ 100 200 Values
scatter
Memory: 100, 10, 200, 30, 40, 50
freealloc gave us some RAM for the program, free gives it back:
mem.free()
Very easy to use:
┌──────────────────────┐
│ RAM │
├──┐ │
└┼─┴───────────────────┘
│
│
▼
alloc
┌──┐
└──┘
our program has to give the small amount of ram back
because another program might need it !
Latest Posts