Add vectorized fillNextPrimes() algorithm for other CPU archtectures (e.g. arm64)

```primesieve::iterator```'s performance depends heavily on the ```fillNextPrimes()``` method from ```PrimeGenerator.cpp```. For x64 we have a [vectorized AVX512 algorithm](https://github.com/kimwalisch/primesieve/blob/c2c56e5b01478f6cf8cbfc10443156c482362fab/src/PrimeGenerator.cpp#L349) that is pretty optimal for this task. Once other CPU architectures (e.g. arm64) support 512-bit vector instructions like AVX512 we should port our AVX512 algorithm to these CPU architectures.

ARM has recently added (2021) the Scalable Vector Extension (SVE) to its CPUs. SVE is supposed to be a portable vector instruction set that works with different vector instructions widths. However for vectorizing our ```fillNextPrimes()``` method we need at least 512-bit vector instructions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add vectorized fillNextPrimes() algorithm for other CPU archtectures (e.g. arm64) #114

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Add vectorized fillNextPrimes() algorithm for other CPU archtectures (e.g. arm64) #114

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions