Namespaces
Variants
Views
Actions

Difference between revisions of "cpp/experimental/simd"

From cppreference.com
m (External links: +links to implementation.)
 
(47 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{title|Data-Parallel Vector Library}}
+
{{title|SIMD library}}
{{navbar
+
{{cpp/experimental/simd/navbar}}
|heading1 = cpp/navbar heading
+
 
|content1 = cpp/navbar content
+
The SIMD library provides portable types for explicitly stating data-parallelism and structuring data for more efficient SIMD access.
|heading2 = cpp/experimental/navbar heading
+
 
|content2 = cpp/experimental/navbar content
+
An object of type {{ltt|cpp/experimental/simd/simd|simd<T>}} behaves analogue to objects of type {{tt|T}}. But while {{tt|T}} stores and manipulates one value, {{tt|simd<T>}} stores and manipulates multiple values (called ''width'' but identified as {{c|size}} for consistency with the rest of the standard library; cf. {{ltt|cpp/experimental/simd/simd_size}}).
|heading3 = cpp/experimental/parallelism/navbar heading
+
 
|content3 = cpp/experimental/parallelism/navbar content
+
Every operator and operation on {{tt|simd<T>}} acts ''element-wise'' (except for ''horizontal'' operations, which are clearly marked as such). This simple rule expresses data-parallelism and will be used by the compiler to generate SIMD instructions and/or independent execution streams.
|heading4 = cpp/experimental/simd/navbar heading
+
 
|content4 = cpp/experimental/simd/navbar content
+
The width of the types {{tt|simd<T>}} and {{ltt|cpp/experimental/simd/simd|native_simd<T>}} is determined by the implementation at compile-time. In contrast, the width of the type {{ltt|cpp/experimental/simd/simd|fixed_size_simd<T, N>}} is fixed by the developer to a certain size.
 +
 
 +
A recommended pattern for using a mix of different SIMD types with high efficiency uses {{ltt|cpp/experimental/simd/simd|native_simd}} and {{ltt|cpp/experimental/simd/rebind_simd}}:
 +
{{source|1=
 +
#include <experimental/simd>
 +
namespace stdx = std::experimental;
 +
 
 +
using floatv  = stdx::native_simd<float>;
 +
using doublev = stdx::rebind_simd_t<double, floatv>;
 +
using intv    = stdx::rebind_simd_t<int, floatv>;
 
}}
 
}}
 +
This ensures that the set of types all have the same width and thus can be interconverted. A conversion with mismatching width is not defined because it would either drop values or have to invent values. For resizing operations, the SIMD library provides the {{ltt|cpp/experimental/simd/split}} and {{ltt|cpp/experimental/simd/concat}} functions.
  
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc header | experimental/simd }}
+
{{dsc header|experimental/simd}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Main classes===
 
===Main classes===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc inc | cpp/experimental/simd/dsc simd}}
+
{{dsc inc|cpp/experimental/simd/dsc simd}}
{{dsc inc | cpp/experimental/simd/dsc simd_mask}}
+
{{dsc inc|cpp/experimental/simd/dsc simd_mask}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===ABI tags===
 
===ABI tags===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc namespace | std::experimental::simd_abi }}
+
{{dsc namespace|std::experimental::simd_abi}}
{{dsc class | cpp/experimental/simd/scalar |  | notes={{mark since parallelism_ts_2}}}}
+
{{dsc inc|cpp/experimental/simd/dsc scalar}}
{{dsc tclass | cpp/experimental/simd/fixed |  | notes={{mark since parallelism_ts_2}}}}
+
{{dsc inc|cpp/experimental/simd/dsc fixed_size}}
{{dsc talias | cpp/experimental/simd/compatible |  | notes={{mark since parallelism_ts_2}}}}
+
{{dsc inc|cpp/experimental/simd/dsc compatible}}
{{dsc talias | cpp/experimental/simd/native |  | notes={{mark since parallelism_ts_2}}}}
+
{{dsc inc|cpp/experimental/simd/dsc native}}
{{dsc const | cpp/experimental/simd/max_fixed_size |  | notes={{mark since parallelism_ts_2}}}}
+
{{dsc inc|cpp/experimental/simd/dsc max_fixed_size}}
 +
{{dsc inc|cpp/experimental/simd/dsc deduce}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Alignment tags===
 
===Alignment tags===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc class | cpp/experimental/simd/element aligned | | notes={{mark since parallelism_ts_2}} | title=element_aligned_tag <br/> element_aligned}}
+
{{dsc inc|cpp/experimental/simd/element aligned}}
{{dsc class | cpp/experimental/simd/vector aligned | | notes={{mark since parallelism_ts_2}} | title=vector_aligned_tag <br/> vector_aligned}}
+
{{dsc inc|cpp/experimental/simd/vector_aligned}}
{{dsc tclass | cpp/experimental/simd/overaligned | | notes={{mark since parallelism_ts_2}} | title=overaligned_tag <br/> overaligned}}
+
{{dsc inc|cpp/experimental/simd/overaligned}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Where expression===
 
===Where expression===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc tclass | cpp/experimental/simd/const_where_expression | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tclass|cpp/experimental/simd/const_where_expression|selected elements with non-mutating operations|notes={{mark since parallelism_ts_2}}}}
{{dsc tclass | cpp/experimental/simd/where_expression | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tclass|cpp/experimental/simd/where_expression|selected elements with mutating operations|notes={{mark since parallelism_ts_2}}}}
{{dsc tfun | cpp/experimental/simd/where | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tfun|cpp/experimental/simd/where|produces const_where_expression and where_expression|notes={{mark since parallelism_ts_2}}}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Casts===
 
===Casts===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc tfun | cpp/experimental/simd/simd_cast | | notes={{mark since parallelism_ts_2}} | title=simd_cast <br/> static_simd_cast }}
+
{{dsc tfun|cpp/experimental/simd/simd_cast|element-wise static_cast|notes={{mark since parallelism_ts_2}}|title=simd_cast<br>static_simd_cast}}
{{dsc tfun | cpp/experimental/simd/abi_cast | | notes={{mark since parallelism_ts_2}} | title=to_fixed_size <br/> to_compatible <br/> to_native }}
+
{{dsc tfun|cpp/experimental/simd/abi_cast|element-wise ABI cast|notes={{mark since parallelism_ts_2}}|title=to_fixed_size<br>to_compatible<br>to_native}}
{{dsc tfun | cpp/experimental/simd/split | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tfun|cpp/experimental/simd/split|splits single simd object to multiple ones|notes={{mark since parallelism_ts_2}}|title=split<br>split_by}}
{{dsc tfun | cpp/experimental/simd/concat | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tfun|cpp/experimental/simd/concat|concatenates multiple simd objects to a single one|notes={{mark since parallelism_ts_2}}}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Algorithms===
 
===Algorithms===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc tfun | cpp/experimental/simd/min | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tfun|cpp/experimental/simd/min|element-wise min operation|notes={{mark since parallelism_ts_2}}}}
{{dsc tfun | cpp/experimental/simd/max | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tfun|cpp/experimental/simd/max|element-wise max operation|notes={{mark since parallelism_ts_2}}}}
{{dsc tfun | cpp/experimental/simd/minmax | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tfun|cpp/experimental/simd/minmax|element-wise minmax operation|notes={{mark since parallelism_ts_2}}}}
{{dsc tfun | cpp/experimental/simd/clamp | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tfun|cpp/experimental/simd/clamp|element-wise clamp operation|notes={{mark since parallelism_ts_2}}}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Reduction===
 
===Reduction===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc tfun | cpp/experimental/simd/reduce | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tfun|cpp/experimental/simd/reduce|reduces the vector to a single element|notes={{mark since parallelism_ts_2}}|title=reduce<br>hmin<br>hmax}}
{{dsc tfun | cpp/experimental/simd/hmin |  | notes={{mark since parallelism_ts_2}}}}
+
{{dsc tfun | cpp/experimental/simd/hmax |  | notes={{mark since parallelism_ts_2}}}}
+
 
{{dsc end}}
 
{{dsc end}}
  
 
===Mask reduction===
 
===Mask reduction===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc tfun | cpp/experimental/simd/all_of | | notes={{mark since parallelism_ts_2}} | title=all_of <br/> any_of <br/> none_of <br/> some_of }}
+
{{dsc inc|cpp/experimental/simd/dsc all_of}}
{{dsc tfun | cpp/experimental/simd/popcount | | notes={{mark since parallelism_ts_2}}}}
+
{{dsc inc|cpp/experimental/simd/dsc popcount}}
{{dsc tfun | cpp/experimental/simd/find_first_set | | notes={{mark since parallelism_ts_2}} | title=find_first_set <br/> find_last_set}}
+
{{dsc inc|cpp/experimental/simd/dsc find_first_set}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Traits===
 
===Traits===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc tclass | cpp/experimental/simd/is_abi_tag |  | notes={{mark since parallelism_ts_2}} | title=is_abi_tag <br/> is_abi_tag_v}}
+
{{dsc inc|cpp/experimental/simd/dsc is_simd}}
{{dsc tclass | cpp/experimental/simd/is_simd |  | notes={{mark since parallelism_ts_2}} | title=is_simd <br/> is_simd_v}}
+
{{dsc inc|cpp/experimental/simd/dsc is_abi_tag}}
{{dsc tclass | cpp/experimental/simd/is_simd_mask |  | notes={{mark since parallelism_ts_2}} | title=is_simd_mask <br/> is_simd_mask_v}}
+
{{dsc inc|cpp/experimental/simd/dsc is_simd_flag_type}}
{{dsc tclass | cpp/experimental/simd/is_simd_flag_type |  | notes={{mark since parallelism_ts_2}} | title=is_simd_flag_type <br/> is_simd_flag_type_v}}
+
{{dsc inc|cpp/experimental/simd/dsc simd_size}}
{{dsc tclass | cpp/experimental/simd/simd_size |  | notes={{mark since parallelism_ts_2}} | title=simd_size <br/> simd_size_v}}
+
{{dsc inc|cpp/experimental/simd/dsc memory_alignment}}
{{dsc tclass | cpp/experimental/simd/memory_alignment |  | notes={{mark since parallelism_ts_2}} | title=memory_alignment <br/> memory_alignment_v}}
+
{{dsc inc|cpp/experimental/simd/dsc rebind_simd}}
{{dsc tclass | cpp/experimental/simd/abi_for_size |  | notes={{mark since parallelism_ts_2}} | title=abi_for_size <br/> abi_for_size_t}}
+
 
{{dsc end}}
 
{{dsc end}}
  
===Helpers===
+
===Math functions===
 +
All functions in {{header|cmath}}, except for the special math functions, are overloaded for {{tt|simd}}.
  
 +
===Example===
 +
{{example
 +
|code=
 +
#include <experimental/simd>
 +
#include <iostream>
 +
#include <string_view>
 +
namespace stdx = std::experimental;
 +
 +
void println(std::string_view name, auto const& a)
 +
{
 +
    std::cout << name << ": ";
 +
    for (std::size_t i{}; i != std::size(a); ++i)
 +
        std::cout << a[i] << ' ';
 +
    std::cout << '\n';
 +
}
 +
 +
template<class A>
 +
stdx::simd<int, A> my_abs(stdx::simd<int, A> x)
 +
{
 +
    where(x < 0, x) = -x;
 +
    return x;
 +
}
 +
 +
int main()
 +
{
 +
    const stdx::native_simd<int> a = 1;
 +
    println("a", a);
 +
 +
    const stdx::native_simd<int> b([](int i) { return i - 2; });
 +
    println("b", b);
 +
 +
    const auto c = a + b;
 +
    println("c", c);
 +
 +
    const auto d = my_abs(c);
 +
    println("d", d);
 +
 +
    const auto e = d * d;
 +
    println("e", e);
 +
 +
    const auto inner_product = stdx::reduce(e);
 +
    std::cout << "inner product: " << inner_product << '\n';
 +
 +
    const stdx::fixed_size_simd<long double, 16> x([](int i) { return i; });
 +
    println("x", x);
 +
    println("cos²(x) + sin²(x)", stdx::pow(stdx::cos(x), 2) + stdx::pow(stdx::sin(x), 2));
 +
}
 +
|output=
 +
a: 1 1 1 1
 +
b: -2 -1 0 1
 +
c: -1 0 1 2
 +
d: 1 0 1 2
 +
e: 1 0 1 4
 +
inner product: 6
 +
x: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
 +
cos²(x) + sin²(x): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 +
}}
 +
 +
===See also===
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc talias | cpp/experimental/simd/native_simd |  | notes={{mark since parallelism_ts_2}}}}
+
{{dsc inc|cpp/numeric/dsc valarray}}
{{dsc talias | cpp/experimental/simd/fixed_sized_simd |  | notes={{mark since parallelism_ts_2}}}}
+
{{dsc talias | cpp/experimental/simd/native_simd_mask |  | notes={{mark since parallelism_ts_2}}}}
+
{{dsc talias | cpp/experimental/simd/fixed_sized_simd_mask |  | notes={{mark since parallelism_ts_2}}}}
+
 
{{dsc end}}
 
{{dsc end}}
  
===Math functions===
+
===External links===
 +
{{elink begin}}
 +
{{elink|num=1|1=[https://github.com/VcDevel/std-simd The implementation of ISO/IEC TS 19570:2018 Section 9 "Data-Parallel Types"] — github.com}}
 +
{{elink|num=2|1=TS implementation reach for [https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/experimental/simd;hb=HEAD GCC/libstdc++] (std::experimental::simd is shipping with GCC-11) — gcc.gnu.org}}
 +
{{elink end}}
  
===Example===
+
{{langlinks|ja|zh}}

Latest revision as of 08:56, 13 October 2023

 
 
Experimental
Technical Specification
Filesystem library (filesystem TS)
Library fundamentals (library fundamentals TS)
Library fundamentals 2 (library fundamentals TS v2)
Library fundamentals 3 (library fundamentals TS v3)
Extensions for parallelism (parallelism TS)
Extensions for parallelism 2 (parallelism TS v2)
Extensions for concurrency (concurrency TS)
Extensions for concurrency 2 (concurrency TS v2)
Concepts (concepts TS)
Ranges (ranges TS)
Reflection (reflection TS)
Mathematical special functions (special functions TR)
Experimental Non-TS
Pattern Matching
Linear Algebra
std::execution
Contracts
2D Graphics
 
 
 

The SIMD library provides portable types for explicitly stating data-parallelism and structuring data for more efficient SIMD access.

An object of type simd<T> behaves analogue to objects of type T. But while T stores and manipulates one value, simd<T> stores and manipulates multiple values (called width but identified as size for consistency with the rest of the standard library; cf. simd_size).

Every operator and operation on simd<T> acts element-wise (except for horizontal operations, which are clearly marked as such). This simple rule expresses data-parallelism and will be used by the compiler to generate SIMD instructions and/or independent execution streams.

The width of the types simd<T> and native_simd<T> is determined by the implementation at compile-time. In contrast, the width of the type fixed_size_simd<T, N> is fixed by the developer to a certain size.

A recommended pattern for using a mix of different SIMD types with high efficiency uses native_simd and rebind_simd:

#include <experimental/simd>
namespace stdx = std::experimental;
 
using floatv  = stdx::native_simd<float>;
using doublev = stdx::rebind_simd_t<double, floatv>;
using intv    = stdx::rebind_simd_t<int, floatv>;

This ensures that the set of types all have the same width and thus can be interconverted. A conversion with mismatching width is not defined because it would either drop values or have to invent values. For resizing operations, the SIMD library provides the split and concat functions.

Defined in header <experimental/simd>

Contents

[edit] Main classes

(parallelism TS v2)
data-parallel vector type
(class template) [edit]
(parallelism TS v2)
data-parallel type with the element type bool
(class template) [edit]

[edit] ABI tags

Defined in namespace std::experimental::simd_abi
(parallelism TS v2)
tag type for storing a single element
(typedef) [edit]
(parallelism TS v2)
tag type for storing specified number of elements
(alias template)[edit]
(parallelism TS v2)
tag type that ensures ABI compatibility
(alias template)[edit]
(parallelism TS v2)
tag type that is most efficient
(alias template)[edit]
(parallelism TS v2)
the maximum number of elements guaranteed to be supported by fixed
(constant) [edit]
(parallelism TS v2)
obtains an ABI type for given element type and number of elements
(class template) [edit]

[edit] Alignment tags

flag indicating alignment of the load/store address to element alignment
(class) [edit]
flag indicating alignment of the load/store address to vector alignment
(class) [edit]
(parallelism TS v2)
flag indicating alignment of the load/store address to the specified alignment
(class template) [edit]

[edit] Where expression

(parallelism TS v2)
selected elements with non-mutating operations
(class template)
(parallelism TS v2)
selected elements with mutating operations
(class template)
(parallelism TS v2)
produces const_where_expression and where_expression
(function template)

[edit] Casts

(parallelism TS v2)
element-wise static_cast
(function template)
element-wise ABI cast
(function template)
(parallelism TS v2)
splits single simd object to multiple ones
(function template)
(parallelism TS v2)
concatenates multiple simd objects to a single one
(function template)

[edit] Algorithms

(parallelism TS v2)
element-wise min operation
(function template)
(parallelism TS v2)
element-wise max operation
(function template)
(parallelism TS v2)
element-wise minmax operation
(function template)
(parallelism TS v2)
element-wise clamp operation
(function template)

[edit] Reduction

(parallelism TS v2)
reduces the vector to a single element
(function template)

[edit] Mask reduction

(parallelism TS v2)
reductions of simd_mask to bool
(function template) [edit]
(parallelism TS v2)
reduction of simd_mask to the number of true values
(function template) [edit]
(parallelism TS v2)
reductions of simd_mask to the index of the first or last true value
(function template) [edit]

[edit] Traits

(parallelism TS v2)
checks if a type is a simd or simd_mask type
(class template) [edit]
(parallelism TS v2)
checks if a type is an ABI tag type
(class template) [edit]
(parallelism TS v2)
checks if a type is a simd flag type
(class template) [edit]
(parallelism TS v2)
obtains the number of elements of a given element type and ABI tag
(class template) [edit]
(parallelism TS v2)
obtains an appropriate alignment for vector_aligned
(class template) [edit]
(parallelism TS v2)
change element type or the number of elements of simd or simd_mask
(class template) [edit]

[edit] Math functions

All functions in <cmath>, except for the special math functions, are overloaded for simd.

[edit] Example

#include <experimental/simd>
#include <iostream>
#include <string_view>
namespace stdx = std::experimental;
 
void println(std::string_view name, auto const& a)
{
    std::cout << name << ": ";
    for (std::size_t i{}; i != std::size(a); ++i)
        std::cout << a[i] << ' ';
    std::cout << '\n';
}
 
template<class A>
stdx::simd<int, A> my_abs(stdx::simd<int, A> x)
{
    where(x < 0, x) = -x;
    return x;
}
 
int main()
{
    const stdx::native_simd<int> a = 1;
    println("a", a);
 
    const stdx::native_simd<int> b([](int i) { return i - 2; });
    println("b", b);
 
    const auto c = a + b;
    println("c", c);
 
    const auto d = my_abs(c);
    println("d", d);
 
    const auto e = d * d;
    println("e", e);
 
    const auto inner_product = stdx::reduce(e);
    std::cout << "inner product: " << inner_product << '\n';
 
    const stdx::fixed_size_simd<long double, 16> x([](int i) { return i; });
    println("x", x);
    println("cos²(x) + sin²(x)", stdx::pow(stdx::cos(x), 2) + stdx::pow(stdx::sin(x), 2));
}

Output:

a: 1 1 1 1 
b: -2 -1 0 1 
c: -1 0 1 2 
d: 1 0 1 2 
e: 1 0 1 4 
inner product: 6
x: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 
cos²(x) + sin²(x): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

[edit] See also

numeric arrays, array masks and array slices
(class template) [edit]

[edit] External links

1.  The implementation of ISO/IEC TS 19570:2018 Section 9 "Data-Parallel Types" — github.com
2.  TS implementation reach for GCC/libstdc++ (std::experimental::simd is shipping with GCC-11) — gcc.gnu.org