Skip to content

Commit 87390cd

Browse files
committed
minor typo fixes and clarify differences with std::linalg::copy
1 parent 6a4a361 commit 87390cd

3 files changed

Lines changed: 60 additions & 30 deletions

File tree

P0009/wg21/data/index.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104768,7 +104768,7 @@ references:
104768104768
- family: Mark Hoemmen, Daisy Hollman, Christian Trott, Daniel Sunderland, Nevin Liber, Alicia Klinvex, Li-Ta Lo, Damien Lebrun-Grandie, Graham Lopez, Peter Caday, Sarah Knepper, Piotr Luszczek, Timothy Costa
104769104769
issued:
104770104770
year: 2023
104771-
URL: https://wg21.link/p1673r1
104771+
URL: https://wg21.link/p1673r13
104772104772
- id: P1674R0
104773104773
citation-label: P1674R0
104774104774
title: "Evolving a Standard C++ Linear Algebra Library from the BLAS"

mdspan_copy/mdspan_copy.html

Lines changed: 43 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
<meta charset="utf-8" />
55
<meta name="generator" content="mpark/wg21" />
66
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
7-
<meta name="dcterms.date" content="2024-03-22" />
7+
<meta name="dcterms.date" content="2024-04-01" />
88
<title>Copy and fill for mdspan</title>
99
<style>
1010
code{white-space: pre-wrap;}
@@ -25,7 +25,7 @@
2525
}
2626
@media print {
2727
pre > code.sourceCode { white-space: pre-wrap; }
28-
pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
28+
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
2929
}
3030
pre.numberSource code
3131
{ counter-reset: source-line 0; }
@@ -425,7 +425,7 @@ <h1 class="title" style="text-align:center">Copy and fill for
425425
</tr>
426426
<tr>
427427
<td>Date: </td>
428-
<td>2024-03-22</td>
428+
<td>2024-04-01</td>
429429
</tr>
430430
<tr>
431431
<td style="vertical-align:top">Project: </td>
@@ -465,15 +465,15 @@ <h1 id="toctitle">Contents</h1>
465465
</ul>
466466
</div>
467467
<h1 data-number="1" id="motivation"><span class="header-section-number">1</span> Motivation<a href="#motivation" class="self-link"></a></h1>
468-
<p>C++23 introduced <code>mdspan</code> (<span class="citation" data-cites="P0009R18">[<a href="#ref-P0009R18" role="doc-biblioref">P0009R18</a>]</span>), a nonowning multidmensional
468+
<p>C++23 introduced <code>mdspan</code> (<span class="citation" data-cites="P0009R18">[<a href="#ref-P0009R18" role="doc-biblioref">P0009R18</a>]</span>), a non-owning multidmensional
469469
array abstraction that has a customizable layout. Layout customization
470470
was originally motivated in <span class="citation" data-cites="P0009R18">[<a href="#ref-P0009R18" role="doc-biblioref">P0009R18</a>]</span> with considerations for
471471
interoperability and performance, particularly on different
472472
architectures. Moreover, <span class="citation" data-cites="P2630R4">[<a href="#ref-P2630R4" role="doc-biblioref">P2630R4</a>]</span> introduced
473473
<code>submdspan</code>, a slicing function that can yield arbitrarily
474474
strided layouts. However, without standard library support, copying
475-
efficiently between mdspans with mixes of complex layouts is challenging
476-
for users.</p>
475+
efficiently between <code>mdspan</code>s with mixes of complex layouts
476+
is challenging for users.</p>
477477
<p>Many applications, including high-performance computing (HPC), image
478478
processing, computer graphics, etc that benefit from <code>mdspan</code>
479479
also would benefit from basic memory operations provided in standard
@@ -487,21 +487,21 @@ <h1 data-number="1" id="motivation"><span class="header-section-number">1</span>
487487
that represent the span of the <code>mdspan</code>. Additionally, it’s
488488
not entirely clear what this would entail.
489489
<code>std::linalg::copy</code> (<span class="citation" data-cites="P1673R13">[<a href="#ref-P1673R13" role="doc-biblioref">P1673R13</a>]</span>) is limited to
490-
<code>mdspans</code> of rank 2 or lower.</p>
490+
<code>mdspan</code>s of rank 2 or lower.</p>
491491
<p>Moreover, the manner in which an <code>mdspan</code> is copied (or
492492
filled) is highly performance sensitive, particularly in regards to
493-
caching behavior when traversing mdspan memory. A naïve user
494-
implementation is easy to get wrong in addition to being tedious for
495-
higher rank <code>mdspan</code>s. Ideally, an implementation would be
496-
free to use information about the layout of the <code>mdspan</code>
493+
caching behavior when traversing <code>mdspan</code> memory. A naive
494+
user implementation is easy to get wrong in addition to being tedious
495+
for higher rank <code>mdspan</code>s. Ideally, an implementation would
496+
be free to use information about the layout of the <code>mdspan</code>
497497
known at compile time to perform optimizations; e.g. a continuous span
498-
<code>mdspan</code> copy for trivial types could be implementeed with a
498+
<code>mdspan</code> copy for trivial types could be implemented with a
499499
<code>memcpy</code>.</p>
500500
<p>Finally, providing these generic algorithms would also enable these
501501
operations for types that are representable by <code>mdspan</code>. For
502502
example, this would naturally include <code>mdarray</code>, which is
503503
convertible to <code>mdspan</code>, or for user-defined types whose view
504-
of memory corresponds to <code>mdspans</code> (e.g. an image class or
504+
of memory corresponds to <code>mdspan</code>s (e.g. an image class or
505505
something similar).</p>
506506
<h2 data-number="1.1" id="safety"><span class="header-section-number">1.1</span> Safety<a href="#safety" class="self-link"></a></h2>
507507
<p>Due to the closed nature of <code>mdspan</code> extents, copy
@@ -555,18 +555,40 @@ <h2 data-number="2.2" id="existing-copy-in-stdlinalg"><span class="header-sectio
555555
<p><span class="citation" data-cites="P1673R13">[<a href="#ref-P1673R13" role="doc-biblioref">P1673R13</a>]</span> introduced several linear
556556
algebra operations including <code>std::linalg::copy</code>. This
557557
operation only applies to <code>mdspan</code>s with <span class="math inline"><em>r</em><em>a</em><em>n</em><em>k</em> ≤ 2</span>.
558-
This paper is proposing a version of <code>copy</code> that is
559-
constrained to a superset of <code>std::linalg::copy</code>.</p>
560-
<p>Right now the strict addition of <code>copy</code> would potentially
561-
cause the following code to be ambiguous, due to ADL-finding
558+
This paper is proposing a version of <code>copy</code> that is not
559+
constrained by the number of ranks and differs from
560+
<code>std::linalg::copy</code> in some important ways outline below.</p>
561+
<p>Note that right now the strict addition of <code>copy</code> would
562+
potentially cause the following code to be ambiguous, due to ADL-finding
562563
<code>std::copy</code>:</p>
563564
<div class="sourceCode" id="cb1"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="kw">using</span> std<span class="op">::</span>linalg<span class="op">::</span>copy;</span>
564565
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a>copy<span class="op">(</span>mds1, mds2<span class="op">)</span>;</span></code></pre></div>
565566
<p>One possibility would be to remove <code>std::linalg::copy</code>, as
566-
it is a subset of the proposed <code>std::copy</code>, though as of now
567-
this paper does not propose to do this.</p>
567+
it is a subset of the proposed <code>std::copy</code>. This was rejected
568+
by the paper authors because of certain requirements in
569+
[linalg.reqs.alg] – that is:</p>
570+
<blockquote>
571+
<p>The function may make arbitrarily many objects of any linear algebra
572+
value type, value-initializing or direct-initializing them with any
573+
existing object of that type.</p>
574+
</blockquote>
575+
<p>This requirement is likely undesirable for a generalized copy
576+
algorithm.</p>
577+
<p>There is a similar argument against simply generalizing
578+
<code>std::linalg::copy</code>. In addition to the freedom of
579+
<code>std::linalg::copy</code> to arbitrarily value or
580+
direct-initializing values, using the linear algebra version of copy
581+
would require the use of unnecessary includes and namespaces. It seems
582+
not very ergonomic for a user to have to use
583+
<code>std::linalg::copy</code> and include <code>&lt;linalg&gt;</code>
584+
even if the <code>mdspan</code> operations they are performing are
585+
unrelated to linear algebra.</p>
568586
<h2 data-number="2.3" id="what-the-proposal-does-not-include"><span class="header-section-number">2.3</span> What the proposal does not
569587
include<a href="#what-the-proposal-does-not-include" class="self-link"></a></h2>
588+
<p>There are a few additions that are analogous to existing standard
589+
algorithms that are not included in this proposal, both to keep the
590+
proposal small and because some of these algorithms do not make sense in
591+
the context of <code>mdspan</code>s:</p>
570592
<ul>
571593
<li><code>std::move</code>: Perhaps this should be included for
572594
completeness’s sake. However, it doesn’t seem applicable to the typical
@@ -587,7 +609,7 @@ <h2 data-number="2.3" id="what-the-proposal-does-not-include"><span class="heade
587609
<h1 data-number="3" id="wording"><span class="header-section-number">3</span> Wording<a href="#wording" class="self-link"></a></h1>
588610
<div class="sourceCode" id="cb2"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> SrcElementType, <span class="kw">class</span> SrcExtents, <span class="kw">class</span> SrcLayoutPolicy, <span class="kw">class</span> SrcAccessorPolicy,</span>
589611
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="kw">class</span> DstElementType, <span class="kw">class</span> DstExtents, <span class="kw">class</span> DstLayoutPolicy, <span class="kw">class</span> DstAccessorPolicy<span class="op">&gt;</span></span>
590-
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> copy<span class="op">(</span>mdspan<span class="op">&lt;</span>SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy<span class="op">&gt;</span> src, </span>
612+
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> copy<span class="op">(</span>mdspan<span class="op">&lt;</span>SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy<span class="op">&gt;</span> src,</span>
591613
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> mdspan<span class="op">&lt;</span>DstElementType, DstExtents, DstLayoutPolicy, DstAccessorPolicy<span class="op">&gt;</span> dst<span class="op">)</span>;</span>
592614
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a></span>
593615
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> ExecutionPolicy, <span class="kw">class</span> SrcElementType, <span class="kw">class</span> SrcExtents, <span class="kw">class</span> SrcLayoutPolicy, <span class="kw">class</span> SrcAccessorPolicy,</span>
@@ -650,7 +672,7 @@ <h1 data-number="4" id="references"><span class="header-section-number">4</span>
650672
Daniel Sunderland, Nevin Liber, Alicia Klinvex, Li-Ta Lo, Damien
651673
Lebrun-Grandie, Graham Lopez, Peter Caday, Sarah Knepper, Piotr
652674
Luszczek, Timothy Costa. 2023. A free function linear algebra interface
653-
based on the BLAS. <a href="https://wg21.link/p1673r1"><div class="csl-block">https://wg21.link/p1673r1</div></a></div>
675+
based on the BLAS. <a href="https://wg21.link/p1673r13"><div class="csl-block">https://wg21.link/p1673r13</div></a></div>
654676
</div>
655677
<div id="ref-P1684R5" class="csl-entry" role="listitem">
656678
<div class="csl-left-margin">[P1684R5] </div><div class="csl-right-inline">Christian Trott, Daisy Hollman, Mark Hoemmen,

mdspan_copy/mdspan_copy.md

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,15 @@ author:
1414

1515
# Motivation
1616

17-
C++23 introduced `mdspan` ([@P0009R18]), a nonowning multidmensional array abstraction that has a customizable layout. Layout customization was originally motivated in [@P0009R18] with considerations for interoperability and performance, particularly on different architectures. Moreover, [@P2630R4] introduced `submdspan`, a slicing function that can yield arbitrarily strided layouts. However, without standard library support, copying efficiently between mdspans with mixes of complex layouts is challenging for users.
17+
C++23 introduced `mdspan` ([@P0009R18]), a non-owning multidmensional array abstraction that has a customizable layout. Layout customization was originally motivated in [@P0009R18] with considerations for interoperability and performance, particularly on different architectures. Moreover, [@P2630R4] introduced `submdspan`, a slicing function that can yield arbitrarily strided layouts. However, without standard library support, copying efficiently between `mdspan`s with mixes of complex layouts is challenging for users.
1818

1919
Many applications, including high-performance computing (HPC), image processing, computer graphics, etc that benefit from `mdspan` also would benefit from basic memory operations provided in standard algorithms such as copy and fill. Indeed, the authors found that a copy algorithm would have been quite useful in their implementation of the copying `mdarray` ([@P1684R5]) constructor. A more constrained form of `copy` is also included in the standard linear algebra library ([@P1673R13]).
2020

21-
However, existing standard library facilities are not sufficient here. Currently, `mdspan` does not have iterators or ranges that represent the span of the `mdspan`. Additionally, it's not entirely clear what this would entail. `std::linalg::copy` ([@P1673R13]) is limited to `mdspans` of rank 2 or lower.
21+
However, existing standard library facilities are not sufficient here. Currently, `mdspan` does not have iterators or ranges that represent the span of the `mdspan`. Additionally, it's not entirely clear what this would entail. `std::linalg::copy` ([@P1673R13]) is limited to `mdspan`s of rank 2 or lower.
2222

23-
Moreover, the manner in which an `mdspan` is copied (or filled) is highly performance sensitive, particularly in regards to caching behavior when traversing mdspan memory. A naïve user implementation is easy to get wrong in addition to being tedious for higher rank `mdspan`s. Ideally, an implementation would be free to use information about the layout of the `mdspan` known at compile time to perform optimizations; e.g. a continuous span `mdspan` copy for trivial types could be implementeed with a `memcpy`.
23+
Moreover, the manner in which an `mdspan` is copied (or filled) is highly performance sensitive, particularly in regards to caching behavior when traversing `mdspan` memory. A naive user implementation is easy to get wrong in addition to being tedious for higher rank `mdspan`s. Ideally, an implementation would be free to use information about the layout of the `mdspan` known at compile time to perform optimizations; e.g. a continuous span `mdspan` copy for trivial types could be implemented with a `memcpy`.
2424

25-
Finally, providing these generic algorithms would also enable these operations for types that are representable by `mdspan`. For example, this would naturally include `mdarray`, which is convertible to `mdspan`, or for user-defined types whose view of memory corresponds to `mdspans` (e.g. an image class or something similar).
25+
Finally, providing these generic algorithms would also enable these operations for types that are representable by `mdspan`. For example, this would naturally include `mdarray`, which is convertible to `mdspan`, or for user-defined types whose view of memory corresponds to `mdspan`s (e.g. an image class or something similar).
2626

2727
## Safety
2828

@@ -47,19 +47,27 @@ We settled on `<mdspan>` because as proposed this is a relatively light-weight a
4747

4848
## Existing `copy` in `std::linalg`
4949

50-
[@P1673R13] introduced several linear algebra operations including `std::linalg::copy`. This operation only applies to `mdspan`s with $rank \le 2$. This paper is proposing a version of `copy` that is constrained to a superset of `std::linalg::copy`.
50+
[@P1673R13] introduced several linear algebra operations including `std::linalg::copy`. This operation only applies to `mdspan`s with $rank \le 2$. This paper is proposing a version of `copy` that is not constrained by the number of ranks and differs from `std::linalg::copy` in some important ways outline below.
5151

52-
Right now the strict addition of `copy` would potentially cause the following code to be ambiguous, due to ADL-finding `std::copy`:
52+
Note that right now the strict addition of `copy` would potentially cause the following code to be ambiguous, due to ADL-finding `std::copy`:
5353

5454
```c++
5555
using std::linalg::copy;
5656
copy(mds1, mds2);
5757
```
5858
59-
One possibility would be to remove `std::linalg::copy`, as it is a subset of the proposed `std::copy`, though as of now this paper does not propose to do this.
59+
One possibility would be to remove `std::linalg::copy`, as it is a subset of the proposed `std::copy`. This was rejected by the paper authors because of certain requirements in \[linalg.reqs.alg\] -- that is:
60+
61+
> The function may make arbitrarily many objects of any linear algebra value type, value-initializing or direct-initializing them with any existing object of that type.
62+
63+
This requirement is likely undesirable for a generalized copy algorithm.
64+
65+
There is a similar argument against simply generalizing `std::linalg::copy`. In addition to the freedom of `std::linalg::copy` to arbitrarily value or direct-initializing values, using the linear algebra version of copy would require the use of unnecessary includes and namespaces. It seems not very ergonomic for a user to have to use `std::linalg::copy` and include `<linalg>` even if the `mdspan` operations they are performing are unrelated to linear algebra.
6066
6167
## What the proposal does not include
6268
69+
There are a few additions that are analogous to existing standard algorithms that are not included in this proposal, both to keep the proposal small and because some of these algorithms do not make sense in the context of `mdspan`s:
70+
6371
* `std::move`: Perhaps this should be included for completeness's sake. However, it doesn't seem applicable to the typical usage of `mdspan`.
6472
* `(copy|fill)_n`: As a multidimensional view `mdspan` does not in general follow a specific ordering. Memory ordering may not be obvious to calling code, so it's not even clear how these would work. Any applications intending to copy a subset of `mdspan` should use call `copy` on the result of `submdspan`.
6573
* `copy_backward`: As above, there is no specific ordering. A similar effect could be achieved via transformations with a custom layout, similar to `layout_transpose` in [@P1673R13].
@@ -70,7 +78,7 @@ One possibility would be to remove `std::linalg::copy`, as it is a subset of the
7078
```c++
7179
template<class SrcElementType, class SrcExtents, class SrcLayoutPolicy, class SrcAccessorPolicy,
7280
class DstElementType, class DstExtents, class DstLayoutPolicy, class DstAccessorPolicy>
73-
void copy(mdspan<SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy> src,
81+
void copy(mdspan<SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy> src,
7482
mdspan<DstElementType, DstExtents, DstLayoutPolicy, DstAccessorPolicy> dst);
7583
7684
template<class ExecutionPolicy, class SrcElementType, class SrcExtents, class SrcLayoutPolicy, class SrcAccessorPolicy,

0 commit comments

Comments
 (0)