1 /*
2 * Copyright (c) 2017, 2020, Oracle and/or its affiliates. All rights reserved.
3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
4 *
5 * This code is free software; you can redistribute it and/or modify it
6 * under the terms of the GNU General Public License version 2 only, as
7 * published by the Free Software Foundation. Oracle designates this
8 * particular file as subject to the "Classpath" exception as provided
9 * by Oracle in the LICENSE file that accompanied this code.
10 *
11 * This code is distributed in the hope that it will be useful, but WITHOUT
12 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 * version 2 for more details (a copy is included in the LICENSE file that
15 * accompanied this code).
16 *
17 * You should have received a copy of the GNU General Public License version
18 * 2 along with this work; if not, write to the Free Software Foundation,
19 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
20 *
21 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
22 * or visit www.oracle.com if you need additional information or have any
23 * questions.
24 */
25 package jdk.incubator.vector;
26
27 import java.nio.ByteBuffer;
28 import java.nio.ByteOrder;
29 import java.util.Arrays;
30
31 /**
32 * A
33 *
34 * <!-- The following paragraphs are shared verbatim
35 * -- between Vector.java and package-info.java -->
36 * sequence of a fixed number of <em>lanes</em>,
37 * all of some fixed
38 * {@linkplain Vector#elementType() <em>element type</em>}
39 * such as {@code byte}, {@code long}, or {@code float}.
40 * Each lane contains an independent value of the element type.
41 * Operations on vectors are typically
42 * <a href="Vector.html#lane-wise"><em>lane-wise</em></a>,
43 * distributing some scalar operator (such as
44 * {@linkplain Vector#add(Vector) addition})
45 * across the lanes of the participating vectors,
46 * usually generating a vector result whose lanes contain the various
47 * scalar results. When run on a supporting platform, lane-wise
746 * sign bits as zero (as on some computers) then this API would reach
747 * for big-endian fictions to create unified addressing of vector
748 * bytes.
749 *
750 * <h2><a id="memory"></a>Memory operations</h2>
751 *
752 * As was already mentioned, vectors can be loaded from memory and
753 * stored back. An optional mask can control which individual memory
754 * locations are read from or written to. The shape of a vector
755 * determines how much memory it will occupy.
756 *
757 * An implementation typically has the property, in the absence of
758 * masking, that lanes are stored as a dense sequence of back-to-back
759 * values in memory, the same as a dense (gap-free) series of single
760 * scalar values in an array of the scalar type.
761 *
762 * In such cases memory order corresponds exactly to lane order. The
763 * first vector lane value occupies the first position in memory, and so on,
764 * up to the length of the vector. Further, the memory order of stored
765 * vector lanes corresponds to increasing index values in a Java array or
766 * in a {@link java.nio.ByteBuffer}.
767 *
768 * <p> Byte order for lane storage is chosen such that the stored
769 * vector values can be read or written as single primitive values,
770 * within the array or buffer that holds the vector, producing the
771 * same values as the lane-wise values within the vector.
772 * This fact is independent of the convenient fiction that lane values
773 * inside of vectors are stored in little-endian order.
774 *
775 * <p> For example,
776 * {@link FloatVector#fromArray(VectorSpecies, float[], int)
777 * FloatVector.fromArray(fsp,fa,i)}
778 * creates and returns a float vector of some particular species {@code fsp},
779 * with elements loaded from some float array {@code fa}.
780 * The first lane is loaded from {@code fa[i]} and the last lane
781 * is initialized loaded from {@code fa[i+VL-1]}, where {@code VL}
782 * is the length of the vector as derived from the species {@code fsp}.
783 * Then, {@link FloatVector#add(Vector) fv=fv.add(fv2)}
784 * will produce another float vector of that species {@code fsp},
785 * given a vector {@code fv2} of the same species {@code fsp}.
786 * Next, {@link FloatVector#compare(VectorOperators.Comparison,float)
787 * mnz=fv.compare(NE, 0.0f)} tests whether the result is zero,
788 * yielding a mask {@code mnz}. The non-zero lanes (and only those
789 * lanes) can then be stored back into the original array elements
790 * using the statement
1022 * These operations are:
1023 * <ul>
1024 *
1025 * <li>The {@link #slice(int,Vector) slice()} family of methods,
1026 * which extract contiguous slice of {@code VLENGTH} fields from
1027 * a given origin point within a concatenated pair of vectors.
1028 *
1029 * <li>The {@link #unslice(int,Vector,int) unslice()} family of
1030 * methods, which insert a contiguous slice of {@code VLENGTH} fields
1031 * into a concatenated pair of vectors at a given origin point.
1032 *
1033 * <li>The {@link #rearrange(VectorShuffle) rearrange()} family of
1034 * methods, which select an arbitrary set of {@code VLENGTH} lanes
1035 * from one or two input vectors, and assemble them in an arbitrary
1036 * order. The selection and order of lanes is controlled by a
1037 * {@code VectorShuffle} object, which acts as an routing table
1038 * mapping source lanes to destination lanes. A {@code VectorShuffle}
1039 * can encode a mathematical permutation as well as many other
1040 * patterns of data movement.
1041 *
1042 * </ul>
1043 * <p> Some vector operations are not lane-wise, but rather move data
1044 * across lane boundaries. Such operations are typically rare in SIMD
1045 * code, though they are sometimes necessary for specific algorithms
1046 * that manipulate data formats at a low level, and/or require SIMD
1047 * data to move in complex local patterns. (Local movement in a small
1048 * window of a large array of data is relatively unusual, although
1049 * some highly patterned algorithms call for it.) In this API such
1050 * methods are always clearly recognizable, so that simpler lane-wise
1051 * reasoning can be confidently applied to the rest of the code.
1052 *
1053 * <p> In some cases, vector lane boundaries are discarded and
1054 * "redrawn from scratch", so that data in a given input lane might
1055 * appear (in several parts) distributed through several output lanes,
1056 * or (conversely) data from several input lanes might be consolidated
1057 * into a single output lane. The fundamental method which can redraw
1058 * lanes boundaries is
1059 * {@link #reinterpretShape(VectorSpecies,int) reinterpretShape()}.
1060 * Built on top of this method, certain convenience methods such
1061 * as {@link #reinterpretAsBytes() reinterpretAsBytes()} or
2672 *
2673 * <p> This method returns the value of this pseudocode:
2674 * <pre>{@code
2675 * Vector<E> r1 = this.rearrange(s.wrapIndexes());
2676 * // or else: r1 = this.rearrange(s, s.laneIsValid());
2677 * Vector<E> r2 = v.rearrange(s.wrapIndexes());
2678 * return r2.blend(r1,s.laneIsValid());
2679 * }</pre>
2680 *
2681 * @param s the shuffle controlling lane selection from both input vectors
2682 * @param v the second input vector
2683 * @return the rearrangement of lane elements of this vector and
2684 * a second input vector
2685 * @see #rearrange(VectorShuffle)
2686 * @see #rearrange(VectorShuffle,VectorMask)
2687 * @see VectorShuffle#laneIsValid()
2688 * @see #slice(int,Vector)
2689 */
2690 public abstract Vector<E> rearrange(VectorShuffle<E> s, Vector<E> v);
2691
2692 /**
2693 * Using index values stored in the lanes of this vector,
2694 * assemble values stored in second vector {@code v}.
2695 * The second vector thus serves as a table, whose
2696 * elements are selected by indexes in the current vector.
2697 *
2698 * This is a cross-lane operation that rearranges the lane
2699 * elements of the argument vector, under the control of
2700 * this vector.
2701 *
2702 * For each lane {@code N} of this vector, and for each lane
2703 * value {@code I=this.lane(N)} in this vector,
2704 * the output lane {@code N} obtains the value from
2705 * the argument vector at lane {@code I}.
2706 *
2707 * In this way, the result contains only values stored in the
2708 * argument vector {@code v}, but presented in an order which
2709 * depends on the index values in {@code this}.
2710 *
2711 * The result is the same as the expression
2837 * vector without performing any value conversions.
2838 *
2839 * <p> Depending on the selected species, this operation may
2840 * either <a href="Vector.html#expansion">expand or contract</a>
2841 * its logical result, in which case a non-zero {@code part}
2842 * number can further control the selection and steering of the
2843 * logical result into the physical output vector.
2844 *
2845 * <p>
2846 * The underlying bits of this vector are copied to the resulting
2847 * vector without modification, but those bits, before copying,
2848 * may be truncated if the this vector's bit-size is greater than
2849 * desired vector's bit size, or filled with zero bits if this
2850 * vector's bit-size is less than desired vector's bit-size.
2851 *
2852 * <p> If the old and new species have different shape, this is a
2853 * <em>shape-changing</em> operation, and may have special
2854 * implementation costs.
2855 *
2856 * <p> The method behaves as if this vector is stored into a byte
2857 * buffer or array using little-endian byte ordering and then the
2858 * desired vector is loaded from the same byte buffer or array
2859 * using the same ordering.
2860 *
2861 * <p> The following pseudocode illustrates the behavior:
2862 * <pre>{@code
2863 * int domSize = this.byteSize();
2864 * int ranSize = species.vectorByteSize();
2865 * int M = (domSize > ranSize ? domSize / ranSize : ranSize / domSize);
2866 * assert Math.abs(part) < M;
2867 * assert (part == 0) || (part > 0) == (domSize > ranSize);
2868 * byte[] ra = new byte[Math.max(domSize, ranSize)];
2869 * if (domSize > ranSize) { // expansion
2870 * this.intoByteArray(ra, 0, ByteOrder.native());
2871 * int origin = part * ranSize;
2872 * return species.fromByteArray(ra, origin, ByteOrder.native());
2873 * } else { // contraction or size-invariant
2874 * int origin = (-part) * domSize;
2875 * this.intoByteArray(ra, origin, ByteOrder.native());
2876 * return species.fromByteArray(ra, 0, ByteOrder.native());
2877 * }
2878 * }</pre>
2879 *
2880 * @apiNote Although this method is defined as if the vectors in
2881 * question were loaded or stored into memory, memory semantics
2882 * has little to do or nothing with the actual implementation.
2883 * The appeal to little-endian ordering is simply a shorthand
2884 * for what could otherwise be a large number of detailed rules
2885 * concerning the mapping between lane-structured vectors and
2886 * byte-structured vectors.
2887 *
2888 * @param species the desired vector species
2889 * @param part the <a href="Vector.html#expansion">part number</a>
2890 * of the result, or zero if neither expanding nor contracting
2891 * @param <F> the boxed element type of the species
2892 * @return a vector transformed, by shape and element type, from this vector
2893 * @see Vector#convertShape(VectorOperators.Conversion,VectorSpecies,int)
2894 * @see Vector#castShape(VectorSpecies,int)
2895 * @see VectorSpecies#partLimit(VectorSpecies,boolean)
2896 */
2897 public abstract <F> Vector<F> reinterpretShape(VectorSpecies<F> species, int part);
2898
2899 /**
2900 * Views this vector as a vector of the same shape
2901 * and contents but a lane type of {@code byte},
2902 * where the bytes are extracted from the lanes
2903 * according to little-endian order.
2904 * It is a convenience method for the expression
2905 * {@code reinterpretShape(species().withLanes(byte.class))}.
2906 * It may be considered an inverse to the various
2907 * methods which consolidate bytes into larger lanes
2908 * within the same vector, such as
2909 * {@link Vector#reinterpretAsInts()}.
2910 *
2911 * @return a {@code ByteVector} with the same shape and information content
2912 * @see Vector#reinterpretShape(VectorSpecies,int)
2913 * @see IntVector#intoByteArray(byte[], int, ByteOrder)
2914 * @see FloatVector#intoByteArray(byte[], int, ByteOrder)
2915 * @see VectorSpecies#withLanes(Class)
2916 */
2917 public abstract ByteVector reinterpretAsBytes();
2918
2919 /**
2920 * Reinterprets this vector as a vector of the same shape
2921 * and contents but a lane type of {@code short},
2922 * where the lanes are assembled from successive bytes
2923 * according to little-endian order.
2924 * It is a convenience method for the expression
2925 * {@code reinterpretShape(species().withLanes(short.class))}.
2926 * It may be considered an inverse to {@link Vector#reinterpretAsBytes()}.
2927 *
2928 * @return a {@code ShortVector} with the same shape and information content
2929 */
2930 public abstract ShortVector reinterpretAsShorts();
2931
2932 /**
2933 * Reinterprets this vector as a vector of the same shape
2934 * and contents but a lane type of {@code int},
3302 * Checks that this vector has the given species,
3303 * and returns this vector unchanged.
3304 * The effect is similar to this pseudocode:
3305 * {@code species == species()
3306 * ? this
3307 * : throw new ClassCastException()}.
3308 *
3309 * @param species the required species
3310 * @param <F> the boxed element type of the required species
3311 * @return the same vector
3312 * @throws ClassCastException if the vector has the wrong species
3313 * @see Vector#check(Class)
3314 * @see VectorMask#check(VectorSpecies)
3315 * @see VectorShuffle#check(VectorSpecies)
3316 */
3317 public abstract <F> Vector<F> check(VectorSpecies<F> species);
3318
3319 //Array stores
3320
3321 /**
3322 * Stores this vector into a byte array starting at an offset
3323 * using explicit byte order.
3324 * <p>
3325 * Bytes are extracted from primitive lane elements according
3326 * to the specified byte ordering.
3327 * The lanes are stored according to their
3328 * <a href="Vector.html#lane-order">memory ordering</a>.
3329 * <p>
3330 * This method behaves as if it calls
3331 * {@link #intoByteBuffer(ByteBuffer,int,ByteOrder,VectorMask)
3332 * intoByteBuffer()} as follows:
3333 * <pre>{@code
3334 * var bb = ByteBuffer.wrap(a);
3335 * var m = maskAll(true);
3336 * intoByteBuffer(bb, offset, bo, m);
3337 * }</pre>
3338 *
3339 * @param a the byte array
3340 * @param offset the offset into the array
3341 * @param bo the intended byte order
3342 * @throws IndexOutOfBoundsException
3343 * if {@code offset+N*ESIZE < 0}
3344 * or {@code offset+(N+1)*ESIZE > a.length}
3345 * for any lane {@code N} in the vector
3346 */
3347 public abstract void intoByteArray(byte[] a, int offset,
3348 ByteOrder bo);
3349
3350 /**
3351 * Stores this vector into a byte array starting at an offset
3352 * using explicit byte order and a mask.
3353 * <p>
3354 * Bytes are extracted from primitive lane elements according
3355 * to the specified byte ordering.
3356 * The lanes are stored according to their
3357 * <a href="Vector.html#lane-order">memory ordering</a>.
3358 * <p>
3359 * This method behaves as if it calls
3360 * {@link #intoByteBuffer(ByteBuffer,int,ByteOrder,VectorMask)
3361 * intoByteBuffer()} as follows:
3362 * <pre>{@code
3363 * var bb = ByteBuffer.wrap(a);
3364 * intoByteBuffer(bb, offset, bo, m);
3365 * }</pre>
3366 *
3367 * @param a the byte array
3368 * @param offset the offset into the array
3369 * @param bo the intended byte order
3370 * @param m the mask controlling lane selection
3371 * @throws IndexOutOfBoundsException
3372 * if {@code offset+N*ESIZE < 0}
3373 * or {@code offset+(N+1)*ESIZE > a.length}
3374 * for any lane {@code N} in the vector
3375 * where the mask is set
3376 */
3377 public abstract void intoByteArray(byte[] a, int offset,
3378 ByteOrder bo,
3379 VectorMask<E> m);
3380
3381 /**
3382 * Stores this vector into a byte buffer starting at an offset
3383 * using explicit byte order.
3384 * <p>
3385 * Bytes are extracted from primitive lane elements according
3386 * to the specified byte ordering.
3387 * The lanes are stored according to their
3388 * <a href="Vector.html#lane-order">memory ordering</a>.
3389 * <p>
3390 * This method behaves as if it calls
3391 * {@link #intoByteBuffer(ByteBuffer,int,ByteOrder,VectorMask)
3392 * intoByteBuffer()} as follows:
3393 * <pre>{@code
3394 * var m = maskAll(true);
3395 * intoByteBuffer(bb, offset, bo, m);
3396 * }</pre>
3397 *
3398 * @param bb the byte buffer
3399 * @param offset the offset into the array
3400 * @param bo the intended byte order
3401 * @throws IndexOutOfBoundsException
3402 * if {@code offset+N*ESIZE < 0}
3403 * or {@code offset+(N+1)*ESIZE > bb.limit()}
3404 * for any lane {@code N} in the vector
3405 * @throws java.nio.ReadOnlyBufferException
3406 * if the byte buffer is read-only
3407 */
3408 public abstract void intoByteBuffer(ByteBuffer bb, int offset, ByteOrder bo);
3409
3410 /**
3411 * Stores this vector into a byte buffer starting at an offset
3412 * using explicit byte order and a mask.
3413 * <p>
3414 * Bytes are extracted from primitive lane elements according
3415 * to the specified byte ordering.
3416 * The lanes are stored according to their
3417 * <a href="Vector.html#lane-order">memory ordering</a>.
3418 * <p>
3419 * The following pseudocode illustrates the behavior, where
3420 * the primitive element type is not of {@code byte},
3421 * {@code EBuffer} is the primitive buffer type, {@code ETYPE} is the
3422 * primitive element type, and {@code EVector} is the primitive
3423 * vector type for this vector:
3424 * <pre>{@code
3425 * EBuffer eb = bb.duplicate()
3426 * .position(offset)
3427 * .order(bo).asEBuffer();
3428 * ETYPE[] a = this.toArray();
3429 * for (int n = 0; n < a.length; n++) {
3430 * if (m.laneIsSet(n)) {
3431 * eb.put(n, a[n]);
3432 * }
3433 * }
3434 * }</pre>
3435 * When the primitive element type is of {@code byte} the primitive
3436 * byte buffer is obtained as follows, where operation on the buffer
3437 * remains the same as in the prior pseudocode:
3438 * <pre>{@code
3439 * ByteBuffer eb = bb.duplicate()
3440 * .position(offset);
3441 * }</pre>
3442 *
3443 * @implNote
3444 * This operation is likely to be more efficient if
3445 * the specified byte order is the same as
3446 * {@linkplain ByteOrder#nativeOrder()
3447 * the platform native order},
3448 * since this method will not need to reorder
3449 * the bytes of lane values.
3450 * In the special case where {@code ETYPE} is
3451 * {@code byte}, the byte order argument is
3452 * ignored.
3453 *
3454 * @param bb the byte buffer
3455 * @param offset the offset into the array
3456 * @param bo the intended byte order
3457 * @param m the mask controlling lane selection
3458 * @throws IndexOutOfBoundsException
3459 * if {@code offset+N*ESIZE < 0}
3460 * or {@code offset+(N+1)*ESIZE > bb.limit()}
3461 * for any lane {@code N} in the vector
3462 * where the mask is set
3463 * @throws java.nio.ReadOnlyBufferException
3464 * if the byte buffer is read-only
3465 */
3466 public abstract void intoByteBuffer(ByteBuffer bb, int offset,
3467 ByteOrder bo, VectorMask<E> m);
3468
3469 /**
3470 * Returns a packed array containing all the lane values.
3471 * The array length is the same as the vector length.
3472 * The element type of the array is the same as the element
3473 * type of the vector.
3474 * The array elements are stored in lane order.
3475 * Overrides of this method on subtypes of {@code Vector}
3476 * which specify the element type have an accurately typed
3477 * array result.
3478 *
3479 * @apiNote
3480 * Usually {@linkplain FloatVector#toArray() strongly typed access}
3481 * is preferable, if you are working with a vector
3482 * subtype that has a known element type.
3483 *
3484 * @return an accurately typed array containing
3485 * the lane values of this vector
3486 * @see ByteVector#toArray()
3487 * @see IntVector#toArray()
|
1 /*
2 * Copyright (c) 2017, 2022, Oracle and/or its affiliates. All rights reserved.
3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
4 *
5 * This code is free software; you can redistribute it and/or modify it
6 * under the terms of the GNU General Public License version 2 only, as
7 * published by the Free Software Foundation. Oracle designates this
8 * particular file as subject to the "Classpath" exception as provided
9 * by Oracle in the LICENSE file that accompanied this code.
10 *
11 * This code is distributed in the hope that it will be useful, but WITHOUT
12 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 * version 2 for more details (a copy is included in the LICENSE file that
15 * accompanied this code).
16 *
17 * You should have received a copy of the GNU General Public License version
18 * 2 along with this work; if not, write to the Free Software Foundation,
19 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
20 *
21 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
22 * or visit www.oracle.com if you need additional information or have any
23 * questions.
24 */
25 package jdk.incubator.vector;
26
27 import jdk.incubator.foreign.MemorySegment;
28
29 import java.nio.ByteOrder;
30 import java.util.Arrays;
31
32 /**
33 * A
34 *
35 * <!-- The following paragraphs are shared verbatim
36 * -- between Vector.java and package-info.java -->
37 * sequence of a fixed number of <em>lanes</em>,
38 * all of some fixed
39 * {@linkplain Vector#elementType() <em>element type</em>}
40 * such as {@code byte}, {@code long}, or {@code float}.
41 * Each lane contains an independent value of the element type.
42 * Operations on vectors are typically
43 * <a href="Vector.html#lane-wise"><em>lane-wise</em></a>,
44 * distributing some scalar operator (such as
45 * {@linkplain Vector#add(Vector) addition})
46 * across the lanes of the participating vectors,
47 * usually generating a vector result whose lanes contain the various
48 * scalar results. When run on a supporting platform, lane-wise
747 * sign bits as zero (as on some computers) then this API would reach
748 * for big-endian fictions to create unified addressing of vector
749 * bytes.
750 *
751 * <h2><a id="memory"></a>Memory operations</h2>
752 *
753 * As was already mentioned, vectors can be loaded from memory and
754 * stored back. An optional mask can control which individual memory
755 * locations are read from or written to. The shape of a vector
756 * determines how much memory it will occupy.
757 *
758 * An implementation typically has the property, in the absence of
759 * masking, that lanes are stored as a dense sequence of back-to-back
760 * values in memory, the same as a dense (gap-free) series of single
761 * scalar values in an array of the scalar type.
762 *
763 * In such cases memory order corresponds exactly to lane order. The
764 * first vector lane value occupies the first position in memory, and so on,
765 * up to the length of the vector. Further, the memory order of stored
766 * vector lanes corresponds to increasing index values in a Java array or
767 * in a {@link jdk.incubator.foreign.MemorySegment}.
768 *
769 * <p> Byte order for lane storage is chosen such that the stored
770 * vector values can be read or written as single primitive values,
771 * within the array or segment that holds the vector, producing the
772 * same values as the lane-wise values within the vector.
773 * This fact is independent of the convenient fiction that lane values
774 * inside of vectors are stored in little-endian order.
775 *
776 * <p> For example,
777 * {@link FloatVector#fromArray(VectorSpecies, float[], int)
778 * FloatVector.fromArray(fsp,fa,i)}
779 * creates and returns a float vector of some particular species {@code fsp},
780 * with elements loaded from some float array {@code fa}.
781 * The first lane is loaded from {@code fa[i]} and the last lane
782 * is initialized loaded from {@code fa[i+VL-1]}, where {@code VL}
783 * is the length of the vector as derived from the species {@code fsp}.
784 * Then, {@link FloatVector#add(Vector) fv=fv.add(fv2)}
785 * will produce another float vector of that species {@code fsp},
786 * given a vector {@code fv2} of the same species {@code fsp}.
787 * Next, {@link FloatVector#compare(VectorOperators.Comparison,float)
788 * mnz=fv.compare(NE, 0.0f)} tests whether the result is zero,
789 * yielding a mask {@code mnz}. The non-zero lanes (and only those
790 * lanes) can then be stored back into the original array elements
791 * using the statement
1023 * These operations are:
1024 * <ul>
1025 *
1026 * <li>The {@link #slice(int,Vector) slice()} family of methods,
1027 * which extract contiguous slice of {@code VLENGTH} fields from
1028 * a given origin point within a concatenated pair of vectors.
1029 *
1030 * <li>The {@link #unslice(int,Vector,int) unslice()} family of
1031 * methods, which insert a contiguous slice of {@code VLENGTH} fields
1032 * into a concatenated pair of vectors at a given origin point.
1033 *
1034 * <li>The {@link #rearrange(VectorShuffle) rearrange()} family of
1035 * methods, which select an arbitrary set of {@code VLENGTH} lanes
1036 * from one or two input vectors, and assemble them in an arbitrary
1037 * order. The selection and order of lanes is controlled by a
1038 * {@code VectorShuffle} object, which acts as an routing table
1039 * mapping source lanes to destination lanes. A {@code VectorShuffle}
1040 * can encode a mathematical permutation as well as many other
1041 * patterns of data movement.
1042 *
1043 * <li>The {@link #compress(VectorMask)} and {@link #expand(VectorMask)}
1044 * methods, which select up to {@code VLENGTH} lanes from an
1045 * input vector, and assemble them in lane order. The selection of lanes
1046 * is controlled by a {@code VectorMask}, with set lane elements mapping, by
1047 * compression or expansion in lane order, source lanes to destination lanes.
1048 *
1049 * </ul>
1050 * <p> Some vector operations are not lane-wise, but rather move data
1051 * across lane boundaries. Such operations are typically rare in SIMD
1052 * code, though they are sometimes necessary for specific algorithms
1053 * that manipulate data formats at a low level, and/or require SIMD
1054 * data to move in complex local patterns. (Local movement in a small
1055 * window of a large array of data is relatively unusual, although
1056 * some highly patterned algorithms call for it.) In this API such
1057 * methods are always clearly recognizable, so that simpler lane-wise
1058 * reasoning can be confidently applied to the rest of the code.
1059 *
1060 * <p> In some cases, vector lane boundaries are discarded and
1061 * "redrawn from scratch", so that data in a given input lane might
1062 * appear (in several parts) distributed through several output lanes,
1063 * or (conversely) data from several input lanes might be consolidated
1064 * into a single output lane. The fundamental method which can redraw
1065 * lanes boundaries is
1066 * {@link #reinterpretShape(VectorSpecies,int) reinterpretShape()}.
1067 * Built on top of this method, certain convenience methods such
1068 * as {@link #reinterpretAsBytes() reinterpretAsBytes()} or
2679 *
2680 * <p> This method returns the value of this pseudocode:
2681 * <pre>{@code
2682 * Vector<E> r1 = this.rearrange(s.wrapIndexes());
2683 * // or else: r1 = this.rearrange(s, s.laneIsValid());
2684 * Vector<E> r2 = v.rearrange(s.wrapIndexes());
2685 * return r2.blend(r1,s.laneIsValid());
2686 * }</pre>
2687 *
2688 * @param s the shuffle controlling lane selection from both input vectors
2689 * @param v the second input vector
2690 * @return the rearrangement of lane elements of this vector and
2691 * a second input vector
2692 * @see #rearrange(VectorShuffle)
2693 * @see #rearrange(VectorShuffle,VectorMask)
2694 * @see VectorShuffle#laneIsValid()
2695 * @see #slice(int,Vector)
2696 */
2697 public abstract Vector<E> rearrange(VectorShuffle<E> s, Vector<E> v);
2698
2699 /**
2700 * Compresses the lane elements of this vector selecting lanes
2701 * under the control of a specific mask.
2702 *
2703 * This is a cross-lane operation that compresses the lane
2704 * elements of this vector as selected by the specified mask.
2705 *
2706 * For each lane {@code N} of the mask, if the mask at
2707 * lane {@code N} is set, the element at lane {@code N}
2708 * of input vector is selected and stored into the output
2709 * vector contiguously starting from the lane {@code 0}.
2710 * All the upper remaining lanes, if any, of the output
2711 * vector are set to zero.
2712 *
2713 * @param m the mask controlling the compression
2714 * @return the compressed lane elements of this vector
2715 * @since 19
2716 */
2717 public abstract Vector<E> compress(VectorMask<E> m);
2718
2719 /**
2720 * Expands the lane elements of this vector
2721 * under the control of a specific mask.
2722 *
2723 * This is a cross-lane operation that expands the contiguous lane
2724 * elements of this vector into lanes of an output vector
2725 * as selected by the specified mask.
2726 *
2727 * For each lane {@code N} of the mask, if the mask at
2728 * lane {@code N} is set, the next contiguous element of input vector
2729 * starting from lane {@code 0} is selected and stored into the output
2730 * vector at lane {@code N}.
2731 * All the remaining lanes, if any, of the output vector are set to zero.
2732 *
2733 * @param m the mask controlling the compression
2734 * @return the expanded lane elements of this vector
2735 * @since 19
2736 */
2737 public abstract Vector<E> expand(VectorMask<E> m);
2738
2739 /**
2740 * Using index values stored in the lanes of this vector,
2741 * assemble values stored in second vector {@code v}.
2742 * The second vector thus serves as a table, whose
2743 * elements are selected by indexes in the current vector.
2744 *
2745 * This is a cross-lane operation that rearranges the lane
2746 * elements of the argument vector, under the control of
2747 * this vector.
2748 *
2749 * For each lane {@code N} of this vector, and for each lane
2750 * value {@code I=this.lane(N)} in this vector,
2751 * the output lane {@code N} obtains the value from
2752 * the argument vector at lane {@code I}.
2753 *
2754 * In this way, the result contains only values stored in the
2755 * argument vector {@code v}, but presented in an order which
2756 * depends on the index values in {@code this}.
2757 *
2758 * The result is the same as the expression
2884 * vector without performing any value conversions.
2885 *
2886 * <p> Depending on the selected species, this operation may
2887 * either <a href="Vector.html#expansion">expand or contract</a>
2888 * its logical result, in which case a non-zero {@code part}
2889 * number can further control the selection and steering of the
2890 * logical result into the physical output vector.
2891 *
2892 * <p>
2893 * The underlying bits of this vector are copied to the resulting
2894 * vector without modification, but those bits, before copying,
2895 * may be truncated if the this vector's bit-size is greater than
2896 * desired vector's bit size, or filled with zero bits if this
2897 * vector's bit-size is less than desired vector's bit-size.
2898 *
2899 * <p> If the old and new species have different shape, this is a
2900 * <em>shape-changing</em> operation, and may have special
2901 * implementation costs.
2902 *
2903 * <p> The method behaves as if this vector is stored into a byte
2904 * array using little-endian byte ordering and then the desired vector is loaded from the same byte
2905 * array using the same ordering.
2906 *
2907 * <p> The following pseudocode illustrates the behavior:
2908 * <pre>{@code
2909 * int domSize = this.byteSize();
2910 * int ranSize = species.vectorByteSize();
2911 * int M = (domSize > ranSize ? domSize / ranSize : ranSize / domSize);
2912 * assert Math.abs(part) < M;
2913 * assert (part == 0) || (part > 0) == (domSize > ranSize);
2914 * MemorySegment ms = MemorySegment.ofArray(new byte[Math.max(domSize, ranSize)]);
2915 * if (domSize > ranSize) { // expansion
2916 * this.intoMemorySegment(ms, 0, ByteOrder.native());
2917 * int origin = part * ranSize;
2918 * return species.fromMemorySegment(ms, origin, ByteOrder.native());
2919 * } else { // contraction or size-invariant
2920 * int origin = (-part) * domSize;
2921 * this.intoMemorySegment(ms, origin, ByteOrder.native());
2922 * return species.fromMemorySegment(ms, 0, ByteOrder.native());
2923 * }
2924 * }</pre>
2925 *
2926 * @apiNote Although this method is defined as if the vectors in
2927 * question were loaded or stored into memory, memory semantics
2928 * has little to do or nothing with the actual implementation.
2929 * The appeal to little-endian ordering is simply a shorthand
2930 * for what could otherwise be a large number of detailed rules
2931 * concerning the mapping between lane-structured vectors and
2932 * byte-structured vectors.
2933 *
2934 * @param species the desired vector species
2935 * @param part the <a href="Vector.html#expansion">part number</a>
2936 * of the result, or zero if neither expanding nor contracting
2937 * @param <F> the boxed element type of the species
2938 * @return a vector transformed, by shape and element type, from this vector
2939 * @see Vector#convertShape(VectorOperators.Conversion,VectorSpecies,int)
2940 * @see Vector#castShape(VectorSpecies,int)
2941 * @see VectorSpecies#partLimit(VectorSpecies,boolean)
2942 */
2943 public abstract <F> Vector<F> reinterpretShape(VectorSpecies<F> species, int part);
2944
2945 /**
2946 * Views this vector as a vector of the same shape
2947 * and contents but a lane type of {@code byte},
2948 * where the bytes are extracted from the lanes
2949 * according to little-endian order.
2950 * It is a convenience method for the expression
2951 * {@code reinterpretShape(species().withLanes(byte.class))}.
2952 * It may be considered an inverse to the various
2953 * methods which consolidate bytes into larger lanes
2954 * within the same vector, such as
2955 * {@link Vector#reinterpretAsInts()}.
2956 *
2957 * @return a {@code ByteVector} with the same shape and information content
2958 * @see Vector#reinterpretShape(VectorSpecies,int)
2959 * @see IntVector#intoMemorySegment(jdk.incubator.foreign.MemorySegment, long, java.nio.ByteOrder)
2960 * @see FloatVector#intoMemorySegment(jdk.incubator.foreign.MemorySegment, long, java.nio.ByteOrder)
2961 * @see VectorSpecies#withLanes(Class)
2962 */
2963 public abstract ByteVector reinterpretAsBytes();
2964
2965 /**
2966 * Reinterprets this vector as a vector of the same shape
2967 * and contents but a lane type of {@code short},
2968 * where the lanes are assembled from successive bytes
2969 * according to little-endian order.
2970 * It is a convenience method for the expression
2971 * {@code reinterpretShape(species().withLanes(short.class))}.
2972 * It may be considered an inverse to {@link Vector#reinterpretAsBytes()}.
2973 *
2974 * @return a {@code ShortVector} with the same shape and information content
2975 */
2976 public abstract ShortVector reinterpretAsShorts();
2977
2978 /**
2979 * Reinterprets this vector as a vector of the same shape
2980 * and contents but a lane type of {@code int},
3348 * Checks that this vector has the given species,
3349 * and returns this vector unchanged.
3350 * The effect is similar to this pseudocode:
3351 * {@code species == species()
3352 * ? this
3353 * : throw new ClassCastException()}.
3354 *
3355 * @param species the required species
3356 * @param <F> the boxed element type of the required species
3357 * @return the same vector
3358 * @throws ClassCastException if the vector has the wrong species
3359 * @see Vector#check(Class)
3360 * @see VectorMask#check(VectorSpecies)
3361 * @see VectorShuffle#check(VectorSpecies)
3362 */
3363 public abstract <F> Vector<F> check(VectorSpecies<F> species);
3364
3365 //Array stores
3366
3367 /**
3368 * Stores this vector into a {@linkplain MemorySegment memory segment}
3369 * starting at an offset using explicit byte order.
3370 * <p>
3371 * Bytes are extracted from primitive lane elements according
3372 * to the specified byte ordering.
3373 * The lanes are stored according to their
3374 * <a href="Vector.html#lane-order">memory ordering</a>.
3375 * <p>
3376 * This method behaves as if it calls
3377 * {@link #intoMemorySegment(MemorySegment,long,ByteOrder,VectorMask)
3378 * intoMemorySegment()} as follows:
3379 * <pre>{@code
3380 * var m = maskAll(true);
3381 * intoMemorySegment(ms, offset, bo, m);
3382 * }</pre>
3383 *
3384 * @param ms the memory segment
3385 * @param offset the offset into the memory segment
3386 * @param bo the intended byte order
3387 * @throws IndexOutOfBoundsException
3388 * if {@code offset+N*ESIZE < 0}
3389 * or {@code offset+(N+1)*ESIZE > ms.byteSize()}
3390 * for any lane {@code N} in the vector
3391 * @throws UnsupportedOperationException
3392 * if the memory segment is read-only
3393 * @throws IllegalArgumentException if the memory segment is a heap segment that is
3394 * not backed by a {@code byte[]} array.
3395 * @throws IllegalStateException if the memory segment's session is not alive,
3396 * or if access occurs from a thread other than the thread owning the session.
3397 * @since 19
3398 */
3399 public abstract void intoMemorySegment(MemorySegment ms, long offset, ByteOrder bo);
3400
3401 /**
3402 * Stores this vector into a {@linkplain MemorySegment memory segment}
3403 * starting at an offset using explicit byte order and a mask.
3404 * <p>
3405 * Bytes are extracted from primitive lane elements according
3406 * to the specified byte ordering.
3407 * The lanes are stored according to their
3408 * <a href="Vector.html#lane-order">memory ordering</a>.
3409 * <p>
3410 * The following pseudocode illustrates the behavior, where
3411 * {@code JAVA_E} is the layout of the primitive element type, {@code ETYPE} is the
3412 * primitive element type, and {@code EVector} is the primitive
3413 * vector type for this vector:
3414 * <pre>{@code
3415 * ETYPE[] a = this.toArray();
3416 * var slice = ms.asSlice(offset)
3417 * for (int n = 0; n < a.length; n++) {
3418 * if (m.laneIsSet(n)) {
3419 * slice.setAtIndex(ValueLayout.JAVA_E.withBitAlignment(8), n);
3420 * }
3421 * }
3422 * }</pre>
3423 *
3424 * @implNote
3425 * This operation is likely to be more efficient if
3426 * the specified byte order is the same as
3427 * {@linkplain ByteOrder#nativeOrder()
3428 * the platform native order},
3429 * since this method will not need to reorder
3430 * the bytes of lane values.
3431 * In the special case where {@code ETYPE} is
3432 * {@code byte}, the byte order argument is
3433 * ignored.
3434 *
3435 * @param ms the memory segment
3436 * @param offset the offset into the memory segment
3437 * @param bo the intended byte order
3438 * @param m the mask controlling lane selection
3439 * @throws IndexOutOfBoundsException
3440 * if {@code offset+N*ESIZE < 0}
3441 * or {@code offset+(N+1)*ESIZE > ms.byteSize()}
3442 * for any lane {@code N} in the vector
3443 * where the mask is set
3444 * @throws UnsupportedOperationException
3445 * if the memory segment is read-only
3446 * @throws IllegalArgumentException if the memory segment is a heap segment that is
3447 * not backed by a {@code byte[]} array.
3448 * @throws IllegalStateException if the memory segment's session is not alive,
3449 * or if access occurs from a thread other than the thread owning the session.
3450 * @since 19
3451 */
3452 public abstract void intoMemorySegment(MemorySegment ms, long offset,
3453 ByteOrder bo, VectorMask<E> m);
3454
3455 /**
3456 * Returns a packed array containing all the lane values.
3457 * The array length is the same as the vector length.
3458 * The element type of the array is the same as the element
3459 * type of the vector.
3460 * The array elements are stored in lane order.
3461 * Overrides of this method on subtypes of {@code Vector}
3462 * which specify the element type have an accurately typed
3463 * array result.
3464 *
3465 * @apiNote
3466 * Usually {@linkplain FloatVector#toArray() strongly typed access}
3467 * is preferable, if you are working with a vector
3468 * subtype that has a known element type.
3469 *
3470 * @return an accurately typed array containing
3471 * the lane values of this vector
3472 * @see ByteVector#toArray()
3473 * @see IntVector#toArray()
|