1 /* 2 * Copyright (c) 2017, 2020, Oracle and/or its affiliates. All rights reserved. 3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 4 * 5 * This code is free software; you can redistribute it and/or modify it 6 * under the terms of the GNU General Public License version 2 only, as 7 * published by the Free Software Foundation. Oracle designates this 8 * particular file as subject to the "Classpath" exception as provided 9 * by Oracle in the LICENSE file that accompanied this code. 10 * 11 * This code is distributed in the hope that it will be useful, but WITHOUT 12 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 13 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 14 * version 2 for more details (a copy is included in the LICENSE file that 15 * accompanied this code). 16 * 17 * You should have received a copy of the GNU General Public License version 18 * 2 along with this work; if not, write to the Free Software Foundation, 19 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 20 * 21 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 22 * or visit www.oracle.com if you need additional information or have any 23 * questions. 24 */ 25 package jdk.incubator.vector; 26 27 import java.nio.ByteBuffer; 28 import java.nio.ByteOrder; 29 import java.util.Arrays; 30 31 /** 32 * A 33 * 34 * <!-- The following paragraphs are shared verbatim 35 * -- between Vector.java and package-info.java --> 36 * sequence of a fixed number of <em>lanes</em>, 37 * all of some fixed 38 * {@linkplain Vector#elementType() <em>element type</em>} 39 * such as {@code byte}, {@code long}, or {@code float}. 40 * Each lane contains an independent value of the element type. 41 * Operations on vectors are typically 42 * <a href="Vector.html#lane-wise"><em>lane-wise</em></a>, 43 * distributing some scalar operator (such as 44 * {@linkplain Vector#add(Vector) addition}) 45 * across the lanes of the participating vectors, 46 * usually generating a vector result whose lanes contain the various 47 * scalar results. When run on a supporting platform, lane-wise 48 * operations can be executed in parallel by the hardware. This style 49 * of parallelism is called <em>Single Instruction Multiple Data</em> 50 * (SIMD) parallelism. 51 * 52 * <p> In the SIMD style of programming, most of the operations within 53 * a vector lane are unconditional, but the effect of conditional 54 * execution may be achieved using 55 * <a href="Vector.html#masking"><em>masked operations</em></a> 56 * such as {@link Vector#blend(Vector,VectorMask) blend()}, 57 * under the control of an associated {@link VectorMask}. 58 * Data motion other than strictly lane-wise flow is achieved using 59 * <a href="Vector.html#cross-lane"><em>cross-lane</em></a> 60 * operations, often under the control of an associated 61 * {@link VectorShuffle}. 62 * Lane data and/or whole vectors can be reformatted using various 63 * kinds of lane-wise 64 * {@linkplain Vector#convert(VectorOperators.Conversion,int) conversions}, 65 * and byte-wise reformatting 66 * {@linkplain Vector#reinterpretShape(VectorSpecies,int) reinterpretations}, 67 * often under the control of a reflective {@link VectorSpecies} 68 * object which selects an alternative vector format different 69 * from that of the input vector. 70 * 71 * <p> {@code Vector<E>} declares a set of vector operations (methods) 72 * that are common to all element types. These common operations 73 * include generic access to lane values, data selection and movement, 74 * reformatting, and certain arithmetic and logical operations (such as addition 75 * or comparison) that are common to all primitive types. 76 * 77 * <p> <a href="Vector.html#subtypes">Public subtypes of {@code Vector}</a> 78 * correspond to specific 79 * element types. These declare further operations that are specific 80 * to that element type, including unboxed access to lane values, 81 * bitwise operations on values of integral element types, or 82 * transcendental operations on values of floating point element 83 * types. 84 * 85 * <p> Some lane-wise operations, such as the {@code add} operator, are defined as 86 * a full-service named operation, where a corresponding method on {@code Vector} 87 * comes in masked and unmasked overloadings, and (in subclasses) also comes in 88 * covariant overrides (returning the subclass) and additional scalar-broadcast 89 * overloadings (both masked and unmasked). 90 * 91 * Other lane-wise operations, such as the {@code min} operator, are defined as a 92 * partially serviced (not a full-service) named operation, where a corresponding 93 * method on {@code Vector} and/or a subclass provide some but all possible 94 * overloadings and overrides (commonly the unmasked varient with scalar-broadcast 95 * overloadings). 96 * 97 * Finally, all lane-wise operations (those named as previously described, 98 * or otherwise unnamed method-wise) have a corresponding 99 * {@link VectorOperators.Operator operator token} 100 * declared as a static constant on {@link VectorOperators}. 101 * Each operator token defines a symbolic Java expression for the operation, 102 * such as {@code a + b} for the 103 * {@link VectorOperators#ADD ADD} operator token. 104 * General lane-wise operation-token accepting methods, such as for a 105 * {@linkplain Vector#lanewise(VectorOperators.Unary) unary lane-wise} 106 * operation, are provided on {@code Vector} and come in the same variants as 107 * a full-service named operation. 108 * 109 * <p>This package contains a public subtype of {@link Vector} 110 * corresponding to each supported element type: 111 * {@link ByteVector}, {@link ShortVector}, 112 * {@link IntVector}, {@link LongVector}, 113 * {@link FloatVector}, and {@link DoubleVector}. 114 * 115 * <!-- The preceding paragraphs are shared verbatim 116 * -- between Vector.java and package-info.java --> 117 * 118 * <p><a id="ETYPE"></a> The {@linkplain #elementType element type} of a vector, 119 * referred to as {@code ETYPE}, is one of the primitive types 120 * {@code byte}, {@code short}, {@code int}, {@code long}, {@code 121 * float}, or {@code double}. 122 * 123 * <p> The type {@code E} in {@code Vector<E>} is the <em>boxed</em> version 124 * of {@code ETYPE}. For example, in the type {@code Vector<Integer>}, the {@code E} 125 * parameter is {@code Integer} and the {@code ETYPE} is {@code int}. In such a 126 * vector, each lane carries a primitive {@code int} value. This pattern continues 127 * for the other primitive types as well. (See also sections {@jls 5.1.7} and 128 * {@jls 5.1.8} of the <cite>The Java Language Specification</cite>.) 129 * 130 * <p><a id="VLENGTH"></a> The {@linkplain #length() length} of a vector 131 * is the lane count, the number of lanes it contains. 132 * 133 * This number is also called {@code VLENGTH} when the context makes 134 * clear which vector it belongs to. Each vector has its own fixed 135 * {@code VLENGTH} but different instances of vectors may have 136 * different lengths. {@code VLENGTH} is an important number, because 137 * it estimates the SIMD performance gain of a single vector operation 138 * as compared to scalar execution of the {@code VLENGTH} scalar 139 * operators which underly the vector operation. 140 * 141 * <h2><a id="species"></a>Shapes and species</h2> 142 * 143 * The information capacity of a vector is determined by its 144 * {@linkplain #shape() <em>vector shape</em>}, also called its 145 * {@code VSHAPE}. Each possible {@code VSHAPE} is represented by 146 * a member of the {@link VectorShape} enumeration, and represents 147 * an implementation format shared in common by all vectors of 148 * that shape. Thus, the {@linkplain #bitSize() size in bits} of 149 * of a vector is determined by appealing to its vector shape. 150 * 151 * <p> Some Java platforms give special support to only one shape, 152 * while others support several. A typical platform is not likely 153 * to support all the shapes described by this API. For this reason, 154 * most vector operations work on a single input shape and 155 * produce the same shape on output. Operations which change 156 * shape are clearly documented as such <em>shape-changing</em>, 157 * while the majority of operations are <em>shape-invariant</em>, 158 * to avoid disadvantaging platforms which support only one shape. 159 * There are queries to discover, for the current Java platform, 160 * the {@linkplain VectorShape#preferredShape() preferred shape} 161 * for general SIMD computation, or the 162 * {@linkplain VectorShape#largestShapeFor(Class) largest 163 * available shape} for any given lane type. To be portable, 164 * code using this API should start by querying a supported 165 * shape, and then process all data with shape-invariant 166 * operations, within the selected shape. 167 * 168 * <p> Each unique combination of element type and vector shape 169 * determines a unique 170 * {@linkplain #species() <em>vector species</em>}. 171 * A vector species is represented by a fixed instance of 172 * {@link VectorSpecies VectorSpecies<E>} 173 * shared in common by all vectors of the same shape and 174 * {@code ETYPE}. 175 * 176 * <p> Unless otherwise documented, lane-wise vector operations 177 * require that all vector inputs have exactly the same {@code VSHAPE} 178 * and {@code VLENGTH}, which is to say that they must have exactly 179 * the same species. This allows corresponding lanes to be paired 180 * unambiguously. The {@link #check(VectorSpecies) check()} method 181 * provides an easy way to perform this check explicitly. 182 * 183 * <p> Vector shape, {@code VLENGTH}, and {@code ETYPE} are all 184 * mutually constrained, so that {@code VLENGTH} times the 185 * {@linkplain #elementSize() bit-size of each lane} 186 * must always match the bit-size of the vector's shape. 187 * 188 * Thus, {@linkplain #reinterpretShape(VectorSpecies,int) reinterpreting} a 189 * vector may double its length if and only if it either halves the lane size, 190 * or else changes the shape. Likewise, reinterpreting a vector may double the 191 * lane size if and only if it either halves the length, or else changes the 192 * shape of the vector. 193 * 194 * <h2><a id="subtypes"></a>Vector subtypes</h2> 195 * 196 * Vector declares a set of vector operations (methods) that are common to all 197 * element types (such as addition). Sub-classes of Vector with a concrete 198 * element type declare further operations that are specific to that 199 * element type (such as access to element values in lanes, logical operations 200 * on values of integral elements types, or transcendental operations on values 201 * of floating point element types). 202 * There are six abstract sub-classes of Vector corresponding to the supported set 203 * of element types, {@link ByteVector}, {@link ShortVector}, 204 * {@link IntVector}, {@link LongVector}, {@link FloatVector}, and 205 * {@link DoubleVector}. Along with type-specific operations these classes 206 * support creation of vector values (instances of Vector). 207 * They expose static constants corresponding to the supported species, 208 * and static methods on these types generally take a species as a parameter. 209 * For example, 210 * {@link FloatVector#fromArray(VectorSpecies, float[], int) FloatVector.fromArray} 211 * creates and returns a float vector of the specified species, with elements 212 * loaded from the specified float array. 213 * It is recommended that Species instances be held in {@code static final} 214 * fields for optimal creation and usage of Vector values by the runtime compiler. 215 * 216 * <p> As an example of static constants defined by the typed vector classes, 217 * constant {@link FloatVector#SPECIES_256 FloatVector.SPECIES_256} 218 * is the unique species whose lanes are {@code float}s and whose 219 * vector size is 256 bits. Again, the constant 220 * {@link FloatVector#SPECIES_PREFERRED} is the species which 221 * best supports processing of {@code float} vector lanes on 222 * the currently running Java platform. 223 * 224 * <p> As another example, a broadcast scalar value of 225 * {@code (double)0.5} can be obtained by calling 226 * {@link DoubleVector#broadcast(VectorSpecies,double) 227 * DoubleVector.broadcast(dsp, 0.5)}, but the argument {@code dsp} is 228 * required to select the species (and hence the shape and length) of 229 * the resulting vector. 230 * 231 * <h2><a id="lane-wise"></a>Lane-wise operations</h2> 232 * 233 * We use the term <em>lanes</em> when defining operations on 234 * vectors. The number of lanes in a vector is the number of scalar 235 * elements it holds. For example, a vector of type {@code float} and 236 * shape {@code S_256_BIT} has eight lanes, since {@code 32*8=256}. 237 * 238 * <p> Most operations on vectors are lane-wise, which means the operation 239 * is composed of an underlying scalar operator, which is repeated for 240 * each distinct lane of the input vector. If there are additional 241 * vector arguments of the same type, their lanes are aligned with the 242 * lanes of the first input vector. (They must all have a common 243 * {@code VLENGTH}.) For most lane-wise operations, the output resulting 244 * from a lane-wise operation will have a {@code VLENGTH} which is equal to 245 * the {@code VLENGTH} of the input(s) to the operation. Thus, such lane-wise 246 * operations are <em>length-invariant</em>, in their basic definitions. 247 * 248 * <p> The principle of length-invariance is combined with another 249 * basic principle, that most length-invariant lane-wise operations are also 250 * <em>shape-invariant</em>, meaning that the inputs and the output of 251 * a lane-wise operation will have a common {@code VSHAPE}. When the 252 * principles conflict, because a logical result (with an invariant 253 * {@code VLENGTH}), does not fit into the invariant {@code VSHAPE}, 254 * the resulting expansions and contractions are handled explicitly 255 * with 256 * <a href="Vector.html#expansion">special conventions</a>. 257 * 258 * <p> Vector operations can be grouped into various categories and 259 * their behavior can be generally specified in terms of underlying 260 * scalar operators. In the examples below, {@code ETYPE} is the 261 * element type of the operation (such as {@code int.class}) and 262 * {@code EVector} is the corresponding concrete vector type (such as 263 * {@code IntVector.class}). 264 * 265 * <ul> 266 * <li> 267 * A <em>lane-wise unary</em> operation, such as 268 * {@code w = v0.}{@link Vector#neg() neg}{@code ()}, 269 * takes one input vector, 270 * distributing a unary scalar operator across the lanes, 271 * and produces a result vector of the same type and shape. 272 * 273 * For each lane of the input vector {@code a}, 274 * the underlying scalar operator is applied to the lane value. 275 * The result is placed into the vector result in the same lane. 276 * The following pseudocode illustrates the behavior of this operation 277 * category: 278 * 279 * <pre>{@code 280 * ETYPE scalar_unary_op(ETYPE s); 281 * EVector a = ...; 282 * VectorSpecies<E> species = a.species(); 283 * ETYPE[] ar = new ETYPE[a.length()]; 284 * for (int i = 0; i < ar.length; i++) { 285 * ar[i] = scalar_unary_op(a.lane(i)); 286 * } 287 * EVector r = EVector.fromArray(species, ar, 0); 288 * }</pre> 289 * 290 * <li> 291 * A <em>lane-wise binary</em> operation, such as 292 * {@code w = v0.}{@link Vector#add(Vector) add}{@code (v1)}, 293 * takes two input vectors, 294 * distributing a binary scalar operator across the lanes, 295 * and produces a result vector of the same type and shape. 296 * 297 * For each lane of the two input vectors {@code a} and {@code b}, 298 * the underlying scalar operator is applied to the lane values. 299 * The result is placed into the vector result in the same lane. 300 * The following pseudocode illustrates the behavior of this operation 301 * category: 302 * 303 * <pre>{@code 304 * ETYPE scalar_binary_op(ETYPE s, ETYPE t); 305 * EVector a = ...; 306 * VectorSpecies<E> species = a.species(); 307 * EVector b = ...; 308 * b.check(species); // must have same species 309 * ETYPE[] ar = new ETYPE[a.length()]; 310 * for (int i = 0; i < ar.length; i++) { 311 * ar[i] = scalar_binary_op(a.lane(i), b.lane(i)); 312 * } 313 * EVector r = EVector.fromArray(species, ar, 0); 314 * }</pre> 315 * </li> 316 * 317 * <li> 318 * Generalizing from unary and binary operations, 319 * a <em>lane-wise n-ary</em> operation takes {@code N} input vectors {@code v[j]}, 320 * distributing an n-ary scalar operator across the lanes, 321 * and produces a result vector of the same type and shape. 322 * Except for a few ternary operations, such as 323 * {@code w = v0.}{@link FloatVector#fma(Vector,Vector) fma}{@code (v1,v2)}, 324 * this API has no support for 325 * lane-wise n-ary operations. 326 * 327 * For each lane of all of the input vectors {@code v[j]}, 328 * the underlying scalar operator is applied to the lane values. 329 * The result is placed into the vector result in the same lane. 330 * The following pseudocode illustrates the behavior of this operation 331 * category: 332 * 333 * <pre>{@code 334 * ETYPE scalar_nary_op(ETYPE... args); 335 * EVector[] v = ...; 336 * int N = v.length; 337 * VectorSpecies<E> species = v[0].species(); 338 * for (EVector arg : v) { 339 * arg.check(species); // all must have same species 340 * } 341 * ETYPE[] ar = new ETYPE[a.length()]; 342 * for (int i = 0; i < ar.length; i++) { 343 * ETYPE[] args = new ETYPE[N]; 344 * for (int j = 0; j < N; j++) { 345 * args[j] = v[j].lane(i); 346 * } 347 * ar[i] = scalar_nary_op(args); 348 * } 349 * EVector r = EVector.fromArray(species, ar, 0); 350 * }</pre> 351 * </li> 352 * 353 * <li> 354 * A <em>lane-wise conversion</em> operation, such as 355 * {@code w0 = v0.}{@link 356 * Vector#convert(VectorOperators.Conversion,int) 357 * convert}{@code (VectorOperators.I2D, 0)}, 358 * takes one input vector, 359 * distributing a unary scalar conversion operator across the lanes, 360 * and produces a logical result of the converted values. The logical 361 * result (or at least a part of it) is presented in a vector of the 362 * same shape as the input vector. 363 * 364 * <p> Unlike other lane-wise operations, conversions can change lane 365 * type, from the input (domain) type to the output (range) type. The 366 * lane size may change along with the type. In order to manage the 367 * size changes, lane-wise conversion methods can product <em>partial 368 * results</em>, under the control of a {@code part} parameter, which 369 * is <a href="Vector.html#expansion">explained elsewhere</a>. 370 * (Following the example above, the second group of converted lane 371 * values could be obtained as 372 * {@code w1 = v0.convert(VectorOperators.I2D, 1)}.) 373 * 374 * <p> The following pseudocode illustrates the behavior of this 375 * operation category in the specific example of a conversion from 376 * {@code int} to {@code double}, retaining either lower or upper 377 * lanes (depending on {@code part}) to maintain shape-invariance: 378 * 379 * <pre>{@code 380 * IntVector a = ...; 381 * int VLENGTH = a.length(); 382 * int part = ...; // 0 or 1 383 * VectorShape VSHAPE = a.shape(); 384 * double[] arlogical = new double[VLENGTH]; 385 * for (int i = 0; i < limit; i++) { 386 * int e = a.lane(i); 387 * arlogical[i] = (double) e; 388 * } 389 * VectorSpecies<Double> rs = VSHAPE.withLanes(double.class); 390 * int M = Double.BITS / Integer.BITS; // expansion factor 391 * int offset = part * (VLENGTH / M); 392 * DoubleVector r = DoubleVector.fromArray(rs, arlogical, offset); 393 * assert r.length() == VLENGTH / M; 394 * }</pre> 395 * </li> 396 * 397 * <li> 398 * A <em>cross-lane reduction</em> operation, such as 399 * {@code e = v0.}{@link 400 * IntVector#reduceLanes(VectorOperators.Associative) 401 * reduceLanes}{@code (VectorOperators.ADD)}, 402 * operates on all 403 * the lane elements of an input vector. 404 * An accumulation function is applied to all the 405 * lane elements to produce a scalar result. 406 * If the reduction operation is associative then the result may be accumulated 407 * by operating on the lane elements in any order using a specified associative 408 * scalar binary operation and identity value. Otherwise, the reduction 409 * operation specifies the order of accumulation. 410 * The following pseudocode illustrates the behavior of this operation category 411 * if it is associative: 412 * <pre>{@code 413 * ETYPE assoc_scalar_binary_op(ETYPE s, ETYPE t); 414 * EVector a = ...; 415 * ETYPE r = <identity value>; 416 * for (int i = 0; i < a.length(); i++) { 417 * r = assoc_scalar_binary_op(r, a.lane(i)); 418 * } 419 * }</pre> 420 * </li> 421 * 422 * <li> 423 * A <em>cross-lane movement</em> operation, such as 424 * {@code w = v0.}{@link 425 * Vector#rearrange(VectorShuffle) rearrange}{@code (shuffle)} 426 * operates on all 427 * the lane elements of an input vector and moves them 428 * in a data-dependent manner into <em>different lanes</em> 429 * in an output vector. 430 * The movement is steered by an auxiliary datum, such as 431 * a {@link VectorShuffle} or a scalar index defining the 432 * origin of the movement. 433 * The following pseudocode illustrates the behavior of this 434 * operation category, in the case of a shuffle: 435 * <pre>{@code 436 * EVector a = ...; 437 * Shuffle<E> s = ...; 438 * ETYPE[] ar = new ETYPE[a.length()]; 439 * for (int i = 0; i < ar.length; i++) { 440 * int source = s.laneSource(i); 441 * ar[i] = a.lane(source); 442 * } 443 * EVector r = EVector.fromArray(a.species(), ar, 0); 444 * }</pre> 445 * </li> 446 * 447 * <li> 448 * A <em>masked operation</em> is one which is a variation on one of the 449 * previous operations (either lane-wise or cross-lane), where 450 * the operation takes an extra trailing {@link VectorMask} argument. 451 * In lanes the mask is set, the operation behaves as if the mask 452 * argument were absent, but in lanes where the mask is unset, the 453 * underlying scalar operation is suppressed. 454 * Masked operations are explained in 455 * <a href="Vector.html#masking">greater detail elsewhere</a>. 456 * </li> 457 * 458 * <li> 459 * A very special case of a masked lane-wise binary operation is a 460 * {@linkplain #blend(Vector,VectorMask) blend}, which operates 461 * lane-wise on two input vectors {@code a} and {@code b}, selecting lane 462 * values from one input or the other depending on a mask {@code m}. 463 * In lanes where {@code m} is set, the corresponding value from 464 * {@code b} is selected into the result; otherwise the value from 465 * {@code a} is selected. Thus, a blend acts as a vectorized version 466 * of Java's ternary selection expression {@code m?b:a}: 467 * <pre>{@code 468 * ETYPE[] ar = new ETYPE[a.length()]; 469 * for (int i = 0; i < ar.length; i++) { 470 * boolean isSet = m.laneIsSet(i); 471 * ar[i] = isSet ? b.lane(i) : a.lane(i); 472 * } 473 * EVector r = EVector.fromArray(species, ar, 0); 474 * }</pre> 475 * </li> 476 * 477 * <li> 478 * A <em>lane-wise binary test</em> operation, such as 479 * {@code m = v0.}{@link Vector#lt(Vector) lt}{@code (v1)}, 480 * takes two input vectors, 481 * distributing a binary scalar comparison across the lanes, 482 * and produces, not a vector of booleans, but rather a 483 * {@linkplain VectorMask vector mask}. 484 * 485 * For each lane of the two input vectors {@code a} and {@code b}, 486 * the underlying scalar comparison operator is applied to the lane values. 487 * The resulting boolean is placed into the vector mask result in the same lane. 488 * The following pseudocode illustrates the behavior of this operation 489 * category: 490 * <pre>{@code 491 * boolean scalar_binary_test_op(ETYPE s, ETYPE t); 492 * EVector a = ...; 493 * VectorSpecies<E> species = a.species(); 494 * EVector b = ...; 495 * b.check(species); // must have same species 496 * boolean[] mr = new boolean[a.length()]; 497 * for (int i = 0; i < mr.length; i++) { 498 * mr[i] = scalar_binary_test_op(a.lane(i), b.lane(i)); 499 * } 500 * VectorMask<E> m = VectorMask.fromArray(species, mr, 0); 501 * }</pre> 502 * </li> 503 * 504 * <li> 505 * Similarly to a binary comparison, a <em>lane-wise unary test</em> 506 * operation, such as 507 * {@code m = v0.}{@link Vector#test(VectorOperators.Test) 508 * test}{@code (IS_FINITE)}, 509 * takes one input vector, distributing a scalar predicate 510 * (a test function) across the lanes, and produces a 511 * {@linkplain VectorMask vector mask}. 512 * </li> 513 * 514 * </ul> 515 * 516 * <p> 517 * If a vector operation does not belong to one of the above categories then 518 * the method documentation explicitly specifies how it processes the lanes of 519 * input vectors, and where appropriate illustrates the behavior using 520 * pseudocode. 521 * 522 * <p> 523 * Most lane-wise binary and comparison operations offer convenience 524 * overloadings which accept a scalar as the second input, in place of a 525 * vector. In this case the scalar value is promoted to a vector by 526 * {@linkplain Vector#broadcast(long) broadcasting it} 527 * into the same lane structure as the first input. 528 * 529 * For example, to multiply all lanes of a {@code double} vector by 530 * a scalar value {@code 1.1}, the expression {@code v.mul(1.1)} is 531 * easier to work with than an equivalent expression with an explicit 532 * broadcast operation, such as {@code v.mul(v.broadcast(1.1))} 533 * or {@code v.mul(DoubleVector.broadcast(v.species(), 1.1))}. 534 * 535 * Unless otherwise specified the scalar variant always behaves as if 536 * each scalar value is first transformed to a vector of the same 537 * species as the first vector input, using the appropriate 538 * {@code broadcast} operation. 539 * 540 * <h2><a id="masking"></a>Masked operations</h2> 541 * 542 * <p> Many vector operations accept an optional 543 * {@link VectorMask mask} argument, selecting which lanes participate 544 * in the underlying scalar operator. If present, the mask argument 545 * appears at the end of the method argument list. 546 * 547 * <p> Each lane of the mask argument is a boolean which is either in 548 * the <em>set</em> or <em>unset</em> state. For lanes where the mask 549 * argument is unset, the underlying scalar operator is suppressed. 550 * In this way, masks allow vector operations to emulate scalar 551 * control flow operations, without losing SIMD parallelism, except 552 * where the mask lane is unset. 553 * 554 * <p> An operation suppressed by a mask will never cause an exception 555 * or side effect of any sort, even if the underlying scalar operator 556 * can potentially do so. For example, an unset lane that seems to 557 * access an out of bounds array element or divide an integral value 558 * by zero will simply be ignored. Values in suppressed lanes never 559 * participate or appear in the result of the overall operation. 560 * 561 * <p> Result lanes corresponding to a suppressed operation will be 562 * filled with a default value which depends on the specific 563 * operation, as follows: 564 * 565 * <ul> 566 * 567 * <li>If the masked operation is a unary, binary, or n-ary arithmetic or 568 * logical operation, suppressed lanes are filled from the first 569 * vector operand (i.e., the vector receiving the method call), as if 570 * by a {@linkplain #blend(Vector,VectorMask) blend}.</li> 571 * 572 * <li>If the masked operation is a memory load or a {@code slice()} from 573 * another vector, suppressed lanes are not loaded, and are filled 574 * with the default value for the {@code ETYPE}, which in every case 575 * consists of all zero bits. An unset lane can never cause an 576 * exception, even if the hypothetical corresponding memory location 577 * does not exist (because it is out of an array's index range).</li> 578 * 579 * <li>If the operation is a cross-lane operation with an operand 580 * which supplies lane indexes (of type {@code VectorShuffle} or 581 * {@code Vector}, suppressed lanes are not computed, and are filled 582 * with the zero default value. Normally, invalid lane indexes elicit 583 * an {@code IndexOutOfBoundsException}, but if a lane is unset, the 584 * zero value is quietly substituted, regardless of the index. This 585 * rule is similar to the previous rule, for masked memory loads.</li> 586 * 587 * <li>If the masked operation is a memory store or an {@code unslice()} into 588 * another vector, suppressed lanes are not stored, and the 589 * corresponding memory or vector locations (if any) are unchanged. 590 * 591 * <p> (Note: Memory effects such as race conditions never occur for 592 * suppressed lanes. That is, implementations will not secretly 593 * re-write the existing value for unset lanes. In the Java Memory 594 * Model, reassigning a memory variable to its current value is not a 595 * no-op; it may quietly undo a racing store from another 596 * thread.)</p> 597 * </li> 598 * 599 * <li>If the masked operation is a reduction, suppressed lanes are ignored 600 * in the reduction. If all lanes are suppressed, a suitable neutral 601 * value is returned, depending on the specific reduction operation, 602 * and documented by the masked variant of that method. (This means 603 * that users can obtain the neutral value programmatically by 604 * executing the reduction on a dummy vector with an all-unset mask.) 605 * 606 * <li>If the masked operation is a comparison operation, suppressed output 607 * lanes in the resulting mask are themselves unset, as if the 608 * suppressed comparison operation returned {@code false} regardless 609 * of the suppressed input values. In effect, it is as if the 610 * comparison operation were performed unmasked, and then the 611 * result intersected with the controlling mask.</li> 612 * 613 * <li>In other cases, such as masked 614 * <a href="Vector.html#cross-lane"><em>cross-lane movements</em></a>, 615 * the specific effects of masking are documented by the masked 616 * variant of the method. 617 * 618 * </ul> 619 * 620 * <p> As an example, a masked binary operation on two input vectors 621 * {@code a} and {@code b} suppresses the binary operation for lanes 622 * where the mask is unset, and retains the original lane value from 623 * {@code a}. The following pseudocode illustrates this behavior: 624 * <pre>{@code 625 * ETYPE scalar_binary_op(ETYPE s, ETYPE t); 626 * EVector a = ...; 627 * VectorSpecies<E> species = a.species(); 628 * EVector b = ...; 629 * b.check(species); // must have same species 630 * VectorMask<E> m = ...; 631 * m.check(species); // must have same species 632 * boolean[] ar = new boolean[a.length()]; 633 * for (int i = 0; i < ar.length; i++) { 634 * if (m.laneIsSet(i)) { 635 * ar[i] = scalar_binary_op(a.lane(i), b.lane(i)); 636 * } else { 637 * ar[i] = a.lane(i); // from first input 638 * } 639 * } 640 * EVector r = EVector.fromArray(species, ar, 0); 641 * }</pre> 642 * 643 * <h2><a id="lane-order"></a>Lane order and byte order</h2> 644 * 645 * The number of lane values stored in a given vector is referred to 646 * as its {@linkplain #length() vector length} or {@code VLENGTH}. 647 * 648 * It is useful to consider vector lanes as ordered 649 * <em>sequentially</em> from first to last, with the first lane 650 * numbered {@code 0}, the next lane numbered {@code 1}, and so on to 651 * the last lane numbered {@code VLENGTH-1}. This is a temporal 652 * order, where lower-numbered lanes are considered earlier than 653 * higher-numbered (later) lanes. This API uses these terms 654 * in preference to spatial terms such as "left", "right", "high", 655 * and "low". 656 * 657 * <p> Temporal terminology works well for vectors because they 658 * (usually) represent small fixed-sized segments in a long sequence 659 * of workload elements, where the workload is conceptually traversed 660 * in time order from beginning to end. (This is a mental model: it 661 * does not exclude multicore divide-and-conquer techniques.) Thus, 662 * when a scalar loop is transformed into a vector loop, adjacent 663 * scalar items (one earlier, one later) in the workload end up as 664 * adjacent lanes in a single vector (again, one earlier, one later). 665 * At a vector boundary, the last lane item in the earlier vector is 666 * adjacent to (and just before) the first lane item in the 667 * immediately following vector. 668 * 669 * <p> Vectors are also sometimes thought of in spatial terms, where 670 * the first lane is placed at an edge of some virtual paper, and 671 * subsequent lanes are presented in order next to it. When using 672 * spatial terms, all directions are equally plausible: Some vector 673 * notations present lanes from left to right, and others from right 674 * to left; still others present from top to bottom or vice versa. 675 * Using the language of time (before, after, first, last) instead of 676 * space (left, right, high, low) is often more likely to avoid 677 * misunderstandings. 678 * 679 * <p> As second reason to prefer temporal to spatial language about 680 * vector lanes is the fact that the terms "left", "right", "high" and 681 * "low" are widely used to describe the relations between bits in 682 * scalar values. The leftmost or highest bit in a given type is 683 * likely to be a sign bit, while the rightmost or lowest bit is 684 * likely to be the arithmetically least significant, and so on. 685 * Applying these terms to vector lanes risks confusion, however, 686 * because it is relatively rare to find algorithms where, given two 687 * adjacent vector lanes, one lane is somehow more arithmetically 688 * significant than its neighbor, and even in those cases, there is no 689 * general way to know which neighbor is the the more significant. 690 * 691 * <p> Putting the terms together, we view the information structure 692 * of a vector as a temporal sequence of lanes ("first", "next", 693 * "earlier", "later", "last", etc.) of bit-strings which are 694 * internally ordered spatially (either "low" to "high" or "right" to 695 * "left"). The primitive values in the lanes are decoded from these 696 * bit-strings, in the usual way. Most vector operations, like most 697 * Java scalar operators, treat primitive values as atomic values, but 698 * some operations reveal the internal bit-string structure. 699 * 700 * <p> When a vector is loaded from or stored into memory, the order 701 * of vector lanes is <em>always consistent </em> with the inherent 702 * ordering of the memory container. This is true whether or not 703 * individual lane elements are subject to "byte swapping" due to 704 * details of byte order. Thus, while the scalar lane elements of 705 * vector might be "byte swapped", the lanes themselves are never 706 * reordered, except by an explicit method call that performs 707 * cross-lane reordering. 708 * 709 * <p> When vector lane values are stored to Java variables of the 710 * same type, byte swapping is performed if and only if the 711 * implementation of the vector hardware requires such swapping. It 712 * is therefore unconditional and invisible. 713 * 714 * <p> As a useful fiction, this API presents a consistent illusion 715 * that vector lane bytes are composed into larger lane scalars in 716 * <em>little endian order</em>. This means that storing a vector 717 * into a Java byte array will reveal the successive bytes of the 718 * vector lane values in little-endian order on all platforms, 719 * regardless of native memory order, and also regardless of byte 720 * order (if any) within vector unit registers. 721 * 722 * <p> This hypothetical little-endian ordering also appears when a 723 * {@linkplain #reinterpretShape(VectorSpecies,int) reinterpretation cast} is 724 * applied in such a way that lane boundaries are discarded and 725 * redrawn differently, while maintaining vector bits unchanged. In 726 * such an operation, two adjacent lanes will contribute bytes to a 727 * single new lane (or vice versa), and the sequential order of the 728 * two lanes will determine the arithmetic order of the bytes in the 729 * single lane. In this case, the little-endian convention provides 730 * portable results, so that on all platforms earlier lanes tend to 731 * contribute lower (rightward) bits, and later lanes tend to 732 * contribute higher (leftward) bits. The {@linkplain #reinterpretAsBytes() 733 * reinterpretation casts} between {@link ByteVector}s and the 734 * other non-byte vectors use this convention to clarify their 735 * portable semantics. 736 * 737 * <p> The little-endian fiction for relating lane order to per-lane 738 * byte order is slightly preferable to an equivalent big-endian 739 * fiction, because some related formulas are much simpler, 740 * specifically those which renumber bytes after lane structure 741 * changes. The earliest byte is invariantly earliest across all lane 742 * structure changes, but only if little-endian convention are used. 743 * The root cause of this is that bytes in scalars are numbered from 744 * the least significant (rightmost) to the most significant 745 * (leftmost), and almost never vice-versa. If we habitually numbered 746 * sign bits as zero (as on some computers) then this API would reach 747 * for big-endian fictions to create unified addressing of vector 748 * bytes. 749 * 750 * <h2><a id="memory"></a>Memory operations</h2> 751 * 752 * As was already mentioned, vectors can be loaded from memory and 753 * stored back. An optional mask can control which individual memory 754 * locations are read from or written to. The shape of a vector 755 * determines how much memory it will occupy. 756 * 757 * An implementation typically has the property, in the absence of 758 * masking, that lanes are stored as a dense sequence of back-to-back 759 * values in memory, the same as a dense (gap-free) series of single 760 * scalar values in an array of the scalar type. 761 * 762 * In such cases memory order corresponds exactly to lane order. The 763 * first vector lane value occupies the first position in memory, and so on, 764 * up to the length of the vector. Further, the memory order of stored 765 * vector lanes corresponds to increasing index values in a Java array or 766 * in a {@link java.nio.ByteBuffer}. 767 * 768 * <p> Byte order for lane storage is chosen such that the stored 769 * vector values can be read or written as single primitive values, 770 * within the array or buffer that holds the vector, producing the 771 * same values as the lane-wise values within the vector. 772 * This fact is independent of the convenient fiction that lane values 773 * inside of vectors are stored in little-endian order. 774 * 775 * <p> For example, 776 * {@link FloatVector#fromArray(VectorSpecies, float[], int) 777 * FloatVector.fromArray(fsp,fa,i)} 778 * creates and returns a float vector of some particular species {@code fsp}, 779 * with elements loaded from some float array {@code fa}. 780 * The first lane is loaded from {@code fa[i]} and the last lane 781 * is initialized loaded from {@code fa[i+VL-1]}, where {@code VL} 782 * is the length of the vector as derived from the species {@code fsp}. 783 * Then, {@link FloatVector#add(Vector) fv=fv.add(fv2)} 784 * will produce another float vector of that species {@code fsp}, 785 * given a vector {@code fv2} of the same species {@code fsp}. 786 * Next, {@link FloatVector#compare(VectorOperators.Comparison,float) 787 * mnz=fv.compare(NE, 0.0f)} tests whether the result is zero, 788 * yielding a mask {@code mnz}. The non-zero lanes (and only those 789 * lanes) can then be stored back into the original array elements 790 * using the statement 791 * {@link FloatVector#intoArray(float[],int,VectorMask) fv.intoArray(fa,i,mnz)}. 792 * 793 * <h2><a id="expansion"></a>Expansions, contractions, and partial results</h2> 794 * 795 * Since vectors are fixed in size, occasions often arise where the 796 * logical result of an operation is not the same as the physical size 797 * of the proposed output vector. To encourage user code that is as 798 * portable and predictable as possible, this API has a systematic 799 * approach to the design of such <em>resizing</em> vector operations. 800 * 801 * <p> As a basic principle, lane-wise operations are 802 * <em>length-invariant</em>, unless clearly marked otherwise. 803 * Length-invariance simply means that 804 * if {@code VLENGTH} lanes go into an operation, the same number 805 * of lanes come out, with nothing discarded and no extra padding. 806 * 807 * <p> As a second principle, sometimes in tension with the first, 808 * lane-wise operations are also <em>shape-invariant</em>, unless 809 * clearly marked otherwise. 810 * 811 * Shape-invariance means that {@code VSHAPE} is constant for typical 812 * computations. Keeping the same shape throughout a computation 813 * helps ensure that scarce vector resources are efficiently used. 814 * (On some hardware platforms shape changes could cause unwanted 815 * effects like extra data movement instructions, round trips through 816 * memory, or pipeline bubbles.) 817 * 818 * <p> Tension between these principles arises when an operation 819 * produces a <em>logical result</em> that is too large for the 820 * required output {@code VSHAPE}. In other cases, when a logical 821 * result is smaller than the capacity of the output {@code VSHAPE}, 822 * the positioning of the logical result is open to question, since 823 * the physical output vector must contain a mix of logical result and 824 * padding. 825 * 826 * <p> In the first case, of a too-large logical result being crammed 827 * into a too-small output {@code VSHAPE}, we say that data has 828 * <em>expanded</em>. In other words, an <em>expansion operation</em> 829 * has caused the output shape to overflow. Symmetrically, in the 830 * second case of a small logical result fitting into a roomy output 831 * {@code VSHAPE}, the data has <em>contracted</em>, and the 832 * <em>contraction operation</em> has required the output shape to pad 833 * itself with extra zero lanes. 834 * 835 * <p> In both cases we can speak of a parameter {@code M} which 836 * measures the <em>expansion ratio</em> or <em>contraction ratio</em> 837 * between the logical result size (in bits) and the bit-size of the 838 * actual output shape. When vector shapes are changed, and lane 839 * sizes are not, {@code M} is just the integral ratio of the output 840 * shape to the logical result. (With the possible exception of 841 * the {@linkplain VectorShape#S_Max_BIT maximum shape}, all vector 842 * sizes are powers of two, and so the ratio {@code M} is always 843 * an integer. In the hypothetical case of a non-integral ratio, 844 * the value {@code M} would be rounded up to the next integer, 845 * and then the same general considerations would apply.) 846 * 847 * <p> If the logical result is larger than the physical output shape, 848 * such a shape change must inevitably drop result lanes (all but 849 * {@code 1/M} of the logical result). If the logical size is smaller 850 * than the output, the shape change must introduce zero-filled lanes 851 * of padding (all but {@code 1/M} of the physical output). The first 852 * case, with dropped lanes, is an expansion, while the second, with 853 * padding lanes added, is a contraction. 854 * 855 * <p> Similarly, consider a lane-wise conversion operation which 856 * leaves the shape invariant but changes the lane size by a ratio of 857 * {@code M}. If the logical result is larger than the output (or 858 * input), this conversion must reduce the {@code VLENGTH} lanes of the 859 * output by {@code M}, dropping all but {@code 1/M} of the logical 860 * result lanes. As before, the dropping of lanes is the hallmark of 861 * an expansion. A lane-wise operation which contracts lane size by a 862 * ratio of {@code M} must increase the {@code VLENGTH} by the same 863 * factor {@code M}, filling the extra lanes with a zero padding 864 * value; because padding must be added this is a contraction. 865 * 866 * <p> It is also possible (though somewhat confusing) to change both 867 * lane size and container size in one operation which performs both 868 * lane conversion <em>and</em> reshaping. If this is done, the same 869 * rules apply, but the logical result size is the product of the 870 * input size times any expansion or contraction ratio from the lane 871 * change size. 872 * 873 * <p> For completeness, we can also speak of <em>in-place 874 * operations</em> for the frequent case when resizing does not occur. 875 * With an in-place operation, the data is simply copied from logical 876 * output to its physical container with no truncation or padding. 877 * The ratio parameter {@code M} in this case is unity. 878 * 879 * <p> Note that the classification of contraction vs. expansion 880 * depends on the relative sizes of the logical result and the 881 * physical output container. The size of the input container may be 882 * larger or smaller than either of the other two values, without 883 * changing the classification. For example, a conversion from a 884 * 128-bit shape to a 256-bit shape will be a contraction in many 885 * cases, but it would be an expansion if it were combined with a 886 * conversion from {@code byte} to {@code long}, since in that case 887 * the logical result would be 1024 bits in size. This example also 888 * illustrates that a logical result does not need to correspond to 889 * any particular platform-supported vector shape. 890 * 891 * <p> Although lane-wise masked operations can be viewed as producing 892 * partial operations, they are not classified (in this API) as 893 * expansions or contractions. A masked load from an array surely 894 * produces a partial vector, but there is no meaningful "logical 895 * output vector" that this partial result was contracted from. 896 * 897 * <p> Some care is required with these terms, because it is the 898 * <em>data</em>, not the <em>container size</em>, that is expanding 899 * or contracting, relative to the size of its output container. 900 * Thus, resizing a 128-bit input into 512-bit vector has the effect 901 * of a <em>contraction</em>. Though the 128 bits of payload hasn't 902 * changed in size, we can say it "looks smaller" in its new 512-bit 903 * home, and this will capture the practical details of the situation. 904 * 905 * <p> If a vector method might expand its data, it accepts an extra 906 * {@code int} parameter called {@code part}, or the "part number". 907 * The part number must be in the range {@code [0..M-1]}, where 908 * {@code M} is the expansion ratio. The part number selects one 909 * of {@code M} contiguous disjoint equally-sized blocks of lanes 910 * from the logical result and fills the physical output vector 911 * with this block of lanes. 912 * 913 * <p> Specifically, the lanes selected from the logical result of an 914 * expansion are numbered in the range {@code [R..R+L-1]}, where 915 * {@code L} is the {@code VLENGTH} of the physical output vector, and 916 * the origin of the block, {@code R}, is {@code part*L}. 917 * 918 * <p> A similar convention applies to any vector method that might 919 * contract its data. Such a method also accepts an extra part number 920 * parameter (again called {@code part}) which steers the contracted 921 * data lanes one of {@code M} contiguous disjoint equally-sized 922 * blocks of lanes in the physical output vector. The remaining lanes 923 * are filled with zero, or as specified by the method. 924 * 925 * <p> Specifically, the data is steered into the lanes numbered in the 926 * range {@code [R..R+L-1]}, where {@code L} is the {@code VLENGTH} of 927 * the logical result vector, and the origin of the block, {@code R}, 928 * is again a multiple of {@code L} selected by the part number, 929 * specifically {@code |part|*L}. 930 * 931 * <p> In the case of a contraction, the part number must be in the 932 * non-positive range {@code [-M+1..0]}. This convention is adopted 933 * because some methods can perform both expansions and contractions, 934 * in a data-dependent manner, and the extra sign on the part number 935 * serves as an error check. If vector method takes a part number and 936 * is invoked to perform an in-place operation (neither contracting 937 * nor expanding), the {@code part} parameter must be exactly zero. 938 * Part numbers outside the allowed ranges will elicit an indexing 939 * exception. Note that in all cases a zero part number is valid, and 940 * corresponds to an operation which preserves as many lanes as 941 * possible from the beginning of the logical result, and places them 942 * into the beginning of the physical output container. This is 943 * often a desirable default, so a part number of zero is safe 944 * in all cases and useful in most cases. 945 * 946 * <p> The various resizing operations of this API contract or expand 947 * their data as follows: 948 * <ul> 949 * 950 * <li> 951 * {@link Vector#convert(VectorOperators.Conversion,int) Vector.convert()} 952 * will expand (respectively, contract) its operand by ratio 953 * {@code M} if the 954 * {@linkplain #elementSize() element size} of its output is 955 * larger (respectively, smaller) by a factor of {@code M}. 956 * If the element sizes of input and output are the same, 957 * then {@code convert()} is an in-place operation. 958 * 959 * <li> 960 * {@link Vector#convertShape(VectorOperators.Conversion,VectorSpecies,int) Vector.convertShape()} 961 * will expand (respectively, contract) its operand by ratio 962 * {@code M} if the bit-size of its logical result is 963 * larger (respectively, smaller) than the bit-size of its 964 * output shape. 965 * The size of the logical result is defined as the 966 * {@linkplain #elementSize() element size} of the output, 967 * times the {@code VLENGTH} of its input. 968 * 969 * Depending on the ratio of the changed lane sizes, the logical size 970 * may be (in various cases) either larger or smaller than the input 971 * vector, independently of whether the operation is an expansion 972 * or contraction. 973 * 974 * <li> 975 * Since {@link Vector#castShape(VectorSpecies,int) Vector.castShape()} 976 * is a convenience method for {@code convertShape()}, its classification 977 * as an expansion or contraction is the same as for {@code convertShape()}. 978 * 979 * <li> 980 * {@link Vector#reinterpretShape(VectorSpecies,int) Vector.reinterpretShape()} 981 * is an expansion (respectively, contraction) by ratio {@code M} if the 982 * {@linkplain #bitSize() vector bit-size} of its input is 983 * crammed into a smaller (respectively, dropped into a larger) 984 * output container by a factor of {@code M}. 985 * Otherwise it is an in-place operation. 986 * 987 * Since this method is a reinterpretation cast that can erase and 988 * redraw lane boundaries as well as modify shape, the input vector's 989 * lane size and lane count are irrelevant to its classification as 990 * expanding or contracting. 991 * 992 * <li> 993 * The {@link #unslice(int,Vector,int) unslice()} methods expand 994 * by a ratio of {@code M=2}, because the single input slice is 995 * positioned and inserted somewhere within two consecutive background 996 * vectors. The part number selects the first or second background 997 * vector, as updated by the inserted slice. 998 * Note that the corresponding 999 * {@link #slice(int,Vector) slice()} methods, although inverse 1000 * to the {@code unslice()} methods, do not contract their data 1001 * and thus require no part number. This is because 1002 * {@code slice()} delivers a slice of exactly {@code VLENGTH} 1003 * lanes extracted from two input vectors. 1004 * </ul> 1005 * 1006 * The method {@link VectorSpecies#partLimit(VectorSpecies,boolean) 1007 * partLimit()} on {@link VectorSpecies} can be used, before any 1008 * expanding or contracting operation is performed, to query the 1009 * limiting value on a part parameter for a proposed expansion 1010 * or contraction. The value returned from {@code partLimit()} is 1011 * positive for expansions, negative for contractions, and zero for 1012 * in-place operations. Its absolute value is the parameter {@code 1013 * M}, and so it serves as an exclusive limit on valid part number 1014 * arguments for the relevant methods. Thus, for expansions, the 1015 * {@code partLimit()} value {@code M} is the exclusive upper limit 1016 * for part numbers, while for contractions the {@code partLimit()} 1017 * value {@code -M} is the exclusive <em>lower</em> limit. 1018 * 1019 * <h2><a id="cross-lane"></a>Moving data across lane boundaries</h2> 1020 * The cross-lane methods which do not redraw lanes or change species 1021 * are more regularly structured and easier to reason about. 1022 * These operations are: 1023 * <ul> 1024 * 1025 * <li>The {@link #slice(int,Vector) slice()} family of methods, 1026 * which extract contiguous slice of {@code VLENGTH} fields from 1027 * a given origin point within a concatenated pair of vectors. 1028 * 1029 * <li>The {@link #unslice(int,Vector,int) unslice()} family of 1030 * methods, which insert a contiguous slice of {@code VLENGTH} fields 1031 * into a concatenated pair of vectors at a given origin point. 1032 * 1033 * <li>The {@link #rearrange(VectorShuffle) rearrange()} family of 1034 * methods, which select an arbitrary set of {@code VLENGTH} lanes 1035 * from one or two input vectors, and assemble them in an arbitrary 1036 * order. The selection and order of lanes is controlled by a 1037 * {@code VectorShuffle} object, which acts as an routing table 1038 * mapping source lanes to destination lanes. A {@code VectorShuffle} 1039 * can encode a mathematical permutation as well as many other 1040 * patterns of data movement. 1041 * 1042 * </ul> 1043 * <p> Some vector operations are not lane-wise, but rather move data 1044 * across lane boundaries. Such operations are typically rare in SIMD 1045 * code, though they are sometimes necessary for specific algorithms 1046 * that manipulate data formats at a low level, and/or require SIMD 1047 * data to move in complex local patterns. (Local movement in a small 1048 * window of a large array of data is relatively unusual, although 1049 * some highly patterned algorithms call for it.) In this API such 1050 * methods are always clearly recognizable, so that simpler lane-wise 1051 * reasoning can be confidently applied to the rest of the code. 1052 * 1053 * <p> In some cases, vector lane boundaries are discarded and 1054 * "redrawn from scratch", so that data in a given input lane might 1055 * appear (in several parts) distributed through several output lanes, 1056 * or (conversely) data from several input lanes might be consolidated 1057 * into a single output lane. The fundamental method which can redraw 1058 * lanes boundaries is 1059 * {@link #reinterpretShape(VectorSpecies,int) reinterpretShape()}. 1060 * Built on top of this method, certain convenience methods such 1061 * as {@link #reinterpretAsBytes() reinterpretAsBytes()} or 1062 * {@link #reinterpretAsInts() reinterpretAsInts()} will 1063 * (potentially) redraw lane boundaries, while retaining the 1064 * same overall vector shape. 1065 * 1066 * <p> Operations which produce or consume a scalar result can be 1067 * viewed as very simple cross-lane operations. Methods in the 1068 * {@link #reduceLanesToLong(VectorOperators.Associative) 1069 * reduceLanes()} family fold together all lanes (or mask-selected 1070 * lanes) of a method and return a single result. As an inverse, the 1071 * {@link #broadcast(long) broadcast} family of methods can be thought 1072 * of as crossing lanes in the other direction, from a scalar to all 1073 * lanes of the output vector. Single-lane access methods such as 1074 * {@code lane(I)} or {@code withLane(I,E)} might also be regarded as 1075 * very simple cross-lane operations. 1076 * 1077 * <p> Likewise, a method which moves a non-byte vector to or from a 1078 * byte array could be viewed as a cross-lane operation, because the 1079 * vector lanes must be distributed into separate bytes, or (in the 1080 * other direction) consolidated from array bytes. 1081 * 1082 * @implNote 1083 * 1084 * <h2>Hardware platform dependencies and limitations</h2> 1085 * 1086 * The Vector API is to accelerate computations in style of Single 1087 * Instruction Multiple Data (SIMD), using available hardware 1088 * resources such as vector hardware registers and vector hardware 1089 * instructions. The API is designed to make effective use of 1090 * multiple SIMD hardware platforms. 1091 * 1092 * <p> This API will also work correctly even on Java platforms which 1093 * do not include specialized hardware support for SIMD computations. 1094 * The Vector API is not likely to provide any special performance 1095 * benefit on such platforms. 1096 * 1097 * <p> Currently the implementation is optimized to work best on: 1098 * 1099 * <ul> 1100 * 1101 * <li> Intel x64 platforms supporting at least AVX2 up to AVX-512. 1102 * Masking using mask registers and mask accepting hardware 1103 * instructions on AVX-512 are not currently supported. 1104 * 1105 * <li> ARM AArch64 platforms supporting NEON. Although the API has 1106 * been designed to ensure ARM SVE instructions can be supported 1107 * (vector sizes between 128 to 2048 bits) there is currently no 1108 * implementation of such instructions and the general masking 1109 * capability. 1110 * 1111 * </ul> 1112 * The implementation currently supports masked lane-wise operations 1113 * in a cross-platform manner by composing the unmasked lane-wise 1114 * operation with {@link #blend(Vector, VectorMask) blend} as in 1115 * the expression {@code a.blend(a.lanewise(op, b), m)}, where 1116 * {@code a} and {@code b} are vectors, {@code op} is the vector 1117 * operation, and {@code m} is the mask. 1118 * 1119 * <p> The implementation does not currently support optimal 1120 * vectorized instructions for floating point transcendental 1121 * functions (such as operators {@link VectorOperators#SIN SIN} 1122 * and {@link VectorOperators#LOG LOG}). 1123 * 1124 * <h2>No boxing of primitives</h2> 1125 * 1126 * Although a vector type like {@code Vector<Integer>} may seem to 1127 * work with boxed {@code Integer} values, the overheads associated 1128 * with boxing are avoided by having each vector subtype work 1129 * internally on lane values of the actual {@code ETYPE}, such as 1130 * {@code int}. 1131 * 1132 * <h2>Value-based classes and identity operations</h2> 1133 * 1134 * {@code Vector}, along with all of its subtypes and many of its 1135 * helper types like {@code VectorMask} and {@code VectorShuffle}, is a 1136 * <a href="{@docRoot}/java.base/java/lang/doc-files/ValueBased.html">value-based</a> 1137 * class. 1138 * 1139 * <p> Once created, a vector is never mutated, not even if only 1140 * {@linkplain IntVector#withLane(int,int) a single lane is changed}. 1141 * A new vector is always created to hold a new configuration 1142 * of lane values. The unavailability of mutative methods is a 1143 * necessary consequence of suppressing the object identity of 1144 * all vectors, as value-based classes. 1145 * 1146 * <p> With {@code Vector}, 1147 * 1148 * <!-- The following paragraph is shared verbatim 1149 * -- between Vector.java and package-info.java --> 1150 * identity-sensitive operations such as {@code ==} may yield 1151 * unpredictable results, or reduced performance. Oddly enough, 1152 * {@link Vector#equals(Object) v.equals(w)} is likely to be faster 1153 * than {@code v==w}, since {@code equals} is <em>not</em> an identity 1154 * sensitive method. 1155 * 1156 * Also, these objects can be stored in locals and parameters and as 1157 * {@code static final} constants, but storing them in other Java 1158 * fields or in array elements, while semantically valid, may incur 1159 * performance penalties. 1160 * <!-- The preceding paragraph is shared verbatim 1161 * -- between Vector.java and package-info.java --> 1162 * 1163 * @param <E> the boxed version of {@code ETYPE}, 1164 * the element type of a vector 1165 * 1166 */ 1167 @SuppressWarnings("exports") 1168 public abstract class Vector<E> extends jdk.internal.vm.vector.VectorSupport.Vector<E> { 1169 1170 // This type is sealed within its package. 1171 // Users cannot roll their own vector types. 1172 Vector(Object bits) { 1173 super(bits); 1174 } 1175 1176 /** 1177 * Returns the species of this vector. 1178 * 1179 * @return the species of this vector 1180 */ 1181 public abstract VectorSpecies<E> species(); 1182 1183 /** 1184 * Returns the primitive <a href="Vector.html#ETYPE">element type</a> 1185 * ({@code ETYPE}) of this vector. 1186 * 1187 * @implSpec 1188 * This is the same value as {@code this.species().elementType()}. 1189 * 1190 * @return the primitive element type of this vector 1191 */ 1192 public abstract Class<E> elementType(); 1193 1194 /** 1195 * Returns the size of each lane, in bits, of this vector. 1196 * 1197 * @implSpec 1198 * This is the same value as {@code this.species().elementSize()}. 1199 * 1200 * @return the lane size, in bits, of this vector 1201 */ 1202 public abstract int elementSize(); 1203 1204 /** 1205 * Returns the shape of this vector. 1206 * 1207 * @implSpec 1208 * This is the same value as {@code this.species().vectorShape()}. 1209 * 1210 * @return the shape of this vector 1211 */ 1212 public abstract VectorShape shape(); 1213 1214 /** 1215 * Returns the lane count, or <a href="Vector.html#VLENGTH">vector length</a> 1216 * ({@code VLENGTH}). 1217 * 1218 * @return the lane count 1219 */ 1220 public abstract int length(); 1221 1222 /** 1223 * Returns the total size, in bits, of this vector. 1224 * 1225 * @implSpec 1226 * This is the same value as {@code this.shape().vectorBitSize()}. 1227 * 1228 * @return the total size, in bits, of this vector 1229 */ 1230 public abstract int bitSize(); 1231 1232 /** 1233 * Returns the total size, in bytes, of this vector. 1234 * 1235 * @implSpec 1236 * This is the same value as {@code this.bitSize()/Byte.SIZE}. 1237 * 1238 * @return the total size, in bytes, of this vector 1239 */ 1240 public abstract int byteSize(); 1241 1242 /// Arithmetic 1243 1244 /** 1245 * Operates on the lane values of this vector. 1246 * 1247 * This is a <a href="Vector.html#lane-wise">lane-wise</a> 1248 * unary operation which applies 1249 * the selected operation to each lane. 1250 * 1251 * @apiNote 1252 * Subtypes improve on this method by sharpening 1253 * the method return type. 1254 * 1255 * @param op the operation used to process lane values 1256 * @return the result of applying the operation lane-wise 1257 * to the input vector 1258 * @throws UnsupportedOperationException if this vector does 1259 * not support the requested operation 1260 * @see VectorOperators#NEG 1261 * @see VectorOperators#NOT 1262 * @see VectorOperators#SIN 1263 * @see #lanewise(VectorOperators.Unary,VectorMask) 1264 * @see #lanewise(VectorOperators.Binary,Vector) 1265 * @see #lanewise(VectorOperators.Ternary,Vector,Vector) 1266 */ 1267 public abstract Vector<E> lanewise(VectorOperators.Unary op); 1268 1269 /** 1270 * Operates on the lane values of this vector, 1271 * with selection of lane elements controlled by a mask. 1272 * 1273 * This is a lane-wise unary operation which applies 1274 * the selected operation to each lane. 1275 * 1276 * @apiNote 1277 * Subtypes improve on this method by sharpening 1278 * the method return type. 1279 * 1280 * @param op the operation used to process lane values 1281 * @param m the mask controlling lane selection 1282 * @return the result of applying the operation lane-wise 1283 * to the input vector 1284 * @throws UnsupportedOperationException if this vector does 1285 * not support the requested operation 1286 * @see #lanewise(VectorOperators.Unary) 1287 */ 1288 public abstract Vector<E> lanewise(VectorOperators.Unary op, 1289 VectorMask<E> m); 1290 1291 /** 1292 * Combines the corresponding lane values of this vector 1293 * with those of a second input vector. 1294 * 1295 * This is a <a href="Vector.html#lane-wise">lane-wise</a> 1296 * binary operation which applies 1297 * the selected operation to each lane. 1298 * 1299 * @apiNote 1300 * Subtypes improve on this method by sharpening 1301 * the method return type. 1302 * 1303 * @param op the operation used to combine lane values 1304 * @param v the input vector 1305 * @return the result of applying the operation lane-wise 1306 * to the two input vectors 1307 * @throws UnsupportedOperationException if this vector does 1308 * not support the requested operation 1309 * @see VectorOperators#ADD 1310 * @see VectorOperators#XOR 1311 * @see VectorOperators#ATAN2 1312 * @see #lanewise(VectorOperators.Binary,Vector,VectorMask) 1313 * @see #lanewise(VectorOperators.Unary) 1314 * @see #lanewise(VectorOperators.Ternary,Vector, Vector) 1315 */ 1316 public abstract Vector<E> lanewise(VectorOperators.Binary op, 1317 Vector<E> v); 1318 1319 /** 1320 * Combines the corresponding lane values of this vector 1321 * with those of a second input vector, 1322 * with selection of lane elements controlled by a mask. 1323 * 1324 * This is a lane-wise binary operation which applies 1325 * the selected operation to each lane. 1326 * 1327 * @apiNote 1328 * Subtypes improve on this method by sharpening 1329 * the method return type. 1330 * 1331 * @param op the operation used to combine lane values 1332 * @param v the second input vector 1333 * @param m the mask controlling lane selection 1334 * @return the result of applying the operation lane-wise 1335 * to the two input vectors 1336 * @throws UnsupportedOperationException if this vector does 1337 * not support the requested operation 1338 * @see #lanewise(VectorOperators.Binary,Vector) 1339 */ 1340 public abstract Vector<E> lanewise(VectorOperators.Binary op, 1341 Vector<E> v, VectorMask<E> m); 1342 1343 /** 1344 * Combines the lane values of this vector 1345 * with the value of a broadcast scalar. 1346 * 1347 * This is a lane-wise binary operation which applies 1348 * the selected operation to each lane. 1349 * The return value will be equal to this expression: 1350 * {@code this.lanewise(op, this.broadcast(e))}. 1351 * 1352 * @apiNote 1353 * The {@code long} value {@code e} must be accurately 1354 * representable by the {@code ETYPE} of this vector's species, 1355 * so that {@code e==(long)(ETYPE)e}. This rule is enforced 1356 * by the implicit call to {@code broadcast()}. 1357 * <p> 1358 * Subtypes improve on this method by sharpening 1359 * the method return type and 1360 * the type of the scalar parameter {@code e}. 1361 * 1362 * @param op the operation used to combine lane values 1363 * @param e the input scalar 1364 * @return the result of applying the operation lane-wise 1365 * to the input vector and the scalar 1366 * @throws UnsupportedOperationException if this vector does 1367 * not support the requested operation 1368 * @throws IllegalArgumentException 1369 * if the given {@code long} value cannot 1370 * be represented by the right operand type 1371 * of the vector operation 1372 * @see #broadcast(long) 1373 * @see #lanewise(VectorOperators.Binary,long,VectorMask) 1374 */ 1375 public abstract Vector<E> lanewise(VectorOperators.Binary op, 1376 long e); 1377 1378 /** 1379 * Combines the corresponding lane values of this vector 1380 * with those of a second input vector, 1381 * with selection of lane elements controlled by a mask. 1382 * 1383 * This is a lane-wise binary operation which applies 1384 * the selected operation to each lane. 1385 * The second operand is a broadcast integral value. 1386 * The return value will be equal to this expression: 1387 * {@code this.lanewise(op, this.broadcast(e), m)}. 1388 * 1389 * @apiNote 1390 * The {@code long} value {@code e} must be accurately 1391 * representable by the {@code ETYPE} of this vector's species, 1392 * so that {@code e==(long)(ETYPE)e}. This rule is enforced 1393 * by the implicit call to {@code broadcast()}. 1394 * <p> 1395 * Subtypes improve on this method by sharpening 1396 * the method return type and 1397 * the type of the scalar parameter {@code e}. 1398 * 1399 * @param op the operation used to combine lane values 1400 * @param e the input scalar 1401 * @param m the mask controlling lane selection 1402 * @return the result of applying the operation lane-wise 1403 * to the input vector and the scalar 1404 * @throws UnsupportedOperationException if this vector does 1405 * not support the requested operation 1406 * @throws IllegalArgumentException 1407 * if the given {@code long} value cannot 1408 * be represented by the right operand type 1409 * of the vector operation 1410 * @see #broadcast(long) 1411 * @see #lanewise(VectorOperators.Binary,Vector,VectorMask) 1412 */ 1413 public abstract Vector<E> lanewise(VectorOperators.Binary op, 1414 long e, VectorMask<E> m); 1415 1416 /** 1417 * Combines the corresponding lane values of this vector 1418 * with the lanes of a second and a third input vector. 1419 * 1420 * This is a <a href="Vector.html#lane-wise">lane-wise</a> 1421 * ternary operation which applies 1422 * the selected operation to each lane. 1423 * 1424 * @apiNote 1425 * Subtypes improve on this method by sharpening 1426 * the method return type. 1427 * 1428 * @param op the operation used to combine lane values 1429 * @param v1 the second input vector 1430 * @param v2 the third input vector 1431 * @return the result of applying the operation lane-wise 1432 * to the three input vectors 1433 * @throws UnsupportedOperationException if this vector does 1434 * not support the requested operation 1435 * @see VectorOperators#BITWISE_BLEND 1436 * @see VectorOperators#FMA 1437 * @see #lanewise(VectorOperators.Unary) 1438 * @see #lanewise(VectorOperators.Binary,Vector) 1439 * @see #lanewise(VectorOperators.Ternary,Vector,Vector,VectorMask) 1440 */ 1441 public abstract Vector<E> lanewise(VectorOperators.Ternary op, 1442 Vector<E> v1, 1443 Vector<E> v2); 1444 1445 /** 1446 * Combines the corresponding lane values of this vector 1447 * with the lanes of a second and a third input vector, 1448 * with selection of lane elements controlled by a mask. 1449 * 1450 * This is a lane-wise ternary operation which applies 1451 * the selected operation to each lane. 1452 * 1453 * @apiNote 1454 * Subtypes improve on this method by sharpening 1455 * the method return type. 1456 * 1457 * @param op the operation used to combine lane values 1458 * @param v1 the second input vector 1459 * @param v2 the third input vector 1460 * @param m the mask controlling lane selection 1461 * @return the result of applying the operation lane-wise 1462 * to the three input vectors 1463 * @throws UnsupportedOperationException if this vector does 1464 * not support the requested operation 1465 * @see #lanewise(VectorOperators.Ternary,Vector,Vector) 1466 */ 1467 public abstract Vector<E> lanewise(VectorOperators.Ternary op, 1468 Vector<E> v1, Vector<E> v2, 1469 VectorMask<E> m); 1470 1471 // Note: lanewise(Binary) has two rudimentary broadcast 1472 // operations from an approximate scalar type (long). 1473 // We do both with that, here, for lanewise(Ternary). 1474 // The vector subtypes supply a full suite of 1475 // broadcasting and masked lanewise operations 1476 // for their specific ETYPEs: 1477 // lanewise(Unary, [mask]) 1478 // lanewise(Binary, [e | v], [mask]) 1479 // lanewise(Ternary, [e1 | v1], [e2 | v2], [mask]) 1480 1481 /// Full-service binary ops: ADD, SUB, MUL, DIV 1482 1483 // Full-service functions support all four variations 1484 // of vector vs. broadcast scalar, and mask vs. not. 1485 // The lanewise generic operator is (by this definition) 1486 // also a full-service function. 1487 1488 // Other named functions handle just the one named 1489 // variation. Most lanewise operations are *not* named, 1490 // and are reached only by lanewise. 1491 1492 /** 1493 * Adds this vector to a second input vector. 1494 * 1495 * This is a lane-wise binary operation which applies 1496 * the primitive addition operation ({@code +}) 1497 * to each pair of corresponding lane values. 1498 * 1499 * This method is also equivalent to the expression 1500 * {@link #lanewise(VectorOperators.Binary,Vector) 1501 * lanewise}{@code (}{@link VectorOperators#ADD 1502 * ADD}{@code , v)}. 1503 * 1504 * <p> 1505 * As a full-service named operation, this method 1506 * comes in masked and unmasked overloadings, and 1507 * (in subclasses) also comes in scalar-broadcast 1508 * overloadings (both masked and unmasked). 1509 * 1510 * @param v a second input vector 1511 * @return the result of adding this vector to the second input vector 1512 * @see #add(Vector,VectorMask) 1513 * @see IntVector#add(int) 1514 * @see VectorOperators#ADD 1515 * @see #lanewise(VectorOperators.Binary,Vector) 1516 * @see IntVector#lanewise(VectorOperators.Binary,int) 1517 */ 1518 public abstract Vector<E> add(Vector<E> v); 1519 1520 /** 1521 * Adds this vector to a second input vector, selecting lanes 1522 * under the control of a mask. 1523 * 1524 * This is a masked lane-wise binary operation which applies 1525 * the primitive addition operation ({@code +}) 1526 * to each pair of corresponding lane values. 1527 * 1528 * For any lane unset in the mask, the primitive operation is 1529 * suppressed and this vector retains the original value stored in 1530 * that lane. 1531 * 1532 * This method is also equivalent to the expression 1533 * {@link #lanewise(VectorOperators.Binary,Vector,VectorMask) 1534 * lanewise}{@code (}{@link VectorOperators#ADD 1535 * ADD}{@code , v, m)}. 1536 * 1537 * <p> 1538 * As a full-service named operation, this method 1539 * comes in masked and unmasked overloadings, and 1540 * (in subclasses) also comes in scalar-broadcast 1541 * overloadings (both masked and unmasked). 1542 * 1543 * @param v the second input vector 1544 * @param m the mask controlling lane selection 1545 * @return the result of adding this vector to the given vector 1546 * @see #add(Vector) 1547 * @see IntVector#add(int,VectorMask) 1548 * @see VectorOperators#ADD 1549 * @see #lanewise(VectorOperators.Binary,Vector,VectorMask) 1550 * @see IntVector#lanewise(VectorOperators.Binary,int,VectorMask) 1551 */ 1552 public abstract Vector<E> add(Vector<E> v, VectorMask<E> m); 1553 1554 /** 1555 * Subtracts a second input vector from this vector. 1556 * 1557 * This is a lane-wise binary operation which applies 1558 * the primitive subtraction operation ({@code -}) 1559 * to each pair of corresponding lane values. 1560 * 1561 * This method is also equivalent to the expression 1562 * {@link #lanewise(VectorOperators.Binary,Vector) 1563 * lanewise}{@code (}{@link VectorOperators#SUB 1564 * SUB}{@code , v)}. 1565 * 1566 * <p> 1567 * As a full-service named operation, this method 1568 * comes in masked and unmasked overloadings, and 1569 * (in subclasses) also comes in scalar-broadcast 1570 * overloadings (both masked and unmasked). 1571 * 1572 * @param v a second input vector 1573 * @return the result of subtracting the second input vector from this vector 1574 * @see #sub(Vector,VectorMask) 1575 * @see IntVector#sub(int) 1576 * @see VectorOperators#SUB 1577 * @see #lanewise(VectorOperators.Binary,Vector) 1578 * @see IntVector#lanewise(VectorOperators.Binary,int) 1579 */ 1580 public abstract Vector<E> sub(Vector<E> v); 1581 1582 /** 1583 * Subtracts a second input vector from this vector 1584 * under the control of a mask. 1585 * 1586 * This is a masked lane-wise binary operation which applies 1587 * the primitive subtraction operation ({@code -}) 1588 * to each pair of corresponding lane values. 1589 * 1590 * For any lane unset in the mask, the primitive operation is 1591 * suppressed and this vector retains the original value stored in 1592 * that lane. 1593 * 1594 * This method is also equivalent to the expression 1595 * {@link #lanewise(VectorOperators.Binary,Vector,VectorMask) 1596 * lanewise}{@code (}{@link VectorOperators#SUB 1597 * SUB}{@code , v, m)}. 1598 * 1599 * <p> 1600 * As a full-service named operation, this method 1601 * comes in masked and unmasked overloadings, and 1602 * (in subclasses) also comes in scalar-broadcast 1603 * overloadings (both masked and unmasked). 1604 * 1605 * @param v the second input vector 1606 * @param m the mask controlling lane selection 1607 * @return the result of subtracting the second input vector from this vector 1608 * @see #sub(Vector) 1609 * @see IntVector#sub(int,VectorMask) 1610 * @see VectorOperators#SUB 1611 * @see #lanewise(VectorOperators.Binary,Vector,VectorMask) 1612 * @see IntVector#lanewise(VectorOperators.Binary,int,VectorMask) 1613 */ 1614 public abstract Vector<E> sub(Vector<E> v, VectorMask<E> m); 1615 1616 /** 1617 * Multiplies this vector by a second input vector. 1618 * 1619 * This is a lane-wise binary operation which applies 1620 * the primitive multiplication operation ({@code *}) 1621 * to each pair of corresponding lane values. 1622 * 1623 * This method is also equivalent to the expression 1624 * {@link #lanewise(VectorOperators.Binary,Vector) 1625 * lanewise}{@code (}{@link VectorOperators#MUL 1626 * MUL}{@code , v)}. 1627 * 1628 * <p> 1629 * As a full-service named operation, this method 1630 * comes in masked and unmasked overloadings, and 1631 * (in subclasses) also comes in scalar-broadcast 1632 * overloadings (both masked and unmasked). 1633 * 1634 * @param v a second input vector 1635 * @return the result of multiplying this vector by the second input vector 1636 * @see #mul(Vector,VectorMask) 1637 * @see IntVector#mul(int) 1638 * @see VectorOperators#MUL 1639 * @see #lanewise(VectorOperators.Binary,Vector) 1640 * @see IntVector#lanewise(VectorOperators.Binary,int) 1641 */ 1642 public abstract Vector<E> mul(Vector<E> v); 1643 1644 /** 1645 * Multiplies this vector by a second input vector 1646 * under the control of a mask. 1647 * 1648 * This is a lane-wise binary operation which applies 1649 * the primitive multiplication operation ({@code *}) 1650 * to each pair of corresponding lane values. 1651 * 1652 * For any lane unset in the mask, the primitive operation is 1653 * suppressed and this vector retains the original value stored in 1654 * that lane. 1655 * 1656 * This method is also equivalent to the expression 1657 * {@link #lanewise(VectorOperators.Binary,Vector,VectorMask) 1658 * lanewise}{@code (}{@link VectorOperators#MUL 1659 * MUL}{@code , v, m)}. 1660 * 1661 * <p> 1662 * As a full-service named operation, this method 1663 * comes in masked and unmasked overloadings, and 1664 * (in subclasses) also comes in scalar-broadcast 1665 * overloadings (both masked and unmasked). 1666 * 1667 * @param v the second input vector 1668 * @param m the mask controlling lane selection 1669 * @return the result of multiplying this vector by the given vector 1670 * @see #mul(Vector) 1671 * @see IntVector#mul(int,VectorMask) 1672 * @see VectorOperators#MUL 1673 * @see #lanewise(VectorOperators.Binary,Vector,VectorMask) 1674 * @see IntVector#lanewise(VectorOperators.Binary,int,VectorMask) 1675 */ 1676 public abstract Vector<E> mul(Vector<E> v, VectorMask<E> m); 1677 1678 /** 1679 * Divides this vector by a second input vector. 1680 * 1681 * This is a lane-wise binary operation which applies 1682 * the primitive division operation ({@code /}) 1683 * to each pair of corresponding lane values. 1684 * 1685 * This method is also equivalent to the expression 1686 * {@link #lanewise(VectorOperators.Binary,Vector) 1687 * lanewise}{@code (}{@link VectorOperators#DIV 1688 * DIV}{@code , v)}. 1689 * 1690 * <p> 1691 * As a full-service named operation, this method 1692 * comes in masked and unmasked overloadings, and 1693 * (in subclasses) also comes in scalar-broadcast 1694 * overloadings (both masked and unmasked). 1695 * 1696 * @apiNote If the underlying scalar operator does not support 1697 * division by zero, but is presented with a zero divisor, 1698 * an {@code ArithmeticException} will be thrown. 1699 * 1700 * @param v a second input vector 1701 * @return the result of dividing this vector by the second input vector 1702 * @throws ArithmeticException if any lane 1703 * in {@code v} is zero 1704 * and {@code ETYPE} is not {@code float} or {@code double}. 1705 * @see #div(Vector,VectorMask) 1706 * @see DoubleVector#div(double) 1707 * @see VectorOperators#DIV 1708 * @see #lanewise(VectorOperators.Binary,Vector) 1709 * @see IntVector#lanewise(VectorOperators.Binary,int) 1710 */ 1711 public abstract Vector<E> div(Vector<E> v); 1712 1713 /** 1714 * Divides this vector by a second input vector 1715 * under the control of a mask. 1716 * 1717 * This is a lane-wise binary operation which applies 1718 * the primitive division operation ({@code /}) 1719 * to each pair of corresponding lane values. 1720 * 1721 * For any lane unset in the mask, the primitive operation is 1722 * suppressed and this vector retains the original value stored in 1723 * that lane. 1724 * 1725 * This method is also equivalent to the expression 1726 * {@link #lanewise(VectorOperators.Binary,Vector,VectorMask) 1727 * lanewise}{@code (}{@link VectorOperators#DIV 1728 * DIV}{@code , v, m)}. 1729 * 1730 * <p> 1731 * As a full-service named operation, this method 1732 * comes in masked and unmasked overloadings, and 1733 * (in subclasses) also comes in scalar-broadcast 1734 * overloadings (both masked and unmasked). 1735 * 1736 * @apiNote If the underlying scalar operator does not support 1737 * division by zero, but is presented with a zero divisor, 1738 * an {@code ArithmeticException} will be thrown. 1739 * 1740 * @param v a second input vector 1741 * @param m the mask controlling lane selection 1742 * @return the result of dividing this vector by the second input vector 1743 * @throws ArithmeticException if any lane selected by {@code m} 1744 * in {@code v} is zero 1745 * and {@code ETYPE} is not {@code float} or {@code double}. 1746 * @see #div(Vector) 1747 * @see DoubleVector#div(double,VectorMask) 1748 * @see VectorOperators#DIV 1749 * @see #lanewise(VectorOperators.Binary,Vector,VectorMask) 1750 * @see DoubleVector#lanewise(VectorOperators.Binary,double,VectorMask) 1751 */ 1752 public abstract Vector<E> div(Vector<E> v, VectorMask<E> m); 1753 1754 /// END OF FULL-SERVICE BINARY METHODS 1755 1756 /// Non-full-service unary ops: NEG, ABS 1757 1758 /** 1759 * Negates this vector. 1760 * 1761 * This is a lane-wise unary operation which applies 1762 * the primitive negation operation ({@code -x}) 1763 * to each input lane. 1764 * 1765 * This method is also equivalent to the expression 1766 * {@link #lanewise(VectorOperators.Unary) 1767 * lanewise}{@code (}{@link VectorOperators#NEG 1768 * NEG}{@code )}. 1769 * 1770 * @apiNote 1771 * This method has no masked variant, but the corresponding 1772 * masked operation can be obtained from the 1773 * {@linkplain #lanewise(VectorOperators.Unary,VectorMask) 1774 * lanewise method}. 1775 * 1776 * @return the negation of this vector 1777 * @see VectorOperators#NEG 1778 * @see #lanewise(VectorOperators.Unary) 1779 * @see #lanewise(VectorOperators.Unary,VectorMask) 1780 */ 1781 public abstract Vector<E> neg(); 1782 1783 /** 1784 * Returns the absolute value of this vector. 1785 * 1786 * This is a lane-wise unary operation which applies 1787 * the method {@code Math.abs} 1788 * to each input lane. 1789 * 1790 * This method is also equivalent to the expression 1791 * {@link #lanewise(VectorOperators.Unary) 1792 * lanewise}{@code (}{@link VectorOperators#ABS 1793 * ABS}{@code )}. 1794 * 1795 * @apiNote 1796 * This method has no masked variant, but the corresponding 1797 * masked operation can be obtained from the 1798 * {@linkplain #lanewise(VectorOperators.Unary,VectorMask) 1799 * lanewise method}. 1800 * 1801 * @return the absolute value of this vector 1802 * @see VectorOperators#ABS 1803 * @see #lanewise(VectorOperators.Unary) 1804 * @see #lanewise(VectorOperators.Unary,VectorMask) 1805 */ 1806 public abstract Vector<E> abs(); 1807 1808 /// Non-full-service binary ops: MIN, MAX 1809 1810 /** 1811 * Computes the smaller of this vector and a second input vector. 1812 * 1813 * This is a lane-wise binary operation which applies the 1814 * operation {@code Math.min()} to each pair of 1815 * corresponding lane values. 1816 * 1817 * This method is also equivalent to the expression 1818 * {@link #lanewise(VectorOperators.Binary,Vector) 1819 * lanewise}{@code (}{@link VectorOperators#MIN 1820 * MIN}{@code , v)}. 1821 * 1822 * @apiNote 1823 * This is not a full-service named operation like 1824 * {@link #add(Vector) add()}. A masked version of 1825 * this operation is not directly available 1826 * but may be obtained via the masked version of 1827 * {@code lanewise}. Subclasses define an additional 1828 * scalar-broadcast overloading of this method. 1829 * 1830 * @param v a second input vector 1831 * @return the lanewise minimum of this vector and the second input vector 1832 * @see IntVector#min(int) 1833 * @see VectorOperators#MIN 1834 * @see #lanewise(VectorOperators.Binary,Vector) 1835 * @see #lanewise(VectorOperators.Binary,Vector,VectorMask) 1836 */ 1837 public abstract Vector<E> min(Vector<E> v); 1838 1839 /** 1840 * Computes the larger of this vector and a second input vector. 1841 * 1842 * This is a lane-wise binary operation which applies the 1843 * operation {@code Math.max()} to each pair of 1844 * corresponding lane values. 1845 * 1846 * This method is also equivalent to the expression 1847 * {@link #lanewise(VectorOperators.Binary,Vector) 1848 * lanewise}{@code (}{@link VectorOperators#MAX 1849 * MAX}{@code , v)}. 1850 * 1851 * <p> 1852 * This is not a full-service named operation like 1853 * {@link #add(Vector) add()}. A masked version of 1854 * this operation is not directly available 1855 * but may be obtained via the masked version of 1856 * {@code lanewise}. Subclasses define an additional 1857 * scalar-broadcast overloading of this method. 1858 * 1859 * @param v a second input vector 1860 * @return the lanewise maximum of this vector and the second input vector 1861 * @see IntVector#max(int) 1862 * @see VectorOperators#MAX 1863 * @see #lanewise(VectorOperators.Binary,Vector) 1864 * @see #lanewise(VectorOperators.Binary,Vector,VectorMask) 1865 */ 1866 public abstract Vector<E> max(Vector<E> v); 1867 1868 // Reductions 1869 1870 /** 1871 * Returns a value accumulated from all the lanes of this vector. 1872 * 1873 * This is an associative cross-lane reduction operation which 1874 * applies the specified operation to all the lane elements. 1875 * The return value will be equal to this expression: 1876 * {@code (long) ((EVector)this).reduceLanes(op)}, where {@code EVector} 1877 * is the vector class specific to this vector's element type 1878 * {@code ETYPE}. 1879 * <p> 1880 * In the case of operations {@code ADD} and {@code MUL}, 1881 * when {@code ETYPE} is {@code float} or {@code double}, 1882 * the precise result, before casting, will reflect the choice 1883 * of an arbitrary order of operations, which may even vary over time. 1884 * For further details see the section 1885 * <a href="VectorOperators.html#fp_assoc">Operations on floating point vectors</a>. 1886 * 1887 * @apiNote 1888 * If the {@code ETYPE} is {@code float} or {@code double}, 1889 * this operation can lose precision and/or range, as a 1890 * normal part of casting the result down to {@code long}. 1891 * 1892 * Usually 1893 * {@linkplain IntVector#reduceLanes(VectorOperators.Associative) 1894 * strongly typed access} 1895 * is preferable, if you are working with a vector 1896 * subtype that has a known element type. 1897 * 1898 * @param op the operation used to combine lane values 1899 * @return the accumulated result, cast to {@code long} 1900 * @throws UnsupportedOperationException if this vector does 1901 * not support the requested operation 1902 * @see #reduceLanesToLong(VectorOperators.Associative,VectorMask) 1903 * @see IntVector#reduceLanes(VectorOperators.Associative) 1904 * @see FloatVector#reduceLanes(VectorOperators.Associative) 1905 */ 1906 public abstract long reduceLanesToLong(VectorOperators.Associative op); 1907 1908 /** 1909 * Returns a value accumulated from selected lanes of this vector, 1910 * controlled by a mask. 1911 * 1912 * This is an associative cross-lane reduction operation which 1913 * applies the specified operation to the selected lane elements. 1914 * The return value will be equal to this expression: 1915 * {@code (long) ((EVector)this).reduceLanes(op, m)}, where {@code EVector} 1916 * is the vector class specific to this vector's element type 1917 * {@code ETYPE}. 1918 * <p> 1919 * If no elements are selected, an operation-specific identity 1920 * value is returned. 1921 * <ul> 1922 * <li> 1923 * If the operation is {@code ADD}, {@code XOR}, or {@code OR}, 1924 * then the identity value is zero. 1925 * <li> 1926 * If the operation is {@code MUL}, 1927 * then the identity value is one. 1928 * <li> 1929 * If the operation is {@code AND}, 1930 * then the identity value is minus one (all bits set). 1931 * <li> 1932 * If the operation is {@code MAX}, 1933 * then the identity value is the {@code MIN_VALUE} 1934 * of the vector's native {@code ETYPE}. 1935 * (In the case of floating point types, the value 1936 * {@code NEGATIVE_INFINITY} is used, and will appear 1937 * after casting as {@code Long.MIN_VALUE}. 1938 * <li> 1939 * If the operation is {@code MIN}, 1940 * then the identity value is the {@code MAX_VALUE} 1941 * of the vector's native {@code ETYPE}. 1942 * (In the case of floating point types, the value 1943 * {@code POSITIVE_INFINITY} is used, and will appear 1944 * after casting as {@code Long.MAX_VALUE}. 1945 * </ul> 1946 * <p> 1947 * In the case of operations {@code ADD} and {@code MUL}, 1948 * when {@code ETYPE} is {@code float} or {@code double}, 1949 * the precise result, before casting, will reflect the choice 1950 * of an arbitrary order of operations, which may even vary over time. 1951 * For further details see the section 1952 * <a href="VectorOperators.html#fp_assoc">Operations on floating point vectors</a>. 1953 * 1954 * @apiNote 1955 * If the {@code ETYPE} is {@code float} or {@code double}, 1956 * this operation can lose precision and/or range, as a 1957 * normal part of casting the result down to {@code long}. 1958 * 1959 * Usually 1960 * {@linkplain IntVector#reduceLanes(VectorOperators.Associative,VectorMask) 1961 * strongly typed access} 1962 * is preferable, if you are working with a vector 1963 * subtype that has a known element type. 1964 * 1965 * @param op the operation used to combine lane values 1966 * @param m the mask controlling lane selection 1967 * @return the reduced result accumulated from the selected lane values 1968 * @throws UnsupportedOperationException if this vector does 1969 * not support the requested operation 1970 * @see #reduceLanesToLong(VectorOperators.Associative) 1971 * @see IntVector#reduceLanes(VectorOperators.Associative,VectorMask) 1972 * @see FloatVector#reduceLanes(VectorOperators.Associative,VectorMask) 1973 */ 1974 public abstract long reduceLanesToLong(VectorOperators.Associative op, 1975 VectorMask<E> m); 1976 1977 // Lanewise unary tests 1978 1979 /** 1980 * Tests the lanes of this vector 1981 * according to the given operation. 1982 * 1983 * This is a lane-wise unary test operation which applies 1984 * the given test operation 1985 * to each lane value. 1986 * @param op the operation used to test lane values 1987 * @return the mask result of testing the lanes of this vector, 1988 * according to the selected test operator 1989 * @see VectorOperators.Comparison 1990 * @see #test(VectorOperators.Test, VectorMask) 1991 * @see #compare(VectorOperators.Comparison, Vector) 1992 */ 1993 public abstract VectorMask<E> test(VectorOperators.Test op); 1994 1995 /** 1996 * Test selected lanes of this vector, 1997 * according to the given operation. 1998 * 1999 * This is a masked lane-wise unary test operation which applies 2000 * the given test operation 2001 * to each lane value. 2002 * 2003 * The returned result is equal to the expression 2004 * {@code test(op).and(m)}. 2005 * 2006 * @param op the operation used to test lane values 2007 * @param m the mask controlling lane selection 2008 * @return the mask result of testing the lanes of this vector, 2009 * according to the selected test operator, 2010 * and only in the lanes selected by the mask 2011 * @see #test(VectorOperators.Test) 2012 */ 2013 public abstract VectorMask<E> test(VectorOperators.Test op, 2014 VectorMask<E> m); 2015 2016 // Comparisons 2017 2018 /** 2019 * Tests if this vector is equal to another input vector. 2020 * 2021 * This is a lane-wise binary test operation which applies 2022 * the primitive equals operation ({@code ==}) 2023 * to each pair of corresponding lane values. 2024 * The result is the same as {@code compare(VectorOperators.EQ, v)}. 2025 * 2026 * @param v a second input vector 2027 * @return the mask result of testing lane-wise if this vector 2028 * equal to the second input vector 2029 * @see #compare(VectorOperators.Comparison,Vector) 2030 * @see VectorOperators#EQ 2031 * @see #equals 2032 */ 2033 public abstract VectorMask<E> eq(Vector<E> v); 2034 2035 /** 2036 * Tests if this vector is less than another input vector. 2037 * 2038 * This is a lane-wise binary test operation which applies 2039 * the primitive less-than operation ({@code <}) to each lane. 2040 * The result is the same as {@code compare(VectorOperators.LT, v)}. 2041 * 2042 * @param v a second input vector 2043 * @return the mask result of testing lane-wise if this vector 2044 * is less than the second input vector 2045 * @see #compare(VectorOperators.Comparison,Vector) 2046 * @see VectorOperators#LT 2047 */ 2048 public abstract VectorMask<E> lt(Vector<E> v); 2049 2050 /** 2051 * Tests this vector by comparing it with another input vector, 2052 * according to the given comparison operation. 2053 * 2054 * This is a lane-wise binary test operation which applies 2055 * the given comparison operation 2056 * to each pair of corresponding lane values. 2057 * 2058 * @param op the operation used to compare lane values 2059 * @param v a second input vector 2060 * @return the mask result of testing lane-wise if this vector 2061 * compares to the input, according to the selected 2062 * comparison operator 2063 * @see #eq(Vector) 2064 * @see #lt(Vector) 2065 * @see VectorOperators.Comparison 2066 * @see #compare(VectorOperators.Comparison, Vector, VectorMask) 2067 * @see #test(VectorOperators.Test) 2068 */ 2069 public abstract VectorMask<E> compare(VectorOperators.Comparison op, 2070 Vector<E> v); 2071 2072 /** 2073 * Tests this vector by comparing it with another input vector, 2074 * according to the given comparison operation, 2075 * in lanes selected by a mask. 2076 * 2077 * This is a masked lane-wise binary test operation which applies 2078 * the given comparison operation 2079 * to each pair of corresponding lane values. 2080 * 2081 * The returned result is equal to the expression 2082 * {@code compare(op,v).and(m)}. 2083 * 2084 * @param op the operation used to compare lane values 2085 * @param v a second input vector 2086 * @param m the mask controlling lane selection 2087 * @return the mask result of testing lane-wise if this vector 2088 * compares to the input, according to the selected 2089 * comparison operator, 2090 * and only in the lanes selected by the mask 2091 * @see #compare(VectorOperators.Comparison, Vector) 2092 */ 2093 public abstract VectorMask<E> compare(VectorOperators.Comparison op, 2094 Vector<E> v, 2095 VectorMask<E> m); 2096 2097 /** 2098 * Tests this vector by comparing it with an input scalar, 2099 * according to the given comparison operation. 2100 * 2101 * This is a lane-wise binary test operation which applies 2102 * the given comparison operation 2103 * to each lane value, paired with the broadcast value. 2104 * 2105 * <p> 2106 * The result is the same as 2107 * {@code this.compare(op, this.broadcast(e))}. 2108 * That is, the scalar may be regarded as broadcast to 2109 * a vector of the same species, and then compared 2110 * against the original vector, using the selected 2111 * comparison operation. 2112 * 2113 * @apiNote 2114 * The {@code long} value {@code e} must be accurately 2115 * representable by the {@code ETYPE} of this vector's species, 2116 * so that {@code e==(long)(ETYPE)e}. This rule is enforced 2117 * by the implicit call to {@code broadcast()}. 2118 * <p> 2119 * Subtypes improve on this method by sharpening 2120 * the type of the scalar parameter {@code e}. 2121 * 2122 * @param op the operation used to compare lane values 2123 * @param e the input scalar 2124 * @return the mask result of testing lane-wise if this vector 2125 * compares to the input, according to the selected 2126 * comparison operator 2127 * @throws IllegalArgumentException 2128 * if the given {@code long} value cannot 2129 * be represented by the vector's {@code ETYPE} 2130 * @see #broadcast(long) 2131 * @see #compare(VectorOperators.Comparison,Vector) 2132 */ 2133 public abstract VectorMask<E> compare(VectorOperators.Comparison op, 2134 long e); 2135 2136 /** 2137 * Tests this vector by comparing it with an input scalar, 2138 * according to the given comparison operation, 2139 * in lanes selected by a mask. 2140 * 2141 * This is a masked lane-wise binary test operation which applies 2142 * the given comparison operation 2143 * to each lane value, paired with the broadcast value. 2144 * 2145 * The returned result is equal to the expression 2146 * {@code compare(op,e).and(m)}. 2147 * 2148 * @apiNote 2149 * The {@code long} value {@code e} must be accurately 2150 * representable by the {@code ETYPE} of this vector's species, 2151 * so that {@code e==(long)(ETYPE)e}. This rule is enforced 2152 * by the implicit call to {@code broadcast()}. 2153 * <p> 2154 * Subtypes improve on this method by sharpening 2155 * the type of the scalar parameter {@code e}. 2156 * 2157 * @param op the operation used to compare lane values 2158 * @param e the input scalar 2159 * @param m the mask controlling lane selection 2160 * @return the mask result of testing lane-wise if this vector 2161 * compares to the input, according to the selected 2162 * comparison operator, 2163 * and only in the lanes selected by the mask 2164 * @throws IllegalArgumentException 2165 * if the given {@code long} value cannot 2166 * be represented by the vector's {@code ETYPE} 2167 * @see #broadcast(long) 2168 * @see #compare(VectorOperators.Comparison,Vector) 2169 */ 2170 public abstract VectorMask<E> compare(VectorOperators.Comparison op, 2171 long e, 2172 VectorMask<E> m); 2173 2174 /** 2175 * Replaces selected lanes of this vector with 2176 * corresponding lanes from a second input vector 2177 * under the control of a mask. 2178 * 2179 * This is a masked lane-wise binary operation which 2180 * selects each lane value from one or the other input. 2181 * 2182 * <ul> 2183 * <li> 2184 * For any lane <em>set</em> in the mask, the new lane value 2185 * is taken from the second input vector, and replaces 2186 * whatever value was in the that lane of this vector. 2187 * <li> 2188 * For any lane <em>unset</em> in the mask, the replacement is 2189 * suppressed and this vector retains the original value stored in 2190 * that lane. 2191 * </ul> 2192 * 2193 * The following pseudocode illustrates this behavior: 2194 * <pre>{@code 2195 * Vector<E> a = ...; 2196 * VectorSpecies<E> species = a.species(); 2197 * Vector<E> b = ...; 2198 * b.check(species); 2199 * VectorMask<E> m = ...; 2200 * ETYPE[] ar = a.toArray(); 2201 * for (int i = 0; i < ar.length; i++) { 2202 * if (m.laneIsSet(i)) { 2203 * ar[i] = b.lane(i); 2204 * } 2205 * } 2206 * return EVector.fromArray(s, ar, 0); 2207 * }</pre> 2208 * 2209 * @param v the second input vector, containing replacement lane values 2210 * @param m the mask controlling lane selection from the second input vector 2211 * @return the result of blending the lane elements of this vector with 2212 * those of the second input vector 2213 */ 2214 public abstract Vector<E> blend(Vector<E> v, VectorMask<E> m); 2215 2216 /** 2217 * Replaces selected lanes of this vector with 2218 * a scalar value 2219 * under the control of a mask. 2220 * 2221 * This is a masked lane-wise binary operation which 2222 * selects each lane value from one or the other input. 2223 * 2224 * The returned result is equal to the expression 2225 * {@code blend(broadcast(e),m)}. 2226 * 2227 * @apiNote 2228 * The {@code long} value {@code e} must be accurately 2229 * representable by the {@code ETYPE} of this vector's species, 2230 * so that {@code e==(long)(ETYPE)e}. This rule is enforced 2231 * by the implicit call to {@code broadcast()}. 2232 * <p> 2233 * Subtypes improve on this method by sharpening 2234 * the type of the scalar parameter {@code e}. 2235 * 2236 * @param e the input scalar, containing the replacement lane value 2237 * @param m the mask controlling lane selection of the scalar 2238 * @return the result of blending the lane elements of this vector with 2239 * the scalar value 2240 */ 2241 public abstract Vector<E> blend(long e, VectorMask<E> m); 2242 2243 /** 2244 * Adds the lanes of this vector to their corresponding 2245 * lane numbers, scaled by a given constant. 2246 * 2247 * This is a lane-wise unary operation which, for 2248 * each lane {@code N}, computes the scaled index value 2249 * {@code N*scale} and adds it to the value already 2250 * in lane {@code N} of the current vector. 2251 * 2252 * <p> The scale must not be so large, and the element size must 2253 * not be so small, that that there would be an overflow when 2254 * computing any of the {@code N*scale} or {@code VLENGTH*scale}, 2255 * when the the result is represented using the vector 2256 * lane type {@code ETYPE}. 2257 * 2258 * <p> 2259 * The following pseudocode illustrates this behavior: 2260 * <pre>{@code 2261 * Vector<E> a = ...; 2262 * VectorSpecies<E> species = a.species(); 2263 * ETYPE[] ar = a.toArray(); 2264 * for (int i = 0; i < ar.length; i++) { 2265 * long d = (long)i * scale; 2266 * if (d != (ETYPE) d) throw ...; 2267 * ar[i] += (ETYPE) d; 2268 * } 2269 * long d = (long)ar.length * scale; 2270 * if (d != (ETYPE) d) throw ...; 2271 * return EVector.fromArray(s, ar, 0); 2272 * }</pre> 2273 * 2274 * @param scale the number to multiply by each lane index 2275 * {@code N}, typically {@code 1} 2276 * @return the result of incrementing each lane element by its 2277 * corresponding lane index {@code N}, scaled by {@code scale} 2278 * @throws IllegalArgumentException 2279 * if the values in the interval 2280 * {@code [0..VLENGTH*scale]} 2281 * are not representable by the {@code ETYPE} 2282 */ 2283 public abstract Vector<E> addIndex(int scale); 2284 2285 // Slicing segments of adjacent lanes 2286 2287 /** 2288 * Slices a segment of adjacent lanes, starting at a given 2289 * {@code origin} lane in the current vector, and continuing (as 2290 * needed) into an immediately following vector. The block of 2291 * {@code VLENGTH} lanes is extracted into its own vector and 2292 * returned. 2293 * 2294 * <p> This is a cross-lane operation that shifts lane elements 2295 * to the front, from the current vector and the second vector. 2296 * Both vectors can be viewed as a combined "background" of length 2297 * {@code 2*VLENGTH}, from which a slice is extracted. 2298 * 2299 * The lane numbered {@code N} in the output vector is copied 2300 * from lane {@code origin+N} of the input vector, if that 2301 * lane exists, else from lane {@code origin+N-VLENGTH} of 2302 * the second vector (which is guaranteed to exist). 2303 * 2304 * <p> The {@code origin} value must be in the inclusive range 2305 * {@code 0..VLENGTH}. As limiting cases, {@code v.slice(0,w)} 2306 * and {@code v.slice(VLENGTH,w)} return {@code v} and {@code w}, 2307 * respectively. 2308 * 2309 * @apiNote 2310 * 2311 * This method may be regarded as the inverse of 2312 * {@link #unslice(int,Vector,int) unslice()}, 2313 * in that the sliced value could be unsliced back into its 2314 * original position in the two input vectors, without 2315 * disturbing unrelated elements, as in the following 2316 * pseudocode: 2317 * <pre>{@code 2318 * EVector slice = v1.slice(origin, v2); 2319 * EVector w1 = slice.unslice(origin, v1, 0); 2320 * EVector w2 = slice.unslice(origin, v2, 1); 2321 * assert v1.equals(w1); 2322 * assert v2.equals(w2); 2323 * }</pre> 2324 * 2325 * <p> This method also supports a variety of cross-lane shifts and 2326 * rotates as follows: 2327 * <ul> 2328 * 2329 * <li>To shift lanes forward to the front of the vector, supply a 2330 * zero vector for the second operand and specify the shift count 2331 * as the origin. For example: {@code v.slice(shift, v.broadcast(0))}. 2332 * 2333 * <li>To shift lanes backward to the back of the vector, supply a 2334 * zero vector for the <em>first</em> operand, and specify the 2335 * negative shift count as the origin (modulo {@code VLENGTH}. 2336 * For example: {@code v.broadcast(0).slice(v.length()-shift, v)}. 2337 * 2338 * <li>To rotate lanes forward toward the front end of the vector, 2339 * cycling the earliest lanes around to the back, supply the same 2340 * vector for both operands and specify the rotate count as the 2341 * origin. For example: {@code v.slice(rotate, v)}. 2342 * 2343 * <li>To rotate lanes backward toward the back end of the vector, 2344 * cycling the latest lanes around to the front, supply the same 2345 * vector for both operands and specify the negative of the rotate 2346 * count (modulo {@code VLENGTH}) as the origin. For example: 2347 * {@code v.slice(v.length() - rotate, v)}. 2348 * 2349 * <li> 2350 * Since {@code origin} values less then zero or more than 2351 * {@code VLENGTH} will be rejected, if you need to rotate 2352 * by an unpredictable multiple of {@code VLENGTH}, be sure 2353 * to reduce the origin value into the required range. 2354 * The {@link VectorSpecies#loopBound(int) loopBound()} 2355 * method can help with this. For example: 2356 * {@code v.slice(rotate - v.species().loopBound(rotate), v)}. 2357 * 2358 * </ul> 2359 * 2360 * @param origin the first input lane to transfer into the slice 2361 * @param v1 a second vector logically concatenated with the first, 2362 * before the slice is taken (if omitted it defaults to zero) 2363 * @return a contiguous slice of {@code VLENGTH} lanes, taken from 2364 * this vector starting at the indicated origin, and 2365 * continuing (as needed) into the second vector 2366 * @throws ArrayIndexOutOfBoundsException if {@code origin} 2367 * is negative or greater than {@code VLENGTH} 2368 * @see #slice(int,Vector,VectorMask) 2369 * @see #slice(int) 2370 * @see #unslice(int,Vector,int) 2371 */ 2372 public abstract Vector<E> slice(int origin, Vector<E> v1); 2373 2374 /** 2375 * Slices a segment of adjacent lanes 2376 * under the control of a mask, 2377 * starting at a given 2378 * {@code origin} lane in the current vector, and continuing (as 2379 * needed) into an immediately following vector. The block of 2380 * {@code VLENGTH} lanes is extracted into its own vector and 2381 * returned. 2382 * 2383 * The resulting vector will be zero in all lanes unset in the 2384 * given mask. Lanes set in the mask will contain data copied 2385 * from selected lanes of {@code this} or {@code v1}. 2386 * 2387 * <p> This is a cross-lane operation that shifts lane elements 2388 * to the front, from the current vector and the second vector. 2389 * Both vectors can be viewed as a combined "background" of length 2390 * {@code 2*VLENGTH}, from which a slice is extracted. 2391 * 2392 * The returned result is equal to the expression 2393 * {@code broadcast(0).blend(slice(origin,v1),m)}. 2394 * 2395 * @apiNote 2396 * This method may be regarded as the inverse of 2397 * {@code #unslice(int,Vector,int,VectorMask) unslice()}, 2398 * in that the sliced value could be unsliced back into its 2399 * original position in the two input vectors, without 2400 * disturbing unrelated elements, as in the following 2401 * pseudocode: 2402 * <pre>{@code 2403 * EVector slice = v1.slice(origin, v2, m); 2404 * EVector w1 = slice.unslice(origin, v1, 0, m); 2405 * EVector w2 = slice.unslice(origin, v2, 1, m); 2406 * assert v1.equals(w1); 2407 * assert v2.equals(w2); 2408 * }</pre> 2409 * 2410 * @param origin the first input lane to transfer into the slice 2411 * @param v1 a second vector logically concatenated with the first, 2412 * before the slice is taken (if omitted it defaults to zero) 2413 * @param m the mask controlling lane selection into the resulting vector 2414 * @return a contiguous slice of {@code VLENGTH} lanes, taken from 2415 * this vector starting at the indicated origin, and 2416 * continuing (as needed) into the second vector 2417 * @throws ArrayIndexOutOfBoundsException if {@code origin} 2418 * is negative or greater than {@code VLENGTH} 2419 * @see #slice(int,Vector) 2420 * @see #unslice(int,Vector,int,VectorMask) 2421 */ 2422 // This doesn't pull its weight, but its symmetrical with 2423 // masked unslice, and might cause questions if missing. 2424 // It could make for clearer code. 2425 public abstract Vector<E> slice(int origin, Vector<E> v1, VectorMask<E> m); 2426 2427 /** 2428 * Slices a segment of adjacent lanes, starting at a given 2429 * {@code origin} lane in the current vector. A block of 2430 * {@code VLENGTH} lanes, possibly padded with zero lanes, is 2431 * extracted into its own vector and returned. 2432 * 2433 * This is a convenience method which slices from a single 2434 * vector against an extended background of zero lanes. 2435 * It is equivalent to 2436 * {@link #slice(int,Vector) slice}{@code 2437 * (origin, }{@link #broadcast(long) broadcast}{@code (0))}. 2438 * It may also be viewed simply as a cross-lane shift 2439 * from later to earlier lanes, with zeroes filling 2440 * in the vacated lanes at the end of the vector. 2441 * In this view, the shift count is {@code origin}. 2442 * 2443 * @param origin the first input lane to transfer into the slice 2444 * @return the last {@code VLENGTH-origin} input lanes, 2445 * placed starting in the first lane of the ouput, 2446 * padded at the end with zeroes 2447 * @throws ArrayIndexOutOfBoundsException if {@code origin} 2448 * is negative or greater than {@code VLENGTH} 2449 * @see #slice(int,Vector) 2450 * @see #unslice(int,Vector,int) 2451 */ 2452 // This API point pulls its weight as a teaching aid, 2453 // though it's a one-off and broadcast(0) is easy. 2454 public abstract Vector<E> slice(int origin); 2455 2456 /** 2457 * Reverses a {@linkplain #slice(int,Vector) slice()}, inserting 2458 * the current vector as a slice within another "background" input 2459 * vector, which is regarded as one or the other input to a 2460 * hypothetical subsequent {@code slice()} operation. 2461 * 2462 * <p> This is a cross-lane operation that permutes the lane 2463 * elements of the current vector toward the back and inserts them 2464 * into a logical pair of background vectors. Only one of the 2465 * pair will be returned, however. The background is formed by 2466 * duplicating the second input vector. (However, the output will 2467 * never contain two duplicates from the same input lane.) 2468 * 2469 * The lane numbered {@code N} in the input vector is copied into 2470 * lane {@code origin+N} of the first background vector, if that 2471 * lane exists, else into lane {@code origin+N-VLENGTH} of the 2472 * second background vector (which is guaranteed to exist). 2473 * 2474 * The first or second background vector, updated with the 2475 * inserted slice, is returned. The {@code part} number of zero 2476 * or one selects the first or second updated background vector. 2477 * 2478 * <p> The {@code origin} value must be in the inclusive range 2479 * {@code 0..VLENGTH}. As limiting cases, {@code v.unslice(0,w,0)} 2480 * and {@code v.unslice(VLENGTH,w,1)} both return {@code v}, while 2481 * {@code v.unslice(0,w,1)} and {@code v.unslice(VLENGTH,w,0)} 2482 * both return {@code w}. 2483 * 2484 * @apiNote 2485 * This method supports a variety of cross-lane insertion 2486 * operations as follows: 2487 * <ul> 2488 * 2489 * <li>To insert near the end of a background vector {@code w} 2490 * at some offset, specify the offset as the origin and 2491 * select part zero. For example: {@code v.unslice(offset, w, 0)}. 2492 * 2493 * <li>To insert near the end of a background vector {@code w}, 2494 * but capturing the overflow into the next vector {@code x}, 2495 * specify the offset as the origin and select part one. 2496 * For example: {@code v.unslice(offset, x, 1)}. 2497 * 2498 * <li>To insert the last {@code N} items near the beginning 2499 * of a background vector {@code w}, supply a {@code VLENGTH-N} 2500 * as the origin and select part one. 2501 * For example: {@code v.unslice(v.length()-N, w)}. 2502 * 2503 * </ul> 2504 * 2505 * @param origin the first output lane to receive the slice 2506 * @param w the background vector that (as two copies) will receive 2507 * the inserted slice 2508 * @param part the part number of the result (either zero or one) 2509 * @return either the first or second part of a pair of 2510 * background vectors {@code w}, updated by inserting 2511 * this vector at the indicated origin 2512 * @throws ArrayIndexOutOfBoundsException if {@code origin} 2513 * is negative or greater than {@code VLENGTH}, 2514 * or if {@code part} is not zero or one 2515 * @see #slice(int,Vector) 2516 * @see #unslice(int,Vector,int,VectorMask) 2517 */ 2518 public abstract Vector<E> unslice(int origin, Vector<E> w, int part); 2519 2520 /** 2521 * Reverses a {@linkplain #slice(int,Vector) slice()}, inserting 2522 * (under the control of a mask) 2523 * the current vector as a slice within another "background" input 2524 * vector, which is regarded as one or the other input to a 2525 * hypothetical subsequent {@code slice()} operation. 2526 * 2527 * <p> This is a cross-lane operation that permutes the lane 2528 * elements of the current vector forward and inserts its lanes 2529 * (when selected by the mask) into a logical pair of background 2530 * vectors. As with the 2531 * {@linkplain #unslice(int,Vector,int) unmasked version} of this method, 2532 * only one of the pair will be returned, as selected by the 2533 * {@code part} number. 2534 * 2535 * For each lane {@code N} selected by the mask, the lane value 2536 * is copied into 2537 * lane {@code origin+N} of the first background vector, if that 2538 * lane exists, else into lane {@code origin+N-VLENGTH} of the 2539 * second background vector (which is guaranteed to exist). 2540 * Background lanes retain their original values if the 2541 * corresponding input lanes {@code N} are unset in the mask. 2542 * 2543 * The first or second background vector, updated with set lanes 2544 * of the inserted slice, is returned. The {@code part} number of 2545 * zero or one selects the first or second updated background 2546 * vector. 2547 * 2548 * @param origin the first output lane to receive the slice 2549 * @param w the background vector that (as two copies) will receive 2550 * the inserted slice, if they are set in {@code m} 2551 * @param part the part number of the result (either zero or one) 2552 * @param m the mask controlling lane selection from the current vector 2553 * @return either the first or second part of a pair of 2554 * background vectors {@code w}, updated by inserting 2555 * selected lanes of this vector at the indicated origin 2556 * @throws ArrayIndexOutOfBoundsException if {@code origin} 2557 * is negative or greater than {@code VLENGTH}, 2558 * or if {@code part} is not zero or one 2559 * @see #unslice(int,Vector,int) 2560 * @see #slice(int,Vector) 2561 */ 2562 public abstract Vector<E> unslice(int origin, Vector<E> w, int part, VectorMask<E> m); 2563 2564 /** 2565 * Reverses a {@linkplain #slice(int) slice()}, inserting 2566 * the current vector as a slice within a "background" input 2567 * of zero lane values. Compared to other {@code unslice()} 2568 * methods, this method only returns the first of the 2569 * pair of background vectors. 2570 * 2571 * This is a convenience method which returns the result of 2572 * {@link #unslice(int,Vector,int) unslice}{@code 2573 * (origin, }{@link #broadcast(long) broadcast}{@code (0), 0)}. 2574 * It may also be viewed simply as a cross-lane shift 2575 * from earlier to later lanes, with zeroes filling 2576 * in the vacated lanes at the beginning of the vector. 2577 * In this view, the shift count is {@code origin}. 2578 * 2579 * @param origin the first output lane to receive the slice 2580 * @return the first {@code VLENGTH-origin} input lanes, 2581 * placed starting at the given origin, 2582 * padded at the beginning with zeroes 2583 * @throws ArrayIndexOutOfBoundsException if {@code origin} 2584 * is negative or greater than {@code VLENGTH} 2585 * @see #unslice(int,Vector,int) 2586 * @see #slice(int) 2587 */ 2588 // This API point pulls its weight as a teaching aid, 2589 // though it's a one-off and broadcast(0) is easy. 2590 public abstract Vector<E> unslice(int origin); 2591 2592 // ISSUE: Add a slice which uses a mask instead of an origin? 2593 //public abstract Vector<E> slice(VectorMask<E> support); 2594 2595 // ISSUE: Add some more options for questionable edge conditions? 2596 // We might define enum EdgeOption { ERROR, ZERO, WRAP } for the 2597 // default of throwing AIOOBE, or substituting zeroes, or just 2598 // reducing the out-of-bounds index modulo VLENGTH. Similar 2599 // concerns also apply to general Shuffle operations. For now, 2600 // just support ERROR, since that is safest. 2601 2602 /** 2603 * Rearranges the lane elements of this vector, selecting lanes 2604 * under the control of a specific shuffle. 2605 * 2606 * This is a cross-lane operation that rearranges the lane 2607 * elements of this vector. 2608 * 2609 * For each lane {@code N} of the shuffle, and for each lane 2610 * source index {@code I=s.laneSource(N)} in the shuffle, 2611 * the output lane {@code N} obtains the value from 2612 * the input vector at lane {@code I}. 2613 * 2614 * @param s the shuffle controlling lane index selection 2615 * @return the rearrangement of the lane elements of this vector 2616 * @throws IndexOutOfBoundsException if there are any exceptional 2617 * source indexes in the shuffle 2618 * @see #rearrange(VectorShuffle,VectorMask) 2619 * @see #rearrange(VectorShuffle,Vector) 2620 * @see VectorShuffle#laneIsValid() 2621 */ 2622 public abstract Vector<E> rearrange(VectorShuffle<E> s); 2623 2624 /** 2625 * Rearranges the lane elements of this vector, selecting lanes 2626 * under the control of a specific shuffle and a mask. 2627 * 2628 * This is a cross-lane operation that rearranges the lane 2629 * elements of this vector. 2630 * 2631 * For each lane {@code N} of the shuffle, and for each lane 2632 * source index {@code I=s.laneSource(N)} in the shuffle, 2633 * the output lane {@code N} obtains the value from 2634 * the input vector at lane {@code I} if the mask is set. 2635 * Otherwise the output lane {@code N} is set to zero. 2636 * 2637 * <p> This method returns the value of this pseudocode: 2638 * <pre>{@code 2639 * Vector<E> r = this.rearrange(s.wrapIndexes()); 2640 * VectorMask<E> valid = s.laneIsValid(); 2641 * if (m.andNot(valid).anyTrue()) throw ...; 2642 * return broadcast(0).blend(r, m); 2643 * }</pre> 2644 * 2645 * @param s the shuffle controlling lane index selection 2646 * @param m the mask controlling application of the shuffle 2647 * @return the rearrangement of the lane elements of this vector 2648 * @throws IndexOutOfBoundsException if there are any exceptional 2649 * source indexes in the shuffle where the mask is set 2650 * @see #rearrange(VectorShuffle) 2651 * @see #rearrange(VectorShuffle,Vector) 2652 * @see VectorShuffle#laneIsValid() 2653 */ 2654 public abstract Vector<E> rearrange(VectorShuffle<E> s, VectorMask<E> m); 2655 2656 /** 2657 * Rearranges the lane elements of two vectors, selecting lanes 2658 * under the control of a specific shuffle, using both normal and 2659 * exceptional indexes in the shuffle to steer data. 2660 * 2661 * This is a cross-lane operation that rearranges the lane 2662 * elements of the two input vectors (the current vector 2663 * and a second vector {@code v}). 2664 * 2665 * For each lane {@code N} of the shuffle, and for each lane 2666 * source index {@code I=s.laneSource(N)} in the shuffle, 2667 * the output lane {@code N} obtains the value from 2668 * the first vector at lane {@code I} if {@code I>=0}. 2669 * Otherwise, the exceptional index {@code I} is wrapped 2670 * by adding {@code VLENGTH} to it and used to index 2671 * the <em>second</em> vector, at index {@code I+VLENGTH}. 2672 * 2673 * <p> This method returns the value of this pseudocode: 2674 * <pre>{@code 2675 * Vector<E> r1 = this.rearrange(s.wrapIndexes()); 2676 * // or else: r1 = this.rearrange(s, s.laneIsValid()); 2677 * Vector<E> r2 = v.rearrange(s.wrapIndexes()); 2678 * return r2.blend(r1,s.laneIsValid()); 2679 * }</pre> 2680 * 2681 * @param s the shuffle controlling lane selection from both input vectors 2682 * @param v the second input vector 2683 * @return the rearrangement of lane elements of this vector and 2684 * a second input vector 2685 * @see #rearrange(VectorShuffle) 2686 * @see #rearrange(VectorShuffle,VectorMask) 2687 * @see VectorShuffle#laneIsValid() 2688 * @see #slice(int,Vector) 2689 */ 2690 public abstract Vector<E> rearrange(VectorShuffle<E> s, Vector<E> v); 2691 2692 /** 2693 * Using index values stored in the lanes of this vector, 2694 * assemble values stored in second vector {@code v}. 2695 * The second vector thus serves as a table, whose 2696 * elements are selected by indexes in the current vector. 2697 * 2698 * This is a cross-lane operation that rearranges the lane 2699 * elements of the argument vector, under the control of 2700 * this vector. 2701 * 2702 * For each lane {@code N} of this vector, and for each lane 2703 * value {@code I=this.lane(N)} in this vector, 2704 * the output lane {@code N} obtains the value from 2705 * the argument vector at lane {@code I}. 2706 * 2707 * In this way, the result contains only values stored in the 2708 * argument vector {@code v}, but presented in an order which 2709 * depends on the index values in {@code this}. 2710 * 2711 * The result is the same as the expression 2712 * {@code v.rearrange(this.toShuffle())}. 2713 * 2714 * @param v the vector supplying the result values 2715 * @return the rearrangement of the lane elements of {@code v} 2716 * @throws IndexOutOfBoundsException if any invalid 2717 * source indexes are found in {@code this} 2718 * @see #rearrange(VectorShuffle) 2719 */ 2720 public abstract Vector<E> selectFrom(Vector<E> v); 2721 2722 /** 2723 * Using index values stored in the lanes of this vector, 2724 * assemble values stored in second vector, under the control 2725 * of a mask. 2726 * Using index values stored in the lanes of this vector, 2727 * assemble values stored in second vector {@code v}. 2728 * The second vector thus serves as a table, whose 2729 * elements are selected by indexes in the current vector. 2730 * Lanes that are unset in the mask receive a 2731 * zero rather than a value from the table. 2732 * 2733 * This is a cross-lane operation that rearranges the lane 2734 * elements of the argument vector, under the control of 2735 * this vector and the mask. 2736 * 2737 * The result is the same as the expression 2738 * {@code v.rearrange(this.toShuffle(), m)}. 2739 * 2740 * @param v the vector supplying the result values 2741 * @param m the mask controlling selection from {@code v} 2742 * @return the rearrangement of the lane elements of {@code v} 2743 * @throws IndexOutOfBoundsException if any invalid 2744 * source indexes are found in {@code this}, 2745 * in a lane which is set in the mask 2746 * @see #selectFrom(Vector) 2747 * @see #rearrange(VectorShuffle,VectorMask) 2748 */ 2749 public abstract Vector<E> selectFrom(Vector<E> v, VectorMask<E> m); 2750 2751 // Conversions 2752 2753 /** 2754 * Returns a vector of the same species as this one 2755 * where all lane elements are set to 2756 * the primitive value {@code e}. 2757 * 2758 * The contents of the current vector are discarded; 2759 * only the species is relevant to this operation. 2760 * 2761 * <p> This method returns the value of this expression: 2762 * {@code EVector.broadcast(this.species(), (ETYPE)e)}, where 2763 * {@code EVector} is the vector class specific to this 2764 * vector's element type {@code ETYPE}. 2765 * 2766 * <p> 2767 * The {@code long} value {@code e} must be accurately 2768 * representable by the {@code ETYPE} of this vector's species, 2769 * so that {@code e==(long)(ETYPE)e}. 2770 * 2771 * If this rule is violated the problem is not detected 2772 * statically, but an {@code IllegalArgumentException} is thrown 2773 * at run-time. Thus, this method somewhat weakens the static 2774 * type checking of immediate constants and other scalars, but it 2775 * makes up for this by improving the expressiveness of the 2776 * generic API. Note that an {@code e} value in the range 2777 * {@code [-128..127]} is always acceptable, since every 2778 * {@code ETYPE} will accept every {@code byte} value. 2779 * 2780 * @apiNote 2781 * Subtypes improve on this method by sharpening 2782 * the method return type and 2783 * and the type of the scalar parameter {@code e}. 2784 * 2785 * @param e the value to broadcast 2786 * @return a vector where all lane elements are set to 2787 * the primitive value {@code e} 2788 * @throws IllegalArgumentException 2789 * if the given {@code long} value cannot 2790 * be represented by the vector's {@code ETYPE} 2791 * @see VectorSpecies#broadcast(long) 2792 * @see IntVector#broadcast(int) 2793 * @see FloatVector#broadcast(float) 2794 */ 2795 public abstract Vector<E> broadcast(long e); 2796 2797 /** 2798 * Returns a mask of same species as this vector, 2799 * where each lane is set or unset according to given 2800 * single boolean, which is broadcast to all lanes. 2801 * <p> 2802 * This method returns the value of this expression: 2803 * {@code species().maskAll(bit)}. 2804 * 2805 * @param bit the given mask bit to be replicated 2806 * @return a mask where each lane is set or unset according to 2807 * the given bit 2808 * @see VectorSpecies#maskAll(boolean) 2809 */ 2810 public abstract VectorMask<E> maskAll(boolean bit); 2811 2812 /** 2813 * Converts this vector into a shuffle, converting the lane values 2814 * to {@code int} and regarding them as source indexes. 2815 * <p> 2816 * This method behaves as if it returns the result of creating a shuffle 2817 * given an array of the vector elements, as follows: 2818 * <pre>{@code 2819 * long[] a = this.toLongArray(); 2820 * int[] sa = new int[a.length]; 2821 * for (int i = 0; i < a.length; i++) { 2822 * sa[i] = (int) a[i]; 2823 * } 2824 * return VectorShuffle.fromValues(this.species(), sa); 2825 * }</pre> 2826 * 2827 * @return a shuffle representation of this vector 2828 * @see VectorShuffle#fromValues(VectorSpecies,int...) 2829 */ 2830 public abstract VectorShuffle<E> toShuffle(); 2831 2832 // Bitwise preserving 2833 2834 /** 2835 * Transforms this vector to a vector of the given species of 2836 * element type {@code F}, reinterpreting the bytes of this 2837 * vector without performing any value conversions. 2838 * 2839 * <p> Depending on the selected species, this operation may 2840 * either <a href="Vector.html#expansion">expand or contract</a> 2841 * its logical result, in which case a non-zero {@code part} 2842 * number can further control the selection and steering of the 2843 * logical result into the physical output vector. 2844 * 2845 * <p> 2846 * The underlying bits of this vector are copied to the resulting 2847 * vector without modification, but those bits, before copying, 2848 * may be truncated if the this vector's bit-size is greater than 2849 * desired vector's bit size, or filled with zero bits if this 2850 * vector's bit-size is less than desired vector's bit-size. 2851 * 2852 * <p> If the old and new species have different shape, this is a 2853 * <em>shape-changing</em> operation, and may have special 2854 * implementation costs. 2855 * 2856 * <p> The method behaves as if this vector is stored into a byte 2857 * buffer or array using little-endian byte ordering and then the 2858 * desired vector is loaded from the same byte buffer or array 2859 * using the same ordering. 2860 * 2861 * <p> The following pseudocode illustrates the behavior: 2862 * <pre>{@code 2863 * int domSize = this.byteSize(); 2864 * int ranSize = species.vectorByteSize(); 2865 * int M = (domSize > ranSize ? domSize / ranSize : ranSize / domSize); 2866 * assert Math.abs(part) < M; 2867 * assert (part == 0) || (part > 0) == (domSize > ranSize); 2868 * byte[] ra = new byte[Math.max(domSize, ranSize)]; 2869 * if (domSize > ranSize) { // expansion 2870 * this.intoByteArray(ra, 0, ByteOrder.native()); 2871 * int origin = part * ranSize; 2872 * return species.fromByteArray(ra, origin, ByteOrder.native()); 2873 * } else { // contraction or size-invariant 2874 * int origin = (-part) * domSize; 2875 * this.intoByteArray(ra, origin, ByteOrder.native()); 2876 * return species.fromByteArray(ra, 0, ByteOrder.native()); 2877 * } 2878 * }</pre> 2879 * 2880 * @apiNote Although this method is defined as if the vectors in 2881 * question were loaded or stored into memory, memory semantics 2882 * has little to do or nothing with the actual implementation. 2883 * The appeal to little-endian ordering is simply a shorthand 2884 * for what could otherwise be a large number of detailed rules 2885 * concerning the mapping between lane-structured vectors and 2886 * byte-structured vectors. 2887 * 2888 * @param species the desired vector species 2889 * @param part the <a href="Vector.html#expansion">part number</a> 2890 * of the result, or zero if neither expanding nor contracting 2891 * @param <F> the boxed element type of the species 2892 * @return a vector transformed, by shape and element type, from this vector 2893 * @see Vector#convertShape(VectorOperators.Conversion,VectorSpecies,int) 2894 * @see Vector#castShape(VectorSpecies,int) 2895 * @see VectorSpecies#partLimit(VectorSpecies,boolean) 2896 */ 2897 public abstract <F> Vector<F> reinterpretShape(VectorSpecies<F> species, int part); 2898 2899 /** 2900 * Views this vector as a vector of the same shape 2901 * and contents but a lane type of {@code byte}, 2902 * where the bytes are extracted from the lanes 2903 * according to little-endian order. 2904 * It is a convenience method for the expression 2905 * {@code reinterpretShape(species().withLanes(byte.class))}. 2906 * It may be considered an inverse to the various 2907 * methods which consolidate bytes into larger lanes 2908 * within the same vector, such as 2909 * {@link Vector#reinterpretAsInts()}. 2910 * 2911 * @return a {@code ByteVector} with the same shape and information content 2912 * @see Vector#reinterpretShape(VectorSpecies,int) 2913 * @see IntVector#intoByteArray(byte[], int, ByteOrder) 2914 * @see FloatVector#intoByteArray(byte[], int, ByteOrder) 2915 * @see VectorSpecies#withLanes(Class) 2916 */ 2917 public abstract ByteVector reinterpretAsBytes(); 2918 2919 /** 2920 * Reinterprets this vector as a vector of the same shape 2921 * and contents but a lane type of {@code short}, 2922 * where the lanes are assembled from successive bytes 2923 * according to little-endian order. 2924 * It is a convenience method for the expression 2925 * {@code reinterpretShape(species().withLanes(short.class))}. 2926 * It may be considered an inverse to {@link Vector#reinterpretAsBytes()}. 2927 * 2928 * @return a {@code ShortVector} with the same shape and information content 2929 */ 2930 public abstract ShortVector reinterpretAsShorts(); 2931 2932 /** 2933 * Reinterprets this vector as a vector of the same shape 2934 * and contents but a lane type of {@code int}, 2935 * where the lanes are assembled from successive bytes 2936 * according to little-endian order. 2937 * It is a convenience method for the expression 2938 * {@code reinterpretShape(species().withLanes(int.class))}. 2939 * It may be considered an inverse to {@link Vector#reinterpretAsBytes()}. 2940 * 2941 * @return a {@code IntVector} with the same shape and information content 2942 */ 2943 public abstract IntVector reinterpretAsInts(); 2944 2945 /** 2946 * Reinterprets this vector as a vector of the same shape 2947 * and contents but a lane type of {@code long}, 2948 * where the lanes are assembled from successive bytes 2949 * according to little-endian order. 2950 * It is a convenience method for the expression 2951 * {@code reinterpretShape(species().withLanes(long.class))}. 2952 * It may be considered an inverse to {@link Vector#reinterpretAsBytes()}. 2953 * 2954 * @return a {@code LongVector} with the same shape and information content 2955 */ 2956 public abstract LongVector reinterpretAsLongs(); 2957 2958 /** 2959 * Reinterprets this vector as a vector of the same shape 2960 * and contents but a lane type of {@code float}, 2961 * where the lanes are assembled from successive bytes 2962 * according to little-endian order. 2963 * It is a convenience method for the expression 2964 * {@code reinterpretShape(species().withLanes(float.class))}. 2965 * It may be considered an inverse to {@link Vector#reinterpretAsBytes()}. 2966 * 2967 * @return a {@code FloatVector} with the same shape and information content 2968 */ 2969 public abstract FloatVector reinterpretAsFloats(); 2970 2971 /** 2972 * Reinterprets this vector as a vector of the same shape 2973 * and contents but a lane type of {@code double}, 2974 * where the lanes are assembled from successive bytes 2975 * according to little-endian order. 2976 * It is a convenience method for the expression 2977 * {@code reinterpretShape(species().withLanes(double.class))}. 2978 * It may be considered an inverse to {@link Vector#reinterpretAsBytes()}. 2979 * 2980 * @return a {@code DoubleVector} with the same shape and information content 2981 */ 2982 public abstract DoubleVector reinterpretAsDoubles(); 2983 2984 /** 2985 * Views this vector as a vector of the same shape, length, and 2986 * contents, but a lane type that is not a floating-point type. 2987 * 2988 * This is a lane-wise reinterpretation cast on the lane values. 2989 * As such, this method does not change {@code VSHAPE} or 2990 * {@code VLENGTH}, and there is no change to the bitwise contents 2991 * of the vector. If the vector's {@code ETYPE} is already an 2992 * integral type, the same vector is returned unchanged. 2993 * 2994 * This method returns the value of this expression: 2995 * {@code convert(conv,0)}, where {@code conv} is 2996 * {@code VectorOperators.Conversion.ofReinterpret(E.class,F.class)}, 2997 * and {@code F} is the non-floating-point type of the 2998 * same size as {@code E}. 2999 * 3000 * @apiNote 3001 * Subtypes improve on this method by sharpening 3002 * the return type. 3003 * 3004 * @return the original vector, reinterpreted as non-floating point 3005 * @see VectorOperators.Conversion#ofReinterpret(Class,Class) 3006 * @see Vector#convert(VectorOperators.Conversion,int) 3007 */ 3008 public abstract Vector<?> viewAsIntegralLanes(); 3009 3010 /** 3011 * Views this vector as a vector of the same shape, length, and 3012 * contents, but a lane type that is a floating-point type. 3013 * 3014 * This is a lane-wise reinterpretation cast on the lane values. 3015 * As such, there this method does not change {@code VSHAPE} or 3016 * {@code VLENGTH}, and there is no change to the bitwise contents 3017 * of the vector. If the vector's {@code ETYPE} is already a 3018 * float-point type, the same vector is returned unchanged. 3019 * 3020 * If the vector's element size does not match any floating point 3021 * type size, an {@code IllegalArgumentException} is thrown. 3022 * 3023 * This method returns the value of this expression: 3024 * {@code convert(conv,0)}, where {@code conv} is 3025 * {@code VectorOperators.Conversion.ofReinterpret(E.class,F.class)}, 3026 * and {@code F} is the floating-point type of the 3027 * same size as {@code E}, if any. 3028 * 3029 * @apiNote 3030 * Subtypes improve on this method by sharpening 3031 * the return type. 3032 * 3033 * @return the original vector, reinterpreted as floating point 3034 * @throws UnsupportedOperationException if there is no floating point 3035 * type the same size as the lanes of this vector 3036 * @see VectorOperators.Conversion#ofReinterpret(Class,Class) 3037 * @see Vector#convert(VectorOperators.Conversion,int) 3038 */ 3039 public abstract Vector<?> viewAsFloatingLanes(); 3040 3041 /** 3042 * Convert this vector to a vector of the same shape and a new 3043 * element type, converting lane values from the current {@code ETYPE} 3044 * to a new lane type (called {@code FTYPE} here) according to the 3045 * indicated {@linkplain VectorOperators.Conversion conversion}. 3046 * 3047 * This is a lane-wise shape-invariant operation which copies 3048 * {@code ETYPE} values from the input vector to corresponding 3049 * {@code FTYPE} values in the result. Depending on the selected 3050 * conversion, this operation may either 3051 * <a href="Vector.html#expansion">expand or contract</a> its 3052 * logical result, in which case a non-zero {@code part} number 3053 * can further control the selection and steering of the logical 3054 * result into the physical output vector. 3055 * 3056 * <p> Each specific conversion is described by a conversion 3057 * constant in the class {@link VectorOperators}. Each conversion 3058 * operator has a specified {@linkplain 3059 * VectorOperators.Conversion#domainType() domain type} and 3060 * {@linkplain VectorOperators.Conversion#rangeType() range type}. 3061 * The domain type must exactly match the lane type of the input 3062 * vector, while the range type determines the lane type of the 3063 * output vectors. 3064 * 3065 * <p> A conversion operator may be classified as (respectively) 3066 * in-place, expanding, or contracting, depending on whether the 3067 * bit-size of its domain type is (respectively) equal, less than, 3068 * or greater than the bit-size of its range type. 3069 * 3070 * <p> Independently, conversion operations can also be classified 3071 * as reinterpreting or value-transforming, depending on whether 3072 * the conversion copies representation bits unchanged, or changes 3073 * the representation bits in order to retain (part or all of) 3074 * the logical value of the input value. 3075 * 3076 * <p> If a reinterpreting conversion contracts, it will truncate the 3077 * upper bits of the input. If it expands, it will pad upper bits 3078 * of the output with zero bits, when there are no corresponding 3079 * input bits. 3080 * 3081 * <p> An expanding conversion such as {@code S2I} ({@code short} 3082 * value to {@code int}) takes a scalar value and represents it 3083 * in a larger format (always with some information redundancy). 3084 * 3085 * A contracting conversion such as {@code D2F} ({@code double} 3086 * value to {@code float}) takes a scalar value and represents it 3087 * in a smaller format (always with some information loss). 3088 * 3089 * Some in-place conversions may also include information loss, 3090 * such as {@code L2D} ({@code long} value to {@code double}) 3091 * or {@code F2I} ({@code float} value to {@code int}). 3092 * 3093 * Reinterpreting in-place conversions are not lossy, unless the 3094 * bitwise value is somehow not legal in the output type. 3095 * Converting the bit-pattern of a {@code NaN} may discard bits 3096 * from the {@code NaN}'s significand. 3097 * 3098 * <p> This classification is important, because, unless otherwise 3099 * documented, conversion operations <em>never change vector 3100 * shape</em>, regardless of how they may change <em>lane sizes</em>. 3101 * 3102 * Therefore an <em>expanding</em> conversion cannot store all of its 3103 * results in its output vector, because the output vector has fewer 3104 * lanes of larger size, in order to have the same overall bit-size as 3105 * its input. 3106 * 3107 * Likewise, a contracting conversion must store its relatively small 3108 * results into a subset of the lanes of the output vector, defaulting 3109 * the unused lanes to zero. 3110 * 3111 * <p> As an example, a conversion from {@code byte} to {@code long} 3112 * ({@code M=8}) will discard 87.5% of the input values in order to 3113 * convert the remaining 12.5% into the roomy {@code long} lanes of 3114 * the output vector. The inverse conversion will convert back all of 3115 * the large results, but will waste 87.5% of the lanes in the output 3116 * vector. 3117 * 3118 * <em>In-place</em> conversions ({@code M=1}) deliver all of 3119 * their results in one output vector, without wasting lanes. 3120 * 3121 * <p> To manage the details of these 3122 * <a href="Vector.html#expansion">expansions and contractions</a>, 3123 * a non-zero {@code part} parameter selects partial results from 3124 * expansions, or steers the results of contractions into 3125 * corresponding locations, as follows: 3126 * 3127 * <ul> 3128 * <li> expanding by {@code M}: {@code part} must be in the range 3129 * {@code [0..M-1]}, and selects the block of {@code VLENGTH/M} input 3130 * lanes starting at the <em>origin lane</em> at {@code part*VLENGTH/M}. 3131 3132 * <p> The {@code VLENGTH/M} output lanes represent a partial 3133 * slice of the whole logical result of the conversion, filling 3134 * the entire physical output vector. 3135 * 3136 * <li> contracting by {@code M}: {@code part} must be in the range 3137 * {@code [-M+1..0]}, and steers all {@code VLENGTH} input lanes into 3138 * the output located at the <em>origin lane</em> {@code -part*VLENGTH}. 3139 * There is a total of {@code VLENGTH*M} output lanes, and those not 3140 * holding converted input values are filled with zeroes. 3141 * 3142 * <p> A group of such output vectors, with logical result parts 3143 * steered to disjoint blocks, can be reassembled using the 3144 * {@linkplain VectorOperators#OR bitwise or} or (for floating 3145 * point) the {@link VectorOperators#FIRST_NONZERO FIRST_NONZERO} 3146 * operator. 3147 * 3148 * <li> in-place ({@code M=1}): {@code part} must be zero. 3149 * Both vectors have the same {@code VLENGTH}. The result is 3150 * always positioned at the <em>origin lane</em> of zero. 3151 * 3152 * </ul> 3153 * 3154 * <p> This method is a restricted version of the more general 3155 * but less frequently used <em>shape-changing</em> method 3156 * {@link #convertShape(VectorOperators.Conversion,VectorSpecies,int) 3157 * convertShape()}. 3158 * The result of this method is the same as the expression 3159 * {@code this.convertShape(conv, rsp, this.broadcast(part))}, 3160 * where the output species is 3161 * {@code rsp=this.species().withLanes(FTYPE.class)}. 3162 * 3163 * @param conv the desired scalar conversion to apply lane-wise 3164 * @param part the <a href="Vector.html#expansion">part number</a> 3165 * of the result, or zero if neither expanding nor contracting 3166 * @param <F> the boxed element type of the species 3167 * @return a vector converted by shape and element type from this vector 3168 * @throws ArrayIndexOutOfBoundsException unless {@code part} is zero, 3169 * or else the expansion ratio is {@code M} and 3170 * {@code part} is positive and less than {@code M}, 3171 * or else the contraction ratio is {@code M} and 3172 * {@code part} is negative and greater {@code -M} 3173 * 3174 * @see VectorOperators#I2L 3175 * @see VectorOperators.Conversion#ofCast(Class,Class) 3176 * @see VectorSpecies#partLimit(VectorSpecies,boolean) 3177 * @see #viewAsFloatingLanes() 3178 * @see #viewAsIntegralLanes() 3179 * @see #convertShape(VectorOperators.Conversion,VectorSpecies,int) 3180 * @see #reinterpretShape(VectorSpecies,int) 3181 */ 3182 public abstract <F> Vector<F> convert(VectorOperators.Conversion<E,F> conv, int part); 3183 3184 /** 3185 * Converts this vector to a vector of the given species, shape and 3186 * element type, converting lane values from the current {@code ETYPE} 3187 * to a new lane type (called {@code FTYPE} here) according to the 3188 * indicated {@linkplain VectorOperators.Conversion conversion}. 3189 * 3190 * This is a lane-wise operation which copies {@code ETYPE} values 3191 * from the input vector to corresponding {@code FTYPE} values in 3192 * the result. 3193 * 3194 * <p> If the old and new species have the same shape, the behavior 3195 * is exactly the same as the simpler, shape-invariant method 3196 * {@link #convert(VectorOperators.Conversion,int) convert()}. 3197 * In such cases, the simpler method {@code convert()} should be 3198 * used, to make code easier to reason about. 3199 * Otherwise, this is a <em>shape-changing</em> operation, and may 3200 * have special implementation costs. 3201 * 3202 * <p> As a combined effect of shape changes and lane size changes, 3203 * the input and output species may have different lane counts, causing 3204 * <a href="Vector.html#expansion">expansion or contraction</a>. 3205 * In this case a non-zero {@code part} parameter selects 3206 * partial results from an expanded logical result, or steers 3207 * the results of a contracted logical result into a physical 3208 * output vector of the required output species. 3209 * 3210 * <p >The following pseudocode illustrates the behavior of this 3211 * method for in-place, expanding, and contracting conversions. 3212 * (This pseudocode also applies to the shape-invariant method, 3213 * but with shape restrictions on the output species.) 3214 * Note that only one of the three code paths is relevant to any 3215 * particular combination of conversion operator and shapes. 3216 * 3217 * <pre>{@code 3218 * FTYPE scalar_conversion_op(ETYPE s); 3219 * EVector a = ...; 3220 * VectorSpecies<F> rsp = ...; 3221 * int part = ...; 3222 * VectorSpecies<E> dsp = a.species(); 3223 * int domlen = dsp.length(); 3224 * int ranlen = rsp.length(); 3225 * FTYPE[] logical = new FTYPE[domlen]; 3226 * for (int i = 0; i < domlen; i++) { 3227 * logical[i] = scalar_conversion_op(a.lane(i)); 3228 * } 3229 * FTYPE[] physical; 3230 * if (domlen == ranlen) { // in-place 3231 * assert part == 0; //else AIOOBE 3232 * physical = logical; 3233 * } else if (domlen > ranlen) { // expanding 3234 * int M = domlen / ranlen; 3235 * assert 0 <= part && part < M; //else AIOOBE 3236 * int origin = part * ranlen; 3237 * physical = Arrays.copyOfRange(logical, origin, origin + ranlen); 3238 * } else { // (domlen < ranlen) // contracting 3239 * int M = ranlen / domlen; 3240 * assert 0 >= part && part > -M; //else AIOOBE 3241 * int origin = -part * domlen; 3242 * System.arraycopy(logical, 0, physical, origin, domlen); 3243 * } 3244 * return FVector.fromArray(ran, physical, 0); 3245 * }</pre> 3246 * 3247 * @param conv the desired scalar conversion to apply lane-wise 3248 * @param rsp the desired output species 3249 * @param part the <a href="Vector.html#expansion">part number</a> 3250 * of the result, or zero if neither expanding nor contracting 3251 * @param <F> the boxed element type of the output species 3252 * @return a vector converted by element type from this vector 3253 * @see #convert(VectorOperators.Conversion,int) 3254 * @see #castShape(VectorSpecies,int) 3255 * @see #reinterpretShape(VectorSpecies,int) 3256 */ 3257 public abstract <F> Vector<F> convertShape(VectorOperators.Conversion<E,F> conv, VectorSpecies<F> rsp, int part); 3258 3259 /** 3260 * Convenience method for converting a vector from one lane type 3261 * to another, reshaping as needed when lane sizes change. 3262 * 3263 * This method returns the value of this expression: 3264 * {@code convertShape(conv,rsp,part)}, where {@code conv} is 3265 * {@code VectorOperators.Conversion.ofCast(E.class,F.class)}. 3266 * 3267 * <p> If the old and new species have different shape, this is a 3268 * <em>shape-changing</em> operation, and may have special 3269 * implementation costs. 3270 * 3271 * @param rsp the desired output species 3272 * @param part the <a href="Vector.html#expansion">part number</a> 3273 * of the result, or zero if neither expanding nor contracting 3274 * @param <F> the boxed element type of the output species 3275 * @return a vector converted by element type from this vector 3276 * @see VectorOperators.Conversion#ofCast(Class,Class) 3277 * @see Vector#convertShape(VectorOperators.Conversion,VectorSpecies,int) 3278 */ 3279 // Does this carry its weight? 3280 public abstract <F> Vector<F> castShape(VectorSpecies<F> rsp, int part); 3281 3282 /** 3283 * Checks that this vector has the given element type, 3284 * and returns this vector unchanged. 3285 * The effect is similar to this pseudocode: 3286 * {@code elementType == species().elementType() 3287 * ? this 3288 * : throw new ClassCastException()}. 3289 * 3290 * @param elementType the required lane type 3291 * @param <F> the boxed element type of the required lane type 3292 * @return the same vector 3293 * @throws ClassCastException if the vector has the wrong element type 3294 * @see VectorSpecies#check(Class) 3295 * @see VectorMask#check(Class) 3296 * @see Vector#check(VectorSpecies) 3297 * @see VectorShuffle#check(VectorSpecies) 3298 */ 3299 public abstract <F> Vector<F> check(Class<F> elementType); 3300 3301 /** 3302 * Checks that this vector has the given species, 3303 * and returns this vector unchanged. 3304 * The effect is similar to this pseudocode: 3305 * {@code species == species() 3306 * ? this 3307 * : throw new ClassCastException()}. 3308 * 3309 * @param species the required species 3310 * @param <F> the boxed element type of the required species 3311 * @return the same vector 3312 * @throws ClassCastException if the vector has the wrong species 3313 * @see Vector#check(Class) 3314 * @see VectorMask#check(VectorSpecies) 3315 * @see VectorShuffle#check(VectorSpecies) 3316 */ 3317 public abstract <F> Vector<F> check(VectorSpecies<F> species); 3318 3319 //Array stores 3320 3321 /** 3322 * Stores this vector into a byte array starting at an offset 3323 * using explicit byte order. 3324 * <p> 3325 * Bytes are extracted from primitive lane elements according 3326 * to the specified byte ordering. 3327 * The lanes are stored according to their 3328 * <a href="Vector.html#lane-order">memory ordering</a>. 3329 * <p> 3330 * This method behaves as if it calls 3331 * {@link #intoByteBuffer(ByteBuffer,int,ByteOrder,VectorMask) 3332 * intoByteBuffer()} as follows: 3333 * <pre>{@code 3334 * var bb = ByteBuffer.wrap(a); 3335 * var m = maskAll(true); 3336 * intoByteBuffer(bb, offset, bo, m); 3337 * }</pre> 3338 * 3339 * @param a the byte array 3340 * @param offset the offset into the array 3341 * @param bo the intended byte order 3342 * @throws IndexOutOfBoundsException 3343 * if {@code offset+N*ESIZE < 0} 3344 * or {@code offset+(N+1)*ESIZE > a.length} 3345 * for any lane {@code N} in the vector 3346 */ 3347 public abstract void intoByteArray(byte[] a, int offset, 3348 ByteOrder bo); 3349 3350 /** 3351 * Stores this vector into a byte array starting at an offset 3352 * using explicit byte order and a mask. 3353 * <p> 3354 * Bytes are extracted from primitive lane elements according 3355 * to the specified byte ordering. 3356 * The lanes are stored according to their 3357 * <a href="Vector.html#lane-order">memory ordering</a>. 3358 * <p> 3359 * This method behaves as if it calls 3360 * {@link #intoByteBuffer(ByteBuffer,int,ByteOrder,VectorMask) 3361 * intoByteBuffer()} as follows: 3362 * <pre>{@code 3363 * var bb = ByteBuffer.wrap(a); 3364 * intoByteBuffer(bb, offset, bo, m); 3365 * }</pre> 3366 * 3367 * @param a the byte array 3368 * @param offset the offset into the array 3369 * @param bo the intended byte order 3370 * @param m the mask controlling lane selection 3371 * @throws IndexOutOfBoundsException 3372 * if {@code offset+N*ESIZE < 0} 3373 * or {@code offset+(N+1)*ESIZE > a.length} 3374 * for any lane {@code N} in the vector 3375 * where the mask is set 3376 */ 3377 public abstract void intoByteArray(byte[] a, int offset, 3378 ByteOrder bo, 3379 VectorMask<E> m); 3380 3381 /** 3382 * Stores this vector into a byte buffer starting at an offset 3383 * using explicit byte order. 3384 * <p> 3385 * Bytes are extracted from primitive lane elements according 3386 * to the specified byte ordering. 3387 * The lanes are stored according to their 3388 * <a href="Vector.html#lane-order">memory ordering</a>. 3389 * <p> 3390 * This method behaves as if it calls 3391 * {@link #intoByteBuffer(ByteBuffer,int,ByteOrder,VectorMask) 3392 * intoByteBuffer()} as follows: 3393 * <pre>{@code 3394 * var m = maskAll(true); 3395 * intoByteBuffer(bb, offset, bo, m); 3396 * }</pre> 3397 * 3398 * @param bb the byte buffer 3399 * @param offset the offset into the array 3400 * @param bo the intended byte order 3401 * @throws IndexOutOfBoundsException 3402 * if {@code offset+N*ESIZE < 0} 3403 * or {@code offset+(N+1)*ESIZE > bb.limit()} 3404 * for any lane {@code N} in the vector 3405 * @throws java.nio.ReadOnlyBufferException 3406 * if the byte buffer is read-only 3407 */ 3408 public abstract void intoByteBuffer(ByteBuffer bb, int offset, ByteOrder bo); 3409 3410 /** 3411 * Stores this vector into a byte buffer starting at an offset 3412 * using explicit byte order and a mask. 3413 * <p> 3414 * Bytes are extracted from primitive lane elements according 3415 * to the specified byte ordering. 3416 * The lanes are stored according to their 3417 * <a href="Vector.html#lane-order">memory ordering</a>. 3418 * <p> 3419 * The following pseudocode illustrates the behavior, where 3420 * the primitive element type is not of {@code byte}, 3421 * {@code EBuffer} is the primitive buffer type, {@code ETYPE} is the 3422 * primitive element type, and {@code EVector} is the primitive 3423 * vector type for this vector: 3424 * <pre>{@code 3425 * EBuffer eb = bb.duplicate() 3426 * .position(offset) 3427 * .order(bo).asEBuffer(); 3428 * ETYPE[] a = this.toArray(); 3429 * for (int n = 0; n < a.length; n++) { 3430 * if (m.laneIsSet(n)) { 3431 * eb.put(n, a[n]); 3432 * } 3433 * } 3434 * }</pre> 3435 * When the primitive element type is of {@code byte} the primitive 3436 * byte buffer is obtained as follows, where operation on the buffer 3437 * remains the same as in the prior pseudocode: 3438 * <pre>{@code 3439 * ByteBuffer eb = bb.duplicate() 3440 * .position(offset); 3441 * }</pre> 3442 * 3443 * @implNote 3444 * This operation is likely to be more efficient if 3445 * the specified byte order is the same as 3446 * {@linkplain ByteOrder#nativeOrder() 3447 * the platform native order}, 3448 * since this method will not need to reorder 3449 * the bytes of lane values. 3450 * In the special case where {@code ETYPE} is 3451 * {@code byte}, the byte order argument is 3452 * ignored. 3453 * 3454 * @param bb the byte buffer 3455 * @param offset the offset into the array 3456 * @param bo the intended byte order 3457 * @param m the mask controlling lane selection 3458 * @throws IndexOutOfBoundsException 3459 * if {@code offset+N*ESIZE < 0} 3460 * or {@code offset+(N+1)*ESIZE > bb.limit()} 3461 * for any lane {@code N} in the vector 3462 * where the mask is set 3463 * @throws java.nio.ReadOnlyBufferException 3464 * if the byte buffer is read-only 3465 */ 3466 public abstract void intoByteBuffer(ByteBuffer bb, int offset, 3467 ByteOrder bo, VectorMask<E> m); 3468 3469 /** 3470 * Returns a packed array containing all the lane values. 3471 * The array length is the same as the vector length. 3472 * The element type of the array is the same as the element 3473 * type of the vector. 3474 * The array elements are stored in lane order. 3475 * Overrides of this method on subtypes of {@code Vector} 3476 * which specify the element type have an accurately typed 3477 * array result. 3478 * 3479 * @apiNote 3480 * Usually {@linkplain FloatVector#toArray() strongly typed access} 3481 * is preferable, if you are working with a vector 3482 * subtype that has a known element type. 3483 * 3484 * @return an accurately typed array containing 3485 * the lane values of this vector 3486 * @see ByteVector#toArray() 3487 * @see IntVector#toArray() 3488 * @see DoubleVector#toArray() 3489 */ 3490 public abstract Object toArray(); 3491 3492 /** 3493 * Returns an {@code int[]} array containing all 3494 * the lane values, converted to the type {@code int}. 3495 * The array length is the same as the vector length. 3496 * The array elements are converted as if by casting 3497 * and stored in lane order. 3498 * 3499 * This operation may fail if the vector element type is {@code 3500 * float} or {@code double}, when lanes contain fractional or 3501 * out-of-range values. If any vector lane value is not 3502 * representable as an {@code int}, an exception is thrown. 3503 * 3504 * @apiNote 3505 * Usually {@linkplain FloatVector#toArray() strongly typed access} 3506 * is preferable, if you are working with a vector 3507 * subtype that has a known element type. 3508 * 3509 * @return an {@code int[]} array containing 3510 * the lane values of this vector 3511 * @throws UnsupportedOperationException 3512 * if any lane value cannot be represented as an 3513 * {@code int} array element 3514 * @see #toArray() 3515 * @see #toLongArray() 3516 * @see #toDoubleArray() 3517 * @see IntVector#toArray() 3518 */ 3519 public abstract int[] toIntArray(); 3520 3521 /** 3522 * Returns a {@code long[]} array containing all 3523 * the lane values, converted to the type {@code long}. 3524 * The array length is the same as the vector length. 3525 * The array elements are converted as if by casting 3526 * and stored in lane order. 3527 * 3528 * This operation may fail if the vector element type is {@code 3529 * float} or {@code double}, when lanes contain fractional or 3530 * out-of-range values. If any vector lane value is not 3531 * representable as a {@code long}, an exception is thrown. 3532 * 3533 * @apiNote 3534 * Usually {@linkplain FloatVector#toArray() strongly typed access} 3535 * is preferable, if you are working with a vector 3536 * subtype that has a known element type. 3537 * 3538 * @return a {@code long[]} array containing 3539 * the lane values of this vector 3540 * @throws UnsupportedOperationException 3541 * if any lane value cannot be represented as a 3542 * {@code long} array element 3543 * @see #toArray() 3544 * @see #toIntArray() 3545 * @see #toDoubleArray() 3546 * @see LongVector#toArray() 3547 */ 3548 public abstract long[] toLongArray(); 3549 3550 /** 3551 * Returns a {@code double[]} array containing all 3552 * the lane values, converted to the type {@code double}. 3553 * The array length is the same as the vector length. 3554 * The array elements are converted as if by casting 3555 * and stored in lane order. 3556 * This operation can lose precision 3557 * if the vector element type is {@code long}. 3558 * 3559 * @apiNote 3560 * Usually {@link FloatVector#toArray() strongly typed access} 3561 * is preferable, if you are working with a vector 3562 * subtype that has a known element type. 3563 * 3564 * @return a {@code double[]} array containing 3565 * the lane values of this vector, 3566 * possibly rounded to representable 3567 * {@code double} values 3568 * @see #toArray() 3569 * @see #toIntArray() 3570 * @see #toLongArray() 3571 * @see DoubleVector#toArray() 3572 */ 3573 public abstract double[] toDoubleArray(); 3574 3575 /** 3576 * Returns a string representation of this vector, of the form 3577 * {@code "[0,1,2...]"}, reporting the lane values of this 3578 * vector, in lane order. 3579 * 3580 * The string is produced as if by a call to 3581 * {@link Arrays#toString(int[]) Arrays.toString()}, 3582 * as appropriate to the array returned by 3583 * {@link #toArray() this.toArray()}. 3584 * 3585 * @return a string of the form {@code "[0,1,2...]"} 3586 * reporting the lane values of this vector 3587 */ 3588 @Override 3589 public abstract String toString(); 3590 3591 /** 3592 * Indicates whether this vector is identical to some other object. 3593 * Two vectors are identical only if they have the same species 3594 * and same lane values, in the same order. 3595 * <p>The comparison of lane values is produced as if by a call to 3596 * {@link Arrays#equals(int[],int[]) Arrays.equals()}, 3597 * as appropriate to the arrays returned by 3598 * {@link #toArray toArray()} on both vectors. 3599 * 3600 * @return whether this vector is identical to some other object 3601 * @see #eq 3602 */ 3603 @Override 3604 public abstract boolean equals(Object obj); 3605 3606 /** 3607 * Returns a hash code value for the vector. 3608 * based on the lane values and the vector species. 3609 * 3610 * @return a hash code value for this vector 3611 */ 3612 @Override 3613 public abstract int hashCode(); 3614 3615 // ==== JROSE NAME CHANGES ==== 3616 3617 // RAISED FROM SUBCLASSES (with generalized type) 3618 // * toArray() -> ETYPE[] <: Object (erased return type for interop) 3619 // * toString(), equals(Object), hashCode() (documented) 3620 // ADDED 3621 // * compare(OP,v) to replace most of the comparison methods 3622 // * maskAll(boolean) to replace maskAllTrue/False 3623 // * toLongArray(), toDoubleArray() (generic unboxed access) 3624 // * check(Class), check(VectorSpecies) (static type-safety checks) 3625 // * enum Comparison (enum of EQ, NE, GT, LT, GE, LE) 3626 // * zero(VS), broadcast(long) (basic factories) 3627 // * reinterpretAsEs(), viewAsXLanes (bytewise reinterpreting views) 3628 // * addIndex(int) (iota function) 3629 3630 }