llvm-project/libclc/generic/lib/geometric/fast_normalize.inc
Tom Stellard 17ec3a51c3 Implement fast_normalize builtin v4
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

v2:
  - Remove f suffix from constant in double implementations.
  - Consolidate implementations using the .cl/.inc approach.

v3:
 - Use __CLC_FPSIZE instead of __CLC_FP{32,64}

v4 (Jan Vesely):
 - Limit to single precision.

llvm-svn: 236920
2015-05-09 00:04:12 +00:00

32 lines
1.3 KiB
PHP

/*
* Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#ifndef __CLC_SCALAR
// Only handle vector implementations
_CLC_OVERLOAD _CLC_DEF __CLC_FLOATN fast_normalize(__CLC_FLOATN p) {
__CLC_FLOAT l2 = dot(p, p);
return l2 == 0.0f ? p : p * half_rsqrt(l2);
}
#endif