Avoid copying of the orignal variable if it is going to be marked as firstprivate in task regions. For taskloops, still need to copy the non-trvially copyable variables to correctly construct them upon task creation.
It may improve performance for declare reduction constructs.
Added full support for parallel master taskloop simd directive.