Releases: intel/intel-graphics-compiler
Releases · intel/intel-graphics-compiler
igc-1.0.5064
Fixed Issues / Improvements
- Multiple refactor changes in preparation of LLVM upgrade to version 11.
- Added shader dumping capabilities to VectorCompiler.
- VectorCompiler now uses DebufInfo library.
- Refactored debug information.
- Fixes for compilation for OpenCL 3.0.
- Remove alwaysinline attributes for call instructions if -cl-opt-disable is present.
- Avoid FP64 emulation related code if kernel doesn't use FP64 at all.
- Instruction splitting is now handled by vISA.
- Enable splitting of instructions with indirect addressing.
- Update GED version to 0.68.
- Improvements to the legalizaton of shuffle-vector.
- Removed LShr and AShr operands truncation.
- Restricting spill space compression intra-iteration only in interest of compile time and memory usage.
- Removed the EnableOCLNoInlineAttr flag, NoInline should be honored by default.
- Multiple other improvements and minor changes throughout the project.
Dependencies revisions
- intel/llvm-patches@c4a0345
- intel/opencl-clang@6a9cd2c
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@8300678
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/llvm-project@llvmorg-10.0.0
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4944
Fixed Issues / Improvements
- Multiple refactor changes in preparation of LLVM upgrade to version 11.
- Vector backend improvements:
- clearer reports about unsupported types,
- introduced i64 emulation pass,
- support for -ftime-report,
- support for genx_addc and genx_subb intrinsics.
- Added support for OpConstFunctionPointerINTEL, 1024-bit constants in SPIRVReader.
- Added patches for SPIRV-LLVM-Translator based on LLVM9.
- Added support for experimental SYCL unmasked call feature.
- Added option to force thread group size.
- Added fallback path when ZEBinary is enabled by -allow-zebin.
- Updated CMFE interface.
- Relaxed some checks to allow a subset of i64 operations for targets without native i64 support.
- Fixed translation of non 32/64-bit constants.
- Fixed processing of GEP instruction when the index is a vector.
- Fixed buildbreak with VectorCompiler switched off.
- Removed unused IGA files, updated IGA.
- Many minor optimization and code improvements throughout whole project.
Dependencies revisions
- intel/llvm-patches@c4a0345
- intel/opencl-clang@6a9cd2c
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@55124bb
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/llvm-project@llvmorg-10.0.0
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4879
Fixed Issues / Improvements
- SWSB sync instruction optimization
- Increase large basic block size to calculate register pressure.
- Support for emitting GenISA_simdShuffleDown in missing execution size (SIMD32)
- Decode DIExpression operation in SPIRV reader.
- Support emulation of general call and return for i64 type.
- Enable accumulator use for ror/rol.
- Minor fixes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4756
Fixed Issues / Improvements
- Update EOTRenderTarget() to use RTW instruction instead of a raw send.
- Add WA to set noMask for all sampler.
- Type of for statement iterator changed from unsigned int to uint64_t to prevent hangs.
- Initial support for source/line debug information in VectorCompiler backend.
- Fix a bug for A64 block store with byte source operand.
- Fix number of elements to use in G4_Declare creation in remat.
- Make IGC emit DW_AT_abstract_origin for inlined DW_TAG_lexical_block constructs.
- Add a check to avoid invalid dereference.
- Incorrect code generation for xor fix.
- Adding a regkey to override product for debugging.
- Create dummy kernel to attach symbol table and indirectly-called functions.
- Move bfrev pattern from GenSpecificPattern to CustomSafeOptPass.
- Adding attribute for selective stack calls.
- Disable rematerialzation of intrinsic.split instruction.
- Assign DebugLoc to pre-defined variables. Skip sinking of allocas when -cl-opt-disable is applied to keep .debug_ranges clean for inlined functions.
- Fixes for OCL 3.0 feature macro usage.
- Fix CISA offsets before splice operation in spill insertion.
- Update EOTRenderTarget() to use RTW instruction instead of a raw send.
- Fix regression in promotion of dynamic buffers to registers.
- Allow enabling some features in Release mode using environment variables.
- ZEBianry: support .spv and .gtpin_info section.
- Add pass to classify move types.
- Do not copy R0 when creating the header for bindless sampler message.
- Eliminate select+phi redundancy in SIMD CF
- Enable explicit variable split.
- Fix some bugs in ZEBinary - Fixed ELF flag - Rename .data.global_const to .data.const - Remove local_id info if not used
- Remove unncessary add for the common special case of add.pair with zero high32-bit values.
- Add backend configuration pass for VC.
- Merge instructions for 64bit emulation.
- Add pass to Split Indirect EE to sel to avoid VxH mov.
- Add dependency for EstimateFunctionSize.
- Emit single copy of pre-defined variables when -cl-kernel-debug-enable is passed.
- Inlining algorithm for controling Kernel Total Size.
- Forward pointer support in SPIRV Reader.
- Run earlyCSE after GEPLowering to optout some instrcutions introduced by GEPLowering.
- Switching from -runtime to -binary-format option -binary-format=ze for ZEBinary output.
- Stitch indirectly-called functions to the binary on VC side.
- Starting from tgllp mid-thread preemption is no longer supported. EnablePreemption value should be set to false for these new platforms.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4594
Fixed Issues / Improvements
- DWARF SIMD location expressions support cont.
- Emit debug info for lower and upper 16 channels for SIMD32
- Add an opt for dp4 with identity matrix
- GRF register info available in dump
- Minor fixes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4560
Fixed Issues / Improvements
- Removed redundant constant folding in HW legalization checks.
- Added ABI validation for CMFE.
- Added flushing L3 for device or cross-device memory fence on global memory.
- Added missing help text for alias options in VC.
- Moved IntrinsicGenISA.gen to build Config folder and added proper dependency requirement.
- Fixed VertexShaderLowering incorrectly clearing out Vertex Header when it is actually used.
- Fixed in LRA to ensure startGRFReg is less than number of GRFs available for allocation to linear scan.
- Fixes for vISA assembly.
- Fixed vector alloca type in TransformPrivMem for function pointers cases.
- Fixed and further implementation of IGC_ASSERT.
- Fixed args passed to register for non-uniform function calls.
- Disabled SIMD32 slicing when -cl-opt-disable is passed instead of -g.
- Disabled legacy mad to mac optimization.
- Implemented SIMD compile info for OCL shaders.
- Improved alloca uniform analysis.
- Allowing DisableAddingAlwaysAttribute flag in release mode.
- Switched to accessing GenXSubtarget through TargetPassConfig.
- Minor refactoring and deprecated code removal.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4521
Fixed Issues / Improvements
- Fix bugs in image/sampler tracking to properly track argument when compiling with -cl-opt-disable
- Move pass to erase redundant movs after other optimizations are done.
- Skip simd32 compilation for per-pixel dispatch with x16 samples.
- Revert new-reg-per-function behavior in VC RA
- Add new Relocation Type R_PER_THREAD_PAYLOAD_OFFSET_32 Also refactor vISA::RelocationEntry create API
- Moving IntrinsicGenISA.gen to build Config folder and adding proper dependency requirement
- Switch to using LLVMTargetMachine in VC. Initialized GenX pass in BackendPlugin.
- Add check for fp64 and i64 copy move if platform does not support 64b types.
- Program the correct response length for spill of a scalar variable used as send dst.
- Enable FP64 accumulator as mul instruction source.
- Add TGL emulation functions for DP and SP
- DWARF debugger location expressions
- Emit variable location off privateBase
- additional include guards to avoid re-defintion conflict of LARGE_INTEGER type
- Uniform analysis tuning for performance.
- For stack calls do not adjust the spill size by global scratch offset
- Relocations and symbols support in L0 binary in VC
- Update register numbering for debug info.
- Try to avoid bank conflict for Gen12 when scheduling
- Fix some excessive mov instructions emitted by VectorCompiler.
- Avoid unncessary llvm metadata regenerations to optimize compilation time
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4479
Fixed Issues / Improvements
- Fail compilation with error message instead of crash if we can't find sampler argument or inline/global sampler.
- DWARF debugger location expressions fixes.
- Remove redundant LICM pass.
- Enable timestats through regkey.
- extend the MarkReadOnlyPass to mark loads with constant address space with invariant.load.
- Adding metadata for computedDepthMode.
- Enable vISA instruction splitting pass.
- Open-sourcing CM FE related parts of driver.
- Fix system LLVM handling in VC.
- update integer splitter api in VectorCompiler.
- Improve lowering of ord/unord fcmp.
- Emask calculation correcting by remove the write enable and using right data types.
- Basic block where flow control partially joins should also be treated as divergent.
- Limit total thread payload size to 96 GRFs.
- Tighten up vISA assembly syntax: do not allow missing regions for general operands, do not allow two offsets for address operands.
- Fixed operations with overflow.
- Prevent tracking of images and samplers from falling into infinite loop.
- Fix lowpc/highpc for subroutines in debug info.
- Make Dst operand's subreg offset immutable.
- Get rid of unnecessary MOVS.
- Check for undefined predicate variables when parsing vISA assembly.
- Removing the dst/src overlap checking after augmentation.
- ZEBinary: add kernel symbol.
- Fix TPM's replaceGatherPrivate.
- Add ZEAutoTool.
- Report parser error for vISA inline assembly in releaseInternal build.
- ZEBinaryBuilder: Fix packed_local_ids size to 6 instead of 12.
- DWARF debugger location expressions fixes.
- Avoid multiple metadata regenerating in AggregateArguments pass.
- Refactor for stack call functions. Combine code for caller/callee stack load/store.
- Fix shuffleVector lowering in legalization pass.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4427
Fixed Issues / Improvements
- Handle 16-byte alignment correct for trivial and local RA.
- Introduce ExtraOCLOptions debug key.
- Improve URB merging.
- Initial support of L0 binary in cmc.
- Add pattern match to emit integer trunc instruction with saturation.
- Update of static bank conflict checking.
- Minor fixes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4361
Fixed Issues / Improvements
- Removed SIMD size dependence on -g. SIMD size is an optimization so it is dependent on whether optimizations are enabled.
- Removed the code related to replacing unreachable instructions with "return undef".
- Added an analysis to check if an instruction's def reaches the end of its parent basic block.
- Added Vector Compute backend
- Added build option, vISA_noStitchExternFunc, to control the stitching policy.
- Added extra condition for transforming ptr-arg.
- Fixed defect where 32word and 64word in align directives failed parse.
- Fixed DILocation for globals that are localized.
- Fixed payload coalesing missing issue for dual-source RTW on SIMD16.
- Move Accumulator substitution into its own file.
- Skipping marking source variables with Output attribute.
- CISA assembly update.
- SWSB improvements.
- Avoiding div by zero while doing spillCost computations.
- Write to null register when inlineAsm output is unused.
- Enhance stateless simple push to allow promoting regions where the starting address can be a sum of 2 runtime values.
- Increase the maximum size of arguments that can be passed to OpenCL kernel to 2 KB.
- Optimize double precision SQRT instruction.
- Do not include zero-sized variable (e.g., Arg/Retval when there's no stack call) in global RA.
- Do explicit var split for local live-ranges only.
- Support vector type for llvm.copysign
- Protecting ISA generated variables from conflicting with vISA keywords (vISA reserved words are suffixed with _ (iteratively)).
- Set push constant mode to gather if the driver only supports gather.
- DWARF debugger location expressions fixes.
- Turn off writing caller's frame-pointer to callee's stack. Since this feature is needed only for stack-walk, we can turn it off by default with compiler flag: EnableWriteOldFPToStack.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.