Event Timeline
% spirv-dis loop_lt.spv 1 jobs
; SPIR-V
; Version: 1.0
; Generator: Khronos LLVM/SPIR-V Translator; 14
; Bound: 33
; Schema: 0
OpCapability Addresses OpCapability Linkage OpCapability Kernel OpCapability Int64 %1 = OpExtInstImport "OpenCL.std" OpMemoryModel Physical64 OpenCL OpEntryPoint Kernel %10 "loop_lt" OpSource OpenCL_C 102000 OpName %5 "__spirv_BuiltInGlobalInvocationId" OpName %11 "out" OpName %12 "iterations" OpName %13 "entry" OpName %14 "for.cond" OpName %15 "for.body" OpName %16 "for.inc" OpName %17 "for.end" OpName %19 "call" OpName %20 "conv" OpName %21 "mul" OpName %22 "conv1" OpName %24 "inc" OpName %25 "i.0" OpName %27 "cmp" OpName %28 "add" OpName %29 "idxprom" OpName %30 "arrayidx" OpDecorate %5 BuiltIn GlobalInvocationId OpDecorate %5 Constant OpDecorate %5 LinkageAttributes "__spirv_BuiltInGlobalInvocationId" Import %2 = OpTypeInt 64 0 %7 = OpTypeInt 32 0 %23 = OpConstant %7 0 %31 = OpConstant %7 1 %3 = OpTypeVector %2 3 %4 = OpTypePointer UniformConstant %3 %6 = OpTypeVoid %8 = OpTypePointer CrossWorkgroup %7 %9 = OpTypeFunction %6 %8 %7 %26 = OpTypeBool %5 = OpVariable %4 UniformConstant %10 = OpFunction %6 None %9 %11 = OpFunctionParameter %8 %12 = OpFunctionParameter %7 %13 = OpLabel %18 = OpLoad %3 %5 %19 = OpCompositeExtract %2 %18 0 %20 = OpUConvert %2 %12 %21 = OpIMul %2 %19 %20 %22 = OpUConvert %7 %21 OpBranch %14 %14 = OpLabel %25 = OpPhi %7 %23 %13 %24 %16 %27 = OpULessThan %26 %25 %12 OpBranchConditional %27 %15 %17 %15 = OpLabel %28 = OpIAdd %7 %22 %25 %29 = OpUConvert %2 %28 %30 = OpInBoundsPtrAccessChain %8 %11 %29 OpStore %30 %28 Aligned 4 OpBranch %16 %16 = OpLabel %24 = OpIAdd %7 %25 %31 OpBranch %14 %17 = OpLabel OpReturn OpFunctionEnd
MAIN:-1 () --- BB:0 (0 instructions) - df = { } loop_lt:10 () --- BB:0 (2 instructions) - df = { } -> BB:2 (tree) 0: ld u32 %r42 c0[0x8] (0) 1: bra BB:2 (0) --- <- BB:0 (tree) BB:2 (19 instructions) - idom = BB:0, df = { } -> BB:3 (tree) 2: rdsv u32 %r43 sv[TID:0] (0) 3: rdsv u32 %r45 sv[CTAID:0] (0) 4: mov u32 %r46 %r43 (0) 5: mad u32 %r47 %r45 c7[0x100] %r46 (0) 6: mov u32 %r48 0x00000000 (0) 7: mov u32 %r101 %r48 (0) 8: merge u64 %r49d %r47 %r101 (0) 9: mov u32 %r102 %r48 (0) 10: merge u64 %r66d %r42 %r102 (0) 11: split u64 { %r84 %r85 } %r49d (0) 12: split u64 { %r86 %r87 } %r66d (0) 13: mul u32 %r88 %r85 %r86 (0) 14: mad u32 %r89 %r84 %r87 %r88 (0) 15: mad (SUBOP:1) u32 %r91 %r84 %r86 %r89 (0) 16: mul u32 %r90 %r84 %r86 (0) 17: merge u64 %r67d %r90 %r91 (0) 18: split u64 { %r68 %r69 } %r67d (0) 19: mov u32 %r105 %r48 (0) 20: bra BB:3 (0) --- <- BB:6 (back) <- BB:2 (tree) BB:3 (3 instructions) - idom = BB:2, df = { BB:3 } -> BB:5 (tree) -> BB:4 (tree) 21: phi u32 %r72 %r104 %r105 (0) 22: set u8 %p73 lt u32 %r72 c0[0x8] (0) 23: %p73 bra BB:5 (0) --- <- BB:3 (tree) BB:4 (1 instructions) - idom = BB:3, df = { } -> BB:7 (tree) 24: bra BB:7 (0) --- <- BB:4 (tree) BB:7 (1 instructions) - idom = BB:4, df = { } -> BB:1 (tree) 25: bra BB:1 (0) --- <- BB:7 (tree) BB:1 (1 instructions) - idom = BB:7, df = { } 26: exit - # (0) --- <- BB:3 (tree) BB:5 (15 instructions) - idom = BB:3, df = { BB:3 } -> BB:6 (tree) 27: add u32 %r74 %r68 %r72 (0) 28: mov u32 %r75 0x00000000 (0) 29: mov u32 %r103 %r74 (0) 30: merge u64 %r76d %r103 %r75 (0) 31: mov u64 %r78d 0x0000000000000004 (0) 32: split u64 { %r93 %r94 } %r78d (0) 33: split u64 { %r95 %r96 } %r76d (0) 34: mul u32 %r97 %r94 %r95 (0) 35: mad u32 %r98 %r93 %r96 %r97 (0) 36: mad (SUBOP:1) u32 %r100 %r93 %r95 %r98 (0) 37: mul u32 %r99 %r93 %r95 (0) 38: merge u64 %r79d %r99 %r100 (0) 39: add u64 %r80d %r79d c0[0x0] (0) 40: st u32 # g[%r80d+0x0] %r74 (0) 41: bra BB:6 (0) --- <- BB:5 (tree) BB:6 (3 instructions) - idom = BB:5, df = { BB:3 } -> BB:3 (back) 42: add u32 %r81 %r72 0x00000001 (0) 43: mov u32 %r104 %r81 (0) 44: bra BB:3 (0) }
BB:2: %68 <- live range [18(18), 21) [18 21) BB:3: %68 <- live range [22(18), 24) [18 21) [22 24) BB:5: %68 <- live range [27(18), 27) [18 21) [22 24) BB:6: %68 <- live range [42(18), 45) [18 21) [22 24) [42 45)
After buildLiveSets() (liveSet seems to be the set of live-in values here):
BB:0: {} BB:2: {42} BB:3: {68} BB:4: {} BB:7: {} BB:1: {} BB:5: {68,72} BB:6: {72}
Technically, since BB:3 is outgoing from BB:6, BB:6's liveset should also include 68. But as the first block to be filled in is BB:6, at the time BB:3 is empty, so BB:6 will start empty (see nv50_ir_ra.cpp:577). In this case, should buildLiveSets() be called twice or will that be sorted later on?
After collectLiveValues() (now it looks like liveSet is the set of live-out values):
BB:0: {42} BB:2: {68} BB:3: {68,72} BB:4: {} BB:7: {} BB:1: {} BB:5: {72} BB:6: {68,72}
As BB:5 is now computed before BB:6, 68 is not found as being part of BB:5's live-out, even if it is. Should this be computed in reverse order instead, or will that be fixed in the remaining of BuildIntervalsPass::visit()? 68 will be added to the live set when processing all the sources from instructions in BB:5, but it will only be live the time of the instructions in that case (since that instruction is the first of the block). So, since 68 is live in BB:6 and we are in SSA-form and 68 is defined in BB:2, 68 should be considered as a live-out of BB:5 after collectLiveValues() was called. Since BuildIntervalsPass::visit() builds using the live sets of the outgoing BBs, should BuildIntervalsPass::visit() process the BBs in reverse order, to start with the outgoing BBs and end with the incoming ones?