Manipulating SPIR-V binaries

SPIR-V is a binary format. One implication is that introspecting into SPIR-V code requires specialized tools to read and make sense of the data.

In this package, we provide ways to read binaries from files or IO streams. Depending on how much you want your SPIR-V binaries to be analyzed, we provide a few data structures:

  • PhysicalModule, which is a thin wrapper over the general SPIR-V format. Instructions are parsed into PhysicalInstructions, which encode the most basic structure of a SPIR-V instruction. This data structure is used exclusively for serialization/deserialization purposes, and little can be done with it beyond that.
  • SPIRV.Module, which preserves the general layout of a SPIR-V module and adds more information about its logical layout. Notably, instructions use the Instruction data structure which encodes logical information about their semantics (SPIRV.OpCode), type and result identifiers (ResultID), and any arguments parsed into Julia types (Float32, Float64, etc). Pretty-printing is provided for SPIRV.Modules, which will show the SPIR-V module in colored human-readable text to MIME"text/plain" outputs.
  • IR, which splits a SPIR-V module into many secondary associative data structures, grouped by semantics. This is the best form to inspect logical sections of SPIR-V modules such as functions, types, or constants. It may be constructed by hand or from a SPIRV.Module, and is convertible to a SPIRV.Module by aggregating all of the logical sections and serializing them into an instruction stream.

Let's first start by looking at the first data structure, PhysicalModule. We will use a tiny vertex shader in SPIR-V form as our binary file.

using SPIRV, Test

bytes = read(joinpath(@__DIR__, "vert.spv"))
1396-element Vector{UInt8}:
 0x03
 0x02
 0x23
 0x07
 0x00
 0x00
 0x01
 0x00
 0x08
 0x00
    ⋮
 0x00
 0xfd
 0x00
 0x01
 0x00
 0x38
 0x00
 0x01
 0x00
pmod = PhysicalModule(bytes)
pmod.instructions
78-element Vector{PhysicalInstruction}:
 PhysicalInstruction(0x0002, 0x0011, nothing, nothing, UInt32[0x00000001])
 PhysicalInstruction(0x0006, 0x000b, nothing, 0x00000001, UInt32[0x4c534c47, 0x6474732e, 0x3035342e, 0x00000000])
 PhysicalInstruction(0x0003, 0x000e, nothing, nothing, UInt32[0x00000000, 0x00000001])
 PhysicalInstruction(0x0009, 0x000f, nothing, nothing, UInt32[0x00000000, 0x00000004, 0x6e69616d, 0x00000000, 0x0000000d, 0x00000012, 0x0000001b, 0x0000001d])
 PhysicalInstruction(0x0003, 0x0003, nothing, nothing, UInt32[0x00000002, 0x000001c2])
 PhysicalInstruction(0x0009, 0x0004, nothing, nothing, UInt32[0x415f4c47, 0x735f4252, 0x72617065, 0x5f657461, 0x64616873, 0x6f5f7265, 0x63656a62, 0x00007374])
 PhysicalInstruction(0x0004, 0x0005, nothing, nothing, UInt32[0x00000004, 0x6e69616d, 0x00000000])
 PhysicalInstruction(0x0006, 0x0005, nothing, nothing, UInt32[0x0000000b, 0x505f6c67, 0x65567265, 0x78657472, 0x00000000])
 PhysicalInstruction(0x0006, 0x0006, nothing, nothing, UInt32[0x0000000b, 0x00000000, 0x505f6c67, 0x7469736f, 0x006e6f69])
 PhysicalInstruction(0x0007, 0x0006, nothing, nothing, UInt32[0x0000000b, 0x00000001, 0x505f6c67, 0x746e696f, 0x657a6953, 0x00000000])
 ⋮
 PhysicalInstruction(0x0005, 0x0051, 0x00000006, 0x00000016, UInt32[0x00000013, 0x00000000])
 PhysicalInstruction(0x0005, 0x0051, 0x00000006, 0x00000017, UInt32[0x00000013, 0x00000001])
 PhysicalInstruction(0x0007, 0x0050, 0x00000007, 0x00000018, UInt32[0x00000016, 0x00000017, 0x00000014, 0x00000015])
 PhysicalInstruction(0x0005, 0x0041, 0x00000019, 0x0000001a, UInt32[0x0000000d, 0x0000000f])
 PhysicalInstruction(0x0003, 0x003e, nothing, nothing, UInt32[0x0000001a, 0x00000018])
 PhysicalInstruction(0x0004, 0x003d, 0x00000007, 0x0000001e, UInt32[0x0000001d])
 PhysicalInstruction(0x0003, 0x003e, nothing, nothing, UInt32[0x0000001b, 0x0000001e])
 PhysicalInstruction(0x0001, 0x00fd, nothing, nothing, UInt32[])
 PhysicalInstruction(0x0001, 0x0038, nothing, nothing, UInt32[])

There isn't much to look at here. We can assemble this module back into a sequence of words:

words = assemble(pmod)
349-element Vector{UInt32}:
 0x07230203
 0x00010000
 0x00080008
 0x00000023
 0x00000000
 0x00020011
 0x00000001
 0x0006000b
 0x00000001
 0x4c534c47
          ⋮
 0x0004003d
 0x00000007
 0x0000001e
 0x0000001d
 0x0003003e
 0x0000001b
 0x0000001e
 0x000100fd
 0x00010038

which is identical to the sequence of bytes we originally had:

@test reinterpret(UInt8, words) == bytes
Test Passed

To see better into our SPIR-V binary, let's parse it into a SPIRV.Module and pretty-print its contents:

mod = SPIRV.Module(bytes)
SPIR-V
Version: 1.0
Generator: 0x00080008
Schema: 0
Bound: 35


      Capability(Shader)
 %1 = ExtInstImport("GLSL.std.450")
      MemoryModel(Logical, GLSL450)
      EntryPoint(Vertex, %4, "main", %13, %18, %27, %29)
      Source(GLSL, 0x000001c2)
      SourceExtension("GL_ARB_separate_shader_objects")
      Name(%4, "main")
      Name(%11, "gl_PerVertex")
      MemberName(%11, 0x00000000, "gl_Position")
      MemberName(%11, 0x00000001, "gl_PointSize")
      MemberName(%11, 0x00000002, "gl_ClipDistance")
      MemberName(%11, 0x00000003, "gl_CullDistance")
      Name(%13, "")
      Name(%18, "in_position")
      Name(%27, "frag_color")
      Name(%29, "in_color")
      Name(%32, "UniformBufferObject")
      MemberName(%32, 0x00000000, "model")
      MemberName(%32, 0x00000001, "view")
      MemberName(%32, 0x00000002, "proj")
      Name(%34, "mvp")
      MemberDecorate(%11, 0x00000000, BuiltIn, Position)
      MemberDecorate(%11, 0x00000001, BuiltIn, PointSize)
      MemberDecorate(%11, 0x00000002, BuiltIn, ClipDistance)
      MemberDecorate(%11, 0x00000003, BuiltIn, CullDistance)
      Decorate(%11, Block)
      Decorate(%18, Location, 0x00000000)
      Decorate(%27, Location, 0x00000000)
      Decorate(%29, Location, 0x00000001)
      MemberDecorate(%32, 0x00000000, ColMajor)
      MemberDecorate(%32, 0x00000000, Offset, 0x00000000)
      MemberDecorate(%32, 0x00000000, MatrixStride, 0x00000010)
      MemberDecorate(%32, 0x00000001, ColMajor)
      MemberDecorate(%32, 0x00000001, Offset, 0x00000040)
      MemberDecorate(%32, 0x00000001, MatrixStride, 0x00000010)
      MemberDecorate(%32, 0x00000002, ColMajor)
      MemberDecorate(%32, 0x00000002, Offset, 0x00000080)
      MemberDecorate(%32, 0x00000002, MatrixStride, 0x00000010)
      Decorate(%32, Block)
      Decorate(%34, DescriptorSet, 0x00000000)
      Decorate(%34, Binding, 0x00000000)
 %2 = TypeVoid()
 %3 = TypeFunction(%2)
 %6 = TypeFloat(0x00000020)
 %7 = TypeVector(%6, 0x00000004)
 %8 = TypeInt(0x00000020, 0x00000000)
 %9 = Constant(0x00000001)::%8
%10 = TypeArray(%6, %9)
%11 = TypeStruct(%7, %6, %10, %10)
%12 = TypePointer(Output, %11)
%13 = Variable(Output)::%12
%14 = TypeInt(0x00000020, 0x00000001)
%15 = Constant(0x00000000)::%14
%16 = TypeVector(%6, 0x00000002)
%17 = TypePointer(Input, %16)
%18 = Variable(Input)::%17
%20 = Constant(0x00000000)::%6
%21 = Constant(0x3f800000)::%6
%25 = TypePointer(Output, %7)
%27 = Variable(Output)::%25
%28 = TypePointer(Input, %7)
%29 = Variable(Input)::%28
%31 = TypeMatrix(%7, 0x00000004)
%32 = TypeStruct(%31, %31, %31)
%33 = TypePointer(Uniform, %32)
%34 = Variable(Uniform)::%33
 %4 = Function(None, %3)::%2
 %5 = Label()
%19 = Load(%18)::%16
%22 = CompositeExtract(%19, 0x00000000)::%6
%23 = CompositeExtract(%19, 0x00000001)::%6
%24 = CompositeConstruct(%22, %23, %20, %21)::%7
%26 = AccessChain(%13, %15)::%25
      Store(%26, %24)
%30 = Load(%29)::%7
      Store(%27, %30)
      Return()
      FunctionEnd()

Contents still remain unchanged. Let's prove that:

@test assemble(mod) == words
Test Passed

Now, let's go all the way and do some work to understand the contents of this SPIR-V module:

ir = IR(bytes)
SPIR-V
Version: 1.0
Generator: 0x00080008
Schema: 0
Bound: 35


      Capability(Shader)
 %1 = ExtInstImport("GLSL.std.450")
      MemoryModel(Logical, GLSL450)
      EntryPoint(Vertex, %main, "main", %13, %in_position, %frag_color, %in_color)
      Source(GLSL, 0x000001c2)
      SourceExtension("GL_ARB_separate_shader_objects")
      Name(%main, "main")
      Name(%gl_PerVertex, "gl_PerVertex")
      MemberName(%gl_PerVertex, 0x00000000, "gl_Position")
      MemberName(%gl_PerVertex, 0x00000001, "gl_PointSize")
      MemberName(%gl_PerVertex, 0x00000002, "gl_ClipDistance")
      MemberName(%gl_PerVertex, 0x00000003, "gl_CullDistance")
      Name(%13, "")
      Name(%in_position, "in_position")
      Name(%frag_color, "frag_color")
      Name(%in_color, "in_color")
      Name(%UniformBufferObject, "UniformBufferObject")
      MemberName(%UniformBufferObject, 0x00000000, "model")
      MemberName(%UniformBufferObject, 0x00000001, "view")
      MemberName(%UniformBufferObject, 0x00000002, "proj")
      Name(%mvp, "mvp")
      Decorate(%gl_PerVertex, Block)
      MemberDecorate(%gl_PerVertex, 0x00000000, BuiltIn, Position)
      MemberDecorate(%gl_PerVertex, 0x00000001, BuiltIn, PointSize)
      MemberDecorate(%gl_PerVertex, 0x00000002, BuiltIn, ClipDistance)
      MemberDecorate(%gl_PerVertex, 0x00000003, BuiltIn, CullDistance)
      Decorate(%in_position, Location, 0x00000000)
      Decorate(%frag_color, Location, 0x00000000)
      Decorate(%in_color, Location, 0x00000001)
      Decorate(%UniformBufferObject, Block)
      MemberDecorate(%UniformBufferObject, 0x00000000, ColMajor)
      MemberDecorate(%UniformBufferObject, 0x00000000, MatrixStride, 0x00000010)
      MemberDecorate(%UniformBufferObject, 0x00000000, Offset, 0x00000000)
      MemberDecorate(%UniformBufferObject, 0x00000001, ColMajor)
      MemberDecorate(%UniformBufferObject, 0x00000001, MatrixStride, 0x00000010)
      MemberDecorate(%UniformBufferObject, 0x00000001, Offset, 0x00000040)
      MemberDecorate(%UniformBufferObject, 0x00000002, ColMajor)
      MemberDecorate(%UniformBufferObject, 0x00000002, MatrixStride, 0x00000010)
      MemberDecorate(%UniformBufferObject, 0x00000002, Offset, 0x00000080)
      Decorate(%mvp, Binding, 0x00000000)
      Decorate(%mvp, DescriptorSet, 0x00000000)
 %2 = TypeVoid()
 %3 = TypeFunction(%2)
 %6 = TypeFloat(0x00000020)
 %7 = TypeVector(%6, 0x00000004)
 %8 = TypeInt(0x00000020, 0x00000000)
 %9 = Constant(0x00000001)::%8
%10 = TypeArray(%6, %9)
%gl_PerVertex = TypeStruct(%7, %6, %10, %10)
%12 = TypePointer(Output, %gl_PerVertex)
%13 = Variable(Output)::%12
%14 = TypeInt(0x00000020, 0x00000001)
%15 = Constant(0x00000000)::%14
%16 = TypeVector(%6, 0x00000002)
%17 = TypePointer(Input, %16)
%in_position = Variable(Input)::%17
%20 = Constant(0x00000000)::%6
%21 = Constant(0x3f800000)::%6
%25 = TypePointer(Output, %7)
%frag_color = Variable(Output)::%25
%28 = TypePointer(Input, %7)
%in_color = Variable(Input)::%28
%31 = TypeMatrix(%7, 0x00000004)
%UniformBufferObject = TypeStruct(%31, %31, %31)
%33 = TypePointer(Uniform, %UniformBufferObject)
%mvp = Variable(Uniform)::%33
 %main = Function(None, %3)::%2
 %5 = Label()
%19 = Load(%in_position)::%16
%22 = CompositeExtract(%19, 0x00000000)::%6
%23 = CompositeExtract(%19, 0x00000001)::%6
%24 = CompositeConstruct(%22, %23, %20, %21)::%7
%26 = AccessChain(%13, %15)::%25
      Store(%26, %24)
%30 = Load(%in_color)::%7
      Store(%frag_color, %30)
      Return()
      FunctionEnd()

The IR is still printed as an instruction stream, but that is only for pretty-printing; the data structure itself loses the sequential layout of the module. Pretty-printing first transforms the IR into a SPIRV.Module, then uses some of the information stored in the IR to enhance the display. Notice how certain previously numbered variable names were substituted with actual variable names, such as %frag_color.

This IR data structure is not public, in the sense that we do not commit to a stable semantically versioned implementation; it has notably been designed in context of the Julia to SPIR-V compiler, which is still experimental. However, if you want to explore a given SPIR-V module, you can try to make sense of its contents:

ir.fdefs
SPIRV.BijectiveMapping with 1 elements:
  %4 <=> FunctionDefinition (0 arguments, 1 block, 10 expressions)
ir.constants
SPIRV.BijectiveMapping with 4 elements:
   %9 <=> SPIRV.Constant(0x00000001, IntegerType(32, false), Base.RefValue{Bool}(false))
  %15 <=> SPIRV.Constant(0, IntegerType(32, true), Base.RefValue{Bool}(false))
  %20 <=> SPIRV.Constant(0.0f0, FloatType(32), Base.RefValue{Bool}(false))
  %21 <=> SPIRV.Constant(1.0f0, FloatType(32), Base.RefValue{Bool}(false))
ir.types
SPIRV.BijectiveMapping with 16 elements:
   %2 <=> VoidType()
   %3 <=> SPIRV.FunctionType(VoidType(), SPIRType[])
   %6 <=> FloatType(32)
   %7 <=> VectorType(FloatType(32), 4)
   %8 <=> IntegerType(32, false)
  %10 <=> ArrayType(FloatType(32), SPIRV.Constant(0x00000001, IntegerType(32, false), Base.RefValue{Bool}(false)))
  %11 <=> StructType(Base.UUID("7d98aeb6-8711-11ef-3f0c-6d4787e071ed"), SPIRType[VectorType(FloatType(32), 4), FloatType(32), ArrayType(FloatType(32), SPIRV.Constant(0x00000001, IntegerType(32, false), Base.RefValue{Bool}(false))), ArrayType(FloatType(32), SPIRV.Constant(0x00000001, IntegerType(32, false), Base.RefValue{Bool}(false)))])
  %12 <=> PointerType(SPIRV.StorageClassOutput, StructType(Base.UUID("7d98aeb6-8711-11ef-3f0c-6d4787e071ed"), SPIRType[VectorType(FloatType(32), 4), FloatType(32), ArrayType(FloatType(32), SPIRV.Constant(0x00000001, IntegerType(32, false), Base.RefValue{Bool}(false))), ArrayType(FloatType(32), SPIRV.Constant(0x00000001, IntegerType(32, false), Base.RefValue{Bool}(false)))]))
  %14 <=> IntegerType(32, true)
  %16 <=> VectorType(FloatType(32), 2)
  %17 <=> PointerType(SPIRV.StorageClassInput, VectorType(FloatType(32), 2))
  %25 <=> PointerType(SPIRV.StorageClassOutput, VectorType(FloatType(32), 4))
  %28 <=> PointerType(SPIRV.StorageClassInput, VectorType(FloatType(32), 4))
  %31 <=> MatrixType(VectorType(FloatType(32), 4), 4, true)
  %32 <=> StructType(Base.UUID("7da3b63a-8711-11ef-2d71-1bdc96eaf652"), SPIRType[MatrixType(VectorType(FloatType(32), 4), 4, true), MatrixType(VectorType(FloatType(32), 4), 4, true), MatrixType(VectorType(FloatType(32), 4), 4, true)])
  %33 <=> PointerType(SPIRV.StorageClassUniform, StructType(Base.UUID("7da3b63a-8711-11ef-2d71-1bdc96eaf652"), SPIRType[MatrixType(VectorType(FloatType(32), 4), 4, true), MatrixType(VectorType(FloatType(32), 4), 4, true), MatrixType(VectorType(FloatType(32), 4), 4, true)]))
ir.capabilities
1-element Vector{SPIRV.Capability}:
 CapabilityShader::Capability = 0x00000001

Let's now assemble it into a sequence of instructions. Note that the regenerated module may slightly differ from the original module; extra metadata may be added or removed.

regenerated = SPIRV.Module(ir)

setdiff(mod, regenerated)
Instruction[]
setdiff(regenerated, mod)
Instruction[]

This page was generated using Literate.jl.