MLIR编译器手册,Dialect及Operation详解

发布时间 2023-04-28 05:07:01作者: 吴建明wujianming
MLIR编译器手册,Dialect及Operation详解
https://zhuanlan.zhihu.com/p/582517107
MLIR Language Reference
MLIR语言参考
 
MLIR(Multi-Level IR)是一种编译器中间表示,与传统的三地址SSA表示(如LLVM IR或SIL)相似,但它引入了多面体循环优化的概念作为一级概念。这种混合设计经过优化,可以表示、分析和转换高级数据流图以及为高性能数据并行系统生成的特定目标代码。除了它的代表性功能之外,它的单一连续设计提供了一个框架,可以从数据流图降低到高性能的目标特定代码。
 
本文档定义并描述了MLIR中的关键概念,旨在成为一份枯燥的参考文档——基本原理文档、术语表和其他内容托管在其他地方。
 
MLIR被设计为以三种不同的形式使用:一种适合调试的人类可读文本形式,一种适合编程转换和分析的内存形式,以及一种适合存储和传输的紧凑序列化形式。不同的形式都描述了相同的语义内容。本文档描述了人类可读的文本形式。
MLIR (Multi-Level IR) is a compiler intermediate representation with similarities to traditional three-address SSA representations (like LLVM IR or SIL), but which introduces notions from polyhedral loop optimization as first-class concepts. This hybrid design is optimized to represent, analyze, and transform high level dataflow graphs as well as target-specific code generated for high performance data parallel systems. Beyond its representational capabilities, its single continuous design provides a framework to lower from dataflow graphs to high-performance target-specific code.
This document defines and describes the key concepts in MLIR, and is intended to be a dry reference document - the rationale documentation, glossary, and other content are hosted elsewhere.
MLIR is designed to be used in three different forms: a human-readable textual form suitable for debugging, an in-memory form suitable for programmatic transformations and analysis, and a compact serialized form suitable for storage and transport. The different forms all describe the same semantic content. This document describes the human-readable textual form.
High-Level Structure
高层结构
MLIR基本上是基于节点(称为运算)和边(称为值)的类似图形的数据结构。每个值都是一个操作或块参数的结果,并且具有由类型系统定义的值类型。操作包含在块中,块包含在区域中。操作也在其包含块中排序,块在其包含区域中排序,尽管这种顺序在给定类型的区域中可能有语义意义,也可能没有语义意义)。操作还可以包含区域,从而能够表示层次结构。
操作可以代表许多不同的概念,从更高级别的概念,如函数定义、函数调用、缓冲区分配、缓冲区视图或切片以及进程创建,到更低级别的概念(如目标无关算术、目标特定指令、配置寄存器和逻辑门)。这些不同的概念由MLIR中的不同操作来表示,并且MLIR中可用的操作集可以任意扩展。
MLIR还使用熟悉的编译器过程概念,为操作转换提供了一个可扩展的框架。对任意一组操作启用任意一组传递会带来显著的扩展挑战,因为每次转换都可能考虑到任何操作的语义。MLIR通过允许使用特征和接口抽象地描述操作语义来解决这种复杂性,从而使转换能够更通用地操作操作。特征通常描述有效IR上的验证约束,使复杂的不变量能够被捕获和检查。
 
MLIR的一个明显应用是表示基于SSA的IR,如LLVM核心IR,通过适当的操作类型选择来定义模块、函数、分支、内存分配和验证约束,以确保SSA优势属性。MLIR包括一组方言,这些方言定义了这样的结构。然而,MLIR旨在足够通用,以表示其他类似编译器的数据结构,例如语言前端中的抽象语法树、目标特定后端中生成的指令或高级合成工具中的电路。
以下是MLIR模块的示例:
MLIR is fundamentally based on a graph-like data structure of nodes, called Operations, and edges, called Values. Each Value is the result of exactly one Operation or Block Argument, and has a Value Type defined by the type system. Operations are contained in Blocks and Blocks are contained in Regions. Operations are also ordered within their containing block and Blocks are ordered in their containing region, although this order may or may not be semantically meaningful in a given kind of region). Operations may also contain regions, enabling hierarchical structures to be represented.
Operations can represent many different concepts, from higher-level concepts like function definitions, function calls, buffer allocations, view or slices of buffers, and process creation, to lower-level concepts like target-independent arithmetic, target-specific instructions, configuration registers, and logic gates. These different concepts are represented by different operations in MLIR and the set of operations usable in MLIR can be arbitrarily extended.
MLIR also provides an extensible framework for transformations on operations, using familiar concepts of compiler Passes. Enabling an arbitrary set of passes on an arbitrary set of operations results in a significant scaling challenge, since each transformation must potentially take into account the semantics of any operation. MLIR addresses this complexity by allowing operation semantics to be described abstractly using Traits and Interfaces, enabling transformations to operate on operations more generically. Traits often describe verification constraints on valid IR, enabling complex invariants to be captured and checked. (see Op vs Operation)
One obvious application of MLIR is to represent an SSA-based IR, like the LLVM core IR, with appropriate choice of operation types to define Modules, Functions, Branches, Memory Allocation, and verification constraints to ensure the SSA Dominance property. MLIR includes a collection of dialects which defines just such structures. However, MLIR is intended to be general enough to represent other compiler-like data structures, such as Abstract Syntax Trees in a language frontend, generated instructions in a target-specific backend, or circuits in a High-Level Synthesis tool.
Here’s an example of an MLIR module:
// Compute A*B using an implementation of multiply kernel and print the // result using a TensorFlow op. The dimensions of A and B are partially // known. The shapes are assumed to match. func.func @mul(%A: tensor<100x?xf32>, %B: tensor<?x50xf32>) -> (tensor<100x50xf32>) { // Compute the inner dimension of %A using the dim operation. %n = memref.dim %A, 1 : tensor<100x?xf32> // Allocate addressable "buffers" and copy tensors %A and %B into them. %A_m = memref.alloc(%n) : memref<100x?xf32> memref.tensor_store %A to %A_m : memref<100x?xf32> %B_m = memref.alloc(%n) : memref<?x50xf32> memref.tensor_store %B to %B_m : memref<?x50xf32> // Call function @multiply passing memrefs as arguments, // and getting returned the result of the multiplication. %C_m = call @multiply(%A_m, %B_m) : (memref<100x?xf32>, memref<?x50xf32>) -> (memref<100x50xf32>) memref.dealloc %A_m : memref<100x?xf32> memref.dealloc %B_m : memref<?x50xf32> // Load the buffer data into a higher level "tensor" value. %C = memref.tensor_load %C_m : memref<100x50xf32> memref.dealloc %C_m : memref<100x50xf32> // Call TensorFlow built-in function to print the result tensor. "tf.Print"(%C){message: "mul result"} : (tensor<100x50xf32>) -> (tensor<100x50xf32>) return %C : tensor<100x50xf32> } // A function that multiplies two memrefs and returns the result. func.func @multiply(%A: memref<100x?xf32>, %B: memref<?x50xf32>) -> (memref<100x50xf32>) { // Compute the inner dimension of %A. %n = memref.dim %A, 1 : memref<100x?xf32> // Allocate memory for the multiplication result. %C = memref.alloc() : memref<100x50xf32> // Multiplication loop nest. affine.for %i = 0 to 100 { affine.for %j = 0 to 50 { memref.store 0 to %C[%i, %j] : memref<100x50xf32> affine.for %k = 0 to %n { %a_v = memref.load %A[%i, %k] : memref<100x?xf32> %b_v = memref.load %B[%k, %j] : memref<?x50xf32> %prod = arith.mulf %a_v, %b_v : f32 %c_v = memref.load %C[%i, %j] : memref<100x50xf32> %sum = arith.addf %c_v, %prod : f32 memref.store %sum, %C[%i, %j] : memref<100x50xf32> } } } return %C : memref<100x50xf32> }
Notation
符号
MLIR有一个简单而明确的语法,允许它可靠地往返于文本形式。这对编译器的开发很重要,例如,对于理解正在转换的代码的状态和编写测试用例。
本文档描述了使用扩展Backus-Naur巴克斯-诺尔形式(EBNF)的语法。
这是本文档中使用的EBNF语法,用黄色方框表示。
MLIR has a simple and unambiguous grammar, allowing it to reliably round-trip through a textual form. This is important for development of the compiler - e.g. for understanding the state of code as it is being transformed and writing test cases.
This document describes the grammar using Extended Backus-Naur Form (EBNF).
This is the EBNF grammar used in this document, presented in yellow boxes.
alternation ::= expr0 | expr1 | expr2 // Either expr0 or expr1 or expr2. sequence ::= expr0 expr1 expr2 // Sequence of expr0 expr1 expr2. repetition0 ::= expr* // 0 or more occurrences. repetition1 ::= expr+ // 1 or more occurrences. optionality ::= expr? // 0 or 1 occurrence. grouping ::= (expr) // Everything inside parens is grouped together. literal ::= `abcd` // Matches the literal `abcd`.
Code examples are presented in blue boxes.
// This is an example use of the grammar above: // This matches things like: ba, bana, boma, banana, banoma, bomana... example ::= `b` (`an` | `om`)* `a`
Common syntax
The following core grammar productions are used in this document:
// TODO: Clarify the split between lexing (tokens) and parsing (grammar). digit ::= [0-9] hex_digit ::= [0-9a-fA-F] letter ::= [a-zA-Z] id-punct ::= [$._-] integer-literal ::= decimal-literal | hexadecimal-literal decimal-literal ::= digit+ hexadecimal-literal ::= `0x` hex_digit+ float-literal ::= [-+]?[0-9]+[.][0-9]*([eE][-+]?[0-9]+)? string-literal ::= `"` [^"\n\f\v\r]* `"` TODO: define escaping rules
Not listed here, but MLIR does support comments. They use standard BCPL syntax, starting with a // and going until the end of the line.
Top level Productions
// Top level production toplevel := (operation | attribute-alias-def | type-alias-def)*
The production toplevel is the top level production that is parsed by any parsing consuming the MLIR syntax. Operations, Attribute aliases, and Type aliases can be declared on the toplevel.
Identifiers and keywords
Syntax:
// Identifiers bare-id ::= (letter|[_]) (letter|digit|[_$.])* bare-id-list ::= bare-id (`,` bare-id)* value-id ::= `%` suffix-id alias-name :: = bare-id suffix-id ::= (digit+ | ((letter|id-punct) (letter|id-punct|digit)*)) symbol-ref-id ::= `@` (suffix-id | string-literal) (`::` symbol-ref-id)? value-id-list ::= value-id (`,` value-id)* // Uses of value, e.g. in an operand list to an operation. value-use ::= value-id value-use-list ::= value-use (`,` value-use)*
标识符命名实体,如值、类型和函数,并由MLIR代码的编写者选择。标识符可以是描述性的(例如%batch_size、@matmul),也可以在自动生成时是非描述性的(如%23、@func42)。值的标识符名称可以在MLIR文本文件中使用,但不会作为IR的一部分保留-打印机会给它们提供匿名名称,如%42。
MLIR通过在标识符前面加一个sigil(例如%、#、@、^、!)来保证标识符永远不会与关键字冲突。在某些明确的上下文(例如仿射表达式)中,为了简洁起见,标识符没有前缀。可以将新的关键字添加到MLIR的未来版本中,而不会有与现有标识符冲突的危险。
值标识符只在定义它们的(嵌套)区域的范围内,不能在该区域之外访问或引用。映射函数中的参数标识符在映射主体的作用域中。特定的操作可能会进一步限制哪些标识符在其区域的范围内。例如,具有SSA控制流语义的区域中的值的范围是根据SSA优势的标准定义来约束的。另一个例子是IsolatedFromAbove特性,它限制直接访问包含区域中定义的值。
函数标识符和映射标识符与符号相关联,并且具有依赖于符号属性的作用域规则
Identifiers name entities such as values, types and functions, and are chosen by the writer of MLIR code. Identifiers may be descriptive (e.g. %batch_size, @matmul), or may be non-descriptive when they are auto-generated (e.g. %23, @func42). Identifier names for values may be used in an MLIR text file but are not persisted as part of the IR - the printer will give them anonymous names like %42.
MLIR guarantees identifiers never collide with keywords by prefixing identifiers with a sigil (e.g. %, #, @, ^, !). In certain unambiguous contexts (e.g. affine expressions), identifiers are not prefixed, for brevity. New keywords may be added to future versions of MLIR without danger of collision with existing identifiers.
Value identifiers are only in scope for the (nested) region in which they are defined and cannot be accessed or referenced outside of that region. Argument identifiers in mapping functions are in scope for the mapping body. Particular operations may further limit which identifiers are in scope in their regions. For instance, the scope of values in a region with SSA control flow semantics is constrained according to the standard definition of SSA dominance. Another example is the IsolatedFromAbove trait, which restricts directly accessing values defined in containing regions.
Function identifiers and mapping identifiers are associated with Symbols and have scoping rules dependent on symbol attributes.
Dialects
方言
方言是参与和扩展MLIR生态系统的机制。它们允许定义新的操作以及属性和类型。每个方言都有一个唯一的名称空间,该名称空间以每个定义的属性/操作/类型为前缀。例如,仿射方言定义了名称空间:Affine。
 
MLIR允许多种方言,甚至是主树之外的方言,在一个模块内共存。方言是由某些通行证产生和使用的。MLIR提供了一个在不同方言之间和内部转换的框架。
MLIR支持的几种方言:
 
Dialects are the mechanism by which to engage with and extend the MLIR ecosystem. They allow for defining new operations, as well as attributes and types. Each dialect is given a unique namespace that is prefixed to each defined attribute/operation/type. For example, the Affine dialect defines the namespace: affine.
MLIR allows for multiple dialects, even those outside of the main tree, to co-exist together within one module. Dialects are produced and consumed by certain passes. MLIR provides a framework to convert between, and within, different dialects.
A few of the dialects supported by MLIR:
Target specific operations
目标特定操作
方言提供了一种模块化的方式,通过这种方式,目标可以直接向MLIR公开特定于目标的操作。例如,一些目标通过LLVM。LLVM具有一组丰富的内部函数,用于某些与目标无关的操作(例如,带溢出检查的加法),并为其支持的目标提供对目标特定操作的访问(例如,向量置换操作)。MLIR中的LLVM内部函数通过以“LLVM.”名称开头的操作来表示。
例子:
Dialects provide a modular way in which targets can expose target-specific operations directly through to MLIR. As an example, some targets go through LLVM. LLVM has a rich set of intrinsics for certain target-independent operations (e.g. addition with overflow check) as well as providing access to target-specific operations for the targets it supports (e.g. vector permutation operations). LLVM intrinsics in MLIR are represented via operations that start with an “llvm.” name.
Example:
// LLVM: %x = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) %x:2 = "llvm.sadd.with.overflow.i16"(%a, %b) : (i16, i16) -> (i16, i1)
这些操作仅在将LLVM作为后端(例如CPU和GPU)时有效,并且需要与这些内部的LLVM定义保持一致。
These operations only work when targeting LLVM as a backend (e.g. for CPUs and GPUs), and are required to align with the LLVM definition of these intrinsics.
Operations
操作
语法:
Syntax:
operation ::= op-result-list? (generic-operation | custom-operation) trailing-location? generic-operation ::= string-literal `(` value-use-list? `)` successor-list? region-list? dictionary-attribute? `:` function-type custom-operation ::= bare-id custom-operation-format op-result-list ::= op-result (`,` op-result)* `=` op-result ::= value-id (`:` integer-literal) successor-list ::= `[` successor (`,` successor)* `]` successor ::= caret-id (`:` block-arg-list)? region-list ::= `(` region (`,` region)* `)` dictionary-attribute ::= `{` (attribute-entry (`,` attribute-entry)*)? `}` trailing-location ::= (`loc` `(` location `)`)?
MLIR引入了一个称为操作的统一概念,从而能够描述许多不同级别的抽象和计算。MLIR中的操作是完全可扩展的(没有固定的操作列表),并且具有特定于应用程序的语义。例如,MLIR支持独立于目标的操作、仿射操作和特定于目标的机器操作。
操作的内部表示很简单:一个操作由一个唯一的字符串(例如dim、tf.Conv2d、x86.repmovsb、ppc.eieio等)标识,可以返回零个或多个结果,接受零个或更多操作数,具有属性字典,具有零个或更多后继项,以及零个或更少封闭区域。通用打印表单从字面上包括所有这些元素,并带有一个函数类型来指示结果和操作数的类型。
示例:
MLIR introduces a uniform concept called operations to enable describing many different levels of abstractions and computations. Operations in MLIR are fully extensible (there is no fixed list of operations) and have application-specific semantics. For example, MLIR supports target-independent operations, affine operations, and target-specific machine operations.
The internal representation of an operation is simple: an operation is identified by a unique string (e.g. dim, tf.Conv2d, x86.repmovsb, ppc.eieio, etc), can return zero or more results, take zero or more operands, has a dictionary of attributes, has zero or more successors, and zero or more enclosed regions. The generic printing form includes all these elements literally, with a function type to indicate the types of the results and operands.
Example:
// An operation that produces two results. // The results of %result can be accessed via the <name> `#` <opNo> syntax. %result:2 = "foo_div"() : () -> (f32, i32) // Pretty form that defines a unique name for each result. %foo, %bar = "foo_div"() : () -> (f32, i32) // Invoke a TensorFlow function called tf.scramble with two inputs // and an attribute "fruit". %2 = "tf.scramble"(%result#0, %bar) {fruit = "banana"} : (f32, i32) -> f32
除了上面的基本语法之外,方言还可以注册已知的操作。这允许这些方言支持用于解析和打印操作的自定义汇编形式。在下面列出的操作集中,我们显示了这两种形式。
In addition to the basic syntax above, dialects may register known operations. This allows those dialects to support custom assembly form for parsing and printing operations. In the operation sets listed below, we show both forms.
Builtin Operations
内置操作
内置方言定义了一些MLIR方言广泛适用的操作,例如简化方言间/方言内转换的通用转换强制转换操作。这个方言还定义了一个顶级模块操作,它表示一个有用的IR容器。
分块
语法
The builtin dialect defines a select few operations that are widely applicable by MLIR dialects, such as a universal conversion cast operation that simplifies inter/intra dialect conversion. This dialect also defines a top-level module operation, that represents a useful IR container.
Blocks
Syntax:
block ::= block-label operation+ block-label ::= block-id block-arg-list? `:` block-id ::= caret-id caret-id ::= `^` suffix-id value-id-and-type ::= value-id `:` type // Non-empty list of names and types. value-id-and-type-list ::= value-id-and-type (`,` value-id-and-type)* block-arg-list ::= `(` value-id-and-type-list? `)`
块是一个操作列表。在SSACFG区域中,每个块表示编译器基本块,其中块内的指令按顺序执行,终止符操作实现基本块之间的控制流分支。
块中的最后一个操作必须是终止符操作。具有单个块的区域可以通过在封闭操作上附加NoTerminator来选择不满足这一要求。顶级ModuleOp就是这样一个操作的例子,它定义了这一特性,并且其块体没有终止符。
MLIR中的块采用块参数列表,以类似函数的方式表示。块参数绑定到由单个操作的语义指定的值。区域的入口块的块参数也是该区域的参数,并且绑定到这些参数的值由包含操作的语义确定。其他块的块参数是由终止符操作的语义决定的,例如将块作为后续块的分支。在具有控制流的区域中,MLIR利用该结构隐式地表示控制流相关值的通过,而没有传统SSA表示中PHI节点的复杂细微差别。请注意,与控制流无关的值可以直接引用,不需要通过块参数传递。
下面是一个简单的示例函数,显示分支、返回和块参数:
A Block is a list of operations. In SSACFG regions, each block represents a compiler basic block where instructions inside the block are executed in order and terminator operations implement control flow branches between basic blocks.
The last operation in a block must be a terminator operation. A region with a single block may opt out of this requirement by attaching the NoTerminator on the enclosing op. The top-level ModuleOp is an example of such an operation which defines this trait and whose block body does not have a terminator.
Blocks in MLIR take a list of block arguments, notated in a function-like way. Block arguments are bound to values specified by the semantics of individual operations. Block arguments of the entry block of a region are also arguments to the region and the values bound to these arguments are determined by the semantics of the containing operation. Block arguments of other blocks are determined by the semantics of terminator operations, e.g. Branches, which have the block as a successor. In regions with control flow, MLIR leverages this structure to implicitly represent the passage of control-flow dependent values without the complex nuances of PHI nodes in traditional SSA representations. Note that values which are not control-flow dependent can be referenced directly and do not need to be passed through block arguments.
Here is a simple example function showing branches, returns, and block arguments:
func.func @simple(i64, i1) -> i64 { ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a cf.cond_br %cond, ^bb1, ^bb2 ^bb1: cf.br ^bb3(%a: i64) // Branch passes %a as the argument ^bb2: %b = arith.addi %a, %a : i64 cf.br ^bb3(%b: i64) // Branch passes %b as the argument // ^bb3 receives an argument, named %c, from predecessors // and passes it on to bb4 along with %a. %a is referenced // directly from its defining operation and is not passed through // an argument of ^bb3. ^bb3(%c: i64): cf.br ^bb4(%c, %a : i64, i64) ^bb4(%d : i64, %e : i64): %0 = arith.addi %d, %e : i64 return %0 : i64 // Return is also a terminator. }
上下文:与传统的“PHI节点是操作”SSA IR(如LLVM)相比,“块参数”表示从IR中消除了许多特殊情况。例如,SSA的并行复制语义是显而易见的,函数参数不再是特例:它们变成了入口块的参数[更多的基本原理]。块也是一个基本概念,不能用操作来表示,因为在操作中定义的值不能在操作之外访问。
Context: The “block argument” representation eliminates a number of special cases from the IR compared to traditional “PHI nodes are operations” SSA IRs (like LLVM). For example, the parallel copy semantics of SSA is immediately apparent, and function arguments are no longer a special case: they become arguments to the entry block [ more rationale]. Blocks are also a fundamental concept that cannot be represented by operations because values defined in an operation cannot be accessed outside the operation.
Regions
Definition
区域
释义
区域是MLIR块的有序列表。区域内的语义不是由IR强加的。相反,包含操作定义了它所包含的区域的语义。MLIR目前定义了两种区域:SSACFG区域和Graph区域,SSACFG区域描述块之间的控制流,Graph区域不需要块间的控制流。操作中的区域类型是使用RegionKindInterface描述的。
区域没有名称或地址,只有区域中包含的块有。区域必须包含在操作中,并且没有类型或属性。该区域中的第一个块是一个称为“入口块”的特殊块。入口块的参数也是区域本身的参数。入口块不能被列为任何其他块的后续块。区域的语法如下:
A region is an ordered list of MLIR Blocks. The semantics within a region is not imposed by the IR. Instead, the containing operation defines the semantics of the regions it contains. MLIR currently defines two kinds of regions: SSACFG regions, which describe control flow between blocks, and Graph regions, which do not require control flow between block. The kinds of regions within an operation are described using the RegionKindInterface.
Regions do not have a name or an address, only the blocks contained in a region do. Regions must be contained within operations and have no type or attributes. The first block in the region is a special block called the ‘entry block’. The arguments to the entry block are also the arguments of the region itself. The entry block cannot be listed as a successor of any other block. The syntax for a region is as follows:
region ::= `{` entry-block? block* `}` entry-block ::= operation+
函数体是区域的一个例子:它由块的CFG组成,并具有其他类型的区域可能没有的额外语义限制。例如,在函数体中,块终止符必须分支到不同的块,或者从函数返回,其中返回参数的类型必须与函数签名的结果类型匹配。同样,函数参数必须与区域参数的类型和计数相匹配。通常,具有区域的操作可以任意定义这些对应关系。
 
入口块是一个没有标签和参数的块,可能出现在区域的开头。它启用了一种使用区域来打开新作用域的通用模式。
A function body is an example of a region: it consists of a CFG of blocks and has additional semantic restrictions that other types of regions may not have. For example, in a function body, block terminators must either branch to a different block, or return from a function where the types of the return arguments must match the result types of the function signature. Similarly, the function arguments must match the types and count of the region arguments. In general, operations with regions can define these correspondences arbitrarily.
An entry block is a block with no label and no arguments that may occur at the beginning of a region. It enables a common pattern of using a region to open a new scope.
Value Scoping
数值范围界定
区域提供程序的分层封装:不可能引用(即分支到)与引用源不在同一区域的块,即终止符操作。类似地,区域为值可见性提供了一个自然的范围:在区域中定义的值不会逃逸到封闭区域(如果有的话)。默认情况下,只要封闭操作的操作数引用这些值是合法的,区域内的操作就可以引用区域外定义的值,但这可以使用特征来限制,如OpTrait::IsolatedFromAbove或自定义验证器。
示例:
Regions provide hierarchical encapsulation of programs: it is impossible to reference, i.e. branch to, a block which is not in the same region as the source of the reference, i.e. a terminator operation. Similarly, regions provides a natural scoping for value visibility: values defined in a region don’t escape to the enclosing region, if any. By default, operations inside a region can reference values defined outside of the region whenever it would have been legal for operands of the enclosing operation to reference those values, but this can be restricted using traits, such as OpTrait::IsolatedFromAbove, or a custom verifier.
Example:
"any_op"(%a) ({ // if %a is in-scope in the containing region... // then %a is in-scope here too. %new_value = "another_op"(%a) : (i64) -> (i64) }) : (i64) -> (i64)
MLIR定义了一个通用的“层次优势”概念,该概念在层次结构中运行,并定义值是否“在范围内”并且可以由特定操作使用。一个值是否可以由同一区域中的另一个操作使用由区域的类型定义。在区域中定义的值可以由在同一区域中具有父级的操作使用,如果且仅当父级可以使用该值。由一个区域的参数定义的值总是可以由该区域中包含的任何操作使用。在区域中定义的值永远不能在区域之外使用。
MLIR defines a generalized ‘hierarchical dominance’ concept that operates across hierarchy and defines whether a value is ‘in scope’ and can be used by a particular operation. Whether a value can be used by another operation in the same region is defined by the kind of region. A value defined in a region can be used by an operation which has a parent in the same region, if and only if the parent could use the value. A value defined by an argument to a region can always be used by any operation deeply contained in the region. A value defined in a region can never be used outside of the region.
Control Flow and SSACFG Regions
控制流和SSACFG区域
在MLIR中,区域的控制流语义由RegionKind::SSACFG表示。非正式地说,这些区域支持区域中的操作“按顺序执行”的语义。在执行操作之前,其操作数具有定义明确的值。执行操作后,操作数具有相同的值,结果也具有定义明确的值。在一个操作执行之后,块中的下一个操作将执行,直到该操作是块末尾的终止符操作,在这种情况下,将执行其他一些操作。下一条要执行的指令的确定是“控制流的传递”。
通常,当控制流被传递到操作时,MLIR不限制控制流何时进入或离开该操作中包含的区域。然而,当控制流进入一个区域时,它总是从该区域的第一个块开始,称为进入块。结束每个块的终止操作通过显式指定块的后续块来表示控制流。在分支操作中,控制流只能传递到指定的后续块之一,或在返回操作中返回到包含操作。没有后续操作的终止操作只能将控制权传递回包含操作。在这些限制范围内,终止符操作的特定语义由所涉及的特定方言操作决定。未被列为终止符操作的后续块的块(入口块除外)被定义为不可访问,并且可以在不影响包含操作的语义的情况下删除。
尽管控制流总是通过入口块进入区域,但控制流可以通过具有适当终止符的任何块离开区域。标准方言利用这一功能来定义具有单输入多输出(SEME)区域的操作,可能流经该区域中的不同块,并通过返回操作退出任何块。这种行为类似于大多数编程语言中的函数体。此外,控制流也可能无法到达块或区域的末尾,例如,如果函数调用没有返回。
示例:
In MLIR, control flow semantics of a region is indicated by RegionKind::SSACFG. Informally, these regions support semantics where operations in a region ‘execute sequentially’. Before an operation executes, its operands have well-defined values. After an operation executes, the operands have the same values and results also have well-defined values. After an operation executes, the next operation in the block executes until the operation is the terminator operation at the end of a block, in which case some other operation will execute. The determination of the next instruction to execute is the ‘passing of control flow’.
In general, when control flow is passed to an operation, MLIR does not restrict when control flow enters or exits the regions contained in that operation. However, when control flow enters a region, it always begins in the first block of the region, called the entry block. Terminator operations ending each block represent control flow by explicitly specifying the successor blocks of the block. Control flow can only pass to one of the specified successor blocks as in a branch operation, or back to the containing operation as in a return operation. Terminator operations without successors can only pass control back to the containing operation. Within these restrictions, the particular semantics of terminator operations is determined by the specific dialect operations involved. Blocks (other than the entry block) that are not listed as a successor of a terminator operation are defined to be unreachable and can be removed without affecting the semantics of the containing operation.
Although control flow always enters a region through the entry block, control flow may exit a region through any block with an appropriate terminator. The standard dialect leverages this capability to define operations with Single-Entry-Multiple-Exit (SEME) regions, possibly flowing through different blocks in the region and exiting through any block with a return operation. This behavior is similar to that of a function body in most programming languages. In addition, control flow may also not reach the end of a block or region, for example if a function call does not return.
Example:
func.func @accelerator_compute(i64, i1) -> i64 { // An SSACFG region ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a cf.cond_br %cond, ^bb1, ^bb2 ^bb1: // This def for %value does not dominate ^bb2 %value = "op.convert"(%a) : (i64) -> i64 cf.br ^bb3(%a: i64) // Branch passes %a as the argument ^bb2: accelerator.launch() { // An SSACFG region ^bb0: // Region of code nested under "accelerator.launch", it can reference %a but // not %value. %new_value = "accelerator.do_something"(%a) : (i64) -> () } // %new_value cannot be referenced outside of the region ^bb3: ... }
Operations with Multiple Regions
具有多个区域的操作
包含多个区域的操作也完全决定了这些区域的语义。特别地,当控制流被传递到操作时,它可以将控制流传递到任何包含的区域。当控制流离开一个区域并返回到包含操作时,包含操作可以将控制流传递到同一操作中的任何区域。操作还可以将控制流同时传递到多个包含的区域。操作还可以将控制流传递到在其他操作中指定的区域,特别是那些定义了给定操作在调用操作中使用的值或符号的区域。这种控制的通过通常独立于控制流通过容纳区域的基本块的通过。
An operation containing multiple regions also completely determines the semantics of those regions. In particular, when control flow is passed to an operation, it may transfer control flow to any contained region. When control flow exits a region and is returned to the containing operation, the containing operation may pass control flow to any region in the same operation. An operation may also pass control flow to multiple contained regions concurrently. An operation may also pass control flow into regions that were specified in other operations, in particular those that defined the values or symbols the given operation uses as in a call operation. This passage of control is generally independent of passage of control flow through the basic blocks of the containing region.
Closure
关闭
区域允许定义一个创建闭包的操作,例如通过将区域的主体“装箱”为它们产生的值。它仍然由操作来定义其语义。请注意,如果操作触发了区域的异步执行,则操作调用方有责任等待区域的执行,以确保任何直接使用的值保持有效。
Regions allow defining an operation that creates a closure, for example by “boxing” the body of the region into a value they produce. It remains up to the operation to define its semantics. Note that if an operation triggers asynchronous execution of the region, it is under the responsibility of the operation caller to wait for the region to be executed guaranteeing that any directly used values remain live.
Graph Regions
图形区域
在MLIR中,区域中的类图语义由RegionKind::graph表示。图区域适用于没有控制流的并发语义,或用于建模通用的有向图数据结构。图区域适用于表示耦合值之间的循环关系,其中这些关系没有基本顺序。例如,图区域中的操作可以用表示数据流的值来表示独立的控制线程。在MLIR中,区域的特定语义完全由其包含操作决定。图形区域只能包含一个基本块(入口块)。
In MLIR, graph-like semantics in a region is indicated by RegionKind::Graph. Graph regions are appropriate for concurrent semantics without control flow, or for modeling generic directed graph data structures. Graph regions are appropriate for representing cyclic relationships between coupled values where there is no fundamental order to the relationships. For instance, operations in a graph region may represent independent threads of control with values representing streams of data. As usual in MLIR, the particular semantics of a region is completely determined by its containing operation. Graph regions may only contain a single basic block (the entry block).
理由:目前,图形区域被任意限制为单个基本块,尽管这种限制没有特定的语义原因。添加此限制是为了更容易稳定通行证基础设施和用于处理图区域的常用通行证,以正确处理反馈循环。如果出现需要多块区域的用例,那么将来可能会允许多块区域。
在图区域中,MLIR操作自然表示节点,而每个MLIR值表示连接单个源节点和多个目标节点的多边缘。作为操作结果在区域中定义的所有值都在区域内的范围内,并且可以由区域中的任何其他操作访问。在图区域中,块内的操作顺序和区域中的块顺序在语义上没有意义,非终止符操作可以自由地重新排序,例如通过规范化。其他类型的图,例如具有多个源节点和多个目的节点的图,也可以通过将图边表示为MLIR操作来表示。
请注意,循环可以发生在图形区域中的单个块内,也可以发生在基本块之间。
Rationale: Currently graph regions are arbitrarily limited to a single basic block, although there is no particular semantic reason for this limitation. This limitation has been added to make it easier to stabilize the pass infrastructure and commonly used passes for processing graph regions to properly handle feedback loops. Multi-block regions may be allowed in the future if use cases that require it arise.
In graph regions, MLIR operations naturally represent nodes, while each MLIR value represents a multi-edge connecting a single source node and multiple destination nodes. All values defined in the region as results of operations are in scope within the region and can be accessed by any other operation in the region. In graph regions, the order of operations within a block and the order of blocks in a region is not semantically meaningful and non-terminator operations may be freely reordered, for instance, by canonicalization. Other kinds of graphs, such as graphs with multiple source nodes and multiple destination nodes, can also be represented by representing graph edges as MLIR operations.
Note that cycles can occur within a single block in a graph region, or between basic blocks.
"test.graph_region"() ({ // A Graph region %1 = "op1"(%1, %3) : (i32, i32) -> (i32) // OK: %1, %3 allowed here %2 = "test.ssacfg_region"() ({ %5 = "op2"(%1, %2, %3, %4) : (i32, i32, i32, i32) -> (i32) // OK: %1, %2, %3, %4 all defined in the containing region }) : () -> (i32) %3 = "op2"(%1, %4) : (i32, i32) -> (i32) // OK: %4 allowed here %4 = "op3"(%1) : (i32) -> (i32) }) : () -> ()
Arguments and Results
争论和结果
区域的第一个块的自变量被视为该区域的自变量。这些参数的来源是由父操作的语义定义的。它们可能与操作本身使用的一些值相对应。
区域生成一个值列表(可能为空)。操作语义定义了区域结果和操作结果之间的关系。
The arguments of the first block of a region are treated as arguments of the region. The source of these arguments is defined by the semantics of the parent operation. They may correspond to some of the values the operation itself uses.
Regions produce a (possibly empty) list of values. The operation semantics defines the relation between the region results and the operation results.
Type System
类型系统
MLIR中的每个值都有一个由类型系统定义的类型。MLIR有一个开放的类型系统(即没有固定的类型列表),并且类型可能具有特定于应用程序的语义。MLIR方言可以定义任意数量的类型,对它们所代表的抽象没有任何限制。
Each value in MLIR has a type defined by the type system. MLIR has an open type system (i.e. there is no fixed list of types), and types may have application-specific semantics. MLIR dialects may define any number of types with no restrictions on the abstractions they represent.
type ::= type-alias | dialect-type | builtin-type type-list-no-parens ::= type (`,` type)* type-list-parens ::= `(` `)` | `(` type-list-no-parens `)` // This is a common way to refer to a value with a specified type. ssa-use-and-type ::= ssa-use `:` type ssa-use ::= value-use // Non-empty list of names and types. ssa-use-and-type-list ::= ssa-use-and-type (`,` ssa-use-and-type)* function-type ::= (type | type-list-parens) `->` (type | type-list-parens)
Type Aliases
type-alias-def ::= '!' alias-name '=' type type-alias ::= '!' alias-name
MLIR支持为类型定义命名别名。类型别名是一个标识符,可以用来代替它定义的类型。这些别名必须在使用之前进行定义。别名不能包含“.”,因为这些名称是为方言类型保留的。
示例:
MLIR supports defining named aliases for types. A type alias is an identifier that can be used in the place of the type that it defines. These aliases must be defined before their uses. Alias names may not contain a ‘.’, since those names are reserved for dialect types.
Example:
!avx_m128 = vector<4 x f32> // Using the original type. "foo"(%x) : vector<4 x f32> -> () // Using the type alias. "foo"(%x) : !avx_m128 -> ()
Dialect Types
方言类型
与操作类似,方言可以定义类型系统的自定义扩展。
Similarly to operations, dialects may define custom extensions to the type system.
dialect-namespace ::= bare-id dialect-type ::= '!' (opaque-dialect-type | pretty-dialect-type) opaque-dialect-type ::= dialect-namespace dialect-type-body pretty-dialect-type ::= dialect-namespace '.' pretty-dialect-type-lead-ident dialect-type-body? pretty-dialect-type-lead-ident ::= '[A-Za-z][A-Za-z0-9._]*' dialect-type-body ::= '<' dialect-type-contents+ '>' dialect-type-contents ::= dialect-type-body | '(' dialect-type-contents+ ')' | '[' dialect-type-contents+ ']' | '{' dialect-type-contents+ '}' | '[^\[<({\]>)}\0]+'
方言类型通常以不透明的形式指定,其中类型的内容是在用方言名称空间和<>包裹的主体中定义的。考虑以下示例:
Dialect types are generally specified in an opaque form, where the contents of the type are defined within a body wrapped with the dialect namespace and <>. Consider the following examples:
// A tensorflow string type. !tf<string> // A type with complex components. !foo<something<abcd>> // An even more complex type. !foo<"a123^^^" + bar>
 
足够简单的方言类型可能使用更漂亮的格式,将部分语法展开为等效但重量较轻的形式:
Dialect types that are simple enough may use a prettier format, which unwraps part of the syntax into an equivalent, but lighter weight form:
// A tensorflow string type. !tf.string // A type with complex components. !foo.something<abcd>
See here to learn how to define dialect types.
Builtin Types
The builtin dialect defines a set of types that are directly usable by any other dialect in MLIR. These types cover a range from primitive integer and floating-point types, function types, and more.
内置类型
内置方言定义了一组类型,MLIR中的任何其他方言都可以直接使用这些类型。这些类型涵盖了一系列基本整数和浮点类型、函数类型等。
属性
语法:
Attributes
Syntax:
attribute-entry ::= (bare-id | string-literal) `=` attribute-value attribute-value ::= attribute-alias | dialect-attribute | builtin-attribute
Attributes are the mechanism for specifying constant data on operations in places where a variable is never allowed - e.g. the comparison predicate of a cmpi operation. Each operation has an attribute dictionary, which associates a set of attribute names to attribute values. MLIR’s builtin dialect provides a rich set of builtin attribute values out of the box (such as arrays, dictionaries, strings, etc.). Additionally, dialects can define their own dialect attribute values.
The top-level attribute dictionary attached to an operation has special semantics. The attribute entries are considered to be of two different kinds based on whether their dictionary key has a dialect prefix:
  • inherent attributes are inherent to the definition of an operation’s semantics. The operation itself is expected to verify the consistency of these attributes. An example is the predicate attribute of the arith.cmpi op. These attributes must have names that do not start with a dialect prefix.
  • discardable attributes have semantics defined externally to the operation itself, but must be compatible with the operations’s semantics. These attributes must have names that start with a dialect prefix. The dialect indicated by the dialect prefix is expected to verify these attributes. An example is the gpu.container_module attribute.
属性是一种机制,用于在从不允许使用变量的地方指定操作的常量数据,例如cmpi操作的比较谓词。每个操作都有一个属性字典,它将一组属性名称与属性值相关联。MLIR的内置方言提供了一组丰富的内置属性值(如数组、字典、字符串等)。此外,方言可以定义自己的方言属性值。
附加到操作的顶级属性字典具有特殊的语义。根据其字典关键字是否具有方言前缀,属性条目被认为是两种不同的类型:
•固有属性是操作语义定义的固有属性。操作本身应验证这些属性的一致性。一个例子是arith.cmpi操作的谓词属性。这些属性的名称必须不以方言前缀开头。
•可丢弃属性具有在操作本身外部定义的语义,但必须与操作的语义兼容。这些属性的名称必须以方言前缀开头。由方言前缀指示的方言有望验证这些属性。gpu.container_module属性就是一个例子。
请注意,属性值本身可以是字典属性,但只有附加到操作的顶级字典属性才受上述分类的约束。
Note that attribute values are allowed to themselves be dictionary attributes, but only the top-level dictionary attribute attached to the operation is subject to the classification above.
属性值别名
Attribute Value Aliases
attribute-alias-def ::= '#' alias-name '=' attribute-value attribute-alias ::= '#' alias-name
MLIR支持为属性值定义命名别名。属性别名是一种标识符,可以用来代替它定义的属性。这些别名必须在使用之前进行定义。别名不能包含“.”,因为这些名称是为方言属性保留的。
示例:
MLIR supports defining named aliases for attribute values. An attribute alias is an identifier that can be used in the place of the attribute that it defines. These aliases must be defined before their uses. Alias names may not contain a ‘.’, since those names are reserved for dialect attributes.
Example:
#map = affine_map<(d0) -> (d0 + 10)> // Using the original attribute. %b = affine.apply affine_map<(d0) -> (d0 + 10)> (%a) // Using the attribute alias. %b = affine.apply #map(%a)
Dialect Attribute Values
方言属性值
与操作类似,方言可以定义自定义属性值。
Similarly to operations, dialects may define custom attribute values.
dialect-namespace ::= bare-id dialect-attribute ::= '#' (opaque-dialect-attribute | pretty-dialect-attribute) opaque-dialect-attribute ::= dialect-namespace dialect-attribute-body pretty-dialect-attribute ::= dialect-namespace '.' pretty-dialect-attribute-lead-ident dialect-attribute-body? pretty-dialect-attribute-lead-ident ::= '[A-Za-z][A-Za-z0-9._]*' dialect-attribute-body ::= '<' dialect-attribute-contents+ '>' dialect-attribute-contents ::= dialect-attribute-body | '(' dialect-attribute-contents+ ')' | '[' dialect-attribute-contents+ ']' | '{' dialect-attribute-contents+ '}' | '[^\[<({\]>)}\0]+'
方言属性通常以不透明的形式指定,其中属性的内容是在用方言名称空间和<>包裹的主体中定义的。考虑以下示例:
Dialect attributes are generally specified in an opaque form, where the contents of the attribute are defined within a body wrapped with the dialect namespace and <>. Consider the following examples:
// A string attribute. #foo<string<"">> // A complex attribute. #foo<"a123^^^" + bar>
足够简单的方言属性可以使用更漂亮的格式,将部分语法展开为等效但权重较轻的形式:
Dialect attributes that are simple enough may use a prettier format, which unwraps part of the syntax into an equivalent, but lighter weight form:
// A string attribute. #foo.string<"">
See here on how to define dialect attribute values.
Builtin Attribute Values
The builtin dialect defines a set of attribute values that are directly usable by any other dialect in MLIR. These types cover a range from primitive integer and floating-point values, attribute dictionaries, dense multi-dimensional arrays, and more.
请参阅此处,了解如何定义方言属性值。
内置属性值
内置方言定义了一组属性值,MLIR中的任何其他方言都可以直接使用这些属性值。这些类型包括基元整数值和浮点值、属性字典、密集多维数组等。
IR版本控制
方言可以选择通过BytecodeDialectInterface来处理版本控制。很少有钩子暴露在方言中,以允许管理编码到字节码文件中的版本。该版本是延迟加载的,允许在解析输入IR时检索版本信息,并为存在版本的每个方言提供机会,以便通过upgradeFromVersion方法在解析后执行IR升级。自定义属性和类型编码也可以使用readAttribute和readType方法根据方言版本进行升级。
方言可以编码什么样的信息来对其版本进行建模,这一点没有限制。目前,版本控制只支持字节码格式。
IR Versioning
A dialect can opt-in to handle versioning through the BytecodeDialectInterface. Few hooks are exposed to the dialect to allow managing a version encoded into the bytecode file. The version is loaded lazily and allows to retrieve the version information while parsing the input IR, and gives an opportunity to each dialect for which a version is present to perform IR upgrades post-parsing through the upgradeFromVersion method. Custom Attribute and Type encodings can also be upgraded according to the dialect version using readAttribute and readType methods.
There is no restriction on what kind of information a dialect is allowed to encode to model its versioning. Currently, versioning is supported only for bytecode formats.
 
[MLIR] Dialect及Operation详解
https://zhuanlan.zhihu.com/p/582517107
1. MLIR简介
MLIR 全称是 Multi-Level Intermediate Representation (多级中间表示),是一种全新的编译器框架。
1.1 IR是什么
IR即 Intermediate Representation,可以看作是一种数据格式,作为从端到端转换中的中间表示。例如深度学习模型一般表示为计算图,能够表示计算图的数据结果就可以称为一种IR,例如ONNXTorchScriptTVM Relay等等。
添加图片注释,不超过 140 字(可选)
计算图(conputation graph)
· ONNX(Open Neural Network Exchange) : ONNX 协议首先由微软和Meta提出,它定义了一组和环境、平台均无关的标准格式(如算子功能)。在训练完成后可以将支持框架(Pytorch、Tensorflow等)的模型转化为 ONNX 文件进行存储,ONNX 文件不仅存储了神经网络模型的权重,也存储了模型的结构信息以及网络中每一层的输入输出等信息。
· TorchScrpit : PyTorch 最大的卖点是它对动态网络的支持,比其他需要构建静态网络的框架拥有更低的学习成本。但动态图模式在每次执行计算时都要重新构造计算图,非固定的网络结构给网络结构分析并进行优化带来了困难。TorchScript 就是为了解决这个问题而诞生的工具,包括代码的追踪及解析、中间表示的生成、模型优化、序列化等各种功能。
· Relay IR : 与 TVM 框架绑定,是一个函数式、可微的、静态的、针对机器学习的领域定制编程语言,解决了普通DL框架不支持 control flow 以及 dynamic shape 的特点,使用 lambda calculus 作为基准IR。
1.2 常见的IR表示系统
​添加图片注释,不超过 140 字(可选)
在上图表示中:
(1) 由于 C、C++ 源码直接转成 AST 时,并不会进行语言特定的优化,程序的优化主要集中于 LLVM IR 阶段。但 LLVM IR 表示层级较低,会丢失源码中的部分信息(如报错信息),会导致优化不充分
(2) 类似于Tensorflow、Keras等框架,会先转化为计算图Computation Graph形式,然后会基于图做一定的优化。但图阶段缺少硬件部署的相关信息,所以后续会转化为某个后端的内部表示,根据不同的硬件(TPU、Phone),进行算子融合等等优化。
...
可见,当前IR表示系统主要的问题
· 可复用性差:针对不同种类IR开发的Pass(优化)可能重复,但不同IR的同类Pass可能并不兼容。
· 不透明:前层IR所作的Pass优化在后层中不可见,可能导致优化重复。
· 变换开销大:转换过程中存在多种IR,这些不同类型的IR转换时开销很大。
1.3 MLIR的提出
Tensorflow 团队较早时采用了多种IR的部署,这样导致软件碎片化较为严重。
因此 Tensorflow 团队就提出了 MLIR,主要是为了统一各类IR格式,协调各类IR的转换,带来更高的优化效率

 添加图片注释,不超过 140 字(可选)

注: SSA是静态单赋值,后面会讲到
2. Dialect及Operation详解
参考:
2.1 Dialect
1. Dialect 是什么?
从源程序到目标程序,要经过一系列的抽象以及分析,通过 Lowering Pass 来实现从一个IR到另一个IR的转换。但IR之间的转换需要统一格式,统一IR的第一步就是要统一“语言”,各个IR原来配合不默契,谁也理解不了谁,就是因为“语言”不通。
因此 MLIR 提出了Dialect,各种IR可以转换为对应的 mlir Dialect,不仅方便了转换,而且还能随意扩展。不妨将dialect看成各种具有IR表达能力的黑盒子,之后的编译流程就是在各种dialect之间转化。

 

. dialect 是怎么工作的?
dialect 将所有的IR放在了同一个命名空间中,分别对每个IR定义对应的产生式并绑定相应的操作,从而生成一个MLIR的模型。
每种语言的 dialect(如tensorflow dialect、HLO dialect、LLVM IR dialect)都是继承自 mlir::Dialect,并注册了属性、操作和数据类型,也可以使用虚函数来改变一些通用性行为。
整个的编译过程:从源语言生成 AST(Abstract Syntax Tree,抽象语法树),借助 dialect 遍历 AST,产生 MLIR 表达式(此处可为多层IR通过 Lowering Pass 依次进行分析),最后经过 MLIR 分析器,生成目标硬件程序。

 3. dialect 内部构成

dialect主要是由自定义的 TypeAttributeInterface 以及 operation 构成。operation 细分为Attribute、Type、Constraint、Interface、Trait(属性、类型、限制、接口、特征)。同时存在 ODS 和 DRR 两个重要的模块,这两个模块都是基于 tableGen 模块,ODS 模块用于定义 operation ,DRR 模块用于实现两个 dialect 之间的 conversion

 2.2 Operation

Operation 是 Dialect 的重要组成部分,是抽象和计算的核心单元,可以看成是方言语义的基本元素。
下面例子可理解为:
生成的结果是 %t_tensor,xxx dialect,执行的是 transpose 操作,输入数据是 %tensor,能够将 tensor<2x3xf64> 的数据转换成tensor<3x2xf64> 的数据,该 transpose 的位置在 "example/file/path",第12行,第1个字符
 
结构拆分解析:
(1)%t_tensor:定义结果名称,SSA值,由%<t_tensor>构成,一般<t_tensor>是一个整数型数字。
IR 是 LLVM 的设计核心,它采用 SSA(Single-Static Assignments,静态单赋值)的形式,并具备两个重要特性: - 代码被组织成三地址指令 - 有无限的寄存器
(2)"xxx.transpose":操作的名称,应该是唯一的字符串,方言空间以.开头;指明为 xxx Dialect 的transpose 操作;.之前的内容是 Dialect 命名空间的名字,.后面是操作的名称。
(3)(%tensor):输入的操作数的列表,多个操作数之间用逗号隔开。
(4){inplace = true}:属性字典,定义一个名为inplace的布尔类型,其常量值为true
(5)(tensor<2x3xf64>) -> tensor<3x2xf64>:函数形式表示的操作类型,前者是输入,后者是输出。<2x3xf64>号中间的内容描述了张量的尺寸2x3和张量中存储的数据类型f64,中间使用x连接。
(6)loc("example/file/path":12:1):此操作的源代码中的位置。每个操作都有与之关联的强制性源位置,在 MLIR 中是核心要求,并且 API 依赖并操纵他。例如:如果一个转换将操作替换成另一个操作,必须在新的操作中附加一个位置,可以追踪该操作的来源。所以,在使用工具链 mlir-opt 中默认没有这个位置信息,添加 -mlir-print-debuginfo 标志指定要包含位置。
更一般的格式可见下图:

 3. 创建新的dialect(添加新的operation)

本节创建新的dialect包括 手动编写C++创建 以及 利用ODS框架生成
ODS 全称 Operation Definition Specification,操作者只需要根据 operation 框架定义的规范,在一个.td文件中填写相应的内容,使用 mlir 的 tableGen 工具就可以自动生成上面的 C++ 代码。 本节完全参考官方文档 :Chapter 2: Emitting Basic MLIR - MLIR (llvm.org)
本节将以xxx语言为例,演示构造 xxx Dialect并添加相应的Operation的流程。
xxx语言是为了验证及演示MLIR系统的整个流程而开发的一种基于Tensor的语言。 xxx 语言具有以下特性: - Mix of scalar and array computations, as well as I/O - Array shape Inference - Generic functions - Very limiter set of operators and features
3.1 定义 xxx Dialect
Dialect 将对 xxx 语言的结构进行建模,并为高级分析和转换提供方便的途径。
1. 使用 C++ 语言手动编写
// 下面是官方给出的xxx Dialect定义,默认位置为 ../mlir/examples/xxx/Ch2/include/xxx/Dialect.h
 

// 下面是官方给出的xxx Dialect定义,默认位置为 ../mlir/examples/xxx/Ch2/include/xxx/Dialect.h
class xxxDialect : public mlir::Dialect {
public:
explicit XxxDialect(mlir::MLIRContext *ctx);

/// Provide a utility accessor to the dialect namespace.
static llvm::StringRef getDialectNamespace() { return "xxx"; }

/// An initializer called from the constructor of xxxDialect that is used to
/// register attributes, operations, types, and more within the xxx dialect.
void initialize();
}; 

2. 使用 ODS 框架自动生成
在使用 ODS 定义操作的这些代码,都在Ops.td中,默认位置为 ../mlir/examples/xxx/Ch2/include/xxx/Ops.td
下面的代码块定义一个名字为 xxx 的 Dialect 在 ODS 框架中,使用let <...> = "..."/[{...}];方式依次明确 name、summary、description 和 cppNamespace(对应 Dialect 类所在的 C++ 命名空间)各个字段的定义。
 
defxxx_Dialect:Dialect{
  // The namespace of our dialect, this corresponds 1-1 with the string we
  // provided in `XxxDialect::getDialectNamespace`.
  letname="xxx";
 
  // A short one-line summary of our dialect.
  letsummary="A high-level dialect for analyzing and optimizing the "
                "xxx language";
 
  // A much longer description of our dialect.
  letdescription=[{
    Thexxxlanguageisatensor-basedlanguagethatallowsyoutodefine
    functions,performsomemathcomputation,andprintresults.Thisdialect
    providesarepresentationofthelanguagethatisamenabletoanalysisand
    optimization.
  }];
 
  // The C++ namespace that the dialect class definition resides in.
  letcppNamespace="xxx";
} 
然后在编译阶段,由框架自动生成相应的 C++ 代码。当然也可以运行下面的命令 直接得到生成的 C++ 代码。
 
下图中右侧是 ODS 中的定义,左侧是自动生成的 C++ 代码。

 3.2 加载到 MLIRContext 中

定义好 Dialect 之后,需要将其加载到 MLIRContext 中。默认情况下,MLIRContext 只加载内置的 Dialect,若要添加自定义的 Dialect,需要加载到 MLIRContext。
// 此处的代码与官方文档中的稍有不同,但实际意义相同。
// 在代码文件 xxxc.cpp 中,默认位置为 ../mlir/examples/xxx/Ch2/xxxc.cpp。
 
// 在代码文件 xxxc.cpp 中,默认位置为 ../mlir/examples/xxx/Ch2/xxxc.cpp
int dumpMLIR() {
...
  // Load our Dialect in this MLIR Context.
  context.getOrLoadDialect<mlir::xxx::XxxDialect>();
...
}
 
3.3 定义 operation
有了上述的 xxx Dialect,便可以定义操作(operation)。官方文档围绕 xxx xxx.ConstantOp 的定义介绍如何使用 C++ 的方式直接定义 operation。
 
1. 使用 C++ 语言手动编写
operation 类是继承于 CRTP 类,有一些可选的 traits 来定义行为。下面是 ConstantOp 的官方定义:
// `mlir::Op` is a CRTP class
 
// `mlir::Op` is a CRTP class
classConstantOp:publicmlir::Op<
                     ConstantOp,// The ConstantOp 
                     mlir::OpTrait::ZeroOperands,// takes zero input operands
                     mlir::OpTrait::OneResult,// returns a single result.
                     mlir::OpTraits::OneTypedResult<TensorType>::Impl>{
public:
  // Op inherit the constructors from the base Op class.
  usingOp::Op;
  // Return a unique name of the operation
  staticllvm::StringRefgetOperationName(){return"xxx.constant";}
  // Return a value by fetching it from the attribute
  mlir::DenseElementsAttrgetValue();
  // Operations may provide additional verification beyond what the attached traits provide.
  LogicalResultverifyInvariants();
 
  // Provide an interface to build this operation from a set of input values.
  // mlir::OpBuilder::create<ConstantOp>(...)
  // Build a constant with the given return type and `value` attribute.
  staticvoidbuild(mlir::OpBuilder&builder,mlir::OperationState&state,
                    mlir::Typeresult,mlir::DenseElementsAttrvalue);
  // Build a constant and reuse the type from the given 'value'.
  staticvoidbuild(mlir::OpBuilder&builder,mlir::OperationState&state,
                    mlir::DenseElementsAttrvalue);
  // Build a constant by broadcasting the given 'value'.
  staticvoidbuild(mlir::OpBuilder&builder,mlir::OperationState&state,
                    doublevalue);
}; 
定义好 operation 的行为后,我们可以在 xxx Dialect 的 initialize 函数中注册(register),之后才可以正常在 xxx Dialect 中使用 ConstantOp。
//
void XxxDialect::initialize() {
addOperations<ConstantOp>();
}
2. 使用 ODS 框架自动生成
首先在 ODS 中定义一个继承自 Op 类的基类 xxx_Op
Operation 和 Op的区别 Operation:用于对所有操作的建模,并提供通用接口给操作的实例。 Op:每种特定的操作都是由 Op 类继承来的。同时它还是 Operation * 的 wrapper,这就意味着,当我们定义一个 Dialect 的 Operation 的时候,我们实际上是在提供一个 Operation 类的接口。 Op 类的定义在 OpBased.td 文件中,默认位置为 ../mlir/include/mlir/IR/OpBased.td。 下面的代码都在Ops.td中,默认位置为 ../mlir/examples/xxx/Ch2/include/xxx/Ops.td
 
// xxx_Dialect : 父类 Dialect 操作
// mnemonic : 注记符号,一般是一个字符串型的单词,代表了该操作的含义
// traits : 该操作的一些特征,放在一个列表中
其次以声明的方式定义相应操作:
 arguments和results:定义参数和结果,参数可以是SSA操作数的属性或类型。
通过为参数或结果提供名称,ODS将自动的生成匹配的访问器。
arguments一般模板(results同理):
let arguments = (ins <data_type><data_attribute>:$<variable_name>);
- ins: 输入 (results中该参数为 outs)
- <data_type>: 数据类型
- <data_structure>: 数据属性
- ElementsAttr: 稠元(dense element)
- <variable_name>: 变量名
 // 自定义程序的组装格式,使最终输出的 IR 格式更精简、易读
let parser = [{ return ::parseConstantOp(parser, result); }];
let printer = [{ return ::print(p, *this); }];
 
def ConstantOp : xxx_Op<"constant", [NoSideEffect]> {
  // "constant"就是注记符号,[NoSideEffect]说明了该操作的一个特点
  // Provide a summary and description for this operation. 
  let summary = "constant";
  let description = [{
    Constant operation turns a literal into an SSA value. The data is attached
    to the operation as an attribute. For example:
    ```mlir
      %0 = xxx.constant dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]>
                        : tensor<2x3xf64>
    ```
  }];
 
  /*
  argumentsresults:定义参数和结果,参数可以是SSA操作数的属性或类型。
  通过为参数或结果提供名称,ODS将自动的生成匹配的访问器。
  arguments一般模板(results同理): 
  let arguments = (ins <data_type><data_attribute>:$<variable_name>);
  - ins: 输入 (results中该参数为 outs)
  - <data_type>: 数据类型
  - <data_structure>: 数据属性
  - ElementsAttr: 稠元(dense element)
  - <variable_name>: 变量名
  */
  // The constant operation takes an attribute as the only input.
  // `F64ElementsAttr` corresponds to a 64-bit floating-point ElementsAttr.
  let arguments = (ins F64ElementsAttr:$value);
  // The constant operation returns a single value of TensorType.
  let results = (outs F64Tensor);
 
  // Divert the printer and parser to `parse` and `print` methods on our operation.
  let hasCustomAssemblyFormat = 1;
  /*
  // 自定义程序的组装格式,使最终输出的 IR 格式更精简、易读
  let parser = [{ return ::parseConstantOp(parser, result); }];
  let printer = [{ return ::print(p, *this); }];
  */
 
  // ODS 可以自动生成一些简单的构建方法,用户也可自定义添加一些构造方法
  let builders = [
    // Build a constant with a given constant tensor value.
    OpBuilderDAG<(ins "DenseElementsAttr":$value), [{
      build($_builder, $_state, value.getType(), value);
    }]>,
    // Build a constant with a given constant floating-point value.
    OpBuilderDAG<(ins "double":$value)>
  ];
 
  // Add additional verification logic to the constant operation.
  // will generate a `::mlir::LogicalResult verify()`
  let hasVerifier = 1;
}
然后在编译阶段,由框架自动生成相应的 C++ 代码。当然也可以运行下面的命令 直接得到生成的 C++ 代码。
下图中右侧是 ODS 中的定义,左侧是自动生成的 C++ 代码。

 3.4 创建流程总结(使用ODS)

整个 tableGen 模块是基于 ODS (Operation Definition Specification)框架进行编写以及发挥作用。tableGen 模块促进了自动化生成,减少了 operation 的手动开发,并且避免了冗余开发。
我们以添加 xxx Dialect为例,总结添加流程如下:
Ops.td文件默认位置为 ../mlir/examples/xxx/Ch2/include/xxx/Ops.td

① (在Ops.td中) 定义一个和 xxx Dialect 的链接
def xxx_Dialect : Dialect {
let name = "xxx";
...
let cppNamespace = "xxx";
}
② (在Ops.td中) 创建 xxx Dialect Operation 基类
class xxx_Op<string mnemonic, list<OpTrait> traits = []> :
Op<xxx_Dialect, mnemonic, traits>;
③ (在Ops.td中) 创建 xxx Dialect 中各种 Operation
def ConstantOp : xxx_Op<"constant", [NoSideEffect]> {
let summary = "constant";
let arguments = (ins F64ElementsAttr:$value);
let results = (outs F64Tensor);
let builders = [
OpBulider<"Builder *b, OperationState &state, Value input">
];
let verifier = [{ return ::verify(*this); }];
}
④ 通过 mlir-tblgen 工具生成 C++ 文件
使用 mlir-tblgen -gen-dialect-decls 命令生成对应的 Dialect.h.inc 文件。
使用 mlir-tblgen -gen-op-defs 命令生成对应的 Ops.h.inc 文件。

 使用 #include 直接引用生成文件

#include"xxx/Dialect.h.inc"
#include"xxx/Ops.h.inc"
 
 
参考文献链接
https://zhuanlan.zhihu.com/p/582517107