/*---------------------------------------------------------------*/
/*--- High-level IR description ---*//*---------------------------------------------------------------*//* Vex IR is an architecture-neutral intermediate representation.
Unlike some IRs in systems similar to Vex, it is not like assembly language (ie. a list of instructions). Rather, it is more like the IR that might be used in a compiler.相对汇编语言,VEX IR更像是Compiler的中间语言
Code blocks
~~~~~~~~~~~ The code is broken into small code blocks ("superblocks", type: 'IRSB'). Each code block typically represents from 1 to perhaps 50 instructions. IRSBs are single-entry, multiple-exit code blocks. Each IRSB contains three things:单入口,多出口的代码块,与Intel Pin中的Trace级别相仿
- a type environment, which indicates the type of each temporary
value present in the IRSB【实例:】
(*ir_block).tyenv
-types -[0] Ity_I32 -[1] Ity_I32 -types_size 0x00000008 -types_used 0x00000002
types_used提示有多少个Temp变量被使用,types数组里面分别保存着每个Temp变量的类型
- a list of statements, which represent code
【实例:】stmts_size 0x00000003 intstmts_used 0x00000003 int- (*ir_block).stmts[0] tag Ist_IMark- (*ir_block).stmts[1] tag Ist_WrTmp- (*ir_block).stmts[2] tag Ist_Put
Statements也是保存在stmts数组中,stmts_used代表实际上使用的Statements的数目
- a jump that exits from the end the IRSB
【实例:】jumpkind Ijk_Boring
最后打印出来的结果如下
0x77D699A0: movl %esi,%espIRSB { t0:I32 t1:I32 【2个Temp变量】------ IMark(0x77D699A0, 2, 0) ------ 【3个Statements,包含IMark,但是没有包含最后一条,因为它是对于IP寄存器操作的,是自动的】 t0 = GET:I32(32) 【整条是一个Statements,而GET:I32(32)是Expression】 PUT(24) = t0 PUT(68) = 0x77D699A2:I32; exit-Boring
其中, 第二条Statements可以继续分解
- (*ir_block).stmts[1] tag Ist_WrTmp .tmp 0 .tag Iex_Get .offset 32 .ty Ity_I32
Because the blocks are multiple-exit, there can be additional
conditional exit statements that cause control to leave the IRSB before the final exit. Also because of this, IRSBs can cover multiple non-consecutive sequences of code (up to 3). These are recorded in the type VexGuestExtents (see libvex.h).Statements and expressions
~~~~~~~~~~~~~~~~~~~~~~~~~~ Statements (type 'IRStmt') represent operations with side-effects, eg. guest register writes, stores, and assignments to temporaries. Expressions (type 'IRExpr') represent operations without side-effects, eg. arithmetic operations, loads, constants. Expressions can contain sub-expressions, forming expression trees, eg. (3 + (4 * load(addr1)).Statements可以有Side-Effects,但是Expressions是Pure的,没有副作用的。
ST代表从寄存器到内存的数据转移, LD代表从内存到寄存器转移数据
Expression的类型
typedef enum { Iex_Binder=0x15000, Iex_Get, Iex_GetI, Iex_RdTmp, Iex_Qop, Iex_Triop, Iex_Binop, Iex_Unop, Iex_Load, Iex_Const, Iex_Mux0X, Iex_CCall } IRExprTag;
Statements的类型
typedef enum { Ist_NoOp=0x19000, Ist_IMark, /* META */ Ist_AbiHint, /* META */ Ist_Put, Ist_PutI, Ist_WrTmp, Ist_Store, Ist_CAS, Ist_LLSC, Ist_Dirty, Ist_MBE, /* META (maybe) */ Ist_Exit } IRStmtTag;