Class ECompiler
Current limitations:
- The compilation of
IEUntranslatedInstruction
is not supported. - Wildcard types are supported in
<< ... >>
brackets. Leave the type info empty (<<>>
) for EImm to specify they should be mutable but not carry any type information (this is different than<<?>>
which specify a wildcard type of 1 slot with no information.
Rules and syntax:
- The Syntax is similar to the IR formatting, except bitsizes must be present and etypes must
be absent(to be supported at a later time), eg
s32:var1 = i32:01h
- Unless a wanted size is specified, statements are auto-assigned a size of 1
- Optional wanted size for statements: "/SIZE: ...", eg
/2: nop
will create an ENop of size 2 instead of the default size 1 - Optional wanted native address (hex) for statements: "ADDRESS: ...", eg
1A00: nop
will create an ENop statement whose mapping to a hypothetical native address is set to 0x1A00 - Literal labels for EJump are supported, eg: @label1 ... goto @label1
- Spaces between tokens and parenthesis around expressions are mostly mandatory; in order to avoid compilation errors, it is better to use them systematically
- Comments: // or ;
- EVar are created automatically upon first encounter unless they already exist in the routine-context (checked first), or the global-context (checked second)
- The class of EVar used for creation depends on its name prefixed; by default, special locals (negative range [2, 0x10000[) is used
- Standardized prefixes:
global-context rID -> physical register EVar if possible RID -> virtual register EVar if possible gADDR -> memory-mapped global EVar if possible ptr_gAAA -> global symbol EVar if possible routine-context vID -> virtual routine-context EVar if possible (similar to global-context's R..) varADDR -> memory-mapped local stack EVar variable (negative stack offset rel.to SP0) parADDR -> memory-mapped local stack EVar variable (positive or null stack offset rel.to SP0) ptr_varADDR -> pointer (reference) to a local memory-mapped stack variable ptr_parADDR -> pointer (reference) to a local memory-mapped stack parameter $r.. -> copy of var $r..$N -> additional copy of var (N>=1) $r.._r.. -> copy of var pair $r.._r..$N -> additional copy of var pair (N>=1) $r..loX -> copy of var, truncated (LSB part) $r..hiX -> copy of var, truncated (MSB part) $r..loX$N -> additional copy of var, truncated (LSB part) $r..hiX$N -> additional copy of var, truncated (MSB part)
Specific rules for expression and statement compilation:
- PC-assigns can receive additional information, to be provided as end-of-line tags enclosed in
brackets:
- [BRANCH]
-> means the PC-assign should be generated as if it came from a normal
branching instruction
- [SUB]
-> means the PC-assign should be generated as if it came from call-to-sub
instruction
- [BRANCH_HINTS:offsets]
-> provide pseudo-native target hints for the branching
instructions; offsets must be a comma-separated list of pseudo-native offsets (not IR
offsets)
Specific rules for CFG compilation:
- N/A
Specific rules for routine compilation:
- routines may or may not be enclosed in PROC/ENDP
Specific rules for program compilation:
- routines must be enclosed in PROC/ENPD. The wanted name, wanted pseudo start address (native),
and IR prototype are all optional:
- data elements: see below.
PROC Name @NativeAddress :Prototype ... ... ENDP
Defining references: simulate dynamically resolved references to routine and data imported into the module, but physically located in an external component.
IMPORT CODE MethodName [:OptionalPrototype] IMPORT DATA FieldName [:OptionalType]
Defining data elements (native memory):
- syntax for raw bytes (does not create variable object, just memory init.) DB/DW/DD/DQ/DS @Address Value B,W,D,Q=BYTE, WORD, DWORD, QWORD (1, 2, 4, 8 byte), hex or decimal value endianness for memory-encoding matches the processor's (referenced in the native context, held by global context provided to the compiler) DB can also be used byte sequences: Value is an arbitrarily-long hex-encoded byte sequence or an escaped string - no zero terminator is appended examples: DB @100 0x11 DW @100 0x1122 DD @100 0x11223344 DQ @100 0x1122334455667788 DB @100 '11aabb660099ff414141141' <--- hex-encoded string (note the single-quotes, vs double-quotes for strings) DB @100 "Hello World!" <--- encode to ASCII [NOT SUPPORTED YET] DB @100 U"Hello World!" <--- encode to UTF8 [NOT SUPPORTED YET] DB @100 L"Hello World!" <--- encode to UTF16LE (note little-endian) - syntax for regular data items: DV Name @Address :TypeName [OptionalValue] where Value is an optional hex-encoded string whose length must be less than or equals to the size of the variable Type - syntax for string (ascii-encoded, 0-terminated) data items: DS Name @Address "Hello" the zero terminator is added implicitly, so the above string would translate to 6 bytes, not 5 - syntax for imported references: DR Name @Address &ImportName
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
A compiled expression.static class
A compiled field.static class
A compiled program.static class
A compiled routine.static class
A compiled statement. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic IEGeneric
cc
(String s, IEGlobalContext gctx) Convenience method to parse an IR expression or statement.static <T extends IEGeneric>
Tcc
(String s, IEGlobalContext gctx, Class<T> clazz) Convenience method to parse an IR expression or statement.compileCfg
(IERoutineContext ctx, String... slist) Compile a sequence of statements and return the CFG.compileCfg
(String... slist) Compile a sequence of statements and return the CFG.compileExpression
(IERoutineContext ctx, String s) Compile a non-statement expression.Compile a non-statement expression.compileProgram
(File file) Compile an IR program made of 1 or more routines.compileProgram
(String... slist) Compile an IR program made of 1 or more routines.compileProgram
(List<String> slist) Compile an IR program made of 1 or more routines.compileRoutine
(String... slist) Compile an IR routine.compileStatement
(IERoutineContext ctx, String s) Compile a single statement.Compile a single statement.void
reset()
Reset this compiler's state.
-
Constructor Details
-
ECompiler
Create an IR compiler.- Parameters:
gctx
- global IR context - one can be provided byIEConverter.getGlobalContext()
or, if no converter is available, can be created ad-hoc
-
-
Method Details
-
cc
Convenience method to parse an IR expression or statement.- Parameters:
s
-gctx
-- Returns:
-
cc
Convenience method to parse an IR expression or statement.- Type Parameters:
T
-- Parameters:
s
-gctx
-clazz
-- Returns:
-
reset
public void reset()Reset this compiler's state. Note that the global IR context (IEGlobalContext
) is not reset. -
compileExpression
Compile a non-statement expression.- Parameters:
s
- pure expression string (not a statement)- Returns:
- the compiled expression
-
compileExpression
Compile a non-statement expression.- Parameters:
ctx
- optional routine context to be used; if null, a fresh context will be createds
- pure expression string (not a statement)- Returns:
- the compiled expression
-
compileStatement
Compile a single statement.- Parameters:
s
- statement string- Returns:
- the compiled statement
-
compileStatement
Compile a single statement.- Parameters:
ctx
- optional routine context to be used; if null, a fresh context will be createds
- statement string- Returns:
- the compiled statement
-
compileCfg
Compile a sequence of statements and return the CFG.- Parameters:
slist
- statement list- Returns:
- IR CFG
-
compileCfg
Compile a sequence of statements and return the CFG.- Parameters:
ctx
- optional routine context to be used; if null, a fresh context will be createdslist
- statement list- Returns:
- IR CFG
-
compileRoutine
Compile an IR routine.- Parameters:
slist
- routine source- Returns:
- the compiled routine
-
compileProgram
Compile an IR program made of 1 or more routines.- Parameters:
file
- UTF8 encoded source file- Returns:
- the compiled program
- Throws:
IOException
-
compileProgram
Compile an IR program made of 1 or more routines.- Parameters:
slist
- program source strings- Returns:
- the compiled program
-
compileProgram
Compile an IR program made of 1 or more routines.- Parameters:
slist
- program source strings- Returns:
- the compiled program
-