isa —
input file
format for the isa utility
The
isa utility is used to generate instruction
stream encoders and decoders from a textual description of a machine
instruction set. This manual page documents the form of the textual
description accepted by the
isa utility.
A machine instruction is composed of one
tokens,
each kind of token having a defined width. Simple RISC-like instruction sets
have instructions that use 1 or 2 tokens, typically an instruction word and an
optional immediate field. More complex CISC instruction sets may use many more
kinds of tokens.
Each token is made up of
fields, for example, an
instruction token could be made up of an opcode field, additional fields
naming registers, fields containing flags, immediate values and so on. Fields
may be named using the
let directive, or can be
unnamed. The bitslice operator (
[]
) can be used to
denote specific portions of a token.
Non-overlapping fields are grouped together into
fragments. Fragments may be composed using the
“&” operator. The textual form of a fragment may be
specified using the
names
directive.
A set of fragments that fully specifies each bit in a token is said to be
‘complete’. Only complete fragment sets can be emitted.
The semicolon “;” introduces a comment. All text from the
semicolon to the end of the line is ignored.
The language uses indentation to specify scope (i.e., it uses the offside rule),
as in the
Python and
Haskell programming languages.
-
-
- Composing Fragments
- The “
&
” operator
is used to join fragments, forming a larger fragment. For example, to
specify a fragment that is comprised of two previously named fragments
Rtop and
Rbottom, use:
-
-
- Generators
- A generator expression has the form
[expr1
|
expr2...]
and denotes a sequence of values expr1,
where the additional expressions expr2
serve to define the range of values generated. Any
“%
”-escapes in
expr1 are expanded. For example,
[R%n | n = 0..31]
generates the sequence R0
,
R1
, ... , R31
.
-
-
- Numeric Ranges
- The notation “..” denotes a numeric range.
For example,
0..(2^16-1)
represents the numbers 0 to 65535, inclusive.
-
-
- Sequences
- Sequences of items are bracketed by square brackets
“[” and “]”. For example,
let n = [ a b c d ]
Sequences can be given a local name using the
name @
sequence syntax, for example:
bar@[ 1 2 3]
defines bar as a local name for the
expression [ 1 2 3 ].
The “++
” operator is used to
concatenate sequences. These sequences must be of the same type.
-
-
- Sequencing Tokens
- The “
<+>
”
operator separates tokens in sequence. For example, to specify an
instruction that has two tokens T1 and T2 in sequence, use:
..the definition of T1..
<+>
..the definition of T2..
-
-
- Slices
- Slices may be specified using the slice notation, namely
name
[
highbit:
lowbit]
,
where highbit and
lowbit are inclusive zero-based indices
and name is the name of a token.
let Rsrc = instruction[3:0]
Sparse slices may be specified by separating slice expressions using commas,
for example bit 7 and 5 of the ifield
token may be specified using:
ifield[7,5]
-
-
- Specifying Assembly Formats
- The “
<=>
” infix
operator is used to specify assembly language syntax and its mapping to
sequences of fragments defined earlier, see the section
Defining
Assembly Syntax.
The “&*
” operator indicates that
all the named fragments in the LHS (the assembly syntax side) of the
“<=>
” operator should be
treated as being present on the RHS. This operator allows instructions
that have a simple one-to-one mapping between their assembly language
definition and instruction encoding to be described succinctly. For
example:
muls %Rd, %Rs <=> i[15:8] = 0b00000010 &*
The input language has the following constructs:
-
-
arch
string
- Specifies the name of the instruction set architecture
being processed.
-
-
cpus
- Starts a block naming CPU identifiers. Specific
instructions or groups of instructions may be flagged as being supported
on sets of the CPUs so declared.
cpus
basic = [ CPU1 CPU2 ]
advanced = basic ++ [ CPU3 ]
-
-
token
name
(width)
- Defines a token with name
name and width
width. For example, to define a 16 bit
named i (short for
“instruction”), and a 8 bit offset token named
o, use:
token i(16) ; a comment here
o(8)
-
-
let
name
[params] =
expression
- Declare name as being the
equivalent of expression.
-
-
names
generator-expression
- Defines the textual representation for a fragment. For
example,
let Rsrc = i[3:0]
names [ R%n | n = 0..7 ]
specifies that a value of 0 for fragment
Rsrc should be shown as
R0
, and so on. Conversely, when assembing text,
the string “R15” would be translated to a fragment value of
15.
-
-
where
name
[params] =
expression
- Like the
let
statement, a
where
statement introduces local definitions,
except that the scope of these definitions is the statement preceding the
where
keyword. Example:
let Kimm6 = Kimm6high & Kimm6low
where Kimm6[5:4] = Kimm6high
Kimm6[3:0] = Kimm6low
-
-
with
fragment-definition
- Defines fragment assignments that hold for statements in
the scope of the
with
statement. For example,
with i[15:8] = 0b00000011
fmulsu %Rd, %Rs <=> i[7,3] = [1,1] &*
Assembly syntax is described using the
<=>
operator. The form of the operator is
assembler-text
<=>
fragment
&
fragment
& ...
The RHS of the
<=>
operator must specify a
‘complete’ fragment set, i.e., no bits should be unspecified in
any of the tokens used in the RHS. The LHS of the
<=>
operator consists of literal text
interspersed by fragment names. Fragment names are prefixed by the
‘%’ character. These fragment names in the LHS may refer to
fragment names defined earlier, or may be new names that are local to the
current definition.
For example, the following definition defines an instruction with mnemonic
“
rjmp
”.
let reloffset = i[11:0]
reljmpcall = i[12]
in
with i[15:13] = 0b110
rjmp %label <=> reljmpcall = 0 & reloffset = (label - . - 1)
In this definition, the field
label is a local
fragment, one that is used to compute the value of the
reloffset field in the instruction. In the
RHS, the
reljmpcall bit is defined as being
0. The rest of the bits in the token
i are
specified by the enclosing
with
statement.
elf(3),
elf(5),
isa(1)
The
isa utility is scheduled to appear in a future
release from the Elftoolchain project.
The
isa(1) utility was written by
Joseph Koshy
<
jkoshy@users.sourceforge.net>.
The
isa utility is currently under development. The
input format documented in this manual is likely to change in the future. If
you intend to use this utility, please get in touch with the project's
developers at
⟨elftoolchain-developers@lists.sourceforge.net⟩.