The JIT compiler translates statement blocks, which have been opened using the compile keyword, to an internal bytecode (p-code) representation. This p-code is optimized by some post processing routines and then finally translated to native CPU code thus speeding up execution time by up to 50 times.
By using a CPUTable to emit the actual machine code, most of the platform-dependent code is removed from the JIT compiler at source level. The CPUTable contains the micro-programs for all p-code instructions. In order to add support for other CPU architectures, this table can be exchanged for a more suitable set of micro programs.
P-Code virtual machines have been around for some decades (read a paper saying 1950ies). Nevertheless I first became aware of VMs when I heard of Sun Microsystems Java language. Actually, I have never been much into Java programming (had to learn it for school once) and I did not look at the Java p-code instruction set or Java VM when I started the TKScript JIT (to be true: When I started optimizing the JIT later on, I used a Java decompiler and actually *had* a look at the Java VM (a bit). Guess the JIT design is quite obvious anyway since the parser tree is stack based, too..). The main sources of inspiration for the JIT were
Some funny thing I just noticed (as of 10-Apr-2004) is that the Forth language had "Integrated access to assembly language.". Originally, the TkScript JIT compiler also included an X86/X87 inline assembler which was removed in 2003 to ease portability.
Probably the most important reference for bytecode virtual machines is UCSD Pascal, read more about it here.
Please notice that TkScript is basically not a bytecode-interpreted language. The bytecode used here is only an intermediate step for the native code generator. It is generated only for compile{/*...*/} statement sequences.
The native-code generator uses a peephole optimizer, which moves an optimization "window" over the generated code and collapses n instructions to a single one. Take a look at the difference between these two examples: sieve_dasm_opt and sieve_dasm_noopt.
The programmer must tag the statement blocks which are to be compiled to native code by opening them using the keyword compile.
Example:
int flags[8191]; flags.numElements=8191;
compile
{
int size= 8190;
int i, prime, k, count, iter;
trace "Eratosthenes Sieve prime number calculation";
trace "10 iterations<br>";
for (iter = 1; iter <= 10; iter++)
{
count = 0;
for (i = 0; i <= size; i++) flags[i] = true;
for (i = 0; i <= size; i++)
{
if (flags[i])
{
prime = i + i + 3;
k = i + prime;
while (k <= size)
{
flags[k] = false;
k += prime;
}
count ++;
}
}
}
trace count + " primes";
}
"sieve.tks" p-code disassembly:
(#0000)00000000: movvc size 8190 (0.000000f); (#0001)00000003: pushc 9659160 (0.000000f); (#0002)00000005: loadc 4370488 (0.000000f); (#0003)00000007: apicall; (#0004)00000008: incstp; (#0005)00000009: pushc 9659320 (0.000000f); (#0006)0000000b: loadc 4370488 (0.000000f); (#0007)0000000d: apicall; (#0008)0000000e: incstp; (#0009)0000000f: movvc iter 1 (0.000000f); (#0010)00000012: jivic iter > 10 005c; (#0011)00000016: movvc count 0 (0.000000f); (#0012)00000019: movvc i 0 (0.000000f); (#0013)0000001c: jiviv i > size 002b; (#0014)00000020: pushc 1 (0.000000f); (#0015)00000022: pushv i; (#0016)00000024: iasel 0; (#0017)00000026: siapopl; (#0018)00000027: inciv i; (#0019)00000029: bra 001c; (#0020)0000002b: movvc i 0 (0.000000f); (#0021)0000002e: jiviv i > size 0058; (#0022)00000032: pushv i; (#0023)00000034: siapushl; (#0024)00000035: sitestz 0054; (#0025)00000037: pushivaddiv i i; (#0026)0000003a: pushc 3 (0.000000f); (#0027)0000003c: siadd; (#0028)0000003d: popv prime; (#0029)0000003f: pushivaddiv i prime; (#0030)00000042: popv k; (#0031)00000044: jiviv k > size 0052; (#0032)00000048: pushc 0 (0.000000f); (#0033)0000004a: pushv k; (#0034)0000004c: siapopl; (#0035)0000004d: ivaddiv k prime; (#0036)00000050: bra 0044; (#0037)00000052: inciv count; (#0038)00000054: inciv i; (#0039)00000056: bra 002e; (#0040)00000058: inciv iter; (#0041)0000005a: bra 0012; (#0042)0000005c: pushc 9594240 (0.000000f); (#0043)0000005e: loadc 4370488 (0.000000f); (#0044)00000060: apicall; (#0045)00000061: incstp; (#0046)00000062: halt;
compiler output without optimization ("sieve.tks" disassembly page):
(#0000)00000000: pushc 8190 (0.000000f); (#0001)00000002: popv size; (#0002)00000004: pushc 9663256 (0.000000f); (#0003)00000006: loadc 4370488 (0.000000f); (#0004)00000008: apicall; (#0005)00000009: incstp; (#0006)0000000a: pushc 9663416 (0.000000f); (#0007)0000000c: loadc 4370488 (0.000000f); (#0008)0000000e: apicall; (#0009)0000000f: incstp; (#0010)00000010: pushc 1 (0.000000f); (#0011)00000012: popv iter; (#0012)00000014: pushv iter; (#0013)00000016: pushc 10 (0.000000f); (#0014)00000018: sicmpb <=; (#0015)00000019: sitestz 0075; (#0016)0000001b: pushc 0 (0.000000f); (#0017)0000001d: popv count; (#0018)0000001f: pushc 0 (0.000000f); (#0019)00000021: popv i; (#0020)00000023: pushv i; (#0021)00000025: pushv size; (#0022)00000027: sicmpb <=; (#0023)00000028: sitestz 0035; (#0024)0000002a: pushc 1 (0.000000f); (#0025)0000002c: pushv i; (#0026)0000002e: iasel 0; (#0027)00000030: siapopl; (#0028)00000031: inciv i; (#0029)00000033: bra 0023; (#0030)00000035: pushc 0 (0.000000f); (#0031)00000037: popv i; (#0032)00000039: pushv i; (#0033)0000003b: pushv size; (#0034)0000003d: sicmpb <=; (#0035)0000003e: sitestz 0071; (#0036)00000040: pushv i; (#0037)00000042: siapushl; (#0038)00000043: sitestz 006d; (#0039)00000045: pushv i; (#0040)00000047: pushv i; (#0041)00000049: siadd; (#0042)0000004a: pushc 3 (0.000000f); (#0043)0000004c: siadd; (#0044)0000004d: popv prime; (#0045)0000004f: pushv i; (#0046)00000051: pushv prime; (#0047)00000053: siadd; (#0048)00000054: popv k; (#0049)00000056: pushv k; (#0050)00000058: pushv size; (#0051)0000005a: sicmpb <=; (#0052)0000005b: sitestz 006b; (#0053)0000005d: pushc 0 (0.000000f); (#0054)0000005f: pushv k; (#0055)00000061: siapopl; (#0056)00000062: pushv k; (#0057)00000064: pushv prime; (#0058)00000066: siadd; (#0059)00000067: popv k; (#0060)00000069: bra 0056; (#0061)0000006b: inciv count; (#0062)0000006d: inciv i; (#0063)0000006f: bra 0039; (#0064)00000071: inciv iter; (#0065)00000073: bra 0014; (#0066)00000075: pushc 9598296 (0.000000f); (#0067)00000077: loadc 4370488 (0.000000f); (#0068)00000079: apicall; (#0069)0000007a: incstp; (#0070)0000007b: halt;
The following intermediate byte code instruction set is used by the JIT compiler to translate source code to native CPU code. Arguments are usually passed on the hardware stack. Script classes require a static array to keep track of nested class calls. The list may be a little outdated but basically nothing has gravely changed.
apicall <function_adr> - call API function (first argument last)
bra <label> - branch to user defined label
fvaddc <var> <const>
fvaddfv <var1> <var2>
fvdivc <var> <const>
fvdivfv <var1> <var2>
fvmulc <var> <const>
fvmulfv <var1> <var2>
fvsubc <var> <const>
fvsubfv <var1> <var2>
halt - terminate the VM.
iasel <var> load int array variable var into array base register
ivaddc <var> <const>
ivaddiv <var1> <var2>
ivandc <var> <const>
ivandiv <var1> <var2>
ivaslc <var> <const>
ivasliv <var1> <var2>
ivasrc <var> <const>
ivasriv <var1> <var2>
ivdivc <var> <const>
ivdiviv <var1> <var2>
iveorc <var> <const>
iveoriv <var1> <var2>
ivmodc <var> <const>
ivmodiv <var1> <var2>
ivmulc <var> <const>
ivmuliv <var1> <var2>
ivorc <var> <const>
ivoriv <var1> <var2>
ivsubc <var> <const>
ivsubiv <var1> <var2>
jeqivic <var1> <iconst> <label>
jeqiviv <var1> <var2> <label>
jgeivic <var1> <iconst> <label>
jgeiviv <var1> <var2> <label>
jgtivic <var1> <iconst> <label>
jgtiviv <var1> <var2> <label>
jltivic <var1> <iconst> <label>
jltiviv <var1> <var2> <label>
jleivic <var1> <iconst> <label>
jleiviv <var1> <var2> <label>
jneivic <var1> <iconst> <label>
jneiviv <var1> <var2> <label>
fasel <var> load float array variable var into array base register
poplv <lvar> - pop local variable lvar from stack
popv <var> -pop global/function variable var from stack
pushc <const> - push constant value const onto stack
pushfvaddc <var> <const>
pushfvaddfv <var1> <var2>
pushfvdivc <var> <const>
pushfvdivfv <var1> <var2>
pushfvmulc <var> <const>
pushfvmulfv <var1> <var2>
pushfvsubc <var> <const>
pushfvsubfv <var1> <var2>
pushivandc <var1> <const>
pushivandiv <var1> <var_2>
pushivorc <var1> <const>
pushivoriv <var1> <var_2>
pushiveorc <var1> <const>
pushiveoriv <var1> <var_2>
pushivmodc <var1> <const>
pushivmodiv <var1> <var_2>
pushivaddc <var1> <const>
pushivaddiv <var1> <var_2>
pushivsubc <var1> <const>
pushivsubiv <var1> <var_2>
pushivmulc <var1> <const>
pushivmuliv <var1> <var_2>
pushivdivc <var1> <const>
pushivdiviv <var1> <var_2>
pushivaslc <var1> <const>
pushivasliv <var1> <var_2>
pushivasrc <var1> <const>
pushivasriv <var1> <var_2>
pushs - push stackpointer onto stack
pushv <var> - push global/function variable var onto stack
pushdeciv <var> - decrement and push global/function int variable var onto stack
pushinciv <var> - increment and push global/function int variable var onto stack
pushivdec <var> - push global/function int variable var onto stack and decrement it
pushivinc <var> - push global/function int variable var onto stack and increment it
movlv <lvar> <lvar2> - move content of lvar2 to lvar
movlvc <lvar> <const> - move constant value const to lvar
movv <var> <var2> - move content of var2 to var
movvc <var> <const> - move constant value const to var
sfatan2 - calc arc tan of st[1]/st[0] (both float)
sfabs - calc absolute value of stack float val
sfadd - add stack float values
sfcmp <relop> <label> - compare float values from stack and branch to label if comparison evaluates to 1
sfcos - calc cos of stack float val (rad)
sfdiv sp[0]=sp[1]/sp[0] -
sffrac - calc remainder of stack float val
sfneg - change sign of stack float val
sfmul - multiplicate stack float values
sfpow - st[0]= st[0] raised to the power of st[1]
sfround - round stack float val
sfsin - calc sin of stack float val (rad)
sfsub sp[0]=sp[1]-sp[0] -
sfsqrt - calc square root of stack float val
sftan - calc tan of stack float val (rad)
sfquad - multiply stack float val with itself
sfcmpb <relop> - compare stack float values and store result (1,0)
siabs - calc absolute value of stack int val
siadd - add stack int values
siand - bitwise and stack int values
siasl sp[0]=sp[1]<<sp[0] -
siasr sp[0]=sp[1]>>sp[0] -
sicmp <relop> <label> - compare int values from stack and branch to label if comparison evaluates to 1
sicmpb compare stack int values and store result (1,0)
sidiv sp[0]=sp[1]/sp[0]. -
sieor - bitwise eor/xor stack int values
siinv - bitwise not int val from stack
siloop <label> - decrement int value from stack and branch to label if value>0
simod sp[0]=sp[1]%sp[0] -
simul - multiplicate stack int values
sineg - change sign of stack int val
sinot - C-like logical not operator. pops int val from stack and pushes either 0 or 1
sior - bitwise or stack int values
siapopl - read array element
siapushl - store array element
sipackargb - pop 4 (byte) ints from stack and push packed 32bit int
sipop - pop int/float val from stack and discard it.
sipush[0..3] - push constant int val [0..3] onto stack
sirnd - push new random int val onto stack
sisub sp[0]=sp[1]-sp[0] -
siswap - swaps swap upper stack values
siswapw - swap upper and lower word of int stack value
sitestbz - check if stack int val >0 and set it to either 0 or 1
sitestbz2 - check if stack+1 int val >0 and set it to either 0 or 1
sitestz <label> - pop int from stack, compare with 0 and branch to label if 1
sitestzp <label> - compare int value from stack and branch to label if comparison evaluates to 1
siquad - multiply stack int val with itself
siunpackargb - pop packed 32bit int from stack and push 4 (byte) ints
sreti - grab int/object result value from last C/C++ call (usually in eax)
sretf - grab float result value from last C/C++ call (usually in st0)
stcif - typecast stack int val to float
stcif2 - typecast stack+1 int val to float
stcfi - typecast stack float val to int
stcfi2 - typecast stack+1 float val to int
Variables, constants and jump/function call addresses in the native code are later replaced by their actual values with the help of some magic (32Bit) place holder symbols that are stored in the CPUTable:
| "magic" symbol | instruction |
|---|---|
| 0xC0DE1234 | halt |
| 0xC0DEC0DE | bra, sfcmp, sicmp, sitestz, sitestzp, siloop |
| 0xC0FFC0FE | movvc |
| 0xC0FFC4C4 | pushv, pushivinc, pushinciv, pushivdec, pushdeciv, popv, movv, movvc, inciv, |
| 0xC0FFC4C4 | deciv, pushv |
| 0xC0FFC0FF | pushc, loadc |
| 0xC0FFC4C5 | movv |
| 0xC0FFAAA1 | iasel, iaselbc |
| 0xC0FFABA1 | iaselbc |
| 0xC0FFAAA2 | fasel |
| 0xC0FFABA2 | faselbc |
| 0xC0FFC7C7 | incliv, decliv, siapushl, siapushlbc, siapopl, siapoplbc |