Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length

 
Advanced search

1075913 Posts in 44152 Topics- by 36120 Members - Latest Member: Royalhandstudios

December 29, 2014, 02:37:34 PM
TIGSource ForumsDeveloperTechnical (Moderators: Glaiel-Gamer, ThemsAllTook)Scripting Language Development
Pages: [1] 2
Print
Author Topic: Scripting Language Development  (Read 2956 times)
Glaiel-Gamer
Moderator
Level 10
******


Stoleurface!


View Profile WWW Email
« on: June 14, 2009, 08:38:00 PM »

For fun, I've decided to write my own small, lightweight scripting language for describing motions of the noninteractive background objects in closure, and triggering animations and such under certain conditions. It won't be implemented into the game soon, but it's fun to write a mini compiler and interpreter for this for now.

This is sorta a mini-dev log for this.

First, I decided what I want the scripting language to do, and what it should look like.
I decided what it would look like, and I decided on c++ syntax, but each script is pretty much its own function
- only floats and ints
- operators: +-*/%, assignment: =, comparison: == != > >= < <=, grouping: ( )
- no dynamic allocation
- if statements, no while/for statements (for now)
- math functions: sin, cos, tan, sqrt (but easily able to add more if I so chose)
- pass the script pointers to data to access/modify through code

Essentially, each script will describe one algorithm.
so I can do stuff like this:

script: (ints A and B and OUTPUT are passed to the script from code)

int C = A * (B + 100)*sin(400 * A);
if(C > 200){
  OUTPUT = 1;
} else {
  OUTPUT = 2;
}


basic stuff pretty much.


Second, I decided on an instruction set for the virtual machine, and how it would operate. All I know about VMs is looking at flash bytecode and the small amount of assembly I know, other than that I'm pretty much making this up as I go along.
I'm going for a stack based thingy
so that means converting normal expressions like A + B * C to

B C * A +

it would push B and C on the stack, Multiply them, place the result on the stack (in place of B and C), put A on the stack, then add them, so after this is done the value on the stack is what you'd expect

Also, I don't know how normal VMs do this, but since I want floats and ints, i made one stack for floating point expressions and one stack for integer expressions, each expression chooses one or the other to work with (if the expression is only ints, it uses the int stack otherwise it uses the float stack and casts ints to float when it pushes onto the stack)

so I also have 3 classes of variable: local, outside/reference, and constant

and came up with this instruction set:
Code:
            SETMODE,                                                         // sets the current stack (float or int)
            POPINT, POPFLOAT, POPINTR, POPFLOATR,                            // moves a number from the current stack to a register, R = reference, C = constant
            PUSHINT, PUSHFLOAT, PUSHINTR, PUSHFLOATR, PUSHINTC, PUSHFLOATC,  // moves a number from the register to the current working stack
            CALL,                                                            // calls a function with the current stacktop as a paramater, places value on stack
            MULT, DIV, ADD, SUB, MOD,                                        // operates on top 2 in stack and places return value in stack
            JUMP,                                                            // sets program pointer
            COMPEQ, COMPGR, COMPLT, COMPGREQ, COMPLTEQ, COMPNEQ,             // compares top 2 on stack, places result in "result" boolean
            SKIPIF,                                                          // skips next line if result is true

for CALL, it could have SIN, COS, TAN, or SQRT as a parameter

set, push, pop call, and jump take 1 argument, the rest take none

this works in my head but I don't know if you can get it through my shoddy explanation


Then I got a function that reads in a file and splits it into distinct tokens (converts "A=B+45*C;" into {"A", "=", "B", "+", "45", "*", "C", ";"}

for a sample from now on, I'm gonna do what I have for my test which is

A * sin(B) + (A*B*B); with A as an int and B as a float passed to it

so the first bit converts it to (using spaces to delimit tokens, internally it's an array of strings)

A * sin ( B ) + ( A * B * B ) ;

Next, I pull out one line at a time and send it to the "expression compiler" (strips out the ; and will do a few things like resolving variable names in the future too)

A * sin ( B ) + ( A * B * B )

Now I begin processing this array of tokens into something easier for the program to compile into opcodes, so seeing as I need to get it into the stack form shown above, I need to swap the order of tokens around, so first I resolve function names (which is sin in this case)

in stack form , sin(b) is "B SIN" which is push B on the stack, then call sin on the top of the stack, so that's simple, when I find a function name, I move it over to right in front of the corresponding closing parenthesis

A * ( B sin ) + ( A * B * B )

Next up is shuffling operators around. Now, for this, i treat parenthesis groupings as a single token, (i essentially pass through them during my swapping)

So I start by looking at the first one and incrementing till I find an operator (in this case the first operator is *
I then swap the operator and the next "token group" (which is the next token  if there's no parenthesis)

A ( B sin ) * + ( A * B * B ) (currently looking at *, next token is +, swap that)

A ( B sin ) * ( A * B * B ) +

Next, I begin converting to opcodes, but, when I encounter a (, I remove it, and count forward till I find the corresponding ), remove that, and send the grouping in between to the expression compiler (recursion). So the following 2 expressions get compiled through the method above and placed in place:

B sin (already fine)
A * B * B (turns to)
A B * B *

so the final expression before translation is

A B sin * A B * B * +

Then it's simply just processing one token at a time and resolving names and symbols

My compiler outputs:

Code:
SETMODE FLOAT //the expression type is float, remember A was an int and B was a float
PUSHINTR A
PUSHFLOATR B
CALL SIN
MULT
PUSHINTR A
PUSHFLOATR B
MULT
PUSHFLOATR B
MULT
ADD

the R on the end of the push opcodes stands for "reference" since A and B are references to variables declared in the c++ program.

But, the opcodes look pretty much correct, and the compiler handles stacking parenthesis properly.

Up next: operator precedence. I can just insert parenthesis around multiplication division and modulus accordingly for now though, the thing will handle it correctly if I do that.
Logged

muku
Level 10
*****



View Profile WWW
« Reply #1 on: June 14, 2009, 11:22:29 PM »

Yeah, writing compilers/interpreters is oddly fun, I've done that myself at some time in the past. I wouldn't recommend rolling your own for an actual project though; just use Lua or something and be done with it, you'll have better performance, less bugs, better tools etc.

Still, some remarks:


Also, I don't know how normal VMs do this, but since I want floats and ints, i made one stack for floating point expressions and one stack for integer expressions, each expression chooses one or the other to work with (if the expression is only ints, it uses the int stack otherwise it uses the float stack and casts ints to float when it pushes onto the stack)

I don't have that much experience with these things myself, but I think that a dynamically typed language could have one big stack where every value carries a "type" tag that tells the interpreter whether it's dealing with an int, float, string, or whatever.

In a statically typed language (I think this is what you're going for here), on the other hand, the compiler knows at any point which type of variable is contained at any location of the stack, so no need to carry that information around at runtime; just again have one stack, push either integers or floats onto it, and have the compiler generate the correct opcode to deal with the data. Say, you could have IADD and FADD instructions to add integers and floats, respectively.

Quote
Now I begin processing this array of tokens into something easier for the program to compile into opcodes, so seeing as I need to get it into the stack form shown above, I need to swap the order of tokens around, so first I resolve function names (which is sin in this case)

That sounds rather cumbersome and limited. I think the usual approach is to build up an Abstract Syntax Tree (AST), that is, a tree structure which contains all terms of one expression or program in a hierarchical way so that you could theoretically evaluate it from the bottom up (or the top down, whichever way you look at it). Here's an example:


(from http://www.anandsekar.com/2006/01/15/writing-a-interpretter/ )

I think the whole "swapping tokens around" thing will get really awkward once you try implementing more operators with varying precedences.


Also, if you want to save yourself a lot of headaches on all the lexing and parsing, consider using a parser generator, like Lex/Yacc for instance.
Logged

The Cosyne Synthesis Engine - realtime music synthesis for games
Glaiel-Gamer
Moderator
Level 10
******


Stoleurface!


View Profile WWW Email
« Reply #2 on: June 15, 2009, 03:10:20 PM »

I see how that tree thing would be useful, but I like figuring out things for myself, it helps me learn.

I can convert to the better way later, but I'm still gonna play around with my method



I just got if/else working.

how my parser handles it is such:


when i encounter a {, it signifies the end of an expression (in fact, I think my parser doesn't even read the word "if"), if it doesn't come after an "else", I compile the previous line (just like ";"), then add the next 2 lines of code

SKIPIF
JUMP #

(an if statement has boolean operators and comparisons in it, so SKIPIF skips the next line of code if the last thing was true)

then when I encounter the corresponding }, I put

LABEL #

in the code. Handling finding the corresponding one was funky and I used a stack for it.
Then i could, when finding

} else {

put

JUMP #
LABEL ##

then after the code compiled, i went through and replaced the jump # with their corresponding line numbers to jump through. Labels are treated as a NOP.

I tested it in a variety of situations and it worked wonderfully.


I also got it resolving variable names and variable hooks to the outside world, and parenthesizing expressions to preserve operator precedence.

Also got it executing the instructions.


Basically I'm done for what I wanted.

Things I need to add:

Unary +, -, ! operators
&&, ||

global storage

Logged

Glaiel-Gamer
Moderator
Level 10
******


Stoleurface!


View Profile WWW Email
« Reply #3 on: June 15, 2009, 07:18:34 PM »

got globals, unary operators, and boolean logic (&&, ||) working.

There. Everything I can get without doing while/for loops is in now.

SCRIPT:
Code:
script test
-----------------------------------

GLOBALS

global int m = 300000;
global int n = 500000;

BEGIN

output = 0;

if(m==0 || !(n < 600000)){
  output = n+1;
}

if(m>0 && n < 600000){
  m += -1;
  n += 1;
}

END

compiled
Code:
SETMODE INT
PUSHINTC 0
POPINTR output

SETMODE INT
PUSHINT m
PUSHINTC 0
COMPEQ
PUSHINT n
PUSHINTC 600000
COMPLT
NOT
OR
SKIPIF
JUMP 19

SETMODE INT
PUSHINT n
PUSHINTC 1
ADD
POPINTR output

LABEL 19

SETMODE INT
PUSHINT m
PUSHINTC 0
COMPGR
PUSHINT n
PUSHINTC 600000
COMPLT
AND
SKIPIF
JUMP 41

SETMODE INT
PUSHINT m
PUSHINTC 1
NEG
ADD
POPINT m

SETMODE INT
PUSHINT n
PUSHINTC 1
ADD
POPINT n

LABEL 41

(label and jump numbers refer to the line number)


main file:

Code:
int main(){
  Script script("test.txt");
  int output = 0;
  script.setLink("output", &output);
  script.compile();
  script.output();
 
  while(output == 0){
    script.run();
  }
 
  std::cout << output << std::endl;
  return 0;
}


outputs: 600001


takes a second to execute with no compiler execution, executes instantly with compiler optimization (HUGE increase... must be some switch-statement magic)
« Last Edit: June 15, 2009, 07:21:42 PM by Glaiel-Gamer » Logged

dustin
Level 6
*


View Profile Email
« Reply #4 on: June 15, 2009, 11:07:05 PM »

Hey I realize this might sound funny but I'm just wondering.

Your scripting language looks so similar to C why aren't you just writing the stuff as functions in C and using function pointers or something.  I think I've missed what advantage you get out of using a scripting language...
Logged
BorisTheBrave
Level 10
*****


View Profile WWW
« Reply #5 on: June 16, 2009, 01:34:42 AM »

If you want a C like language, I would recommend using a ANTLR to skip the parsing part, as IMHO it's the least fun. As you can say, you can switch after you get fed up of the current approach, but afterwards you'll see you put many opcodes in the VM for no reason other than to make parsing easier.

Have you considered writing a Forth parser. It's very easy to parse and run, stack based, and quite powerful for scripting. Or Scheme.
Logged
Glaiel-Gamer
Moderator
Level 10
******


Stoleurface!


View Profile WWW Email
« Reply #6 on: June 16, 2009, 04:12:25 AM »

Hey I realize this might sound funny but I'm just wondering.

Your scripting language looks so similar to C why aren't you just writing the stuff as functions in C and using function pointers or something.  I think I've missed what advantage you get out of using a scripting language...

So i can send scripts to my artist, or have him use scripts, so he doesn't have to go through the trouble of compiling the entire project if he wants to change behaviors
Logged

Glaiel-Gamer
Moderator
Level 10
******


Stoleurface!


View Profile WWW Email
« Reply #7 on: June 16, 2009, 06:16:07 PM »

i implemented it into Closure for scripting background and environment objects in the game now

It's quite awesome to see it working for real

I call it "CLOT" until i come up with a better name
Logged

dustin
Level 6
*


View Profile Email
« Reply #8 on: June 16, 2009, 08:24:06 PM »

Quote
So i can send scripts to my artist, or have him use scripts, so he doesn't have to go through the trouble of compiling the entire project if he wants to change behaviors

Ahh ok that makes sense.  All I could think of was that it was just for you to avoid compiling which didn't seem like to big a deal really.
Logged
yesfish
Guest
« Reply #9 on: June 16, 2009, 11:42:38 PM »

Wouldn't using something like python or lua for scriping work better? (I don't find programming that fun so don't mind me if you prefer to do it yourself, lol)
Logged
Alex May
...is probably drunk right now.
Level 10
*


hen hao wan


View Profile WWW Email
« Reply #10 on: June 17, 2009, 12:58:22 AM »

Wouldn't using something like python or lua for scriping work better? (I don't find programming that fun so don't mind me if you prefer to do it yourself, lol)

Here's your answer:

I like figuring out things for myself, it helps me learn.

(still... why reinvent the wheel that Lua does so well Wink)

Logged

postlogic
Level 1
*



View Profile
« Reply #11 on: June 17, 2009, 04:31:49 AM »

Looping is pretty simple to implement, though... Just base it on if-else.

Basically, test expression, if true, go through statements, jump to top, if loop expression true, jump to top, if false, jump to line below jump to top (this might require you to update that line later, like I had to do in my implementation of this simple C thing)

I like watching how you progress on this Smiley

@Alex May: Cause he can.
Logged

No animals were harmed during the making of this post. Except one.
Glaiel-Gamer
Moderator
Level 10
******


Stoleurface!


View Profile WWW Email
« Reply #12 on: June 17, 2009, 06:33:27 AM »

Looping is pretty simple to implement, though... Just base it on if-else.

I added "label" and "goto" so I can do looping if I want to
Logged

Ryan
Level 1
*



View Profile
« Reply #13 on: June 17, 2009, 01:52:38 PM »

It confuses me a bit when people ask 'why?' due to the fact there's so many viable scripting languages already out there.  There is joy when creating something by yourself, surely a community of independent game developers knows this.

Some people are happy using tools that get the job done, like Game Maker or Unity.  Others enjoy using their own tools.  Self-made tools may not be the 'best' way to do things, or even the most efficient- but they're yours.

Even if they're not used in the end, it's the journey and experience that makes that beautiful process all worthwhile.

Beer!
Logged
Overkill
Level 3
***


Andrew G. Crowell

overkill9999@gmail.com Minimum+Overkill
View Profile WWW Email
« Reply #14 on: June 17, 2009, 04:48:34 PM »

Sometimes it's fun to sit down and reinvent the wheel. It makes you really appreciate existing wheels, lets you learn to use the wheels that are out there better, and in rare cases it allows you to make BETTER ones.

That said, it really breaks down to what you want to make with your time, at the end of the day. If you're going to make a scripting language, you'll have less time to make games. Thus you have to think, besides "for fun", is there going to be a practical use for it? Or are you just coding something that you're going to throw away (Sometimes this is okay)?

I have tried making scripting languages a bunch of times too, but since my dev time now is scarce, I decided to concentrate on what I *really* wanted to do, make games. (Even then, now that I'm over language fascination for now, I'm still caught up in engine dev craze all the time)

Anyways, kind of neat. It gets really fun when you get past the basic control structures, and you add user functions and static arrays. And even more fun when you run into dynamic arrays/tables, and object types since they will probably screw over your existing interpreter implementation. I've been there :D
« Last Edit: June 17, 2009, 04:51:35 PM by Overkill » Logged

Glaiel-Gamer
Moderator
Level 10
******


Stoleurface!


View Profile WWW Email
« Reply #15 on: June 17, 2009, 04:53:54 PM »

Sometimes it's fun to sit down and reinvent the wheel. It makes you really appreciate existing wheels, lets you learn to use the wheels that are out there better, and in rare cases it allows you to make BETTER ones.

That said, it really breaks down to what you want to make with your time, at the end of the day. If you're going to make a scripting language, you'll have less time to make games. Thus you have to think, besides "for fun", is there going to be a practical use for it? Or are you just coding something that you're going to throw away (Sometimes this is okay)?

I have tried making scripting languages a bunch of times too, but since my dev time now is scarce, I decided to concentrate on what I *really* wanted to do, make games. (Even then, now that I'm over language fascination for now, I'm still caught up in engine dev craze all the time)

Anyways, kind of neat. It gets really fun when you get past the basic control structures, and you add functions and static arrays. And even more fun when you run into dynamic arrays/tables, and object types since they will probably screw over your existing interpreter implementation. I've been there :D

Ya but I have all I need now for the purpose I want, and it only took 3 days to write it and integrate it into my game.
Unless I want to add "while" loops directly into the language, but I really can't see any big use for them and if I do need them, i can get the same behavior with labels and gotos.

I have a very very specific need for my scripting, which is "make the background and environment feel more dynamic". I don't need a scripting language capable of writing an entire game in, I just need something capable of describing motion.

Plus it was pretty fun to do.
Logged

xelanoimis
Level 0
**



View Profile WWW Email
« Reply #16 on: July 16, 2009, 01:15:02 AM »

One reason for NOT using Lua is if you don't like the syntax. If you're more into c/c++ it's very easy to mess the scripts (like forgetting that lua indexes start from 1 not from 0 and so on). However LUA is very fast for a script. Lua 5 uses registers (no longer a pure stack machine) and having a single numeric type boost the speed a lot. Unless you want something simpler it is very difficult to come up with a faster scripting language. But you can be close enough for the purpose of a script, even if used at runtime (like for AI).

If you want you can have a look at my GS9 script here (full sources and docs):
www.simion.co.uk/gs9
It's used with much success in my DizzyAGE engine:
www.yolkfolk.com/dizzyage

Now I'm working on the next version and I managed to improve the performance close to LUA (even faster in some tests).

Having your own scripting language allows you to adjust it to your needs. You can have namespaces, or direct access to game's objects (the way you want). If you feel up to it, you can do classes and inheritance. Or you can have a simple script for loading data, with minimum of arithmetic operations.

But if you're not able to implement something clean and stable you'll just have more trouble than it worths - that's why the general advice is to use something popular and already proven.

Good luck with the project!
Alex
Logged

skaldicpoet9
Level 10
*****


"The length of my life was fated long ago "


View Profile Email
« Reply #17 on: July 16, 2009, 01:51:41 AM »

Hmm, apparently someone's avatar has a trojan embedded in it according to my Avast. The img is from onipress and is named wlljicon3.gif, could be nothing but better safe then sorry...
Logged

\\\\\\\"Fearlessness is better than a faint heart for any man who puts his nose out of doors. The date of my death and length of my life were fated long ago.\\\\\\\"
Triplefox
Level 9
****



View Profile WWW Email
« Reply #18 on: July 17, 2009, 06:04:06 PM »

Writing languages is fun, although every language I've done uses a hand-crafted parser based around some form of state machine. I would probably stand to gain from a grammar generator. On the other hand, the stuff I've done is pretty much all data-definition languages rather than computation/logic, so my syntax never has a great need to be complex.
Logged

Mipe
Level 10
*****


Migrating to imagination.


View Profile
« Reply #19 on: July 18, 2009, 02:34:20 AM »

Hmm, apparently someone's avatar has a trojan embedded in it according to my Avast. The img is from onipress and is named wlljicon3.gif, could be nothing but better safe then sorry...

Same here.
Logged
Pages: [1] 2
Print
Jump to:  

Theme orange-lt created by panic