Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411423 Posts in 69363 Topics- by 58416 Members - Latest Member: JamesAGreen

April 18, 2024, 11:31:43 PM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)Creating a new programming language from scratch?
Pages: [1] 2
Print
Author Topic: Creating a new programming language from scratch?  (Read 2157 times)
Siegfreide
Level 0
***



View Profile
« on: April 11, 2014, 03:28:17 PM »

I'd like to know if anyone has any suggestions regarding how someone would go about writing a new language from scratch, then using this language to create its own compiler.
 
I've been entertaining the idea of writing my own language, mainly because I find pretty much every other language unreadable in terms of syntax choices, and overall visual representation of the code.
 
My own experience with coding is limited, with most of my experience being in GML (GameMaker Language), and I've started picking up on Java. I was really impressed with GML's syntax and loose variable declaration, but to my knowledge it cant be used to create general computer software. When I started Java, however, I was completely turned off to its formatting requirements and syntax.

Some of the parts of other languages I would like to include;
  • The high performance of C++
  • The portability of Java
  • The ease-of-use of GML

Basically, I would like to create an easily read all-purpose language. I'd like to try to combine all of the "pros" of the most popular languages, while eliminating as many of the "cons" as possible. Since I'm already being extremely ambitious, I might as well make it a "One language to rule them all" kind of thing.
« Last Edit: April 11, 2014, 03:35:26 PM by Siegfreide » Logged
jgrams
Level 3
***



View Profile
« Reply #1 on: April 11, 2014, 04:21:23 PM »

As someone who has been messing around with making toy programming language for 18 years or, so, I would say... "DOOOOM! Run while you still can!" Tongue

  • The high performance of C++
  • The portability of Java
  • The ease-of-use of GML

It's called Python. Wink



More seriously, be aware that:

Programming language implementation and theory is a huge field, both broad and deep. Getting an overview of what we already know about how to do (and how not to do) programming languages is probably a full-time job for a year or three.

Most of the modern language implementations have hundreds of man-years of work put into them.

So you're not going to build something amazing on your own.

You might, however, get somewhere building on existing work.

LLVM seems to be more and more the back-end of choice.

I've also heard good things about PyPy's infrastructure for building good VM/JIT systems, though I believe it's almost completely undocumented.

The VPRI project has just completed a six (?) year research project into the fundamentals of computing and how to build large amounts of functionality in small amounts of code, and they have produced a bunch of stuff that would be pretty interesting to look into and possibly use as a starting point.

But it may well make sense start with a custom naive slow implementation which makes it easy to play around with language features, and then start optimizing once the language starts settling down (if it ever does).



Honestly, though, even if you're seriously dedicated and smart and all those other things, if you're starting from knowing nothing you're looking at two to three years of work (at least) before you get to the point of having a language that is good enough to start attracting a community at all.

So...I dunno. If it's really what you want to do, I don't want to discourage you too much. Designing and implementing a language will probably eat up large chunks of your life, and you are extremely unlikely to produce something that anyone but you wants to use, but it's also an incredibly fascinating endeavor, and you'll learn a lot.

I'd be happy to Skype or something if you want to talk about all the bits and pieces I've picked up over the years. I'm very much an amateur, but I read a lot of stuff, and I've built a lot of little experiments.

--Josh
Logged
Siegfreide
Level 0
***



View Profile
« Reply #2 on: April 11, 2014, 05:18:48 PM »

Taking your suggestion of looking into VPRI, I found this; http://www.vpri.org/pdf/tr2007008_steps.pdf

I've only read a few paragraphs, but its a very interesting read so far, and it definitely has one of my goals in mind. AKA, more expressive language, taking fewer lines to write a complete program.

I'm certainly aware I'm not going to create anything monumental on my own lol. But its my hope that, once I lay down the general framework and basic syntax of the language, that I might be able to generate some interest in others and get a little collaboration going on. A long shot, of course. But I think its a worthy goal and a good use of my time, if only for the learning experience like you mentioned.

If you have some know-how and knowledge in this type of development, I'd definitely enjoy the opportunity to glean any info about it from you. I'll send you a PM.
Logged
Average Software
Level 10
*****

Fleeing all W'rkncacnter


View Profile WWW
« Reply #3 on: April 11, 2014, 05:23:44 PM »

I've written compilers for college, and I've also been dabbling in a language of my own design for many years (all generic, all the time!).

Compilers are, as has been pointed out, not trivial to write, and languages are much harder to design than you might expect.  One thing to consider is that compilers can generally be broken down into semi-independant parts and you may want to start be learning how to write some of those smaller parts.

In particular, knowing how to write a general language parser is immensely valuable.  I can't think of a single significant project that I've worked on that didn't involve writing a language parser at some point.
Logged



What would John Carmack do?
J-Snake
Level 10
*****


A fool with a tool is still a fool.


View Profile WWW
« Reply #4 on: April 11, 2014, 05:37:49 PM »

  • The high performance of C++
  • The portability of Java
  • The ease-of-use of GML
Those are contradictory goals. You will end up making compromises like every language does.
Logged

Independent game developer with an elaborate focus on interesting gameplay, rewarding depth of play and technical quality.<br /><br />Trap Them: http://store.steampowered.com/app/375930
Siegfreide
Level 0
***



View Profile
« Reply #5 on: April 11, 2014, 05:43:49 PM »

I've written compilers for college, and I've also been dabbling in a language of my own design for many years (all generic, all the time!).

Compilers are, as has been pointed out, not trivial to write, and languages are much harder to design than you might expect.  One thing to consider is that compilers can generally be broken down into semi-independant parts and you may want to start be learning how to write some of those smaller parts.

In particular, knowing how to write a general language parser is immensely valuable.  I can't think of a single significant project that I've worked on that didn't involve writing a language parser at some point.

Might you be able to point me in the general direction of a tutorial or book that would be a useful resource in breaking down and reconstructing compilers, kind sir? Smiley

Also, if its not proprietary information, would you mind if take a look over your own created language, and maybe clue me in to important lessons you had to learn the hard away? Tongue
Logged
Siegfreide
Level 0
***



View Profile
« Reply #6 on: April 11, 2014, 05:48:09 PM »

  • The high performance of C++
  • The portability of Java
  • The ease-of-use of GML
Those are contradictory goals. You will end up making compromises like every language does.

Hmmm... Could you elaborate a bit further? I'm not aware of how they could be contradictory. I have limited knowledge of the subject, but I thought that C++ was higher performance over Java because of its manual memory management, Java owed its portability to its virtual machine, and GML... Well, its just plain easy to read and understand lol. Am I wrong in thinking that these are not mutually exclusive?
Logged
Cheesegrater
Level 1
*



View Profile
« Reply #7 on: April 11, 2014, 05:57:50 PM »

The standard book on compilers and language design is 'Compilers: Principles, Techniques, and Tools' by Aho et al. It's complicated and long (1000 pages), but indispensable if you really want to understand every aspect of how compilers work.
Logged
Siegfreide
Level 0
***



View Profile
« Reply #8 on: April 11, 2014, 06:11:56 PM »

Thanks for the recommendation Cheesegrater, I'll look into that. Is that book more regarding theory of compilers, or is is based around an actual compiler that is dissected and studied in-depth?
Logged
Sqorgar
Level 0
**


View Profile
« Reply #9 on: April 11, 2014, 06:19:01 PM »

I wouldn't bother to write a completely new programming language - at least, not with a LOT of experience under my belt. It is a non-trivial problem, and I imagine that it would one's lifelong goal rather than a fun project to do.

However, it can be useful to design "domain specific languages" (or DSLs) which take care of specific tasks using a more expressive grammar. For instance, I created a LISP-based language to define procedural components for random dungeon generation. LISP is trivial to parse (but not a language I would do a full game in) and it's natural tree structure benefits procedural design directly.

The advantages of DSLs is that you are basically creating a data format rather than a programming language, so you don't have to worry about taking your syntax tree and compiling it to assembly language. With LISP, you can evaluate the syntax tree directly. With something like FORTH, you don't even need a syntax tree. You can basically evaluate it at the lexical analysis phase!

It was a lot of fun and really not as difficult as I initially suspected it would be, but it required a lot of reading. I read several books on compiler design and DSLs before I could make heads or tails out of the Dragon Book - which is as definitive as everyone says, but dense with mathspeak. And I can program in two dozen different languages, which was EXTREMELY helpful when trying to understand compiler design. Honestly, this stuff makes me love programming so much more, but one does not simply walk into the Dragon Book on a whim.
Logged
J-Snake
Level 10
*****


A fool with a tool is still a fool.


View Profile WWW
« Reply #10 on: April 11, 2014, 06:33:25 PM »

There are several factors which contradict your integration demands. You cannot have high abstraction and demand the same speed you could achieve with lower level "hacking". Also you cannot have the ease of use of GML and just integrate it into the clean object oriented philosophy of java.
There is a reason why java makes very large and complicated projects possible.

If you are serious about it though and have a good university around you I recommend to visit lectures starting with introduction to state machines and formal systems. Then take the lectures about compiler construction. Otherwise just google for well rated books on that subject.
Logged

Independent game developer with an elaborate focus on interesting gameplay, rewarding depth of play and technical quality.<br /><br />Trap Them: http://store.steampowered.com/app/375930
InfiniteStateMachine
Level 10
*****



View Profile
« Reply #11 on: April 11, 2014, 07:03:47 PM »

There are several factors which contradict your integration demands. You cannot have high abstraction and demand the same speed you could achieve with lower level "hacking".

I would agree with you up until recently. There are starting to be some cases (concurrency related) where functional languages are beating low level languages in real world performance. Partially due to ease for the programmer as well as compiler optimizations afforded by a language that is immutable by default. Now it's up to you to decide where a functional language exists in the high-low spectrum. The language I'm specifically citing is Haskell.


As for the OP. Take it a step at a time. Take Average Programmers advice and start by looking into parsing. In games it comes up countless times. My first experience was writing my own data definition language which was a shameless ripoff of Valve's KV system. I used it for years before I discovered XML Sad
Logged

Cheesegrater
Level 1
*



View Profile
« Reply #12 on: April 11, 2014, 07:15:04 PM »

Is that book more regarding theory of compilers, or is is based around an actual compiler that is dissected and studied in-depth?

Mostly theory, with some examples. I doubt you'll find a deconstruction of a real language compiler in book or tutorial form. Maybe a simple toy language, but I don't know of any off the top of my head.

I believe O'Reilly's book on yacc includes a chapter about parsing C, but yacc only generates the parser stage of a compiler, there's still a lot (a lot) more to know than what goes into the parser, especially if you're thinking about virtual machines and JIT like Java uses.
Logged
J-Snake
Level 10
*****


A fool with a tool is still a fool.


View Profile WWW
« Reply #13 on: April 11, 2014, 07:39:56 PM »

There are starting to be some cases (concurrency related) where functional languages are beating low level languages in real world performance.
Depends on how general the low level compiler is. For most projects it is certainly not an issue though. I am using java. If java wouldn't keep up with my performance demands I wouldn't use it.
Logged

Independent game developer with an elaborate focus on interesting gameplay, rewarding depth of play and technical quality.<br /><br />Trap Them: http://store.steampowered.com/app/375930
InfiniteStateMachine
Level 10
*****



View Profile
« Reply #14 on: April 11, 2014, 08:27:23 PM »

There are starting to be some cases (concurrency related) where functional languages are beating low level languages in real world performance.
Depends on how general the low level compiler is. For most projects it is certainly not an issue though. I am using java. If java wouldn't keep up with my performance demands I wouldn't use it.

You switched from c# to java? May i ask why?
Logged

pelle
Level 2
**



View Profile WWW
« Reply #15 on: April 11, 2014, 10:40:23 PM »

In practice, writing a DSL or using an existing language that does most of what you want will always be the best choice.

But I always thought that designing a programming language at some point is something a programmer should do. Not that I have, but I always have a few on my possible-TODO-list for future projects.

One reaction on the goals given above was that Java isn't really that portable. Go beyond a few mainstream platforms and it will be a lot easier to run something written in Python or C  or C++ or JavaScript. Technically the Java bytecode is more portable than C or C++ of course, so depending on application it might win over C, but in practice you almost always have to build installers and some platform-specific glue for Java anyway, so an extra compilation-step added to each new platform is no big difference.
Logged
nikki
Level 10
*****


View Profile
« Reply #16 on: April 12, 2014, 01:27:03 AM »

How to create your own freaking awesome programming language
Logged
Eigen
Level 10
*****


Brobdingnagian ding dong


View Profile WWW
« Reply #17 on: April 12, 2014, 03:05:20 AM »

Have you really exhausted all the available languages out there that you need to roll out you own? I doubt it.

What about the D language?
Logged

Average Software
Level 10
*****

Fleeing all W'rkncacnter


View Profile WWW
« Reply #18 on: April 12, 2014, 09:55:35 AM »

Thanks for the recommendation Cheesegrater, I'll look into that. Is that book more regarding theory of compilers, or is is based around an actual compiler that is dissected and studied in-depth?

In practice, the theory is nearly impossible to separate from the implementation, unless you're writing a bad compiler, I guess.

Also, if its not proprietary information, would you mind if take a look over your own created language, and maybe clue me in to important lessons you had to learn the hard away?

It's not far enough along to have anything to show, just notes and stuff in my head, really.  However, one of my current game projects has its own simple scripting language.  It's not a "programming language" in any real sense, but the code does have a complete parser and tokenizer.  It's all GPL, so feel free to look at it.  Source is included in all the downloads.
Logged



What would John Carmack do?
RandyGaul
Level 1
*

~~~


View Profile WWW
« Reply #19 on: April 12, 2014, 10:21:07 AM »

Maybe this is mostly humorous, but might add a little perspective on the topic for someone reading: http://colinm.org/language_checklist.html
Logged
Pages: [1] 2
Print
Jump to:  

Theme orange-lt created by panic