This is the closest to my ideal language I've ever seen. Elegant and expressive syntax of Python, statically typed with automatic type inference, compiled to native code, garbage collected, macros, AST manipulation, very comprehensive "batteries included" standard library. Definitely worth a closer look!
This reminds me a lot of D. It's low-level, GCed, essentially intended as a nicer C/C++.
But I think it suffers from the same fatal flaw that D does: you have to semi-manually translate C headers that you want to use. Like D, it provides an automated tool to help, but the translated headers will inevitably lag their original counterparts.
As another commenter below noted, the var and proc thing is also redundant and annoying. However, the ability to compile to C (and, as a result, to easily cross-compile) is really nice.
Is "fatal flaw" the right term? I do not see any way to fix this. Even including a full C parser and preprocessor into the D/Nimrod compiler would not solve the problem.
Since I've written a C compiler, I know how to read C source files. The problem with automated conversion is you can do 90% of the job without any trouble, but there's that darned last 10% that doesn't map to D without some human decision making.
For example, there are the preprocessor macros. Most are straightforward, but it seems most C .h files succumb to the temptation at some point to do something wacky with them.
This is very true; however c2nim distinguishes between #def and #define for this reason and this helps a lot. #def means c2nim needs to expand the macro, #define means it's some macro that should be translated into a Nimrod template. Coming up with heuristics to do this #def vs #define distinction automatically seems to be quite easy and might be implemented in a later version of c2nim.
Obviously, I have no perfect solution. I was initially harboring the vague hope that Nimrod would be able to avoid this problem by virtue of compiling to C.
But I do still assert that this is a "fatal flaw" for a systems programming language. When I wrap a C library in, say, Python, I am already forced to write a wrapper layer, so there's no discomfort or surprise in declaring a few additional structs or function prototypes.
For a novel systems programming language, however, whose raison d'être is to save me keystrokes over C/C++, it feels like two steps forward and one step back to have a more concise main program but lots of auxillary handwritten header translations that have to be manually maintained.
I believe that the primary reason for the success of C++ is its easy compatibility with C, and any creators of would-be C++ replacements who fail to understand this are doomed to failure.
> I believe that the primary reason for the success of C++ is its easy compatibility with C, and any creators of would-be C++ replacements who fail to understand this are doomed to failure.
The main reason were actually
- Most C compiler vendors bundled C++ compilers in the begining
- The C and C++ compiler vendors that matter are OS vendors
Any language can replace C or C++, if an OS vendor decides that it is the way to go.
Lets say Apple decides to have Objective-C as their systems language, or Microsoft decides to have C# as their systems language.
Once upon a time, Apple's system programming language was a Pascal dialect, just as an example.
Developers are always at the mercy what languages are the official ones in a given platform.
This is how Microsoft is deprecating C on Windows nowadays, by leaving their C compiler at C90 level and stating C++ is the official systems language for Windows on their tooling.
> This is how Microsoft is deprecating C on Windows nowadays, by leaving their C compiler at C90 level
Actually they've recently gone back on that and are implementing c99 language[1] and standard library[2] features in an apparent sudden realization that the rest of the C world has moved on (since they cite FFmpeg as a reason, my pet theory is that they were just that embarrassed by Ron Bultje's c99 to c89 translator[3])
Those C99 features are only at library level and are partially required by C++11 and C++14, there is no change of mind going on, at least for the moment.
> But I think it suffers from the same fatal flaw that D does: you have to semi-manually translate C headers that you want to use. Like D, it provides an automated tool to help, but the translated headers will inevitably lag their original counterparts.
I can't think of a better alternative. Can you suggest one? The fact that Nimrod compiles to C already is a huge plus, it makes wrapping C code very easy.
If the language has "A real package system" why not allow for packages written in other languages? If a C build system can look at a .h and a .lib and figure out what symbols are exported and use them then I seen no reason why a compiler for xyz could not also do that. If you reused OSS projects like Clang it may not even be a monumental effort.
In Nimrod, when you are declaring a variable with an explicit type, the syntax is:
var x, y: int
as opposed to the more concise C family way of doing it:
int x, y;
In C, there is no keyword needed to signal that this is a variable declaration. Similarly, in Nimrod, if you declare the return type of a function, you write it in addition to the "proc" keyword.
IMO, the C++0x/D way of declaring a type inferred variable/function, using the auto keyword, is superior, because it is an alternative to specifying the type, not in addition to specifying the type. The Python way, omitting the type entirely, is not really suitable for a statically typed language but at least has the advantage of brevity.
If you have type inference, this is usually not a problem. Also, it is much easier to parse, both for humans and compilers. Rust uses the former way, and it's extremely readable.
Even Herb Sutter, of C++ fame agrees:
> One of the things Go does that I would love C++ to do is a complete left-to-right declaration syntax. That is a good thing because the left-to-right makes you end up in a place where you have no ambiguities, you can read your code in a more strait-forward way, which also makes it more toolable.
I prefer Pascal-style type declaration as well, but playing Devil's advocate for a moment: There's nothing that says you can't have a left-to-right type declaration syntax and have the variable name appear on the right. So something like this:
var ptr_to_array_of_ptr_to_int : *[](*int)
would become something like this:
*[](*int) ptr_to_array_of_ptr_to_int
The former syntax is slightly easier for a program to parse, makes it easier to see variable names, and is more consistent with typical syntax for type inference during variable assignment:
var something := some_function(whatever)
auto something = some_function(whatever)
However, the latter does, as xaa said, eliminate some keystrokes. I also happen to find
byte foo = 10
to read more nicely than any of these:
var foo : byte = 10
var foo := (u8)10
var foo := 10'u8
Go is a good example. It does variable declaration quite concisely while avoiding parsing ambiguity (although it falls into the same trap as Nimrod on function declaration with its "func" keyword).
Type inference can be useful, but important secondary purposes of declaring types, beyond informing the compiler what type a variable is, are as documentation to yourself, and as a sort of compile-time assertion, to ensure that the variable type is indeed what you thought it was. Thus, a language with type inference should not treat manually declared types as some kind of corner case.
Nimrod looks a lot like someone took Borland's ObjectPascal and removed the begin/end block semantics in favour of whitespace indentation.
For example, they use the "var: type" syntax; the syntax for declaring classes is nearly identical; "case [value] of" is right out of Pascal, as is the "0..2" range syntax. They refer to "procedures" as opposed to functions. They even use naming conventions from ObjectPascal: "T" as a prefix for types (eg., TChannel), "P" for the Nimrod equivalent to pointers (eg., PChannel is a "ref TChannel"), "F" for class members.
I would not be surprised if the author was an old Turbo Pascal/Delphi hand.
What's really amazing about this language is that until only recently it has been developed solely by one person in his spare time no less. It is competing with the likes of Rust and Go which have pretty big companies behind them.
Hey, guy behind Jester here (The web framework being benchmarked). Please don't let these results deter you from Nimrod. Not only is Jester alpha-grade software, but when I submitted these benchmarks I did not have much time to properly utilize Nimrod's concurrency features (I was in the middle of exams). My first submission was not parallel at all and I quickly corrected it so that multiple processes were simply spawned (with no regard to the amount of CPU cores) which is definitely not ideal and the results show this. So please don't blame Nimrod for this, blame me instead.
Hi dom96. Thanks again for contributing the Jester tests! I hope you get some time to continue your work on Jester. We'd like to have Nimrod represented properly in the results.
Are there other Nimrod web frameworks (and importantly, domain experts with those frameworks) that could be contributed for Round 7?
I'm in scientific computing so the floating-point performance is by far my biggest concern. In the tests I just ran, Nimrod and C are virtually identical in time (probably creating near-identical machine code).
test.nim:
var sum = 1.0
while sum<1.0e10:
sum = sum*1.00000001
echo($sum)
test.c:
#include <stdio.h>
int main() {
double sum = 1.0;
while(sum < 1e10) {
sum *= 1.00000001;
}
printf("%f\n", sum);
return 0;
}
I've checked it out on your recommendation. I like the syntax even better than Nimrod's, it has better support for scientific work, and the floating-point performance is virtually identical. The only problem is that global variables are apparently not allowed to be typed as plain primitives, so this program:
s = 1.0
while s<1.0e10
s = s*1.00000001
end
println(s)
Has mysteriously terrible performance. You have to use:
function test()
s = 1.0
while s<1.0e10
s = s*1.00000001
end
println(s)
end
test()
I actually discovered this yesterday and spent Saturday morning learning about it. I really like most of the syntax, especially the "var/const/let" distinction together with "if/when". I'm sort of wish that Rust and Go would just copy that. The semantics seemed to be up to par with this generation of potential C replacements, but not any better than the others.
I hate that this language looks so good. Fast benchmarks, good library, readable code, documentation. I don't want to learn another language! Now I have to …
Just when i started reading tutorials about Rust this comes up... Looks nice, but does it also have concurrency built in (Rust tasks, Erlangs processes, message passing, etc.)?
> Beneath a nice infix/indentation based syntax with a powerful (AST based, hygienic) macro system lies a semantic model that supports a soft realtime GC on thread local heaps. Asynchronous message passing is used between threads, so no "stop the world" mechanism is necessary.
Yes, so not. I've found the actor module, but it's based on threads so no "lightweigth" and not as powerful as the networked, lightweight erlang processes. I'd really love to see that in a new language, especially since Rust and Nimrod are the first ones in a while i really like from the syntax point of view..
Erlang seems to pay a small performance penalty for that ability though. The VM is not allowed to run more than a few thousand instructions without checking for a signal to task switch.
But if that kind of guaranteed cooperation isn't needed, I'd expect that lightweight threads could be added to a language like this in a straightforward way.
Kitten, a statically typed, globally type-inferred, GC-less functional stack language, with opt-in layout-based syntax and an effect system to separate pure and impure operations.
Disclaimer: I wrote this; it’s nowhere near complete, but not bad to play around with. We’re working toward a release in a few weeks, to include some missing language features and a new x86_64 runtime.
Can I ask you one thing: why did you decide to write a stack-based programming language?
I'm really curious, as I personally see nothing good with them. The same kind of non-mathematical syntax as LISP (`1 2 +` as opposed to `+ 1 2`), without any of the goodness (code is data).
In Factor, code is data. Remember, though, that there are many sources of goodness in a language. As for your actual question, I decided to write a stack-based language because:
• They can be implemented very efficiently on real hardware. The stack-based VM is well established as an implementation technique for non–stack-based languages.
• Like in Lisp, the simple structure makes them suitable for useful visualisations beyond the program text. Static typing creates even more cool possibilities for such analysis.
• There is fertile ground for new research, and applying existing theory to these languages for the first time—particularly type and effect systems.
I agree with all of those, to me is just seems like transformation into stack-based language should be made as part of the compilation process, not written down by the programmer. Maybe it's just me, I totally admit that I'm not used to thinking in concatenative languages as I've never used one, but mathematics, with named variables and control- and data-flow denoted by functions or sequential lines of operations, seems the most natural notation to read, write and think in.
I totally understand. In fact, Kitten has named variables for that reason. You can write code that looks imperative, expressiony, or dataflowy, according to taste. Here’s a silly example:
// Implicitly thread state between functions.
def promptInt:
": " cat prompt
readInt
fromSome
// Write stacky code if you want.
def squareDiff:
- dup *
// Use locals as much as you need to.
"x0" promptInt ->x0
"y0" promptInt ->y0
"x" promptInt ->x
"y" promptInt ->y
x0 x squareDiff ->a
y0 y squareDiff ->b
a b + sayInt
i've always felt that had i discovered euphoria earlier in my programming life, i would have really liked it. these days i'd miss all the features more recent languages have borrowed from lisp and/or ml, but it would have been an awesome second or third language (after basic and c).
I discovered Euphoria back in the DOS days when it was still commercial. I agree that it seems a little dated these days but I still fondly remember it.
Also it is still a great language for non-professional programmers and still has an active community there. Note the recently updated SDL2, SFML2, and Allegro bindings for example.
For an interesting systems programming language try ATS [1]. It's an ML style functional programming language with dependent types, linear types, macros, generics and compiles to C giving reasonable portability.
Reading the tutorial is indeed very pleasant. I have to say however, that from such a new language I would ask for more correct string handling as opposed to the let-me-sidestep-encoding-issues-and-still-call-it-UTF8 that's going on.
Defining characters as 8 bit is not quite correct (you will get char variables holding a fraction of a character), treating strings as arrays of characters by conversion also isn't (now you have an array of partial characters) and ignoring the encoding (just assuming UTF8) is a sure cause for annoying bugs - something the rest of the language goes great lengths to avoid.
I've only read the tutorial so far. Maybe the library fixes some of the issues - we'll see. The thing is just that if you start adding metadata to your sting types (they do encode the length (in what? Characters? Partial characters? Bytes?)), then why not also at least add an encoding (to help programmers not to mix strings of different encodings)? Why do everything right and then weasel out when you get to strings?
Sorry. I'm totally obsessive what string encodings is concerned :-)
Strings should be converted to UTF-8 as part of input validation. For proper input validation we have the taint mode already. Note that often a file does not include any information about the encoding, so you can only use heuristics.
I've been tempted by this one several times. I may have to play around with it and SDL2.
One thing that gets me, though, is the var keyword in a statically typed language. Why say "var thing: string" instead of "string thing"?
I think that consistent, explicit declaration of types is much cleaner and easier to read than this recurring mixture of type inference and annotation.
One point in favor of inference is that it gives you less to read, and what the compiler can infer, the reader usually can, too. On the other side, you could say you shouldn't have to infer while you read and having types in front of you reduces cognitive load.
I bet the folks building these languages with inference probably and wouldn't mind having fewer annotations (and being more consistent, in a sense) if they could. It's just that to get rid of some of these annotations you have to make deep changes to the language, like OCaml's polymorphic functions, and that may be inconsistent with their design goals or just hard.
Obviously folks won't agree on whether inference is a great thing or not. Whatever one's used to usually seems easier to read. I've worked more in dynamically typed langauges than statically, and (so) languages with inference feel more like home.
In this language, I don't think it makes sense. (Then again, I think little of this language makes sense. Neat hack, but still in the "if I want GC I'll use the JVM" bucket that a lot of these fall into.)
In something like Scala, you can end up with some fairly hairy eyesores of variable declarations and the type is often not that important when you and everyone else have the ability to hover over a variable name and see the computed type immediately. Wouldn't work for a vim-driven language, but works well in an IDE.
There was something similar to this about two weeks ago here, with vars and lambdas in c99. It was less of a full blown language and more just functions and preprocessor definitions to be used in C. I am having trouble finding it now. Does anyone remember the name of this other project? I'd like to check these both out.
Unfortunately the documentation still states that at this point I can choose either threads or a dynamically linked run time library. For an embedded system, static linking isn't much of an option because there are real memory constraints.
It's a shame that a lot of these new languages focus on creating a really great language, but don't seem to give as much time to the practical usage of the language (for example, GHC did not support cross-compilation at all until very recently).
Ironing some of those fundamental issues out before adding some of the fancier features could probably really boost interest in the language and programmer uptake.
On the Clay thread a while back, people liked my list of systems-oriented languages that I've been compiling for a while. I can't go a good copy-paste and edit job on my iPad, so I'll just link to the thread here: https://news.ycombinator.com/item?id=6117456 . A bunch of other people came in with updates and other entries for the list. Included were of course Clay, Nimrod, Rust, and D, but also Deca, BitC, ATS, Habit (a Haskell dialect), and others.
I don't know anything about systems languages, but it seems that there is a lot of effort getting put into C/C++ alternatives, and many bold efforts at that. But maybe that is just a historical bias, because so few remembers the failed (popularity vice) systems languages efforts of yesteryear?
Before C became mainstream, Modula-2 and many Pascal dialects like Apple and Turbo Pascal could already fulfil the same role with better type safety and Go compile times in 16 bit systems.
However they lacked the bundling to a commercial OS like UNIX, hence C grew in importance, because developers tend to use the languages supported by the OS vendors.
Nobody can explain to a dedicated C++ programmer why a new language is better than C++. I have tried for more than a decade many times and all I have ever gotten across has apparently been a "wah wah wah" noise like the adults in Peanuts.
Because C++ is multi-paradigm, all paradigms are possible within it (although they may not be syntactically easy or have reasonable error messages, etc etc). Therefore, either C++ is the obviously the best language (to its partisans) or it is a hellish agglomeration of mutually contradictory confusing crap that takes years to begin to understand (to its detractors). Unfortunately, those two camps each find their position obvious, and seem to be unable to communicate with one another.
I dunno - I have about 15 years C++ experience, and if anything the better I have become at the language the less I like it...
Having said that any language with a GC is unlikely to supplant C++. I would have thought that anything that could get away with using GC is already not using C++. Ergo what will finally kill C++ in all its final domains is IMO some sort of better C. Maybe Rust, if they sort out their non-mandatory GC.
I was hoping for something more constructive, like "these 3 Nimrod features address programming problems much better than c++". I skimmed through tutorial ( both parts) and failed to see anything extraordinary.
If you're using namespace std then it's obvious (note that this is considered bad practice), assuming you haven't overridden <<, but otherwise it depends on endl's type and value. Maybe you meant std::endl?
The fact that the C++ preprocessor deals with that trigraph case inside quote marks does not have anything to do with what foobarbazqux said about std::endl.
The problem is that many older timers, like myself, that know C++ since its origins are quite comfortable with it, despite all its warts.
Nowadays I spend my consulting work in Java and .NET land, and actually I favour the Wirth family of languages over C and C++.
However, C and C++ have a big tooling support in the industry, and history shows system programming languages are only successful when backed by OS vendors.
Exactly, writing code in the same language as the OS (and their source examples) is generally the best approach ... based on what you're doing of course.
Ugh, the community is barely alive and it's an unwieldy collection of non-orthogonal constructs. Generics and a few whiz-bang features doesn't paper over the sins upon sins. I see this project cutting back features or dying under the weight of crushing unmaintainability. (sad.)
>> ...unwieldy collection of non-orthogonal constructs...
>> ...doesn't paper over the sins upon sins...
Please elaborate. I didn't see any of that and honestly don't know what "non-orthogonal constructs" means. I'm no language design expert but would like to know what, to you, are the obvious deficiencies.
Non-orthogonality: Multiple features that accomplish nearly the same result in different ways.
For a language maintainer, this is another piece that will have to be supported, troubleshooted and documented.
The other problem for users is that it gives a sense of confusion and uncertainty (Paradox of Choice) about the correctness of a program that may lead to multiple refactorings that don't add any value. Also, more features require more learning and makes programs harder to understand.
The goals of any good language should be correctness and understandability. Arguably, Python learned the lessons from Perl. But there are many different languages, and some are better in different instances.
What Nimrod turns into will be hard to say, but Go and Erlang need more competition.
I don't understand why Nimrod has to mostly look like Python, but then make all kinds of innovations in syntax and formatting which have nothing to do with performance.
if yes("Should I delete all your important files?"):
echo("I'm sorry Dave, I'm afraid I can't do that.")
else:
echo("I think you know what the problem is just as well as I do.")
What would be really killer is if it could have bindings to Java. Then one could take advantage of the JVM being present everywhere w/o having to do a separate compilation for every platform.
Note that nimrod compiles to C, and your JVM may be implemented in C or C++. So for the measly gain of not having to run a command on the destination platform (nimrod c name.nim) you are bringing in the runtime performance drawbacks of the whole JVM and requiring a further layer of abstraction.
Plus bytecode doesn't mean a program will run at all, look at all the java programs which run on android without modification, oh, wait...
I'm in CompBio and the sense I get is that it's a lot easier to get someone to try your tool if all you have to do is provide them the jar which they just have to click on and it runs. Whereas with compilation, you first have to make sure you have all the platforms covered and then force the potential user to select the right executable for the platform.
I don't think forcing potential users to compile a program is going lower the barrier much. A lot of the users could be biologists and they would likely not know how to compile the source. I do agree that the software should be open source.
In my experience, biologists are allergic to the command line. They need web interfaces.
In the case where they don't mind using the command line, providing Linux x64 binaries (along with the source) is plenty sufficient.
Note that I'm not saying they shouldn't distribute the jar if that's the kind of game they want to play. I'm saying they shouldn't restrict themselves to distributing a jar.
> I do agree that the software should be open source.
Me too. But it's not an ideological point. I wasn't kidding when I said that, in all probability, I'll have to fix whatever software I'm using. Or at the very least, read the source code to understand what it is that they've implemented. (You'd think it'd be clear from the methods section in their paper...)
Err, why? It's fairly easy, and you get so much for free: portability, compiler optimizations, easy use of C libraries, easy interfacing with other languages...
One good argument I can think of would be compile time. C compiles slowly compared to almost every other language not C++. And of course you get the C compile time on top of the compile to C time, so I expect the Nimrod compile chain to be quite slow.
Doesn't mean that I am generally opposed to compiling to C, it's often a good trade-off.