Wednesday, June 1, 2011

Hello World: An update on LLVM

When learning programming, the most common first program one writes is "Hello World" which basically just displays that message. It's a small but significant milestone to be able to produce this simple message. When we were working on adding support for Windows to Real Studio, it was exciting to see "Hello World" on Windows 98! We experienced the same excitement during the development of support for Mac OS X, Linux and the web.

We reached a similar milestone today with our work to replace the backend of our compiler with LLVM. First, a quick summary of how our compiler works and how LLVM will be an important part of the future of our compiler:


When your choose Run or Build in Real Studio, all of the code you wrote is passed to the Real Studio compiler which is represented by the items in grey in this diagram. The front end is the part of the compiler that understands the Realbasic syntax. It converts your code to a sort of meta-assembly language. That meta-code is then passed to the compiler backend which turns it into x86 assembler which your computer can understand. All this compiled code is then passed to the linker which puts it all together and produces the actual application you can distribute. Of course all of this is hidden behind the scenes but it's helpful (or at least mildly interesting) to understand how it works.

We have written two different backends for the compiler: one that produces x86 assembler and the other that produces PowerPC assembler. The latter is no longer supported since Apple has stopped making Macs using PowerPC processors.

LLVM is an open-source compiler backend that has a lot of advantages over the ones we have written:
  1. It's an optimizing compiler which means it can make your code run faster.
  2. It does dead-code stripping which means it can leave out the parts of the Realbasic framework that your project isn't using resulting in a smaller application.
  3. It may enable us to return to a single-file executable on Windows.
  4. It supports the ARM processor used in mobile devices such as iPhone/iPad.
Last year we took the first step of making RBScript compile with LLVM. The next step is to replace it for building entire applications. We have spent a bit of time on this next step recently. Today we were able to compile and run a "Hello World" console app and a desktop app as well. This is a significant milestone but it's only the first of many.

There is still a lot of work to do before you will be building your projects with LLVM but reaching this important "Hello World" milestone is an important one.

EDIT: For the sake of clarity, it's possible that some issue might prevent us from achieving single-file executables for Windows. So I have changed "will" to "may" in item #3 above. Also, I have removed mention of Android from item #4 because LLVM doesn't help (or hinder) us when it comes to supporting Android.

23 comments:

Mark said...

That's good to hear. Excellent stuff. Onwards and upwards...! :-)

shahid said...

Thanks for update, we wish you all the luck. Bring it to iOS and see the revolution..

Paul said...

Do I take it from the embedded PowerPC comment that you will be dropping PPC support once LLVM is the compiler du jour?

Alyssa said...

@Paul this earlier post should answer your question, http://www.realsoftwareblog.com/2010/08/future-support-for-powerpc.html

Rick said...

Good to know. I wait for monolithic and smaller apps and Cocoa for ages. I keep an eye on the evolution of RB for at last 3 years and never felt it ready for my needs and quality demands and desires. LLVM seems to be the way to go. I hope you guys can do it fast. May the force be with you.

Konformagaitz said...

Could we know what the size reduction for the Hello World application was? Will it get reduced further? (just out of curiosity

Geoff Perlman said...

@ Konformagaitz - It's not easy to say at this point. We compiled the code for the app with LLVM but the framework itself is still compiled with GCC. The next step will be recompiling the framework with LLVM. Then we should see a size reduction.

Martin Kvapil said...

Wow this is really great news, the timing is prefect but about timing, what are we talking about? In a ball park so to say.

Geoff Perlman said...

@ Martin - It's difficult to say. Very math intensive code will see the greatest speed gain. UI code won't speed up as much if at all.

Matt Trivisonno said...

Congratulations on achieving the milestone, Geoff. This is great news. Thanks for keeping us informed.

Mathias said...

ARM support is great! So it would be able to compile to Maemo platform? Or Meego?

Geoff Perlman said...

@ Mathias - it could compile for any ARM-based device but remember that's only compiling code. There would still need to be a framework provided for any new platform we were going to support.

Martin Kvapil said...

so if i have an ARM920T running debian, when in time will i be able to run my RB code on it?

This year?

Im so excited about this news :)

Geoff Perlman said...

@ Martin - Good question. I doubt you will be able to this year because our priority will be getting Mac, Windows and Linux working for x86. But once that's done we can experiment and see what would be involved in supporting Linux on ARM.

TJ said...

There's something that needs clarification here with regards to GCC versus LLVM and multiple architectures. Please understand that my comments have no bearing on RS' shift to LLVM and LLVM really is the best option for the REAL Studio environment. I simply provide it to offer a bit more information about GCC's capabilities since someone asked me off list about this because they knew we compiled for multiple architectures using GCC.

Many comments in the mailing lists and the RS forums concerning the RS shift to LLVM to support multiple architectures may have people thinking that GCC is not capable of supporting such an effort. GCC is quite capable of compiling for multiple platforms and architectures on a single system as long as you have the proper build environment in place. For example, we have been compiling our Unix core app on an Intel-based Linux system using GCC 3.4 since around 2002. We compile for FreeBSD and Linux x86, x86_64, Itanium, PPC, SPARC, and Alpha, A/IX PPC and PowerRISC, Solaris SPARC and x86, HP-UX PA-RISC and Itanium, and IRIX MIPS.

Granted, our build environment is relatively huge (around 7 GB) since we need to maintain the supporting libraries and headers for each platform, but it is quite manageable once you know how the pieces all fit. We use a chroot environment to segregate each platform and architecture.

Here's a WIkipedia page that offers a basic overview:

Wikipedia GCC Article

We may consider moving our environment to LLVM in the future, but for now, GCC gets the job done for us very well.

Geoff Perlman said...

@ TJ - Because GCC is GPL, it's not a option for us as a backend compiler. Apple is moving to LLVM from GCC for the same reason.

TJ said...

@Geoff - I definitely understood and fully agree with your move (as I said). My point was simply to educate on GCC since some folks were under the impression that you had to change to get cross platform / architecture compiles.

Geoff Perlman said...

@ TJ - Ah. No, we didn't have to change for cross-platform. We have been compiling our cross-platform frameworks with GCC for Mac and Linux and Visual Studio for Windows for many years.

However, your apps are compiled, not with GCC or Visual Studio, but with our compiler. We have decided to switch our backend compiler to LLVM because there are a lot of benefits to doing so and few downsides. GCC would not have been an option for us to use as the backend for our compiler because of its GPL license.

Hank Fay said...

When you are compiling through LLVM, does that mean you will open up other Frontends for other dynamic languages, e.g., Python?

Geoff Perlman said...

@ Hank - Supporting other languages would be a big job and LLVM won't really help with that.

EB said...

iOS framework support would be killer vector for corp programming. I see great opps for factory floor iOS apps to monitor all the machinery and what not.

MacATDBB said...

Hey Guys, I've been watching the RS blog but haven't spotted any updates on this recently. Is the LLVM back-end still being actively worked on?

Geoff Perlman said...

@ MacATDBB - We have been keeping RBScript up to date with the latest versions of LLVM. Work on building apps with LLVM will start once we are finished with 2013r1.