Tuesday, August 11, 2015

ELENA 1.9.21 build (win32 / linux32) is out

A new release is available at github

What's new:

    • new "variable" hint
    • "ByRef" parameters are supported
    • project template console_mt (corex) is operational once again
  • ELC
    • issue #47 fixed
    • issue #22 fixed
    • issue #68 fixed
    • issue #67 fixed
  • LIB
    • system'collections: Stack, Queue
    • system'dynamic: CastOver
    • bool inline if : iif[2]
    • forms : Combobox
    • system'dynamic'DynamicStruct is renamed into system'dynamic'Dynamic
  • Samples
    • rosetta code sample : 100 doors
    • rosetta code sample : 24 game
    • calc is using script engine now
  • IDE
    • issue #7 fixed
    • issue #74 fixed

Stack allocated variables in ELENA

In many programming languages numbers are immutable, so result of any arithmetic operation is a new number. For an object-oriented language it means creating a new object, putting an extra load to GC. Different languages choose different strategies to fight this problem.

In ELENA it is possible to declare embeddable objects which will be stack allocated. All basic numeric classes (IntNumber, LongNumber, RealNumber, ...) are embeddable.

Declaring a stack allocated object is quite straightforward - we have to declare a strong typed variable

   int i := 0.

The "type" attribute tells the compiler that we creates a variable which is bound to system'IntNumber (int is a strong type associated with it). If the class is embeddable then it will allocate a space in the method stack and copies 0 into it.

The following operation

   int m := 2 * i + 1.

will be executed directly without allocating any additional space in the program heap.

But if we will pass it into the message call

   console writeLine:(a@n).

the compiler will box it (create an object copy in the heap). Moreover if the object is changeable after the operation the object will be unboxed (content of the dynamic object will be copied into original variable). In our case IntNumber is immutable, so no need to unbox.

In most cases we have to pass our number to one or another method. So there should be a way to reduce explicit boxing. It is possible to declare a method to be stack safe. As a result this will allow the compiler pass stack allocated variables directly. The only limitation that it should be a direct / virtual message call (it means that the target should be known. i.e. "typecasted" and be sealed or limited). Many classes provide such an interface, for example - an array

   console writeLine:
              (a array getAt &int:n).

In this case n is passed directly. Note that by sending get&array message to a, we typecast it.

This approach works quite good, except it cannot be used for returning values.

   int l := a array length.

In this case a new dynamic object is created despite using a stack allocated variable and known operation - get&length.

So we have to introduce a new concept - the type wrapper. If the class is sealed and has only one typed embeddable field it can be marked as a wrapper. In this case it could be used as by-reference parameter.

   int l := 0.
   a array readLength &vint:l.

vint is a strong type associated with Integer class. Integer is a wrapper around IntNumber value.

Some magic has to happen here. In normal case the compiler should send vint message to boxed l variable. Because IntNumber does not support get&vint message the operation will be broken. But in this case the compiler knowns that vint is a wrapper around int, so it will implicitly box it into an instance of Integer class. Moreover readLength&vint is direct and stack safe so the compiler will be able to pass a reference to our variable directly without extra boxing / unboxing operations.

As a result we are able to read the array length into the variable without allocating the number in the heap.

As you may see using this simple tricks we were able to write quite optimized code.

Thursday, July 30, 2015

ELENA 1.9.20 build (win32 / linux32) is out

A new release is available at github

What's new:
  • ELC
    • issue #15 fixed
  • LIB30
    • sqlite: supports float and blob fields
    • extensions'dynamic : scriptEngine
    • system'text'UT8Encoding : toLiteral&int&int&bytearray
  • ElenaScript
    • script engine is again operational
    • refactored / syntax modified
    • local variables are supported
    • structures are supported
    • functions are supported
    • EvaluateFile / InterpretFile: "~" indicates relative to elena bin folder
  • Samples
    • calc
    • sqlite_test - a first ELENA database example
    • rosetta samples ackermann, addfield, accumulator are migrated to Linux

Wednesday, July 8, 2015

ELENA setup blocked

Windows blocks the downloaded setup from being executed.

You may either unblock the file or download zip archive.

Wednesday, April 29, 2015

Source code


I'm migrating my source code to GitHub. SVN repository is now obsolete!

New repository address is now - http://github.com/ELENA-LANG/elena-lang

There are two branches: master (the latest release) and current (work in progress)

The latest build is now available and from github as well - http://github.com/ELENA-LANG/elena-lang/releases/tag/v1.9.19.4a

P.S. updated to reflect change from three to two branches

Tuesday, April 28, 2015

ELENA build (win32 / linux32) is out

A new build is available at http://sourceforge.net/projects/elenalang/files/ELENA%202.x/builds/
$nil pseudo variable is added, to replace nil one.
In Linux version new examples are ported : goods, textfile, textdb
In Windows version gui samples added : agenda, c_a_g, graph

Wednesday, April 15, 2015

Tutorial: Literal as an array of characters

In this tutorial we will see how could we read the literal a character by character.

Let's consider the simple example: we will read each literal character and print it on the screen.

#import extensions.

// ...

        var l := "Hello world".
        var i := 0.
        while (i < l length)
            console write:(l@i).
            i := i + 1.

Let's make our example a little bit more difficult:

        l := "Привет Мир".
        var i := 0.
        while (i < l length)
            console write:(l@i).
            i := i + 1.

The first loop works but the second one fails.

Let's find what breaks our code.

Starting from 1.9.19 LiteralValue is UTF-8. So the literal is actually twice as long. All Russian characters are encoded by two bytes. So why the code was not broken in the first loop? Because CharValue is UTF-32. It has enough place to store any Unicode characters. But when we read the second byte CharValue raises the exception because the code is invalid.

Note that it would works well if "l" is WideLiteralValue. But we will have the similar problem for Chinese symbols.

Fortunately we could easily fix the problem if we will use CharValue.length method. It returns how much bytes it take to encode the symbol.

        i := 0.
        while (i < l length)
            console write:(l@i).
            i := i + l@i length.

Or we could use an enumerator:

        var enum := l enumerator.
        while (enum next)
            console write:(enum get).

And the simplest way would be to use extensions'control helper:

       control run:l &forEach: ch [ console write:ch. ].

Sunday, April 12, 2015

ELENA build (win32 / linux32) is out

A new build is available at http://sourceforge.net/projects/elenalang/files/ELENA%202.x/builds/

It includes several critical bug fixes (UTFxEncoder, elc2).

In Linux version new examples are ported : translit, matrix

In Windows version agenda sample is reintroduced (native implementation)

As usually any bug report is welcomed

Thursday, April 9, 2015

ELENA build (win32 / linux32) is out

A new build is available at http://sourceforge.net/projects/elenalang/files/ELENA%202.x/builds/

It includes several critical bug fixes (e.g. LiteralValue.new&length&index&chararray).

In Linux version new examples are ported : binary, bsort, replace, words

As usually any bug report is welcomed

Friday, April 3, 2015

Work in progress : first Linux alpha release

ELENA is finally ported to Linux. The first alpha release is available at http://sourceforge.net/projects/elenalang/files/ELENA%202.x/builds/

From the beginning I planned to implement ELENA on several platforms. But it took a lot of time to actually did this. When I implemented Unicode support I chose UTF-16 because of Windows. But when I started porting to Linux I realize that it gives me actually no advantages, so I switched to UTF-8 and now ELENA internally is 100% UTF-8. UTF-16 is used in Windows version for file and display operations. LiteralValue is UTF-8. For Windows I implemented WideLiteralValue, which is UTF-16 and CharValue is UTF-32.

Currently only console operations are supported but I will move the rest of the code (except GUI) during this month.

Thursday, February 26, 2015

ELENA 2.x: messages, types, dispatching...

In this topic I would like to discuss in details how ELENA objects interact with each others, the message structure, the concept of types and "Compiler magic" to increase the code performance.
In a dynamic language the type of the object (its class) cannot be resolved during compilation time in most cases. So we need a way to resolve the message mapping for every object before we call the appropriate method. The simplest way would be to create a method table containing pairs of message hash code and reference to the executable code (message handler - method). Calculating the message hash code can be done by a compiler, so the following code:

x add:y
will be compiled as

  <push> y
  <push> x
  <set-message> message-id-of:add
  <get> x.class
     <throw-exception> system'MethodNotFound
Resolving the message entry in the table can be done by a class itself. It gives the compiler possibility to tune the operation, for big classes binary search may be used for example. So a special method - dispatcher - can be declared in the super class which will be always the first entry in the method table (note that a custom dispatcher is used in many group objects - system'Variable, system'dynamic'Extension, ...). In this case our code will look like this:

  <push> y
  <push> x
  <set-message> message-id-of:add
  <get> x.class
In the simplest case a message hash code can be an index in the global message table.

Let's again review our example and assume that x is system'IntNumber and y - system'LongNumber. Because x and y are dynamic objects, it is not possible to sum them directly. So IntNumber.add[1] method should ask the operand about its type. Alternatively we could send a new message to y providing it with x identity. e.g.
   add : y = y addToInt:$self.
So LongNumber knows the operand type and may perform the operation. Similar for multiply operation - multiplyByInt should be sent and so on. As we see the message may contain the information about the operand "type" (note that it is not a real object type - system'IntNumber, but a protocol, convention between the objects). So all these led me to idea to introduce the message structure. So the message can be split into two parts - a a verb describing the operation and a subject describing the operation parameter. So in our case the message addToInt, may be replaced with add&int (where "add" is a verb and "int" is a subject). If we somehow bind the object with its subject (e.g as a field in the class header) we could dynamically dispatch the parameter:
  <push> y
  <push> x
  <set-verb> verb-id-of:add
  <add-subject> y.class.type
Unfortunately this works only for few cases (when a message has only one parameter), in most cases the solution become too complex so after a while I had to give up this approach and took another way.
Alternatively the message can be split into three parts : the generic action, the signature and the parameter count. The signature can be split into several subjects describing the parameters - e.g. the message insert&index&literal[2] is a insert action with the signature index&literal and has two parameters. The message dispatching is possible when the signature is the operand "type" and the parameter count is 1.
This allows us to make several operations with message parts. We could use a signature symbol to qualify the generic message or dispatch an action with specific signature.
For example dispatching is used in cast methods:
   cast : aVerb &to:aTarget = aTarget::aVerb short:$self.
and it will be used in the following code:
   anObject cast:%add &to:$self.
where %add is generic verb. In cast method we dispatch it with particular subject, i.e. dynamically adding subject to the generic message.
Though a "typeless" nature of dynamic languages is good thing, in many cases we still have situations where we need a specific class. For example in system'Array constructor the parameter should contains the length. In strongly typed language it could be guaranteed in compile time, for ELENA we have to check the type in run-time. So it would be convenient if we could create a special agreement between the method and the caller to guarantee the parameter role (or its type) without need to check it every time in the method itself
As the message signature consists of subjects they could be used to describe not only the parameter role but in some case its type as well. In most cases the subject may be used implicitly without need to declare it. But if we would like to use it as a protocol it should be explicitly declared. So for example we could declare a new subject "enumerable" - which means that the object passed under this role supports enumerator message. There is no way to guarantee that the object actually supports this protocol, it is up to programmer to care about it. But in some cases we can force it - especially for data types. In that case we associate the subject (or type) with a class. As a result only instance of this class has to be passed. But ELENA still dynamic language so how could we make this without introducing the actual types? I found the solution that every time the strong typed parameter is required, compiler calls typecast message - get&<type-subject> (e.g. get&int, get&literal). So actually the only place where compile-time typecasting should be performed is in any get&<type-subject> method.
Though subjects were designed initially only for the message parameters they could be used for providing the variable (local or class) and the method result "type" as well
Strong types can give us a way to increase performance as well. If the type class is sealed or limited, compiler can resolve the message in compile-type. For example system'Enumerator is limited class. system'LiteralEnumerator inherits it. Both of them are of enumerator "type". So if we declare the enumerator variable, all operations with it will be resolved in compile-time due to the fact that system'LiteralEnumerator may only override existing methods without adding new ones.
If the class is a structure (contains raw data) and sealed its type can be used for stack allocated variables. For example in the following code:
        int x := self int.
        int y := anOperand int.
        int z := x / y.
        z := z * y.
x, y, z are stack allocated system'IntNumber classes. As a result the basic arithmetic operation can be done at compile-time without need to create a new dynamic class for every operation.
Despite the introducing "type" concept ELENA is still dynamic language. Strong types can be used in performance critical part of the application (like arithmetic operation) but in all other cases ELENA is 100% dynamic language.

Monday, February 23, 2015

ELENA Language Compiler 1.9.18 released:

ELENA Language Compiler 1.9.18 released
[!] binary incompatible due to changes in debugger / exception support 
    / typecasting
[-] dn files are no longer produced, debug info is inside an executable
[+] new shared library : elenart - run-time helper for stand-alone 
[+] constructor may have redirecting and initializing parts
[+] vmpath is no longer used, shared path used instead (for both 
    elenavm and elenart)
[*] type routine overhaul
[*] syntax : extending operator renamed - "::" should be used
[*] syntax : class syntax changed - "::" should be used to provide 
    the class parent
[+] new hint : type can be provided for the method result
[-] direct dispatching no longer supported
[-] statement terminator can be omitted only in lazy expression
[+] character constant (e.g. "a" is CharValue rather than LiteralValue)
[-] escape sequence is no longer supported (e.g. "Hello%n" => "Hello"#10 )
[*] 0r, 1.0e2r constants are correctly recognized
[*] ? and ! operators require an object now (e.g. a == b ? [ .. ]  is 
    no longer possible, correct syntax is (a==b) ? [ ... ])

[*] codebase refactored
[*] output code optimized
[-] compiler option is no longer supported : -xembed-

[+] Project - View - Call stack
[-] Project - Options - Debug Mode : Enabled for VM Client is no 
    longer supported
[-] Project - Options - VM Path is no longer supported / required 
    (SHARED path is used)

[*] system : ByteNumber, ShortNumber, IntNumber, LongNumber, RealNumber, 
             Integer, Long, Real, CharValue
[*] system : intConvertor, literalConvertor, longConvertor, realConvertor, 
             FunctionX, IndexFunctionX
[*] system : Object#class.new, Exception::theCallStack, 
[*] system : ByteArray, ShortArray, LiteralValue, Array
[*] system'control => extensions'control
[*] system'collections : List, Dictionary
[*] system'routines : literalOp
[*] system'math : intOp, realOp
[*] system'calendar : Date, TimeSpan
[*] system'text : TextBuffer, Encoder, ansiEncoder
[*] system'io : BinaryReader, StreamReader, BinaryWriter, StreamWriter, 
                FileStream, TextReader, TextWriter
[*] extensions: inputOp, outputOp, convetor
[+] system : CallStack, IntArray, byteConvertor, shortConvertor, CharArray
[+] extensions'dynamic: scriptEngine, ScriptEngineException
[+] system'math : byteOp, shortOp, longOp, Matrix

[*] refactored / syntax modified
[+] installation package is introduced
[*] ELENA API documentation (see doc\api) is up to date now
[*] Visual Studio Projects are migrated to Visual Studio Community 2013

Tuesday, January 13, 2015

ELENA 1.9.18 first beta release

ELENA 1.9.18 first beta version is now available at sourceforge

The upcoming release will include deep type support redesign, refactored API, some minor syntax changes