Inside naru programming language

« List of documents

This is a draft implementation note for naru programming language, a general-purpose programming language designed by Kang Seonghoon. This article serves as a design document for both naru standard library and runtime for naru implementation.

This document is available as both a standalone document and a wiki page. The recent changes are available in Mercurial repository.

1 Prelude

naru in a nutshell document describes that naru core module is implicitly imported before every naru source code is run. This is only partially true; it is among first modules to be run, but it is definitely not the first one to be run. The main reason is that the naru runtime is very complex, and typically divided into lots of modules. (By the way, the internal pieces of runtime are due to be located at _narurt package.)

The prelude (_narurt prelude module) is a part of the naru runtime that bootstraps basic types and initializes built-in operators. Due to its duty, it is automatically imported by import naru core * statement and also the very first module to be run. Since no value is available before the prelude runs, the prelude constructs the very first values during its lifetime; it is possible using the vendor-specific pragma or the external module. These values are truly ethereal (i.e. exempted or invisible from the garbage collector) and cannot be destroyed.

1.1 Bootstrapping Types

The prelude constructs the following types:

1.2 Built-in Operators

It is very important to realize that naru does not have operators like +, * and so on by default. In fact, the core language only supports the function call, and all operators are desugared into sequences of more primitive function calls. For example, the following code:

quadraticRootOf(a, b, c) := fun
    det = b**2 - 4*a*c
    if det < 0 then
        return {/}    -- no real solutions
    elseif det > 0 then
        rootdet = det**(1/2)
        return {(-b+rootdet)/(2*a), (-b-rootdet)/(2*a)}
    else
        return {-b/(2*a)}
    end
end

…is really desugared into the following form:

quadraticRootOf##FSSS_S(a:*, b:*, c:*)->* := fun
    det:* = #-#(#**#(b, 2), #*#(#*#(4, a), c))
    given #<#(det, 0)
    case True
        set0##V:Set{*} = Set##M _empty()
        return set0##V
    case False
        given #>#(det, 0)
        case True
            rootdet:* = #**#(det, #/#(1, 2))
            set1##V:Set{*} = Set##TS_ _singleton(#/#(#+#(-#(b), rootdet), #*#(2, a)))
            set1##V _add(#/#(#-#(-#(b), rootdet), #*#(2, a)))
            return set1##V
        case False
            set2##V:Set{*} = Set##TS_ _singleton(#/#(-#(b), #*#(2, a)))
            return set2##V
        end
    end
end

2 Memory Management

While hidden to the typical naru code, the memory management in naru is fully implemented in naru itself.

2.1 Garbage Collection

3 Numbers Internal

3.1 Integers

TODO

4 Collections Internal

4.1 String

_narurt string implements the String class and its companion, Symbol class. It is by far the most complex data structure in naru standard library (compared to its conceptual simplicity).

Internally the string and symbol share the same internal data structure, _StringTree; only the public interface differs between them. As the name suggests it is a heavyweight string (rope), implemented as a tree of short strings. Each node stores the length of that portion of string along with its children, so the indexing is O(log n) time (compared to the constant time for normal string). The same portion of the tree is shared, even when the explicit copy is made; when the change is made to the copy new tree is constructed so that as much portions of the original tree as possible is shared (i.e. copy-on-write scheme). TODO pointers to the next or previous leaves can be used if we split the leaf node and actual storage, at the expense of a copy of the entire tree (without expansion of shared parts!); i doubt its value however.

A rope is used because most disadvantages of the rope, namely slower indexing, are not a concern for most use cases. The direct indexing on a Unicode string is not useful since a Unicode code point is not same to a character; even when the code point is needed, most use cases requires the iteration of the whole string which is just as efficient as a normal string. It also has an advantage that UTF–8-encoded string can be stored as is; the verification is still required, but it is much faster than the full decoding. Finally, various algorithms on Unicode strings can be optimized using a rope.

Nodes. There are four kinds of leaves in the _StringTree:

There are two kinds of non-leaf nodes in the _StringTree:

Additional node informations. Each node contains the following informations. Some informations can be uninitialized on the construction of the string, and will be recalculated in-place during the traversal.

5 I/O Internal

6 Outside naru core