Cicada ---> Online Help Docs ---> Reference

Cicada built-in functions

Cicada provides a number of built-in functions that work just like user-defined functions. These are listed in the Table 4. A function’s ID number is just its ‘name’ in bytecode -- we don’t usually have to deal with this. Functions that leave behind a value when used as a command (i.e. print a value from the command prompt) have checkmarks in the ‘@’ column. This section explains each built-in function in alphabetical order.


ID name @ ID name @ ID name @
0 call   10 trap   20 tan X
1 setCompiler X 11 throw   21 acos X
2 compile X 12 top X 22 asin X
3 transform   13 size X 23 atan X
4 load X 14 abs X 24 random X
5 save   15 floor X 25 find X
6 input X 16 ceil X 26 type X
7 print   17 log X 27 member_ID X
8 read_string   18 cos X 28 bytecode X
9 print_string   19 sin X 29 springCleaning  

Table 4: Built-in functions, by bytecode ID number. Functions which return values even as commands have a checkmark in the ‘@’ columns


abs()

syntax: (numeric) y = abs((numeric) x)

Returns the absolute value of its argument (which must be a number).


acos()

syntax: (numeric) y = acos((numeric) x)

Returns the inverse cosine of its argument. The argument must be a number on the interval [-1, 1] (a number outside this range will generate the ‘not a number’ value on many machines). The result is on the interval [0, pi].


asin()

syntax: (numeric) y = asin((numeric) x)

Returns the inverse sine of its argument. The argument must be a number on the interval [-1, 1] (a number outside this range will generate the ‘not a number’ value on many platforms). The result is on the interval [-pi/2, pi/2].


atan()

syntax: (numeric) y = atan((numeric) x)

Returns the inverse tangent of the argument, which must be numeric. The result is an angle in radians on the interval [-pi/2, pi/2].


bytecode()

syntax: (string) codeString = bytecode((variable) myFunction [, (numeric) memberIndex])

Returns the bytecode of a given variable or member. If there is one argument it returns the bytecode of that variable; if there are two members then it returns the bytecode of member myFunction[memberIndex]. Member code is never run directly, but it determines the sort of variable a member can point to (because code and type are equivalent in Cicada).

To read the bytecode we need to move the bytecode data from the string into an array of integers using the =! operator. The last integer is always 0, signifying the end of bytecode. If there are multiple codes (due to the inheritance operator) then the codes are concatenated in parent-to-child order in the same string, and each separate code ends in a null integer. bytecode() is the inverse operation to transform().

The bytecode() function return the code for functions, but also many other objects that we don’t normally think of as having code. In fact the only restriction is that myFunction must be some composite object (defined using curly braces). So if we define


    pow :: {
       params :: { x :: y :: double }
       
       code
       
       params = args
       return new(params.x^params.y)
    }
   

then bytecode(pow) returns the bytecode for everything inside pow()’s definition (including the definition of params and the code marker), whereas bytecode(pow.params) is also legal and returns the bytecode corresponding to x :: y :: double.


call()

syntax: (numeric) return_code = call((string/numeric) C_routine, [arguments])

Runs a user-defined C or C++ routine referenced in UserFunctions[] (in userfn.c). The first argument specifies which function to run, either as a string containing the Cicada function name (the string in UserFunctions[]), or else as the array index (beginning at 1) of the function in UserFunctions[]. The subsequent arguments form the argv array that the C routine receives. Returns the return value of C/C++ function (an integer).

For example, if we write some C function


    ccInt myFunction(ccInt argc, char **argv) { ... }
   

just as if it were a complete program. In order to use it from inside Cicada we would add an entry to the UserFunctions[] array in userfn.c.


    userFunction UserFunctions[] = { { "pass2nums", &pass2nums }, { "cicada", &runCicada },
                   { "runFunctionInC", &myFunction } };
   

callFunctionInC is Cicada’s name of the function, even though myFunction is its C name. Then, after recompiling Cicada, we could run the function with the command


    result := call("runFunctionInC", arg1, 14, "another argument")
   

or


    result := call(3, arg1, 14, "another argument")
   

(because runFunctionInC is the 3rd C function in UserFunctions[]). If we are using a string to specify the C function, then we can also use a shorthand syntax where that string follows a dollar-sign:


    result := $runFunctionInC(arg1, 14, "another argument")
   

The number of arguments is passed through argc, and the address of the array of pointers to the actual arguments is located at argv. So argv[0] is the pointer to the first argument, and argv[0][0] is the first byte of the first argument; argv[1][0] is the pointer to the second argument; etc.

Only primitive variables can compose an argument to the C routine; a composite argument to call() generally contains, and is passed as, multiple primitive arguments, one for each primitive component. In the example above the number of arguments is at least 3, depending on the type of arg1. Suppose arg1 was an array of { bool, double } -- then the number of arguments is 4. (The first argument is an array of bools, the second is an array of doubles, the third contains the integer 14, and the fourth is a string). Strings are passed as linked lists (see the reference section). Void arguments or members are skipped.

call() also adds one final argument at the end, in position argv[argc], that can be used for type-checking. This argument-info array is a list of elements of type arg_info, defined in userfn.h:


    typedef struct {
       ccInt argType;
       ccInt argIndices;
    } arg_info;
   

There is one entry in this array for each argument passed by the user. The argType codes are defined in Table 1. argIndices gives the number of indices that were passed (1 if it was just a variable, N if an array).

All arguments are passed by reference: they are pointers to Cicada’s own data storage, so data can be exported as well as imported. It is easy to crash Cicada by overwriting the wrong regions of memory: a call() preceding a crash is frequently at fault, even if the crash happens far downstream.


ceil()

syntax: (numeric) y = ceil((numeric) x)

Returns the nearest integer that is as high as or higher than the argument, which must be numeric. For example, ceil(5.6) returns 6, ceil(-5.6) returns -5, and ceil(2) returns 2.


compile()

syntax: (string) script_bytecode = compile((string) script [, (string) file_name [, (string) char_positions [, member_names]]])

Before Cicada can execute a script, that script must be compiled into a binary form called bytecode that is much easier to execute than the raw text. The built-in compile() function does this job. Given a string containing a Cicada script (script), compile() then returns a second string (script_bytecode) containing Cicada bytecode. Importantly, the bytecode is not machine code -- it is only used by Cicada.

A basic compile() call looks like:


    myBytecode := compile("x = 3")
   

This command produces bytecode using the currently-active compiler. Each compiler keeps a record of all variable names, so if x had been defined with bytecode produced by the current compiler then this command will run just fine. To change compilers use the setCompiler() command.

A compilation error will actually crash the script running the compile() command. To prevent this we can enclose the compile() call inside of the trap() function. If we want to print out the error message, we can write a semicolon or code marker at the beginning of trap()’s arguments.


    > trap(; compile("x = "))
   
    Error: right-hand argument expected
   
    x =
     ^
   

The optional second file_name argument causes any error message to reference that file name.


    > trap(; compile("x = ", "myFile.txt"))
   
    Error: right-hand argument expected in file myFile.txt
   

Often a script will compile but cause an error when it runs. In order to properly flag runtime error messages we must collect another piece of information: the character position in the original script of each bytecode word. This lets the error message flag the offending line in the original script. The character positions are stored inside of any string that is passed as an optional third argument to compile(). Both that string and the original Cicada script will be passed to transform(), the function that actually allows compiled bytecode to be run.

In some cases we may want to avoid using compile(), but instead hand-code the bytecode and load it in using transform(). After all, compile() is ‘only’ a string operation: it converts an ASCII script into a string containing binary bytecode.


cos()

syntax: (numeric) y = cos((numeric) x)

Returns the cosine of its argument. The argument must be numeric.


find()

syntax: (numeric) result = find((strings) search_in, search_for [, (numeric) mode [, (numeric) starting_position]])

Finds an instance of, or counts the number of instances of, a substring (argument 2) within another string (argument 1). If find() is used in search mode, it returns the character position (where 1 denotes the first character) where the substring was first found, and 0 if it was not found anywhere. If find() is run in count mode, it returns the number of instances of the substring found within the larger string.

The optional third argument controls the mode that find() is run in: it needs to be -1, 0 or 1. If a mode is not specified then it defaults to mode 1, which denotes a forward search; i.e. it will return the first instance of the substring that it finds. Mode -1 corresponds to a reverse search, which will find the last instance of the substring. Mode 0 is the count mode.

By default, a forward search begins from the first character, and a reverse search begins with the last character. A count proceeds forward from the first character. The starting character can be changed by specifying a starting position in the fourth argument. A mode has to be given in order for a starting position to be specified.


floor()

syntax: (numeric) y = floor((numeric) x)

Returns the nearest integer that is as low as or lower than the (numeric) argument. For example, floor(2.3) returns 2, floor(-2.3) returns -3, and floor(-4) returns -4.


input()

syntax: (string) str = input()

Reads in a single line from the C standard input (which is usually the keyboard). input() causes Cicada’s execution to halt until an end-of-line character is read (i.e. the user hits return or enter), at which point execution resumes. The return string contains all characters before, but not including, the end-of-line. Reading in a null character causes the error “I/O error” to be thrown.


load()

syntax: (string) file_string = load((string) file_name)

Reads a file into a string. Both ASCII-encoded and binary files can be read this way. The file name must include a path if the file is not in the default directory, as in “/Users/bob/Desktop/MyFile.txt”. If there is an error in opening or reading the file (i.e. if the file was not found or there was a permissions problem), then load() returns “I/O error”, signifying that the error comes from the operating system, not Cicada. The counterpart to load() is save().

load() searches only in the default directory. The user.cicada routine Load() extends the built-in load() by searching all paths specified in the filePaths[] array. (The run() function in user.cicada also searches all filePaths[].)


log()

syntax: (numeric) y = log((numeric) x)

Returns the natural logarithm (base e) of its argument. The argument must be numeric. The logarithm is only defined for arguments greater than zero.


member_ID()

syntax: (numeric) mbr_ID = member_ID((composite variable) var, (numeric) member_number)

Returns the ID number of a given member of a composite variable. The ID is essentially the bytecode representation of the member’s name. Under normal conditions user-defined names are assigned positive ID numbers, whereas hidden members are given unique negative ID numbers. The variable enclosing the member is the first argument, and the member number is the second argument.


print()

syntax: print((vars) v1, v2, ...)

Writes data to the standard output (which is normally the command prompt window). The arguments are printed sequentially and without spaces in between. Numeric arguments are converted to ASCII and printed as legible integers or floating-point numbers. String arguments are written verbatim (byte-for-byte) to the screen, except that unprintable characters are replaced by their hexadecimal equivalents “\AA” (which is also the format in which these characters may be written into a string). Also, carriage returns in strings are written as end-of-line characters, so a PC-style line ending marked by “\0D\n” outputs as a double line-break.

When Cicada is run from the command prompt, user.cicada loads three further printing functions: printl() (print with line break), sprint() (for printing composite structures), and mprint() (printing arrays). sprint() is the default function for printing expressions typed by the user.


print_string()

syntax: print_string([(numeric) field_width, [(numeric) precision, ]] (string) to_write, (vars) v1, v2, ...)

Writes data to a text string. print_string() is the counterpart to read_string(). Roughly speaking, print_string() is to print() as C’s more elaborate sprintf() is to printf(). The string to write is followed by any number of variables whose data Cicada writes to the string (with no spaces in between). Strings from the source variables get copied into the destination string verbatim. Numeric variables are written as text, and here print_string differs from a forced equate. For example:


    print_string(str, 5, 2.7)
   

sets str to “52.7”, whereas


    str =! { 5, 2.7 }
   

gives something illegible (the raw bytes encoding the two numbers in binary format).

If the first argument is numeric, then it is taken as the minimum field width for numeric and Boolean (but not string or character) variables to be printed; otherwise the default minimum field width is zero. If both the first and second arguments are numeric, then the second argument is the output precision for floating-point variables; otherwise the output precision is determined by the C constant DBL_DIG for double-typed variables. When no precision is specified, print_string prints considerably more digits than does print(), whose precision is set by printFloatFormatString at the top of cmpile.c.


random()

syntax: (numeric) y = random()

Returns a pseudo-random number uniformly drawn on the interval [0, 1]. To obtain the random number to double-precision, Cicada calls C’s rand() function twice:


   random() = rand()/RAND_MAX + rand()/(RAND_MAX)2

The random number generator is initialized by Cicada to the current clock time each time the program is run, so the generated sequence should not be repeatable.


read_string()

syntax: read_string((string) to_write, (vars) v1, v2, ...)

Reads data from an ASCII string into variables. The first argument is the string to read from; following arguments give the variables that will store the data. read_string() is the humble cousin to C’s sscanf() routine (it does not take a format string). The various fields within the string must be separated by white space or end-of-line characters.

read_string() converts ASCII data in the source string into the binary format of Cicada’s memory. Thus numeric fields in the source string need to be written out as text, as in “3.14”. Each string field must be one written word long, so “the quick brown” will be read into three string variables, not one. Composite variables are decomposed into their primitive components, which are read sequentially from the source string. Void members are skipped.

Here is an example of the use of read_string()


    date :: { month :: string, day :: year :: int }
    activity :: string
    read_string("Jan 5 2007  meeting", date, activity)
   

If the string cannot be read into the given variables (i.e. there are too many or too few variables to read), then read_string() throws a type-mismatch warning. Warnings can also be thrown if read_string() cannot read a field that should be numeric, or if there is an overflow in a numeric field.

read_string() is a counterpart to print_string(). However, print_string() does not write spaces in between the fields, so unless spaces are put in explicitly its output cannot be read directly by read_string().


save()

syntax: save((strings) file_name, data_to_write)

Saves the data from the second argument into the file specified in the first argument. There is no return value, although the error “I/O error” will be thrown if the save is unsuccessful. (An error would likely indicate a bad pathname, disk full, or that we don’t have write permissions for that file or directory). If the directory is not explicitly written before the file name, as in “/Library/my_file”, then the file is saved in the default directory, which is probably the Cicada directory.

There is no need for the data to be encoded in ASCII format, even though it gets passed to save() as a string. Online conversion to the proper string type can be done in the following way:


    save("my_data", (temp_str :: string) =! the_data)
   

where the_data may be a variable or array or any other object. save() writes the data verbatim; if the data is ASCII text, then a text file will be produced; otherwise the output should be considered a binary file. The saved data can be read back into a string using the load() function.


setCompiler()

syntax: (numeric) compilerID = setCompiler([(numeric) compilerIDtoUse])

or: (numeric) compilerID = setCompiler(array of { (string) commandString, (int) precedence, (string) rtrnTypes, (string) translation }, (int array) opLevelDirections[])

Optionally sets the active compiler, and returns the ID number of the active compiler. To use a particular compiler, pass its ID number as the first argument. To simply find the ID number of the current compiler, we can just call setCompiler() with no arguments. To create a new compiler we pass setCompiler() having two arguments: 1) an array of { string, int, string, string }, one element for each command, containing the definitions of each command, and 2) an array of the direction to evaluate each order-of-operations level. These mirror the cicadaLanguage[] and cicadaLanguageAssociativity[] arrays, respectively, which are defined in cicada.c.


sin()

syntax: (numeric) y = sin((numeric) x)

Returns the sine of its argument, which must be numeric.


size()

syntax: (numeric) var_size = size((var) my_var [, (bool) physicalSize])

Returns the size, in bytes, of the first argument. For composite variables, this is the sum of the sizes of all its members. If two members of a composite variable point to the same data (i.e. one is an alias of the other), then that data will indeed be double-counted unless the optional second argument is set to true (its default value is false). If the second argument if false then size() returns the number of bytes that will participate in, for example, a forced-equate or save(), which may be more than the number of bytes of actual storage which is returned by setting the second argument to true.

If a member points back to the composite variable, as in


    a :: {
       self := @this
       data :: int   }
   
    size(a)  | will cause an error
   

then the size of a, including its members and its members’ members, etc., is effectively infinite, and Cicada throws a self-reference error unless the second argument was set to true.


springCleaning()

syntax: springCleaning()

This function removes all unused objects from Cicada’s memory, in order to free up memory. An object is termed ‘unused’ if it cannot be accessed by the user in any way. For example, if we remove the only member to a function then that function’s internal data can never be accessed unless it is currently running.

Cicada tries to free memory automatically, but unfortunately it is not always able to do so. (The reason is self-referencing loops between objects in memory.) The only way to eliminate these zombies is to comb the whole memory tree, which is what springCleaning() does. When Cicada is run from the command prompt, it disinfects itself with a springCleaning() after every command from the user. But we might want to scrub the memory more often if we are running a lengthy, memory-intensive script that allocates and removes memory frequently. springCleaning() can help unjam arrays, if there is no member leading to the jamb.


tan()

syntax: (numeric) y = tan((numeric) x)

Returns the tangent of its (numeric) argument.


throw()

syntax: throw((numeric) error_code [, (composite) error_script [, (numeric) concat_number [, error_index [, (Boolean) if_warning]]]])

Causes an error to occur. This of course stops execution and throws Cicada back to the last enclosing trap() function; if there is none then Cicada either prints an error (if run from the command line) or bails out completely. The first argument is the error code to throw -- these are listed in Table 5. The optional second, third and fourth arguments allow one to specify the function, the part of the function (should be 1 unless the inheritance operator was used) and the bytecode word in that function the error appears to come from. If one sets the optional fifth argument to true, then the error will be thrown as a warning instead. All arguments may be skipped with a ‘*’.

Although all real errors have error codes in the range 1-50, throw() works perfectly well for larger (or smaller) error codes that Cicada has never heard of. It can be hard to tell when throw() is working. For starters, if the error code is zero then it will appear that throw() is not doing its job, just because 0 is code for ‘no error’. throw() does require that the error code be zero or positive, so it gives a number-out-of-range error if the argument is negative. However, the following also gives a range error:


    throw(2)
   

In this case throw() actually worked: we got an out-of-range error because that is error #2. (That once caused the author some confusion..)


top()

syntax: (numeric) vartop = top((composite variable) my_var)

Returns the number of indices of the argument variable. The argument must be a composite variable or equivalent (e.g. set, function, class, etc.). top() does not count hidden members. Therefore the value it returns corresponds to the highest index of the variable that can be accessed, so


    my_var[top(my_var)]
   

is legal (unless the top member is void) whereas


    my_var[top(my_var) + 1]
   

is always illegal (unless we are in the process of defining it). Notice that in both of these cases we can replace the top() function by the top keyword, which is always defined inside of array brackets: e.g. my_var[top+1].


transform()

syntax: (composite) target_function = transform((string) bytecode [, (composite) target_function [, (composite) code_path [, (string) file_name [, (string) script [, (string) char_positions]]]]])

Copies compiled bytecode stored as a string (1st argument) into the internal code of a target function variable (return value and/or 2nd argument), without running the code’s constructor. The bytecode is typically generated using the compile() function:


    newFunction :: transform(compile("toAdd := 2; return args[1]+toAdd"))
   

but it is also possible to write the bytecode by hand. This probably won’t work -- the member IDs depend on your workspace history -- but the code looks something like:


    newFunction :: {}
    (newBytecode :: string) =! { 8, 47, 10, 314, 54, 2, 4, &
                         5, 8, 237, 10, -999, 27, 12, 40, 54, 1, 10, 314, 0 }
    transform(newBytecode, newFunction)
   

At this point it is as if we had written


    newFunction :: { toAdd := 2; return args[1]+toAdd }
   

We can now execute the new code by running the target function.


    newFunction(3)       | will return 5
   

When we define a function as the return value of transform(), as in the previous example, the constructor runs automatically. If we don’t want this to happen, we should pass in a target function as the second argument of transform(). If a function appears here, that is not void, then that function’s existing codes are erased and replaced by the transformed code (assuming no error) without running the constructor.

The default search path for the transformed code is the same search path used the function that called transform(), but we can replace this default with a manually-constructed path by passing a set of variables as the optional 3rd argument. For example


    A :: B :: C :: { D :: {} }
    transform(newBytecode, newFunction, { A, C.D, B })
   

causes newFunction()’s search path to go from newFunction to A to C.D and finally end at B.

The optional fourth, fifth and sixth arguments help Cicada to give helpful error messages if the new code crashes when we try to run it. The fourth argument is just the name of the file containing the script, if applicable (otherwise set it to the void). The fifth argument is the original ASCII text of the script, and the sixth is the mapping between bytecode words and script characters that is an optional output of compile(). Here is how we pass all of this information between compile() and transform():


    fileName := "scriptFile.cicada"
    myScript := load(fileName)
    opPositions :: string
   
    scriptBytecode := compile(myScript, fileName, opPositions)
   
    newFunction :: {}
    transform(scriptBytecode, newFunction, { }, fileName, myScript, opPositions)
   

It is certainly possible to pass bogus bytecode to transform() (particularly if we’re trying to write out the binary ourselves). transform() checks the bytecode’s syntax, and if there is a problem then it crashes out with an error message.


trap()

syntax: (numeric) error_code = trap([code to run])

Runs the code inside the parentheses (i.e. its argument), and returns any error value. Error codes are listed in Table 5. No code marker is needed within a trap() call. Upon error, the argument stops running and the error code is returned; if the argument finishes with no error then the return value is 0. trap() thus prevents a piece of dubious code from crashing a larger script. Note that the most egregious errors are compile-time errors and trap() will not be able to prevent those -- this includes some type-mismatch errors like trap(string = 4).

A trap() call can optionally print out an error message if needed. To do this we add a semicolon (or code marker) immediately at the beginning of its arguments. A second opening semicolon causes it to print any error message without clearing the error -- so code execution will then fall back to the next enclosing trap() and possibly print another message. This can help to trace errors through multiple nested functions.


    trap((a::*) = 2)                  | prevents a crash
    errCode := trap((a::*) = 2)       | returns the type-mismatch error code
    trap( ; (a::*) = 2)               | prints a type-mismatch error but doesn't cause crash
    trap( ; ; (a::*) = 2)             | prints a type-mismatch error, then crashes out
   

Notice that trap() will also print warning messages (minor errors that don’t stop the program). Warning codes are the same as error codes except that they are negated: for example an out-of-range error will return error code 2, but an out of range warning will return -2. If several warnings have been produced, trap() will only print and return the error code for the last one.

The trap() function has the unique ability to run its arguments in whatever function called trap(), rather than in a private argument variable used by all other built-in and user-defined functions. So variables which are defined within the trap() argument list will be accessible to the rest of the function. Also this and parent have the same meaning inside a trap() command as outside of it.


type()

syntax: (numeric) theType = type((composite variable) var [, (numeric) memberNumber])

Returns a number representing the type of the given variable (one argument) or one of its members (if there is a second argument). The variable is the first argument, and the member number is the second argument. The types IDs are listed in Table 1. A composite-typed variable or member only returns a ‘5’ even though its full type is properly determined by its code list -- use the bytecode() function to obtain the code list.


Prev: Define operator flags    Next: Functions define in Cicada scripts


Last update: May 8, 2024