go to Trig home page   Guide to GAUSS Programming - a basic introduction


 
Procedures

Procedures are short self-contained blocks of code. When they are called by the program, the chain of command within the program switches to the procedure; when the procedure has completed all its operations, control returns to the main program. A number of procedures have already been encountered: READR, WRITER, DELIF, DET, ONES, and so on. This section discusses how procedures are written and work.

A procedure works in just the same way as code in the main program. So why bother with them? For a number of reasons, of which the main ones are:
  • Tidiness. An excessively large and complicated program may be difficult to read, understand, and alter. If the program is broken into separate sections with meaningful procedure names, it becomes much more manageable. Alternatively, there may be a piece of code which carries out some minor function. Placing this code in a procedure allows the programmer to concentrate on the main points of the program.
  • Repetitive operations. Some functions are used in many places; for example, the READR operation, or SEQA which creates ordered vectors. The choice is between explicitly programming the same operation several times, or writing a procedure and calling it several times; usually the latter wins hands down.
  • Security. As the way a procedure interacts with the rest of the environment can be more strictly controlled, then procedures are often easier to test and less susceptible to unexpected influences.
The main disadvantage of procedures is the associated efficiency loss and the extra memory usage. The first is due to the overhead of setting up subroutines and variables, and GAUSS seems to manage this relatively well. The second drawback is largely due to the need to take copies of variables, and it is the programmer's responsibility to minimise this.

Before the details of writing procedures we require a short digression on variable visibility.


1 Scope rules and variable life

A variable always has a certain scope: the domain in which it is visible (accessible) to parts of a program. All of the variables considered so far have been global: they are visible to all parts of the program. Procedures allow the use of local variables: they can only be seen within the ambit of the procedure. Anything outside that procedure cannot read or access those variables; as far as the program outside the procedure goes, that variable does not exist.

Local variables are only visible at the level at which they were declared. Procedures may be nested: one procedure may call another. However, the local variables are only visible to those procedures in which they were called: they are not visible to procedures they call or were called by. For example, suppose a program uses the following variables:

Part of program Called by Variables declared Variables visible
main program - mVar1, mVar2 mVar1, mVar2
procedure P1 main program p1Var1, p1Var2 mVar1, mVar2, p1Var1, p1Var2
procedure P2 procedure p1 p2Var1, p2Var2 mVar1, mVar2, p2Var1, p2Var2


Although P1 calls P2, variables local to P1 are not available to the subsidiary procedure P2.

Because procedures cannot see the variables created by other procedures, variables with the same name can be used in any number of procedures. If, however, variable names do conflict, (a global variable has the same name as a local variable), then the local variable always takes precedence. If procedure P1 above had declared a local variable called "mVar1", then any references to mVar1 inside the procedure will be deemed to refer to the local mVar1.

Local variables only exist for the life of the procedure; once the procedure is completed and control returns to the calling code, all variables local to that procedure will be deleted from memory. If the procedure is called again, the local variables will be a completely new set, not the set that was used last time the procedure was called. Obviously, local variables always start off uninitialised.

Global variables cannot be declared inside a procedure. They may be used, their size may be changed, but they may not be declared afresh. Any variable which is used in a procedure must be either declared explicitly as a local variable or be a preexisting global variable.

2 Writing procedures

A procedure contains five parts: the declaration of the procedure; the declaration of local variables; the body of the code; the statement of which variables are to be returned; and a closing statement:

PROC (numRets) = ProcName ( inParam1, inParam2,... inParamN);

LOCAL locVar1;
:
LOCAL locVarN;


instruction1;
instruction2;
:
instructionN;


RETP (outParam1, outParam2, ... outParamN);

ENDP;

As for the other control statements, this spacing and indentation is not necessary. The important bits are the order of the various elements and the location of the semi-colons.

2.1 The procedure declaration

The first element tells GAUSS that the procedure can be referred to as ProcName, that it will return numRets variables to the bit of code which called the procedure, and that it requires a number of pieces of information from the calling code: inParam1 to inParamN. GAUSS will check numRets against the number of variables actually being returned to the calling code and produce an error message if the two do not match. It will not check that the variables are the right sort of vector, matrix, etcetera. If you have just one return value, you can omit numRets but I leave it in for consistency.

These input parameters are variables which can be used like any other. They are copies of the variables with which the procedure was called. Therefore they can be altered in any way inside the procedure and this will have no effect on the original variables. This is equivalent to taking a photocopy of a piece of paper. The copy, originally an exact one, can be left untouched, drawn upon, made into an aeroplane - whatever its owner wants. The original is unaffected by the adventures of the copy.

This is part of the security issue raised earlier. A variable can be passed to a procedure as a parameter confident that, to the calling code, its value will not be altered. Of course, this is not guaranteed. If the procedure is called from the main program, then the variables used will be global and thus visible inside the procedure. Thus procedures should only make reference, where possible, to input parameters and local variables. Besides, testing of the procedure is easier if it is a self-contained unit.

2.2 Local variable declarations

Local variables are declared using the LOCAL statement. Any variables used in the procedure which are not input parameters or global variables must be declared here. Variables can be defined in two ways:

LOCAL x;
LOCAL y;
LOCAL z;
or LOCAL x, y, z;

Note that there is no information about the size or type of the variable here. All this statement says is that there are variables x, y, and z which will be accessed during this procedure, and that GAUSS should add their names to the list of valid names while this procedure is running.

LET statements are legal in a procedure, once the variables have been identified as local, global, or parameter. However, DECLARE statements should not be used as these are for a different sort of initialisation.

2.3 Procedure code

The main body of the procedure can contain exactly the same instructions as any other section of code, with the obvious exception that procedures cannot be defined within another procedure. However, a procedure can call other procedures; the only effective limit to the number of nested procedure calls is the amount of memory available.

2.4 Return values

When the workings of the procedure are finished, the final action is to return to the calling code any output parameters. These can be of any type; GAUSS will not check. Nor will its compiler check warn if the number of returns is not equal to numRets in the procedure declaration. GAUSS will only report an error when the procedure is actually called during a program run, so a program may run for a considerable time before an error in the number of returns is discovered.

The RETP statement is followed by a list of output parameters. These parameters can be any of the variables used, although returning global variables is clearly a remarkably foolish thing to do. If the aim of the procedure was to take variable as an input parameter, alter it, and then return it, then it must also be included in the output parameter list (as the input parameters are only copies of the original variables).

If there is no value to be returned, then the RETP statement can be omitted. The procedure can have several RETPs; however, this is not recommended for the same reasons that multiple END statements are a poor idea: they confuse the flow of control, and rarely lead to more efficient programs. A RETP will usually be the penultimate line of the procedure.


2.5 Finishing the definition: ENDP

The statement ENDP tells GAUSS that the definition of the procedure is finished. GAUSS then adds the procedure to its list of symbols. It does not do anything with the code, because a procedure does not, in itself, generate any executable code. A procedure only "exists" in any meaningful sense when it is called; otherwise it is just a definition. Consider a procedure which is not called during a particular run of a program. Then that procedure could have contained any code statements and it would have made no difference whatsoever to the running of the program; for all intents and purposes, that procedure was completely ignored and might as well have been just another unused variable. This is why local variables have no existence outside their procedure: accessing variables local to a procedure that was never called is equivalent to being the child of parents who never existed.

2.6 Example

Consider first this simple procedure to take a column vector and fill it with ascending numbers. The start number and increment are given as parameters. This mimics the action of the standard function SEQA:

PROC (1) = FillVec (inVec, startNum, step);

LOCAL i;
LOCAL nRows;

nRows = ROWS (inVec);
inVec[1] = startNum;
i = 1;
DO WHILE i <= nRows;
inVec[i] = inVec[i-1] + step;
i = i + 1;
ENDO;

RETP (inVec);

ENDP;

This procedure could be called by, for example,

    :
sequence = FillVec (ZEROS(10, 1), 10, 10);
    :

which would give a 10x1 vector counting to one hundred in tens.

In this case, even though the parameters are variables within the procedure, they were created using constants. This is due to the fact that parameters are copies of the variables passed to the procedure. In the above example, GAUSS calculated the results of the ZEROS operation; created three new variables, "inVec", "startNum", and "step", which have no further connection to the original values ZEROS(..), 10, 10; and then made these new variables visible to FillVec, and FillVec only. Thus to concatenate an index vector onto an existing matrix, a program could use

temp = FillVec (mat[.,1], 1, 1);
mat = mat ~ temp;

or, equivalently and without needing an extra variable,

mat = mat ~ FillVec(mat[.,1], 1, 1);

The column of mat used as the input vector is irrelevant; it will not be altered by the procedure call.

Note that when a procedure returns a single result, it can be treated like the result of any other operation. Thus, given a vector iVec, a valid command could be

result = SQRT((FillVec(iVec, 50, 1)
                .*FillVec(iVec, 50, -1))*ONES(50, 1));

For a second example, consider a procedure which, given a GAUSS dataset handle, reads a number of lines or returns an end-of-file message:

PROC (2) = Extract (handle, numLines);

LOCAL currRow;
LOCAL readOkay;
LOCAL data;

currRow = SEEKR (handle, -1);
IF (currRow+numLines-1) > ROWSF(handle);
readOkay = 0;
CLEAR data;
ELSE;
readOkay = 1;
data = READR (handle, numLines);
ENDIF;

RETP (readOkay, data);

ENDP;

Note the need to CLEAR data: if we did not assign some value to data (in this case, 0) before we returned from the procedure, then GAUSS would report an error arising from an uninitialised variable.

This procedure could be then used:

{readOkay, data} = Extract (handle, 16);
IF NOT readOkay;
PRINT "Run out of data";
ELSE;
...

In this case all the variables in the procedure have the same name as in the calling code. This does not matter. The variables that Extract uses will be the local variables or the parameter copies. The procedure in turn calls the procedures SEEKR, ROWSF, and READR. However, none of the variables that Extract uses will be visible to any of these procedures except as parameters. Thus Extract will take a copy of "handle" and "numLines" and use the copies for its own use. It then calls READR with these two copies as input parameters, and READR will take its own copies of these. Thus, by the time the program gets to the level of READR's code, there will be the original variable "handle" and two copies of it lying around in memory, each being accessed by a different "layer" of the program.

3 Procedures as variables

An extremely useful feature of GAUSS isthe ability to pass procedures as variables to other procedures. For example,

PROC(1) = Sign(mat, procVar);

LOCAL procVar: proc;
LOCAL temp;

temp = procVar(mat);
IF temp <0;
temp = "negative";
ELSE;
temp = "non-negative";
ENDIF;

RETP (temp);
ENDP;

This procedure takes a procedure variable called procVar and a matrix mat as parameters. We need to declare in the procedure body that procVar is a procedure (by the LOCAL procVar: proc; statement) so that GAUSS will realise this is a procedure and not another matrix or string.

Having done that, we can then use procVar within the procedure as if it were a proper procedure, even though we have no idea what the procedure is. All we require is that procVar takes one input parameter and returns one numeric scalar.

To use this, we need to call it with a reference to the relevant function. We do this by putting an ampersand & in front of the function name.

To continue this example, we could call the above procedure thus:

v = someVector;
PRINT "The sign of the largest number is " Sign(v, &Max_mat);
PRINT "The sign of the smallest number is " Sign(v, &Min_mat);

assuming the procedures Max_mat and Min_mat have been defined as taking a vector input and producing a scalar output. So calling any one of these functions with a vector parameter satisfies the requirements of the procedure variable procVar.

GAUSS does not allow GAUSS reserved words (such as MINC) to be used as procedure variables, although some standard GAUSS procedures can be used. The list of proscribed procedures can be found in the User Guide - Reserved Words Appendix. Anything not in there can be used as a procedure variable.

These are trivial examples, but third-party products make extensive use of procedure variables - this is how they can supply generic optimisation routines while you just supply functions and derivatives. If you plan to use these add-on packages, it is worthwhile practising using procedure variables.

4 Functions and keywords

Functions are one-line procedures which return a single parameter. They are defined slightly differently:

FN fnName(inParam1,... inParamN) = someCode;

but otherwise operate in much the same way as procedures. However, the code in a function can only be one line, and functions do not have local variables. Thus functions can be neater than procedures for defining simple repetitive tasks, but apart from that they offer no real benefits.

Keywords take a single string as input and do not return any output. They can be useful for printing messages to the screen, for example. They are called slightly differently to procedures and functions, looking more like the PRINT function. They do allow for local variables and more than one line of code, so in that sense they are more flexible than functions. However, only taking a string as input restricts their value somewhat.

In general, functions and keywords can simplify programs, but as they do nothing that procedures can't do, you can happily ignore them.

[ previous page ] [ next page ]