[Cockcroft98] Section 17.5——17.10

来源:百度文库 编辑:神马文学网 时间:2024/04/29 13:42:33

Functions, Procedures, and Programming Notes

SymbELsupports encapsulated, scoped blocks that can return a value. Theseblocks are referred to as functions and procedures for notationalbrevity. To complete our picture, we need to make some points aboutthese constructs.

Function Return Types

So far, the only value that functions have returned in the examples has been a scalar type, but functions can returndouble orstring as well. More complex types, covered in later sections, can be returned as well.

Functionscannot return arrays because there is no syntactic accommodation fordoing so. However, there is a way to get around this limitation forarrays of nonstructured types. See “Returning an Array of Nonstructured Type from a Function” on page 552.

Scope

Althoughvariables may be declared local to a function, the default semanticsfor local variables are for them to be the C equivalent ofstatic. Therefore, even though a local variable has an initialization part in a local scope, this initialization is not performed on each entry to the function. It is done once before the first call and never done again.

Initializing Variables

Variablescan be initialized to values that are compatible with their declaredtype. This is the case for both simple and structured types. The onlyexceptional condition in initializing variables is the ability toinitialize a global variable with a function call. This capability issupported, but use it with great care. In general, avoid it as badpractice.

Arrays can be given initial values through an aggregate initialization. The syntax is identical to that in C. For example:

int array[ARRAY_SIZE] = {
1, 2, 3, 4, 5, -1
};

The size of the array must be large enough to accommodate the aggregate initialization or the parser will flag it as an error.

Notes About Arrays and Strings

The typestring is native to SymbEL and has no equivalent in C. Sincestringis an atomic type, it is incorrect to use the subscript operator toaccess individual characters in the string. For this reason, we need tobe able to interchange values between variables that are of typestring andchar[].

Array Assignment

Althoughpointer types are not allowed in SymbEL, assignment of arrays isallowed provided that the size of the target variable is equal to orgreater than the size of the source variable. This is not a pointerassignment, though. Consider it a value assignment where the values ofthe source array are being copied to the target array.

String Type and Character Arrays

Variables declared as typestring and arrays of typechar have interchangeable values to allow access to individual characters contained in a string. For example:

char tmp[8]; 
string s = "hello";
tmp=s;
if (tmp[0] == 'h') {
...
}

Accessing the individual characters in the string cannot be done with thes variable by itself. If a subscript was used, it would mean that the variables was an array of strings, not an array of characters. After any modification to the variabletmp in this example is done, the value could be assigned back tos.

Assignment to String Variables

When a variable of typestringis assigned a new value, the existing value of the variable is freedand a new copy of the source string is allocated and assigned to thevariable. This is also the case when string variables are assigned thereturn value of a function that returns typestring. See “Using Return Values from Attached Functions” on page 553.

Empty Arrays

Whena function accepts an array as a parameter, it is not always convenientto send an array of the same size as the parameter. For this reason,the empty array declaration was added for use in parameterdeclarations. This is a notation where no subscript size is included inthe declaration, just the[] suffix to the variable name. Here is an example.

print_it(int array[]) 
{
int i;
for(i=0; array[i] != -1; i++) {
printf("%d\n", array[i]);
}
}
main()
{
int array[6] = { 1, 2, 3, 4, 5, -1 };
print_it(array);
}

Upon entry to the function containing the emptyarray parameter, the parameter variable obtains a size. In this example, thearray parameter is given a size of 24 (6 * 4) upon entry to the functionprint_it. This size will change for every array passed as an actual parameter.

Recursion

Recursionis not supported. Direct recursion is flagged as an error by theparser. Indirect recursion is silently ignored. The disallowance ofrecursion is due to a problem in the run time that could not beovercome in the short term and may be fixed in a future release.Examine the following program.

Code View:Scroll/Show All
one() 
{
// remember, initialization is only done once
int i = 0;
i++;
switch(i) {
case 1:
printf("Here I am\n");
break;
case 2:
printf("Here I am again\n");
return;
}
two();
}
two()
{
one();
}
main()
{
one();
}


It seems that the output of this program would be

Here I am 
Here I am again

but, in fact, only the first line will be printed out. The second call tooneis detected by the run time, and a return from the function isperformed before anything is done. SymbEL is not the place to dorecursion. Again, if you feel like being tricky, don't be. It probablywon't work.

Built-in Functions

SymbELcurrently supports a limited set of built-in functions. As the needarises, more built-ins will be added. Many of these built-ins work thesame as or similarly to the C library version. For a completedescription of those functions, see the manual page for the C function.The current built-in functions are described below.

int fileno(int)

Send the return value of anfopen() orpopen() tofileno() to retrieve the underlying file descriptor. This function works like the stdio macro, only it’s a built-in.

int fprintf(int, string, ...)

Print a formatted string onto the file defined by the first parameter. The man page for thefprintf C library function defines the use of this function in more detail.

int sizeof(...)

Return the size of the parameter. This parameter can be a variable, a numeric value, or an expression.

string itoa(int)

Convert an integer into a string.

string sprintf(string, ...)

Returna string containing the format and data specified by the parameters.This function is like the C library function in what it does, but itdoes not use a buffer parameter as the first argument. The buffer isreturned instead. The function is otherwise like the C function.

struct prpsinfo_t first_proc(void)

In conjunction with thenext_proc() function, traverse through all of the processes in the system. All of the fields of theprpsinfo_t structure may not be filled because of permissions. With root permissions, they are filled in completely.

struct prpsinfo_t get_proc(int)

Geta process by its process ID (instead of traversing all of theprocesses). The same rules regarding permissions apply to this functionas well as tofirst_proc().

struct prpsinfo_t next_proc(void)

In conjunction withfirst_proc(), traverse through all of the processes on the system. When thepr_pidmember of theprpsinfo_t structure is -1 after a return from this function, then all of the processes have been visited.

ulong kvm_address(VAR)

Return the kernel seek value for this variable. This function works only on variables designated as specialkvm variables in the declaration.

ulong kvm_declare(string)

Declare a newkvm variable while the program is running. The return value is thekvm address of the variable, used as the second parameter tokvm_cvt.

void debug_off(void)

Turn debugging off until the next call todebug_on().

void debug_on(void)

Turn debugging on following this statement. Debugging information is printed out until the next call todebug_off().

void kvm_cvt(VAR, ulong)

Change the kernel seek value for this variable to the value specified by the second parameter.

void printf(string, ...)

Print a formatted string. Internally, a call tofflush() is made after every call toprintf(). This call causes awrite() system call; consider the effects of this call when writing SymbEL programs.

void signal(int, string);

Specify a signal catcher. The first parameter specifies the signal name according to thesignal.se include file. The second parameter is the name of the SymbEL function to call upon receipt of the signal.

void struct_empty(STRUCT, ulong)

Dumpthe contents of the structure variable passed as the first parameterinto the memory location specified by the second parameter. The binaryvalue dumped will be the same format as when used in a C program.

void struct_fill(STRUCT, ulong)

Replacethe data from the second parameter into the structure variable passedas the first parameter. This function allows C-structure-format data tobe translated into the internal representation of structures used bySymbEL.

void syslog(int, fmt, ...)

Log a message through thesyslog() facility. Note that the%m string must be sent as%%m because the interpreter passes the format string throughvsprintf() before passing it tosyslog() internally.

Dynamic Constants

TheSymbEL interpreter deals with a few physical resources that havevariable quantities on each computer on which the interpreter is run.These are the disks, network interfaces, CPUs, and devices that have aninterrupt-counters structure associated with the device. It is oftennecessary to declare arrays that are bounded by the quantity of such aresource. When this is the case, a value is required that issufficiently large to prevent subscripting errors when the script isrunning. This requirement is dealt with by means of dynamic constants.These constants can be used as integer values, and the interpreterviews them as such. These dynamic constants are:

  • MAX_DISK — Maximum number of disk or disk-like resources

  • MAX_IF — Maximum number of network interfaces

  • MAX_CPU — Maximum number of CPUs

  • MAX_INTS — Maximum number of devices with interrupt counters

    These values are typically set to the number of discovered resources plus one. A single-CPU computer, for instance, will have a MAX_CPU value of 2. Run this script on your system and see what it says.

main() 
{
printf("MAX_DISK = %d\n", MAX_DISK);
printf("MAX_IF = %d\n", MAX_IF);
printf("MAX_CPU = %d\n", MAX_CPU);
printf("MAX_INTS = %d\n", MAX_INTS);
}

Attachable Functions

Toensure against the rampant effects of “creeping featurism” overtakingthe size and complexity of the interpreter, a mechanism had to bedevised so many procedures and functions could be “built in” withoutbeing a built-in.

Thesolution was to provide a syntactic remedy that defined a shared objectthat could be attached to the interpreter at run time. This declarationwould include the names of functions contained in that shared object.Here is an example.

attach "libc.so" {
int puts(string s);
};
main()
{
puts("hello");
}

Theattach statements are contained in the samese include files as the C counterpart in/usr/include. The man page forfopen, for instance, specifies that the filestdio.h should be included to obtain its declaration. In SymbEL, the include filestdio.se is included to obtain the declaration inside anattach block.

Here are some rules governing the use of attached functions:

  • Only parameters that are four bytes long or less can be passed as parameters. No longlong types or doubles.

  • Structures can be passed, but they are sent as pointers to structures.

    The equivalent C representation of the SymbEL structure can be declared, and the parameters should then be declared as pointers to that type. The structure pointer parameter can then be used as it normally would be used. And although the structure parameter in the C code is a pointer, it still is not a reference parameter and any changes made will not be copied back to SymbEL variable.

  • Attached functions declaring a structure type as their return value are treated as if the function returns a pointer to that type.

    There is no way to declare an attached function that returns a structure, i.e., not a pointer to a structure, but a structure. The value returned is converted from the C representation into the internal SymbEL representation. No additional code is needed to convert the return value.

    Note that attached functions returning pointers to structures sometimes return zero (a null pointer) to indicate error or end-of-file conditions. Such functions should be declared as returning ulong, and their return value compared to zero. If a non-zero value is returned, the struct_fill built-in can be used to fill a structure. If an attached function is declared to return a structure and it returns zero, then a null pointer exception occurs in the program and the interpreter exits.

  • No more than 12 parameters can be passed.

  • Arrays passed to attached functions are passed by reference. The call

    fgets(buf, sizeof(buf), stdin);

    does exactly what it is expected to do. The buf parameter will be filled in by fgets directly because the internal representation of an array of characters is, not surprisingly, an array of characters. These semantics include passing arrays of structures. The SymbEL structure will be emptied before passing and filled upon return when sent to attached functions.

  • The rules for finding the shared library in an attach statement are the same as those defined in the man page for ld.

Ellipsis Parameter

For attached functions only, you can use the ellipsis parameter (...)to specify that there are an indeterminate number and type ofparameters to follow. Values passed up until the ellipsis argument aretype checked, but everything after that is not type checked. Theellipsis parameter allows functions likesscanf to work and therefore makes the language more flexible. For instance, the program

attach "libc.so" {
int sscanf(string buf, string format, ...);
};
main()
{
string buf_to_parse = "hello 1:15.16 f";
char str[32];
int n;
int i;
double d;
char c;
n = sscanf(buf_to_parse, "%s %d:%lf %c", str, &i, &d, &c);
printf("Found %d values: %s %d:%5.2lf %c\n", n, str, i, d, c);
}

yields the output

"Found 4 values: hello 1:15.16 f"

Attached Variables

Global variables contained in shared objects can be declared within anattach block with the keywordexternbefore the declaration. This declaration causes the values within theinternal SymbEL variable to read and be written to the variable as itis used in the execution of the program.

Here is an example of the declaration ofgetopt, with its global variablesoptind,opterr, andoptarg from the include filestdlib.se.

attach "libc.so" {
int getopt(int argc, string argv[], string optstring);
extern int optind;
extern int opterr;
extern string optarg;
};

This code works for all types, including structures.

Built-in Variables

Although extern variables can be attached with theextern notation, there are three very special cases of variables that cannot be attached this way. These variables arestdin,stdout, andstderr. These “variables” in C are actually#define directives in thestdio.hinclude file; they reference the addresses of structure members. Sincethe address of structures cannot be taken in SymbEL, there is no way torepresent these so-called variables. They are, therefore, provided bythe interpreter as built-in variables. They can be used without anydeclaration or include file usage.

Parameters tomain and Its Return Value

In C programs, the programmer can declaremain as accepting three parameters:

  • An argument count (usually argc)

  • An argument vector (usually argv)

  • An environment vector (usually envp)

Similarly, the SymbELmain function can be declared as accepting two of these parameters,argc andargv. Here is an example that uses these variables.

main(int argc, string argv[]) 
{
int i;
for(i=0; i printf("argv[%d] = %s\n", i, argv[i]);
}
}

This example also demonstrates the use of an empty array declaration. When this program is run with the command

se test.se one two three four five six

the resulting output is

argv[0] = test.se 
argv[1] = one
argv[2] = two
argv[3] = three
argv[4] = four
argv[5] = five
argv[6] = six

It is not necessary to declare these parameters tomain. If they are not declared, then the interpreter does not send any values for them.

It is also possible to declaremain as being an integer function. Although theexit function can be used to exit the application with a specific code, the value can also be returned frommain. In this case, the previous example would be:

int main(int argc, string argv[]) 
{
int i;
for(i=0; i printf("argv[%d] = %s\n", i, argv[i]);
}
return 0;
}

The value returned by thereturn statement is the code that the interpreter exits with.

Structures

SymbEL supports the aggregate typestruct,which is similar to the C variety, with some exceptions. An aggregateis a collection of potentially dissimilar objects collected into asingle group. As it turns out, most of the SymbEL code developed willcontain structures.

As an example, here is what a SymbEL password file entry might look like.

struct passwd {
string pw_name;
string pw_passwd;
long pw_uid;
long pw_gid;
string pw_age;
string pw_comment;
string pw_gecos;
string pw_dir;
string pw_shell;
};

The declaration of structure variables differs from C in that the wordstruct is left out of the variable declaration. So, to declare a variable of typestruct passwd, onlypasswd would be used.

Accessing Structure Members

You access a structure member with dot notation.The first part of the variable is the variable name itself, followed bya dot and then the structure member in question. To access thepw_name member of thepasswd structure above, the code could look like this.

main() 
{
passwd pwd;
pwd.pw_name = "richp";
...
}

Structure members can be any type, including other structures. A structure may not contain a member of its own type. If it does, the parser posts an error.

Arrays of Structures

Declarationsof arrays of structures is the same as for any other type, with theprovision stated in the previous paragraph. Notation for accessingmembers of an array of structures isname[expression].member.

Structure Assignment

The assignment operation is available to variables of the same structure type.

Structure Comparison

Comparison of variables of structure type is not supported.

Structures as Parameters

Variablesof structure type can be passed as parameters. As with otherparameters, they are passed by value so the target function can accessits structure parameter as a local variable.

Arraysof structures to other SymbEL functions are also passed by value. Thisis not the case with passing arrays of structures to attached functions(see “Attachable Functions” on page 532).

Structures as Return Values of Functions

Functionscan return structure values. Assigning a variable the value of theresult of a function call that returns a structure is the same as astructure assignment between two variables.

Language Classes

Thepreceding sections have discussed the basic structure of SymbEL. Theremainder of this chapter discuss the features that make SymbELpowerful as a language for extracting, analyzing, and manipulating datafrom the kernel.

Whengeneralizing a capability, the next step after creation of a library isthe development of a syntactic notation which represents the capabilitythat the library provided. The capability in question here is theretrieval of data from the sources within the kernel that provideperformance tuning data. SymbEL provides a solution to this problemthrough the use of predefined language classes that can be used todeclare the type of a variable and to designate it as being a specialvariable. When a variable with this special designation is accessed,the data from the source that the variable represents is extracted andplaced into the variable before it is evaluated.

There are four predefined language classes in SymbEL:

  • kvm — Access to any global kernel symbol

  • kstat — Access to any information provided by the kstat framework

  • mib — Read-only access to the MIB2 variables in the IP, ICMP, TCP, and UDP modules in the kernel

  • ndd — Access to variables provided by the IP, ICMP, TCP, UDP, and ARP modules in the kernel

Variablesof these language classes have the same structure as any othervariable. They can be a simple type or a structured type. What needsclarification in the declaration of the variable is

  • Whether the variable type is simple or structured

  • Whether the variable has a predefined language class attribute

Thesyntax selected for this capability defines the variable with a namethat is the concatenation of the language class name and a dollar sign ($). This convention allows these prefixes for variables to denote their special status.

kvm$ kvm language class • kstat$ kstat language class • mib$ mib language class • ndd$ ndd language class

Examples of variables declared with a special attribute are:

ks_system_misc kstat$misc;    // structured type, kstat language class 
int kvm$maxusers; // simple type, kvm language class
mib2_ip_t mib$ip; // structured type, mib language class
ndd_tcp_t ndd$tcp; // structured type, ndd language class

Whenany of these variables appear in a statement, the values that thevariables represent are retrieved from the respective source before thevariable is evaluated. Variables declared of the same type but notpossessing the special prefix are not evaluated in the same manner. Forinstance, the variable

ks_system_misc tmp_misc;  // structured type, no language class specified

can be accessed without any data being read from thekstat framework.

Variables that use a language class prefix in their name are called active variables. Those that do not are called inactive variables.

Thekvm Language Class

Let’s look at an example of the use of akvm variable.

main() 
{
int kvm$maxusers;
printf("maxusers is set to %d\n", kvm$maxusers);
}

In this example, there is a local variable of typeint. The fact that it is anint is not exceptional. The fact that the name of the variable begins withkvm$is exceptional. It is thekvm$ prefix that flags the interpreter to look up this value in the kernel via thekvm library. The actual name of the kernel variable is whatever follows thekvm$prefix. The program need not take special action to read the value fromthe kernel. Simply accessing the variable by using it as a parameter totheprintf() statement (in thisexample) causes the interpreter to read the value from the kernel andplace it in the variable before sending the value toprintf(). Use ofkvm variables is somewhat limiting since the effective uid ofse must be superuser or the effective gid must besys in order to successfully use thekvm library.

In this example, the variablemaxusersis a valid variable in the kernel and when accessed is read from thekernel address space. It is possible and legal to declare akvm$active variable with the name of a variable that is not in the kerneladdress space. The value will contain the original initialized value,and refreshing of this type of variable is futile because there is noactual value in the kernel. This technique is useful when dealing withpointers, though, and an example is included in “Using kvm Variables and Functions” on page 553.

Thekstat Language Class

The use ofkstat variables differs from the use ofkvm variables in that all of thekstat types are defined in the header filekstat.se. Allkstat variables must be structures because this is how they are defined in the header file. Declaration of an activekstat variable that is not a structure results in a semantic error. Declaration of an activekstat variable that is not of a type declared in thekstat.seheader file results in the variable always containing zeros unless theprogram manually places something else in the variable. Here is anexample of usingkstat variables.

#include  
main()
{
ks_system_misc kstat$misc;
printf("This machine has %u CPU(s) in it.\n", kstat$misc.ncpus);
}

Just as in thekvm example, no explicit access need be done to retrieve the data from thekstat framework. The access to the member of the activeks_system_misc variable in the parameter list ofprintf() causes the member to be updated by the run time.

Multiple Instances

Thekstat.se header file contains many structures that have information that is unique in nature. Theks_system_misc structure is an example.

Thenumber of CPUs on the system is unique and does not change depending onsomething else. However, the activity of each of the individual CPUs doeschange, depending on which CPU is in question. This is also the casefor network interfaces and disks. This situation is handled by theaddition to structures of two members that contain data for devicesthat have multiple instances. These members arename$ andnumber$.

Thename$ member contains the name of the device as supplied bykstat. Thenumber$ member is a linear number representing the nth device of this type encountered. It is not the device instance number. This representation allows aforloop to be written such that all of the devices of a particular typecan be traversed without the need to skip over instances that are notin the system. It is not unusual, for instance, for a multiprocessormachine to contain CPUs that do not have linear instance numbers. Whentraversing through all the devices, the program will encounter the endof the list when thenumber$ member contains a -1. Here is an example of searching through multiple disk instances.

Code View:Scroll/Show All
#include  
main()
{
ks_disks kstat$disk;
printf("Disks currently seen by the system:\n");
for(kstat$disk.number$=0; kstat$disk.number$ != -1; kstat$disk.number$++)
{
printf("\t%s\n", kstat$disk.name$);
}
}


In this program,kstat$disk.number$ is set initially to zero. The “while part” of the loop is then run, checking the value ofkstat$disk.number$ to see if it’s -1. That comparison causes the run time to verify that there is an nth disk. If there is, then thenumber$ member is left with its value and the body of the loop runs. When the run time evaluates thekstat$disk.name$ value in theprintf() statement, it reads the name of the nth disk and places it in thename$ member, which is then sent toprintf().

Other Points Aboutkstat

Here are some points about how to best usekstat variables in a program.

Some of the values contained in thekstat structures are not immediately useful by themselves. For instance, thecpu member of theks_cpu_sysinfo structure is an array of four unsignedlongsrepresenting the number of clock ticks that have occurred since systemboot in each of the four CPU states: idle, user, kernel, and wait. Thisdata must be disseminated to be useful.

If a program needs to access many members of akstat variable, then it is in the best interest of the performance of the program and the system to copy the values into an inactivekstatvariable by using a structure assignment. The single structureassignment causes all of the members of the structure to be read fromthekstat framework with oneread and then copied to the inactive variable. When these values areaccessed by the inactive variable, no more reads from thekstatframework will be initiated. The net result is a reduction in thenumber of system calls being performed by the run time, and thereforese does not have a significant impact on the performance of the system. Here is an example.

Example kstat Program
Code View:Scroll/Show All
#include  
#include
#include

main()
{
ks_cpu_sysinfo kstat$cpusys; // active kstat variable
ks_cpu_sysinfo tmp_cpusys; // inactive kstat variable
ks_system_misc kstat$misc; // active kstat variable
int ncpus = kstat$misc.ncpus; // grab it and save it
int old_ints[MAX_CPU];
int old_cs[MAX_CPU];
int ints;
int cs;
int i;

// initialize the old values
for(i=0; i kstat$cpusys.number$ = i; // does not cause an update
tmp_cpusys = kstat$cpusys; // struct assignment, update performed
old_ints[i] = tmp_cpusys.intr; // no update, inactive variable
old_cs[i] = tmp_cpusys.pswitch; // no update, inactive variable
}
for(;;) {
sleep(1);
for(i=0; i kstat$cpusys.number$ = i; // does not cause an update
tmp_cpusys = kstat$cpusys; // struct assignment, update performed
ints = tmp_cpusys.intr - old_ints[i];
cs = tmp_cpusys.pswitch - old_cs[i];

printf("CPU: %d cs/sec = %d int/sec = %d\n", i, cs, ints);

old_ints[i] = tmp_cpusys.intr;
old_cs[i] = tmp_cpusys.pswitch; // save old values
}
}
}


About the Program
ks_cpu_sysinfo kstat$cpusys;   // active kstat variable 
ks_cpu_sysinfo tmp_cpusys; // inactive kstat variable

Thiscode is the declaration of the active and inactive variable. Use of theactive variable causes the run time to read the values from thekstat framework for theks_cpu_sysinfo structure. Later accesses to the inactive variable do not cause the reads to occur.

ks_system_misc kstat$misc;     // active kstat variable 
int ncpus = kstat$misc.ncpus; // grab it and save it

Since thencpus variable will be used extensively, it is best to put the value into a variable that does not cause continual updates.

int old_ints[MAX_CPU]; 
int old_cs[MAX_CPU];

Sincethe program computes the rate at which interrupts and context switchesare occurring, the values from the previous iteration need to be savedso they can be subtracted from the values of the current iteration.They are arrays bounded by the maximum number of CPUs available on asystem.

// initialize the old values 
for(i=0; i kstat$cpusys.number$ = i; // does not cause an update
tmp_cpusys = kstat$cpusys; // struct assignment, update performed
old_ints[i] = tmp_cpusys.intr; // no update, inactive variable
old_cs[i] = tmp_cpusys.pswitch; // no update, inactive variable
}

This code grabs the initial values that will be subtracted from the current values after the firstsleep()is completed. For simplicity, no timers are kept, and it is assumedthat only one second has elapsed between updates. In practice, theelapsed time would be computed.

for(i=0; i  kstat$cpusys.number$ = i;    // does not cause an update 
tmp_cpusys = kstat$cpusys; // struct assignment, update performed

Here, thenumber$member is set to the CPU in question, and then the contents of theentire active structure variable are copied into the inactive structurevariable. This coding causes only one system call to update thekstat variable.

ints = tmp_cpusys.intr - old_ints[i]; 
cs = tmp_cpusys.pswitch - old_cs[i];

printf("CPU: %d cs/sec = %d int/sec = %d\n", i, cs, ints);

old_ints[i] = tmp_cpusys.intr;
old_cs[i] = tmp_cpusys.pswitch; // save old values

Thiscode computes the number of interrupts and context switches for theprevious second and prints it out. The current values are then saved asthe old values, and the loop continues.

Runtime Declaration ofkstat Structures

Thekstatframework is dynamic and contains information regarding devicesattached to the system. These devices are built by Sun and bythird-party manufacturers. The interpreter contains static definitionsof many devices, and these definitions are mirrored by thekstat.seinclude file. However, it is unreasonable to assume that theinterpreter will always contain all of the possible definitions fordevices. To accommodate this situation, a syntactic element was needed.This is thekstat structure.

Akstat structure can define onlyKSTAT_TYPE_NAMEDstructures, which are the structures that define devices such asnetwork interfaces. As an example, the following script prints out thevalues of akstat structure that is not declared in thekstat.se file but has been part of thekstat framework since the very beginning.

kstat struct "kstat_types" ks_types {
ulong raw;
ulong "name=value";
ulong interrupt;
ulong "i/o";
ulong event_timer;
};
main()
{
ks_types kstat$t;
ks_types tmp = kstat$t;
printf("raw = %d\n", tmp.raw);
printf("name=value = %d\n", tmp.name_value);
printf("interrupt = %d\n", tmp.interrupt);
printf("i/o = %d\n", tmp.i_o);
printf("event_timer = %d\n", tmp.event_timer);
}

Thekstat structure introduces a few new concepts:

  • The structure starts with the word " kstat" to denote its significance.

  • The structure also contains members that are quoted. Quoted members work only for kstat structures and do not work in an ordinary structure declaration. Quoted members enable programmers to declare variables that accurately reflect the name of the member within the kstat framework. For instance, the member "name=value" could not be declared without quotes since the parser would generate errors. When accessed in the printf() statement, special characters are translated to underscores. This is the case for any character that is recognized as a token and also for spaces. The characters that will be translated to underscores are:

    []{}()@|!#;:.,+*/=-><~%? \t\n\\^
    []{}()@|!#;:.,+*/=-><~%? \t\n\\^
  • Members of KSTAT_TYPE_NAMED structures sometimes have no name. This situation will also be correctly handled by the interpreter. Any member of a structure with the name "" is changed to missing N where N starts at 1 and increments for each occurrence of a missing member name. A declaration of

    kstat struct "asleep" ks_zzzz {
    ulong ""; // translates into missing1
    };
    kstat struct "asleep" ks_zzzz {
    ulong ""; // translates into missing1
    };

    translates into

    kstat struct "asleep" ks_zzzz {
    ulong missing1;
    };
    kstat struct "asleep" ks_zzzz {
    ulong missing1;
    };

    for the purposes of the programmer. It is a good idea to document such declarations, as shown above.

  • Members with reserved words as names are also munged into another form—the prefix SYM_ is added to the name. For instance, this declaration

    kstat struct "unnecessary" ks_complexity {
    short "short";
    };
    kstat struct "unnecessary" ks_complexity {
    short "short";
    };

    is munged into

    kstat struct "unnecessary" ks_complexity {
    short SYM_short;
    };
    kstat struct "unnecessary" ks_complexity {
    short SYM_short;
    };

    so you can continue.

  • The quoted string following the keyword struct in the declaration represents the name of the KSTAT_TYPE_NAMED structure in the kstat framework and is an algebra unto itself. First, an introduction.

    Each “link” in the kstat “chain” that composes the framework has three name elements: a module, an instance number, and a name. The "kstat_types" link, for instance, has the complete name "unix".0."kstat_types". "unix" is the module, 0 is the instance number, and "kstat_types" is the name. Here are the possible ways to specify the kstat name within this quoted string.

    • "kstat_types" — The “name” of the kstat.

    • "cpu_info:" — The “module” of the kstat. A link with the full name of "cpu_info".0."cpu_info0" would map onto this structure. However, so too would "cpu_info".1."cpu_info1", and this case brings up an issue. When a kstat structure is declared with a kstat module name, the first two members of the structure must be:

      long number$; 
      string name$;
      long number$; 
      string name$;

      This requirement is in keeping with other kstat declarations with multiple instances. In the case of structures with multiple module names that have the same structure members, the list of names continues with colon separators, for example:

      kstat struct "ieef:el:elx:pcelx" ks_elx_network { ...
      kstat struct "ieef:el:elx:pcelx" ks_elx_network { ...
    • "*kmem_magazine" — The prefix of the name portion of the kstat. In the case of the kmem_magazines, the module name is always "unix", which is the module name of many other links that do not share the same structure members as the kmem_magazines. As is the case with specifying a module name, the number$ and name$ members must be present.

Note that when a dynamickstatstructure declaration replaces a static declaration inside of theinterpreter, the old declaration is discarded and replaced with the newone. Therefore, if akmem_magazine declaration were used to replace the"ks_cache" declaration fromkstat.se, the onlykstat links seen would be thekmem_magazine members and all of the other cache links (and there are a lot of them) would no longer be seen.

Adding New Disk Names

You can use an internal function,se_add_disk_name(string name), to add new disk names to the existing list internally. Therefore, if the tape drives and nfs mounts that are recorded in theKSTAT_TYPE_IO section of thekstatframework were to be added to the list of disks for display by anyscript that shows disk statistics, you could add these lines at thebeginning of the script.

se_add_disk_name("st"); 
se_add_disk_name("nfs");

This function is declared in these.se include file.

Themib Language Class

A lot of data regarding the network resides in themib variables of the kernel. Unfortunately, thesemib variables are not part of thekstat framework. Therefore, a new language class was created to facilitate access to this information.

Variables of themibclass have a unique feature in that they can be read, but assigningvalues generates a warning from the interpreter. This warning is toremind you that assigning values to the members of themib2_* structures will not result in the information being placed back into the kernel. Themib variables are read-only.

mib variables do not have the permissions limitation ofkvm variables. Any user can viewmib variable values without special access permissions.

To view the mib information available from within SymbEL, run the commandnetstat-s from the command line. All but the IGMP information is available.

Since allmibvariables are structures, the rules regarding structure assignmentbeing used to cut down on the overhead of the interpreter are the sameas for thekstat andkvm classes. Here is an example of usingmib class variables.

#include  
main()
{
mib2_tcp_t mib$tcp;
printf("Retransmitted TCP segments = %u\n", mib$tcp.tcpRetransSegs);
}

Thendd Language Class

SunOS 5.x provides access to variables that define the operation of the network stack through a command calledndd (seendd(1M)). Thenddlanguage class within SymbEL provides access to the variables withinthe IP, ICMP, TCP, UDP, and ARP modules. The definitions of theavailable variables are in thendd.se include file. For each module, there is a structure that contains all of the variables available for that module.

Someof these variables are read-write and others are read-only. If you tryto modify a variable that is read-only, the interpreter posts a warningmessage. Some of the read-only variables are tables that can be quitelarge. Note that the largest table size that can be handled is 64kilobytes (65,536 bytes). If anndd variable is larger than 64 kilobytes, it is truncated.

Likekstat andmib variables, allndd variables are structures.

The following program displays thetcp_status variable of the TCP module. This variable is typestring and when printed looks like a large table.

#include  
#include
main()
{
ndd_tcp_t ndd$tcp;
puts(ndd$tcp.tcp_status);
}

User-Defined Classes

Thefour language classes provide a significant amount of data to a programfor analysis. But the analysis of this data can become convoluted andmake the program difficult to deal with. This is one of the problemsthat SymbEL hoped to clear up. Adding more language classes is apotential solution to this problem.

An example of an additional language class that would be useful is avmstat class. This would be a structure that provided all of the information that thevmstat program provides. The problem is that such an addition would makese larger and provide functionality that didn't really require the internals of the interpreter to accomplish. All of whatvmstat does can be done by a SymbEL program.

In addition to thevmstat class, it would be useful to have classes foriostat,mpstat,nfsstat,netstat,and any other “stat” program that provided this type of statisticalinformation. What was needed to accomplish this task correctly was alanguage feature that allowed programmers to create their own languageclasses in SymbEL. This “user defined class” would be a structure andan associated block of code that was called whenever one of the membersof the structure was accessed. This idea led to the development of theaggregate typeclass.

Aclasstype is a structure and a block of code inside the structure that arefirst called when the block that contains the declaration of the classvariable is entered. Thereafter, whenever a member of the classvariable is accessed, the block is called. To illustrate the classconstruct, here is a program that continually displays how long asystem has been up. The first example is without the use of aclass.

Code View:Scroll/Show All
#include  
#include
#include

#define MINUTES (60 * hz)
#define HOURS (60 * MINUTES)
#define DAYS (24 * HOURS)

main()
{
ulong ticks;
ulong days;
ulong hours;
ulong minutes;
ulong seconds;
ks_system_misc kstat$misc;
long hz = sysconf(_SC_CLK_TCK);

for(;;) {
ticks = kstat$misc.clk_intr;
days = ticks / DAYS;
ticks -= (days * DAYS);
hours = ticks / HOURS;
ticks -= (hours * HOURS);
minutes = ticks / MINUTES;
ticks -= (minutes * MINUTES);
seconds = ticks / hz;
printf("System up for: %4u days %2u hours %2u minutes %2u seconds\r",
days, hours, minutes, seconds);
fflush(stdout);
sleep(1);
}
}


This program continues in an infiniteforloop, computing the uptime based on the number of clock ticks thesystem has received since boot. The computation is contained completelywithin the main program. This code can be distilled into a user-definedclass, as the following code shows.

Code View:Scroll/Show All
#include  
#include

#define MINUTES (60 * hz)
#define HOURS (60 * MINUTES)
#define DAYS (24 * HOURS)

class uptime {

ulong ticks;
ulong days;
ulong hours;
ulong minutes;
ulong seconds;

uptime$()
{
ks_system_misc kstat$misc;
long hz = sysconf(_SC_CLK_TCK);

ticks = kstat$misc.clk_intr; /* assign these values to the */
days = ticks / DAYS; /* class members */
ticks -= (days * DAYS);
hours = ticks / HOURS;
ticks -= (hours * HOURS);
minutes = ticks / MINUTES;
ticks -= (minutes * MINUTES);
seconds = ticks / hz;
}
};


Thestart of the class looks like a structure, but the final “member” ofthe structure is a block of code called the “class block.” The nameused after theclass keyword is the typename that will be used in the declaration of the variable. The name ofthe class block is the prefix used in variable names to denote that thevariable is active. Variables declared in a user-defined class typethat do not use the prefix in the variable name are inactive.

Themain() function of theuptime program would now be written to use theuptime class as shown in this example.

#include  
#include
#include "uptime_class.se"

main()
{
uptime uptime$value;
uptime tmp_uptime;

for(;;) {
tmp_uptime = uptime$value;
printf("System up for: %4u days %2u hours %2u minutes %2u seconds\r",
tmp_uptime.days, tmp_uptime.hours,
tmp_uptime.minutes, tmp_uptime.seconds);
fflush(stdout);
sleep(1);
}
}

Theprevious section discussed how the assignment of entire structures cutsdown on the overhead of the system because only one copy is required.Not only is this true here as well, but the structure copy also ensuresthat the data printed out represents the calculations of one snapshotin time, instead of printing different values for each time that theclass block was called to update each member of the class that was usedas a parameter toprintf().

Pitfalls

Hereare some of the idiosyncrasies of the language that will catchprogrammers by surprise if they’re accustomed to using a particularfeature in C and assume that it will be supported in SymbEL.

  • Only one variable can be declared per line. The variable names may not be a comma-separated list.

  • There is no type float. All floating-point variables are type double.

  • Curly braces must surround all sequences of statements in control structures, including sequences of length one.

  • The comparators work with scalars, floats, and strings. Therefore, the logical comparison ("hello" == "world") is valid and in this case returns false.

  • If the result of an expression yields a floating value as an operand to the modulus operator, that value is converted to long before the operation takes place. This conversion occurs while the program is running.

  • Assignment of the result of a logical expression is not allowed.

  • The for loop has some limitations.

    • There can be only one assignment in the assignment part.

    • There can be only logical expressions in the while part.

  • There can be only one assignment in the do part.

  • All local variables have static semantics.

  • All parameters are passed by value.

  • Global variables can be assigned the value of a function call. * while(running) is not syntactically correct. while(running != 0) is correct.

  • There is no recursion in SymbEL.

  • Structure comparison is not supported.

  • Syntax of conditional expressions is rigid: ( condition ? do_exp : else_exp )

  • Calling attached functions with incorrect values can result in a core dump and is not avoidable by the interpreter. This simple but effective script will cause a segmentation fault core dump:

#include  
main()
{
puts(nil);
}

Tricks

Asthe creator of a programming language and the developer of theinterpreter, it is much easier for me to see through the intricacies ofthe features to underlying functionality of the interpreter itself.This knowledge manifests itself in programming “tricks” that allowcertain operations to be done that may not be obvious. Here are somethat I’ve used. If there’s something you need done and it doesn’t seemto fit into any language feature, try to work around it. You may find aloophole that you didn’t know existed.

Returning an Array of Nonstructured Type from a Function

Although it is not allowed to declare a function as

int [] 
not_legal()
{
int array[ARRAY_SIZE]={1,2,3,4,5,-1};
return array;
}

itis still possible to return an array. Granted, this code isunattractive, but most of the tricks in this section involve somethingthat is not very appealing from the programming standpoint. SymbEL is,after all, just a scripting language. And if it can be done at all,it’s worth doing. So, here’s how to return an array of nonstructuredtype from a function.

#define ARRAY_SIZE 128 
ulong
it_is_legal()
{
int array[ARRAY_SIZE]={1,2,3,4,5,-1};
return &array;
}
struct array_struct {
int array[ARRAY_SIZE];
};
main()
{
array_struct digits;
ulong address;
int i;
address = it_is_legal();
struct_fill(digits, address);
for(i=0; digits.array[i] != -1; i++) {
printf("%d\n", digits.array[i]);
}
}

Using Return Values from Attached Functions

It is common to read input lines by usingfgets, then locate the newline character withstrchr and change it to a null character. This approach has unexpected results in SymbEL. For instance, the code segment

while(fgets(buf, sizeof(buf), stdin)!=nil){
p = strchr(buf, '\n');
p[0] = '\0';
puts(buf);
}

would be expected to null the newline character and print the line (yes, I know this code segment will causese to exit with a null pointer exception if a line is read with no newline character). But this is not the case because thestrchr function will return a string that is assigned to the variablep. When this happens, a new copy of the string returned bystrchr is allocated and assigned top. When thep[0] = '\0'; line is executed, the newline character in the copy is made null. The originalbuf from thefgetscall remains intact. The way around this result (and this workaroundshould be done only when it is certain that the input lines contain thenewline character) is:

while(fgets(buf, sizeof(buf), stdin)!=nil){
strcpy(strchr(buf, '\n'), "");
puts(buf);
}

In this case, the result of thestrchr call is never assigned to a variable, and its return value remains uncopied before being sent to thestrcpy function.strcpy then copies the string"" onto the newline and in doing so, changes it to the null character.

Usingkvm Variables and Functions

Using thekvm functions and dealing withkvmvariables in general is quite confusing because there are so manylevels of indirection of pointers. This simple script performs theequivalent of/bin/uname -m.

Code View:Scroll/Show All
#include  
#include

main()
{
ulong kvm$top_devinfo; // top_devinfo is an actual kernel variable
dev_info_t kvm$root_node; // root_node is not, but it needs to be active

// The next line affects a pointer indirection. The value of top_devinfo
// is a pointer to the root of the devinfo tree in the kernel. This value
// is extracted, and the root_node variable has its kernel address changed
// to this value. Accessing the root_node variable after this assignment
// will cause the reading of the dev_info_t structure from the kernel
// since root_node is an active variable. Note that root_node is not
// a variable in the kernel though, but it's declared active so that
// the value will be read out *after* it's given a valid kernel address.
// And there's no need to explicitly read the string, it's done already.
kvm_cvt(kvm$root_node, kvm$top_devinfo);
puts(kvm$root_node.devi_name);
}


Another example of extractingkvm values is with thekvm_declare function. This function allows kernel variables to be declared while the program is running. Instead of declaring akvm variable formaxusers, for instance, you could do it this way:

main() 
{
ulong address;
int kvm$integer_value;

address = kvm_declare("maxusers");
kvm_cvt(kvm$integer_value, address);
printf("maxusers is %d\n", kvm$integer_value);
}

A more general way to peruse integer variables entered at the user’s leisure is shown in this example.

Code View:Scroll/Show All
#include  
#include

int main()
{
char var_name[BUFSIZ];
ulong address;
int kvm$variable;

for(;;) {
fputs("Enter the name of an integer variable: ", stdout);
if (fgets(var_name, sizeof(var_name), stdin) == nil) {
return 0;
}
strcpy(strchr(var_name, '\n'), ""); // chop
address = kvm_declare(var_name); // look it up with nlist
if (address == 0) {
printf("variable %s is not found in the kernel space\n", var_name);
continue;
}
kvm_cvt(kvm$variable, address); // convert the address of the kvm var
printf("%s = %u\n", var_name, kvm$variable);
}
}


Using anattach Block to Call Interpreter Functions

Theattach feature of SymbEL implements the use of the dynamic linking feature of Solaris. Thedlfunctions allow an external library to be attached to a runningprocess, thus making the symbols within that binary available to theprogram.

One of thefeatures of dynamic linking is the ability to access symbols within thebinary that is running. That is, a process can look into itself forsymbols. This can also be accomplished in SymbEL by using anattachblock with no name. With this trick, a script can call functionscontained within the interpreter, but the author of the script has toknow what functions are available. Currently, the only functionsavailable to the user are listed in these.se include file.

The most useful of these functions is these_function_callfunction, which allows the script to call a SymbEL function indirectly.This function can be used for a callback mechanism. It’s the equivalentof a pointer to a function. For example, this script calls the function"callback" indirectly.

#include  
main()
{
se_function_call("callback", 3, 2, 1);
}

callback(int a, int b, int c)
{
printf("a = %db=%dc=%d\n", a, b, c);
}

These_function_callfunction is declared with an ellipsis argument so any number ofparameters can be passed (up to the internal limit) to the functionbeing called. Be careful to pass the correct type and number ofarguments.

Anextreme example of this functionality is demonstrated below. The scriptcalls on the interpreter to parse a function from an external file andthen run the function. It’s an absurd example, but it demonstrates thetangled web that can be weaved with attached functions and variables.

Code View:Scroll/Show All
// this is the file "other_file" 
some_function(int param)
{
printf("hello there: %d\n", param);
}
// this is the demo script
#include
#include
#include

attach "" {
extern ulong Lex_input;
extern int Se_errors;
yyparse();
se_fatal(string p);
};

int main()
{
Lex_input = fopen("other_file", "r");
if (Lex_input == 0) {
perror("fopen");
return 1;pf
}
yyparse();
if (Se_errors != 0) {
se_fatal("parse errors in other_file");
return 1;
}
se_function_call("some_function", 312);
return 0;
}