Programming, using and understanding

Pike

by Fredrik Hübinette

Preface

This book was written with the intention of making anybody with a little programming experience able to use Pike. It should also be possible to gain a deep understanding of how Pike works and to some extent why it works the way it does from this book. It will teach you how to write your own extensions to Pike. I have been trying for years to get someone else to write this book, but since it seems impossible without paying a fortune for it I will have to do it myself. A big thanks goes to
Ian Carr-de Avelon and Henrik Wallin for helping me iron out some of the rough spots. The book assumes that you have programmed some other programming language before and that you have some experience of UNIX.

Table of contents

Preface
Table of contents
Introduction
Overview
The history of Pike
A comparison with other languages
What is Pike
How to read this manual
1 Getting started
1.1 Your first Pike program
1.2 Improving hello_world.pike
1.3 Further improvements
1.4 Control structures
1.5 Functions
1.6 True and false
1.7 Data Types
2 A more elaborate example
2.1 Taking care of input
2.1.1 add_record()
2.1.2 main()
2.2 Communicating with files
2.2.1 save()
2.2.2 load()
2.2.3 main() revisited
2.3 Completing the program
2.3.1 delete()
2.3.2 search()
2.3.3 main() again
2.4 Then what?
2.5 Simple exercises
3 Control Structures
3.1 Conditions
3.1.1 if
3.1.2 switch
3.2 Loops
3.2.1 while
3.2.2 for
3.2.3 do-while
3.2.4 foreach
3.3 Breaking out of loops
3.3.1 break
3.3.2 continue
3.3.3 return
3.4 Exercises
4 Data types
4.1 Basic types
4.1.1 int
4.1.2 float
4.1.3 string
4.2 Pointer types
4.2.1 array
4.2.2 mapping
4.2.3 multiset
4.2.4 program
4.2.5 object
4.2.6 function
4.3 Sharing data
4.4 Writing data types
5 Operators
5.1 Arithmetic operators
5.2 Comparison operators
5.3 Logical operators
5.4 Bitwise/set operators
5.5 Indexing
5.6 The assignment operators
5.7 The rest of the operators
5.8 Operator precedence
5.9 Operator functions
6 Object orientation
6.1 Terminology
6.2 The approach
6.3 How does this help?
6.4 Pike and object orientation
6.5 Inherit
6.6 Multiple inherit
6.7 Pike inherit compared to other languages
6.8 Modifiers
6.9 Operator overloading
6.10 Simple exercises
7 Miscellaneous functions
7.1 sscanf
7.2 catch & throw
7.3 gauge
7.4 typeof
8 Modules
8.1 How to use modules
8.2 Where do modules come from?
8.3 The . operator
8.4 How to write a module
8.5 Simple exercises
9 File I/O
9.1 File management - Stdio.File
9.2 Buffered file management - Stdio.FILE
9.3 Standard streams - Stdio.stdin, stdout and stderr
9.4 Listening to sockets - Stdio.Port
9.5 UDP socket and message management - Stdio.UDP
9.6 Terminal management - Stdio.Terminfo
9.6.1 Stdio.Terminfo.Termcap
9.6.2 Stdio.Terminfo.Terminfo
9.7 Simple input-by-prompt - Stdio.Readline
9.8 Other Stdio functions
9.9 A simple example
9.10 A more complex example - a simple WWW server
10 Threads
10.1 Starting a thread
10.2 Threads reference section
10.3 Threads example
11 Modules for specific data types
11.1 String
11.2 Array
12 Image
12.1 Image.Image
12.2 Image.Colortable
12.3 Image.Layer
12.4 Image.Font
12.5 Image.colortable
12.6 Image.Poly
12.7 Image.Color
12.7.1 Image.Color.Color
12.8 Image.X
12.9 Image.ANY
12.10 Image.AVS
12.11 Image.BMP
12.12 Image.GD
12.13 Image.GIF
12.14 Image.JPEG
12.15 Image.TIFF
12.16 Image.TTF
12.16.1 Image.TTF.Face
12.16.2 Image.TTF.FaceInstance
12.17 Image.XFace
12.18 Image.HRZ
12.19 Image.ILBM
12.20 Image.PCX
12.21 Image.PNG
12.22 Image.PNM
12.23 Image.PSD
12.24 Image.TGA
12.25 Image.XBM
12.26 Image.XCF
12.27 Image.XWD
13 Protocols
13.1 Protocols.HTTP
13.1.1 Protocols.HTTP.Query
13.2 Protocols.LysKOM
13.2.1 Protocols.LysKOM.Session
13.2.2 Protocols.LysKOM.Connection
13.2.3 Protocols.LysKOM.Request
13.2.3.1 Protocols.LysKOM.Request._Request
13.3 Protocols.DNS
13.3.1 Protocols.DNS.client
14 Other modules
14.1 System
14.2 Process
14.3 Regexp
14.4 Gmp
14.5 Gdbm
14.6 Getopt
14.7 Gz
14.8 Yp
14.9 ADT.Table
14.10 Yabu transaction database
14.10.1 The database
14.10.2 Tables
14.10.3 Transactions
14.11 MIME
14.11.1 Global functions
14.11.2 The MIME.Message class
14.11.2.1 Public fields
14.11.2.2 Public methods
14.12 Simulate
14.13 Mysql.mysql
14.14 The Pike Crypto Toolkit
14.14.1 Introduction
14.14.2 Block ciphers
14.14.3 Stream Ciphers
14.14.4 Hash Functions
14.14.5 Public key algorithms
14.14.6 Combining block cryptos
14.15 Locale.Gettext
14.16 Calendar
14.16.1 Calendar.Calendar
14.16.2 Calendar.TimeRange
14.16.3 Calendar.SuperTimeRange
14.16.4 Calendar.Austrian
14.16.5 Calendar.Coptic
14.16.6 Calendar.Discordian
14.16.7 Calendar.Event
14.16.8 Calendar.Gregorian
14.16.9 Calendar.ISO
14.16.10 Calendar.Julian
14.16.11 Calendar.Roman
14.16.12 Calendar.Stardate
14.16.13 Calendar.Swedish
14.16.14 Calendar.TZnames
14.16.15 Calendar.Time
14.16.15.1 Calendar.Time.TimeOfDay
14.16.15.2 Calendar.Time.SuperTimeRange
14.16.15.3 Calendar.Time.Hour
14.16.15.4 Calendar.Time.Minute
14.16.15.5 Calendar.Time.Second
14.16.15.6 Calendar.Time.Fraction
14.16.16 Calendar.Timezone
14.16.17 Calendar.YMD
14.16.17.1 Calendar.YMD.YMD
14.16.17.2 Calendar.YMD.Year
14.16.17.3 Calendar.YMD.Week
14.16.17.4 Calendar.YMD.Hour
14.17 Parser
14.17.1 Parser.HTML
14.18 Math
14.18.1 Math.Matrix
14.19 Calendar.Calendar
14.20 Calendar.TimeRange
14.21 Calendar.SuperTimeRange
14.22 Calendar.Austrian
14.23 Calendar.Coptic
14.24 Calendar.Discordian
14.25 Calendar.Event
14.26 Calendar.Gregorian
14.27 Calendar.ISO
14.28 Calendar.Julian
14.29 Calendar.Roman
14.30 Calendar.Stardate
14.31 Calendar.Swedish
14.32 Calendar.TZnames
14.33 Calendar.Time
14.33.1 Calendar.Time.TimeOfDay
14.33.2 Calendar.Time.SuperTimeRange
14.33.3 Calendar.Time.Hour
14.33.4 Calendar.Time.Minute
14.33.5 Calendar.Time.Second
14.33.6 Calendar.Time.Fraction
14.34 Calendar.Timezone
14.35 Calendar.YMD
14.35.1 Calendar.YMD.YMD
14.35.2 Calendar.YMD.Year
14.35.3 Calendar.YMD.Week
14.35.4 Calendar.YMD.Hour
14.36 Calendar_I.time_unit
14.37 Calendar_I.Gregorian
14.37.1 Calendar_I.Gregorian.
14.37.2 Calendar_I.Gregorian.Year
14.37.3 Calendar_I.Gregorian.Stardate
14.37.3.1 Calendar_I.Gregorian.Stardate.TNGDate
14.38 Crypto.randomness
14.38.1 Crypto.randomness.pike_random
14.38.2 Crypto.randomness.arcfour_random
14.39 Geographical.Position
14.40 Geographical.Countries
14.41 Image.Product(string name, string version)
14.42 Parser.SGML
14.43 Protocols.HTTP
14.43.1 Protocols.HTTP.Query
14.44 Protocols.LysKOM
14.44.1 Protocols.LysKOM.Session
14.44.2 Protocols.LysKOM.Connection
14.44.3 Protocols.LysKOM.Request
14.44.3.1 Protocols.LysKOM.Request._Request
14.45 Protocols.DNS
14.45.1 Protocols.DNS.client
14.46 Mird.Glue
14.46.1 Mird.Glue.Mird
14.46.2 Mird.Glue.Transaction
14.46.3 Mird.Glue.Scanner
14.47 SANE.Scanner
15 The preprocessor
16 Builtin functions
17 Pike internals - how to extend Pike
17.1 The master object
17.2 Data types from the inside
17.2.1 Basic data types
17.2.2 struct svalue
17.2.3 struct pike_string
17.2.4 struct array
17.2.5 struct mapping
17.2.6 struct object
17.2.7 struct program
17.3 The interpreter
Appendix A Terms and jargon
Appendix B Register program
Appendix C Reserved words
Appendix D BNF for Pike
Appendix E How to install Pike
Appendix F How to convert from old versions of Pike
Appendix G Image.Layer modes
Appendix H Image.Color colors
Index

Introduction

This introduction will give you some background about Pike and this book and also compare Pike with other languages. If you want to start learning Pike immediately you can skip this chapter.

Overview

This book is designed for people who want to learn Pike fast. Since Pike is a simple language to learn, especially if you have some prior programming experience, this should benefit most people.

Chapter one is devoted to background information about Pike and this book. It is not really necessary to read this chapter to learn how to use and program Pike, but it might help explain why some things work the way they do. It might be more interesting to re-read the chapter after you have learned the basics of Pike programming. Chapter two is where the action starts. It is a crash course in Pike with examples and explanations of some of the basics. It explains the fundamentals of the Pike data types and control structures. The systematic documentation of all Pike capabilities starts in chapter three with a description of all control structures in Pike. It then continues with all the data types in chapter four and operators in chapter five. Chapter six deals with object orientation in Pike, which is slightly different than what you might be used to.

The history of Pike

In the beginning, there was Zork. Then a bunch of people decided to make multi-player adventure games. One of those people was Lars Pensjö at the Chalmers university in Gothenburg, Sweden. For his game he needed a simple, memory-efficient language, and thus LPC (Lars Pensjö C) was born. About a year later I started playing one of these games and found that the language was the most easy-to-use language I had ever encountered. I liked the language so much that I started improving it and before long I had made my own LPC dialect called LPC4. LPC4 was still geared towards writing adventure games, but was quite useful for writing other things with as well. A major problem with LPC4 was the copyright. Since it was based on Lars Pensjö's code, it came with a license that did not allow it to be used for commercial gain. So, in 1994 I started writing µLPC, which was a new but similar LPC interpreter. I got financial backing from Signum Support AB for writing µLPC. Signum is a company dedicated to supporting GNU and GPL software and they wanted to create more GPL software.

When µLPC became usable, InformationsVävarna AB started using it for their web-server. Before then, Roxen (then called Spinner) was non-commercial and written in LPC4. Then in 1996 I started working for InformationsVävarna developing µLPC for them. We also changed the name of µLPC to Pike to get a more commercially viable name.

A comparison with other languages

Python
Python is probably the language that is most like Pike. Pike is faster and has better object orientation. It also has a syntax similar to C++, which makes it more familiar for people who know C++. Python on the other hand, has a lot more libraries available.
C++
Pike's syntax is almost the same as for C++. A huge difference is that Pike is interpreted. This makes the code slower, but reduces compile times to almost nothing. For those few applications which require the speed of C or C++, it is often easier to write a Pike extension than to write the whole thing in C or C++.
Lisp and Scheme
Internally Pike has a lot in common with simple interpreted Lisp and Scheme implementations. They are all stack based, byte-compiled, interpreted languages. Pike is also a 'one-cell' language, just like Scheme.
Pascal
Pike has nothing in common with Pascal.
Tcl/Tk
Pike is similar to Tcl/Tk in intent and they both have good string handling. Pike has better data types and is much faster however. On the other hand Tcl/Tk has X windows system support.

What is Pike

Pike is: Pike has:

How to read this manual

This manual uses a couple of different typefaces to describe different things:
italics
Italics is used as a placeholder for other things. If it says a word in the text it means that you should put your own word there.
bold
Bold is just used to emphasize that this word is not merely what it sounds like. It is actually a term.
fixed size
Fixed size is used to for examples and text directly from the computer.
Also, please beware that the word program is also a builtin Pike data type.

Chapter 1, Getting started

First you need to have Pike installed on your computer. See appendix E "How to install Pike" if this is not already done. It is also vital for the first of the following examples that the Pike binary is in your UNIX search path. If you have problems with this, consult the manual for your shell or go buy a beginners book about UNIX.

1.1 Your first Pike program

int main()
{
    write("hello world\n");
    return 0;
}
Let's call this file hello_world.pike, and then we try to run it:

	$ pike hello_world.pike
	hello world
	$ 
Pretty simple, Let's see what everything means:
int main()
This begins the function main. Before the function name the type of value it returns is declared, in this case int which is the name of the integer number type in Pike. The empty space between the parenthesis indicates that this function takes no arguments. A Pike program has to contain at least one function, the main function. This function is where program execution starts and thus the function from which every other function is called, directly or indirectly. We can say that this function is called by the operating system. Pike is, as many other programming languages, built upon the concept of functions, i.e. what the program does is separated into small portions, or functions, each performing one (perhaps very complex) task. A function declaration consists of certain essential components; the type of the value it will return, the name of the function, the parameters, if any, it takes and the body of the function. A function is also a part of something greater; an object. You can program in Pike without caring about objects, but the programs you write will in fact be objects themselves anyway. Now let's examine the body of main;

{
    write("hello world\n");
    return 0;
}
Within the function body, programming instructions, statements, are grouped together in blocks. A block is a series of statements placed between curly brackets. Every statement has to end in a semicolon. This group of statements will be executed every time the function is called.

write("hello world\n");
The first statement is a call to the builtin function write. This will execute the code in the function write with the arguments as input data. In this case, the constant string hello world\n is sent. Well, not quite. The \n combination corresponds to the newline character. write then writes this string to stdout when executed. Stdout is the standard Unix output channel, usually the screen.

return 0;
This statement exits the function and returns the value zero. Any statements following the return statements will not be executed.

1.2 Improving hello_world.pike

Typing pike hello_world.pike to run our program may seem a bit unpractical. Fortunately, Unix provides us with a way of automating this somewhat. If we modify hello_world.pike to look like this:

#!/usr/local/bin/pike

int main()
{
    write("hello world\n");
    return 0;
}
And then we tell UNIX that hello_world.pike is executable so we can run hello_world.pike without having to bother with running Pike:

	$ chmod +x hello_world.pike
	$ ./hello_world.pike
	hello world
	$ 
N.B.: The hash bang (#!) must be first in the file, not even whitespace is allowed to precede it! The file name after the hash bang must also be the complete file name to the Pike binary, and it may not exceed 30 characters.

1.3 Further improvements

Now, wouldn't it be nice if it said Hello world! instead of hello world ? But of course we don't want to make our program "incompatible" with the old version. Someone might need the program to work like it used to. Therefore we'll add a command line option that will make it type the old hello world. We also have to give the program the ability to choose what it should output based on the command line option. This is what it could look like:

#!/usr/local/bin/pike

int main(int argc, array(string) argv)
{
    if(argc > 1 && argv[1]=="--traditional")
    {
        write("hello world\n"); // old style
    }else{
        write("Hello world!\n"); // new style
    }
    return 0;
}
Let's run it:

	$ chmod +x hello_world.pike
	$ ./hello_world.pike
	Hello world!
	$ ./hello_world.pike --traditional
	hello world
	$ 
What is new in this version, then?
int main(int argc, array(string) argv)
In this version the space between the parenthesis has been filled. What it means is that main now takes two arguments. One is called argc, and is of the type int. The other is called argv and is a an array of strings.

The arguments to main are taken from the command line when the Pike program is executed. The first argument, argc, is how many words were written on the command line (including the command itself) and argv is an array formed by these words.

if(argc > 1 && argv[1] == "--traditional")
{
    write("hello world\n"); // old style
}else{
    write("Hello world!\n"); // new style
}
This is an if-else statement, it will execute what's between the first set of brackets if the expression between the parenthesis evaluate to something other than zero. Otherwise what's between the second set of brackets will be executed. Let's look at that expression:

argc > 1 && argv[1] == "--traditional"
Loosely translated, this means: argc is greater than one, and the second element in the array argv is equal to the string --traditional. Since argc is the number of words on the command line the first part is true only if there was anything after the program invocation.

Also note the comments:

write("hello world\n"); // old style
The // begins a comment which continues to the end of the line. Comments will be ignored by the computer when it reads the code. This allows to inform whoever might read your code (like yourself) of what the program does to make it easier to understand. Comments are also allowed to look like C-style comments, i.e. /* ... */, which can extend over several lines. The // comment only extends to the end of the line.

1.4 Control structures

The first thing to understand about Pike is that just like any other programming language it executes one piece of code at a time. Most of the time it simply executes code line by line working its way downwards. Just executing a long list of instructions is not enough to make an interesting program however. Therefore we have control structures to make Pike execute pieces of code in more interesting orders than from top to bottom.

We have already seen an example of the if statement:

if( expression )
    statement1;
else
    statement2;
if simply evaluates the expression and if the result is true it executes statement1, otherwise it executes statement2. If you have no need for statement2 you can leave out the whole else part like this:
if( expression )
    statement1;
In this case statement1 is evaluated if expression is true, otherwise nothing is evaluated.

Note for beginners: go back to our first example and make sure you understand what if does.

Another very simple control structure is the while statement:

while( expression )
    statement;
This statement evaluates expression and if it is found to be true it evaluates statement. After that it starts over and evaluates expression again. This continues until expression is no longer true. This type of control structure is called a loop and is fundamental to all interesting programming.

1.5 Functions

Another control structure we have already seen is the function. A function is simply a block of Pike code that can be executed with different arguments from different places in the program. A function is declared like this:
modifiers type name(type varname1, type varname2, ...)
{
    statements
}
The modifiers are optional. See
section 6.8 "Modifiers" for more details about modifiers. The type specifies what kind of data the function returns. For example, the word int would signify that the function returns an integer number. The name is used to identify the function when calling it. The names between the parenthesis are the arguments to the function. They will be defined as local variables inside the function. Each variable will be declared to contain values of the preceding type. The three dots signifies that you can have anything from zero to 256 arguments to a function. The statements between the brackets are the function body. Those statements will be executed whenever the function is called.

Example:

int sqr(int x) { return x*x; }
This line defines a function called sqr to take one argument of the type int and also returns an int. The code itself returns the argument multiplied by itself. To call this function from somewhere in the code you could simply put: sqr(17) and that would return the integer value 289.

As the example above shows, return is used to specify the return value of a function. The value after return must be of the type specified before the function name. If the function is specified to return void, nothing at all should be written after return. Note that when a return statement is executed, the function will finish immediately. Any statements following the return will be ignored.

There are many more control structures, they will all be described in a later chapter devoted only to control structures.

1.6 True and false

Throughout this chapter the words true and false have been used without any explanation to what they mean. Pike has a fairly simple way of looking at this. The number 0 is false and everything else is true. (Except when using operator overloading as I will explain in a later chapter.)

1.7 Data Types

As you saw in our first examples we have to indicate the type of value returned by a function or contained in a variable. We used integers (int), strings (string), and arrays (with the * notation). The others are mapping, mixed, void, float, multiset, function, object and program. Neither mixed nor void are really types, void signifies that no value should be returned and mixed that the return value can be of any type, or that the variable can contain any type of value. Function, object and program are all types related to object orientation. We will not discuss the last three in any great detail here,
Int
The integer type stores a signed integer. It is 32 bit or 64 depending on architecture.
Float
This variable type stores a floating point number.
Array
Arrays are basically a place to store a number of other values. Arrays in Pike are allocated blocks of values. They are dynamically allocated and do not need to be declared as in C. The values in the array can be set when creating the array like this:
arr=({1,2,3});
Or, if you have already created an array, you can change the values in the array like this:
arr [ ind ] = data;
This sets entry number ind in the array arr to data. ind must be an integer. The first index of an array is 0 (zero). A negative index will count from the end of the array rather than from the beginning, -1 being the last element. To declare that a variable is an array we simply type array in front of the variable name we want:
array i;
We can also declare several array variables on the same line:
array i, j;
If we want to specify that the variable should hold an array of strings, we would write:
array (string) i;
String
A string contains a sequence of characters, a text, i.e. a word, a sentence or a book. Note that this is not simply the letters A to Z; special characters, null characters, newlines and so on can all be stored in a string. Any 8-bit character is allowed. String is a basic type in Pike, it is not an array of char like it is in C. This means that you cannot assign new values to individual characters in a string. Also, all strings are "shared", i.e. if the same string is used in several places, it will be stored in memory only once. When writing a string in a program, you enclose it in double quotes. To write special characters you need to use the following syntax:
\nnewline
\rcarriage return
\ttab
\bbackspace
\"" (quotation character)
\\\ (literal backslash)
Mapping
A mapping is basically an array that can be indexed on any type, not just integers. It can also be seen as a way of linking data (usually strings) together. It consists of a lot of index-data pairs which are linked together in such a way that map[index1] returns data1. A mapping can be created in a way similar to arrays:
mapping(string:string) map=(["five":"good", "ten":"excellent"]);
You can also set that data by writing map["five"]="good". If you try to set an index in a mapping that isn't already present in the mapping it will be added as well.
Multiset
A multiset is basically a mapping without data values. When referring to an index in the multiset a 1 (one) will be returned if the index is present and 0 (zero) otherwise.

Chapter 2, A more elaborate example

To illustrate several of the fundamental points of Pike we will now introduce an example program, that will be extended as we go. We will build a database program that keeps track of a record collection and the songs on the records. In the first version we hard-code our "database" into the program. The database is a mapping where the index is the record name and the data is an array of strings. The strings are of course the song names. The default register consists of one record.
#!/usr/local/bin/pike

mapping (string:array(string)) records =
([
    "Star Wars Trilogy" : ({
        "Fox Fanfare",
        "Main Title",
        "Princess Leia's Theme",
        "Here They Come",
        "The Asteroid Field",
        "Yoda's Theme",
        "The Imperial March",
        "Parade of the Ewoks",
        "Luke and Leia",
        "Fight with Tie Fighters",
        "Jabba the Hut",
        "Darth Vader's Death",
        "The Forest Battle",
        "Finale"
    })
]);
We want to be able to get a simple list of the records in our database. The function list_records just goes through the mapping records and puts the indices, i.e. the record names, in an array of strings, record_names. By using the builtin function sort we put the record names into the array in alphabetical order which might be a nice touch. For the printout we just print a header, "Records:", followed by a newline. Then we use the loop control structure for to traverse the array and print every item in it, including the number of the record, by counting up from zero to the last item of the array. The builtin function sizeof gives the number of items in an array. The printout is formatted through the use of sprintf which works more or less like the C function of the same name.
void list_records()
{
    int i;
    array (string) record_names=sort(indices(records));

    write("Records:\n");
    for(i=0;i<sizeof(record_names);i++)
        write(sprintf("%3d: %s\n", i+1, record_names[i]));
}
If the command line contained a number our program will find the record of that number and print its name along with the songs of this record. First we create the same array of record names as in the previous function, then we find the name of the record whose number (num) we gave as an argument to this function. Next we put the songs of this record in the array songs and print the record name followed by the songs, each song on a separate line.
void show_record(int num)
{
    int i;
    array (string) record_names = sort(indices (records));
    string name=record_names[num-1];
    array (string) songs=records[name];
    
    write(sprintf("Record %d, %s\n",num,name));
    for(i=0;i<sizeof(songs);i++)
        write(sprintf("%3d: %s\n", i+1, songs[i]));
}
The main function doesn't do much; it checks whether there was anything on the command line after the invocation. If this is not the case it calls the list_records function, otherwise it sends the given argument to the show_record function. When the called function is done the program just quits.
int main(int argc, array (string) argv)
{
    if(argc <= 1)
    {
        list_records();
    } else {
        show_record((int) argv[1]);
    }
}

2.1 Taking care of input

Now, it would be better and more general if we could enter more records into our database. Let's add such a function and modify the main() function to accept "commands".

2.1.1 add_record()

Using the method Stdio.Readline()->read() we wait for input which will be put into the variable record_name. The argument to ->read() is printed as a prompt in front of the user's input. Readline takes everything up to a newline character. Now we use the control structure while to check whether we should continue inputting songs. The while(1) means "loop forever", because 1 is always true. This program does not in fact loop forever, because it uses return to exit the function from within the loop when you type a period. When something has been read into the variable song it is checked. If it is a "." we return a null value that will be used in the while statement to indicate that it is not ok to continue asking for song names. If it is not a dot, the string will be added to the array of songs for this record, unless it's an empty string. Note the += operator. It is the same as saying records[record_name]=records[record_name]+({song}).
void add_record()
{
    string record_name=Stdio.Readline()->read("Record name: ");
    records[record_name]=({});
    write("Input song names, one per line. End with '.' on its own line.\n");
    while(1)
    {
        string song;
        song=Stdio.Readline()->read(sprintf("Song %2d: ",
                                                                sizeof(records[record_name])+1));
        if(song==".")
             return;
        if (strlen(song))
            records[record_name]+=({song});
    }
}

2.1.2 main()

The main function now does not care about any command line arguments. Instead we use Stdio.Readline()->read() to prompt the user for instructions and arguments. The available instructions are "add", "list" and "quit". What you enter into the variables cmd and args is checked in the switch() block. If you enter something that is not covered in any of the case statements the program just silently ignores it and asks for a new command. In a switch() the argument (in this case cmd) is checked in the case statements. The first case where the expression equals cmd then executes the statement after the colon. If no expression is equal, we just fall through without any action. The only command that takes an argument is "list" which works as in the first version of the program. If "list" receives an argument, that record is shown along with all the songs on it. If there is no argument it shows a list of the records in the database. When the program returns from either of the listing functions, the break instruction tells the program to jump out of the switch() block. "add" of course turns control over to the function described above. If the command given is "quit" the exit(0) statement stops the execution of the program and returns 0 (zero) to the operating system, telling it that everything was ok.
int main(int argc, array(string) argv)
{
    string cmd;
    while(cmd=Stdio.Readline()->read("Command: "))
    {
        string args;
        sscanf(cmd,"%s %s",cmd,args);

        switch(cmd)
        {
        case "list":
            if((int)args)
            {
                show_record((int)args);
            } else {
                list_records();
            }
            break;

        case "quit":
            exit(0);

        case "add":
            add_record();
            break;
        }
    }
}

2.2 Communicating with files

Now if we want to save the database and also be able to retrieve previously stored data we have to communicate with the environment, i.e. with files on disk. Now we will introduce you to programming with objects. To open a file for reading or writing we will use one of the programs which is builtin in Pike called Stdio.File. To Pike, a program is a data type which contains code, functions and variables. A program can be cloned which means that Pike creates a data area in memory for the program, places a reference to the program in the data area, and initializes it to act on the data in question. The methods (i.e. functions in the object) and variables in the object Stdio.File enable us to perform actions on the associated data file. The methods we need to use are open, read, write and close. See
chapter 9 "File I/O" for more details.

2.2.1 save()

First we clone a Stdio.File program to the object o. Then we use it to open the file whose name is given in the string file_name for writing. We use the fact that if there is an error during opening, open() will return a false value which we can detect and act upon by exiting. The arrow operator (->) is what you use to access methods and variables in an object. If there is no error we use yet another control structure, foreach, to go through the mapping records one record at a time. We precede record names with the string "Record: " and song names with "Song: ". We also put every entry, be it song or record, on its own line by adding a newline to everything we write to the file.
Finally, remember to close the file.
void save(string file_name)
{
    string name, song;
    Stdio.File o=Stdio.File();

    if(!o->open(file_name,"wct"))
    {
        write("Failed to open file.\n");
        return;
    }

    foreach(indices(records),name)
    {
        o->write("Record: "+name+"\n");
        foreach(records[name],song)
            o->write("Song: "+song+"\n");
    }

    o->close();
}

2.2.2 load()

The load function begins much the same, except we open the file named file for reading instead. When receiving data from the file we put it in the string file_contents. The absence of arguments to the method o->read means that the reading should not end until the end of the file. After having closed the file we initialize our database, i.e. the mapping records. Then we have to put file_contents into the mapping and we do this by splitting the string on newlines (cf. the split operator in Perl) using the division operator. Yes, that's right: by dividing one string with another we can obtain an array consisting of parts from the first. And by using a foreach statement we can take the string file_contents apart piece by piece, putting each piece back in its proper place in the mapping records.
void load(string file_name)
{
    string name="ERROR";
    string file_contents,line;

    Stdio.File o=Stdio.File();
    if(!o->open(file_name,"r"))
    {
        write("Failed to open file.\n");
        return;
    }

    file_contents=o->read();
    o->close();

    records=([]);

    foreach(file_contents/"\n",line)
    {
        string cmd, arg;
        if(sscanf(line,"%s: %s",cmd,arg))
        {
            switch(lower_case(cmd))
            {
            case "record":
                name=arg;
                records[name]=({});
                break;

             case "song":
                 records[name]+=({arg});
                 break;
            }
        }
    }
}

2.2.3 main() revisited

main() remains almost unchanged, except for the addition of two case statements with which we now can call the load and save functions. Note that you must provide a filename to load and save, respectively, otherwise they will return an error which will crash the program.
case "save":
    save(args);
    break;

case "load":
    load(args);
    break;

2.3 Completing the program

Now let's add the last functions we need to make this program useful: the ability to delete entries and search for songs.

2.3.1 delete()

If you sell one of your records it might be nice to able to delete that entry from the database. The delete function is quite simple. First we set up an array of record names (cf. the list_records function). Then we find the name of the record of the number num and use the builtin function m_delete() to remove that entry from records.
void delete_record(int num)
{
    array(string) record_names=sort(indices(records));
    string name=record_names[num-1];

    m_delete(records,name);
}

2.3.2 search()

Searching for songs is quite easy too. To count the number of hits we declare the variable hits. Note that it's not necessary to initialize variables, that is done automatically when the variable is declared if you do not do it explicitly. To be able to use the builtin function search(), which searches for the presence of a given string inside another, we put the search string in lowercase and compare it with the lowercase version of every song. The use of search() enables us to search for partial song titles as well. When a match is found it is immediately written to standard output with the record name followed by the name of the song where the search string was found and a newline. If there were no hits at all, the function prints out a message saying just that.
void find_song(string title)
{
    string name, song;
    int hits;

    title=lower_case(title);

    foreach(indices(records),name)
    {
        foreach(records[name],song)
        {
            if(search(lower_case(song), title) != -1)
            {
                write(name+"; "+song+"\n");
                hits++;
            }
        }
    }

    if(!hits) write("Not found.\n");
}

2.3.3 main() again

Once again main() is left unchanged, except for yet another two case statements used to call the search() and delete functions, respectively. Note that you must provide an argument to delete or it will not work properly.
case "delete":
    delete_record((int)args);
    break;

case "search":
    find_song(args);
    break;

2.4 Then what?

Well that's it! The example is now a complete working example of a Pike program. But of course there are plenty of details that we haven't attended to. Error checking is for example extremely sparse in our program. This is left for you to do as you continue to read this book. The complete listing of this example can be found in
appendix B "Register program". Read it, study it and enjoy!

2.5 Simple exercises

  • Make a program which writes hello world 10 times.
  • Modify hello_world.pike to write the first argument to the program.
  • Make a program that writes a hello_world program to stdout when executed.
  • Modify the register program to store data about programs and diskettes instead of songs and records.
  • Add code to the register program that checks that the user typed an argument when required. The program should notify the user and wait to receive more commands instead of exiting with an error message.
  • Add code to the register program to check that the arguments to show_record and delete_records are numbers. Also make sure that the number isn't less than one or bigger than the available number of records.
  • Rewrite the register program and put all the code in main().
  • Chapter 3, Control Structures

    In this chapter all the control structures in Pike will be explained. As mentioned earlier, control structures are used to control the flow of the program execution. Note that functions that make the program pause and simple function calls are not qualified as control structures.

    3.1 Conditions

    Pike only has two major condition control structures. We have already seen examples of both of them in Chapter two. But for completeness they will be described again in this chapter.

    3.1.1 if

    The simplest one is called the if statement. It can be written anywhere where a statement is expected and it looks like this:
    if( expression ) statement1; else statement2;
    Please note that there is no semicolon after the parenthesis or after the else. Step by step, if does the following:
    1. First it evaluates expression.
    2. If the result was false go to point 5.
    3. Execute statement1.
    4. Jump to point 6.
    5. Execute statement2.
    6. Done.
    This is actually more or less how the interpreter executes the if statement. In short, statement1 is executed if expression is true otherwise statement2 is executed. If you are interested in having something executed if the expression is false you can drop the whole else part like this:
    if( expression )
        statement1;
    If on the other hand you are not interested in evaluating something if the expression is false you should use the not operator to negate the true/false value of the expression. See chapter 5 for more information about the not operator. It would look like this:
    if( ! expression )
        statement2 ;
    Any of the statements here and in the rest of this chapter can also be a block of statements. A block is a list of statements, separated by semicolons and enclosed by brackets. Note that you should never put a semicolon after a block of statements. The example above would look like this;
    if ( ! expression )
    {
        statement;
        statement;
        statement;
    }

    3.1.2 switch

    A more sophisticated condition control structure is the switch statement. A switch lets you select one of many choices depending on the value of an expression and it can look something like this:
    switch ( expression )
    {
        case constant1:
            statement1;
            break;

        case constant2:
            statement2;
            break;

        case constant3 .. constant4:
            statement3;
            break;

        default:
            statement5;
    }
    As you can see, a switch statement is a bit more complicated than an if statement. It is still fairly simple however. It starts by evaluating the expression it then searches all the case statements in the following block. If one is found to be equal to the value returned by the expression, Pike will continue executing the code directly following that case statement. When a break is encountered Pike will skip the rest of the code in the switch block and continue executing after the block. Note that it is not strictly necessary to have a break before the next case statement. If there is no break before the next case statement Pike will simply continue executing and execute the code after that case statement as well.

    One of the case statements in the above example differs in that it is a range. In this case, any value between constant3 and constant4 will cause Pike to jump to statement3. Note that the ranges are inclusive, so the values constant3 and constant4 are also valid.

    3.2 Loops

    Loops are used to execute a piece of code more than once. Since this can be done in quite a few different ways there are four different loop control structures. They may all seem very similar, but using the right one at the right time makes the code a lot shorter and simpler.

    3.2.1 while

    While is the simplest of the loop control structures. It looks just like an if statement without the else part:
    while ( expression )
        statement;
    The difference in how it works isn't that big either, the statement is executed if the expression is true. Then the expression is evaluated again, and if it is true the statement is executed again. Then it evaluates the expression again and so forth... Here is an example of how it could be used:
    int e=1;
    while(e<5)
    {
        show_record(e);
        e=e+1;
    }
    This would call show_record with the values 1, 2, 3 and 4.

    3.2.2 for

    For is simply an extension of while. It provides an even shorter and more compact way of writing loops. The syntax looks like this:
    for ( initializer_statement ; expression ; incrementor_expression )
        statement ;
    For does the following steps:
    1. Executes the the initializer_statement. The initializer statement is executed only once and is most commonly used to initialize the loop variable.
    2. Evaluates expression
    3. If the result was false it exits the loop and continues with the program after the loop.
    4. Executes statement.
    5. Executes the increment_expression.
    6. Starts over from 2.
    This means that the example in the while section can be written like this:
    for(int e=1; e<5; e=e+1)
        show_record(e);

    3.2.3 do-while

    Sometimes it is unpractical that the expression is always evaluated before the first time the loop is executed. Quite often you want to execute something, and then do it over and over until some condition is satisfied. This is exactly when you should use the do-while statement.
    do
        statement;
    while ( expression );
    As usual, the statement can also be a block of statements, and then you do not need a semicolon after it. To clarify, this statement executes statement first, and then evaluates the expression. If the expression is true it executes the loop again. For instance, if you want to make a program that lets your modem dial your Internet provider, it could look something like this:
    do {
        modem->write("ATDT441-9109\n"); // Dial 441-9109
    } while(modem->gets()[..6]] != "CONNECT");
    This example assumes you have written something that can communicate with the modem by using the functions write and gets.

    3.2.4 foreach

    Foreach is unique in that it does not have an explicit test expression evaluated for each iteration in the loop. Instead, foreach executes the statement once for each element in an array. Foreach looks like this:
    foreach ( array_expression, variable )
        statement ;
    We have already seen an example of foreach in the find_song function in chapter 2. What foreach does is:
    1. It evaluates the array_expression which must return an array.
    2. If the array is empty, exit the loop.
    3. It then assigns the first element from the array to the variable.
    4. Then it executes the statement.
    5. If there are more elements in the array, the next one is assigned to the variable, otherwise exit the loop.
    6. Go to point 4.
    Foreach is not really necessary, but it is faster and clearer than doing the same thing with a for loop, as shown here:
    array tmp1= array_expression;
    for ( tmp2 = 0; tmp2 < sizeof(tmp1); tmp2++ )
    {
        variable = tmp1 [ tmp2 ];
        statement;
    }

    3.3 Breaking out of loops

    The loop control structures above are enough to solve any problem, but they are not enough to provide an easy solution to all problems. One thing that is still missing is the ability to exit a loop in the middle of it. There are three ways to do this:

    3.3.1 break

    break exits a loop or switch statement immediately and continues executing after the loop. Break can not be used outside of a loop or switch. It is quite useful in conjunction with while(1) to construct command parsing loops for instance:
    while(1)
    {
        string command=Stdio.Readline()->read("> ");
        if(command=="quit") break;
        do_command(command);
    }

    3.3.2 continue

    Continue does almost the same thing as break, except instead of breaking out of the loop it only breaks out of the loop body. It then continues to execute the next iteration in the loop. For a while loop, this means it jumps up to the top again. For a for loop, it jumps to the incrementor expression. For a do-while loop it jumps down to the expression at the end. To continue our example above, continue can be used like this:
    while(1)
    {
        string command=Stdio.Readline()->read("> ");
        if(strlen(command) == 0) continue;
        if(command=="quit") break;
        do_command(command);
    }
    This way, do_command will never be called with an empty string as argument.

    3.3.3 return

    Return doesn't just exit the loop, it exits the whole function. We have seen several examples how to use it chapter 2. None of the functions in chapter two returned anything in particular however. To do that you just put the return value right after return. Of course the type of the return value must match the type in the function declaration. If your function declaration is int main() the value after return must be an int. For instance, if we wanted to make a program that always returns an error code to the system, just like the UNIX command false this is how it would be done:
    #!/usr/local/bin/pike

    int main()
    {
        return 1;
    }
    This would return the error code 1 to the system when the program is run.

    3.4 Exercises

  • End all functions in the examples in chapter two with a return statement.
  • Change all foreach loops to for or while loops.
  • Make the find_song function in chapter 2 return when the first matching song is found.
  • Make the find_song function write the number of the record the song is on.
  • If you failed to get the program to work properly in the last exercise of chapter 2, try it again now.
  • Make a program that writes all the numbers from 1 to 1000.
  • Modify the program in the previous exercise to NOT write numbers divisible by 3, 7 or 17.
  • Make a program that writes all the prime numbers between 1 and 1000.
  • Chapter 4, Data types

    In this chapter we will discuss all the different ways to store data in Pike in detail. We have seen examples of many of these, but we haven't really gone into how they work. In this chapter we will also see which operators and functions work with the different types. There are two categories of data types in Pike: basic types, and pointer types. The difference is that basic types are copied when assigned to a variable. With pointer types, merely the pointer is copied, that way you get two variables pointing to the same thing.

    4.1 Basic types

    The basic types are int, float and string. For you who are accustomed to C or C++, it may seem odd that a string is a basic type as opposed to an array of char, but it is surprisingly easy to get used to.

    4.1.1 int

    Int is short for integer, or integer number. They are normally 32 bit integers, which means that they are in the range -2147483648 to 2147483647. Note that on some machines an int might be larger than 32 bits. Since they are integers, no decimals are allowed. An integer constant can be written in several ways:
    78 // decimal number
    0116 // octal number
    0x4e // hexadecimal number
    'N' // Ascii character
    All of the above represent the number 78. Octal notation means that each digit is worth 8 times as much as the one after. Hexadecimal notation means that each digit is worth 16 times as much as the one after. Hexadecimal notation uses the letters a, b, c, d, e and f to represent the numbers 10, 11, 12, 13, 14 and 15. The ASCII notation gives the ASCII value of the character between the single quotes. In this case the character is N which just happens to be 78 in ASCII.

    Integers are coded in 2-complement and overflows are silently ignored by Pike. This means that if your integers are 32-bit and you add 1 to the number 2147483647 you get the number -2147483648. This works exactly as in C or C++.

    All the arithmetic, bitwise and comparison operators can be used on integers. Also note these functions:

    int intp(mixed x)
    This function returns 1 if x is an int, 0 otherwise.
    int random(int x)
    This function returns a random number greater or equal to zero and smaller than x.
    int reverse(int x)
    This function reverses the order of the bits in x and returns the new number. It is not very useful.
    int sqrt(int x)
    This computes the square root of x. The value is always rounded down.

    4.1.2 float

    Although most programs only use integers, they are unpractical when doing trigonometric calculations, transformations or anything else where you need decimals. For this purpose you use float. Floats are normally 32 bit floating point numbers, which means that they can represent very large and very small numbers, but only with 9 accurate digits. To write a floating point constant, you just put in the decimals or write it in the exponential form:
    3.14159265358979323846264338327950288419716939937510 // Pi
    1.0e9 // A billion
    1.0e-9 // A billionth
    Of course you do not need this many decimals, but it doesn't hurt either. Usually digits after the ninth digit are ignored, but on some architectures float might have higher accuracy than that. In the exponential form, e means "times 10 to the power of", so 1.0e9 is equal to "1.0 times 10 to the power of 9".

    All the arithmetic and comparison operators can be used on floats. Also, these functions operates on floats:

    trigonometric functions
    The trigonometric functions are: sin, asin, cos, acos, tan and atan. If you do not know what these functions do you probably don't need them. Asin, acos and atan are of course short for arc sine, arc cosine and arc tangent. On a calculator they are often known as inverse sine, inverse cosine and inverse tangent.
    float log(float x)
    This function computes the natural logarithm of x,
    float exp(float x)
    This function computes e raised to the power of x.
    float pow(float|int x, float|int y)
    This function computes x raised to the power of y.
    float sqrt(float x)
    This computes the square root of x.
    float floor(float x)
    This function computes the largest integer value less than or equal to x. Note that the value is returned as a float, not an int.
    float ceil(float x),
    This function computes the smallest integer value greater than or equal to x and returns it as a float.
    float round(float x),
    This function computes the closest integer value to x and returns it as a float.

    4.1.3 string

    A string can be seen as an array of values from 0 to 2³²-1. Usually a string contains text such as a word, a sentence, a page or even a whole book. But it can also contain parts of a binary file, compressed data or other binary data. Strings in Pike are shared, which means that identical strings share the same memory space. This reduces memory usage very much for most applications and also speeds up string comparisons. We have already seen how to write a constant string:
    "hello world" // hello world
    "he" "llo" // hello
    "\116" // N (116 is the octal ASCII value for N)
    "\t" // A tab character
    "\n" // A newline character
    "\r" // A carriage return character
    "\b" // A backspace character
    "\0" // A null character
    "\"" // A double quote character
    "\\" // A singe backslash
    "\x4e" // N (4e is the hexadecimal ASCII value for N)
    "\d78" // N (78 is the decimal ACII value for N)
    "hello world\116\t\n\r\b\0\"\\" // All of the above
    "\xff" // the character 255
    "\xffff" // the character 65536
    "\xffffff" // the character 16777215
    "\116""3" // 'N' followed by a '3'
    As you can see, any sequence of characters within double quotes is a string. The backslash character is used to escape characters that are not allowed or impossible to type. As you can see, \t is the sequence to produce a tab character, \\ is used when you want one backslash and \" is used when you want a double quote (") to be a part of the string instead of ending it. Also, \XXX where XXX is an octal number from 0 to 37777777777 or \xXX where XX is 0 to ffffffff lets you write any character you want in the string, even null characters. From version 0.6.105, you may also use \dXXX where XXX is 0 to 2³²-1. If you write two constant strings after each other, they will be concatenated into one string.

    You might be surprised to see that individual characters can have values up to 2³²-1 and wonder how much memory that use. Do not worry, Pike automatically decides the proper amount of memory for a string, so all strings with character values in the range 0-255 will be stored with one byte per character. You should also beware that not all functions can handle strings which are not stored as one byte per character, so there are some limits to when this feature can be used.

    Although strings are a form of arrays, they are immutable. This means that there is no way to change an individual character within a string without creating a new string. This may seem strange, but keep in mind that strings are shared, so if you would change a character in the string "foo", you would change *all* "foo" everywhere in the program.

    However, the Pike compiler will allow you to to write code like you could change characters within strings, the following code is valid and works:

    string s="hello torld";
    s[6]='w';
    However, you should be aware that this does in fact create a new string and it may need to copy the string s to do so. This means that the above operation can be quite slow for large strings. You have been warned. Most of the time, you can use replace, sscanf, `/ or some other high-level string operation to avoid having to use the above construction too much.

    All the comparison operators plus the operators listed here can be used on strings:

    Summation
    Adding strings together will simply concatenate them. "foo"+"bar" becomes "foobar".
    Subtraction
    Subtracting one string from another will remove all occurrences of the second string from the first one. So "foobarfoogazonk" - "foo" results in "bargazonk".
    Indexing
    Indexing will let you get the ASCII value of any character in a string. The first index is zero.
    Range
    The range operator will let you copy any part of the string into a new string. Example: "foobar"[2..4] will return "oba".
    Division
    Division will let you divide a string at every occurrence of a word or character. For instance if you do "foobargazonk" / "o" the result would be ({"f","","bargaz","nk"}). It is also possible to divide the string into strings of length N by dividing the string by N. If N is converted to a float before dividing, the reminder of the division will be included in the result.
    Multiplication
    The inverse of the division operator can be accomplished by multiplying an array with a string. So if you evaluate ({"f","","bargaz","nk"}) * "o" the result would be "foobargazonk".
    Modulo
    To complement the division operator, you can do string % int. This operator will simply return the part of the string that was not included in the array returned by string / int

    Also, these functions operates on strings:

    string String.capitalize(string s)
    Returns s with the first character converted to upper case.
    int String.count(string haystack, string needle)
    Returns the number of occurances of needle in haystack. Equvivalent to sizeof(haystack/needle)-1.
    int String.width(string s)
    Returns the width s in bits (8, 16 or 32).
    string lower_case(string s)
    Returns s with all the upper case characters converted to lower case.
    string replace(string s, string from, string to)
    This function replaces all occurrences of the string from in s with to and returns the new string.
    string reverse(string s)
    This function returns a copy of s with the last byte from s first, the second last in second place and so on.
    int search(string haystack, string needle)
    This function finds the first occurrence of needle in haystack and returns where it found it.
    string sizeof(string s)
    Same as strlen(s), returns the length of the string.
    int stringp(mixed s)
    This function returns 1 if s is a string, 0 otherwise.
    int strlen(string s)
    Returns the length of the string s.
    string upper_case(string s)
    This function returns s with all lower case characters converted to upper case.

    4.2 Pointer types

    The basic types are, as the name implies, very basic. They are the foundation, most of the pointer types are merely interesting ways to store the basic types. The pointer types are array, mapping, multiset, program, object and function. They are all pointers which means that they point to something in memory. This "something" is freed when there are no more pointers to it. Assigning a variable with a value of a pointer type will not copy this "something" instead it will only generate a new reference to it. Special care sometimes has to be taken when giving one of these types as arguments to a function; the function can in fact modify the "something". If this effect is not wanted you have to explicitly copy the value. More about this will be explained later in this chapter.

    4.2.1 array

    Arrays are the simplest of the pointer types. An array is merely a block of memory with a fixed size containing a number of slots which can hold any type of value. These slots are called elements and are accessible through the index operator. To write a constant array you enclose the values you want in the array with ({ }) like this:
    ({ }) // Empty array
    ({ 1 }) // Array containing one element of type int
    ({ "" }) // Array containing a string
    ({ "", 1, 3.0 }) // Array of three elements, each of different type
    As you can see, each element in the array can contain any type of value. Indexing and ranges on arrays works just like on strings, except with arrays you can change values inside the array with the index operator. However, there is no way to change the size of the array, so if you want to append values to the end you still have to add it to another array which creates a new array. Figure 4.1 shows how the schematics of an array. As you can see, it is a very simple memory structure.


    fig 4.1

    Operators and functions usable with arrays:

    indexing ( arr [ c ] )
    Indexing an array retrieves or sets a given element in the array. The index c has to be an integer. To set an index, simply put the whole thing on the left side of an assignment, like this: arr [ c ] = new_value
    range ( arr [ from .. to ] )
    The range copies the elements from, from+1, , from+2 ... to into a new array. The new array will have the size to-from+1.
    comparing (a == b and a != b)
    The equal operator returns 1 if a and b are the same arrays. It is not enough that they have the same size and same data. They must be the same array. For example: ({1}) == ({1}) would return 0, while array(int) a=({1}); return a==a; would return 1. Note that you cannot use the operators >, >=, < or <= on arrays.
    Summation (a + b)
    As with strings, summation concatenates arrays. ({1})+({2}) returns ({1,2}).
    Subtractions (a - b)
    Subtracting one array from another returns a copy of a with all the elements that are also present in b removed. So ({1,3,8,3,2}) - ({3,1}) returns ({8,2}).
    Intersection (a & b)
    Intersection returns an array with all values that are present in both a and b. The order of the elements will be the same as the the order of the elements in a. Example: ({1,3,7,9,11,12}) & ({4,11,8,9,1}) will return: ({1,9,11}).
    Union (a | b)
    Union works almost as summation, but it only adds elements not already present in a. So, ({1,2,3}) | ({1,3,5}) will return ({1,2,3,5}). Note: the order of the elements in a can be changed!
    Xor (a ^ b)
    This is also called symmetric difference. It returns an array with all elements present in a or b but the element must NOT be present in both. Example: ({1,3,5,6}) ^ ({4,5,6,7}) will return ({1,3,4,7}).
    Division (a / b)
    This will split the array a into an array of arrays. If b is another array, a will be split at each occurance of that array. If b is an integer or float, a will be split between every bth element. Examples: ({1,2,3,4,5})/({2,3}) will return ({ ({1}), ({4,5}) }) and ({1,2,3,4})/2 will return ({ ({1,2}), ({3,4}) }).
    Modulo (a % b)
    This operation is valid only if b is an integer. It will return the part of the array that was not included by dividing a by b.
    array aggregate(mixed ... elems)
    This function does the same as the ({ }) operator; it creates an array from all arguments given to it. In fact, writing ({1,2,3}) is the same as writing aggregate(1,2,3).
    array allocate(int size)
    This function allocates a new array of size size. All the elements in the new array will be zeroes.
    int arrayp(mixed a)
    This function returns 1 if a is an array, 0 otherwise.
    array column(array(mixed) a, mixed ind)
    This function goes through the array a and indexes every element in it on ind and builds an array of the results. So if you have an array a in which each element is a also an array. This function will take a cross section, by picking out element ind from each of the arrays in a. Example: column( ({ ({1,2,3}), ({4,5,6}), ({7,8,9}) }), 2) will return ({3,6,9}).
    int equal(mixed a, mixed b)
    This function returns 1 if if a and b look the same. They do not have to be pointers to the same array, as long as they are the same size and contain equal data.
    array filter(array a, mixed func, mixed ... args)
    filter returns every element in a for which func returns true when called with that element as first argument, and args for the second, third, etc. arguments. (Both a and func can be other things; see the reference for filter for details about that.)
    array map(array a, mixed func, mixed ... args)
    This function works similar to filter but returns the results of the function func instead of returning the elements from a for which func returns true. (Like filter, this function accepts other things for a and func; see the reference for map.)
    array replace(array a, mixed from, mixed to)
    This function will create a copy of a with all elements equal to from replaced by to.
    array reverse(array a)
    Reverse will create a copy of a with the last element first, the last but one second, and so on.
    array rows(array a, array indexes)
    This function is similar to column. It indexes a with each element from indexes and returns the results in an array. For example: rows( ({"a","b","c"}), ({ 2,1,2,0}) ) will return ({"c","b","c","a"}).
    int search(array haystack, mixed needle)
    This function returns the index of the first occurrence of an element equal (tested with ==) to needle in the array haystack.
    int sizeof(mixed arr)
    This function returns the number of elements in the array arr.
    array sort(array arr, array ... rest)
    This function sorts arr in smaller-to-larger order. Numbers, floats and strings can be sorted. If there are any additional arguments, they will be permutated in the same manner as arr. See chapter 16 "Builtin functions" for more details.
    array uniq(array a)
    This function returns a copy of the array a with all duplicate elements removed. Note that this function can return the elements in any order.

    4.2.2 mapping

    Mappings are are really just more generic arrays. However, they are slower and use more memory than arrays, so they cannot replace arrays completely. What makes mappings special is that they can be indexed on other things than integers. We can imagine that a mapping looks like this:


    fig 4.2

    Each index-value pair is floating around freely inside the mapping. There is exactly one value for each index. We also have a (magical) lookup function. This lookup function can find any index in the mapping very quickly. Now, if the mapping is called m and we index it like this: m [ i ] the lookup function will quickly find the index i in the mapping and return the corresponding value. If the index is not found, zero is returned instead. If we on the other hand assign an index in the mapping the value will instead be overwritten with the new value. If the index is not found when assigning, a new index-value pair will be added to the mapping. Writing a constant mapping is easy:

    ([ ]) // Empty mapping
    ([ 1:2 ]) // Mapping with one index-value pair, the 1 is the index
    ([ "one":1, "two":2 ]) // Mapping which maps words to numbers
    ([ 1:({2.0}), "":([]), ]) // Mapping with lots of different types

    As with arrays, mappings can contain any type. The main difference is that the index can be any type too. Also note that the index-value pairs in a mapping are not stored in a specific order. You can not refer to the fourteenth key-index pair, since there is no way of telling which one is the fourteenth. Because of this, you cannot use the range operator on mappings.

    The following operators and functions are important:

    indexing ( m [ ind ] )
    As discussed above, indexing is used to retrieve, store and add values to the mapping.
    addition, subtraction, union, intersection and xor
    All these operators works exactly as on arrays, with the difference that they operate on the indices. In those cases when the value can come from either mapping, it will be taken from the right side of the operator. This makes it easier to add new values to a mapping with +=. Some examples:
    ([1:3, 3:1]) + ([2:5, 3:7]) returns ([1:3, 2:5, 3:7 ])
    ([1:3, 3:1]) - ([2:5, 3:7]) returns ([1:3])
    ([1:3, 3:1]) | ([2:5, 3:7]) returns ([1:3, 2:5, 3:7 ])
    ([1:3, 3:1]) & ([2:5, 3:7]) returns ([3:7])
    ([1:3, 3:1]) ^ ([2:5, 3:7]) returns ([1:3, 2:5])
    same ( a == b )
    Returns 1 if a is the same mapping as b, 0 otherwise.
    not same ( a != b )
    Returns 0 if a is the same mapping as b, 1 otherwise.
    array indices(mapping m)
    Indices returns an array containing all the indices in the mapping m.
    mixed m_delete(mapping m, mixed ind)
    This function removes the index-value pair with the index ind from the mapping m. It will return the value that was removed.
    int mappingp(mixed m)
    This function returns 1 if m is a mapping, 0 otherwise.
    mapping mkmapping(array ind, array val)
    This function constructs a mapping from the two arrays ind and val. Element 0 in ind and element 0 in val becomes one index-value pair. Element 1 in ind and element 1 in val becomes another index-value pair, and so on..
    mapping replace(mapping m, mixed from, mixed to)
    This function creates a copy of the mapping m with all values equal to from replaced by to.
    mixed search(mapping m, mixed val)
    This function returns the index of the 'first' index-value pair which has the value val.
    int sizeof(mapping m)
    Sizeof returns how many index-value pairs there are in the mapping.
    array values(mapping m)
    This function does the same as indices, but returns an array with all the values instead. If indices and values are called on the same mapping after each other, without any other mapping operations in between, the returned arrays will be in the same order. They can in turn be used as arguments to mkmapping to rebuild the mapping m again.
    int zero_type(mixed t)
    When indexing a mapping and the index is not found, zero is returned. However, problems can arise if you have also stored zeroes in the mapping. This function allows you to see the difference between the two cases. If zero_type(m [ ind ]) returns 1, it means that the value was not present in the mapping. If the value was present in the mapping, zero_type will return something else than 1.

    4.2.3 multiset

    A multiset is almost the same thing as a mapping. The difference is that there are no values:


    fig 4.3

    Instead, the index operator will return 1 if the value was found in the multiset and 0 if it was not. When assigning an index to a multiset like this: mset[ ind ] = val the index ind will be added to the multiset mset if val is true. Otherwise ind will be removed from the multiset instead.

    Writing a constant multiset is similar to writing an array:

    (< >) // Empty multiset
    (< 17 >) // Multiset with one index: 17
    (< "", 1, 3.0, 1 >) // Multiset with 3 indices
    Note that you can actually have more than one of the same index in a multiset. This is normally not used, but can be practical at times.

    4.2.4 program

    Normally, when we say program we mean something we can execute from a shell prompt. However, Pike has another meaning for the same word. In Pike a program is the same as a class in C++. A program holds a table of what functions and variables are defined in that program. It also holds the code itself, debug information and references to other programs in the form of inherits. A program does not hold space to store any data however. All the information in a program is gathered when a file or string is run through the Pike compiler. The variable space needed to execute the code in the program is stored in an object which is the next data type we will discuss.


    fig 4.4
    Writing a program is easy, in fact, every example we have tried so far has been a program. To load such a program into memory, we can use compile_file which takes a file name, compiles the file and returns the compiled program. It could look something like this:
    program p = compile_file("hello_world.pike");
    You can also use the cast operator like this:
    program p = (program) "hello_world";
    This will also load the program hello_world.pike, the only difference is that it will cache the result so that next time you do (program)"hello_world" you will receive the _same_ program. If you call compile_file("hello_world.pike") repeatedly you will get a new program each time.

    There is also a way to write programs inside programs with the help of the class keyword:

    class class_name {
        inherits, variables and functions
    }
    The class keyword can be written as a separate entity outside of all functions, but it is also an expression which returns the program written between the brackets. The class_name is optional. If used you can later refer to that program by the name class_name. This is very similar to how classes are written in C++ and can be used in much the same way. It can also be used to create structs (or records if you program Pascal). Let's look at an example:
    class record {
        string title;
        string artist;
        array(string) songs;
    }

    array(record) records = ({});

    void add_empty_record()
    {
        records+=({ record() });
    }

    void show_record(record rec)
    {
        write("Record name: "+rec->title+"\n");
        write("Artist: "+rec->artist+"\n");
        write("Songs:\n");
        foreach(rec->songs, string song)
            write(" "+song+"\n");
    }
    This could be a small part of a better record register program. It is not a complete executable program in itself. In this example we create a program called record which has three identifiers. In add_empty_record a new object is created by calling record. This is called cloning and it allocates space to store the variables defined in the class record. Show_record takes one of the records created in add_empty_record and shows the contents of it. As you can see, the arrow operator is used to access the data allocated in add_empty_record. If you do not understand this section I suggest you go on and read the next section about objects and then come back and read this section again.

    cloning
    To create a data area for a program you need to instantiate or clone the program. This is accomplished by using a pointer to the program as if it was a function and call it. That creates a new object and calls the function create in the new object with the arguments. It is also possible to use the functions new() and clone() which do exactly the same thing except you can use a string to specify what program you want to clone.
    compiling
    All programs are generated by compiling a string. The string may of course be read from a file. For this purpose there are three functions:
    program compile(string p);
    program compile_file(string filename);
    program compile_string(string p, string filename);
    compile_file simply reads the file given as argument, compiles it and returns the resulting program. compile_string instead compiles whatever is in the string p. The second argument, filename, is only used in debug printouts when an error occurs in the newly made program. Both compile_file and compile_string calls compile to actually compile the string after calling cpp on it.
    casting
    Another way of compiling files to program is to use the cast operator. Casting a string to the type program calls a function in the master object which will compile the program in question for you. The master also keeps the program in a cache, so if you later need the same program again it will not be re-compiled.
    int programp(mixed p)
    This function returns 1 if p is a program, 0 otherwise.
    comparisons
    As with all data types == and != can be used to see if two programs are the same or not.

    The following operators and functions are important:

    cloning ( p ( args ) )
    Creates an object from a program. Discussed in the next section.
    indexing ( p [ string ], or p -> identifier )
    Retreives the value of the named constant from a program.
    array(string) indices(program p)
    Returns an array with the names of all non-static constants in the program.
    array(mixed) values(program p)
    Returns an array with the values of all non-static constants in the program.

    4.2.5 object

    Although programs are absolutely necessary for any application you might want to write, they are not enough. A program doesn't have anywhere to store data, it just merely outlines how to store data. To actually store the data you need an object. Objects are basically a chunk of memory with a reference to the program from which it was cloned. Many objects can be made from one program. The program outlines where in the object different variables are stored.

    fig 4.5
    Each object has its own set of variables, and when calling a function in that object, that function will operate on those variables. If we take a look at the short example in the section about programs, we see that it would be better to write it like this:
    class record {
        string title;
        string artist;
        array(string) songs;

        void show()
        {
            write("Record name: "+title+"\n");
            write("Artist: "+artist+"\n");
            write("Songs:\n");
            foreach(songs, string song)
                write(" "+song+"\n");
        }
    }

    array(record) records = ({});

    void add_empty_record()
    {
        records+=({ record() });
    }

    void show_record(object rec)
    {
        rec->show();
    }
    Here we can clearly see how the function show prints the contents of the variables in that object. In essence, instead of accessing the data in the object with the -> operator, we call a function in the object and have it write the information itself. This type of programming is very flexible, since we can later change how record stores its data, but we do not have to change anything outside of the record program.

    Functions and operators relevant to objects:

    indexing
    Objects can be indexed on strings to access identifiers. If the identifier is a variable, the value can also be set using indexing. If the identifier is a function, a pointer to that function will be returned. If the identifier is a constant, the value of that constant will be returned. Note that the -> operator is actually the same as indexing. This means that o->foo is the same as o["foo"]
    cloning
    As discussed in the section about programs, cloning a program can be done in two different ways:
    Whenever you clone an object, all the global variables will be initialized. After that the function create will be called with any arguments you call the program with.
    void destruct(object o)
    This function invalidates all references to the object o and frees all variables in that object. This function is also called when o runs out of references. If there is a function named destroy in the object, it will be called before the actual destruction of the object.
    array(string) indices(object o)
    This function returns a list of all identifiers in the object o.
    program object_program(object o)
    This function returns the program from which o was cloned.
    int objectp(mixed o)
    This function returns 1 if o is an object, 0 otherwise. Note that if o has been destructed, this function will return 0.
    object this_object()
    This function returns the object in which the interpreter is currently executing.
    array values(object o)
    This function returns the same as rows(o,indices(o)). That means it returns all the values of the identifiers in the object o.
    comparing
    As with all data types == and != can be used to check if two objects are the same or not.

    4.2.6 function

    When indexing an object on a string, and that string is the name of a function in the object a function is returned. Despite its name, a function is really a function pointer.

    fig 4.6
    When the function pointer is called, the interpreter sets this_object() to the object in which the function is located and proceeds to execute the function it points to. Also note that function pointers can be passed around just like any other data type:
    int foo() { return 1; }
    function bar() { return foo; }
    int gazonk() { return foo(); }
    int teleledningsanka() { return bar()(); }
    In this example, the function bar returns a pointer to the function foo. No indexing is necessary since the function foo is located in the same object. The function gazonk simply calls foo. However, note that the word foo in that function is an expression returning a function pointer that is then called. To further illustrate this, foo has been replaced by bar() in the function teleledningsanka.

    For convenience, there is also a simple way to write a function inside another function. To do this you use the lambda keyword. The syntax is the same as for a normal function, except you write lambda instead of the function name:

    lambda ( types ) { statements }
    The major difference is that this is an expression that can be used inside an other function. Example:
    function bar() { return lambda() { return 1; }; )
    This is the same as the first two lines in the previous example, the keyword lambda allows you to write the function inside bar.

    Note that unlike C++ and Java you can not use function overloading in Pike. This means that you cannot have one function called 'foo' which takes an integer argument and another function 'foo' which takes a float argument.

    This is what you can do with a function pointer.

    calling ( f ( mixed ... args ) )
    As mentioned earlier, all function pointers can be called. In this example the function f is called with the arguments args.
    string function_name(function f)
    This function returns the name of the function f is pointing at.
    object function_object(function f)
    This function returns the object the function f is located in.
    int functionp(mixed f)
    This function returns 1 if f is a function, 0 otherwise. If f is located in a destructed object, 0 is returned.
    function this_function()
    This function returns a pointer to the function it is called from. This is normally only used with lambda functions because they do not have a name.

    4.3 Sharing data

    As mentioned in the beginning of this chapter, the assignment operator (=) does not copy anything when you use it on a pointer type. Instead it just creates another reference to the memory object. In most situations this does not present a problem, and it speeds up Pike's performance. However, you must be aware of this when programming. This can be illustrated with an example:
    int main(int argc, array(string) argv)
    {
        array(string) tmp;
        tmp=argv;
        argv[0]="Hello world.\n";
        write(tmp[0]);
    }
    This program will of course write Hello world.

    Sometimes you want to create a copy of a mapping, array or object. To do so you simply call copy_value with whatever you want to copy as argument. Copy_value is recursive, which means that if you have an array containing arrays, copies will be made of all those arrays.

    If you don't want to copy recursively, or you know you don't have to copy recursively, you can use the plus operator instead. For instance, to create a copy of an array you simply add an empty array to it, like this: copy_of_arr = arr + ({}); If you need to copy a mapping you use an empty mapping, and for a multiset you use an empty multiset.

    4.4 Writing data types

    When declaring a variable, you also have to specify what type of variable it is. For most types, such as int and string this is very easy. But there are much more interesting ways to declare variables than that, let's look at a few examples:
    int x; // x is an integer
    int|string x; // x is a string or an integer
    array(string) x; // x is an array of strings
    array x; // x is an array of mixed
    mixed x; // x can be any type
    string *x; // x is an array of strings

    // x is a mapping from int to string
    mapping(string:int) x;

    // x implements Stdio.File
    Stdio.File x;

    // x implements Stdio.File
    object(Stdio.File) x;

    // x is a function that takes two integer
    // arguments and returns a string
    function(int,int:string) x;

    // x is a function taking any amount of
    // integer arguments and returns nothing.
    function(int...:void) x;

    // x is ... complicated
    mapping(string:function(string|int...:mapping(string:array(string)))) x;
    As you can see there are some interesting ways to specify types. Here is a list of what is possible:
    mixed
    This means that the variable can contain any type, or the function return any value
    array( type )
    This means an array of elements with th