This site hosted by Free.ProHosting.com
Google

Chapter 9 : Utilizing the power of Internet. PERL for server side programming.

Very often, when you surf the Internet, you will encounter forms like that below. This chapter will discuss programming pertaining to such.

You may try to fill in the form with fictitious data, and send the form. The server side will print the information it receives. You may not understand such information now, but you will after reading this chapter. To return back to this Chapter, click "Previous Page" of your browser.


Your name

Your Password

Choices

Mathematics
English
Physics

Your time preference

Morning
Afternoon
Evening
Holidays

Please explain in a few words the reason why you choose to study here

Your country




The actual coding of the above is, (just have a quick look at them, details will be explained later.)


<form name=form1 method=post action="http://balder.prohosting.com/sywu/cgi-bin/t5.pl">

<p>Your name <input type=text name=uname size=40 maxlength=40></p>

<p>Your Password <input type=password name=pword size=10 maxlength=10>

<h2>Choices</h2>
<p><input type=checkbox name=ck1 >Mathematics<br>
   <input type=checkbox name=ck2 >English<br>
   <input type=checkbox name=ck3 >Physics
</p>

<h2>Your time preference</h2>
<p><input type=radio value="0" name=timep >Morning<br>
   <input type=radio value="1" name=timep >Afternoon<br>
   <input type=radio value="2" name=timep >Evening<br>
   <input type=radio value="3" name=timep >Holidays
</p>

<h2>Please explain in a few words the reason why you choose to study here</h2>
<p align=center>
<table width=80% ><tr><td>
<textarea rows=10 cols=50 name=reason >
e.g. I wish to be a better person
</textarea>
</td></tr></table>
</p>

<h2>Your country</h2>
<select name=country size=5 multiple>
<option selected>Hong Kong</option>
<option>Kowloon</option>
<option>New Territories</option>
<option>China</option>
<option>Macau</option>
<option>Singapore</option>
<option>Malaysia</option>
<option>Japan</option>
</select>

<input type=hidden name=htext value=10>

<p ><input type=submit value="o.k. send" ><br><br>
 <input type=reset value="clear and reset">
</p>
</form>

And the response is somewhat like, (Alas! I have difficulties on many occasions in accessing the server. So, if you submit the form, and it fails, you may read a typical response below. The PERL program that produces this will be shown later in the Chapter.)

uname=Wu+Siu+Yan&pword=aaaaaa&ck2=on&ck3=on&timep=1&
reason=Testing+testing+one+two+three%0D%0Alast+line.&country=Kowloon


reason => Testing testing one two three last line.

ck2 => on

ck3 => on

country => Kowloon

pword => aaaaaa

uname => Wu Siu Yan

timep => 1

htext => 10

You will find that the form contains text-areas for you to fill in, check-boxes or check-circles to click, and buttons (e.g. "o.k. send", "clear and reset").

If you do not know basic HTML (Hypertext Mark-Up Language), you should read [How to write web pages] before you proceed.

We shall discuss the following

  1. Form Tag - e.g. <form> ... </form>
  2. Text - where one may fill in 1 line of text only.
  3. Textarea - where one may fill in a whole box of data.
  4. Password - Sometimes, we may require password to restrict access.
  5. Hidden Text - Since Internet is "memory-less", in the sense that server handles requests as they come along. Once the server has sent out information, it will consider that request closed, and will jump to serve other requests. It will not remember what it has sent. "Hidden Text" is there to help server remember what it has sent, e.g. If it has sent page 9, then when the user click "next", the server will know that it should send out page 10. User will not be able to see hidden text.
  6. Check Box - Like the "Choices" in the form above.
  7. Radio Button - Like "Time Preferences" above.
  8. Selections - Like "Your Country" above.
  9. Submit button, Reset button - These are special buttons. One, when clicked, will submit the form to server, the other, will clear all the data filled in by user.

Form Tag

e.g.

    <form name=form1 method=post action="http://balder.prohosting.com/sywu/cgi-bin/t5.pl">
         .....
         .....
         .....
    </form>

Properties :

     name =           There may be more than 1 form in a webpage.  This is used to
                      identify them.
     
     method =         It may be (1) post (2) get.
                      For (1) post, the reply is sent back through "stdin" - standard
                      input.  This is the preferred way.
                      For (2) get, the reply is sent back through environment variable,
                      "QUERY_STRING", and also as "char *argv[1]" if the server program
                      is written in C.  Notice that "method=get" MAY NOT BE SUPPORTED
                      BY SOME BROWSERS.  Hence it is advisable to use "method=post".

     action = URL     The program on the server side that handles the form.  In the
                      example above, it is a PERL program "t5.pl" in the directory
                      "cgi-bin", in the WWW server "http://balder.prohosting.com/sywu/"

     target =framename    The output from the server side is displayed
                          in this frame. Again, it is optional.

     onsubmit = applet    These two will not be used in this Chapter.  Usually "applet"   
     onreset = applet     is a subroutine written in Javascript or
                          VBscript that does data-validation on the browser side.

(Notice that "properties" are parameters, which user may set. They are C data structures.)


Text input of one line

e.g.

   <p>Your name <input type=text name=uname size=40 maxlength=40></p>

Properties :

      type = text         This parameter must be there to identify that it is a 
                          input of one line of text.

      name =              Identifier in case there are many such input lines.

      size =              Width in number of characters.

      maxlength =         The maximum length the input string can assume.  If it is longer
                          than this, the rest will be truncated.  This parameter is optional.

      value =             We may put in some "default value".  This parameter
                          is quite useful in some scripts, e.g. Javascript.
                          This parameter is optional.

      onblur = applet     These will not be used in this Chapter.
      onchange = applet
      onfocus = applet
      onselect = applet

If the user has entered, say, "Wu Siu Yan", the browser will send the following to the server,

uname=Wu+Siu+Yan

The identifier is that in "name=uname". Notice that all blanks have been replaced by "+". Also all control characters, e.g. linefeed, carriage return, ... , will be encoded by the browser as %nn, where nn is its hexadecimal code.


Text area of more than one lines

e.g. (Notice that "textarea" below is within a table. "form" may contain tables, which in turn may contain form items.)

<h2>Please explain in a few words the reason why you choose to study here</h2>
<p align=center>
<table width=80% height=><tr><td>
<textarea rows=10 cols=50 name=reason >
e.g. I wish to be a better person
</textarea>
</td></tr></table>
</p>

Properties :

      rows =               No. of rows the text area will occupy.
      
      cols =               No. of columns the text area will occupy.

      name =               Identifying name.

      value =              This parameter is usually used by Javascript etc.
                           and is usually omitted.

      onblur = applet      These will not be used in this Chapter.
      onchange = applet
      onfocus = applet
      onselect = applet

Notice that we may put "default text" within
<textarea .... > .... </textarea>.

If the user has entered, say,

lifelong learning should be our attitude. While we cannot answer LORD's questions on Job 38, 39, we should continue to learn.

The browser will send the following to the server,
reason=lifelong+learning+should+be+our+attitude.%0D%0AWhile+we+cannot+answer+LORD%27s+questions+on+Job+38%2C+39%2C+we+should+continue+to+learn.

The identifier is that in "name=reason".


Password

"password" entry is like "text" of one line, except that what the user types will not be printed.

e.g.

    <p>Your Password <input type=password name=pword size=10 maxlength=10 >

Properties :

      type = password     This parameter must be there to identify that it is a 
                          password entry.

      name =              Identifier.

      size =              Width in number of characters.

      maxlength =         The maximum length the input string can assume.  If it is longer
                          than this, the rest will be truncated.  This parameter is optional.

      value =             This parameter is usually used by Javascript etc.
                          and is usually omitted.

      onblur = applet     These will not be used in this Chapter.
      onchange = applet
      onfocus = applet

It can be seen that it is almost identical to "text input of one line".

Exercise : If the user has entered "Tom Sawyer", what will the browser send to the server ?

Ans : It will send,

pword=Tom+Sawyer

Hidden Text

e.g.

     <input type=hidden name=htext value=10>

Properties :

      type = hidden       This parameter must be there to identify that it is a 
                          hidden text.

      name =              Identifier.

      value =             This parameter is needed, as Internet is "memory-less".

Exercise : What will the browser send to the server in this case ?

Ans : It will send,

htest=10

Check Box

e.g.

<h2>Choices</h2>
<p><input type=checkbox name=ck1 >Mathematics<br>
   <input type=checkbox name=ck2 >English<br>
   <input type=checkbox name=ck3 >Physics
</p>

Properties :

      type = checkbox     This parameter must be there to identify that it is a 
                          checkbox.

      name =              Identifier.

      value =             This parameter is for scripts, e.g. "Javascript".
                          This parameter is optional.

Exercise : Suppose we have checked "Mathematics" and "English", what will the browser send to the server ?

Ans : It will send,

ck1=on
ck2=on

Notice that if we had specified "value=something", then it will send, e.g.

ck1=something1
ck2=something2

Radio Button

e.g.

<h2>Your time preference</h2>
<p><input type=radio value="0" name=timep >Morning<br>
   <input type=radio value="1" name=timep >Afternoon<br>
   <input type=radio value="2" name=timep >Evening<br>
   <input type=radio value="3" name=timep >Holidays
</p>

Properties :

      type = radio        This parameter must be there to identify that it is a 
                          radio button.

      name =              Identifier.

      value =             This parameter must be here, one for each button.

(Notice that the "name" must be the same for the several buttons. Only one button may be active. "value" must be there, or else the server would not know which has been clicked.)

Exercise : Suppose we have clicked "Afternoon", what will the browser send to the server ?

Ans : It will send,

timep=1

Selection

e.g.

<h2>Your country</h2>
<select name=country size=5 multiple>
<option selected>Hong Kong</option>
<option>Kowloon</option>
<option>New Territories</option>
<option>China</option>
<option>Macau</option>
<option>Singapore</option>
<option>Malaysia</option>
<option>Japan</option>
</select>

Properties :

<select .... > 
       ....
       ....
</select>

      name =              Identifier.

      size =              This determines how many options will be visible to
                          user at one time.

      onblur = applet     These will not be used in this Chapter.
      onchange = applet
      onfocus = applet


<option> .... </option>

      selected             Default selection.

      value =              This is to be used by scripts, e.g. Javascript.

Exercise : Had we specified "size=1" instead of "size=5" in the above example, what will the appearance be?

Ans : Only one country will be shown, instead of 5 at one time.

Exercise : Suppose we have clicked "Kowloon", what will the browser send to the server ?

Ans : It will send,

country=Kowloon

Submit Button, Reset Button

e.g.

<p ><input type=submit value="o.k. send" ><br><br>
    <input type=reset value="clear and reset">
</p>

Properties :

      type = submit/reset/button  This parameter must be there to identify
                                  its function. 

      name =              Identifier.

      value =             This parameter must be here, one for each button.

      onblur = applet     These will not be used in this Chapter.
      onfocus = applet

How to write server side program

One way to write a webpage is to add HTML tags ourselves.

Another way is to write a program that generates the HTML file for us. For example, we have a master file of "student name, age, telephone", and we want to write a webpage listing all of them. We can write a program that reads the data from the master file, then generates the HTML file.

That program may be a PERL program, or a C program, or any other program so long as it can read from "stdin" (standard input), and write to "stdout" (standard output).

The WWW server (World Wide Web Server, e.g. Apache server) may be configured to execute programs and send their output, not to the screen ("stdout"), but to the user.

Let us illustrate this with a simple master file, "student.txt",

     Tom Sawyer**36**123456
     Mary**27**9993456
     Peter**31**9499322
     John**29**9482678
     Robert**45**2378348
     Susanne**42**30030423
          

The PERL program may be, (you should revise HTML <table .. > tag first, [How to write webpages], if you are not familiar with the <table ..> tag.)

#!/usr/bin/perl

open(FIN,"<student.txt");
@aa = <FIN>;
close(FIN);

# Note : Every HTML file generated by a program must print 
#        something like the following line first.
print "Content-type:text/html\n\n";

print '<basefont face="courier new">',"\n";

print '<h2 align=center>Student Master File</h2>',"\n";
print '<p align=center><table width=80% border=0 cellpadding=10>',"\n";

$icount = 0;
for $x (@aa)
    {($name, $age, $tel) = split(/\*\*/, $x);

     $color = (($icount % 2) == 0) ? '#e0ffff' : '#ffe0e0';

     print '<tr bgcolor=', $color, '><td width=40%>', $name,
           '</td><td width=10% align=center>', $age,
           '</td><td>', $tel, '</td></tr>', "\n";

     $icount++;
    }

print '</table></p>',"\n";
exit 0;

What the PERL program generates is

Content-type:text/html

<basefont face="courier new">
<h2 align=center>Student Master File</h2>
<p align=center><table width=80% border=0 cellpadding=10>
<tr bgcolor=#e0ffff><td width=40%>     Tom Sawyer
</td><td width=10% align=center>36</td><td>123456
</td></tr>
<tr bgcolor=#ffe0e0><td width=40%>     Mary
</td><td width=10% align=center>27</td><td>9993456
</td></tr>
<tr bgcolor=#e0ffff><td width=40%>     Peter
</td><td width=10% align=center>31</td><td>9499322
</td></tr>
<tr bgcolor=#ffe0e0><td width=40%>     John
</td><td width=10% align=center>29</td><td>9482678
</td></tr>
<tr bgcolor=#e0ffff><td width=40%>     Robert
</td><td width=10% align=center>45</td><td>2378348
</td></tr>
<tr bgcolor=#ffe0e0><td width=40%>     Susanne
</td><td width=10% align=center>42</td><td>30030423</td></tr>
</table></p>

And what the user will see on their browser is,

Student Master File

Tom Sawyer36123456
Mary279993456
Peter319499322
John299482678
Robert452378348
Susanne4230030423

Exercise : Will the program work if we have used,

print "Content-type:text/html\n";
and not
print "Content-type:text/html\n\n";
The difference is only "\n", and "\n\n".

Ans : No. There must be two "newlines", or else the WWW server will mistake it for something else, and the program will fail.


In what follows, we continue to discuss using PERL for "form response".


The browser will join all answers with "&" into a long string, and send the string to the server. The server will then load the program, in our example, "/cgi-bin/t5.pl", which is a PERL program to handle it.

Since we have specified "method=post" in the <form> tag, the server will feed that string through stdin - "standard input" to "/cgi-bin/t5.pl". It may be pointed out that the program may be a C program or any other program that can read from stdin and print to stdout.

If the program is a C program, we have to write several subroutines to decode that string. And if you have time, you should do that as an exercise.

To decode the string with PERL is a little difficult, because PERL does not have all the C (assembler-like) capabilities. The steps to do so are outlined in the following exercises.

Exercise : How to read in the string through STDIN and split the string into a list at "&" with PERL ?

Ans :

$a = <STDIN>;
@b = split(/\&/, $a);

Exercise : After splitting the string into a list, each element is in the form,

identifier=value
How would you split each element into right and left portions ?

Ans :

     for $dum (@b)
        { ($lvalue, $rvalue) = split(/\=/, $dum);
              .......
              .......

Now that we have split it into $lvalue, $rvalue. To decode $rvalue, we use,

     $rvalue =~ tr/+/ /;
     $rvalue =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C",hex($1))/eg;

First, we change "+" back into blank. Next we have to decode %nn, where nn is in hexadecimal. This is done with bracket to capture the two hex characters after % into $1, then use hex(..) to get its decimal value, then use pack(..) to pack it back into a unsigned character. The options are "g = global substitution" and "e = evaluate the replacement part as a PERL statement".

The complete program is shown below, (Note that we have used a global hash %formvalue to store the values,

#!/usr/bin/perl

# The following line must be the first output of a  
# server side program that produces a HTML file.
print "Content-type:text/html\n\n";

print '<basefont face="courier new">',"\n";

$a=<STDIN>;

print '<p>The string that the server gets is :</p><br>',"\n";
print '<p>'.$a.'</p><br><br>'."\n";

# split(..) is a subroutine that splits the string into a hash.

&split($a);

while (($name,$value) = each(%formvalue))
    {print '<p>'.$name.' => '.$value.'</p>'."\n";
    };

exit 0;

sub split
{  my ($a, $dum, $lvalue, $rvalue, @b, @c); 

     $a = $_[0];
#    print "check $a\n";

     @b = split(/\&/, $a);
#    print "after split @b\n";

#    Initialize the list @c.
     @c = ();

     for $dum (@b)
        { ($lvalue, $rvalue) = split(/\=/, $dum);
#         print "left value \n     $lvalue\nright value\n     $rvalue\n";
          $rvalue =~ tr/+/ /;
          $rvalue =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C",hex($1))/eg;
#         print "right value after change\n$rvalue\n";
          push(@c, $lvalue, $rvalue);
        };

     %formvalue = @c;

#    while (($name,$value) = each(%formvalue))
#      {print $name." => ".$value."\n";
#      };

     return 1;
};

(Notice that I have used a lot of comment "#". Those are the codes that I used for debugging purposes.)


NOTE :

  1. Many ISP (Internet Service Providers) will allow you use PERL programs (or PHP program, another language similar to PERL, but developed later and patterned after PERL) at a fee.

  2. Usually such server-side programs have to be put into a separate directory, e.g. "/cgi-bin/" under your directory, or may not. You have to ask your ISP. (NOTE : cgi means "common gateway interface", bin means binary programs. It is a term used in WWW to mean data exchange in the server.)

  3. The permission for such PERL programs would usually be "0700" (i.e. you have read/write/execute access, but others have none), and data-file in cgi-bin, "0600" (read/write access for yourself, none to others). The directory, e.g. "/cgi-bin/", should have "0711" (i.e. grant execute access to others, which is the same as read access. ) Again, you have to ask your ISP for details.
    Most "ftp" (file transfer) program would not allow us to change "mode", but some would, e.g. "ws_ftp". If you use "ws_ftp", you first click to select the file or directory, then click right button, and a pop-up menu will appear, then choose "chmod" ...

  4. You may open "sequential file, random file, database file (dbmopen(...))" in PERL, hence you can automate a lot of data-processing jobs.

  5. You should know that the Almighty LORD is watching everyone of us, so NEVER DO ANYTHING THAT WOULD HURT YOUR ISP because he has granted you the privilege of running program on his system. The Golden Rule, Jesus : "So whatever you wish that men would do to you, do so to them. [Mt 7:12]", and Confucius : "Do not do to others what you don't want others to do to you", and imagine yourself an ISP that has to care about servers, and wages to employers, and have to provide you service at the same time.
    FEAR THE ALMIGHTY LORD, WHO IS ETERNAL TRUTH WITH UNIMAGINABLE POWER.

Use "pipe", etc.

We may use "piping", e.g.

open(FH, "| output-pipe-command");
open(FH, " input-pipe-command |");

In the first case, the output from the file whose filehandle is "FH" will be the input of "output-pipe-command".

In the second case, the output from "input-pipe-command" will be the input of "FH".

Very often, we would use it to "mail" files to user.

Many UNIX/LINUX system has a "mail" program,

mail -c carbon-copies-receivers -s subject receiver

e.g.

    mail -c tom@xxx.com,mary@yyy.com -s "Hello from S.Y.Wu" peter@zzz.com

Here "carbon copies" will be send to "tom@xxx.com, mary@yyy.com", and the subject matter is "Hello from S.Y.Wu", the receiver is "peter@zzz.com".

The following is a simple example program,

    $a = 'mail -c tom@xxx.com,mary@yyy.com -s "Hello from S.Y.Wu" peter@zzz.com';
    open(FOUT, "| $a) or die "Unable to pipe $!\n";
    print FOUT "Dear Friends,\n",
               "If you judge the material in my homepage useful to others,\n",
               "please inform them of my webpages.\n",
               "From a preacher of Christianity,\n",
               "Wu Siu Yan from Hong Kong\n\n";
    close(FOUT);
    exit 0;

I have briefly discussed PERL and its use on the Internet. Personally, I favor C, because I think it is more reliable. Moreover, a C program, as tested on my PC, is 16 times faster than a PERL program (imagine work load on servers when all cgi-bin programs are in C and not in PERL nor PHP !).

From the above, you can see that "data-validation" is done on the server side. But it may be done on the "browser, or client side". "Javascript" is a language that has been developed by Netscape for browser side automation, and its syntax is C syntax. It can do browser side data-validation. You may read manuals of them.

I hope you may learn about computers from my webpages. As I have said before, I came from Mathematics field, and not computer science. Moreover, I am isolated, in that secret people from Secret Alliance isolate me. Hence there would be many short-comings in these web-pages, (you can read a lot on the Internet, and build up your own library, your own tool-kits, ... ), but still, it is my hope to offer some effort in this rapid developing world of Internet.


[Previous] [Home]