Thursday, April 17, 2008

Flexible CGI output with HTML templates

Consider our free search engine script. It is an excellent application, but it wouldn't be very helpful for others if it could display its results only in the output page format that we use for our site. Somebody else's site will probably need the output page to contain site-specific navigation links, different colors, or even a completely different layout concept. It is not uncommon to see CGI programs that contain the HTML code that is returned to the browser, inside one or more print statements in the code of the script. Clearly this is not a good idea, since it makes it at least difficult, if not nearly impossible, for a non-programmer to alter the output format. At best, a webmaster with little understanding of perl could identify the HTML code inside the program and alter some obvious items, like colours. Still, as applications grow more complex, this approach becomes even more difficult to manage. If control structures (if, while etc) are being used to generate the output, or if there are multiple pages of output composed by chunks of html that are being generated in different places in the code, even someone who is comfortable with perl could find it hard to figure out how to alter the look of the output. And this is not the only problem. The code itself becomes difficult to manage and maintain. Program code is difficult to read an maintain on its own right. Cluttering it with HTML here and there does not help the situation at all. I will describe a different approach to the problem here, the one that we use for our programs and which proves to be a very good solution for almost any type of application. The idea in this approach is to isolate the HTML code in a separate file, which we call a template. A template is just like any regular HTML file, but it contains some special markup dictating where to insert the various dynamic data that will be produced by the CGI program. The program need only be concerned with generating the actual dynamic content, then it can simply insert it in appropriate place in the template and thus come up with the final output.

What a template looks like

The only thing that sets a template apart from a regular HTML file is that it contains comments of the form . Such comments will be identified by the CGI program and replaced with the appropriate value for variable. The template may contain references to as many such variables as the CGI program is made to define. For example, take our search engine program. On the top of the search results page there appears some text saying what search was made and how many results it came up with. Assuming the program defines variables search_string and total_results, our template could look like: ... Your search for resulted in matching pages. ...

Filling in the template

A simple way of doing this is the following: open(TEMPLATE, $template_file); while( This will do the job, but we can certainly do much better than that. A realistic CGI-based application might contain many scripts generating many pages each, and each of those pages might contain much more than just a handful of variable data. Clearly we could write something much more general, a little module, that can handle output generation in the same way, but without requiring the programmer to explicitly perform substitutions of the template directives in the CGI program.

Perlfect::Template

As I already mentioned, we use that technique in all our programs here at Perlfect, and to make our life easier we have made a simple module that encapsulates all the functionality relating to templates that could be of use for CGI programming. The module is freely available for anyone to download and use for their own programs. To introduce you to the use of Perlfect::Template I will show a simple example: my $template = new Perlfect::Template("template.html"); my %data = ( name => 'Nick', email => 'nick@perlfect.com', homepage => 'http://www.perlfect.com', ); my $html = $template->cast(\%data); First of all we create a Perlfect::Template object corresponding to the template file we want to use. The filename (or fully qualified path to it) is passed as an argument to the object constructor that returns a reference to the newly created object. Then we prepare the data that we are going to cast to the template. Perlfect::Template expects us to supply it with a reference to a hash table (associative array) mapping data keys (template variables) to the values that should be inserted for them in the template. Thus all the processing done by the CGI program should end up defining a set of such key-value pairs in a hash reflecting the dynamic content of the output. Finally, we call the cast method of the template object passing a reference to our data hash. The cast method does not alter the template, but simply uses it to produce a filled-in copy of it that is returned by the call. That means, that you can reuse a template as much as you like in a program. This allows you to construct your output from 'sub-templates'. For example, in our search engine script, there is one template that defines the general layout of the results page, but there is also a template that defines the format of a single result listing. The program uses the latter to generate result listings by substituting, title, description, url, score, etc and thenputs them all together and inserts them in the general layout template. To create the result listings we only create one template object and use it for all results, each time passing a different hash reference to it. This has the advantage that the template file is read only once and that the processing needed to generate an other instance of the html output is minimal.

Passing substitution handlers..

In most cases the functionality we discussed so far is more than enough to do the job. There are cases though where passing static values for substitution in a template is a too restrictive mechanism. Sometimes you want the template to be able to describe not only the location of the substitutions but also some dynamic behaviour. In the example above I might want to let the template author choose whether the homepage url should appear in linked or plain text form. Of course, the template variable could just store the url and the template could contain a directive: "> to construct the linked form, but in other situations ther might not be such workarounds. I would like to let the user pass parameters in a template directive, and instead of substituting a static string for the directive, I would like the substitution string to be generated by a function that will be called with the parameters in the directive. Perlfect::Template allows you to use references to subroutines instead of scalars as values of the substitution hash. When a directive for suvh a template variable (that really is a subroutine) is met, Perlfect::Template will arrange for the subroutine to be called and will use ts return value as a substitution string. Further if the template directive includes arguments in parentheses as in the subroutine will be passed those aruments using perl's standard argument passing mechanism. Arguments in the parentheses follow standard perl syntax for subroutine calls. So let's review our example above, now using the subroutine calling mechanism... my $template = new Perlfect::Template("template.html"); my %data = ( name => 'Nick', email => 'nick@perlfect.com', homepage => sub { if($_[0] eq 'linked') { return ... } else { return ... } } ); my $html = $template->cast(\%data);

And who said it's only for HTML?

A final note to be made is that, while this tool was initially developed to aid the writing of CGI scripts that produce HTML output, it is absolutely fine to use for any kind of program that needs to format text files into a configurable layout. For example, one great use we've found is to create personalized email from a standard text template of the message and a database of names and email addresses of the recipients. In general any text file who is not likely to clash syntactically with the directive format of Perlfect::Template is perfectly suitable for processing with it.

[Via - Flexible CGI output with HTML templates] uk webmaster world

Multiplexing filehandles with select() in perl.

The problem
I/O requests such as read() and write() are blocking requests. Suppose you have a line in a program that get STDIN from a terminal like the following:

$input = ;

What will happen here is that the program's execution will block until there a line of input is available, i.e. the user types something followed by a newline. In many cases this is the desired behavior. Suppose you have a program that accepts requests through a socket and does some processing for each request, then moves on to the next request.

01 # Create the receiving socket 02 my $s = new IO::Socket ( 03 LocalHost => thekla, 04 LocalPort => 7070, 05 Proto => 'tcp' 06 Listen => 16, 07 Reuse => 1, 08 ); 09 die "Could not create socket: $!\n" unless $s; 10 11 my ($ns, $buf); 12 while( $ns = $s->accept() ) { # wait for and accept a connection 13 while( defined( $buf = <$ns> ) ) { # read from the socket 14 # do some processing 15 } 16 } 17 close($s);

Although this is a perfectly valid way of handling the incoming requests, it does suffer some serious problems, especially if the frequency of incoming requests is high and the processing that needs to be performed for each is a lot.

Clearly, the problem is that, once a request has been accepted, we have to keep other requests hanging in the queue while we read the request message and process it. Now, reading from a socket is a blocking call, so if the client takes too long to transmit the request message, we just sit there waiting while we could be doing useful processing of other requests. Obviously, not only this is not acceptable, but in cases where the demand for request processing is high, the program may not be able to meet its operating reqiurements. Also think that a single client failure at a critical point (in the middle of an ongoing transmission) poses the risk of making the server block indefinetly.

What can we do about it?

What we need to deal with situations like the above, is a way to handle I/O (we use sockets for this example, but the rules apply in general to any kind of filehandles) independently and with some sort of apparent parallelism/multiprocessing. There are two very common approaches to deal with this.

One approach is to spawn separate threads of control to handle each request. This can be done either at process-level, using fork() to create a new process for each request, or at thread-level using perl's threading capabilities to create multiple threads within the same process. (Perl's support for threads was introduced in version 5.005)

The other approach - which is the one that we will discuss here - is to use the select() to multiplex between several filehandles within a single thread of control, thus creating the effect of parallelism in the handling of I/O.

What does select() do?

The idea behind select() is to avoid blocking calls by making sure that a call will not block before attempting it. How do we do that? Suppose we have two filehandles, and we want to read data from them as it comes in. Let's call them A and B. Now, let's assume that A has no input pending yet, but B is ready to respond to a read() call. If we know this bit of information, we can try readin from B first, instead of A, knowing that our call will not block. select() gives us this bit of information. All we need to do is to define sets of filehandles (one for reading, one for writing and one for errors) and ask call select() on them which will return a filehandle which is ready to perform the operation for which it has been delegated (depending on which set it is in) as soon as such a filhandle is ready.

Obviously this provides us with the advantage of always picking up a filehandle that will not block thus avoiding the possibility of delaying the entire program for one lazy filehandle just because it happened to be the first we picked at random. Still, it does not guarantee that the selected filehandle is the best choice, because we still don't know how much data can be read, or how qucikly it can take in data that we wrte to it. But it is definetly a big step forward from our initial program.

Using select()

We will try writing the example program we attempted on the beginnign of this article, but now using the select() method. Instead of using perl's select call directly we will use a wrapper module, IO::Select that makes life easier for us.

... create socket as before ... 11 use IO::Select; 12 $read_set = new IO::Select(); # create handle set for reading 13 $read_set->add($s); # add the main socket to the set 14 15 while (1) { # forever 16 # get a set of readable handles (blocks until at least one handle is ready) 17 my ($rh_set) = IO::Select->select($read_set, undef, undef, 0); 18 # take all readable handles in turn 19 foreach $rh (@$rh_set) { 20 # if it is the main socket then we have an incoming connection and 21 # we should accept() it and then add the new socket to the $read_set 22 if ($rh == $s) { 23 $ns = $rh->accept(); 24 $read_set->add($ns); 25 } 26 # otherwise it is an ordinary socket and we should read and process the request 27 else { 28 $buf = <$rh>; 29 if($buf) { # we get normal input 30 # ... process $buf ... 31 } 32 else { # the client has closed the socket 33 # remove the socket from the $read_set and close it 34 $read_set->remove($rh); 35 close($rh); 36 } 37 } 38 } 39 }

We create an IO::Select object, $read_set, which is our set of handles to test for readability, and add all open handles to it. We start by adding the main socket and each time a new connection is made returning a new socket for it, we add that socket to the set. Then we go into a loop where we ask select to give us a list of readable handles and we examine each one in turn. If it is the main socket then we want to call accept() to receive the incoming connection and add the new socket to the read set. Otherwise it must be an ordinary socket in which case we read from it and process its input. If the read fails, that means the socket has been closed on the client side, so we close it, too, and remove it from the read set. So we work our way continuously through the incoming requests, by making sure that a call for I/O on any filehandle will progress since select() tells us it will.

As we already mentioned earlier, this method does not guarantee progress as it only tests whether a handle is ready to respond to I/O. The question still remains, whether the handle we pick from the ready ones is the one that will respond faster to I/O, and how much data there is available for reading or how much data it is ready to receive. So it is still possible to block a bit after the point where we picked the handle. Also, we did not take into account the impact on performance that the actual processing of requests will have. We might just be printing incoming data to a file, but then again, each request might need heavy processing that would slow down the entire handle processing loop. But these are issues that must be considered in the context of the individual application.

[Via - Multiplexing filehandles with select() in perl]

Wednesday, April 16, 2008

International Profit Associates

International Profit Associates IPA-IBA is a U.S. leader in business development and management consulting services for small and medium-size companies. IPA-IBA is one of the fastest growing management consulting companies. It is widely held that it is one of the only management consulting firm in the world that delivers such a broad array of professional services to the small and medium-size business marketplace.

My space Layouts

My space layout is a place where you can find all kind of layouts that you allways require to design a cool business website which will provide all the necessary solutions to make your business grow. MyspaceMaster.Net offers Myspace layouts, myspace codes, myspace backgrounds, myspace graphics, myspace generators. Customize myspace profile to make sure you are seen.