CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in
Add ListingModify ListingTell A FriendLink to TPASubscribeNew ListingsCool ListingsTop RatedRandom Link
Newest Reviews
  • review
  • hagen software
  • NOT GPL!
  • Hagan Software
  • Wasted Time with ...
  • poor pre-sale sup...
  • no response
  • rating the offer
  • Good Stuff
  • Good idea but use...


  •  
    Perl Archive : TLC : Programming : Perl : Some Perl Tips
    Guide Search entire directory 
     

    Date Published: 2000-05-01

    Written for the Perl Archive by Turk Scripts

    In this article we wanted to point out several perl tips, which might be helpful for beginner or intermediate level perl programmers. Please feel free to send an email to turkscripts@hotelspectra.com if you have any questions or corrections.


    Reading the whole file to a variable at one step.

    Instead of reading a file line by line, you might want to read the whole file to a variable at one step. This is useful especially if you are reading html files. If you open a file and try reading from that file, perl reads only until the first [enter] character and stops. The reason of this behavior is that the default "input record separator" in perl is the [enter] character. This separator is defined in the special variable $/. By default $/ is equal to "\n". If you undefine this variable using undef, you can read the file at one step,

    Example:

    undef $/;
    open(FILE, "data.htm");
    $html = <FILE>;
    close(FILE);

    Using qq{} for printing strings:

    If you take a look to most of the cgi scripts you might see a line like:
    print "<a href=\"http://yahoo.com\">Yahoo is $property</a>";

    Well, if you use double quotes to print strings, you have to escape all double quotes, which appear in your string. An easier way is to use qq{} function. It is a replacement for double quotes and you don't need to escape anything except you should escape any "}" in your string unless it's preceded by a "{". Same line can be written as:

    print qq{<a href="http://yahoo.com">Yahoo is $property</a>};

    Another benefit you might be interested in is that you can put multiple lines of data inside qq{}.

    Example:

    print qq{
    
    <html>
    
    <head>
    <meta http-equiv="Content-Language" content="en-us">
    <meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
    <title>New Page 1</title>
    </head>
    
    <body>
    
    <p>Some html page</p>
    
    </body>
    
    </html>
    
    };

    Regular expressions to search a variable:

    I am assuming from now on that you are familiar with substitution operator in perl: s///. A basic example:

    s/apple/orange/;

    would replace the word "apple" with the word "orange". The separator "/" we used in this example can be replaced with any other non alpha-numeric character. The catch is; you have to escape the separator character inside your regular expression. So it is a better idea to use a less common character as a separator than "/". I prefer using "#" as a separator, because it is less common in strings and visually it is a good separator. So same regular expression could be written as:

    s#apple#orange#;

    A common mistake people do when using regular expressions is to try to match a variable in your regular expressions.

    Example:

    $data =~ s#$url#http://yahoo.com#;

    This is going to work properly most of the time. But sometime it won't behave as expected or you will be experiencing occasional run time errors. For example, if your $url is equal to http://yahoo.com/do.cgi?action=go++&tell=poetry, the substitution operator is going to fail and exit with an error message.

    "/http://yahoo.com/do.cgi?action=go++&tell=poetry/: nested *?+ in regex..."

    The reason for the failure is that you can't use "++" inside your regular expression. You have to escape them. The variable might include several special variables, which have to be escaped properly. To correct way to implement this substitution is:

    $temp = quotemeta($url);
    $data =~ s#$temp#http://yahoo.com#;

    quotemeta() is a standard perl function and it escapes all non-alphanumeric characters in your variable.


    Using eval for clever substitutions:

    If you used regular expressions in perl, you should have used substitution operator frequently. Most of the time a simple substitution is satisfactory.

    Example:

    $html =~ s#\bdogs\b#cats#ig;

    In this example all the occurrences of the word "dogs" are replaced by the word "cats". What if we want to replace "dogs" with variable we calculated in our program rather than a fixed text.

    Example:

    $html =~ s#\bdogs\b#join(', ' , @animals)#ige;

    In this example we used "e" switch, which enables us to use a result of an expression as a replacement.

    "e" means: evaluate right side as an expression.

    If you want to do more complicated replacement using a chunk of code, you might want to use eval function with curly brackets.

    Example: In this example if the target of a link in an html page is "_top", then we replace that link with a link to http://yahoo.com.

    $html =~ s#<a href="([^"]*)" target="([^"]*)"#
    eval{
      if($2 eq "_top"){ 
        $string = qq{
          <a href="http://yahoo.com" target="_top"
        };
      } 
      else {
        $string = qq{
          <a href="$1" target="$2"
        };
      }
      $string
    }
    #iges;

    Company Info

    Turk Scripts is a Turkish company specialized on high performance and bug-free CGI & Perl scripts. Current projects are focused on fetching web pages on the fly, parsing databases, personalization, spidering, information processing and retrieval. Please visit our web site for more information.
     

    Written for the Perl Archive by Turk Scripts

     
     


    About The Perl ArchiveLink Validation ProcessSearch Tips
    Web Applications & Managed Hosting Powered by Gossamer Threads
    Visit our Mailing List Archives