|
Date Published: 2002-08-26
Scalar data, as you learned yesterday, involves individual items such as numbers and strings. Yesterday, you learned several things you could do with scalar data; today, we'll finish up the discussion, show you more operators you can play with, and finish up with some related topics.
Go to Part 2
Sams Teach Yourself Perl in 21 Days, Second Edition
|
 |
Author: |
Laura Lemay |
| Publisher: |
Sams |
| ISBN: |
0672320355 |
| Published: |
April 2002 |
| Purchase Online: |
Amazon.com
($24.49) |
| |
Barnes
& Noble ($24.95) |
Day 3: More Scalar Data and Operators
Scalar data, as you learned yesterday, involves individual items such as numbers
and strings. Yesterday, you learned several things you could do with scalar
data; today, we'll finish up the discussion, show you more operators you
can play with, and finish up with some related topics. The things you can expect
to learn today are
-
Various assignment operators
-
String concatenation and repetition
-
Operator precedence
-
Pattern matching for digits
-
A short overview of input and output
Assignment Operators
Yesterday, we discussed the basic assignment operator, =, which assigns
a value to a variable. One common use of assignment is an operation to change
the value of a variable based on the current value of that variable, such as:
$inc = $inc + 100;
This does exactly what you'd expect; it gets the value of $inc,
adds 100 to it, and then stores the result back into $inc. This sort
of operation is so common that there is a shorthand assignment operator to do
just that. The variable reference goes on the left side, and the amount to change
it on the right, like this:
$inc += 100;
Perl supports shorthand assignments for each of the arithmetic operators, for
string operators I haven't described yet, and even for &&
and ||. Table 3.1 shows a few of the shorthand assignment operators.
Basically, just about any operator that has two operands has a shorthand assignment
version, where the general rule is that
variable operator= expression
is equivalent to
variable = variable operator expression
There's only one difference between the two: in the longhand version,
the variable reference is evaluated twice, whereas in the shorthand it's
only evaluated once. Most of the time, this won't affect the outcome of
the expression, just keep it in mind if you start getting results you don't
expect.
Table 3.1 Some Common Assignment Operators
|
Operator
|
Example
|
Longhand equivalent
|
|
+=
|
$x += 10
|
$x = $x + 10
|
|
-=
|
$x -= 10
|
$x = $x - 10
|
|
*=
|
$x *= 10
|
$x = $x * 10
|
|
/=
|
$x /= 10
|
$x = $x / 10
|
|
%=
|
$x %= 10
|
$x = $x % 10
|
|
**=
|
$x **= 10
|
$x = $x**10
|
Note that the pattern matching operator, =~, is not an assignment
operator and does not belong in this group. Despite the presence of the equals
sign (=) in the operator, pattern matching and variable assignment
are entirely different things.
Increment and Decrement Operators
The ++ and -- operators are used with a variable to increment
or decrement that variable by 1 (that is, to add or subtract 1). And as with
C, both operators can be used either in prefix fashion (before the variable,
++$x) or in postfix (after the variable, $x++). Depending
on the usage, the variable will be incremented or decremented before or after
it's used.
If your reaction to the previous paragraph is "Huh?", here's
a wordier explanation. The ++ and -- operators are used with
scalar variables to increment or decrement the value of that variable by 1,
sort of an even shorter shorthand to the += or -= operators.
In addition, both operators can be used before the variable referencecalled
prefix notation, like this:
++$x;
Or, in postfix notation (after the variable), like this:
$x++;
The difference is subtle and determines when, in the process of Perl's
evaluation of an expression, that the variable actually gets incremented. If
you used these operators as I did in those previous two examplesalone,
by themselvesthen there is no difference. The variable gets incremented
and Perl moves on. But, if you use these operators on the right side of another
variable assignment, then whether you use prefix or postfix notation can be
significant. For example, let's look at this snippet of Perl code:
$a = 1;
$b = 1;
$a = ++$b;
At the end of these statements, both $a and $b will be 2.
Why? The prefix notation means that $b will be incremented before its
value is assigned to $a. So, the order of evaluation in this expression
is that $b is incremented to 2 first, and then that value is assigned
to $a.
Now let's look at postfix:
$a = 1;
$b = 1;
$a = $b++;
In this case, $b still ends up getting incremented; its value at the
end of these three statements is 2. But $a's value stays at 1.
In postfix notation, the value of $b is used before it's incremented.
$b evaluates to 1, that value is assigned to $a, and then
$b is incremented to 2.
Note - To be totally, rigorously correct, my ordering of how things
happen here is off. For a variable assignment, everything on the right side
of the = operator always gets evaluated before the assignment occurs,
so in reality, $a doesn't get changed until the very last step.
What actually happens is that the original value of $b is remembered
by Perl, so that when Perl gets around to assigning a value to $a,
it can use that actual value. But unless you're working with really complex
expressions, you might as well think of it as happening before the increment.
Caution - Even though using assignment operators and increment operators
in the same statement can be convenient, you should probably avoid it because
it can cause confusion.
String Concatenation and Repetition
String and text management is one of Perl's biggest strengths, and quite
a lot of the examples throughout this book are going to involve working with
stringsfinding things in them, changing things in them, getting them from
files and from the keyboard, and sending them to the screen, to files, or over
a network to a Web browser. Today, we started by talking about strings in general
terms.
There are just a couple more things I want to mention about strings here, however,
because they fit in with today's "All Operators, All the Time"
theme. Perl has two operators for using strings: . (dot) for string
concatenation, and x for string repetition.
To concatenate together two strings you use the . operator, like this:
'four score' . ' and seven years ago';
This expression results in a third string containing 'four score and
seven years ago.' It does not modify either of the original strings.
You can put together multiple concatenations, and they'll result in one
single long string:
'this, ' . 'that, ' . 'and the ' . 'other thing.'
Perl also includes a concatenate-and-assign shorthand operator; similar to
the operators I listed in Table 3.1:
$x .= "dog";
In this example, if $x contained the string "mad",
then after this statement $x would contain the string "maddog".
As with the other shorthand assignment operators, $x .= 'foo'
is equivalent to $x = $x . 'foo'.
The other string-related operator is the x operator (not the X
operator; it must be a lowercase x). The x operator takes a
string on one side and a number on the other (but will convert them as needed),
and then creates a new string with the old string repeated the number of times
given on the right. Some examples:
'blah' x 4; # 'blahblahblahblah'
'*' x 3; # '***'
10 x 5; # '1010101010'
In that last example, the number 10 is converted to the string '10',
and then repeated five times.
Why is this useful? Consider having to pad a screen layout to include a certain
number of spaces or filler characters, where the width of that layout can vary.
Or consider, perhaps, doing some kind of ASCII art where the repetition of characters
can produce specific patterns (hey, this is Perl, you're allowedno,
encouragedto do weird stuff like that). At any rate, should you ever need
to repeat a string, the x operator can do it for you.
Operator Precedence and Associativity
Operator precedence determines which operators in a complex expression
are evaluated first. Associativity determines how operators that have
the same precedence are evaluated (where your choices are left-to-right, right-to-left,
or nonassociative for those operators where order of evaluation is either not
important, not guaranteed, or not even possible). Table 3.2 shows the precedence
and associativity of the various operators available in Perl, with operators
of a higher precedence (evaluated first) higher up in the table than those of
a lower precedence (evaluated later). You'll want to fold down the corner
of this page or mark it with a sticky note; this is one of those tables you'll
probably refer to over and over again as you work with Perl.
You can always change the evaluation of an expression (or just make it easier
to read) by enclosing it with parentheses. Expressions inside parentheses are
evaluated before those outside parentheses.
Note that there are a number of operators in this table you haven't learned
about yet (and some I won't cover in this book at all). I've included
lesson references for those operators I do explain later on in this book.
Table 3.2 Operator Precedence and Associativity
|
Operator
|
Associativity
|
What it means
|
|
->
|
left
|
Dereference operator (Day 19, "Working with References"
|
|
++ --
|
non
|
Increment and decrement
|
|
**
|
right
|
Exponent
|
|
! ~ \ + -
|
right
|
Logical not, bitwise not, reference (Day 19), unary +, unary -
|
|
=~ !~
|
left
|
Pattern matching
|
|
* / % x
|
left
|
Multiplication, division, modulus, string repeat
|
|
+ - .
|
left
|
Add, subtract, string concatenate
|
|
<< >>
|
left
|
Bitwise left shift and right shift
|
|
unary operators
|
non
|
Function-like operators (See today's "Going Deeper" section)
|
|
< > <= >= lt gt le ge
|
non
|
Tests
|
|
== != <=> eq ne cmp
|
non
|
More tests (<=> and cmp, Day 8, "Data Manipulation
with Lists")
|
|
&
|
left
|
Bitwise AND
|
|
| ^
|
left
|
Bitwise OR, bitwise XOR
|
|
&&
|
left
|
C-style logical AND
|
|
||
|
left
|
C-style logical OR
|
|
..
|
non
|
Range operator (Day 4, "Working with Lists and Arrays")
|
|
?:
|
right
|
Conditional operator (Day 6, "Conditionals and Loops")
|
|
= += -= *= /=, etc.
|
right
|
Assignment operators
|
|
, =>
|
left
|
Comma operators (Day 4)
|
|
list operators
|
non
|
list operators in list context (Day 4)
|
|
not
|
right
|
Perl logical NOT
|
|
and
|
left
|
Perl logical AND
|
|
or xor
|
left
|
Perl logical OR and XOR
|
Using Patterns to Match Digits
Yesterday I introduced you to the bare basics of pattern matching. You learned
how to look for strings contained inside other strings, which gives you some
flexibility in your scripts and what sort of input you can accept and test.
Today, and in future days, you'll learn about new kinds of patterns you
can use in the pattern operator /.../ and how to use those to make
your scripts more flexible and more powerful.
Today we'll look at patterns that are used to match digits, any digit,
from 0 to 9. Using what you learned yesterday you could test for digits like
this:
if ($input =~ /1/ or $input =~ /2/ or $input =~ /3/ or $input =~ /4/ or
$input =~ /5/ or $input =~ /6/ or ... }
But that would be a lot of repetitive typing, and Perl has a much better way
of doing that same thing. The \d pattern will match any single digit
character, from 0 to 9. So, the statement $input =~ /\d/ would test
$input to see if it contained any numbers, and return true if it did.
If input contained 1, it would return true. It would return true if input contained
"23skidoo" or "luckynumber123" or "1234567".
Each \d stands for a single digit. If you want to match two digits,
you can use two \d patterns:
$input =~ /\d\d/;
In this case "23skidoo" would return true, but "number1"
would not. You need two digits in a row for the pattern to match.
You can also combine the \d pattern with specific characters:
$input =~ /\dup\d/;
This pattern matches, in order: any digit, the character u, the character
p, and then any digit again. So "3up4" would match;
"comma1up1semicolon" would match; but "14upper13"
would not. You need the exact sequence of any number and specific characters.
You can also match any character that is not a digit using the \D
pattern. \D matches all the letters, all the punctuation, all the whitespace,
anything in the ASCII character set that isn't 0 through 9. You might use
this one specificallyas we will in the next sectionto test your
input to see if it contains any nonnumeric characters. When you want your input
to be numeric, for example, if you were going to perform arithmetic on it, or
compare it to something else numeric, you want to make sure there are no other
characters floating around in there. \D will do that:
$input =~ /\D/;
This test will return true if $input contains anything nonnumeric.
"foo" is true, as is "foo34", "23skidoo",
or "a123". If the input is "3456" it will
return falsethere are only numeric characters there.
Table 3.3 shows the patterns you've learned so far.
Table 3.3 Patterns
|
Pattern
|
Type
|
What it does
|
|
/a/
|
Character
|
Matches a
|
|
/\d/
|
Any digit
|
Matches a single digit
|
|
/\D/
|
Any character not a digit
|
Matches a single character other than a digit
|
An Example: Simple Statistics
Here's an example called stats.pl, which prompts you for numbers,
one at a time. When you're done entering numbers, it gives you a count
of the numbers, the sum, and the average. It's a rather silly kind of statistics
script, but it'll demonstrate tests, variable assignment, and pattern matching
for input verification (and we'll be building on this script later on).
Here's an example of what it looks like when run (including what happened
when I accidentally typed an r in the middle of entering the numbers):
% stats.pl
Enter a number: 3
Enter a number: 9
Enter a number: 3
Enter a number: r
Digits only, please.
Enter a number: 7
Enter a number: 4
Enter a number: 7
Enter a number: 3
Enter a number:
Total count of numbers: 7
Total sum of numbers: 36
Average: 5.14
%
Listing 3.1 shows the code behind the statistics script.
Listing 3.1 The stats.pl Script
1: #!/usr/local/bin/perl -w
2:
3: $input = ''; # temporary input
4: $count = 0; # count of numbers
5: $sum = 0; # sum of numbers
6: $avg = 0; # average
7:
8: while () {
9: print 'Enter a number: ';
10: chomp ($input = <STDIN>);
11: if ($input eq '') { last; }
12:
13: if ($input =~ /\D/) {
14: print "Digits only, please.\n";
15: next;
16: }
17:
18: $count++;
19: $sum += $input;
20: }
21:
22: $avg = $sum / $count;
23:
24: print "\nTotal count of numbers: $count\n";
25: print "Total sum of numbers: $sum\n";
26: printf("Average (mean): %.2f\n", $avg);
This script has three main sections: an initialization section, a section for
getting and storing the input, and a section for computing the average and printing
out the results.
Here's the initialization section (with line numbers in place):
3: $input = ''; # temporary input
4: $count = 0; # count of numbers
5: $sum = 0; # sum of numbers
6: $avg = 0; # average
We're using four scalar variables here: one to store the input as it comes
in, one to keep track of the count of numbers, one to hold the sum, and one
to hold the average.
The next section is where you prompt for the data and store it:
8: while () {
9: print 'Enter a number: ';
10: chomp ($input = <STDIN>);
11: if ($input eq '') { last; }
12:
13: if ($input =~ /\D/) {
14: print "Digits only, please.\n";
15: next;
16: }
17:
18: $count++;
19: $sum += $input;
20: }
This second part of the script uses a while loop and a couple if
conditionals to read the input repeatedly until we get a blank line. And also
to test the input to make sure that we didn't get anything that wasn't
a number. I still haven't discussed how loops and conditionals are defined
in Perl (and we won't get around to it until Day 6). So, I'm going
to pause here and give you a very basic introduction so you will not be totally
lost for the next few days.
A while loop says "while this thing is true, execute this stuff."
With a while loop Perl executes a test, and if the test is true it
executes everything inside the curly braces (here, everything in between lines
9 and 20). Then, it'll go back and try the test again, and if it's
true again, it'll execute all that code again, and so on. The loop means
it goes around and around and around until the test is false.
Usually, the test is contained inside the parentheses (line 8), and can be
any of the tests you learned about yesterday. Here, there is no test, so this
is an infinite loop; it never stops, at least not here. We'll find a way
to break out of it from inside the loop.
An if conditional is simpler than a loop. An if conditional
has a test, and if the test is true, Perl executes some code. If the test is
false, sometimes Perl executes some other code (if the if conditional
has a second part, called an else), and sometimes it just goes onto
the next part of the script. So, for example, in the if conditional
in line 11, the test is if the input is equal to the empty string ''.
If the test is true, last is executed. The last keyword is
used to immediately break out of a while loop and stop looping. If
the test is false, Perl skips over line 11 altogether and continues onto line
13.
In the if conditional, lines 13 through 16, the test is a pattern
match. Here we're testing the $input to see if it contains any
nondigit characters. If it does, we execute the print statement in
line 14, and then call next. The next keyword skips to the
end of the while loop (in this case, skipping lines 18 and 19), and
restarts the next loop at the top of the while again. Just as with
line 11, if the test in line 13 was false, Perl skips over everything in line
13 to 16 and continues onto line 18 instead.
Now that you know about if and while, let's start at
the top and figure out what this bit of code actually does. It's a while
loop with no test, so it'll keep going forever until something breaks you
out of it. Inside the body of the while, we use line 10 to grab the
actual input (and I know you're still waiting to learn what chomp
and <STDIN> do; it's coming up soon). Line 11, as I mentioned,
tests for an empty string, and if we got one, breaks out of the loop. The empty
string in the input will only occur if the user hit return without typing anything;
that is the signal to the stats program that the end of input has been reached.
Note the string test (ne) here; a number test would convert the empty
string to 0, which is not what we want. 0 is valid input for the stats program.
When you get to line 13 we know we have something in $input, but we
want to make sure that you have valid input, that is, numeric data. You're
going to be performing arithmetic on this data in lines 18 and 19, and if you
end up with nonnumeric data in the input, and warnings turned on, Perl is going
to complain about that data. By verifying and rejecting invalid input you can
make sure your scripts do not do unfriendly things like spew errors, or crash
when your users are running them.
Lines 13 through 15 are the input validation test. If the input did contain
nonnumeric data, we print an error and the loop restarts by prompting for new
data.
By the time we get to line 18 we now know that we have data to be handled in
$input and that data does not contain nonnumeric characters. Now we
can add that new data to our current store of data. In lines 18 and 19 we increment
the $count variable, and modify the $sum variable to add the
new value that was input. With these two lines we can keep a running total of
the count and the sum as each new bit of input comes along, but we'll wait
until all the data has been entered to calculate the average.
And, finally, we finish up by calculating the average and printing the results:
22: $avg = $sum / $count;
23:
24: print "\nTotal count of numbers: $count\n";
25: print "Total sum of numbers: $sum\n";
26: printf("Average (mean): %.2f\n", $avg);
Line 22 is a straightforward calculation, and lines 24 through 26 print the
count, the sum, and the average. Note the \n at the beginning of the
first print statement; this will print an extra blank line before the
summary. Remember that \n can appear anywhere in a string, not just
at the end.
In the third summary line, you'll note we're using printf
again to format the number output. This time, we used a printf format
for a floating-point number that prints 2 decimal places of that value (%.2f).
You get more information about printf in the next section.
© Copyright Pearson Education. All rights reserved.
|