|
Date Published: 1999-10-01
by D. Jasmine Merced
TNS Group, Inc.
Hashes (Associative Arrays)
Previously known as "associative arrays", the developers of perl, in their wisdom, eliminated the
common confusion with arrays and shortened the name at the same time
by starting to refer to it as a "hash" instead. While the term "associative array" offers a bit of
insight to what it actually is, it was a little confusing in that you'd think an associative array was
just a different name for an array.
Not true.
How are hashes different from arrays? |
|
Arrays are lists storing multiple pieces of data, and are prefixed with an @
sign. The pieces of data, called elements, are retrieved by their numeric place in the array.
Below is an example of an array listing all of the months of the year.
@months = ("January", "February", "March", "April", "May", "June", "July", "August", "September",
"October", "November", "December");
In the above example, January would be retrieved by using $months[0].
(If this is foreign to you, please refer back to the discussion on arrays).
Hashes, on the other hand, store a list of key/value pairs and is prefixed with a percent sign
%. (Don't' let that definition scare you - it will be explained in
detail after the example below.)
Note: The remainder of this article will use "hash" instead of "associative array".
What do hashes look like? |
|
Before we continue, let's look at an example of a hash, using book titles and their authors (for
brevity, I've only listed the first author on the cover of each book).
Here we defined a hash called %perlbooks. Remember, hashes begin with a
percent sign %. Within %perlbooks, we
listed 6 book titles and their respective authors. You'll notice that each book has its own author, and
all books are part of a list.
In this example, the book title is the "key" and the author is the "value". Each book and author pair
is part of the overall list, %perlbooks.
Back to the original definition, a hash stores a list of key/value pairs. Does this make sense now?
Hashes are the easiest and most efficient way to handle chucks of key/value pairs.
You can have as many key/values pairs as you'd like.
You can easily search the keys or value of the entire hash.
You can easily add, modify or remove pairs by editing the hash, which is located in one section,
not scattered around the entire program.
The alternatives to not using hashes aren't pretty. Suppose a programmer chose not to use hashes
and decided to define each book and author as a scalar variable. This is what
the code could look like:
$booktitle1 = "Learning Perl";
$bookauth1 = "Randal L. Schwartz";
$booktitle2 = "Programming Perl";
$bookauth2 = "Larry Wall";
$booktitle3 = "Advanced Perl Programming";
$bookauth3 = "Sriram Srinivasan";
Then, if the programmer needed to look for "Advanced Perl Programming" (and didn't know that you would
put "Advanced Perl Programming" as $booktitle3), they would have to
individually see if $booktitle1 matched "Advanced Programming Perl", then
$booktitle2, then $booktitle3. Imagine
if you had 50 books.
By using hashes, the each key and value of the hash can be search quickly and effectively.
Accessing individual keys/values |
|
Similar to the way arrays access elements, individual hash values can be
access by first "converting" it to a scalar variable. Example: To get the
author of "Programming Perl", the programmer would use
$perlbooks{'Programming Perl'}
Notice the curly braces {}. Whereas an array uses brackets
[], hashes use curly braces {}. The use
of curly braces tells perl that the value is located in a hash and not an array.
This is going to get a little technical, and in most cases, it's not at all necessary to know anything
in this section if you're just installing downloaded/purchased programs. This explanation is for those
"curious ones" who just have to put all of the pieces together and see an actual implementation of how
hashes are searched. If it starts to make your head hurt, by all means, skip to the next section.
Let's start with a searching sample, then onto a line-by-line explanation:
The above program, when run, would print out:
The advanced perl books in our hash are:
Advanced Perl Programming by Sriram Srinivasan
Mastering Regular Expressions by Jeffrey E.F. Friedl
Mastering Algorithms with Perl by Jon Orwant
Now, the explanation.
print "The advanced perl books in our hash are:\n\n";
This line prints "The advanced perl books in our hash are:" with one blank line after it.
@allbooks = sort(keys(%perlbooks));
This line does 2 things. 1) it sorts the keys (book titles) in ASCII* order; and 2) it assigns the hash %perlbooks to an array @allbooks.
foreach $title(@allbooks){
This line says "for every book title in @allbooks"...
if (($title =~ /Mastering/)||($title =~ /Advanced/)){
... if the title contains "Mastering" or "Advanced"...
print "$title by $perlbooks{$title}\n";}}
... print the book title, the word "by" and the author of the book.
If you were able to follow this, you can now see how easy hashes make searching.
* ASCII order is not what you and I would normally think of alphanumeric order. In ASCII
order, "A" and "Z" comes before "a".
Why do I need to know about hashes? |
|
Many programs use hashes to store data as discussed above. In many cases, it's you who enters the
information in the hashes - perhaps in a configuration file, or the top of a program, if you install perl
programs, there'll probably be a time when you'll need to know the right way to enter information into
hashes.
If information is typed incorrectly into hashes, the perl program will fail or yield incorrect results.
Let's go through the proper format of a hash.
As discussed, a hash begins with a percent sign %. As you see in the
%perlbooks example (copied below for your convenience), the entire
contents of the list is enclosed in parenthesis ().
The key "points" to the value by using the following format:
"key" => "value";
Both the key (book title) and its value (author) need to be entered following the same principles as
scalars. That is:
- "Straight text can be enclosed in quotation marks"
- 'Text using special characters should be enclosed in single quotes'
- "If you prefer not to use single quotes, special characters need to be escaped using a backslash,
like \@ this";
Please be sure to review the discussion on scalar variables for a detailed
discussion on proper formatting.
Of course, the instructions that come with the program you're installing should let you know what type
of information you should include in your hash.
Adding and Removing Lines from Hashes |
|
Let's say you're installing a shopping cart program that allows you to set up various shipping methods
and define a flat fee for each shipping method. Let's further suppose that the programmer gave you
the following default:
%shipping = (
"UPS" => "10.00",
"Federal Express" => "20.00",
"USPS" => "5.00"
);
Now you want to add "Airborne Express" and remove "Federal Express". Just copy the "USPS" line and
paste it below "USPS" and change the text, right?
Wrong.
See the way the Federal Express line ends with a comma? And the way USPS doesn't? Perl needs to have
key/value pairs separated by commas, so it knows where one pair begins and ends. Omitting a comma between
pairs will produce an error and cause the program to fail.
If you want to copy and paste the last line, just make sure the second to last line ends in a
comma. Add one if you must. Some programmers will use the comma on the last line and others won't. You'll
have to keep a sharp eye out for that one. So now our hash looks like:
%shipping = (
"UPS" => "10.00",
"Federal Express" => "20.00",
"USPS" => "5.00",
"Airborne Express" => "7.50"
);
It may seem silly (it's just a comma after all), but that little omission is one of the most common
errors when using hashes.
To remove Federal Express, easy, just delete the entire line. Our final hash:
%shipping = (
"UPS" => "10.00",
"USPS" => "5.00",
"Airborne Express" => "7.50"
);
A hash stores a list of key/value pairs. These key/value pairs can easily be searched and offer a very
logical way to organize related data. Hashes start with a % and have a
few strict guidelines to follow to properly define them.
Next, we're going to delve into subroutines.
D. Jasmine Merced is a partner in Tintagel
Net Solutions Group, Inc. and the administrator of The Perl Archive. She also serves as a Director of
the World Organization of Webmasters.
|