Retro video games delivered to your door every month!
Click above to get retro games delivered to your door ever month!
X-Hacker.org- PERL 4.0 Reference Guide - Norton Guide http://www.X-Hacker.org [<<Previous Entry] [^^Up^^] [Next Entry>>] [Menu] [About The Guide]

     Data Types and Objects

     Perl has three data types: scalars, arrays of  scalars,  and
     associative arrays of scalars.  Normal arrays are indexed by
     number, and associative arrays by string.

     The interpretation of operations and values  in  perl  some-
     times  depends on the requirements of the context around the
     operation or value.  There are three major contexts: string,
     numeric  and  array.  Certain operations return array values
     in contexts wanting an array, and scalar  values  otherwise.
     (If this is true of an operation it will be mentioned in the
     documentation for that operation.)  Operations which  return
     scalars  don't  care  whether  the  context is looking for a
     string or a number, but  scalar  variables  and  values  are
     interpreted as strings or numbers as appropriate to the con-
     text.  A scalar is interpreted as TRUE in the boolean  sense
     if  it  is  not  the null string or 0.  Booleans returned by
     operators are 1 for true and 0 or '' (the null  string)  for
     false.

     There are actually two varieties of null string: defined and
     undefined.   Undefined  null strings are returned when there
     is no real value for something, such as when  there  was  an
     error, or at end of file, or when you refer to an uninitial-
     ized variable or element of an  array.   An  undefined  null
     string  may become defined the first time you access it, but
     prior to that you can use the defined() operator  to  deter-
     mine whether the value is defined or not.

     References to scalar variables always begin with  '$',  even
     when referring to a scalar that is part of an array.  Thus:

         $days           # a simple scalar variable
         $days[28]       # 29th element of array @days
         $days{'Feb'}    # one value from an associative array
         $#days          # last index of array @days

     but entire arrays or array slices are denoted by '@':

         @days           # ($days[0], $days[1],... $days[n])
         @days[3,4,5]    # same as @days[3..5]
         @days{'a','c'}  # same as ($days{'a'},$days{'c'})

     and entire associative arrays are denoted by '%':

         %days           # (key1, val1, key2, val2 ...)

     Any of these eight constructs may serve as an  lvalue,  that
     is,  may be assigned to.  (It also turns out that an assign-
     ment is itself an lvalue in certain  contexts--see  examples
     under s, tr and chop.)  Assignment to a scalar evaluates the
     righthand side in a scalar context, while assignment  to  an
     array  or  array  slice  evaluates  the righthand side in an
     array context.

     You may  find  the  length  of  array  @days  by  evaluating
     "$#days",  as in csh.  (Actually, it's not the length of the
     array, it's the subscript of the last element,  since  there
     is (ordinarily) a 0th element.)  Assigning to $#days changes
     the length of the array.  Shortening an array by this method
     does  not actually destroy any values.  Lengthening an array
     that was previously shortened recovers the values that  were
     in  those elements.  You can also gain some measure of effi-
     ciency by preextending an array that is going  to  get  big.
     (You  can  also  extend  an array by assigning to an element
     that is off the end of the array.  This differs from assign-
     ing to $#whatever in that intervening values are set to null
     rather than recovered.)  You can truncate an array  down  to
     nothing  by assigning the null list () to it.  The following
     are exactly equivalent

          @whatever = ();
          $#whatever = $[ - 1;


     If you evaluate an array in a scalar context, it returns the
     length of the array.  The following is always true:

          @whatever == $#whatever - $[ + 1;


     Multi-dimensional arrays are not directly supported, but see
     the  discussion of the $; variable later for a means of emu-
     lating multiple subscripts with an associative  array.   You
     could  also  write  a subroutine to turn multiple subscripts
     into a single subscript.

     Every data type has its own  namespace.   You  can,  without
     fear  of  conflict, use the same name for a scalar variable,
     an array, an associative array, a filehandle,  a  subroutine
     name,  and/or  a label.  Since variable and array references
     always start with '$', '@', or  '%',  the  "reserved"  words
     aren't  in  fact  reserved  with  respect to variable names.
     (They ARE reserved with respect to labels  and  filehandles,
     however,  which  don't  have  an  initial special character.
     Hint:  you  could  say   open(LOG,'logfile')   rather   than
     open(log,'logfile').    Using   uppercase  filehandles  also
     improves readability and protects  you  from  conflict  with
     future  reserved  words.)  Case IS significant--"FOO", "Foo"
     and "foo" are all different names.  Names which start with a
     letter may also contain digits and underscores.  Names which
     do not start with a letter are  limited  to  one  character,
     e.g.  "$%" or "$$".  (Most of the one character names have a
     predefined significance to perl.  More later.)


     You can also embed newlines directly in your  strings,  i.e.
     they  can  end on a different line than they begin.  This is
     nice, but if you forget your trailing quote, the error  will
     not be reported until perl finds another line containing the
     quote character, which may be much further on in the script.
     Variable  substitution  inside  strings is limited to scalar
     variables, normal array values, and array slices.  (In other
     words,  identifiers  beginning  with  $ or @, followed by an
     optional bracketed expression as a subscript.)  The  follow-
     ing code segment prints out "The price is $100."

         $Price = '$100';               # not interpreted
         print "The price is $Price.\n";# interpreted

     Note that you can put curly brackets around  the  identifier
     to  delimit it from following alphanumerics.  Also note that
     a single quoted string must be separated  from  a  preceding
     word  by a space, since single quote is a valid character in
     an identifier (see Packages).

     Two  special  literals  are  __LINE__  and  __FILE__,  which
     represent the current line number and filename at that point
     in your program.  They may only be used as separate  tokens;
     they  will  not  be interpolated into strings.  In addition,
     the token __END__ may be used to indicate the logical end of
     the  script  before  the  actual end of file.  Any following
     text is ignored (but may be read via the  DATA  filehandle).
     The  two  control  characters  ^D  and  ^Z  are synonyms for
     __END__.

     A word that doesn't have any  other  interpretation  in  the
     grammar  will  be  treated as if it had single quotes around
     it.  For this purpose, a word consists only of  alphanumeric
     characters  and underline, and must start with an alphabetic
     character.  As with filehandles and labels, a bare word that
     consists  entirely  of lowercase letters risks conflict with
     future reserved words, and if you use the  -w  switch,  Perl
     will warn you about any such words.

     Array values are interpolated into double-quoted strings  by
     joining  all  the  elements  of the array with the delimiter
     specified in the $" variable, space by default.   (Since  in
     versions  of  perl  prior  to  3.0 the @ character was not a
     metacharacter in double-quoted strings, the interpolation of
     @array,   $array[EXPR],   @array[LIST],   $array{EXPR},   or
     @array{LIST} only happens if array is  referenced  elsewhere
     in  the  program  or  is  predefined.)   The  following  are
     equivalent:

          $temp = join($",@ARGV);
          system "echo $temp";

          system "echo @ARGV";

     Within search patterns (which  also  undergo  double-quotish
     substitution)  there  is a bad ambiguity:  Is /$foo[bar]/ to
     be interpreted as /${foo}[bar]/ (where [bar] is a  character
     class for the regular expression) or as /${foo[bar]}/ (where
     [bar] is the subscript to array @foo)?  If @foo doesn't oth-
     erwise  exist,  then  it's  obviously a character class.  If
     @foo exists, perl takes a good guess  about  [bar],  and  is
     almost  always  right.  If it does guess wrong, or if you're
     just plain paranoid, you can force the  correct  interpreta-
     tion with curly brackets as above.

     A line-oriented form of quoting is based on the shell  here-
     is syntax.  Following a << you specify a string to terminate
     the quoted material, and all  lines  following  the  current
     line  down  to  the  terminating string are the value of the
     item.  The terminating string may be either an identifier (a
     word),  or  some quoted text.  If quoted, the type of quotes
     you use determines the treatment of the  text,  just  as  in
     regular  quoting.   An unquoted identifier works like double
     quotes.  There must be no space between the << and the iden-
     tifier.   (If  you  put a space it will be treated as a null
     identifier, which is valid,  and  matches  the  first  blank
     line--see  Merry  Christmas example below.)  The terminating
     string must appear by itself (unquoted and with no surround-
     ing whitespace) on the terminating line.

          print <<EOF;        # same as above
     The price is $Price.
     EOF

          print <<"EOF";      # same as above
     The price is $Price.
     EOF

          print << x 10;      # null identifier is delimiter
     Merry Christmas!

          print <<`EOC`;      # execute commands
     echo hi there
     echo lo there
     EOC

          print <<foo, <<bar; # you can stack them
     I said foo.
     foo
     I said bar.
     bar

     Array literals are denoted by separating  individual  values
     by commas, and enclosing the list in parentheses:

          (LIST)

     In a context not requiring an array value, the value of  the
     array literal is the value of the final element, as in the C
     comma operator.  For example,

         @foo = ('cc', '-E', $bar);

     assigns the entire array value to array foo, but

         $foo = ('cc', '-E', $bar);

     assigns the value of variable bar  to  variable  foo.   Note
     that the value of an actual array in a scalar context is the
     length of the array; the following assigns to $foo the value
     3:

         @foo = ('cc', '-E', $bar);
         $foo = @foo;         # $foo gets 3

     You  may  have  an  optional  comma   before   the   closing
     parenthesis of an array literal, so that you can say:

         @foo = (
          1,
          2,
          3,
         );

     When a LIST is  evaluated,  each  element  of  the  list  is
     evaluated in an array context, and the resulting array value
     is interpolated into LIST just as if each individual element
     were a member of LIST.  Thus arrays lose their identity in a
     LIST--the list

          (@foo,@bar,&SomeSub)

     contains all the elements of @foo followed by all  the  ele-
     ments  of @bar, followed by all the elements returned by the
     subroutine named SomeSub.

     A list value may also be subscripted like  a  normal  array.
     Examples:

          $time = (stat($file))[8];     # stat returns array value
          $digit = ('a','b','c','d','e','f')[$digit-10];
          return (pop(@foo),pop(@foo))[0];


     Array lists may be assigned to if and only if  each  element
     of the list is an lvalue:

         ($a, $b, $c) = (1, 2, 3);

         ($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00);

     The final element may be an array or an associative array:

         ($a, $b, @rest) = split;
         local($a, $b, %rest) = @_;

     You can actually put an array anywhere in the list, but  the
     first  array  in  the  list will soak up all the values, and
     anything after it will get a null value.  This may be useful
     in a local().

     An associative array literal contains pairs of values to  be
     interpreted as a key and a value:

         # same as map assignment above
         %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);

     Array assignment in a scalar context returns the  number  of
     elements produced by the expression on the right side of the
     assignment:

          $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2


     There are several other pseudo-literals that you should know
     about.    If  a  string  is  enclosed  by  backticks  (grave
     accents), it first undergoes variable substitution just like
     a  double  quoted  string.  It is then interpreted as a com-
     mand, and the output of that command is  the  value  of  the
     pseudo-literal,  like  in  a  shell.  In a scalar context, a
     single string consisting of all the output is returned.   In
     an  array  context,  an array of values is returned, one for
     each line of output.  (You can set $/  to  use  a  different
     line  terminator.)   The  command  is executed each time the
     pseudo-literal is evaluated.  The status value of  the  com-
     mand  is  returned  in  $?  (see  Predefined  Names  for the
     interpretation of $?).  Unlike in  csh,  no  translation  is
     done  on  the return data--newlines remain newlines.  Unlike
     in any of the shells, single quotes  do  not  hide  variable
     names  in  the  command  from  interpretation.   To pass a $
     through to the shell you need to hide it with a backslash.

     Evaluating a filehandle in angle brackets  yields  the  next
     line  from  that file (newline included, so it's never false
     until EOF, at which time an undefined  value  is  returned).
     Ordinarily  you  must  assign  that value to a variable, but
     there is one situation where an  automatic  assignment  hap-
     pens.   If  (and only if) the input symbol is the only thing
     inside the  conditional  of  a  while  loop,  the  value  is
     automatically assigned to the variable "$_".  (This may seem
     like an odd thing to you, but you'll use  the  construct  in
     almost  every perl script you write.)  Anyway, the following
     lines are equivalent to each other:

         while ($_ = <STDIN>) { print; }
         while (<STDIN>) { print; }
         for (;<STDIN>;) { print; }
         print while $_ = <STDIN>;
         print while <STDIN>;

     The filehandles STDIN, STDOUT  and  STDERR  are  predefined.
     (The  filehandles  stdin,  stdout  and stderr will also work
     except in packages, where they would be interpreted as local
     identifiers rather than global.)  Additional filehandles may
     be created with the open function.

     If a <FILEHANDLE> is used in a context that is  looking  for
     an  array,  an  array  consisting  of all the input lines is
     returned, one line per array element.  It's easy to  make  a
     LARGE data space this way, so use with care.

     The null filehandle <> is special and can be used to emulate
     the  behavior  of  sed  and awk.  Input from <> comes either
     from standard input, or from each file listed on the command
     line.   Here's how it works: the first time <> is evaluated,
     the ARGV array is checked, and if it is  null,  $ARGV[0]  is
     set to '-', which when opened gives you standard input.  The
     ARGV array is then processed as a list  of  filenames.   The
     loop

          while (<>) {
               ...            # code for each line
          }

     is equivalent to

          unshift(@ARGV, '-') if $#ARGV < $[;
          while ($ARGV = shift) {
               open(ARGV, $ARGV);
               while (<ARGV>) {
                    ...       # code for each line
               }
          }

     except that it isn't as cumbersome to say.  It  really  does
     shift  array ARGV and put the current filename into variable
     ARGV.  It also uses filehandle  ARGV  internally.   You  can
     modify  @ARGV  before  the first <> as long as you leave the
     first filename at the beginning of the array.  Line  numbers
     ($.)  continue as if the input was one big happy file.  (But
     see example under eof for how to reset line numbers on  each
     file.)

     If you want to set @ARGV to your own list of files, go right
     ahead.   If  you want to pass switches into your script, you
     can put a loop on the front like this:

          while ($_ = $ARGV[0], /^-/) {
               shift;
              last if /^--$/;
               /^-D(.*)/ && ($debug = $1);
               /^-v/ && $verbose++;
               ...       # other switches
          }
          while (<>) {
               ...       # code for each line
          }

     The <> symbol will return FALSE only once.  If you  call  it
     again  after  this it will assume you are processing another
     @ARGV list, and if you haven't set @ARGV,  will  input  from
     STDIN.

     If the string inside the angle brackets is a reference to  a
     scalar  variable  (e.g. <$foo>), then that variable contains
     the name of the filehandle to input from.

     If the string inside angle brackets is not a filehandle,  it
     is  interpreted  as  a  filename  pattern to be globbed, and
     either an array of filenames or the  next  filename  in  the
     list  is  returned,  depending  on  context.  One level of $
     interpretation is done  first,  but  you  can't  say  <$foo>
     because  that's  an  indirect filehandle as explained in the
     previous paragraph.  You  could  insert  curly  brackets  to
     force interpretation as a filename glob: <${foo}>.  Example:

          while (<*.c>) {
               chmod 0644, $_;
          }

     is equivalent to

          open(foo, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
          while (<foo>) {
               chop;
               chmod 0644, $_;
          }

     In fact, it's currently implemented that way.  (Which  means
     it will not work on filenames with spaces in them unless you
     have /bin/csh on your machine.)  Of course, the shortest way
     to do the above is:

          chmod 0644, <*.c>;

Online resources provided by: http://www.X-Hacker.org --- NG 2 HTML conversion by Dave Pearson