The evil of a global $_

I often write code like this:

    $_->store() for @objects;

and quite often it actually works. But suddenly a piece of code doing this in a loop broke with this error message: Can’t call method “store” without a package or object reference. And quite right, sometimes @objects would contain plain integers.

Unfortunately it wouldn’t be quite easy to track down the relevant change, so enter the perl debugger. Declaring a watch on ‘@objects’ isn’t useful as it triggers each time @objects enters or leaves scope. But saving a reference to @objects in $my::objects and the watching $my::objects->[0] worked.

I had reimplemented the store() method using Data::Walk for walking some structure instead of doing it by hand. And Data::Walk sets $_ to the current node. A due to the aliasing implied in the for-modifier each element in my @objects array was garbled.

Two solutions: The store() method can localize $_ by adding local $_; before using Data::Walk - this works in legacy perl interpreters. The routine looping through @objects can make $_ a lexical variable by adding my $_; before the loop - This is a new feature in Perl 5.10.

An even more robust solution would be to have Data::Walk localize $_ itself AND use a lexical $_ in code where $_ is aliased to important data. See RT #47309.

Comments (2)

New version of the Padre debugger plugin

I’ve just uploaded a new version of Padre-Plugin-Debugger to CPAN. It has been stuck in github for a while as I meant to write some documentation and start calling it version 1.0. But as I haven’t written any documentation yet, it is still just a puny 0.3 version.

The major update is how the interpreter is called. Now it will actually find the modules you are using, if you add the correct directories to ‘Edit -> Preferences -> Run Paramerters’

Comments

Benchmarking serialization modules

First you have to optimize for correctness, then you can optimize for speed. At the moment I’m working on a project where I do a lot of serialization, and to ensure that I could debug the correctness I have chosen an easy readable serialization format: YAML.

But running Devel::NYTProf on my code showed that an awful lot of time was spend in the YAML code. Changing a few lines to use Storable instead showed an improvement in speed going from 1000s to 200s for my test data. This is a significant win even while developing, but of course I shouldn’t have used the old YAML-implementation to begin with.

Mentioning this to the local Perl Mongers group I got referred to a benchmark of different serialization modules by Christian Hansen (I think). I’ve updated the script and the interesting part is that JSON::XS outruns Storable on both tests.

Thank you Devel::NYTProf for showing me that 80% of my run time was spent in old pure perl serialization code.

Comments (2)

More fun with DESTROY

When I wrote about exception handling in Perl I mentioned briefly that DESTROY() methods should localize $? too.

Try this script:

#!/usr/bin/perl

package Foo;

sub new {
    my $class = shift;
    open my $fh, "-|", "cat /dev/zero";
    return bless { fh => $fh }, $class;
}

sub DESTROY {
    my $self = shift;
    close $self->{fh}
}

package main;

my $foo = Foo->new();
exit 42;

You would expect it to return with status code 42, but when I run it the status code is 13. Localizing $? in the DESTROY() method gives the expected 42 status code.

The most common usage of $? is explained in the perlvar manual page:

The status returned by the last pipe close, backtick (”“”) command, successful call to wait() or waitpid(), or from the system() operator.

And the above destructor makes a ‘pipe close’ changing the value of $?. But the perlvar manual also mentions an secondary usage of $?:

Inside an “END” subroutine $? contains the value that is going to be given to “exit()”. You can modify $? in an “END” subroutine to change the exit status of your program.

But this doesn’t just holds for END block, it is true for any code run after the exit() invokation - including destructors.

And then to a mystery. Given the above Foo package it isn’t very surprising that this script have the status code 13:

use Foo;
my $foo = Foo->new();

But why does the following change make the script have the status code 0?

use Foo;
my $foo = Foo->new();
$? = 42;

Comments (1)

Dansk politisk spam: Socialdemokraterne

Vi har alle vore egne små mærkesager. En af mine mærkesager at at jeg gerne vil være fri for reklameri min mail og i min papirspost. Desvære er holdninger ikke underlagt markedsføringslovens bestemmelser, så religiøst og politisk spam er slet ikke forbudt.

Men de politiske parier burde alligevel respektere at jeg på min dør har taget en klar tilkendegivelse af at jeg ikke ønsker reklamer. Men det kan i hvert fald Socialdemokraterne ikke finde ud af.

Derfor, stem nej til politisk spam, stem nej til Socialdemokraternes Claus Larsen-Jensen!

Comments (2)

When perfection isn’t good enough

One of the many possible problems with using CPAN modules is that they often just implement the parts of the problems the author needed a solution for. But once in a while you run into the opposite problem: The module implements parts of a standard that you just don’t want.

Some time ago I had to parse parts of the WebDAV protocol. WebDAV properties are transmitted using XML with namespaces, which is one thing I think XML::Simple is particular bad for. So I turned to XML::LibXML (which is becoming my XML module of choice either way).

So, the WebDAV RFC have examples like this:

     <?xml version="1.0" encoding="utf-8" ?>
     <D:propfind xmlns:D="DAV:">

       <D:prop xmlns:R="http://ns.example.com/boxschema/">
         <R:bigbox/>
         <R:author/>
         <R:DingALing/>
         <R:Random/>
       </D:prop>

     </D:propfind>

Unfortunately XML::LibXML insists on namespace URI’s to conform to the URI specification, which DAV: doesn’t. Due to XML::LibXML’s perfection I’m not able to just use it. Solution:

sub escapeNamespace {
    $_[0] =~ s/(xmlns(?::\w+)?)="(?!urn|http)([^"]+)"/$1="urn:xxx:$2"/g;
    $_[0] =~ s/(xmlns(?::\w+)?)=""/$1="urn:xxx:nonamespace"/g;
}

I’m not quite sure that the second substitution is needed by the standard, but the Litmus webdav test suite needs it…

Comments (3)

Benchmarking is hard

Benchmarking is hard to do right. Recently Jason Switzer found an old benchmark of the smart match operator written by Michael Schwern. Jason Switcher comments:

If I were to publish a paper with such a gaping hole like that, it would never be taken serious. In each test, he’s generating a random number and a random character and storing the result! That’s not part of the test. The $needles should each be generated outside the timing loop. This is adding tons of additional instructions that are not considered part of the test. These results are basically useless and should be redone.

After doing his test, Jason concludes: “Those results are astonishing!”. Yes, it clearly shows that the grep solutions are 25% faster than using first. This is astonishing and surprising — and probably warrants an explanation.

Jason’s methodology consists of generating a single random array and needle for each piece of code he wants to benchmark. Running his benchmark on my machine shows that first is 15 times faster than smart match and about 28 times faster than grep. Of course, my random arrays might have been more biased against ‘first’ and randomly placing the needle at the start.

So Jason’s test are not to be taken seriously either. Even when benchmarking with random data, we have to benchmark all our code pieces against the same data and not just on one data set.

Update: I’ve been running some more benchmarks in the background. It seems that the general pattern is that ‘first’ from List::Util and any from List::MoreUtils is about 25% slower than grep, but with some spikes where ‘first’/'any’ is blazingly fast. Is being a native opcode that much faster than XS-code?

Comments (2)

Perl exception handling is hard

It is a well known perl idiom to write exception handling with eval and die, instead of try and catch.


eval {
    do_something()
        or die "Err!";
 };

if ($@) {
    print "Catched exception: $@";
}

Even Damian Conway does it in chapter 16 of Perl Best Practices and ‘perldoc -f die‘ does it too. Unfortunately it doesn’t work all the time. Consider the following situation:


#!/usr/bin/perl

package Foo;

sub new {
    my $class = shift;
    return bless {}, $class;
}

sub DESTROY {
    my $self = shift;
    eval { 1; };
}

package main;
eval {
    my $foo = Foo->new();
    open my $fh, ">", "/"
        or die "Could not open /: $!";
    print "Doing some work\n";
};
if ($@) {
    print STDERR "Something bad happened: $@\n";
} else {
    print "Everything went well\n";
}

As we shouldn’t be able to open ‘/’ for writing, we expect to see an error message. But it turns out that the script claims to succeed even tough it didn’t do any “work”. So what happened? Well, $foo went out of scope, the destructor was called and the embedded eval changed $@.

What can we do to solve the problem. First of all, make sure that the destructor doesn’t chage $@. You can do this by localizing it. Even tough you’re not calling eval directly, some other code might do it. So always start destructors like this:


sub DESTROY {
    my $self = shift;
    local $@;
    ...;
}

By the way, you want to localize $? too, but that is another story.

Another solution is to change you try/catch emulation. If an eval’ed block dies, the resulting value is undef. So make sure that the ordinary case always ends with a true value, and replace catch with a simple or:


eval {
    do_work();
    1;
} or {
    print "catch: $@";
}

Of course $@might be misleading about the real error, so this doesn’t work as well as fixing the destructors. But when using modules written by others, you might implement defense in depth by using both solutions. I think there is a Perl::Critic policy advocating the above try/catch construction.

But how does the execption modules on CPAN fare against the above destructor? I tried two of them: Error.pm works correctly, because it’s throw implementation stores the exception before dying. The much newer TryCatch.pm doesn’t work (See rt #46294), but is much cooler otherwise.

Update: I’ve managed to write a fix for TryCatch.pm. It includes a throw function, just like Error.pm.

Comments (3)

Persistence and objects from XS modules

After three workdays spend debugging, it is time for some blame allocation. But who to blame?

In a larger system we’re using Geo::IP to locate users. Recently we have discovered a few hanging or segv faulting processes. After finding a reproducible case we found out that both cases shared the same backtrace:

[...]
#1  0x55791570 in GeoIPRecord_delete () from /usr/lib/libGeoIP.so.1
#2  0x5577ecd8 in XS_Geo__IP__Record_DESTROY () from /usr/lib/perl5/auto/Geo/IP/IP.so
#3  0x080bdaa1 in Perl_pp_entersub ()
#4  0x080626db in Perl_magicname ()
#5  0x080632ee in Perl_call_sv ()
#6  0x080c9bf4 in Perl_sv_clear ()
#7  0x080ca3e5 in Perl_sv_free ()
#8  0x080c9f07 in Perl_sv_clear ()
#9  0x080ca3e5 in Perl_sv_free ()
#10 0x080bb5da in Perl_av_undef ()
#11 0x080ca0e7 in Perl_sv_clear ()
#12 0x080ca3e5 in Perl_sv_free ()
#13 0x080c9f07 in Perl_sv_clear ()
#14 0x080ca3e5 in Perl_sv_free ()
#15 0x080b6f69 in Perl_hv_free_ent ()
#16 0x080b70e5 in Perl_hv_iterinit ()
#17 0x080b8e69 in Perl_hv_undef ()
#18 0x080ca0cc in Perl_sv_clear ()
#19 0x080ca3e5 in Perl_sv_free ()
#20 0x080e92ef in Perl_leave_scope ()
#21 0x080e93bc in Perl_pop_scope ()
#22 0x080bec4d in Perl_pp_leavesub ()
#23 0x080bc379 in Perl_runops_standard ()
#24 0x08063bfd in perl_run ()
#25 0x0805ffd1 in main ()

GeoIPRecord_delete consists of 3 free(3) calls for members of a struct and 1 to free(3) the struct itself. Running with Gnu libc’s MALLOC_CHECK and using valgrind confirmed that it triet to free invalid pointers. But the ‘leak’ was nowhere to be found. Using the pure perl version of Geo::IP removed any sign of problems.

Almost giving up I suddenly discovered that it happend while a value read from Storable was going out of scope. The it was easy to reproduce:


use Geo::IP;
use Storable;
my $geoip = Geo::IP->open_type(
    GEOIP_CITY_EDITION_REV1,
    GEOIP_MEMORY_CACHE|GEOIP_CHECK_CACHE
);
my $gir = $geoip->record_by_name( "peter.makholm.net" );
my $copy = Storable::dclone( $gir );
$gir = undef;
$copy = undef;
[... crash ...]

What happens is that Geo::IP::Record is just a scalar reference containing the a C pointer as the integer value. Dumping it with Devel::Peek loks like this:

SV = RV(0x2cdd4c8) at 0x2cdd4b8
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x27a4a50
  SV = PVMG(0x297e610) at 0x27a4a50
    REFCNT = 1
    FLAGS = (OBJECT,IOK,pIOK)
    IV = 47079264
    NV = 0
    PV = 0
    STASH = 0x2ca9520   "Geo::IP::Record"

Almost the same as what we get for my $foo = bless do { \(my $o = 123456); }, "Foo::Bar"

SV = RV(0x2ce82f0) at 0x2ce82e0
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x2cf7d30
  SV = PVMG(0x297e730) at 0x2cf7d30
    REFCNT = 3
    FLAGS = (OBJECT,IOK,pIOK)
    IV = 123456
    NV = 0
    PV = 0
    STASH = 0x2cbde28   "Foo::Bar"

so Storable can’t really see that the object is a c-pointer and Geo::IP::Record wasn’t meant to be dumped wioth Storable. The only hint is the description in the documentation of Storable:

The Storable package brings persistence to your Perl data structures containing SCALAR, ARRAY, HASH or REF objects, i.e. anything that can be conveniently stored to disk and retrieved at a later time.

But using Geo::IP::Record it looks exactly like you run of the mill HASH-based record objects. Lessons learned? Really no-one, except to watch out for XS objects.

Comments (1)

What do you need in a debugger?

My one and a half hour-debugger was very clearly just a proof of concept thingy. A week later I think I got most of the basic debugger functionality implemented. At least most of the features I use in the standard perl debugger.

A nice feature is that while single stepping, it just ignores code in files not loaded in Padre. If you want to debug this code:

use Error qw(:try);

try {
    do_it();
} catch Error::Simple with {
    my $E = shift;
    $log->notice("Failed to do it: $E");
};

but don’t care about Error.pm, just don’t load it in Padre and you can single step trough it, just like if try/catch was first control structure. Double-click on the Error.pm entries in the stack trace and it will ask if you want to load Error.pm.

Version 1.0 is still far away. Main issue is to rewrite the menu and handling of the ‘watches’ and ’stack trace’ views. A complete refactoring of the user interface parts.

But functionality is more funny to implement, so what do you need for the debugger in an Perl IDE. Comments please, either here or in #padre @ irc.perl.org

Comments

« Previous entries