After three workdays spend debugging, it is time for some blame allocation. But who to blame?

In a larger system we're using Geo::IP to locate users. Recently we have discovered a few hanging or segv faulting processes. After finding a reproducible case we found out that both cases shared the same backtrace:

[...]
#1  0x55791570 in GeoIPRecord_delete () from /usr/lib/libGeoIP.so.1
#2  0x5577ecd8 in XS_Geo__IP__Record_DESTROY () from /usr/lib/perl5/auto/Geo/IP/IP.so
#3  0x080bdaa1 in Perl_pp_entersub ()
#4  0x080626db in Perl_magicname ()
#5  0x080632ee in Perl_call_sv ()
#6  0x080c9bf4 in Perl_sv_clear ()
#7  0x080ca3e5 in Perl_sv_free ()
#8  0x080c9f07 in Perl_sv_clear ()
#9  0x080ca3e5 in Perl_sv_free ()
#10 0x080bb5da in Perl_av_undef ()
#11 0x080ca0e7 in Perl_sv_clear ()
#12 0x080ca3e5 in Perl_sv_free ()
#13 0x080c9f07 in Perl_sv_clear ()
#14 0x080ca3e5 in Perl_sv_free ()
#15 0x080b6f69 in Perl_hv_free_ent ()
#16 0x080b70e5 in Perl_hv_iterinit ()
#17 0x080b8e69 in Perl_hv_undef ()
#18 0x080ca0cc in Perl_sv_clear ()
#19 0x080ca3e5 in Perl_sv_free ()
#20 0x080e92ef in Perl_leave_scope ()
#21 0x080e93bc in Perl_pop_scope ()
#22 0x080bec4d in Perl_pp_leavesub ()
#23 0x080bc379 in Perl_runops_standard ()
#24 0x08063bfd in perl_run ()
#25 0x0805ffd1 in main ()

GeoIPRecord_delete consists of 3 free(3) calls for members of a struct and 1 to free(3) the struct itself. Running with Gnu libc's MALLOC_CHECK and using valgrind confirmed that it triet to free invalid pointers. But the 'leak' was nowhere to be found. Using the pure perl version of Geo::IP removed any sign of problems.

Almost giving up I suddenly discovered that it happend while a value read from Storable was going out of scope. The it was easy to reproduce:


use Geo::IP;
use Storable;
my $geoip = Geo::IP->open_type(
    GEOIP_CITY_EDITION_REV1, 
    GEOIP_MEMORY_CACHE|GEOIP_CHECK_CACHE
);
my $gir = $geoip->record_by_name( "peter.makholm.net" );
my $copy = Storable::dclone( $gir );
$gir = undef;
$copy = undef;
[... crash ...]

What happens is that Geo::IP::Record is just a scalar reference containing the a C pointer as the integer value. Dumping it with Devel::Peek loks like this:

SV = RV(0x2cdd4c8) at 0x2cdd4b8
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x27a4a50
  SV = PVMG(0x297e610) at 0x27a4a50
    REFCNT = 1
    FLAGS = (OBJECT,IOK,pIOK)
    IV = 47079264
    NV = 0
    PV = 0
    STASH = 0x2ca9520   "Geo::IP::Record"

Almost the same as what we get for my $foo = bless do { \(my $o = 123456); }, "Foo::Bar"

SV = RV(0x2ce82f0) at 0x2ce82e0
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x2cf7d30
  SV = PVMG(0x297e730) at 0x2cf7d30
    REFCNT = 3
    FLAGS = (OBJECT,IOK,pIOK)
    IV = 123456
    NV = 0
    PV = 0
    STASH = 0x2cbde28   "Foo::Bar"

so Storable can't really see that the object is a c-pointer and Geo::IP::Record wasn't meant to be dumped wioth Storable. The only hint is the description in the documentation of Storable:

The Storable package brings persistence to your Perl data structures containing SCALAR, ARRAY, HASH or REF objects, i.e. anything that can be conveniently stored to disk and retrieved at a later time.

But using Geo::IP::Record it looks exactly like you run of the mill HASH-based record objects. Lessons learned? Really no-one, except to watch out for XS objects.