Using dtrace to track down memory leaks

June 30, 2009

I’ve been working with a customer to try and find a memory “leak”
in their application. Many things have been tried, libumem, and the
mdb ::findleaks command all with no success.


So I was, as I am sure others before me have, pondering if you
could use dtrace to do this. Well I think you can. I have a script
that puts probes into malloc et al and counts how often they are
called by this thread and when they are freed often free is called.


Then in the entry probe of the target application note away how
many calls there have been to the allocators and how many to free and
with a bit of care realloc. Then in the return probe compare the
number of calls to allocate and free with the saved values and
aggregate the results. The principle is that you find the routines
that are resulting in allocations that they don’t clear up. This
should give you a list of functions that are possible leakers which
you can then investigate1.


Using the same technique I for getting dtrace to “follow
fork” that I described
here
I ran this up on diskomizer, a program that I understand
well and I’m reasonably sure does not have systemic memory leaks. The
dtrace script reports three sets of results.



  1. A count of how many times each routine and it’s descendents
    have called a memory allocator.


  2. A count of how many times each routine and it’s descendents
    have called free or realloc with a non NULL pointer as the first
    argument.


  3. The difference between the two numbers above.



Then with a little bit of nawk to remove all the functions for
which the counts are zero gives:


# /usr/sbin/dtrace -Z -wD TARGET_OBJ=diskomizer2 -o /tmp/out-us \
-s /tmp/followfork.d \
-Cs /tmp/allocated.d -c \
"/opt/SUNWstc-diskomizer/bin/sparcv9/diskomizer -f /devs -f background \
-o background=0 -o SECONDS_TO_RUN=1800"
dtrace: failed to compile script /tmp/allocated.d: line 20: failed to create entry probe for ‘realloc’: No such process
dtrace: buffer size lowered to 25m
dtrace: buffer size lowered to 25m
dtrace: buffer size lowered to 25m
dtrace: buffer size lowered to 25m
# nawk ‘$1 != 0 { print $0 }’ < /tmp/out.3081
allocations
1 diskomizer`do_dev_control
1 diskomizer`set_dev_state
1 diskomizer`set_state
3 diskomizer`report_exit_reason
6 diskomizer`alloc_time_str
6 diskomizer`alloc_time_str_fmt
6 diskomizer`update_aio_read_stats
7 diskomizer`cancel_all_io
9 diskomizer`update_aio_write_stats
13 diskomizer`cleanup
15 diskomizer`update_aio_time_stats
15 diskomizer`update_time_stats
80 diskomizer`my_calloc
240 diskomizer`init_read
318 diskomizer`do_restart_stopped_devices
318 diskomizer`start_io
449 diskomizer`handle_write
606 diskomizer`do_new_write
2125 diskomizer`handle_read_then_write
2561 diskomizer`init_buf
2561 diskomizer`set_io_len
58491 diskomizer`handle_read
66255 diskomizer`handle_write_then_read
124888 diskomizer`init_read_buf
124897 diskomizer`do_new_read
127460 diskomizer`expect_signal
freecount
1 diskomizer`expect_signal
3 diskomizer`report_exit_reason
4 diskomizer`close_and_free_paths
6 diskomizer`update_aio_read_stats
9 diskomizer`update_aio_write_stats
11 diskomizer`cancel_all_io
15 diskomizer`update_aio_time_stats
15 diskomizer`update_time_stats
17 diskomizer`cleanup
160 diskomizer`init_read
318 diskomizer`do_restart_stopped_devices
318 diskomizer`start_io
442 diskomizer`handle_write
599 diskomizer`do_new_write
2125 diskomizer`handle_read_then_write
2560 diskomizer`init_buf
2560 diskomizer`set_io_len
58491 diskomizer`handle_read
66246 diskomizer`handle_write_then_read
124888 diskomizer`do_new_read
124888 diskomizer`init_read_buf
127448 diskomizer`cancel_expected_signal
mismatch_count
-127448 diskomizer`cancel_expected_signal
-4 diskomizer`cancel_all_io
-4 diskomizer`cleanup
-4 diskomizer`close_and_free_paths
1 diskomizer`do_dev_control
1 diskomizer`init_buf
1 diskomizer`set_dev_state
1 diskomizer`set_io_len
1 diskomizer`set_state
6 diskomizer`alloc_time_str
6 diskomizer`alloc_time_str_fmt
7 diskomizer`do_new_write
7 diskomizer`handle_write
9 diskomizer`do_new_read
9 diskomizer`handle_write_then_read
80 diskomizer`init_read
80 diskomizer`my_calloc
127459 diskomizer`expect_signal
#


From the above you can see that there are two functions that create
and free the majority of the allocations and the allocations almost
match each other, which is expected as they are effectively
constructor and destructor for each other. The small mismatch is not
unexpected in this context.


However it is the vast number of
functions that are not listed at all as they and their children make
no calls to the memory allocator or have exactly matching allocation
and free that are important here. Those are the functions that we
have just ruled out.


From here it is easy now to drill down
on the functions that are interesting you, ie the ones where there
are unbalanced allocations.





I’ve uploaded the files allocated.d
and followfork.d
so you can see the details. If you find it useful then let me know.



1Unfortunately
the list is longer than you want as on SPARC it includes any
functions that don’t have their own stack frame due to the way
dtrace calculates ustackdepth, which the script makes use of.




2The
script only probes particular objects, in this case the main
diskomizer binary, but you can limit it to a particular library or
even a particular set of entry points based on name if you edit the
script.



Follow fork for dtrace pid provider?

June 27, 2009

There is a ongoing request to have follow fork functionality for
the dtrace pid provider but so far no one has stood upto the plate
for that RFE. In the mean time my best workaround for this is this:


cjg@brompton:~/lang/d$ cat followfork.d
proc:::start
/ppid == $target/
{
stop();
printf("fork %d\n", pid);
system("dtrace -qs child.d -p %d", pid);
}
cjg@brompton:~/lang/d$ cat child.d
pid$target::malloc:entry
{
printf("%d %s:%s %d\n", pid, probefunc, probename, ustackdepth)
}
cjg@brompton:~/lang/d$ pfexec /usr/sbin/dtrace -qws followfork.d -s child.d -p 26758
26758 malloc:entry 22
26758 malloc:entry 15
26758 malloc:entry 18
26758 malloc:entry 18
26758 malloc:entry 18
fork 27548
27548 malloc:entry 7
27548 malloc:entry 7
27548 malloc:entry 18
27548 malloc:entry 16
27548 malloc:entry 18


Clearly you can have the child script do what ever you wish.


Better solutions are welcome!


Why a fixie

June 25, 2009

Why a fixie1?


A few people have asked
me this so here are the reasons for a fixie:



  • They are alleged to improve you pedaling.


  • They are supposed to make your legs stronger.


  • People say you are more in touch with the bike on a fixie


  • People say they are fun to ride. Quite why is hard to
    understand why would this be significantly different from just
    picking a gear and sticking to it. For a single speed bike with a
    free wheel I would agree except a single speed there is no way to
    give in up hills without getting off.



My additional reasons were:



  • I had noticed I generally ride in 3 gears during the winter
    so wondered if I could make it on a single speed.


  • Bike to work made it very affordable.


  • I wanted one.



Now having one I agree they are
fun more fun than I ever expected and even though I don’t think I
have mastered it yet I do understand about it being in touch with the
bike. Going up hill there is nowhere to hide I don’t know if it is
making me stronger but it feels like it.


It certainly has improved my
ability to “spin”.



1Obvioulsy
the answer that you can never have too many bikes I assume will not
wash. Indeed I have that problem at home since the house rule is
that I can only have three bikes. The brompton, luckily, does not
count leaving some others. However since under UK law a bike that
has no pedals is not a bike I only have three sets of pedals so I’m
o.k.



500 fixie miles

June 23, 2009

I’ve now commuted 500 miles on my fixie. Riding with a single gear
and no freewheel has proved to be more fun than I expected. I’ve
managed to stay on the the thing despite it having a pretty good
attempt to throw me on three occasions:



  • Trying to “freewheel” as I went over a speed bump.


  • Trying to “freewheel” going round a roundabout. Very
    exciting as I got lifted up at the same time as the rearwheel
    stepped out.


  • Going down hill and letting the speed build up and my legs
    not being able to keep up, in the wet all went very wobbly.



All these happened on the first two commutes since then I have
mastered not freewheeling and can control the speed going down hill
and spin at a rate that I previously thought impossible although it
is much easier to use the brakes.



  • Track standing is harder on the fixie than on a freewheel
    bike. I think this is just lack of practice as it is getting better.


  • Gettting too close to the kerb is frightening as that pedal
    is going to go down relentlessly so narrow gaps are narrower.


  • Judging whether there is room for another turn of the pedals
    before you have to stop is more important than I relised. If you get
    it wrong then your foot is stuck at the bottom of a pedal stroke so
    starting is really hard.


  • Getting your foot clipped in once you are moving is
    impossible, better to get clipped in before putting the power down.
    Another reason to perfect that track stand.


  • I’ve not scraped my pedals going round corners, Yet.



Diskomizer Open Sourced

June 18, 2009

I’m pleased to announce the Diskomizer
test suite has been open sourced. Diskomizer started life in the dark
days before ZFS when we lived in a world full1
of bit flips, phantom writes, phantom reads, misplaced writes and
misplaced reads.


With a storage architecture that does
not use end to end data verification the best that you can hope for
was that your application will spot errors quickly and allow you to
diagnose the broken part or bug quickly. Diskomizer was written to be
a “simple” application that could verify all the data
paths worked correctly and worked correctly under extreme load. It
has been and is used by support, development and test groups for
system verification.


For more details of what Diskomizer is
and how to build and install read these pages:


http://www.opensolaris.org/os/community/storage/tests/Diskomizer/


You can download the source and
precompiled binaries from:


http://dlc.sun.com/osol/test/downloads/current/


and can browse the source here:


http://src.opensolaris.org/source/xref/test/stcnv/usr/src/tools/diskomizer


Using Diskomizer


First remember in most cases Diskomizer will destroy all the
data on any target you point it at
. So extreme care is advised.


I will say that again.


Diskomizer will destroy all the data
on any target that you point it at.


For the purposes of this explanation I am going to use ZFS volumes
so that I can create and destroy them with confidence that I will not
be destroying someone’s data.


First lets create some volumes.


# i=0
# while (( i < 10 ))
do
zfs create -V 10G storage/chris/testvol$i
let i=i+1
done
#


Now write the names of the devices you wish to test into a file after
the key “DEVICE=”:


# echo DEVICE= /dev/zvol/rdsk/storage/chris/testvol* > test_opts


Now start the test. When you installed diskomizer it put the standard
option files on the system and has a search path so that it can find
them. I’m using the options file “background” which will
make the test go into the back ground redirecting the output into a
file called “stdout” and any errors into a file called
“stderr”:





# /opt/SUNWstc-diskomizer/bin/diskomizer -f test_opts -f background
#


If Diskomizer has any problems with the configuration it will report
them and exit. This is to minimize the risk to your data from a typo.
Also the default is to open devices and files exclusively to again
reduce the danger to your data (and to reduce false positives where
it detects data corruption).


Once up and running it will report it’s
progress for each process in the output file:


# tail -5 stdout
PID 1152: INFO /dev/zvol/rdsk/storage/chris/testvol7 (zvol0:a)2 write times (0.000,0.049,6.068) 100%
PID 1152: INFO /dev/zvol/rdsk/storage/chris/testvol1 (zvol0:a) write times (0.000,0.027,6.240) 100%
PID 1152: INFO /dev/zvol/rdsk/storage/chris/testvol7 (zvol0:a) read times (0.000,1.593,6.918) 100%
PID 1154: INFO /dev/zvol/rdsk/storage/chris/testvol9 (zvol0:a) write times (0.000,0.070,6.158) 79%
PID 1151: INFO /dev/zvol/rdsk/storage/chris/testvol0 (zvol0:a) read times (0.000,0.976,7.523) 100%
#


meanwhile all the usual tools can be used to view the IO:


# zpool iostat 5 5
capacity operations bandwidth
pool used avail read write read write
———- —– —– —– —– —– —–
storage 460G 15.9T 832 4.28K 6.49M 31.2M
storage 460G 15.9T 3.22K 9.86K 25.8M 77.2M
storage 460G 15.9T 3.77K 6.04K 30.1M 46.8M
storage 460G 15.9T 2.90K 11.7K 23.2M 91.4M
storage 460G 15.9T 3.63K 5.86K 29.1M 45.7M
#






1Full
may be an exaggeration but we will never know thanks to the fact
that the data loss was silent. There were enough cases reported
where there was reason to doubt whether the data was good to keep me
busy.




2The
fact that all the zvols have the same name (zvol0:a) is bug 6851545
found with diskomizer.



Hello world!

June 16, 2009

Welcome to WordPress.com. This is your first post. Edit or delete it and start blogging!


Passing the Baton

June 14, 2009

When I was about 16 I used to ride around Surrey and Sussex a lot
on my bike. Sometimes on my own and sometimes with friends. It
allowed me to miss the Wedding of Lady Di and Prince Charles by going
camping the entire weekend, little did I realise I would get complete
symetry and miss her death as well.


On one of these cycling trips I was on my own, tired and
struggling up a hill out of Horsham (I assume I was returning from
Worthing as I used to cycle down there quite often and since I was on
my own I must have been visiting my Great Aunt who lived there). As I
struggled up the hill a cyclist who was much older than me (probably
in his forties) pull along side and asked how I was. I replied
something like “knackered” at which point he put his hand on my
back and pushed me up the hill. I was both thankful for his kindness
and appaulled that an “old git” was pushing me, a teenager, up the
hill.


Well today I managed to pass the baton on. A teenager came out
with us on our ride today and on the way back from the cafe blew
quite spectacularly. Not able to keep up on the flat even in the tow
of the other riders unless we slowed to a crawl, which we did, then
we got to a hill and I saw my chance. I realised I had been waiting
28 years (or so) to pay this favour back and so I pushed him up the
hill.


OpenSolaris 2009.06

June 7, 2009

After a week of running 2009.06 on my Toshiba Tecra M9 having
upgraded from 2008.11 I’m in a position to comment on it. I’ve been
able to remove all the workarounds I had on the system. Nwam appears
to work even in the face a suspend
and resume
. Removalbe media also appears to be robust without the
occasional panics that would happen when I removed the SD card with
2008.11.


Feature wise the things I have noticed are the new tracker system
for searching files, but it seems to be completely non functional.
The big improvements are in the support for the special keys on the
Toshiba and the volume control, which unlike the volume on the M2 is
a logical control so requires software support. 2009.06 has this
support along with support fo the number pad, brightness and mute
buttons.


The downside was hitting this bug.
This pretty much renders resume useless and I was about to go back to
2008.06 when the bug was updated to say it will be fixed in the first
update release and in the mean time there are binary packages. So
after creating a new boot enviroment so that I have an unpatched one
to switch to when the fix gets into the support repository I have
applied the patch. Seems to work which is very pleasing as it has not
taken me long to get used to the brightness buttons working.


Possibly the best shell programming mistake ever

June 5, 2009

A colleague, lets call him Lewis, just popped over with the most bizarrely behaving shell script I have seen.


The problem was that the script would hang while the automounter timed out an attempt to NFS mount a file system on the customer’s system.

I narrowed it down to something in a shell function that looked like this:



# Make a copy even if the destination already exists.
safe_copy()
{
typeset src=”$1″
typeset dst=”$2″
/* Nothing to copy */
if [ ! -f $src ] ; then
return
fi
if [ ! -h $src -a ! -h $dst -a ! -d $dst ] ; then
cp -p $src $dst || exit 1
fi
}


safe_copy was called with a file as the $1 and a file as $2.



I laughed when saw the problem. Funny how you can read something and miss such an obvious mistake!


Thankfully the script has quietly been fixed.