Skip to content

Making the zfs snapshot service run faster

January 3, 2009

I’ve not been using Tim’s auto-snapshot service on my home server as once I configured it so that it would work on my server I noticed it had a large impact on the system:

: pearson FSS 15 $; time /lib/svc/method/zfs-auto-snapshot \          svc:/system/filesystem/zfs/auto-snapshot:frequent  real    1m22.28s user    0m9.88s sys     0m33.75s : pearson FSS 16 $;

The reason is two fold. First reading all the properties from the pool takes time and second it destroys the unneeded snapshots as it takes new ones. Something the service I used cheats with and does only very late at night. Looking at the script there are plenty of things that could be made faster and so I wrote a python version that could replace the cron job and the results , while and improvement were disappointing:

: pearson FSS 16 $; time ./ \          svc:/system/filesystem/zfs/auto-snapshot:frequent  real    0m47.19s user    0m9.45s sys     0m31.54s : pearson FSS 17 $; 

still too slow to actually use. The time was dominated by cases where the script could not use a recursive option to delete the snapshots. The problem being that there is no way to list all the snapshots of a filesystem or volume but not it’s decendents.

Consider this structure:

# zfs list -r -o name,com.sun:auto-snapshot tank NAME                                  COM.SUN:AUTO-SNAPSHOT tank                                  true tank/backup                           false tank/dump                             false tank/fs                               true tank/squid                            false tank/tmp                              false 

The problem here is that the script wants to snapshots and clean up “tank” but can’t use recustion without backing up all the other file systems that have the false flag set and set for very good reason. Howeve If I did not bother to snapshot “tank” then tank/fs could be managed recusively and there would be no need for special handling. The above list does not reflect all the file systems I have but you get the picture. The results of making this change brings the timing for the service

 : pearson FSS 21 $; time ./ \          svc:/system/filesystem/zfs/auto-snapshot:frequent  real    0m9.27s user    0m2.43s sys     0m4.66s : pearson FSS 22 $; time /lib/svc/method/zfs-auto-snapshot \          svc:/system/filesystem/zfs/auto-snapshot:frequent  real    0m12.85s user    0m2.10s sys     0m5.42s : pearson FSS 23 $;  

While the python module still gets better results than the korn shell script the korn shell script does not do so badly. However it still seems worthwhile spending the time to get the python script to be able to handle all the features of the korn shell script. More later.


From → Solaris

  1. UX-admin permalink

    What’s with the sudden Python fad at Sun?
    Are you guys sacrificing high tech in favor of being hip & cool, or what? It will come back to bite you.

  2. Thanks Chris! A python re-implementation is something I’ve wanted to have a go at, but not much ample free time at the moment, with a newborn in the house. As you point out, getting correctness as well as speed is vital: I agree.
    There are some performance improvements in the hg repository, over and above what’s in 2008.11, but I suspect you’re already using the latest bits.
    iirc I’ve logged an RFE to ask for a means to show snapshots of a just a given dataset – will try to dig up the CR (and if I haven’t, it’s certainly one I meant to file)

  3. @UX-admin
    My reasons for choosing python were three fold:
    1) I wanted to learn python due to the "fad" @ sun. I know I will end up needing to know it.
    2) The other choice was TCL but that would not have stood such a good chance of making it back into the base OS.
    3) I like learning new (computer) languages. I could get to like python a lot.
    Thanks for the RFE. On my home server I’m bang upto date so I think this is a good as it gets.
    I do recall writing a java app to download pictures from my first digital camera with my son asleep on my chest all night. If I moved he woke so I just stayed put and wrote code: Obviously I wrote the code to allow me to take pictures as well and hence I have that photo.
    So having a new born is no excuse;-)

  4. Roland Mainz permalink

    BTW: The performance of the original Korn shell script can be *VASTLY* improved – right now it is |fork()|’ing like mad and most of this stuff can be avoided (and I bet 10 Euro that I can tune this script using ksh93 (from ksh93-integration update1) and make it outperform any Python version =:-) ).

  5. Be my guest Roland! Giving the ksh method code a good kicking would be more than welcome, just be sure to start with the version at the tip of the hg repo:

  6. The point is not korn shell v python but that the performance of both the korn shell script and the python script are both dominated by reading the names of the snapshots when the script only needs the names of the snapshots of this filesystem or volume. Instead it gets all the snapshots below this filesystem or volume.
    Anything else while useful will not get you significant gains.

  7. UX-admin permalink

    "I wanted to learn python due to the "fad" @ sun. I know I will end up needing to know it."
    Sun seems to really like slow programming languages (Java), and Python fits that bill as well.
    The stuff which you are doing in the KSH script is trivial to write in AWK, with the additional bonus that, once working, the AWK program can be compiled into a binary executable, for maximum performance.
    I am unpleasantly amazed that you did not use the AWK programming language to solve this, since your program is exactly the kind of workload what AWK was designed for.
    It’s a case of "pick the right tool for the job", and I can’t justify Python being the right tool no matter how you slice it and dice it.
    Then again, learning something just because it’s a fad isn’t rational, so my argument completely sinks.

  8. Hmm, java is not slow, there are just slow java apps as there are slow apps written in any language.
    Even if I agreed that python was a "fad" then it would still be rational to learn it as I will find my self faced with python code to debug.
    Awk would be an interesting choice but while it may be possible to write this in awk it would not be the few hours it took for a novice python programmer to do in python and that is speaking as someone who writes awk. ksh93 could well be one of the right tools for the job but python or even TCL would also do the job just fine.
    However that still misses the point. The way to improve this is to change the way the zfs command works to allow it to do only the things that the script needs. So CR 6352014 would appear to be the place to start and that is written in C.

  9. Anonymous permalink

    BTW, I looked at 6352014, and the workaround for "-c" is trivial:
    awk -F’/’ ‘$3 !~ /@/ && $4 == "" {print;}’
    And if you want to do it *right*:
    #include <sys/types.h>
    #include <regex.h>
    …and go to town with:
    cc -xO4 -xprefetch=auto [-xipo=2 -xlinkopt=2 -xlibmil -xlibmopt …]
    But, being a Sun Microsystems employee, you should already know all the Sun Forte switches "like drinking water", correct?

    was the RFE I’d logged requesting basically the same thing as 6352014, but a bit more useful.

  11. @
    While the workaround gets you the same output it does not solve the performance problem.
    I do like the idea that all Sun Employees know all the swithes to the compiler. You should test Jonathan at the next investor con call;-)
    @Tim Foster
    Oh the irony. I had that functionality working until I saw bug 6352014 so changed my prototype to implement the -c flag. The results are impressive:
    : pearson FSS 1 $; time /usr/sbin/zfs list -r -t snapshot tank | wc
    43876 219380 4431485
    real 0m12.61s
    user 0m4.16s
    sys 0m7.11s
    : pearson FSS 2 $; time /usr/sbin/zfs list -c -t snapshot tank | wc
    370 1850 25539
    real 0m0.06s
    user 0m0.03s
    sys 0m0.03s
    : pearson FSS 3 $;
    Let me know if you want a binary to play with!
    Moving this from a prototype to real code and putting the code in to provide both depth and child would not be that hard. Getting it through PSARC will be harder;-)

  12. UX-admin permalink

    Provided that 6762432 gets through the PSARC, "–depth" would be a really, really bad idea, because it diverges from the UNIX standard.
    For example, it would be much more constructive and consistent to use "-l" for "level" (depth), or if that’s not an option, at least use "-depth", to attempt to be consistent with other commands which support this option, like find(1).
    "–something" is GNU, and GNU is not UNIX. We’re on a System V UNIX here, please remain consistent.

  13. UX-admin permalink

    And in that regard, Mr. Gerhard is on a better track because he went for a single letter switch, "-c".

  14. –depth would not be my choice and strangely enough I had chosen -l as well but in my head it stood for limit until I switched to the other bug report. The important point is not the name of the flag but the functionality that it provides. Which would be better a flag that allows listing the children or a flag that allows listing to an fixed depth in the tree or should we aim for the same number of options as ls and have both!

  15. UX-admin permalink

    Probably a flag to allow fixed depth.
    UNIX is all about the power to choose, so a flag to specify an arbitrary depth puts the consumer (of the technology) behind the steering wheel.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: