Working around really really a small but irritating nwam bug

November 29, 2008

The euphoria over having a laptop that would suspend to RAM did
not last long before it was shattered by a more real world situation.
That is suspending while the wireless is connected, ie not when at
my desk. This is bug 6766807
which is somewhat irritating and I’m sure will be resolved soon. With
my work hat on I wonder if this could be one of the bugs that will be
fixed in a supported update. However there is a simple work around.


#!/bin/ksh
function restart_nwam
{
pfexec svcadm restart nwam
}
trap restart_nwam 35
while :
do
sleep $((60*60*24))
done


Run that script as one of the programs started by the session and
this problem is history. Obviously keep an eye on the bug
so that when the fix is delivered you remove the work around. I’ll
update the bug with the workaround on Monday.


New Laptop == Fresh install of OpenSolaris

November 28, 2008

I have a new laptop. A Toshiba Tecra M9. Since it, like my
brompton
is owned by Sun, it is called “brompton”.


The new OpenSolaris 2008.11 bits on this hardware support suspend
to RAM so closing the lid with the power disconnected results in the
system sleeping almost instantly and equally importantly when I press
the power button it restarts from where it left off. Really
something that any laptop needs to have so this is real progress.


While Tim is suggesting that Sun should give up on the desktop,
something I don’t completely agree with as the savings would not be
that great unless you give up on the X server as well which would
leave Sun Ray high and dry something that we should not do. The
desktop experience on a modern 3D accelerated frame buffer is
something that is getting quite appealing. While most of the features
are really just icing (rotating the workspaces when you hit
<control><Alt><Left> & <control><Alt><Right>)
at least one I’ve found useful already. When I press <control>
& the key there is a ripple effect as if the
desktop were water and a water drop has landed where the mouse is.




It allowed me to track the mouse after VirtualBox had hidden it
although in the snapshot you can see the mouse. This has probably
been on my old laptop but I either had not noticed it or it was not
turned on as I had selected the custom options to compiz a while
back. It makes me wonder what other new features are hidden in the
window system that I may be missing. A VT220 emulator maybe?


One mis-feature though is that by default savecore does not get
run at boot time. I recall the head in the sand arguments that were
made for turning off savecore after beta in the dark (although at the
time less dark than now) days of SunOS 4.0. This seems like a
similar exercise in denying reality. On the upside this is not quite
so bad as it was as at least there is a dedicated dump device so the
dump will not get overwritten as part of swap and can be extracted
later by running savecore. Indeed the first thing I would do and did
do was this:


cjg@brompton:/boot/grub$ pfexec savecore
cjg@brompton:/boot/grub$ pfexec dumpadm -y
Dump content: kernel pages
Dump device: /dev/zvol/dsk/rpool/dump (dedicated)
Savecore directory: /var/crash/brompton
Savecore enabled: yes
cjg@brompton:/boot/grub$






Adding dependancies to exim

November 27, 2008

I finally got around to adding dependancies to the smtp (mail)
server I am using on my home server so that it depends on both
spamassassin and the clam anti virus services. While there is
probably a way to do this using individual commands it was much
quicker to export the XML edit that and reimport it having added
these lines:


    <dependency name=’spamd’ grouping=’require_all’ restart_on=’error’ type=’service’>
<service_fmri value=’svc:/network/spamd’/>
</dependency>
<dependency name=’clam’ grouping=’require_all’ restart_on=’error’ type=’service’>
<service_fmri value=’svc:/network/clam’/>
</dependency>


Having refreshed the service and restarted I, it now shows as
depending on the other two services:


: pearson FSS 3 $; svcs -d cswexim
STATE STIME FMRI
online Nov_24 svc:/network/loopback:default
online Nov_24 svc:/milestone/name-services:default
online Nov_24 svc:/system/filesystem/local:default
online Nov_24 svc:/network/clam:default
online Nov_26 svc:/network/spamd:default
: pearson FSS 4 $;


and any failure of the dependant services results in cswexim being
restarted after the dependant service restarts. Depressingly I had
found that small amounts of spam could sneak through thanks to exim
not depending on spamassasin.


Redirecting output to syslog

November 25, 2008

People are always asking this and often when they are not they
should be. How do you redirect all the output from a script to
syslog?


The obvious is:


my_script | logger -p local6.debug -t my_script 2>&1


but how can you do that from within the script? Simple put this at
the top of your script:





#!/bin/ksh
logger -p daemon.notice -t ${0##*/}[$$] |&
exec >&p 2>&1





Clearly this is korn shell specific but
then who still writes bourne shell scripts. If you script was called
redirect you get messages logged thus:


Nov 25 17:40:41 enoexec redirect[17449]: [ID 702911 daemon.notice] bar


Two pools on one drive?

November 23, 2008

Now I’m committed to ZFS root I’m left with a dilemma. Given the
four drives I have in the system and that I have too much data and
the drives are of different sizes so raid2Z is not an option even
though it would give the greatest protection for the data the next
best solution is some form of mirroring. Initially I simply had two
pools which offers good redundancy and allows ZFS root to work but is
suboptimal performance. If I could stripe the pool that would be
better but then that does not work with ZFS root.


However since I used to run with a future
proof
Disk Suite, UFS based root I still have the space that
used to contain the two boot environments that were on UFS into which
I intended to grow the pool once they were not needed. What if I did
not grow the pool but instead put a second pool on that partition?
Then I would have a pool, “rpool” mirrored across part of
the disk and then the data pool, “tank” mirrored over the
rest of the boot drives and
striped across a second mirror consisting of the entire second pair
of drives.


Clearly the solution is
suboptimal but given the constraints of ZFS root and the hardware I
have would this perform better?


I should point out that the
system as is does not perform badly, but I don’t want to leave
performance on the table if I don’t have to. I’m not going to rush
into this (that is I’ve not already done it) since growing the pool
is a one way operation there being no way to shrink it again although
at the moment I am minded to do it.


Comments welcome


Forced to upgrade

November 22, 2008

Build 103 and ZFS root have come to the home server. While I was
travelling the system hit bug 6746456
which resulted in the system panicing every time it booted. So I was
forced to return
to build 100
and have now upgraded to build 103. Live upgrade
using UFS would not work at all and since I have the space I’ve moved
over to ZFS root. However the nautilus bug is still in build 103 so
I’m either going to have to live with it, which is impossible,
disable nautilus completely or work to get the time slider feature
disabled until it is usable. Disabling nautilus while irritating is
effectively what I have had to do now so could be the medium term
solution.


The other news for the home server was the failure of the power
supply. So it was good bye to the small Antec case that used to house
the server since it did not really save any space a more traditional
desk side unit has replaced it which also allows upto six internal
drives. Since ZFS root will not support booting of stripes the extra
two drives I have form a second pool.


# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
pool2 294G 36.8G 257G 12% ONLINE -
tank 556G 307G 249G 55% ONLINE -
#


The immediate effect of two pools is being able to have the Solaris
image from which I upgraded on a different pair of disks from the
ones being upgraded with a dramatic performance boost. The other is
that I can let the automatic snapshot service take control of the
other pool rather than add it to my old snapshot policy. Early on I
realise I need to turn off snapshots on the swap volumes which are on
both pools (to get some striping):


# zfs set com.sun:auto-snapshot=false pool2/swap
#
zfs set com.sun:auto-snapshot=false tank/swap
#


should do it.


Finding the correct device to install onto

November 19, 2008

After spending too long today installing onto the wrong disk of an
x4500 I thought I better write down how to find the right one.


The solaris install document:
http://docs.sun.com/source/819-4362-16/solaris.html
tells us that the bootable devices are:






















Device



Slot Number



Device Node



sata3/0



0 *



c5t0



sata3/4



1 *



c5t4







Now the important thing to remember is, ignore the device nodes
from the table. Instead boot of the media and use cfgadm to list the
devices and the device nodes for sata3/0 and sata3/4.


# cfgadm | grep ’sata3/[04]‘
sata3/0::dsk/c3t0d0 disk connected configured ok
sata3/4::dsk/c3t4d0 disk connected configured ok
#


So on this system and this OS (snv_101a) the boot devices are c3t0d0
and c3t4d0.


Throttling disks

November 5, 2008

The disk drivers in Solaris support SCSI tagged queuing and have
done for a long time. This enables them to send more than one command
to a LUN (logical Unit) at a time. The number of commands that can be
sent in parallel is limited, throttled, by the disk drivers so that
they never send more commands than the LUN can cope with. While it is
possible for the LUN to respond with a “queue full” SCSI status
to tell the driver that it can not cope with any more commands there
are significant problems with relying on this approach:



  • Devices connected via fibre channel have to negotiate onto
    the loop to return the queue full status. This can mean that by the
    time the device manages to return queue full the host can have sent
    many more commands. This risks that the LUN can end up with a
    situation it can not cope with and typically results in the LUN
    resetting.


  • If the LUN is being accessed from more than one host it is
    possible for it to return Queue full on the very first command. This
    makes it hard for the host to know when it will be safe to send a
    command since there are none outstanding from that host.


  • If the LUN is part of many LUNs on a single target it may
    share the total pool of commands that can be accepted by all the
    LUNS and so again could respond with “queue full” on the first
    command to a LUN.


  • In the two cases above the total number of commands a single
    host can send a single LUN will vary depending on conditions that
    the host simply can not know making adaptive algorithms unreliable.



All the above issues result in people taking the safest option and
setting the throttle for a device as low as required so that the LUN
never needs to send queue full. In some cases as low as 1. This is
bad when limited to an individual LUN, it is terrible when done
globally on the entire system.


As soon as you get to the point where you hit the throttle two
things happen:



  1. You are no longer transferring data over the interconnect
    (fibre channel, parallel scsi or iscsi) for writes. This data has to
    wait until another command can complete before it can be
    transferred. This then reduces the throughput of the device. You
    writes can end up being throttled by reads and hence tend towards
    the speed of the spinning disk if the read has to go to the disk
    even though you may have a write cache.


  2. The command is queued on the waitq which will increase the
    latency still further if the queue becomes deep. See here
    for information about disksort’s effect on latency.



Given that the system will regularly dump large numbers of
commands on devices for short periods of time you want those commands
to be handled as quickly as possible to minimized applications
hanging while their IO is completed. If you want to observe the
maximum number of commands sent to a device then there is a D script
here
to do that.


So the advice for configuring storage would be:



dtrace top tip

November 5, 2008

When logged on to a laptop (or any desktop system using X) don’t
run this command:


 $ pfexec dtrace -l -p $(pgrep Xorg)


Instead do this:


 $ pfexec dtrace -l -o /tmp/dt -p $(pgrep Xorg)


The former will deadlock the Xserver and if, like me you are in a
hotel room with no other way to login to the system require you to
power cycle it. The latter will put the results in /tmp/dt which you
can then look at.


Even as I hit return I thought I should
not do that as bad things could happen, See CR 4259419.


I blame this on staying up late at an
Election night Party with some very happy
people. Thank you to them for letting me share the experience, I just
hope that you are not let down in the same way we have been after the
1997 Labour victory which brought so much hope to so many.


My baby is Sweet Sixteen

November 5, 2008

The oldest NIS+ names space on the Planet:


$; niscat -o org_dir.hotline.uk.sun.com.
Object Name : "org_dir"
Directory : "hotline.uk.sun.com."
Owner : "rangdo.hotline.uk.sun.com."
Group : "admins.hotline.uk.sun.com."
Access Rights : r—rmcdrmc-r—
Time to Live : 12:0:0
Creation Time : Thu Nov 5 13:33:46 1992
Mod. Time : Tue Sep 6 21:09:44 2005
Object Type : DIRECTORY
Name : ‘org_dir.hotline.uk.sun.com.’
Type : NIS
Master Server :
……





I have not had anything to do with it
for over 8 years so I should thank those that have kept it alive and
kicking. I notice it has a new “Owner” since it
was 12
and I did not even get a card!