Skip to content

Converting sd minor numbers to instance numbers.

October 29, 2008

I had an email this week about a program that I wrote that would not die. The program is a disk test program that has been around in Sun for a while and with luck will be open sourced in the not to distant future, but I digress. The program was hanging no IO was going on and even sending it a kill -KILL would not kill it.

Generally if processes don’t disappear when sent the signal “KILL” that is not the fault of the program. Since there is nothing the program can do to protect itself from KILL you need to look elsewhere. That elsewhere being in the kernel somewhere. So a crash dump was generated and I was pointed at it.

From the stack it was clear that the program could not die as there were outstanding async IO requests pending and looking at the aio_t confirmed this.

Walking the structures down to the first element on the aio_poolq to find a stuck IO and the buf’s dev_t to see where we are hung up I do this:

> ::pgrep disko | ::print proc_t p_aio  | ::print aio_t aio_pollq | ::print  aio _req_t   aio_req_buf.b_edev | ::devt      MAJOR       MINOR         27        9474 > 0t27::major2name sd 

Seeing that minor number rang alarm bells as the usual way to convert from a minor number for the sd driver into an instance is to divide by 8 (the number of partitions) but that would still leave over 1000 devices. Possible but not likely. Only at that point did it dawn on me that this was an x86 box which thanks to a long history supports a different number of slices. A short grok in the source and the conversion for x86 is to divide by 64.

 > 0t9474%0t64=D                 148               > *sd_state::softstate 0t148 | ::print "struct sd_lun" {     un_sd = 0xffffff016b72daa8     un_rqs_bp = 0xffffff07aef1db80     un_rqs_pktp = 0xffffff0548871080     un_sense_isbusy = 0     un_buf_chain_type = 0x1     un_uscsi_chain_type = 0x8 

What shocked me was how far I could get through a crash dump before taking on board the architecture of the system.

Advertisements

From → Solaris

2 Comments
  1. Hi Chris,
    Is the disk test program, that you hope will be open sourced, the one that is mentioned at this link?
    http://research.sun.com/minds/2008-0312/

  2. Yes. It is slightly more than "hope" as well. We are actively working this. There just seem to be a lot of hoops to jump through.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: