Applied Kerberos troubleshooting

The following is an IRC transcript taken from #afp548, irc.freenode.net. It chronicles the troubleshooting process of a fairly well-hidden edge case of Kerberos configuration in Mac OS X Server.

pastebin.ca was used to relay larger hunks of textual information; I’ve made local copies of the results since the pastebin pages expire in 1 month. Pastebin displays line numbers, and those numbers are used here to refer to specific portions of text… however, pastebin doesn’t seem to allow users to copy the text including line numbers, so I added those myself (awk ‘{print NR “. “$0}’ file)

16:44 <@dre^> re the kerberos question: still have to use ‘connect to’ to get kerberos
16:44 <@dre^> which is weird, because the browsing method is how you get kerberos for the LKDC realms, heh
16:44 <@dre^> at least for things like screen sharing
16:51 <@dre^> wow, /dev/random is slow
16:51 <@dre^> erps, ww
17:34 -!- ideopathic [n=ideopath@75-56-246-1.lightspeed.brbnca.sbcglobal.net] has joined #afp548
17:39 < SpaceBass> dre^, connect to server doesnt use the ticket either
17:41 < SpaceBass> and for that matter, screen sharing doesnt seem to consistantly use kerberos either
17:41 < SpaceBass> apple really broke things with the whole lkdc implementation
17:44 <@dre^> heh
17:44 <@dre^> if you can’t get kerberos via connect to, there is some other problem
17:44 <@dre^> lkdc works, kerberos works… if configured and used properly ;)
17:45 <@dre^> a quick list of things to check regarding kerberized services in general:
17:45 <@dre^> * are the client and the service service configured for the same kerberos realm?
17:45 <@dre^> * does the client have a valid kerberos principal in the kdc? can the client user kinit at all?
17:46 <@dre^> * does the service server have service keytabs in the kdc? if you kadmin –> listprincs on the kdc, do you see afpserver/hostname@REALM?
17:46 <@dre^> * does the service’s configuration know what principal name to use? this is in teh afp preferences in the case of afp server
17:48 < SpaceBass> dre^, hard to misconfigure Leopard Server – create the DNS, create the OD domain, join to the domain
17:48 < SpaceBass> there’s posts all over the apple forums about it…just though I’d see if anyone had identified a work around
17:49 <@dre^> have an example post?
17:49 <@dre^> I’ve used kerberos a ton
17:49 <@dre^> so I know it’s not always broken all the time
17:49 < SpaceBass> kinit works fine, and I get a ticket at login … but I cannot use that ticket via the finder for almost anything … it does work for SSH or mount_afp in the terminal
17:50 <@dre^> right, but pls distinguish between finder browsing vs finder connect to
17:50 < SpaceBass> ever leopard machine that joins the realm creates 3 enteries for each service … host.fqdn.com host.local and a random serial number for the LKDC /back to my mac stuff
17:51 < SpaceBass> so when you say connect to, do you mean GO menu –> connect to server?
17:51 <@dre^> yes
17:51 < SpaceBass> and it hasn’t been broken all the time … 10.4 worked flawlessly …
17:51 < SpaceBass> ok an in the connect to menu, what is the uri? I’m using afp://host … I have also tried host.domain.com and host.local
17:52 <@dre^> ah, .local…
17:52 < SpaceBass> ok, tried that and I get a box asking for user/pass
17:52 <@dre^> are you using .local in your actual DNS / realm names?
17:52 <@dre^> no. dont use .local unless you are forced to, heh
17:53 < SpaceBass> no, I have a private domain …
17:53 <@dre^> and yes, it shoudl be afp://fqdn.goes.here
17:53 <@dre^> also verify that afpserver’s auth settings are either “any method” or “kerberos”
17:53 < SpaceBass> ok…with afp://host.domain.com I get 2 different results …somtimes it fails right off the bat, others it asks for user/pass
17:54 <@dre^> so then you check the KDC logs to see what’s going on
17:54 <@dre^> but of course you probably don’t have access to those…
17:54 < SpaceBass> dre^, I hand checked each plist last night … that occured to me late in the game, and I was impressed to see that they all said any and kerb
17:54 <@dre^> which is the crappy part about debugging kerberos
17:54 < SpaceBass> the logs? I’m the admin
17:54 <@dre^> ok good. check the kdc log
17:56 < SpaceBass> ok…logs show me requesting a ticket for host.local
17:56 < SpaceBass> but I’m using fqdn and the afp plist shows the host.fqdn.com as the principal to use
17:57 < SpaceBass> I dont mind manually adding those principals but that seems broken to me
17:57 <@dre^> ok… what are your existing tickets? klist
17:57 <@dre^> you should not have to add .local principals
17:58 <@dre^> specifically, what’s the realm associated with your existing tickets (if any)
17:58 < SpaceBass> right now I just have the krbtgt
17:58 <@dre^> but in what realm?
17:58 <@dre^> a .local realm or ‘other’?
17:59 < SpaceBass> NSNET.cc
17:59 <@dre^> ok great
17:59 < SpaceBass> my realm
17:59 < SpaceBass> krbtgt/NSNET.CC@NSNET.CC
17:59 < SpaceBass> what I’d expect
17:59 < SpaceBass> and if I ssh into a linux server I get host/linux.nsnet.cc@
17:59 <@dre^> so the next step woudl probably be to verify the client-side kerberos configuration. get root and take a walk into /var/db/dslocal/nodes/Default/config
18:00 <@dre^> ok intersting, so the client-side config is probably correct
18:00 <@dre^> is the afp service running on the OD master?
18:00 < SpaceBass> dre^, yes, but I dont really have any shares there…mostly on leopard workstations
18:01 < SpaceBass> (and a linux box running netatalk, but I don’t expect anyone to help me troubleshoot that)
18:01 <@dre^> no problem, just getting the lay of the land… in particular, in that configuration, it’s very unlikely that your afp service would not have the required keytabs
18:01 < SpaceBass> in …../config … didn’t know about this dir
18:01 <@dre^> yes, that config dir is the authoritative spot for such configurations
18:01 <@dre^> /L/P/edu.mit.Kerberos is an externalized representation of data found here
18:01 <@dre^> and is really ‘for legacy purposes only’
18:02 < SpaceBass> cool … I’m used to /L/P/edu …
18:02 < SpaceBass> gotcha
18:02 < SpaceBass> good to know
18:02 <@dre^> yes it is. cause sometimes that translation breaks down
18:02 <@dre^> and you need to go see what’s up
18:02 <@dre^> ok… so the next thing I would do is…
18:03 <@dre^> stand by, but I have some awesome debugging steps for you
18:03 < SpaceBass> very apperciative
18:04 <@dre^> ok here goes
18:04 <@dre^> a) open a terminal and execute the following:
18:04 <@dre^> sudo syslog -c syslog -d
18:04 <@dre^> sudo syslog -c 0 -d
18:04 <@dre^> killall NetAuthAgent
18:04 <@dre^> kdestroy -A
18:04 <@dre^> syslog -w
18:04 <@dre^> b) start a connection in Finder using ‘connect to’
18:05 <@dre^> once you attempt a connection using the proper fqdn, enter a name / pw if prompted
18:05 <@dre^> then wait 30 seconds for syslog in teh terminal to catch up, then cntrl-c it
18:05 <@dre^> you should find ample / useful debugging info in the terminal (syslog) output
18:05 < SpaceBass> interesting
18:05 < SpaceBass> lots of info
18:05 <@dre^> but I can help make sense of it if you need
18:05 < SpaceBass> getting asked for user/pass for the share
18:06 < SpaceBass> checking the logs now
18:06 <@dre^> Look for KRBCreateSession, and right after that…
18:06 <@dre^> you should see the results of some realm_for_host calls…
18:07 < SpaceBass> now the kdestroy removed all tickets … expected ?
18:07 <@dre^> my guess is that such results are either wrong or missing
18:07 <@dre^> yes, expected
18:07 < SpaceBass> k
18:07 <@dre^> but this process should obtain new tickets
18:07 < SpaceBass> how would it get my password?
18:07 < SpaceBass> I dont have it saved in the keychain
18:08 < SpaceBass> right after the KRBCreateSession line I see:
18:08 < SpaceBass> (and I can’t cut/paste b/c I’m using two different machines)
18:09 < SpaceBass> parse_principal … decomposing afpserver/osx5.nsnet.cc@NSNET.cc (seems correct)
18:09 <@dre^> ok
18:10 <@dre^> and you probably do have it in your keychain if you got in without authing
18:10 -!- SpaceBass2 [n=SP@96.228.61.195] has joined #afp548
18:10 <@dre^> ok, so that means that afp server is returning the expected principal name
18:10 < SpaceBass2> flood warning
18:10 < SpaceBass2> : [[[ KRBCreateSession () – required parameters okay
18:10 < SpaceBass2> Thu Jul 3 18:02:07 osx1 NetAuthAgent[2861] <Debug>: [[[ parse_principal_name () decomposing afpserver/osx5.nsnet.com@NSNET.COM
18:10 < SpaceBass2> Thu Jul 3 18:02:07 osx1 NetAuthAgent[2861] <Debug>: ]]] parse_principal_name () – 0
18:10 < SpaceBass2> Thu Jul 3 18:02:07 osx1 NetAuthAgent[2861] <Debug>: KRBCreateSession: processed host name = osx5.nsnet.com
18:10 < SpaceBass2> Thu Jul 3 18:02:07 osx1 NetAuthAgent[2861] <Debug>: KRBCreateSession: last char of host name = 0x6d
18:11 < SpaceBass2> Thu Jul 3 18:02:07 osx1 NetAuthAgent[2861] <Debug>: KRBCreateSession: getaddrinfo = success (0)
18:11 < SpaceBass2> Thu Jul 3 18:02:07 osx1 NetAuthAgent[2861] <Debug>: KRBCreateSession: canonical host name = osx5.nsnet.com
18:11 < SpaceBass2> Thu Jul 3 18:02:07 osx1 NetAuthAgent[2861] <Debug>: [[[ realm_for_host: hostname=osx5.nsnet.com hintrealm=NSNET.COM
18:11 < SpaceBass2> Thu Jul 3 18:02:07 osx1 NetAuthAgent[2861] <Debug>: realm_for_host: krb5_get_host_realm returned unusable realm!
18:11 < SpaceBass2> Thu Jul 3 18:02:07 osx1 NetAuthAgent[2861] <Debug>: ]]] realm_for_host: failed to determine realm
18:11 <@dre^> ah ha
18:11 < SpaceBass> dre^, I did NOT get in without authing … I got the finder prompt for user/pass
18:11 <@dre^> ok that’s good
18:11 <@dre^> and expected
18:12 <@dre^> it definitely looks as though the client kerberos config is malformed somehow
18:12 <@dre^> since it thinks NSNET.COM is unusable
18:12 <@dre^> go ahead and kinit and paste in the TGT you get
18:12 <@dre^> or jsut klist if you already have one
18:12 < SpaceBass> ok…here’s the thing…its a brand spanking new Macbook pro … first thing out of the box…configured DNS, did updates, jointed to domain using directory utility.app
18:13 <@dre^> is that the client or afp server?
18:13 < SpaceBass2> Kerberos 5 ticket cache: ‘API:Initial default ccache’
18:13 < SpaceBass2> Default principal: ndawson@NSNET.COM
18:13 < SpaceBass2> Valid Starting Expires Service Principal
18:13 < SpaceBass2> 07/03/08 18:09:58 07/04/08 04:09:58 krbtgt/NSNET.COM@NSNET.COM
18:13 < SpaceBass2> renew until 07/10/08 18:09:58
18:13 < SpaceBass> client
18:13 <@dre^> hmm, ok
18:14 <@dre^> could you post or email me your /L/P/edu.mit.Kerberos?
18:14 <@dre^> dre@mac.com
18:14 < SpaceBass> can post – its short
18:15 < SpaceBass> pastebin at least
18:15 <@dre^> sure
18:15 < SpaceBass2> https://pastebin.ca/1061728

# WARNING This file is automatically created, if you wish to make changes
# delete the next two lines
# autogenerated from : /LDAPv3/vail.nsnet.com
# generation_id : 97528862
[libdefaults]
default_realm = NSNET.COM
[realms]
NSNET.COM = {
admin_server = vail.local
kdc = vail.local
}
[domain_realm]
.local = NSNET.COM
local = NSNET.COM
[logging]
admin_server = FILE:/var/log/krb5kdc/kadmin.log
kdc = FILE:/var/log/krb5kdc/kdc.log

18:16 < SpaceBass> thats a little different than I’m used to seeing – but its what apple generates
18:16 <@dre^> loading…
18:16 <@dre^> (slowly)
18:17 < SpaceBass> again, really appreciate the help
18:17 <@dre^> sure no prob :)
18:18 < SpaceBass2> I am surprised that apple’s automated processes seem to be broken
18:19 <@dre^> heh, well… I guess that’s good. one should ideally expect things to work properly without too much work :)
18:20 < SpaceBass2> exactly
18:21 <@dre^> ok it loaded finally
18:21 <@dre^> oh, lol
18:21 <@dre^> I see the problem :P
18:22 <@dre^> kdc = vail.local
18:22 <@dre^> vail.local should be a fqdn
18:22 < SpaceBass2> in the edu… ?
18:22 <@dre^> yes absolutely
18:22 < SpaceBass2> see, I thought the same thing, but what is that part about the aliasing?
18:23 <@dre^> theoretically in a perfect world this would be a valid configuration
18:23 < SpaceBass2> :D
18:23 <@dre^> the thing is that Kerberos makes assumptions based on host name / fqdn
18:23 < SpaceBass2> ok … if I change edu.mit.kerb …how do I get it to update the files in /var…/config
18:23 <@dre^> so you need to use the fqdn for the KDC that matches the host name portion of the kerberos principals
18:23 <@dre^> you should not change it
18:23 <@dre^> you should unbind and rebind using a fqdn and see what happens
18:23 < SpaceBass2> ok
18:24 < SpaceBass2> rebind using the fqdn of the server?
18:24 <@dre^> yes
18:24 < SpaceBass2> odd, b/c thats what I did
18:24 <@dre^> unbind / rebind the client
18:24 <@dre^> ok, then don’t do that
18:24 < SpaceBass2> glad to re-try
18:24 -!- dakine [n=sam@bas3-toronto01-1177779856.dsl.bell.ca] has quit [“This computer has gone to sleep”]
18:24 <@dre^> let’s verify the server configuraiton
18:24 < SpaceBass2> k
18:24 <@dre^> on teh OD master: sudo slapconfig -checkhostname
18:24 <@dre^> er, sorry
18:25 <@dre^> sudo changeip -checkhostname
18:25 < SpaceBass2> yeah , I figured thats what you meant :D … vail.nsnet.com
18:25 <@dre^> in general, it’s good to resist the temptation to hand-hack any config files, because doing so may break assumptions that apple makes about the contents of the files, in the cases where the same config files are maintained automatically by apple tools
18:26 < SpaceBass2> dre^, I’ve learned that the hard way before :)
18:26 <@dre^> so it says “there’s nothing to change” at the end?
18:26 < SpaceBass2> yes
18:26 <@dre^> ok good
18:26 < SpaceBass2> names match, nothing to change
18:27 <@dre^> does the server’s /L/P/edu.mit.Kerberos look the same?
18:27 <@dre^> it probably will…
18:27 < SpaceBass2> exactly the same
18:27 < SpaceBass2> (and that damn .local keeps throwing me off too)
18:27 <@dre^> yeah. it should be. that data is all downloaded by the client from the LDAP directory
18:28 <@dre^> (when you bind, a tool called kerberosautoconfig … well, does that)
18:28 < ideopathic> i’m following a long trying to learn a little about kerberos. where is the file located that you uploaded to pastbin?
18:28 < SpaceBass2> and, like I said…ssh and mount_afp work …
18:28 < SpaceBass2> ideopathic, /Library/Preferences
18:28 < SpaceBass2> ideopathic, this is a good one to follow :D learning a lot myself
18:28 <@dre^> there is still something wrong if it thinks your kdc is hosted by a .local thing
18:28 <@dre^> you’re supposed to get a fqdn there, e.g. vail.nsnet.com
18:29 <@dre^> ok, so let’s check your kdc configuration…
18:29 <@dre^> on the KDC (OD master): ps auxwww | grep krb
18:29 -!- dakine [n=sam@bas3-toronto01-1177779856.dsl.bell.ca] has joined #afp548
18:29 <@dre^> you shoudl see krb5kdc running and supporting at least one realm
18:29 < SpaceBass2> root 96 0.0 0.2 82512 2480 ?? S 25Jun08 0:15.03 /usr/sbin/krb5kdc -n -r LKDC:SHA1.B3567769537F126486F54B94C5B03C7A439C0F80 -r NSNET.COM -a
18:29 <@dre^> very interesting
18:30 <@dre^> so the KDC thinks it’s hosting two realms, the LKDC realm and the NSNET.COM realm
18:30 < SpaceBass2> yeah…theres those damn lkdc entries again
18:30 <@dre^> that’s fine, don’t fear the lkdc ;)
18:30 < SpaceBass2> oh but I do :D
18:30 <@dre^> perhaps this will aleviate your concern: https://dreness.com/wikimedia/index.php?title=LKDC
18:30 <@dre^> a little write-up I did about the LKDC
18:31 <@dre^> but that is beside the point
18:31 <@dre^> the question is: what broke between the KDC configuration and the population of the KerberosClientConfig record in OD
18:31 <@dre^> open workgroup manager
18:32 <@dre^> actually let’s just use dscl
18:32 < SpaceBass2> cool – good reading!
18:32 <@dre^> dscl /LDAPv3/127.0.0.1 (on the OD master)
18:32 < SpaceBass2> k
18:32 <@dre^> read /Config/KerberosClient
18:32 < SpaceBass2> I’ll warn you, my dscl-fu is weak
18:33 <@dre^> this should be similar to what you see in /L/P/edu.mit.kerberos (albeit formated differently)
18:33 <@dre^> true or false?
18:33 < SpaceBass2> checking -its xml …but close
18:33 <@dre^> mainly looking for vail.local
18:33 < SpaceBass2> yeah
18:34 < SpaceBass2> its there
18:34 <@dre^> ok
18:34 < SpaceBass2> as the KDC for nsnet.com
18:34 < SpaceBass2> nsnet.cc
18:34 <@dre^> this is the data that is downloaded by clients when they bind
18:34 <@dre^> wait
18:34 < SpaceBass2> ah!
18:34 <@dre^> nsnet.cc or nsnet.com!?!
18:34 < SpaceBass2> cc
18:34 < SpaceBass2> sorry
18:34 < SpaceBass2> er..com
18:34 < SpaceBass2> it is com
18:34 <@dre^> hehe
18:34 < SpaceBass2> and .com is correct
18:35 <@dre^> ok
18:35 < SpaceBass2> and if I’ve been saying .cc its an old habit
18:35 < SpaceBass2> but nsnet.com is a private domain …in that i do not own it on the interwebs
18:35 <@dre^> … that is not recommended ;)
18:35 <@dre^> you should use fake TLDs in that case
18:35 < SpaceBass2> yeah, stupid move that I made years ago and wish I could undo
18:36 <@dre^> e.g. nsnet.lan
18:36 < SpaceBass2> but I suspect trying to change the realm now would be pretty challenging
18:36 <@dre^> you can and should un-do it as a reasonably high priority
18:36 <@dre^> it could cause very hard to track down DNS ‘problems’
18:36 <@dre^> but we’ll talk about that later
18:36 < SpaceBass2> what I’d really like to do get a public domain and do a dual horizon dns … would make getting a comercial cert much easier
18:37 < SpaceBass2> but like you said, I can tackel that later
18:37 <@dre^> ok, so
18:37 <@dre^> now let’s look at /Library/Logs/slapconfig.log
18:37 <@dre^> might wanna slap that on pastebin
18:37 <@dre^> (on the OD master)
18:37 <@dre^> slapconfig.log records information about OD role changes, such as promotion to master
18:38 < SpaceBass2> assume there is nothing sensitive in there
18:38 <@dre^> nothing that you haven’t already told us :)
18:38 <@dre^> might be an admin account name
18:38 * SpaceBass2 pats his PFsense box
18:38 <@dre^> but certainly no passwords…
18:39 < SpaceBass2> https://pastebin.ca/1061749

http://dreness.com/bits/tech/applied_kerberos_troubleshooting/paste1

18:39 <@dre^> (although before tiger shipped, I did find admin passwords in that log… heh. fixed before ship though, thankfully…)
18:39 < SpaceBass2> ouch!
18:39 <@dre^> full disclosure: I work at apple
18:40 <@dre^> loading slow again…
18:40 < SpaceBass2> yeah? awesome
18:40 < SpaceBass2> full discolsure I’m a fan boy
18:40 <@dre^> hehe
18:40 * SpaceBass2 has 16 macs …personally … this is a home setup by the way
18:41 < SpaceBass2> and my wife is only tolerating me troubleshooting this right now b/c I’ve promised that she’ll be able to mount the media share again
18:41 <@dre^> haha
18:41 <@dre^> ok it’s loaded, reading
18:42 < SpaceBass2> k
18:42 < SpaceBass2> reading myself as its new to me
18:42 <@dre^> I see you had one false start
18:43 < SpaceBass2> yeah – in fact, the long history is that I did a tiger-leo upgrade and it failed several times … so I blew it away and re-created the OD from sctatch …and did indeed have a false start
18:44 <@dre^> hmm, looks like you’re merging in an OD backup from tiger
18:45 < SpaceBass2> I did try and pull in a backup – again failed … you should see where I eventually re-created by hand
18:45 < SpaceBass2> if memory serves ….
18:45 <@dre^> heh ok, still reading
18:45 < SpaceBass2> I did try and pull in the backup and then create new passwords, but I wasn’t getting user principals
18:46 <@dre^> upgrades are risky business…
18:48 <@dre^> ok, so if you look at line 247
18:48 <@dre^> that’s where it starts creating the wrong service principals
18:48 <@dre^> though there is no obvious indication of why it’s doing it wrong… between line 202 and 247 appears normal
18:49 < SpaceBass2> leme look
18:49 < SpaceBass2> the warnings?
18:49 <@dre^> no, the principal name itself
18:49 <@dre^> er, the hostname portion of the service principals
18:49 <@dre^> vail.local
18:50 < SpaceBass2> i see
18:50 <@dre^> intersetingly enough, when you kerberize other hosts, they work
18:50 <@dre^> e.g. telluride
18:50 <@dre^> that explains why ssh to linux is working
18:50 < SpaceBass2> telluride is a linux box – added by hand
18:50 <@dre^> *nod*
18:50 <@dre^> note line 327
18:51 <@dre^> the service principals are being created with the correct server name
18:51 < SpaceBass2> humm I cannot seem to get into kadmin
18:51 <@dre^> try kadmin.local as root
18:51 < SpaceBass2> but what I have observed in the past is that it creates 3 enteries for each OSX host
18:51 <@dre^> yes, that is fixed in 10.5.3
18:51 <@dre^> but only for ‘new’ installs :/
18:51 < SpaceBass2> is it?!?!
18:52 <@dre^> it’s not really a functional problem, more cosmetic
18:52 < SpaceBass2> I’m on 10.5.2 – been avoiding the upgrade b/c I wasn’t sure it was safe yet
18:52 <@dre^> well now it’s 10.5.4, heh
18:52 < SpaceBass2> even for server?
18:52 <@dre^> yes
18:52 < SpaceBass2> on .4 for clients
18:52 < SpaceBass2> cool
18:52 <@dre^> in general, updates ship at the same time for client and server
18:52 < SpaceBass2> I’ll update tonight if all goes well
18:52 < smultron> i updated
18:53 < smultron> no problems
18:53 <@dre^> well… if you don’t have a lot of stuff in your OD master, you should probably demote / promote
18:53 < SpaceBass2> interesting – I only see vail.local in the keytab
18:53 <@dre^> yes, that is a problem :)
18:53 <@dre^> you might be able to slapconfig -kerberize your way to nirvana… lemme see
18:53 < SpaceBass2> oh yeah it is! can’t belive I missed that
18:53 <@dre^> I’ve never really done that, since I always stop at the first sign of weirdness and start over
18:53 < SpaceBass2> I mean, I can add em if need be
18:54 <@dre^> in general, watch slapconfig.log like a hawk whenever you do OD stuff
18:54 < SpaceBass2> but, since osx1.nsnet.com is trying to connect to osx5.nsnet.com … does vail.local matter?
18:54 < SpaceBass2> would that break the “chain” so to speak?
18:55 <@dre^> well, it matters in the sense that vail’s services are kerberized using the wrong hostname
18:55 < SpaceBass2> (and hostname on the kdc reports vail.nsnet.com )
18:55 <@dre^> right, it’s just the self-kerberization that failed for some reason
18:56 <@dre^> ok, couple more things to check…
18:57 <@dre^> sudo sso_util info -r /LDAPv3/127.0.0.1
18:57 <@dre^> should return NSNET.COM
18:58 < SpaceBass2> ’tis
18:58 < SpaceBass2> nsnet.com
19:00 <@dre^> ok, so there is an sso_util command that can attempt to kerberize services on the OD master
19:00 <@dre^> sso_util configure
19:00 < SpaceBass2> oh…?
19:00 <@dre^> but this will make changes
19:00 < SpaceBass2> at this point, its not like I cannot rebuild again … data is on the clients and its all backed up
19:00 <@dre^> so before doing that, let me ask: how much stuff is in the OD master? How long would it take you to demote and promote, and recreate all of the users / kerberized hosts?
19:00 <@dre^> ok
19:00 < SpaceBass2> and rebuilding the OD master isn’t too hard
19:01 <@dre^> well depends on how much stuff is in it :) the idea is we don’t want to restore from an archive
19:01 < SpaceBass2> I’d really prefer not to do that…at least not tonight … but its “do-able”
19:01 <@dre^> as that will restore potentially bad data
19:01 <@dre^> well doing the sso_util configure shouldn’t break anything other than kerberized services on the OD master
19:01 < SpaceBass2> guess what I’m saying is: I’m ok with risking it
19:01 <@dre^> which means that at works, you have to use standard auth and not kerberos
19:01 <@dre^> s/works/worst/
19:02 < SpaceBass2> I can live with standard for a few days if I have to
19:03 <@dre^> ok so try: sudo sso_util configure -r NSNET.COM -a admin-name all
19:03 <@dre^> where admin-name is your *directory* administraotr
19:03 <@dre^> you will be prompted for a password
19:03 < SpaceBass2> says either us -p or named pipe
19:04 <@dre^> oh, interesting… must be a difference between versions
19:04 <@dre^> try passing -p with no password
19:04 < SpaceBass2> same error
19:04 <@dre^> blah, then do -p <password>
19:04 <@dre^> which is evil and stupid
19:04 <@dre^> 10.5.4 server allows you to get a secure prompt
19:04 < SpaceBass2> guess I can truncate history later :D
19:04 <@dre^> heh *nod*
19:05 <@dre^> hopefully you will see it creating new service principals…
19:05 < SpaceBass2> ok…same error …so I moved -p right after the -a diradmin
19:05 <@dre^> in the form service/vail.nsnet.com/NSNET.COM
19:05 <@dre^> hmm
19:05 < SpaceBass2> creating service princs
19:05 < SpaceBass2> add_principal: Principal or policy already exists while creating “ldap/vail.local@NSNET.COM”.
19:05 <@dre^> bah!
19:06 <@dre^> and you are sure that the ‘hostname’ command does not return vail.local?
19:06 < SpaceBass2> 100%
19:06 <@dre^> oh, I guess this could be keying off the KerberosConfig record…
19:06 <@dre^> maybe we need to re-publish that
19:06 <@dre^> ok let’s see…
19:07 < SpaceBass2> and by the way – if I’m keeping you from something, please say so
19:07 < SpaceBass2> you’v been more than helpful, to say the least
19:07 <@dre^> well thanks :) I kinda wanna solve this, I’m sure i’ll be seeing similar problems from others…
19:07 <@dre^> (I help scrub incomming server bugs)
19:08 < SpaceBass2> I really appreciate the help!
19:08 < SpaceBass2> gotcha – so this is right up your alley then
19:08 < SpaceBass2> although I suspect you dont see many home users with Server
19:09 <@dre^> well, no…
19:10 <@dre^> ok, gotta find how the KerberosClient record can be re-created
19:10 <@dre^> cause that’s where the bad data is coming from
19:10 < SpaceBass2> I’d show you my server cabinet and rack …but its a tad shoddy compared to a real server room
19:10 <@dre^> could very well have been left over from the false start(s)
19:12 < SpaceBass2> humm
19:12 <@dre^> ok how about this
19:12 <@dre^> dscl /Search list /Computers
19:13 < SpaceBass2> livingroom.local$
19:13 < SpaceBass2> livingroom.nsnet.com$
19:13 < SpaceBass2> LKDC:SHA1.2F5BAB71984D985DC0BA0D103C85DC067EF0A22E$
19:13 < SpaceBass2> LKDC:SHA1.64604752011301522B118A9CFE83A95560B194E5$
19:13 < SpaceBass2> LKDC:SHA1.AB999D5B63EDDCDC11B360E1EACB9536849844CC$
19:13 < SpaceBass2> LKDC:SHA1.C1E7E428054307B586CD240141B42583DF46FB5A$
19:13 < SpaceBass2> LKDC:SHA1.C2DA7627FD7C4E44EFE720A00FAE2CE2F76BA9A8$
19:13 < SpaceBass2> LKDC:SHA1.DD1F37D568FCC14ACE2F3935554012B235C87A4C$
19:13 < SpaceBass2> LKDC:SHA1.DD362AEF0FD6C7CBA5664D5FD27818058317ED49$
19:13 < SpaceBass2> osx1
19:13 < SpaceBass2> osx1.local$
19:13 < SpaceBass2> osx1.nsnet.com$
19:13 < SpaceBass2> osx10.local$
19:13 < SpaceBass2> osx10.nsnet.com$
19:13 < SpaceBass2> osx5
19:13 < SpaceBass2> osx5.nsnet.com$
19:13 < SpaceBass2> osx7.local$
19:13 < SpaceBass2> osx7.nsnet.com$
19:13 < SpaceBass2> telluride.nsnet.com
19:13 < SpaceBass2> vail.nsnet.com$
19:13 < SpaceBass2> oops…SORRY
19:13 < SpaceBass2> ment to put that into pastebin
19:13 <@dre^> no worries, butok, vail.nsnet.com is there
19:15 < SpaceBass2> help me understand the $ … is that some kind of wild card
19:15 <@dre^> used for computer records
19:15 <@dre^> maybe only those with a qualified name
19:15 <@dre^> e.g. foo.tld instead of just foo
19:15 <@dre^> and I think only when they are auto-generated
19:16 <@dre^> which is why teh linux box record doesn’t have one
19:16 < SpaceBass2> gotcha
19:16 < SpaceBass2> gotcha
19:17 <@dre^> ok hmmm
19:17 < SpaceBass2> I’ve avoided joining the other machines until I get the issues sussed out
19:18 <@dre^> dscl /Search read “/Computers/vail.nsnet.com$”
19:18 <@dre^> sorry
19:18 <@dre^> dscl /Search read “/Computers/vail.nsnet.com$” cn
19:19 < SpaceBass2> dsAttrTypeNative:cn: vail.nsnet.com$ vail.nsnet.com
19:19 <@dre^> ok
19:20 <@dre^> kdcsetup is the one who writes the KerberosClient record into LDAP
19:22 <@dre^> but it doesn’t appear to be able to only re-write KerberosClient without doing everything else
19:22 <@dre^> so fire up WGM
19:22 <@dre^> go into prefs, turn on the inspector
19:22 < SpaceBass2> k
19:22 < dakine> hey guys, quick question
19:23 < dakine> what do you say you do for a living?
19:23 <@dre^> click the bullseye icon (the right-most above the left-hand list view)
19:23 <@dre^> I work at apple as a seed engineer
19:23 < SpaceBass2> <– healthcare process improvement :D
19:23 <@dre^> software seeding, that is
19:23 < SpaceBass2> looking for inspector
19:24 <@dre^> second checkbox
19:24 <@dre^> (in the wgm prefs)
19:24 < dakine> lol
19:24 < dakine> ok
19:24 < SpaceBass2> see it now
19:24 <@dre^> dakine: in case that isn’t clear, I help mediate communications between external customers with bugs and apple software engineers
19:24 < SpaceBass2> ok…in the bulls eye
19:24 < SpaceBass2> also new to me
19:25 < dakine> ah
19:25 <@dre^> from the pop-up menu, select Config
19:25 < dakine> so you are the middleman
19:25 <@dre^> well I hate that term, heh
19:25 < dakine> cause the software engineers arent people persons
19:25 <@dre^> middleman implies that I’m good for nothing ;)
19:25 < dakine> lol
19:25 < dakine> listen
19:25 < dakine> nothing gets done without the middle man
19:26 <@dre^> space: then select KerberosClient
19:26 < dakine> its just the problem givers and the problems solvers in communicado
19:26 <@dre^> then select XMLPlist and click Edit below
19:26 < dakine> anyways I am off
19:26 <@dre^> later dakine :)
19:26 < SpaceBass2> im there
19:26 < SpaceBass2> later dakine
19:26 <@dre^> fix the hostnames
19:26 <@dre^> vail.local becomes vail.nsnet.com
19:27 < SpaceBass2> k
19:27 <@dre^> and increment the generation ID by one
19:27 <@dre^> (at the bottom)
19:27 < SpaceBass2> fixed
19:27 <@dre^> the generation ID is how the client tells if its local version of the config is stale
19:28 < SpaceBass2> ah
19:28 < SpaceBass2> that long integer at the btm?
19:28 <@dre^> yes
19:28 < SpaceBass2> k
19:29 <@dre^> click OK to comit the changes
19:29 <@dre^> click Save if it’s lit up
19:29 < SpaceBass2> k
19:29 <@dre^> go back to the client and run sudo kerberosautoconfig
19:29 <@dre^> (we’ll do the server next if this works)
19:29 <@dre^> then examine edu.mit.Kerberos on the client
19:30 <@dre^> the kdc and kdc admin server should be reported as vail.nsnet.com
19:30 < SpaceBass2> yep
19:30 < SpaceBass2> it is
19:30 <@dre^> ok great
19:30 <@dre^> same thing on the OD master
19:30 < SpaceBass2> on the master huh?
19:30 < SpaceBass2> k
19:30 <@dre^> aye
19:31 < SpaceBass2> done
19:31 <@dre^> now we want to sso_util configure again, same as before… lemme double check the usage
19:31 <@dre^> sudo sso_util configure -r NSNET.COM -a whatever -p whatever all
19:32 <@dre^> now you shoudl get correct keytabs
19:32 <@dre^> if so, that *should* be it
19:32 < SpaceBass2> still got warnings about the .local :(
19:33 <@dre^> BAH
19:33 <@dre^> and you did check that it got an updated edu.mit.kerberos, right?
19:33 <@dre^> the od master
19:33 < SpaceBass2> yeah
19:34 < SpaceBass2> its correct
19:34 <@dre^> hmm
19:34 <@dre^> oh, uhm..
19:34 <@dre^> well no, not a stale DS cache if hte on-disk file is correct
19:35 < SpaceBass2> yeah, checking /L/P/edu…
19:35 -!- Azhi_Dahaka [n=Azhi@unaffiliated/azhidahaka/x-172934] has quit []
19:36 <@dre^> oooo
19:36 <@dre^> I think I know :)
19:37 <@dre^> you might have an ‘upgraded’ sso_util
19:37 <@dre^> from tiger
19:37 <@dre^> md5 /usr/sbin/sso_util
19:37 <@dre^> paste results pls
19:37 < SpaceBass2> its a fresh install from leopard
19:37 <@dre^> oh dammit
19:37 < SpaceBass2> its a one liner
19:37 < SpaceBass2> MD5 (/usr/sbin/sso_util) = 32a7a95f3e49502ddb0863583c30410d
19:37 < SpaceBass2> 10.5.3 remember
19:38 <@dre^> ppc?
19:38 < SpaceBass2> yeah …
19:38 < SpaceBass2> g4
19:38 <@dre^> k, no problem. but that probably explains why its different from mine
19:38 <@dre^> actually..
19:38 <@dre^> file /usr/sbin/sso_util
19:38 <@dre^> paste results
19:38 < SpaceBass2> if I buy an xserver my wife call it quits
19:39 <@dre^> heh, they are big and loud
19:39 < SpaceBass2> https://pastebin.ca/1061784
19:40 < SpaceBass2> can’t be louder than my 2u linux box :D
19:40 < SpaceBass2> but might be hotter
19:41 <@dre^> just looking for both a ppc and i386 image, that’s all…
19:41 <@dre^> not really taht important.
19:41 <@dre^> hmm, there’s supposed to be an sso_util debug mode…
19:42 < SpaceBass2> is sso_util unique to OSX?
19:42 <@dre^> here we go
19:42 <@dre^> this is gonna be big-ass
19:43 <@dre^> same sso_util command, but add: -v 7 after configure and before -r
19:43 <@dre^> and pastebin results
19:43 < SpaceBass2> which cmd?
19:43 < SpaceBass2> the confgure ?
19:43 < SpaceBass2> configure ?
19:43 <@dre^> sso_util configure -v 7 …
19:44 < SpaceBass2> any second Im going to forget and pastebin the admin passwd
19:45 <@dre^> well at least your conscious of that possibility ;)
19:46 < SpaceBass2> https://pastebin.ca/1061791

http://dreness.com/bits/tech/applied_kerberos_troubleshooting/paste2

19:46 <@dre^> I suspect that the GerPrimaryHostName block will contain the error…
19:47 <@dre^> oh snap, do you have multiple IPs on the od master?
19:47 < SpaceBass2> leme check – I did under tiger server, but didn’t tink I did any more
19:48 <@dre^> man pastebin.ca needs an upgrade
19:48 <@dre^> still loading… there it goes
19:48 <@dre^> shit
19:48 < SpaceBass2> I had to do it under tiger b/c I did a DNS move (migrated from Windows Server…what a mistake that was) … but thats a long story
19:49 < SpaceBass2> yeah…still have two IPs
19:50 < SpaceBass2> don’t need the 2nd anymore since I’m not doing VPN on the OSX server anymore
19:50 <@dre^> yep foudn the problem
19:50 <@dre^> line 433
19:51 < SpaceBass2> <CFArray 0x10ec80 [0xa07e7174]>{type = immutable, count = 2, values = (
19:51 < SpaceBass2> ?
19:51 <@dre^> heh, no not that specific line
19:51 <@dre^> but that begins a block…
19:51 <@dre^> 432: GetPrimaryHostName
19:51 <@dre^> then it steps through your network interfaces
19:52 <@dre^> in the two blocks that follow
19:52 <@dre^> for each interface, you see attributes like family, dnsName, name, serviceName, etc
19:52 <@dre^> note that the second one has isPrimaryIPv4Interface
19:52 < SpaceBass2> brb….going to change root on several computers
19:52 <@dre^> guess which one that is :)
19:52 <@dre^> good idea
19:53 <@dre^> you *should* be able to solve this by simply setting 10.1.1.5 as your primary address
19:53 <@dre^> which you can do by drag / drop in the network prefpane’s list of interfaces (under ‘change network service order’, from the gear menu)
19:54 <@dre^> woohoo, we found the problem!
19:54 < SpaceBass2> one sec…
19:55 <@dre^> do you mind if I post this chat log to my blog?
19:55 < SpaceBass2> and I created a huge one
19:55 <@dre^> yeah, heh. happens to everybody at one time or another…
19:55 < SpaceBass2> no, please do…I was going to ask you if I could keep it too
19:55 <@dre^> just be fast about changing and double-check access logs…
19:56 <@dre^> I’ve actually typed passwords directly into IRC before, when I thought a certain window had focus but between the time that it had focus and the time I typed the password, something caused a change in window focus…
19:56 <@dre^> like an errant mouse click, for instance…
19:56 < SpaceBass2> ok… ssh closed …. passwords changed
19:57 * SpaceBass2 wipes brow
19:57 <@dre^> so anyway, do you see what’s going on line 432?
19:57 <@dre^> “going on on line 432”
19:58 <@dre^> “GetPrimaryHostName”… this result will be used to form the server name portion of the kerberos service principal
19:58 < SpaceBass2> leme look
19:59 <@dre^> looking at the two blocks directly following (434 – 439 and 441 – 448), you can see attributes that look like they are related to network interfaces
19:59 <@dre^> like ipAddress, dnsName, family, etc
19:59 < SpaceBass2> ahhhh
19:59 < SpaceBass2> snap!
19:59 <@dre^> so the bonus question is:
19:59 <@dre^> how does the system determine what the primary hostname is?
19:59 < SpaceBass2> of course
19:59 <@dre^> look at the differences in the attributes for each interface
19:59 < SpaceBass2> there’s no DNS entry for the 2nd interface
20:00 <@dre^> well… they both have dnsName
20:00 <@dre^> but what attribute is present for one but not the other?
20:00 < SpaceBass2> looking
20:00 <@dre^> ok there’s two… userDefinedName, and one other… the other one is the key :)
20:01 < SpaceBass2> yep… .nsnet.com vs .local
20:01 < SpaceBass2> binbo
20:01 < SpaceBass2> bingo
20:01 <@dre^> no no
20:01 <@dre^> keep looking
20:01 <@dre^> how does it know which of those to choose?
20:01 < SpaceBass2> en0?
20:01 <@dre^> nope
20:01 <@dre^> which attribute is present for one but not the other?
20:01 < SpaceBass2> ok…leme keep looking
20:01 <@dre^> besides userDefinedName
20:01 < SpaceBass2> dont tell me
20:02 <@dre^> en0 is not an attribute, it’s a value
20:02 < SpaceBass2> d’oh
20:02 < SpaceBass2> isPrimaryIPv4Interface = true
20:02 <@dre^> the attribute that corresponds to en0 is ‘name’, as in the bsd name of the interface
20:02 <@dre^> yep, that’s the one
20:02 < SpaceBass2> I’m actually laughing out loud
20:02 <@dre^> so then, how do you set which is the primary interface? :)
20:02 < SpaceBass2> never in a million years
20:02 < SpaceBass2> well now, thats a good question
20:03 <@dre^> there is a very easy GUI answer, also :)
20:03 < SpaceBass2> b/c the one it identifies as primary is actually a copy
20:03 <@dre^> and that’s perfectly legit
20:03 < SpaceBass2> I’m guessing you go into network prefs and drag it first
20:03 <@dre^> yep!
20:03 < SpaceBass2> ALRIGHT!
20:03 <@dre^> the top-most active interface is the primary
20:03 <@dre^> you should be able to simply make that change and re-run sso_util
20:04 < SpaceBass2> ok …what if I just delete it?
20:04 < SpaceBass2> since I dont need it?
20:04 <@dre^> well… that could be a problem
20:04 < SpaceBass2> ok
20:04 <@dre^> because when you promote to master, the primary hostname / address is encoded in several spots
20:04 <@dre^> but no fear: changeip to the rescue
20:05 <@dre^> so you want to changeip over to .15 / vail.nsnet.com
20:05 <@dre^> see the changeip manpage for examples
20:05 < SpaceBass2> ok,… the one listed as primary is actually 2nd in the gui
20:05 <@dre^> really?!?
20:05 < SpaceBass2> yeah
20:05 <@dre^> well, which gui
20:05 <@dre^> are you in ‘change network service order’, or the overview?
20:06 <@dre^> sorry, ‘set network service order’, under the gear menu
20:06 < SpaceBass2> https://www.flickr.com/photos/nickdawson/2634507389/

2634507389_b6afcbb829.jpg

20:06 <@dre^> which is just above the lock
20:06 <@dre^> ya, click the gear icon
20:06 <@dre^> ‘set network service order’
20:06 < SpaceBass2> modem, if1 (.15) firewire if2 (.17)
20:07 < SpaceBass2> .17 is the one set as .local and primary and is not needed
20:07 <@dre^> .17 should appear above .15 in the ‘set network service order’ list
20:07 <@dre^> since it is in fact the primary, and that list order is supposed to be what defines the primary
20:08 <@dre^> on the ohter hand
20:08 <@dre^> most of hte system appears to believe that vail.nsnet.com is the primary hostname
20:08 <@dre^> which suggest that somehow, somewhere, the network config got confused
20:08 <@dre^> what I would try is simply dragging .17 to the top, and then dragging .15 to the top
20:09 <@dre^> which should re-set the isPrimaryIPv4Interface to be correct
20:09 < SpaceBass2> back…had to get power
20:09 < SpaceBass2> https://www.flickr.com/photos/nickdawson/2635335214/

2635335214_1c1a36fc13.jpg

20:10 <@dre^> yeah just try dragging ‘ethernet’ to the top
20:10 <@dre^> er sorry
20:10 <@dre^> oh wow
20:10 <@dre^> no this is very broken
20:10 <@dre^> lol
20:10 < SpaceBass2> lol!
20:10 <@dre^> both of those interfaces claim to be ‘en0’
20:11 <@dre^> which is theoretically impossible
20:11 < SpaceBass2> well, in linux-speak … en1 and en1:1
20:11 <@dre^> right, but when you create virtual interfaces in os x, they each get unique bsd names
20:11 < SpaceBass2> in other words 10.1.1.17 is a vitrual IP
20:11 < SpaceBass2> right
20:11 < SpaceBass2> and bsd interface names baffle me :D
20:12 <@dre^> hmm… actually maybe I’m wrong about that. ifconfig would show them in teh same physical interface
20:12 <@dre^> os maybe this isn’t horribly broken as I thought
20:12 <@dre^> but they are definitely ordered wrong, or at least the OS thinks they are
20:12 <@dre^> (you can use ifconfig to read, but should not use it to change settings)
20:12 < SpaceBass2> how detrimental would it be to delete the virtual IP?
20:12 <@dre^> (the os x equivalent is ipconfig)
20:13 <@dre^> probably not very, since your system already thinks it is vail.nsnet.com
20:13 < SpaceBass2> yeah, I know ifconfig :D …
20:13 <@dre^> except for this one little piece of configuration which is wrong
20:13 <@dre^> but just to be safe, disable it instead of deleting
20:13 < SpaceBass2> ok
20:13 <@dre^> gear –> make service inactive
20:13 <@dre^> that way you can always turn it on if something assplodes
20:14 < SpaceBass2> is that the same as ifconfig <interface> down ?
20:14 <@dre^> yes, but don’t do that in os x
20:14 <@dre^> you should only use ifconfig to read settings, not write them
20:14 < SpaceBass2> yeah?
20:14 < SpaceBass2> you mentioned that
20:14 <@dre^> (because ifconfig bypasses the system frameworks that are used by the rest of the OS)
20:15 < SpaceBass2> I always have to remind myself that bash in osx is truly just a shell
20:15 <@dre^> so you could make a change, but hte OS doens’t know the change was made (only the very low networking layers), and so e.g. network prefs would be totally ignorant of the change
20:15 < SpaceBass2> which is arguably the way it should be
20:15 <@dre^> if you want to make network changes from the cli, use ipconfig or networksetup
20:16 <@dre^> so disable the interface and re-run sso_util
20:16 <@dre^> brb, potty
20:16 < SpaceBass2> jawdrop – ipconfig is a binary in 10.5 … wow
20:18 < SpaceBass2> ok … re-ran and same result …still .local
20:18 < SpaceBass2> but I feel that we are very close :D
20:18 <@dre^> hmm
20:18 <@dre^> let me see that relevant hunk of sso_util configure -v 7 output
20:19 < SpaceBass2> Entry for principal ftp/vail.local@NSNET.COM with kvno 7, encryption type ArcFour with HMAC/md5 added to keytab WRFILE:/etc/krb5.keytab.
20:19 <@dre^> the part where it detects network name
20:19 < SpaceBass2> leme get it
20:19 <@dre^> GetPrimaryHostName
20:20 < SpaceBass2> https://pastebin.ca/1061809

http://dreness.com/bits/tech/applied_kerberos_troubleshooting/paste3

20:21 <@dre^> (loading)
20:24 <@dre^> well tha’ts bizarre…
20:24 <@dre^> it still think vail.local is primary
20:26 <@dre^> maybe you will need to delete .17
20:26 <@dre^> it could also be that the settings are horked enough that you cannot change them
20:26 < SpaceBass2> yeah, not ruling that out
20:26 <@dre^> (you did remember to click Apply right?)
20:26 <@dre^> in network prefs…
20:27 < SpaceBass2> heck, let me delete it and see
20:27 < SpaceBass2> yeah, closed prefs and re-opned even
20:28 < SpaceBass2> BOOM!
20:28 < SpaceBass2> removed it and bingop
20:28 < SpaceBass2> bingo
20:28 < SpaceBass2> xmpp/vail.nsnet.com@NSNET.COM
20:28 <@dre^> woot!
20:28 < SpaceBass2> high-five!
20:28 <@dre^> ^5 :)

Epilogue: After re-reading this, I realized that his afp server is actually a separate host from his OD master (vail), but the same troubleshooting steps apply… so in the end, I might not have actually fixed the AFP mounting problem, but we did fix at least *some* problems :)

Finally: if anyone knows how to make WordPress not DELETE AN ENTIRE POST when you paste in a chunk of text that is too big; or, how to adjust this threshold, please tell me. This post took entirely too long to compose, as I had to move text around in increasingly smaller chunks to work around this problem.

About dre

I like all kinds of food.
This entry was posted in OS X, OS X Server, tutorials. Bookmark the permalink.

Leave a Reply