Recently I did some spelunking into the Adaptive Firewall facility of OS X Server to devise a procedure for troubleshooting a reported failure of AF to block failed SSH logins. Consider this a supplement to this post at krypted. (although do note that the hb_summary tool mentioned there seems to be defunct now).
- 1) Verify that AdaptiveFirewall (AF) is actually enabled. The “Adaptive” part is what reacts to events such as login failures; I mention this because adding a block rule manually using afctl is roughly equivalent to adding a block rule in pf, and even if this block rule takes effect (because pf is enabled), that does not imply that AdaptiveFirewall is enabled.
- 2) AF doesn’t detect the events itself; it relies on Event Monitor (emond) for this. Verify that emond is seeing the activity in question.
- Verify that AF is creating the correct rules in pf based on what it learns from emond.
First, create the following shell alias to allow easy invocation of afctl:
Verify that AF is enabled
Check to see if AF’s launchd job is running. You should see the com.apple.afctl job listed.
bash-3.2# launchctl list | grep afctl - 0 com.apple.afctl
If it’s not listed, re-initialize AF. This doesn’t destroy any state. Make sure it exits zero (no error).
bash-3.2# afctl -c ; echo $? 0
Re-enable any previously disabled rules, check exit status.
bash-3.2# afctl -e ; echo $? 0
Force AF into active state, check exit status. Don’t be scared by the pfctl boilerplate about the -f option.
bash-3.2# afctl -f ; echo $? pfctl: Use of -f option, could result in flushing of rules present in the main ruleset added by the system at startup. See /etc/pf.conf for further details. No ALTQ support in kernel ALTQ related functions disabled 0
Verify that emond is seeing the auth failure events
Configure emond to do some additional logging. Edit /etc/emond/emond.plist to increase the debugLevel to 4 and set logEvent to true, as shown below:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>config</key> <dict> <key>additionalRulesPaths</key> <array> <string>/Applications/Server.app/Contents/ServerRoot/private/etc/emond.d/rules/</string> <string>/Applications/Server.app/Contents/ServerRoot/private/etc/emond.d/rules/</string> </array> <key>debugLevel</key> <integer>4</integer> <key>errorLogPath</key> <string>/Library/Logs/EventMonitor/EventMonitor.error.log</string> <key>eventLogPath</key> <string>/Library/Logs/EventMonitor/EventMonitor.event.log</string> <key>filterByGID</key> <string></string> <key>filterByUID</key> <string>0,27,70,83,84,94,214,235</string> <key>logEvents</key> <true/> <key>periodicEvents</key> <array> <dict> <key>eventType</key> <string>periodic.daily.midnight</string> <key>startTime</key> <string>0</string> </dict> </array> <key>saveState</key> <true/> </dict> <key>initialGlobals</key> <dict> <key>notificationContacts</key> <array/> </dict> </dict> </plist>
After making the above change, run: sudo killall emond. There is now an additional log in /Library/Logs/EventMonitor (EventMonitor.event.log), and both that and the error.log now contain more verbose information. Watch these files with tail -f to see ongoing activity. Note that for arcane reasons, a single failed SSH attempt actually results in multiple detected auth failures.
You can also look at /etc/emond.d/state, which is only written upon reception of SIGTERM. The state file lists all the hosts that have attempted to connect to a protected service, along with the count of failed auths. Successful logins are indicated by a bad auth count of zero.
Verify correct rules in pf
pf rules associated with AF are all rooted under a pf anchor (anchor is pf’s word for ‘group’) called com.apple/400.AdaptiveFirewall. Show the active pf rules under this anchor:
bash-3.2# pfctl -s Anchors -a com.apple/400.AdaptiveFirewall -s rules -v No ALTQ support in kernel ALTQ related functions disabled block drop in quick from <blockedHosts> to any [ Evaluations: 31705 Packets: 0 Bytes: 0 States: 0 ] [ Inserted: uid 0 pid 22564 ]
(note that the ‘evaluations’ counter should be non-zero; if it’s zero that likely means pf isn’t enabled; afctl -f is supposed to do that)
bash-3.2# pfctl -s info No ALTQ support in kernel ALTQ related functions disabled Status: Enabled for 0 days 00:01:31 Debug: Urgent State Table Total Rate current entries 0 searches 2034928 22361.8/s inserts 0 0.0/s removals 0 0.0/s Counters match 999161 10979.8/s bad-offset 0 0.0/s fragment 0 0.0/s short 0 0.0/s normalize 0 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 418 4.6/s proto-cksum 0 0.0/s state-mismatch 0 0.0/s state-insert 0 0.0/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s dummynet 0 0.0/s
If afctl -f doesn’t enable pf, that’s a bug. If this is the case, you could try manually enabling pf. If it’s already enabled, it says so:
bash-3.2# pfctl -e No ALTQ support in kernel ALTQ related functions disabled pfctl: pf already enabled
pf uses ‘tables’ to efficiently store data associated with rules that only differ by a single element (such as IP address). Show the list of pf tables under the AF anchor:
bash-3.2# pfctl -a com.apple/400.AdaptiveFirewall -s Tables -vvv No ALTQ support in kernel ALTQ related functions disabled -pa-r- blockedHosts com.apple/400.AdaptiveFirewall Addresses: 0 Cleared: Fri Mar 25 11:38:30 2016 References: [ Anchors: 0 Rules: 1 ] Evaluations: [ NoMatch: 529189 Match: 141 ] In/Block: [ Packets: 141 Bytes: 15909 ] In/Pass: [ Packets: 0 Bytes: 0 ] In/XPass: [ Packets: 0 Bytes: 0 ] Out/Block: [ Packets: 0 Bytes: 0 ] Out/Pass: [ Packets: 0 Bytes: 0 ] Out/XPass: [ Packets: 0 Bytes: 0 ]
Show the contents of the blockedHosts table in the AF anchor. In the below output, I manually added 18.104.22.168 using afctl, and x.x.x.x is a redacted address that was automatically added by AF due to failed SSH login attempts.
bash-3.2# pfctl -a com.apple/400.AdaptiveFirewall -t blockedHosts -T show -vvv No ALTQ support in kernel ALTQ related functions disabled 22.214.171.124 Cleared: Fri Mar 25 13:26:12 2016 In/Block: [ Packets: 0 Bytes: 0 ] In/Pass: [ Packets: 0 Bytes: 0 ] Out/Block: [ Packets: 0 Bytes: 0 ] Out/Pass: [ Packets: 0 Bytes: 0 ] x.x.x.x Cleared: Fri Mar 25 14:15:38 2016 In/Block: [ Packets: 8 Bytes: 1088 ] In/Pass: [ Packets: 0 Bytes: 0 ] Out/Block: [ Packets: 0 Bytes: 0 ] Out/Pass: [ Packets: 0 Bytes: 0 ]
… I think that’s pretty much everything, except for some errata:
* Starting from a clean slate, you can get the failed auth counter for a given sending host up to 25 very quickly. At that point, the block rule is created and lasts for 15 minutes by default. No failed auths happen from that host in this 15 minute window, because the sending host is blocked and can’t reach sshd. After the 15 minute interval, the block rule is removed. An additional failed auth earns the sending host another 15 minute block rule. The bad auth counter is only reset by a successful login from that host.
* A block rule is only created once there have been 25 failed auths from the same IP address. This value is configurable with afctl. There is no time window associated with this policy. Therefore, a botnet with 100 hosts would be able to attempt 100 * 25 SSH auths against your server. As there is no reliable way to know that you’re being hit by a botnet, AF cannot help you guard against this except by reducing the failed auth count threshold required for a block rule.