Thursday, July 28, 2011

Strange source of lockouts seen on the ISA server

A few times in our environment we have seen user account lockouts showing up on the ISA servers. These are configured to require authentication in order to allow proxying. In this type of case 90% of the time, the problem will be at the user's workstation. There can be several causes, and it is typically internet enabled applications that don't support windows authentication against proxies. When they receive a request to authenticate, some applications are really stupid and think they are authenticating against their remote service's servers, sending whatever 3rd party username/password combination to your proxy server. If the user name is the same as you domain user ID, you get locked out (have seen this with Skype). Others may allow user's to provide their username and password, and later on after the domain account password is changed, users forgot they where they typed it in. In rare cases, there are applications that find their way into caching domain credentials, but not always keeping up to date with them. The case I will present here is the later, and the details are incomplete.

When you see the proxy server locking out a user, when checking their machine, first look at every obvious internet enabled application. Sometimes you can obviously find something and update it or remove it. If you are still not sure, I recommend installing microsoft netmon 3.3 or higher on the workstation and running it for a while until the next bad password attempt shows up. The advantage of this network capture software is that it can provide the process name or process ID of the application that is doing the communication. Look for the HTTP requests and typically you will want to look for plain text authentication attempts as your culprit. Use this filter:

HTTP.Request.HeaderFields.ProxyAuthorization AND HTTP.Request.HeaderFields.ProxyAuthorization.Authorization.BasicAuthorization.Scheme == "Basic"

In the case I am bringing up, the process name was not provided and the ID was 4. PID 4 is system processes/system services. The packet capture showed plain text authentication using the user's previous domain password. Since it is a system process, we looked at the system services and came up with Akamai NetSession Interface service. This is something that installs as a download manager or similar software that Adobe is bundling with some of its downloads. I didn't get into a deep dive inspection of the machine to see where it manages to cache this plain text password, but this sounds like a good security project for someone to look at. If the software is grabbing domain credentials at some point, it would be nice to know the controls around it. In any case, this issue has come up several times in our environment with the same service. Disabling or removing fixes the problem. The problem may only come up in certain versions or due to some specific use case as we have only seen this a handful of times although there are over a hundred machines with the service.

I hope this information is helpful in troubleshooting this type of authentication failure and lockout source. For additional information on account lockouts, you can visit my account lockout tracking general practice page.

Wednesday, July 27, 2011

The security database on the server does not have a computer account for this workstation trust relationship

This is an error that can come up from time to time for a variety of reasons. When you get this error, usually you cannot access the system in any way: local login, terminal services, RPC connections, shared folder access, etc. When this happens, the computer account may have been deleted, the system may have failed to update its password properly (memory problems, network problems, offline too long, etc). But, what if you can access the server remotely and login, but local logins and terminal service logons are failing? You may see this problem with newer OS's (WIN7 and 2008, or vista) if you are using a disjointed dns namespace.

First of all, what is a disjointed namespace? If you domain is: mycorpdomain.com, and you have other dns zones that different sites use in that same domain, such as east.mycorpdomain.com, and west.mycorpdomain.com, these are disjointed namespaces with subdomains. In the example I will provide, let us assume that the "primary dns suffix" setting of a machine is being pushed through group policy, either at an OU level or at an AD site level.

Let's explain the primary dns suffix setting a little bit. This can be set in the same place that you would set the computer name or change the domain membership, just click the "More" button on this form. The primary dns suffix is an attribute that exists on the computer account in AD, and it is also related to the machine's service principal names (used by kerberos).

When GPO's are used to update primary dns suffix, there are occasions where a machine does not properly update its machine account information. You can see this by looking at the machine details with any AD search tool or ADUC (General tab -> dns name attribute), and the setspn.exe tool.

When the machine fails to update its information, it may show up in two places. To start with you want to look in ipconfig /all from that machine. See what the primary dns suffix is for that machine:

C:\>ipconfig /all

Windows IP Configuration

Host Name . . . . . . . . . . . . : MYMACHINE
Primary Dns Suffix . . . . . . . : east.mycorpdomain.com


Here we see the machine is set to use east. We can use Joeware's adfind to read the other important attributes

C:\>adfind -b dc=mycorpdomain,dc=com -f "cn=MYMACHINE" dnshostname serviceprincipalname

AdFind V01.37.00cpp Joe Richards (joe@joeware.net) June 2007

Using server: myserver.mycorpdomain.com:389
Directory: Windows Server 2003

dn:CN=MYMACHINE,OU=Computers,OU=east,DC=mycorpdomain,DC=com
>dNSHostName: MYMACHINE.mycorpdomain.com
>servicePrincipalName: HOST/MYMACHINE.mycorpdomain.com
>servicePrincipalName: RestrictedKrbHost/MYMACHINE.mycorpdomain.com
>servicePrincipalName: TERMSRV/MYMACHINE.mycorpdomain.com
>servicePrincipalName: TERMSRV/MYMACHINE
>servicePrincipalName: RestrictedKrbHost/MYMACHINE
>servicePrincipalName: HOST/MYMACHINE


If you see here, the dnsHostName attribute is not using the same primary dns name that my machine is using. You may also see some mixed up ServicePrincipalName attributes or some combination of the two. The important ones are the that dnsHostname matches, and the RestrictedKrbHost and HOST serviceprincipalnames match what the machine says its primarydnssuffix is. If they don't, the machine fails to find itself in AD and thinks it doesn't have a computer account, while all the time still acting like it can authenticate to the domain in most cases. This can typically be fixed with some manual Setspn -A commands to add the valid serviceprincipalname attributes to the machine, then reboot.

The problem gets caused somewhere in the delay of change for the primarydnssuffix attribute (only takes effect after reboot) and updates to the machine. I have been told that spn updates may be done as more than one transaction and its possible they were written in more than one place, causing a last writer to win situation that overwrites some of the other updates. That is why you may see some SPN's with the correct disjointed dns name, and some are missing them.

To mitigate this problem, if the computer account's dns suffix is correct (you will see this mostly in vista machines), you can script a job that checks machine accounts for mismatches and fix them proactively. For Windows 7 and higher it is more difficult as the computer account dns suffix is wrong. Generally though you will see this problem on newly built machine as the problem only occurs just after group policy first applies. So spreading the knowledge of the problem to people that build machines is a useful tool to fixing the problem before users see it.

Friday, July 22, 2011

Powershell sometimes less powerful than a snail

Today I was working on a simple old log file deletion script to run against remote machines. Since they are legacy boxes, they don't have powershell remoting capabilities, so I was going for basic access via admin shares. I wanted to clean up files that contained date data in the file name, so it was a pretty simple, wild card match delete operations and a few date operations...something that should take a few seconds to throw together. Just to be on the safe side, I tried running my work through PowerGui Script editor in debug mode and ended up with a hang on a deletion operation. That's wierd since its only handling a few hundred files, something cmd's del command would knock out in no time. It seems with the changes that came with powershell, .NET integration and the object oriented nature of how DEL and DIR were replaced with remove-item and get-childitem, the operations in the background became ridiculously inefficient in some cases. In my example I'm accessing a server that has a ping response time of 249ms from my machine where I'm running the debug. From another machine (my script jobs server) I have a 4ms response time to the target. Notice how this works for me:

From the 4ms server doing a get-childitem operation on a folder with 940 files in it

Friday, July 22, 2011 2:04:50 AM
Friday, July 22, 2011 2:05:02 AM


12 second is a bit slow, but tolerable. Lets see how cmd compares:

get-date; cmd /C "dir \\remoteserver\c$\mydir >c:\temp\somebsfile.txt"; get-date

Friday, July 22, 2011 2:18:08 AM
Friday, July 22, 2011 2:18:09 AM


1 second or less. Much nicer. So, how about that debug machine 249ms away from the target:

Lets start with cmd /C DIR, because my Powershell get-childitem has been running so long already:

Friday, July 22, 2011 3:08:54 PM
Friday, July 22, 2011 3:09:01 PM


and we're waiting for powershell.....

waiting....

15 minutes gone by.....

still waiting....

is this still processing????

firing up netmon.....

yeah its still pulling data over SMB....

SMB query path info every 300ms or so...

comes out like this

Friday, July 22, 2011 3:05:05 PM
Friday, July 22, 2011 3:30:16 PM


Bottom line, powershell remoting is probably a better way if available, otherwise failback to CMD.