SharePoint 2010 – Drive Space Conundrum

One of the new capabilities of SharePoint 2010 that comes in handy is the Health Monitoring alerts that pop up on the front page of Central Admin. One thing you might run into is when you start to run out of disk space.  You’ll probably see something similar to this:

Always something that you want to see while you’re working on your environment right? Not so much.

For some reason it always seems that just when things are going well, profiles are synchronizing, users are starting to engage the SharePoint platform, and boom, whammo, the file system fills up with log files, trace logs and event logs. So just a gentle reminder to examine where your log files are and consider moving them to an alternate drive than the core OS drive.

How do I do this you ask?  Pretty simply…

First off, decide what your disk plan is for your SharePoint Servers – hopefully you’ve got more than just a single drive in your system, if not slap on an extra drive (either physical or virtual) for log files, or if you’ve got a SAN handy, request an extra drive on separate spindles from where your data is stored and have them zone it for your SharePoint server to be added for offloading.

Next, for your IIS logs, simply open up IIS and go to the server name (in my case SP2010WFE-01) and then in the main information pane of IIS, scroll down to Logging underneath IIS.

Locate the Directory location and modify it to the location that you’ve setup for log files, in my case I’ve added an additional drive to my SharePoint server with the logical volume “E:”

Once IIS creates the structure, you’ll want to copy over old log files from your core OS drive (C:inetpublogs) to your new location.

Next up, Trace Logs for the Unified Logging System…

Within SharePoint Central Admin, navigate to Monitoring, within Reporting select “Configure Diagnostic Logging”.

Direct Link – http://<NetBiosNameOfSharePointServer&gt;:<CentralAdminPort>/_admin/metrics.aspx

Scroll down to the Trace Log section where you’ll see something like this:

Simply input the drive that you’d prefer to use, in my case replacing %CommonProgramFiles% with “E:Program Filescommon files”, and you end up with something like this:

Go ahead and copy over the contents of the Logs file on the original instance to the new instance to consolidate your log files.

Lastly, moving your event logs to the log drive is definitely a consideration to make – especially the Security Log file as this will grow quickly once you’ve implemented Kerberos and opened your system to your user base (NTLM spawns quite a few security events too). Out of the box you’ll see your event logs like this:

Application Event Logs

Security Event Logs

System Event Logs

Simply modify the “File” location to the new location where you are looking to store your files, in my case I use “E:Windows EventsLogs” as the directory followed by the appropriate event log file name. This is documented in: http://support.microsoft.com/kb/216169

Further, to ensure that log files don’t explode, leverage the “AutoBackupLogFiles” property within the Application Events (you’ll have to add this to the Security and System Event Logs, simple DWORD property). Setting the value to “1” or any value other than “0” will create backup files in the file location specified.

This is documented in http://support.microsoft.com/kb/312571 (though it’s specific to 2000/2003 server, it works for 2008).

These three simple changes should assist in keeping your core OS drive lighter weight and prevent your system from a hiccup caused by a disk filling up.

Expiring Service Accounts…

Recently I read an article regarding service accounts and how they should never be set to expire.  Bold statement that in some respects I agree with.  In the context of the author, I completely understand their frustration with the identity management system not properly alerting the end user that their account was about to be disabled and in turn bringing their development system to a screeching halt. 

In the context of an enterprise environments, password and user account expiration are standard obligations that not only ever system administrator must adhere to, but every user on the domain.  From an information assurance traceability perspective, without an account sponsor for each and every domain user object, there runs a risk of information loss and accessibility to information by individuals that should not have such access.

By and far I would see the majority of user account responsibilities and issues falling on the shoulders of the system administrator.  From an availability perspective, they are the engineer that ensures the system continues to operate properly.  While they may not be the face of a system, without their diligent caretaking, the other engineers and analysts are unable to perform their duties.

One core responsibility of a system administrator is to keep a running list of user accounts that are used in their Microsoft Windows Networking Infrastructure to include the service accounts used for MOSS, SQL and any other third party software that is operating which may have adverse effects on the availability of the system if disabled.  As a part of the technical governance, one of the responsibilities of the system administrator should clearly state and define that they track user accounts used within the system (some might even call this a best practice).

An appropriate checklist to ensure user account availability would potentially include (but not be limited to):

  • Listing of all service accounts (display name, UPN, sAMAccountName, POC, etc.)
  • Where each of the accounts reside in the OU structure
  • What security groups each of the accounts belong to at the local server level, the domain level, and the enterprise level
  • If there are any DCOM modifications required for a service account to operate properly on a server
  • What the operating period for the service account is (i.e. does it have a definitive expiration date)
  • GPO policy on the particular set of service accounts
  • Password change policy timeline

In my experience, I’d say that the majority of system outages and incidents that have occurred when either a service account expires, the password aging that is required catches the administrator off guard when they forget to change it, the core network switches that provide for general connectivity go offline, or a new GPO is pushed down and inadvertently modifies security groups or other domain user object properties.  Three of these four issues can be easily mitigated by the system administrator with proper notifications and alerts.  Networks are networks and you never know when your core switch is going to have a board go bad or worse melt due to over processing (I’ve not seen the latter, but I have seen the former).

Based on the context of the environment, a system administrator should have a maintenance calendar in SharePoint linked to Outlook that users are subscribed to and receive alerts which provide pertinent information.  Such information could potentially include when the next maintenance period is and what will be accomplished during the system outage. Additionally, and maybe it’s wishful thinking on my behalf, the System Administrator hopefully has a relationship of some sort with the Domain Admins or help desk and knows what the policy states in terms of how many days an account is valid for before expiration.

Should user accounts expire or passwords age?  It depends on the context.  How you approach the issue and handle it is a separate story.  Working in the context of an enterprise system requires a higher specification of diligence to properly ensure system availability.  In a small environment or dev system, rarely do I find password aging or expiration enabled, thereby reducing the risk of availability due to AD issues.

What works for your organization?