Monthly Archives: August 2015

Full disks and databases

How do you deal with a full hard disk/volume/partition?

I have recently had a number of discussions about dealing with hard drives or databases themselves filling up on SQL Servers. There seems to be a few different opinions on how to deal with this and I’d like to share those here and give my preferred solution.

N.B. (For brevity, I will refer to disks, but mean either disk, volume or partition where applicable)

The main error that people are confronted with when either a disk or a database file has filled to the brim is something like this:

Could not allocate a new page for database 'DatabaseName' because of insufficient disk space in filegroup 'PRIMARY'. Create the necessary space by dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

There are a few reasons this could happen (not an exhaustive list):

  1. The database file is full but the disk holding the database file does not have enough room to accommodate the file growth required for auto-growth to occur.
  2. The database file is full but the disk holding the database file has a quota activated that has been reached, so does not have enough room to accommodate the file growth required for auto-growth to occur.
  3. The database file is full but database does not have auto-growth setup.

The problem with the first two situations I mentioned is that they are generally not easy/quick situations to fix. If a disk is full, then extending that disk may not be possible, or only possible with a lot of administrative overhead. The same applies for disk quotas, even more so with quotas, as quotas usually have some sort of “internal politics” attached to them.

The third reason occurs surprisingly often. For some reason, certain companies just don’t want their databases to grow automatically. They want to keep everything under human control, so they turn off all sorts of automatic support options like auto-growth. I also heard a customer mention that they preferred the data files to be static in size and it was easier for them to extend these than the disk (again confirming my point about disk/partition/volume extensions being difficult).

Regardless of the reasoning for databases/database files actually filling up and the error occurring, what can be done in an attempt to help prevent this from happening, or at least allow for a speedier resolution of the issue?

Dummy/Buffer files

If you have a volume size limitation and need to ensure that you can quickly arrange for more disk space very quickly, one option is to use “dummy” or “buffer” files. These are files that are of a certain size, but have no actual content. They are created on the disk that you want to have “insurance” for and kept around until disk space is either low (a pre-defined threshold) or the disk is actually full. When either of these situations occurs, you simply delete one or more of these “dummy” files and you immediately have more disk space.

To create these files you can use fsutil a great little tool that is bundled with windows and allows sysadmins to do a number of things with disks. You call fsutil from the command line and provide it with the correct parameters and an “empty” file will be created.

Syntax of fsutil (length is in bytes)
fsutil file createnew <filename> <length>

Example to create a single 50MB file
fsutil file createnew d:\temp\dummyfile 52428800

dummy_file

Once the file(s) has been created you have a buffer of space so that you can quickly react to a disk filling up (even with some sort of scripted automation). Of course, with the consideration that the disk now has less space available for “real” data, meaning a full disk can actually occur quicker.

Using Max Size Limits for a Database

Among the settings that a database can have, you can control in what size “steps” a database should grow by if auto-growth is activated. You can also set a maximum size for the database file(s) to grow to.  This is similar to the idea of a disk quota, but on a database file level. Setting this limit can be useful to prevent a database from growing so large that it will completely fill a disk. I actually prefer the idea of setting an upper limit on the database file(s) rather than allowing unlimited growth. The reason for my preference is down to the fact that it is easier to change the maximum growth limitation than it is to extend a disk.

Setting the maximum size for a database is a simple operation, either in SSMS via the GUI:

database_autogrowth

Or via T-SQL:

ALTER DATABASE [sandbox] MODIFY FILE ( NAME = N'sandbox', MAXSIZE = 102400KB )

With this soft-limit for a database file, you are able to control file growth with greater precision. The main benefit is, that you can increase the maxsize at a moment’s notice and allow normal operations to continue if a database file should accidentally/suddenly fill up. This is not an ideal solution, but still improves control and allows normal operations to resume.

I like dummies!

I prefer the dummy file solution out of the two that I have explained above. Using a dummy file allows you to have auto-growth turned on (to cover those immediate growth requirements), while stilling offering you the chance to also avoid a disk from filling up completely through an auto-growth event.

Using the maxsize solution is ok, so long as you are able to connect to the instance to extend the maxsize. This is often possible for user databases, but what about TempDB? I have experienced systems where the system has locked up because TempDB is full and attempts to connect were fruitless (Remote DAC was not activated and no local access to the SQL instance), so it was not possible to extend TempDB.

Having a few dummy files around would allow me to quickly supply TempDB with the required disk space for an auto-growth and for me to connect to the instance again and take charge of the situation. The other alternative (which customers often chose in the past), was to reboot the server / restart the instance. The issue with this being, the culprit for the “crash” is no longer identifiable, unless you have a solution implemented to do just that (which I will blog about soon!).

Conclusion

In my experience most installations / customers do not have a solution implemented to counteract a disk/database filling up. The two solutions I have explained are good starting points to help avoid catastrophe in future. Please take these ideas and see if you can improve the stability and availability of your systems using them, or even better, provide your ideas/solutions in the remarks of this blogpost.

Thanks for reading and have fun with your diskspace!

The case of the non-starting SQL Agent Job

I ran into an interesting situation recently that involved a SQL Agent job that would not/could not run. Looking at the job settings everything seemed fine, yet the SQL Agent deemed the job “un-runnable”.

As we can see here “My Test Job” is enabled, and scheduled, but has no “Next Run” and is deemed not “Runnable”.

Job Activity Monitor

Taking a closer look at the job properties, we can further see that the job is enabled:

Job Properties - General_3

We also see that there really is a schedule for the job:

Job Properties - Schedules

And on further investigation I saw something that I have never really looked at before. The “Targets” properties:

There are no entries here, but more importantly, the radio button “Target local server” is not selected. This turned out to be the cause of the job not running!

If I try and exit the properties with “OK” (which you should never do unless you have intentionally changed a setting somewhere), then we are presented with a clear error message informing us about the missing Target:

Job Save Error

“You must specify the Target Servers on which this multi server job will execute.”

The “Targets” section of a SQL Agent Job is something that I have never delved into in any detail, but is a long standing feature to allow for administration of large server environments. You can read more into Multiserver Environment Administration on MSDN.

The reason for the setting being missing was through an incomplete scripting process for the affected job. If you script out a SQL Agent Job in SQL Server Management Studio, you will see that one of the last steps in the script is a system stored procedure “msdb.dbo.sp_add_jobserver”:

EXEC msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'

This stored procedure associates a job to a jobserver, which in my experience has always been “(local)”. Obviously, this will be different if you are using master servers and target servers in your environment.

As soon as I had set the target server to local for the job, everything was back to normal and the job ran again according to it’s schedule.