Is It Time to Swap Corporate File Servers for Disk-Based Archiving Systems?

| | Leave a comment

A frequently reiterated statistic is the high percentage of infrequently accessed or static data that resides on production storage systems - up to 80% according to some estimates. In fact, a recent joint study conducted over 3 months by researchers from NetApp and the University of California and presented at USENIX 2008 found that over 90% of the 22 TB of data stored on two enterprise file servers was rarely accessed after it was stored. Specifically, 66% of the files were re-opened only once and 95% were re-opened fewer than five times.

This infrequent access of production data stored on these corporate file servers raises a serious question about the type of storage systems that companies should select to host their file server data going forward. While using enterprise file servers with high performance disk drives certainly makes sense for hosting data that is frequently accessed, this report can lead one to extrapolate that the vast majority of data now found on corporate file servers does not fall into this category.

This study raises some serious questions about the need for high performance storage systems to handle many of the day-to-day file services needs for most organizations, specifically for applications such as mail servers and home directories. Instead alternative storage systems such as the Permabit Enterprise Archive that present a file system interface and perform archiving functions may be better fits in these environments.

I am not suggesting that companies should take such a leap without first doing some research or that all of corporate data should be moved onto these systems. But if the 90%+ infrequently accessed statistic holds true across multiple companies, not just the file server referenced in this study, it merits companies taking a serious look at the activity on their file server to determine if certain disk-based archiving systems are a more appropriate option for use as a primary file serving target for certain applications.

Mail servers and home directories, as shown in Table 4 of this study, are specific candidates that companies may consider moving from current corporate file servers to disk-based archiving storage systems. Practical reasons why the Permabit Enterprise Archive storage system would support the workloads of these specific corporate functions include:

  • It presents CIFS, NFS and WebDAV interfaces to corporate servers
  • It can store data as a regular file (does not need to be a WORM/archived format)
  • The system is managed through a web browser
  • It supports multiple concurrent connections from multiple clients
  • It offers read, write and read/write I/O throughput ratios are in-line with the results published in this study

It is maybe more for this last reason than any other that companies have largely stayed with high performance file servers and avoided competitive solutions. Companies lack knowledge about the performance characteristics of these workloads on their corporate file servers. As a result, they do not move these directories to alternative storage systems for fear of enraging their user base. This study serves to point out that the performance of mail server and home directory workloads is not so egregious that they cannot consider using a reasonably high performance disk-based archiving system in this role. As brought out in the first paragraph of Section 4.7 of the study, most files (76.1%) are only opened by one client and 92.7% of files are only ever opened by two or fewer clients. This illustrates that the behavior of these files is more in-line with the behavior of files stored archiving systems, not corporate file servers, anyway.

One can not too hastily draw too many conclusions from a study that only looked at two enterprise file servers in one company. However it should give companies pause going forward. File server performance and the ability of file servers to fit seamlessly into network LANs are major corporate concerns. But if the majority of the files stored on these file servers is accessed less than five times with most files only accessed by one client after it is stored, it begins to beg the question, "Is it time to use disk-based archiving storage systems as primary file servers for a majority of the company's files since that is essentially the role that current corporate file servers are serving anyway?"

Leave a comment

Entry Sponsorship

This entry is sponsored by Permabit Technology Corporation

About Permabit Technology Corporation Blog

    Permabit Enterprise Archive is the only enterprise-class, disk-based storage system to archive petabytes of information at a fraction of the cost of tape. The system combines space saving compression and deduplication with multi-petabyte scalability to provide Scalable Data Reduction™ (SDR)