White Paper

NAS Virtualization

The Recent Past

In the mid-eighties organizations were empowering their employees by equipping them with computing resources at the desktop. This resulted in a major productivity boost for employees but added greatly to the headaches of IT administrators. One of the woes IT professionals had to wrestle with was the challenge of addressing and managing the increasing demand for storage, both personal and shared.

As we moved into the 1990s, networks and end-users began to grow exponentially as well as the issues of data management, performance, scaling, downtime and recoverability. This was the environment that gave rise to Network Attached Storage (NAS).

Early generation NAS servers were basically generic servers that were customized to do one thing very well – serve files. NAS servers provided much needed relief for unwieldy management, configuration and availability concerns by using specialized techniques that streamlined configuration and installation. Early generation NAS servers allowed storage administrators to focus their time on managing files and file structures and less time on managing the physical storage media where the files were stored. Introducing a NAS server into an existing network was a simple and straightforward process. They were easy to manage and maintain, used mainly for department or group file and print serving, file sharing and storage consolidation. Installed in minutes and requiring minimal downtime, NAS also enabled data sharing in a heterogeneous environment.

The number of NAS installations began growing rapidly in response to the explosive storage demands. Their plug-and-play nature and flexibility made them incredibly attractive. Unfortunately as NAS installations grew their limitations became evident. Early generation NAS servers did not scale well making them unsuitable for large enterprise needs.

One of the largest contributors to the NAS scaling problems was the inability of a file system to scale beyond a single NAS server. This issue had a dramatic impact on availability and how and when storage administrators performed their functions. When additional storage or performance was required for a given file system, and the existing server was at capacity, a new server had to be added. This involved adding a server, decommissioning the original server and applications that used the data, copying some of the existing data to the new server, modifying the shares / mount points on each and every desktop system to indicate the new data location, and then reactivate all resources with the new configuration. This was very disruptive to the business. Because it was so painful to scale, it was typical to over provision NAS servers with excess capacity or processing power to decrease the time between server additions. This was inefficient and costly.

All in all, first generation NAS capabilities did not keep up with growing demand.. The scaling problems primarily relegated NAS to the department / group level and out of the corporate processing environment. Their ease-of-use appeal was significantly diminished by the resurfacing of issues that NAS was originally designed to eliminate.

While NAS servers were being implemented at the department / group level, another storage architecture appeared on the scene at the enterprise level – Storage Area Network (SAN). SANs separated servers from storage by placing all of the storage devices on a block oriented fabric accessible to all servers in the fabric. Unlike its NAS counterpart with file-based management, SANs provided management of storage resources at the block and device level. There is no file system associated with a SAN. The lack of a file system allowed SANs to easily grow in capacity and easily attach new servers. This approach helped to address scalability, availability and performance issues.

As SANs grew in capacity and complexity, SAN management issues grew accordingly. Since SAN is implemented at the block and device level, and SAN has no knowledge of files or file systems, storage administrators were forced to manage the SAN at the very lowest level of components – LUNs, switch ports, disk drives, SCSI and Fibre Channel HBAs, etc. All data was managed at the block level and not the file level. As the SAN fabric grew the management of the fabric quickly got out of hand.

In an effort to assuage the pain of SAN management, a technique called storage virtualization became increasingly popular. The concept is aimed at allowing administrators to treat storage devices as a general pool of storage. SAN virtualization software creates a virtualized storage layer between the server operating system and the physical storage. This storage layer creates a virtualized storage pool, which is presented to the server operating system as one large expandable volume or disk. On the back end of the virtualization layer, physical capacity can be added and storage dynamically managed without modifying the application or users.

Virtualization was well suited to SANs. Administrators understood how to manage physical volumes and learning to manage virtual volumes was relatively simple. SANs had the most to gain in simplification and ease of management. Many different approaches to virtualization emerged; In-band, out-of-band, virtualization appliances, virtualization software, etc. Some these approaches to virtualization were better than others, all provided some relief to the storage administrator, but none totally eliminated the complexity and inefficiency of overall storage management. Storage professionals continued looking for an answer to their struggle of simplified storage management.

The Present

Today's NAS systems boast increased functionality capacity and performance. Because of the increased functionality, NAS solutions are now being implemented in applications never dreamed of in the recent past – mission critical, customer facing, highly

Many companies have developed different approaches to solving the NAS scaling and management issues. Some have developed larger, faster NAS boxes, some have developed file switches and application switches that aggregate or virtualize existing NAS implementations, some have developed software based NAS servers that may be host based, fabric based, or standalone. As a result, users must wade through a myriad of promises and capabilities that vary with assorted solution approaches and vendors. The problem with most of the attempts available today is that the most new NAS servers utilize the same limited NAS architecture of the nineties, or add extra software or hardware layers in the storage architecture that add complexity and overhead.

Let's look at some of the recent NAS solutions:

Monolithic Server Based
This traditional NAS design is built around a thin server concept with an OS optimized for fast file I/O operations, coupled with back-end RAID storage. Network clients access the NAS server via a 10/100/1000 Ethernet connection. The monolithic designs deliver the traditional NAS benefits of easy deployment and centralized file storage.The newer offerings claim increased storage capacity and increased performance, but these designs do nothing to address the fundamental limitations of NAS technology of each server requiring its own single file system, or centralized management of geographically distributed systems.

Software Based
Software based providers offer software packages that turn standard off-the-shelf Windows and UNIX servers into NAS devices. The software provides the NAS functionality and NAS management capabilities. Because they are implemented on standard server technology, and they are added as an application on top of the server OS, they are usually lower performance than commercially available NAS servers. Similar to monolithic NAS architectures, software-based solutions are still hampered by the scalability, performance and management issues that are inherent in traditional NAS designs.

NAS Aggregation
Much of the startup activity is based around this new NAS aggregation server based approach to NAS technology. The aggregation servers front-end installed traditional NAS servers. By residing between the clients and the existing NAS storage, they aggregate or virtualize the storage to the clients. This approach to building NAS solutions virtualizes all storage resources across multiple NAS servers. This technique is viewed as complicating an already complex situation. Aggregation devices may limit performance, introduce additional processing latency, add additional risk to data integrity due complex locking schemes and may present interoperability and certification issues. Aggregation devices do solve some of the management and scaling issues, but they potentially introduce more problems than they solve.

Overall, higher capacities and improved management can be achieved with the new players and their offerings, but the old problems are still present and some new problems are introduced.

Even though NAS has evolved and experienced explosive growth in capabilities, the old nagging problems of scaling and management are more prominent today than ever before. Users are demanding a solution. The industry has seen what virtualization did for SAN management, and vendors are now looking for ways to leverage those techniques for simplifying NAS scaling and management issues.

The New NAS

The industry is demanding a new breed of NAS storage solution. This solution would not only meet the startling demand for capacity but would also provide crucial management features for enabling NAS to successfully serve in the strategic, mission-critical role that storage now plays.

The industry needs a solution built from the ground-up to deliver the full feature set required with efficiency and simplicity. There is not and efficient method to retrofit existing technologies and still adequately address today's 24X7 IT environment, which is on average doubling capacity annually. The solution must be designed with virtualization features integral to the architecture to enable simplified management and to eliminate inefficiencies and minimize overhead.

Spinnaker Networks has developed the Spinnaker SpinServer to meet the demands of the NAS market. Spinnaker delivers next-generation NAS solutions that scale and perform to meet the requirements of fast-changing enterprise and service provider environments. Spinnaker has introduced the SpinServer operating system, SpinFS, which incorporates many virtual features into the base system architecture to provide a complete solution to the scaling, performance and management problems that have plagued the industry for years. SpinFS implements a 2 stage distributed architecture, which separates network functions from disk functions. SpinFS supports clustering of SpinServers into a single storage resource that may increase capacity and performance for a given file system transparently and non-disruptively.

Many of these features and capabilities are delivered by employing the virtual NAS functions available in the Spinnaker Networks' SpinServer and SpinFS:

  • Transparent multi-server scaling of capacity and performance
  • Multi-server scaling of capacity and performance of a single file system
  • Global File System across multiple servers
  • Application specific storage pools
  • Non-disruptive, online data movement
  • Segregation of clients for secure resource sharing
  • Segregation of clients for logical management
  • Transparent access-port failover
  • Non-disruptive client access load distribution

Virtual Features

Virtual File System
The SpinServer's virtual file system (VFS) can be thought of as the universal storage container of the Spinnaker storage system. A VFS can consist of a variety of files that can be viewed and manipulated as a singular storage container. A VFS may be assigned to a user, group of users or applications. VFSes have quotas and access permissions. Clients access files as they normally would using a standard NAS server using shares or mount points. The client files are stored in VFSes on the SpinServer. SpinFS takes responsibility for virtualizing or mapping the clients' actual shares or mount points to a particular VFS within the Spinnaker storage system. SpinFS provides a de-coupling of the actual user's share or name space from the physical location of where the data or VFS resides. Because of this decoupling function, changes made to the VFS on the server are transparent to the user. The users are unaware of the actual physical location of their data. It may be on a local SpinServer or a SpinServer in a different state. The Virtual File System plays a big part in eliminating location dependency and downtime required for reconfiguration and scaling because the file access has been de-coupled from the physical storage location.

Virtual Server
The virtual server (VS) logically groups various storage resources into one single virtual server that can be assigned and relegated to a particular user community. Typically a virtual server would be configured with client access ports, VFSes, and an administrator. Virtual servers can span multiple physical servers that may be geographically dispersed. The virtual server handles the presentation of storage resources as a combined whole to the clients. Virtual servers provide segmentation and logical isolation of users on a shared set of resources, securely and transparently. It works by associating each Virtual server with a set of network ports and a set of virtual file systems. Only requests from ports configured on a virtual server may access VFSes stored on the virtual server. Users only see VFSes associated with their respective virtual server. Virtual server may have one or more administrators, allowing multiple levels of storage management.

Virtual Interface
The virtual interface (VI) provides client access to a virtual server. A virtual interface is mapped to one physical interface on a SpinServer. The virtual interface is then assigned to a specific user or user community. End users are not aware of the existence of virtual servers or virtual interfaces. A virtual server may have one or more virtual interfaces assigned. Since a virtual server may span multiple SpinServers, a single virtual server may have many virtual interfaces assigned on multiple SpinServers. It is a straightforward but very important part of the SpinServer virtual offering because it helps preserve performance, non-disruptive client access and support high-availability configurations.

Storage Pool

A storage pool (SP) is the Spinnaker adaptation of aggregating physical storage media into a logical storage pool. A Storage Pool is comprised of one or more storage units, which is a single representation of an entire RAID set.

RAID set---> storage unit--> storage pool

A Storage Pool is comprised of one or more storage units and can vary in size from .5TB to 22TBs. Because storage pools are comprised of physical storage, storage pools may take on the characteristics of the underlying physical media. Storage pools are contained completely within a SpinServer. Storage units may be added to a storage pool on-line and without disruption. VFSes may take advantage of new capacity added to a storage pool instantaneously.

Feature and Benefit Detail

Many capabilities may be realized by utilizing the virtual features of the Spinnaker SpinServer.

Transparent Multi-server Scaling of a Single File System
SpinFS, the 2-stage distributed file system architecture, allows SpinServers to be clustered into a single storage cluster. If additional capacity is needed for a file system, using the SpinCluster, an administrator may add another server to the cluster, provision storage behind the server, create a new virtual file system and assign the new VFS to the virtual server that exists on the original server. This can be performed across a 2-server cluster and ultimately up to a 512 node SpinCluster. All scaling is non-disruptive and transparent to users. The single (or multiple) file system is scaled transparently and non-disruptively across the SpinCluster.

Global File System across multiple servers
As described above, virtual file systems may be created on each server and linked to a virtual server, in effect creating a global file system. The global file system may be on one or hundreds of servers, local or remote. SpinFS takes responsibility of mapping users' shares or mount points to the appropriate VFS locations transparently to the users.

Application Specific Storage Pools
Storage administrators can actually create application specific storage pools. By supporting different RAID configurations, different drive types, and different drive configurations, storage administrators will be able to configure storage based on the needs of different applications. For example, an application requiring high-performance and high-throughput such as e-commerce would assign its VFSes to storage pools comprised of low latency drives, and have a reasonably small number of drives in the RAID set for best performance. The storage pool may reside behind a SpinServer that is used solely for this application providing 100% of processing power to this application. Conversely, a large library access application would assign its VFSes to a storage pool that is configured using slower drives with higher capacity, and use RAID sets with a very large number of drives that minimize storage costs. Configuration may now be driven by business need, not by technology.

Segregation of clients for secure resource sharing
The use of virtual servers allows for segregation of clients on a shared set of storage resources. A client or multiple clients may be grouped and assigned a virtual server. The virtual server may only have access to specific resources such as ports and VFSes. Clients in a virtual server may only see and access resources within their respective virtual server. Virtual servers provide administrators a secure and efficient method of sharing storage resources efficiently, effectively and transparently, delivering the highest ROI on their storage investment.

Segregation of clients for logical management
Virtual server provides client segregation as stated above. Virtual servers also may be assigned one or more administrators with varying degrees of authority. This is a powerful tool that administrators have at their disposal. Storage management may now be performed intuitively on an application, user segment basis or departmental or business unit basis.

Transparent access-port failover
Virtual interfaces provide client access to a virtual server. SpinFS in conjunction with virtual interface provide a means of ensuring client access availability in the event of a port failure. If a client access port fails, the virtual interfaces assigned to the failed port may be failed over to its designated backup port(s). A virtual interface may have multiple failover ports. The failover ports may be on the any port in the SpinCluster. The failover is transparent and non-disruptive for stateless protocol clients. Virtual interfaces ensure that clients will always have access to their data.

Non-disruptive client access load distribution
delivers the ability to manually migrate a virtual interface to a different physical client access port on the same or different server, non-disruptively and transparently to clients using stateless protocols. This is the key to front end client load distribution. Spinnaker recommends best practice of defining a number of virtual interfaces per virtual server, dividing the client base into equal size segments, and assigning each segment their own virtual interface. This allows the storage admin to migrate a virtual interface (and associated clients) to a different or less used physical interface if load distribution becomes necessary. If warranted a client per virtual interface could be associated to provide the ultimate granularity for load distribution.

Using virtual interface migration administrators can adjust their storage landscape to meet the changing demands.

Non-disruptive, online data movement
Spinnaker has devised a method of moving client data online, non-disruptively and while in use to another location on the SpinCluster, with no modifications to users' shares or mount points. SpinFS in conjunction with VFSes and the SpinMove software provide this feature. SpinMove relocates a client VFS while in use to anew location in the SpinCluster. SpinFS takes responsibility for mapping the new location of the VFS to the users' shares or mount points. SpinMove provides the ability to non-disruptively move a VFS from one storage pool to another, same or different server, providing access relief on an overburdened server or storage pool. These abilities allow administrators to perform storage management functions when they want to not when activity demand is low or applications are offline.

Virtual NAS Implementation Examples

Below are a few examples of the SpinServer may be implemented using some of the virtualized features in an Enterprise Data Center.

Server De-centralization
A company has decided to pursue a decentralized storage approach by creating four (4) geographically dispersed data centers. Their biggest challenge was maintaining central management of all storage resources while making the physical change and ongoing operation transparent to the end-users. Critical to their success was maintaining the current level of staff expertise and keeping additional costs to a minimum. This meant retention of skilled and trained staff, little or no relocation, minimal hiring and avoidance of outsourcing.

SpinServers may be deployed at each center, with virtual servers defined spanning all SpinServers. Using SpinMove, the storage may be migrated to the new centers while in use and with no disruption to operations. All VFS metadata (ACLS, etc.) remain intact and require no manual changes. The beauty of the SpinServer is that the client community has no idea of the physical location of their data; they access their data as they normally do, with no changes to their shares or mount points. Features such as the virtual interface strengthen fault tolerance since all virtual interfaces have been assigned at least two physical interfaces in the event that one should need to fail over. In the event of a disaster at the corporate site, administrative functions can be performed from the management console at a secondary data center location to keep business activities moving along. Administrators may manage all storage from the corporate location using the web based GUI. This single point of management serves as the keystone for not only meeting the technical challenge of this project, but for addressing the staffing and cost-related issues as well. Under normal operating circumstances all administrative activities for all locations occur at corporate headquarters. In the event of a disaster, administration and operations can continue at the other sites.

The SpinServer 3300 has enabled Corporation X to successfully implement a major change in their operations and disaster recovery strategies by providing a single point of management for all storage resources regardless of location. As a result, the highly skilled staff so critical to business operations was retained because no one was asked to relocate to remote data center locations to support operations. Outsourcing and additional staffing were not necessary because all storage resources are managed by the same team located a corporate headquarters. The high availability and fault-tolerance features of the SpinServer 3300 also eased the concerns that are associated with supporting mission-critical applications. Lastly, the final project cost was significantly less than what is normally expected with this type of initiative. TCO is dramatically reduced.

Server Consolidation
A large enterprise currently has multiple application servers, each with their own storage. New servers are being deployed for new applications at an ever increasing rate, and the management of the servers is getting quickly out of hand. There is unused stranded capacity on each server that can't be used by other applications. Highly skilled system staff are spending a majority of their time managing storage on each server. Backups are very time consuming because each server has its own set of backup procedures, and they must be executed serially due to limited tape resources. Since storage capacity demands will only increase, the company is seeking a solution that would enable efficient management of the servers and storage, and would reduce the time staff are spending managing storage on individual servers. This solution would also eliminate the unused stranded capacity on each server and enables efficient sharing of storage resources. They also require the ability to add capacity on demand with no outage to applications or users. Overall downtime to execute back-ups, restore data, perform normal maintenance or add new servers must be minimized as much as possible. Performance is another critical component and must be maintained. The company decides that implement a NAS server for storage consolidation is the ideal solution.

Consolidations of this type are a perfect fit for the SpinServer 3300. As described earlier, the IT administrator can construct application specific storage pools on the SpinServer that are uniquely configured for the requirements of the individual applications. Once defined virtual servers can be created to match the operating environment of the individual application servers. Equally important in terms of economic efficiencies and productivity are the streamlined management capabilities of the SpinServer 3300. Since all storage will be grouped as one cluster, it will also be viewed and managed as a single entity from a single console greatly reducing the time and effort spent performing necessary administrative tasks. Not only can all storage be seen from a single console, but it can be viewed from an application perspective as well.

IT administrators can guarantee the elimination of downtime for normal administration functions such as adding new servers to an existing file system, adding new capacity to an existing file system and moving data from one server to another within a cluster. The end users have access to all of the data needed to perform their daily tasks with no knowledge of how storage is configured at the back-end. Performance is maximized to ensure smooth on-line sales transactions without the worry of bottlenecks during peak business times. Management is no longer a nightmare as administrators now have a single point of storage management with a single view. Since the SpinServer 3300 is so flexible, the company will be able to keep up with the anticipated growth with decisions being driven by the business, not the technology.

Data Center Consolidation
One example of the power of the new NAS virtualization architecture would be an enterprise undertaking a storage consolidation project. In this example, an enterprise with four separate physical locations (corporate, business unit 1 (BU1), business unit 2 (BU2), and business unit 3 (BU3)) each with their own data centers wishes to consolidate operations to the corporate headquarters location, while minimizing operational impact. The individual business units are to retain individual control of their storage resources while corporate needs to have

Using the capabilities of the virtual server, centrally located storage is deployed at the main data center. The storage is then assigned into virtual servers, one for each business unit, and resources such as virtual server administrator and virtual interfaces are added. Once the physical pooling and connections are in place, the NAS server is used to recreate and implement the storage environment that existed when the business units were physically distributed. The data can be moved non-disruptively to the new location, easing the burden on the storage administrator, allowing the move to occur even while the data is in use. Once the data is moved, the end-users will access their data and files as they normally did prior to the consolidation. The integrated distributed file system handles the mapping of the actual share or mount point to the appropriate VFS, and the users' shares or mounts never had to be modified – the virtual aspect of the architecture handles the mapping. The virtual server groups and manages access to VFSes spanning multiple storage pools that can reside at different geographic locations allowing the corporate administrator to have true global control and management.

Users at all four locations view and store files on their particular directories unaware of any physical changes. BU1, BU2, and BU3 all have control of their respective storage and corporate can manage and control the entire cluster from a single console as a single entity. When extra capacity is needed, it can be added and made available immediately without disruption to the end-users. The same goes for the addition of a new server to the existing file system or for moving data from one server to another. In addition, if a physical interface ever fails, the associated virtual interface may be migrated to a different physical interface on the same or different server. This occurs behind the scenes without any end-user disruption. Last but not least, the corporate administrator can associate enterprise applications with virtual servers based on the performance and availability requirements creating application specific storage pools designed to meet the needs of the business.

Summary

Spinnaker's next generation NAS solution delivers the virtual NAS capabilities considered essential in a comprehensive enterprise storage network without introducing additional issues or complexity. Management costs are significantly reduced through the SpinServer 3300's single point of management, single view of management and non-disruptive data movement and scaling features. Downtime costs associated with normal storage administration functions can be a thing of the past. The market now has a NAS solution specifically designed to meet the harsh demands of mission-critical environments in an economic manner

Author

Mark Buczynski is senior marketing manager at Spinnaker Networks. He has 18+ years experience in the storage, data processing and telecom industry and has held executive management, marketing and technical positions.

Submitted by Spinnaker Networks.