Life At The Edge - Protecting Enterprise Data
Here's a recipe for disaster: Take large and growing volumes of critical, file-based data, stir in widely distributed networks and users, then add remote offices with poor or nonexistent backup plans and minimal IT resources. The result: Valuable enterprise data is at serious, and continual, risk.
Ideally, this data would be stored centrally in a protected Data Center; where it's easier to manage, easier to protect, and much less expensive. The problem is that remote users can't access their files over low bandwidth, high-latency wide area networks (WANs). So data and storage remain distributed throughout the enterprise creating "Islands of Storage", where storage management and data protection remain a constant challenge.
The most commonly employed alternatives — remote backup, thin clients, data replication, and client-based backups — work to a point, but each has significant drawbacks. New approaches built on file-caching technology are showing great promise in consolidating and protecting remote office data while responding to the user's need for remote file access.
Challenges
Most enterprises are well aware of the value of their company's data. Government regulations, the threat of data loss, and unmanageable data all add to the pressure of protecting this critical asset.
- Government regulations. Regulatory requirements are quickly becoming 800-pound gorillas in the corporate world. For example, the Sarbanes-Oxley Act of 2002 includes general business oversight practices around data retention and corporate records. The government is serious about this — just this past July, government regulators invoked Sarbanes-Oxley in a large civil suit against executives at HealthSouth Corp.
- Threat of data loss. The terrorist attacks on the Oklahoma City federal building and New York's Twin Towers demonstrated the potential for data loss on a large scale, and natural disasters put many regional offices at risk. Less dramatic attacks from hackers and viruses can be just as costly as physical disasters, and loss from employee actions — mistaken or malicious — is common and widespread.
- Unmanageable data. Managing storage challenges even the most sophisticated enterprises. Storage area networks (SANs) enable storage administrators to consolidate block-level data, such as database information, into centrally managed storage arrays. However, according to Gartner Group, 80 percent of enterprise data is file-based (MS Office documents, e-mail attachments, graphics applications, etc.), using file system protocols that resist management and do not operate well over WANs.
In response to these challenges, corporations have carefully protected the data they directly control. They've built and consolidated highly managed data centers with sophisticated data protection and disaster recovery solutions, placing them at corporate and large regional headquarters. These centralized data protection solutions include replication, mirroring, snapshots, tiered storage, and online backup.
Although central protection is critical, according to Strategic Research Corp., as much as 60 percent of a corporation's data resides outside its managed servers on remote networks, desktops, and mobile systems. Not only does a corporation's core data protection scheme rarely extend to the distant remote offices at the network's edge, but even if it did, it couldn't handle the large volumes of unstructured files. As much as 75 percent of this "edge data" is unprotected, because it is either ineffectively backed up or not backed up at all. This is a risky business practice, as edge data can be as critical to the company's survival as its more manageable centralized data.
Some enterprises throw up their collective hands in frustration and do little to protect edge data. Other organizations create corporate backup policies and issue them to all their branch offices, holding each office responsible for its own local backup procedures. The larger offices with IT support personnel may implement workable data protection solutions, but smaller offices without IT staff may make spotty backups or none at all. Businesses that do take action to protect edge data generally either deploy branch office backup solutions, replicate data to their data centers, or deploy terminal servers to eliminate data stored at the edge. However, all of the solutions available for protecting edge data have significant drawbacks.
Branch office solutions including backup servers, backup software, tape drives, and tape media are expensive and not always reliable, and add to the management complexity of a distributed IT infrastructure. The more common replication technologies are temperamental and complex, demand never-ending capacity and bandwidth, and frequently fail over high-latency WANs. Overworked data center staff must also monitor daily replications and backups from dozens of remote offices to make sure they actually complete the backups. And terminal servers, which use the WAN to process and display requests between remote thin clients and centralized application servers, suffer from poor performance and lack scalability.
Taming the Edge with File-Caching Gateways
The most commonly used file system protocols — Windows CIFS and Unix NFS — were never designed to operate over high-latency or limited bandwidth network links. But if enterprises were able to keep all persistent data at the data center and still provide LAN-like access to data for remote office users, corporations would have the best of both worlds — centralized data protection and management combined with high application service levels for their remote users.
The key is file-caching technology. Throughout most networks, 5 percent of network files account for more than 50 percent of I/O activity. If just these active data sets stay immediately available to users, processing and transport time drop significantly. Since file caching relieves servers of at least half of its I/O hits, it can increase server throughput considerably. File caching can also relieve WANs of large amounts of data traffic, greatly improving traffic movement over low-bandwidth, high-latency wide area networks.
File caching works by keeping the most active files in local storage. For example, a remote office with 1TB of total data requires only 100GB of cache for its active working data set. By storing just the active working data set in a local cache, users in the remote office can quickly and transparently retrieve their data no matter where it resides — in their office or at corporate headquarters.
A typical deployment would involve a client-side caching appliance installed at each remote office and a server-side caching appliance at the data center. The client-side caching appliance responds to file system requests initiated by local clients and presents a cached view of centralized data. Because heavily accessed data already resides in cache, data requests receive a near-instant response. If the requested data is not in cache, the client-side appliance communicates with a corresponding server-side appliance to obtain the requested data from the centralized server or NAS head-end in the data center. Using network and protocol optimization techniques, the request is quickly served to the client in the remote office.
By utilizing file caching, local clients can access their files, browse remote directories, and perform file system operations at LAN-like speeds. Global version management and locking mechanisms maintain data coherency and ensure that remote clients retrieve the latest copy of a file from the cache. A central management agent offers remote management capabilities for each file-caching appliance through a web-based interface. Data center administrators can centrally control and manage storage capacity. They can also optimize system performance by prepositioning files and directories into target caches or an external directory, and by setting policies to manage data transfers.
File-caching appliances such as Actona's ActaStor uses caching, compression, and network optimization techniques to run standard CIFS and NFS file protocols over low-bandwidth, high-latency WANs. The appliances seamlessly integrate into existing network and storage infrastructures, and require no software to be installed on client machines or file servers. By combining centralized storage with local file services, Actona's file caching solution enables companies to consolidate servers and storage, and centralize backup and disaster recovery processes, while providing remote users with LAN-like performance.
File-caching enables efficient storage consolidation across the WAN, driving storage, management, and backup operations out of remote offices and into the data center. Backup and restore operations are much faster and more reliable in a centralized environment. Centralized file-cached storage is cost-effective because it reduces storage provisioning across the extended enterprise and reduces the costs of remote office IT support. The results include lower management costs, higher resource utilization, more effective backup, improved disaster recovery, and the most important benefit of all — the confidence that all data will be available whenever and wherever it's needed.
To learn more about file caching, visit our website www.Actona.com or contact John Henze at Actona Technologies at (408) 399-8600.