Secure Storage: Continuous Data Protection
The concept of continuous data protection (CDP) deals with rapid adaptation to constant changes in information. Specifically, it relates to the practice of preserving the whole set of data each time even the slightest alteration is made to it, and recording that adjustment itself. CDP also helps maintain a secure environment: If any sort of cyber-attack contaminates a file, the system can recover the most recent, non-infected version.
Yet CDP is not as simple as this relatively straightforward description, said Giora Tarnopolsky, data storage systems team leader for the Information Storage Industry Consortium (INSIC) and founder of storage consultancy TarnoTek. “The subject is considerably more complex because there are several tiers of preservation,” he said. “One is the physical preservation, which is merely the preservation of the storage device containing a bit stream of the data, and the bit-stream preservation, assuring the permanency of a bit stream. If you have data stored on a tape, disc drive or optical DVD, regardless of any other consideration, you must ensure the mechanical or physical integrity of that medium.”
Another layer of this schema is what Tarnopolsky termed “logical preservation,” which is sometimes referred to as “semantic preservation.” This involves organizing data so that individuals’ future interactions with information objects will be productive and meaningful. “A good example of logical preservation that was widely known in the public domain was the Y2K crisis,” he explained. “It didn’t materialize because there were preventative measures taken, but that reflected the fact that coded information that had been written several years ago would suddenly become incompatible with common demands and needs of software. Similarly, I encounter that in my profession. You will encounter an inability to read records that you possessed, simply because you have not taken care of the logical preservation. Your software is no longer compatible with that record.”
The scope of CDP is such that it can seem intimidating. Tarnopolsky cited the example of virtual preservation of all property deeds in Santa Clara County, Calif., which are recorded digitally and posted on the county’s Web site. Systems like this necessitate a vision for information storage that extends far into the future, he said. “Human needs in a legal sense—such as property deeds—require a very long view of preservation if recorded digitally.
“In terms of data preservation, because most records are being created digitally—and if they’re not being created digitally, then they’re being preserved digitally—the spectrum of required data preservation starts from one second after the item is created until a century (later),” he added. “All media degrades over time. That’s a problem that arises with any attempt to preserve data five years or more. You need to be concerned about the bit-stream preservation and the physical preservation of the storage device—that it’s compatible with that medium.”
Individuals are storing an increasing amount of personal data, and need solutions that can preserve it for decades. “Consider the photograph you’re taking of your family members today, which you’ll need to retrieve 15 or 20 years from now,” Tarnopolsky said. “At the pace in which the computing industry develops, neither the hardware or the software of 15 years ago is supported now. Individuals will want to be able to access baby pictures in high resolution when that baby turns 50 years old, not to mention when the baby becomes a college graduate or gets married. All this creates large volumes of data creation, retention and retrieval. So in the consumer market, the big question—the big philosophical item, if you will—of semantic continuity is now in the forefront.”
In the enterprise market, one of the main drivers of CDP is compliance concerns that arise as a result of new legislation, Tarnopolsky said. “Regulatory compliance is a very important aspect of data protection. The implications of Sarbanes-Oxley and HIPAA are enormous. They refer not just to the integrity of the data, but to security as well. For data that is confidential, access should be assured to those who have a right to it and denied to those who do not have a right to it.”
Taken together, the business and consumer sectors have created a substantial demand for CDP, which the information storage industry has not adequately met thus far. “What you see being used in the industry nowadays significantly in terms of data continuity or endurance is the concept of information lifecycle management (ILM),” Tarnopolsky said. “It has not been addressed far beyond ILM. The problem is very broad. That’s why it’s such a big business opportunity.”
–Brian Summerfield, firstname.lastname@example.org