Big Data describes datasets that are growing exponentially and that are too large, raw, and unstructured for analysis using traditional database technology and techniques. To capture and crunch Big Data, companies have to deploy new storage, computing, and analytic technologies and techniques.
Simply the size of the Big Data dataset can make implementing security controls unwieldy. “The challenge that you run into with Big Data is that a lot of the security controls that we’ve implemented as an industry are very dataset dependent”, Moyle, senior security expert at Savvis, told Infosecurity.
For example, employing data encryption, a data leak prevention tool, or malware scanning can pose significant engineering problems when it comes to Big Data, he noted.
“If you have a large dataset and you are engineering an application or process, or rolling out a deployment that is very large, it behooves you to think about how you are going to address security ahead of time”, Moyle said.
“Depending on the type of security control you want to implement, it may only be feasible to implement it at the beginning of the process”, he advised.
“If you have petabytes of data and you want to encrypt that data…that is a huge undertaking. In some cases, it might not even be feasible. But if you decide at the beginning of the process that you are going to apply encryption, you can build it into the architecture. As the data grows in volume, your encryption can keep pace with it”, Moyle explained.
“It is not every security control and it is not every organization that has this problem, but when the two collide, there are not a lot of folks thinking about that in the industry”, he added.
Big Data analytics requires a centralization of the information. But this poses challenges for securing and segmenting data.
“Once you bring the data together, trying to identify sensitive data is very, very difficult because of the large size of the data”, he said.
“If you are undertaking a data centralization effort, the time to think about how you are going to label pieces of data and how you are going to protect them is not after you have done the work of consolidation”, Moyle concluded.