There’s been quite a lot of discussion recently about applying big data to improving security. My colleagues Rashmi Knowles and Barrett Mononen have written a couple of blogs about it. Our RSA/Netwitness CSO Eddie Schwartz spoke about it at RSA Conference China and the Splunk IPO in April certainly created lots of buzz around big data in general and the application of big data to security in particular.
But the related discussion of how to secure big data has received a lot less attention. At the Strata Conference in February, for example, there was very little discussion of security in the exhibition or in the sessions. The Data Science Summit at EMCworld in May also had little to say on the topic and the upcoming Strata Conferences in New York and London in October also appear to have very little on their agendas related to security.
But securing big data is a critical issue for any organization that is looking to take advantage of big data. We explored this issue from several different perspectives in sessions at RSA Conference China. Branden Williams and I gave a session on the high-level strategy for securing big data. Our colleague Samir Saklikar gave a session on enhanced security capabilities for Hadoop. Branden and Jason Rader also gave a session on how to define the value of data, a critical issue in determining what data to protect and how to protect it.
In our presentation, Branden and I suggested that the same process that we at RSA encourage for addressing the security of information in general should also be applied to big data. It’s important to consider controls like identity management, encryption and data loss prevention. But as I wrote in an earlier blog, those controls should be considered within a security process that begins with intentional decisions about what information needs to be secured and what tools, processes and structures best address the risks to that information. The controls need to be complemented by visibility and analytics that verify the controls are working effectively and, even more importantly, quickly discover and address issues and attacks that get around those controls. This strategy, moreover, has to be applied to defining and addressing the risks for the source data, for the result set and for the analytics process.
“Isn’t this approach the same as for any data?” one of the folks in the audience asked at the end of our presentation. “Absolutely!” was my reply. But all too often, the focus of any security discussion moves quickly into what tools and technology to put in place.
Technology, including innovations such as Samir discussed in his session on Hadoop security, is critically important. Technology can be critical in support of security processes, such as in the capability that our colleagues at Varonis® have provided in their Metadata Framework™ to support the collaborative processes so essential for big data. We’ll be discussing these and other technologies in up-coming blogs. But the place to start in securing big data is with understanding what information you care about and what threats there are to that data, including the derived information that Branden discussed in a recent blog. Then you can make the right decisions about the processes, organizational structures and technologies to put in place to protect it.