A Data Science Central Community
The combination of cloud computing and big data is a match made in heaven. Big data requires a flexible compute environment, which can scale quickly and automatically to support massive amounts of data. Infrastructure clouds provide exactly that. But whenever cloud computing is discussed, the question comes up:
When it comes to cloud security in a big data use case, the expectation is that any security solution will provide the same flexibility as the cloud without compromising the overall security of the implementation. When taking your big data to the cloud, the following four tips will enable you to achieve cloud flexibility paired with strict cloud security.
Data encryption creates the “virtual walls” for your cloud infrastructure. Deploying cloud encryption is considered a fundamental first step, but there is no solution with a “one size fits all” approach. Some encryption solutions require on premise gateway encryption, which does not work well in cloud big-data scenarios. Other approaches (for example, data encryption powered by the cloud provider itself) force the end user to trust someone else with the encryption keys, which is both risky and a compliance deal-breaker.
Recent encryption technologies, like split-key encryption, are tailored specifically to the cloud and leverage the best of both worlds by providing an infrastructure cloud solution while keeping the encryption keys safe and in the hands of the customer.
To achieve the best possible encryption for your big data scenario, use split-key encryption.
In big data, each component of the architecture should scale, and the cloud security solution is no different. When selecting a cloud security solution, make sure it is available across all relevant cloud geo-locations. Furthermore, it must scale effectively with your big data infrastructure.
On the surface level, this means, of course, that hardware cannot be involved. Hardware Security Modules (HSMs) do not fit the big data use case because of the inability to scale and flex to fit the cloud model.
To achieve the necessary scalability, use a cloud security solution that is designed for the cloud, but achieves security that is comparable to (or better than) hardware-based solutions.
Big data cloud computers are frustrated from the fact that their cloud security architecture does not easily scale (see tip #2). Traditional encryption solutions require an HSM (hardware) element. Needless to say, hardware implementation cannot be automated.
To be able to automate as much of your cloud security as possible, strive for a virtual appliance approach, not a hardware approach. Also, make sure that a usable API (ideally a RESTful API) is available as part of the cloud security offering.
A virtual appliance plus RESTful API will enable the required flexibility and automation needed in a cloud big data use case.
Because cloud security is often complicated, we see “security shortcuts” in big data implementations. Security shortcuts are usually taken to avoid complexity and maintain the big data architecture “unharmed.”
Some customers use freeware encryption tools and keep the encryption key on disk (which is highly insecure and may expose the encrypted data to anyone with access to the virtual disk), while others simply do not encrypt. These shortcuts are certainly not complicated, but, obviously, they are also not secure.
When it comes to big data security, map your data according to its sensitivity and protect it accordingly. In some cases, the consequences are dramatic. Not all big data infrastructure is secure, and one might need to find an alternative, if the data at stake is regulated or sensitive.
Big data can continue to enjoy the scalability, flexibility, and automation offered by cloud computing while maintaining the strictest security standards for the data. Encryption is considered a fundamental first step in protecting cloud (big) data, and new technologies such as split-key encryption and homomorphic key management should be leveraged to protect sensitive data and comply with regulations like HIPAA, PCI, and many others.