Today, big data and analytics are entering a more mature deployment stage. This is good news for small and medium-sized businesses that are deploying these technologies and have been struggling to define a big data architecture for their company.
Uncertainty about how to define the overall architecture of big data and analytics is one of the reasons why SMBs lag behind in big data and analytics deployment. In many cases, they are waiting and watching to see how trends such as hybrid computing, data marts, master databases, etc. develop, and how controls over security and governance will play out.
Finally, an emerging best practice data architecture that everyone can follow will be provided. In this architecture: Cloud computing services are being used to store and process big data, while on-premise data centers are used to develop local data marts in the enterprise.
Let’s take a closer look at the reasons behind this big data and analytics architecture:
If the enterprise is small, it is expensive to purchase server clusters to process big data in parallel in the data center, not to mention hiring or training very expensive professionals who know how to optimize, upgrade and maintain the parallel processing environment. . Businesses that choose to process and store data on-site also make significant investments in hardware, software, and storage equipment. Procuring big data hardware and software, and outsourcing computing processing and storage to the cloud will all cost a lot of money.
Data governance (for example, security and compliance issues) is one of the reasons why enterprises are reluctant to deliver all their mission-critical data to the cloud because it is more difficult to manage this cloud data. Therefore, once the data is processed in the cloud, many enterprises choose to migrate the data to their own on-premises data centers.
There is another reason why many enterprises choose to use their data centers: to focus on the proprietary applications and algorithms that develop this data, because it is the policy of many cloud computing providers that any applications developed by customers in the cloud may be compared with other Customer sharing.
By keeping applications on-premises in the data center and developing an on-premises master data set from which smaller data marts can be separated, enterprises have direct control over their data and applications.
For example, if an enterprise needs to anonymize data, the process it implements should be documented and agreed with its cloud computing provider, as the cloud computing provider will perform the anonymization. If an enterprise wants to clean up its own data, it should also provide detailed written instructions to its cloud computing provider on the cleanup process. For example, does the business just want to unify the abbreviations for all U.S. states (e.g., "Tenn" and "Tennessee" = "TN") or do other edits to the data make it uniform and easier to process? In the end, whether your business is Whether running in a cloud computing service provider's dedicated tenant or in a multi-tenant environment, the cloud computing provider should be able to guarantee that the enterprise's data is never shared with other customers.
Many IT departments in enterprises completely miss this task. They just start implementing big data projects but forget that the existing application development policies and procedures come from the application domain of the transaction. Businesses should not make this mistake. Instead, companies need to revise policies and procedures in areas where the likelihood of interacting with big data is higher (e.g., storage, database management, applications).
In the case of cloud-based disaster recovery (DR) testing, enterprises should include provisions in the contract for documenting and executing DR. Disaster recovery (DR) plans (which focus on transactional data and systems) should also be kept up to date and include recovery and test scripts for big data and analytics.
The above is the detailed content of Three best practices for small and medium-sized enterprises to adopt hybrid cloud to handle big data. For more information, please follow other related articles on the PHP Chinese website!