Implement robust AI governance to democratize data
The emergence of GenAI has accelerated the pace of unlocking the potential of data, providing opportunities for new insights and better decisions. However, achieving broader data access requires a comprehensive data governance strategy. Those enterprises that can strike a balance between data democratization and rigorous data governance will differentiate themselves in the market by unlocking unique data-driven insights.
According to Gartner, more than 80% of enterprises will use GenAI APIs and models or deploy GenAI-enabled applications in production by 2026, up from less than 5% last year. GenAI's natural language interface allows non-technical users, from department heads to frontline workers, to more easily access and use data. This levels the playing field in access to information and skills, which Gartner calls “one of the most disruptive trends of this decade.”
If companies are to avoid increased risks to privacy, security and data quality, democratizing data in this way makes strong governance even more critical, which means knowing exactly what you have Data, where it resides, who has access to it and how each type of user is allowed to use it, but how does a business enforce total control without stifling innovation?
At a higher level, the favored approach is to consolidate data into a comprehensive repository that can be shared easily and securely among different teams and workgroups. By unifying data, enterprises can centralize management and expand access to data while minimizing complexity and optimizing costs. This centralized approach to storing data helps ensure data consistency and accuracy and avoids problems caused by data duplication and inconsistency. Additionally, this also helps improve data security and protect privacy, as access control and monitoring measures can be more easily implemented. Therefore, it is very important for enterprises to establish a unified data repository.
In practice, this may bring some challenges because data sovereignty regulations require that certain data must be stored in specific country or region. Faced with this situation, enterprises need to work to eliminate data silos and implement a consistent governance framework across their data platforms.
In addition, some specific methods and technologies can help ensure that enterprises can maintain effective governance while maintaining security as GenAI expands access to data. These approaches include basic governance practices that apply across a variety of settings, but become especially critical as GenAI drives further democratization of data access.
Granular Controls for Privacy and Compliance
As employee access to data increases, so does the risk of data breaches and personally identifiable information (PII) being accessed by unauthorized users. . Therefore, implementing strict access control policies and using anonymization and identification technologies are critical to ensure compliance and protect data from inappropriate access.
In our new Data Trends 2024 report analyzing Snowflake Data Cloud trends, we noticed a significant increase in the use of governance capabilities that provide granular control over data while also appropriately Available to more users for more use cases, for example, usage of applied masks or row access policies increased 98% in the 12 months ended January 31, 2024 compared to the same period last year , at the same time, the number of columns assigned masking policies increased by 97%.
However, it is worth noting that the total number of queries run against policy-protected objects rose by 142%. This number is significant because it shows that good data governance is not about saying "no" and restricting data usage. Despite seeing an increase in governance through the use of labeling and blocking policies, the report notes that the amount of work being done using this data is rising rapidly.
In some cases, employees may wish to inspect a dataset to which they cannot be granted direct access. In such cases, differential privacy is a powerful technique as it allows users to view the dataset by schema to share and explore data sets without revealing any individual user’s PII. Taking this a step further, data clean rooms allow multiple parties to collaborate on data without disclosing the raw data to each other. Data clean rooms are typically used to share data between different businesses, but we are seeing the technology being used internally to meet growing demand. regulatory and privacy needs, it can be an effective technique for exploring PII data in the context of GenAI interfaces.
Consistent, coordinated security
Security should be built into the fabric of the data platform, rather than trying to fix it later for individual data sets and users, and the technology that supports conversational interfaces should not be replicated identity and other core permissions on data, which results in a fragile setup. If two or more systems are tracking who has access to which data, the potential for errors and unauthorized access increases significantly.
Technologies that play a key role in protecting data for GenAI use cases include continuous risk monitoring and protection, role-based access control (RBAC) and fine-grained authorization policies. Role-based tags and tag-based masking policies allow you to protect data at the column level by assigning a masking policy to a tag and then setting the tag on one or more database objects.
Data silos are the enemy of good governance
Storing copies or fragments of data in disparate systems makes it difficult to track who has access to what information and to maintain consistency in access and control policies Extremely difficult, which is why data silos are the enemy of strong governance.
Data silos also make it difficult to ensure that employees are querying the most current and accurate data, which can lead to costly mistakes. To achieve broad access to data through GenAI, enterprises need a single source of truth to ensure all employees are viewing the same information and that controls and policies can be applied and updated across all data.
Ensure data quality for accurate results
Even if you eliminate silos and have the appropriate permissions, there is no guarantee that the information your employees are accessing is correct. The data quality framework is based on applying to tables Configurable data quality rules for a specific column or set of columns to help detect quality issues and ensure accurate information.
Additionally, by now we all know that GenAI can sometimes hallucinate and produce answers that are actually unfounded, which is unacceptable for enterprise use. Enterprises can solve this problem by combining large language models (LLMs) with data sources they know they can trust, such as internal customer databases or vetted data sets from trusted third-party providers.
These trusted data sources can be merged using processes that require LLM customization (such as fine-tuning) or do not require LLM customization (such as just-in-time engineering or retrieval-augmented generation (RAG)). Whatever the case, these technologies help ensure employees receive accurate, high-quality results while adhering to the governance standards built into the on-premises cloud environment.
The power of data access and universal search
An important aspect of GenAI governance is making it easy for employees to find the right data sets and data products to help them with their analysis. One reason why artificial intelligence is so powerful is that It allows employees to interact with data without going through a central team, but this requires those employees to know what data is available to them and how to find it.
The search function provides this functionality, allowing users to find and query datasets and data products. This search function itself can be powered by LLM, making data search more intuitive - this is what we have developed at Snowflake, As part of our universal search.
Governance is the foundation of data democratization
Business users are eager to make wider use of their organization’s data, and GenAI finally makes this possible. Thanks to LLMS and natural language processing, employees in areas such as finance, HR, sales and operations can now formulate questions specific to their role and get the answers they need to make more informed decisions.
But to meet the security and compliance needs of the enterprise, this can only happen in an environment with strong governance. The stronger the governance, the more freely your employees can browse the data without giving the company Bringing additional risks, GenAI opens the door to true data democratization, and good governance is the foundation to make it possible.
The above is the detailed content of Implement robust AI governance to democratize data. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



The CentOS shutdown command is shutdown, and the syntax is shutdown [Options] Time [Information]. Options include: -h Stop the system immediately; -P Turn off the power after shutdown; -r restart; -t Waiting time. Times can be specified as immediate (now), minutes ( minutes), or a specific time (hh:mm). Added information can be displayed in system messages.

Complete Guide to Checking HDFS Configuration in CentOS Systems This article will guide you how to effectively check the configuration and running status of HDFS on CentOS systems. The following steps will help you fully understand the setup and operation of HDFS. Verify Hadoop environment variable: First, make sure the Hadoop environment variable is set correctly. In the terminal, execute the following command to verify that Hadoop is installed and configured correctly: hadoopversion Check HDFS configuration file: The core configuration file of HDFS is located in the /etc/hadoop/conf/ directory, where core-site.xml and hdfs-site.xml are crucial. use

Backup and Recovery Policy of GitLab under CentOS System In order to ensure data security and recoverability, GitLab on CentOS provides a variety of backup methods. This article will introduce several common backup methods, configuration parameters and recovery processes in detail to help you establish a complete GitLab backup and recovery strategy. 1. Manual backup Use the gitlab-rakegitlab:backup:create command to execute manual backup. This command backs up key information such as GitLab repository, database, users, user groups, keys, and permissions. The default backup file is stored in the /var/opt/gitlab/backups directory. You can modify /etc/gitlab

Enable PyTorch GPU acceleration on CentOS system requires the installation of CUDA, cuDNN and GPU versions of PyTorch. The following steps will guide you through the process: CUDA and cuDNN installation determine CUDA version compatibility: Use the nvidia-smi command to view the CUDA version supported by your NVIDIA graphics card. For example, your MX450 graphics card may support CUDA11.1 or higher. Download and install CUDAToolkit: Visit the official website of NVIDIACUDAToolkit and download and install the corresponding version according to the highest CUDA version supported by your graphics card. Install cuDNN library:

Docker uses Linux kernel features to provide an efficient and isolated application running environment. Its working principle is as follows: 1. The mirror is used as a read-only template, which contains everything you need to run the application; 2. The Union File System (UnionFS) stacks multiple file systems, only storing the differences, saving space and speeding up; 3. The daemon manages the mirrors and containers, and the client uses them for interaction; 4. Namespaces and cgroups implement container isolation and resource limitations; 5. Multiple network modes support container interconnection. Only by understanding these core concepts can you better utilize Docker.

Installing MySQL on CentOS involves the following steps: Adding the appropriate MySQL yum source. Execute the yum install mysql-server command to install the MySQL server. Use the mysql_secure_installation command to make security settings, such as setting the root user password. Customize the MySQL configuration file as needed. Tune MySQL parameters and optimize databases for performance.

A complete guide to viewing GitLab logs under CentOS system This article will guide you how to view various GitLab logs in CentOS system, including main logs, exception logs, and other related logs. Please note that the log file path may vary depending on the GitLab version and installation method. If the following path does not exist, please check the GitLab installation directory and configuration files. 1. View the main GitLab log Use the following command to view the main log file of the GitLabRails application: Command: sudocat/var/log/gitlab/gitlab-rails/production.log This command will display product

PyTorch distributed training on CentOS system requires the following steps: PyTorch installation: The premise is that Python and pip are installed in CentOS system. Depending on your CUDA version, get the appropriate installation command from the PyTorch official website. For CPU-only training, you can use the following command: pipinstalltorchtorchvisiontorchaudio If you need GPU support, make sure that the corresponding version of CUDA and cuDNN are installed and use the corresponding PyTorch version for installation. Distributed environment configuration: Distributed training usually requires multiple machines or single-machine multiple GPUs. Place
