Using the Data Lakehouse for Vault CRM

Customers can access a Data Lakehouse containing a complete and up-to-date copy of their Vault CRM data. The Data Lakehouse retrieves all Vault CRM data using Vault Platform’s Direct Data API. Retrieved data publishes as Apache Iceberg™ tables on S3, enabling customers to easily query Vault CRM data in place, or copy out to an external data warehouse.

Every CRM Vault has its own Data Lakehouse. Data is read-only and any metadata or data changes in Vault CRM are updated in the Data Lakehouse within 30 minutes, ensuring queries are accurate and up-to-date.

Enabling the Data Lakehouse

Veeva enables the Data Lakehouse when requested on a per-Vault basis. Admin users must open a Veeva Support ticket requesting the Data Lakehouse be enabled in your Vault CRM instances.

Connecting and Running Queries

Admin users can connect their existing data warehouse platforms to the Data Lakehouse to run queries against their Vault CRM data, or use any query engine that supports Apache Iceberg™, for example, Databricks or Snowflake.