Artifact tables in schema changes
Overview
During online schema changes in Vitess, artifact tables are created as part of the migration process. Understanding how these tables work, their naming conventions, and their storage implications is crucial for managing schema changes effectively.
What are artifact tables?
When using deploy requests, you may notice some additional tables in your database with specific prefixes. These are artifact tables that VReplication creates while running your online schema changes.
These tables serve as intermediate storage during the schema change process, allowing Vitess to perform non-blocking migrations by copying data from the original table to the new structure before swapping them atomically.
Naming conventions
Artifact tables created during schema changes follow specific naming patterns:
_vt
prefix: Tables used to facilitate the deployment and revert process, with specific subtypes:_vt_vrp_
: Tables actively being used in VReplication during online schema changes_vt_hld_
: Tables that will be held onto until a specified timestamp (e.g.,_vt_hld_6ace8bcef73211ea87e9f875a4d24e90_20200915120410_
)_vt_drp_
: Tables that are marked for dropping at a specified timestamp (e.g.,_vt_drp_6ace8bcef73211ea87e9f875a4d24e90_20200915120410_
)
Automatic cleanup
Artifact tables used during a schema change will automatically be cleaned up 24 hours after the deploy request is complete. The 24 hours are in place to allow for data to naturally flush from the buffer pool before the table is dropped. This protects from potential locking in the buffer pool when the table is removed.
Exception for foreign key constraints
For databases that have foreign key constraints, artifact tables are dropped immediately after the deploy request is completed, rather than waiting for the 24-hour period.
Billing
The artifact tables do not count against the storage costs on network-attached storage (Amazon Elastic Block Storage or Google Persistent Disk).
Storage considerations
Metal instances
If you're using a Metal instance, these tables will temporarily affect your storage size. It's important to make sure you're on a large enough instance to account for these artifact tables during a deploy request.
Storage estimation and protection
To protect Metal instances from running out of storage during schema changes, PlanetScale will not allow a deploy request to be enqueued if it will cause the database to exceed available storage. We estimate the maximum storage usage to be table_size * 2
for all tables involved in the migration. This conservative estimate allows for plenty of room for:
- The data to be copied over during the migration process
- Any binlogs that are created during the migration process
- Additional overhead from the schema change operations
For all Metal deploy requests, the estimated storage usage will be displayed on the deploy request page, allowing you to see the projected storage impact for each shard before proceeding.
If you do not have enough storage space to make a copy of the changed table to perform the online schema change, you will not be able to begin a deploy request.
Sharded database calculations
For sharded databases, we check and create an estimate for each individual shard. If any individual shard has insufficient space, then the deploy request will not be able to begin.
Instant deployments exception
The exception to this is if you are using Instant deployments, which use MySQL's ALGORITHM=INSTANT and do not require creating artifact table copies.
Related topics
Need help?
Get help from the PlanetScale Support team, or join our GitHub discussion board to see how others are using PlanetScale.