Back to papers
Crystal: A Unified Cache Storage System for Analytical Databases
Summary: Crystal is a co-located, smart cache storage system that unifies analytical caching across DBMSs via data-source connectors and push-down predicates, caching single-table regions (hyper-rectangles). Unmodified Spark and Greenplum see substantially lower latency and reduced remote bandwidth.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12419
- Venue
- VLDB
- Year
- 2021
- Pagerank
- 5.1847534e-05
- Overall Rank
- 6,149 | 57.23%
- DOI
-
10.14778/3476249.3476292
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 10 of 10 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 4,544 |
ScaleStore: A Fast and Cost-Efficient Storage Engine using DRAM, NVMe, and RDMA |
2022 |
SIGMOD |
6.1000636e-05 |
| 4,870 |
Exploiting Cloud Object Storage for High-Performance Analytics |
2023 |
VLDB |
5.8613885e-05 |
| 6,972 |
Predicate Caching: Query-Driven Secondary Indexing for Cloud Data Warehouses |
2024 |
SIGMOD |
4.8785237e-05 |
| 7,427 |
Selection Pushdown in Column Stores using Bit Manipulation Instructions |
2023 |
SIGMOD |
4.7327406e-05 |
| 8,645 |
Predicate Pushdown for Data Science Pipelines |
2023 |
SIGMOD |
4.4772518e-05 |
| 8,945 |
The Five-Minute Rule for the Cloud: Caching in Analytics Systems |
2025 |
CIDR |
4.4254423e-05 |
| 9,125 |
On-Demand State Separation for Cloud Data Warehousing |
2022 |
VLDB |
4.3917246e-05 |
| 9,848 |
Saving Money for Analytical Workloads in the Cloud |
2024 |
VLDB |
4.2721228e-05 |
| 10,767 |
The HANA Native Query Engine for Lakehouse Systems |
2025 |
VLDB |
4.1945683e-05 |
| 10,854 |
LiquidCache: Efficient Pushdown Caching for Cloud-Native Data Analytics |
2025 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 22 of 22 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 21 |
C-Store: A Column-oriented DBMS |
2005 |
VLDB |
0.00086087497 |
| 133 |
On Rules, Procedures, Caching And Views In Data Base Systems |
1990 |
SIGMOD |
0.00042757638 |
| 167 |
The Snowflake Elastic Data Warehouse |
2016 |
SIGMOD |
0.00039180521 |
| 442 |
Semantic Data Caching and Replacement |
1996 |
VLDB |
0.000230437 |
| 584 |
Answering Queries with Aggregation Using Views |
1996 |
VLDB |
0.0001971526 |
| 731 |
Optimizing Queries Using Materialized Views: A Practical, Scalable Solution |
2001 |
SIGMOD |
0.00017468889 |
| 746 |
Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores |
2020 |
VLDB |
0.00017326979 |
| 981 |
DynaMat: A Dynamic View Management System for Data Warehouses |
1999 |
SIGMOD |
0.00014879532 |
| 1,021 |
Materialized View Selection for Multidimensional Datasets* |
1998 |
VLDB |
0.00014619259 |
| 1,477 |
Fine-grained Partitioning for Aggressive Data Skipping |
2014 |
SIGMOD |
0.00011770865 |
| 1,887 |
Caching Multidimensional Queries Using Chunks |
1998 |
SIGMOD |
0.00010204659 |
| 1,889 |
Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads |
2021 |
VLDB |
0.00010200865 |
| 1,922 |
Selecting Subexpressions to Materialize at Datacenter Scale |
2018 |
VLDB |
0.00010082599 |
| 2,345 |
Transparent Mid-Tier Database Caching in SQL Server |
2003 |
SIGMOD |
8.9919454e-05 |
| 2,645 |
WATCHMAN: A Data Warehouse Intelligent Cache Manager |
1996 |
VLDB |
8.3829312e-05 |
| 2,691 |
Greenplum: A Hybrid Database for Transactional and Analytical Workloads |
2021 |
SIGMOD |
8.2909126e-05 |
| 2,693 |
An Architecture for Recycling Intermediates in a Column-store |
2009 |
SIGMOD |
8.2883398e-05 |
| 3,607 |
Cache Tables: Paving the Way for an Adaptive Database Cache |
2003 |
VLDB |
6.9253431e-05 |
| 3,922 |
Pushing Data-Induced Predicates Through Joins in Big-Data Clusters |
2020 |
VLDB |
6.6291079e-05 |
| 4,174 |
Computation Reuse in Analytics Job Service at Microsoft |
2018 |
SIGMOD |
6.3856219e-05 |
| 6,777 |
Revisiting Reuse in Main Memory Database Systems |
2017 |
SIGMOD |
4.9288776e-05 |
| 7,407 |
Intermittent Query Processing |
2019 |
VLDB |
4.7373205e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 6,861 |
HetCache: Synergising NVMe Storage and GPU acceleration for Memory-Efficient Analytics |
2023 |
CIDR |
4.905263e-05 |
| 7,842 |
ChronoCache: Predictive and Adaptive Mid-Tier Query Result Caching |
2020 |
SIGMOD |
4.6368238e-05 |
| 10,649 |
STsCache: An Efficient Semantic Caching Scheme for Time-series Data Workloads Based on Hybrid Storage |
2025 |
VLDB |
4.1945683e-05 |
| 6,066 |
GPU Database Systems Characterization and Optimization |
2024 |
VLDB |
5.2290447e-05 |
| 1,887 |
Caching Multidimensional Queries Using Chunks |
1998 |
SIGMOD |
0.00010204659 |
| 5,301 |
ReCache: Reactive Caching for Fast Analytics over Heterogeneous Data |
2018 |
VLDB |
5.5790928e-05 |
| 4,667 |
FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS |
2021 |
VLDB |
6.0116919e-05 |
| 8,945 |
The Five-Minute Rule for the Cloud: Caching in Analytics Systems |
2025 |
CIDR |
4.4254423e-05 |
| 10,854 |
LiquidCache: Efficient Pushdown Caching for Cloud-Native Data Analytics |
2025 |
VLDB |
4.1945683e-05 |
| 4,870 |
Exploiting Cloud Object Storage for High-Performance Analytics |
2023 |
VLDB |
5.8613885e-05 |