| 6,388 |
Optimizing Data-intensive Systems in Disaggregated Data Centers with TELEPORT |
2022 |
SIGMOD |
5.0851841e-05 |
| 6,541 |
ConnectorX: Accelerating Data Loading From Databases to Dataframes |
2022 |
VLDB |
5.0216945e-05 |
| 6,590 |
Interactive Demonstration of Probabilistic Predicates |
2018 |
SIGMOD |
5.0010949e-05 |
| 6,658 |
Scalable Querying of Nested Data |
2021 |
VLDB |
4.9711629e-05 |
| 6,673 |
Incorporating Super-Operators in Big-Data Query Optimizers |
2020 |
VLDB |
4.966799e-05 |
| 6,715 |
Shared Foundations: Modernizing Meta's Data Lakehouse |
2023 |
CIDR |
4.9509939e-05 |
| 6,745 |
DistME: A Fast and Elastic Distributed Matrix Computation Engine using GPUs |
2019 |
SIGMOD |
4.9417155e-05 |
| 6,759 |
AStream: Ad-hoc Shared Stream Processing |
2019 |
SIGMOD |
4.9352213e-05 |
| 6,784 |
SparkR: Scaling R Programs with Spark |
2016 |
SIGMOD |
4.9265155e-05 |
| 6,871 |
Towards General and Efficient Online Tuning for Spark |
2023 |
VLDB |
4.8997004e-05 |
| 6,993 |
Unit Testing Data with Deequ |
2019 |
SIGMOD |
4.8693227e-05 |
| 7,059 |
Adaptive and Robust Query Execution for Lakehouses at Scale |
2024 |
VLDB |
4.8477825e-05 |
| 7,060 |
SquirrelJoin: Network-Aware Distributed Join Processing with Lazy Partitioning |
2017 |
VLDB |
4.8465382e-05 |
| 7,067 |
JetScope: Reliable and Interactive Analytics at Cloud Scale |
2015 |
VLDB |
4.8440936e-05 |
| 7,207 |
Kodiak: Leveraging Materialized Views For Very Low-Latency Analytics Over High-Dimensional Web-Scale Data |
2016 |
VLDB |
4.800763e-05 |
| 7,237 |
CleanM: An Optimizable Query Language for Unified Scale-Out Data Cleaning |
2017 |
VLDB |
4.7928651e-05 |
| 7,296 |
Multi-Tenant Cloud Data Services: State-of-the-Art, Challenges and Opportunities |
2022 |
SIGMOD |
4.7723197e-05 |
| 7,387 |
Bubble Execution: Resource-aware Reliable Analytics at Cloud Scale |
2018 |
VLDB |
4.7438193e-05 |
| 7,399 |
SmartBench: A Benchmark For Data Management In Smart Spaces |
2020 |
VLDB |
4.7410149e-05 |
| 7,427 |
Selection Pushdown in Column Stores using Bit Manipulation Instructions |
2023 |
SIGMOD |
4.7327406e-05 |
| 7,534 |
Enabling Efficient and General Subpopulation Analytics in Multidimensional Data Streams |
2022 |
VLDB |
4.7180004e-05 |
| 7,599 |
Quill: Efficient, Transferable, and Rich Analytics at Scale |
2016 |
VLDB |
4.7003593e-05 |
| 7,704 |
ExDRa: Exploratory Data Science on Federated Raw Data |
2021 |
SIGMOD |
4.6733838e-05 |
| 7,723 |
Mind the Gap: Bridging Multi-Domain Query Workloads with EmptyHeaded |
2017 |
VLDB |
4.6676712e-05 |
| 7,818 |
A Survey and Experimental Comparison of Distributed SPARQL Engines for Very Large RDF Data |
2017 |
VLDB |
4.6434716e-05 |
| 7,905 |
S2RDF: RDF Querying with SPARQL on Spark |
2016 |
VLDB |
4.6211706e-05 |
| 7,907 |
Petabyte-Scale Row-Level Operations in Data Lakehouses |
2024 |
VLDB |
4.6205839e-05 |
| 7,925 |
Architecting a Query Compiler for Spatial Workloads |
2020 |
SIGMOD |
4.6153403e-05 |
| 7,953 |
Shasta: Interactive Reporting At Scale |
2016 |
SIGMOD |
4.613363e-05 |
| 8,002 |
Pangea: Monolithic Distributed Storage for Data Analytics |
2019 |
VLDB |
4.6088289e-05 |
| 8,075 |
AJoin: Ad-hoc Stream Joins at Scale |
2020 |
VLDB |
4.5917655e-05 |
| 8,130 |
Simple & Optimal Quantile Sketch: Combining Greenwald-Khanna with Khanna-Greenwald |
2024 |
PODS |
4.5784634e-05 |
| 8,197 |
SparkCruise: Workload Optimization in Managed Spark Clusters at Microsoft |
2021 |
VLDB |
4.5607121e-05 |
| 8,230 |
You Say 'What', I Hear 'Where' and 'Why' - (Mis-)Interpreting SQL to Derive Fine-Grained Provenance |
2018 |
VLDB |
4.5541444e-05 |
| 8,248 |
Flare & Lantern: Efficiently Swapping Horses Midstream |
2019 |
VLDB |
4.5509332e-05 |
| 8,396 |
Optimizing Declarative Graph Queries at Large Scale |
2019 |
SIGMOD |
4.5276541e-05 |
| 8,429 |
Handling Environments in a Nested Relational Algebra with Combinators and an Implementation in a Verified Query Compiler |
2017 |
SIGMOD |
4.5156925e-05 |
| 8,479 |
Excalibur: A Virtual Machine for Adaptive Fine-grained JIT-Compiled Query Execution based on VOILA |
2023 |
VLDB |
4.5014929e-05 |
| 8,506 |
New Query Optimization Techniques in the Spark Engine of Azure Synapse |
2022 |
VLDB |
4.4957661e-05 |
| 8,534 |
Translation of Array-Based Loops to Distributed Data-Parallel Programs |
2020 |
VLDB |
4.4937074e-05 |
| 8,617 |
A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning |
2024 |
VLDB |
4.4846425e-05 |
| 8,645 |
Predicate Pushdown for Data Science Pipelines |
2023 |
SIGMOD |
4.4772518e-05 |
| 8,672 |
Optimizing Video Selection LIMIT Queries With Commonsense Knowledge |
2024 |
VLDB |
4.4710897e-05 |
| 8,758 |
Hyperspace: The Indexing Subsystem of Azure Synapse |
2021 |
VLDB |
4.456315e-05 |
| 8,781 |
Accelerate Distributed Joins with Predicate Transfer |
2025 |
SIGMOD |
4.4534753e-05 |
| 8,980 |
HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries |
2021 |
SIGMOD |
4.4169807e-05 |
| 9,001 |
The Power of Nested Parallelism in Big Data Processing – Hitting Three Flies with One Slap – |
2021 |
SIGMOD |
4.4107627e-05 |
| 9,016 |
Making Data Engineering Declarative |
2023 |
CIDR |
4.4094312e-05 |
| 9,093 |
Databricks Lakeguard: Supporting Fine-grained Access Control and Multi-user Capabilities for Apache Spark Workloads |
2025 |
SIGMOD |
4.398149e-05 |
| 9,124 |
Dynamic Speculative Optimizations for SQL Compilation in Apache Spark |
2020 |
VLDB |
4.391961e-05 |