| 35 |
MonetDB/X100: Hyper-Pipelining Query Execution |
2005 |
CIDR |
0.00076197749 |
| 60 |
Efficiently Compiling Efficient Query Plans for Modern Hardware |
2011 |
VLDB |
0.00064439773 |
| 254 |
Snorkel: Rapid Training Data Creation with Weak Supervision |
2018 |
VLDB |
0.00030540555 |
| 351 |
Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs |
2009 |
VLDB |
0.0002636504 |
| 404 |
Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited |
2014 |
VLDB |
0.00024143076 |
| 536 |
The LDBC Social Network Benchmark: Interactive Workload |
2015 |
SIGMOD |
0.00020722862 |
| 557 |
SystemML: Declarative Machine Learning on Spark |
2016 |
VLDB |
0.00020197988 |
| 585 |
Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems |
2012 |
VLDB |
0.00019706145 |
| 667 |
Incremental Knowledge Base Construction Using DeepDive |
2015 |
VLDB |
0.00018440557 |
| 683 |
Cerebro: A Data System for Optimized Deep Learning Model Selection |
2020 |
VLDB |
0.00018195476 |
| 727 |
On Synopses for Distinct-Value Estimation Under Multiset Operations |
2007 |
SIGMOD |
0.00017508726 |
| 761 |
Materialization Optimizations for Feature Selection Workloads |
2014 |
SIGMOD |
0.00017053783 |
| 1,215 |
Snuba: Automating Weak Supervision to Label Training Data |
2019 |
VLDB |
0.0001323375 |
| 1,402 |
Hybrid Parallelization Strategies for Large-Scale Machine Learning in SystemML |
2014 |
VLDB |
0.00012180605 |
| 1,420 |
Data Management Challenges in Production Machine Learning |
2017 |
SIGMOD |
0.00012057956 |
| 1,427 |
Towards Scalable Dataframe Systems |
2020 |
VLDB |
0.0001204248 |
| 1,482 |
Automating Large-Scale Data Quality Verification |
2018 |
VLDB |
0.00011725533 |
| 1,666 |
HELIX: Holistic Optimization for Accelerating Iterative Machine Learning |
2019 |
VLDB |
0.0001096361 |
| 1,727 |
BigBench: Towards an Industry Standard Benchmark for Big Data Analytics |
2013 |
SIGMOD |
0.00010740936 |
| 1,804 |
An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory |
2016 |
SIGMOD |
0.00010501185 |
| 1,967 |
Compressed Linear Algebra for Large-Scale Machine Learning |
2016 |
VLDB |
9.9131712e-05 |
| 2,122 |
SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle |
2020 |
CIDR |
9.4989076e-05 |
| 2,170 |
tf.data: A Machine Learning Data Processing Framework |
2021 |
VLDB |
9.3821603e-05 |
| 2,249 |
Orca: A Modular Query Optimizer Architecture for Big Data |
2014 |
SIGMOD |
9.2034693e-05 |
| 2,456 |
Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities |
2021 |
SIGMOD |
8.7733773e-05 |
| 2,623 |
GenBase: A Complex Analytics Genomics Benchmark |
2014 |
SIGMOD |
8.4374366e-05 |
| 2,848 |
Exploiting Matrix Dependency for Efficient Distributed Matrix Computation |
2015 |
SIGMOD |
8.0208832e-05 |
| 3,721 |
To Partition, or Not to Partition, That is the Join Question in a Real System |
2021 |
SIGMOD |
6.8179379e-05 |
| 3,763 |
Flexible Rule-Based Decomposition and Metadata Independence in Modin: A Parallel Dataframe System |
2022 |
VLDB |
6.7801795e-05 |
| 3,918 |
On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML |
2018 |
VLDB |
6.6315176e-05 |
| 3,948 |
A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics |
2018 |
VLDB |
6.5959084e-05 |
| 4,196 |
Overton: A Data System for Monitoring and Improving Machine-Learned Products |
2020 |
CIDR |
6.3686231e-05 |
| 4,261 |
Parallelizing Query Optimization |
2008 |
VLDB |
6.31244e-05 |
| 4,505 |
SPOOF: Sum-Product Optimization and Operator Fusion for Large-Scale Machine Learning |
2017 |
CIDR |
6.1327108e-05 |
| 4,769 |
Automated Feature Engineering for Algorithmic Fairness |
2021 |
VLDB |
5.934329e-05 |
| 4,774 |
LIMA: Fine-grained Lineage Tracing and Reuse in Machine Learning Systems |
2021 |
SIGMOD |
5.9316087e-05 |
| 4,833 |
MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions |
2019 |
SIGMOD |
5.8916346e-05 |
| 5,087 |
Accelerating Queries with Group-By and Join by Groupjoin |
2011 |
VLDB |
5.7075009e-05 |
| 5,242 |
Towards Benchmarking Feature Type Inference for AutoML Platforms |
2021 |
SIGMOD |
5.6074743e-05 |
| 6,053 |
Optimizing Machine Learning Workloads in Collaborative Environments |
2020 |
SIGMOD |
5.2326838e-05 |
| 6,228 |
Managing ML Pipelines: Feature Stores and the Coming Wave of Embedding Ecosystems |
2021 |
VLDB |
5.1470042e-05 |
| 7,470 |
The Case for Deep Query Optimisation |
2020 |
CIDR |
4.7201897e-05 |
| 7,704 |
ExDRa: Exploratory Data Science on Federated Raw Data |
2021 |
SIGMOD |
4.6733838e-05 |
| 7,723 |
Mind the Gap: Bridging Multi-Domain Query Workloads with EmptyHeaded |
2017 |
VLDB |
4.6676712e-05 |
| 9,001 |
The Power of Nested Parallelism in Big Data Processing – Hitting Three Flies with One Slap – |
2021 |
SIGMOD |
4.4107627e-05 |