Database Paper Browser

Back to papers

Lachesis: Automatic Partitioning for UDF-Centric Analytics

Summary: Lachesis automates partitioning for UDF-centric analytics by modeling workloads as sub-computations to guide data partitioning. Deep RL selects sub-computations to partition, enabling automatic storage optimization across apps and improved productivity. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12316
Venue
VLDB
Year
2021
Pagerank
4.7188928e-05
Overall Rank
7,476 | 48.00%
DOI
10.14778/3457390.3457392

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 6 of 6 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 33 of 33 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
70 Hive - A Warehousing Solution Over a Map-Reduce Framework 2009 VLDB 0.00059533166
285 Automating Physical Database Design in a Parallel Database 2002 SIGMOD 0.0002899128
286 Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design 2004 SIGMOD 0.00028990057
408 Database Cracking 2007 CIDR 0.00023953844
557 SystemML: Declarative Machine Learning on Spark 2016 VLDB 0.00020197988
704 Building Efficient Query Engines in a High-Level Language 2014 VLDB 0.00017900583
794 Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) 2010 VLDB 0.00016605103
979 Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads 2012 VLDB 0.0001488055
1,108 Froid: Optimization of Imperative Programs in a Relational Database 2018 VLDB 0.00013984276
1,750 Weld: A Common Runtime for High Performance Data Analytics 2017 CIDR 0.00010683647
2,350 An Intermediate Representation for Optimizing Machine Learning Pipelines 2019 VLDB 8.9788641e-05
2,413 Automated Partitioning Design in Parallel Database Systems 2011 SIGMOD 8.8672223e-05
2,418 Tupleware: "Big" Data, Big Analytics, Small Clusters 2015 CIDR 8.8556595e-05
2,439 CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop 2011 VLDB 8.8190594e-05
2,818 Implicit Parallelism through Deep Language Embedding 2015 SIGMOD 8.0665558e-05
2,983 Supporting Table Partitioning By Reference in Oracle 2008 SIGMOD 7.7796493e-05
3,076 Learning a Partitioning Advisor for Cloud Databases 2020 SIGMOD 7.6107677e-05
3,349 Schema Management for Document Stores 2015 VLDB 7.1903648e-05
3,535 Scaling Spark in the Real World: Performance and Usability 2015 VLDB 6.9992495e-05
3,821 Locality-aware Partitioning in Parallel Database Systems 2015 SIGMOD 6.7281515e-05
4,061 Advanced Partitioning Techniques for Massively Distributed Computation 2012 SIGMOD 6.483587e-05
4,174 Computation Reuse in Analytics Job Service at Microsoft 2018 SIGMOD 6.3856219e-05
4,409 Declarative Recursive Computation on an RDBMS 2019 VLDB 6.2104034e-05
4,437 Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics 2015 VLDB 6.1907793e-05
5,118 AdaptDB: Adaptive Partitioning for Distributed Joins 2017 VLDB 5.6820984e-05
5,821 Tensor Relational Algebra for Distributed Machine Learning System Design 2021 VLDB 5.3134851e-05
5,960 Skew-Aware Join Optimization for Array Databases 2015 SIGMOD 5.2559595e-05
6,619 Near-Optimal Distributed Band-Joins through Recursive Partitioning 2020 SIGMOD 4.9910152e-05
7,134 Incremental Elasticity For Array Databases 2014 SIGMOD 4.822331e-05
7,304 MRTuner: A Toolkit to Enable Holistic Optimization for MapReduce Jobs 2014 VLDB 4.7684491e-05
8,002 Pangea: Monolithic Distributed Storage for Data Analytics 2019 VLDB 4.6088289e-05
8,137 Customizable and Scalable Fuzzy Join for Big Data 2019 VLDB 4.5774794e-05
9,332 PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development 2018 SIGMOD 4.3556432e-05
Previous Page 1 / 1 Next

Semantically Similar Papers