Back to papers
Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics
Summary: ADLS is a fully-managed, exabyte-scale file service optimized for parallel big-data analytics, unifying HDFS compatibility with Cosmos semantics and co-located compute/data. It bridges Cosmos and Hadoop with an HDFS-compatible API, adds multi-tier storage and security, and outlines Cosmos-to-ADLS migration.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 5348
- Venue
- SIGMOD
- Year
- 2017
- Pagerank
- 7.6717218e-05
- Overall Rank
- 3,038 | 78.87%
- DOI
-
10.1145/3035918.3056100
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 23 of 23 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 1,922 |
Selecting Subexpressions to Materialize at Datacenter Scale |
2018 |
VLDB |
0.00010082599 |
| 2,062 |
Dremel: A Decade of Interactive SQL Analysis at Web Scale |
2020 |
VLDB |
9.6481955e-05 |
| 2,083 |
Towards a Learning Optimizer for Shared Clouds |
2019 |
VLDB |
9.5834572e-05 |
| 2,359 |
Data Market Platforms: Trading Data Assets to Solve Data Problems |
2020 |
VLDB |
8.9607667e-05 |
| 2,545 |
POLARIS: The Distributed SQL Engine in Azure Synapse |
2020 |
VLDB |
8.5725413e-05 |
| 2,954 |
Magpie: Python at Speed and Scale using Cloud Backends |
2021 |
CIDR |
7.8262582e-05 |
| 3,407 |
End-to-end Optimization of Machine Learning Prediction Queries |
2022 |
SIGMOD |
7.1295646e-05 |
| 3,625 |
Cost Models for Big Data Query Processing: Learning, Retrofitting, and Our Findings |
2020 |
SIGMOD |
6.9055212e-05 |
| 5,489 |
To Share, or not to Share Online Event Trend Aggregation Over Bursty Event Streams |
2021 |
SIGMOD |
5.4782335e-05 |
| 5,562 |
A Deep Dive into Common Open Formats for Analytical DBMSs |
2023 |
VLDB |
5.4331334e-05 |
| 6,242 |
Helios: Hyperscale Indexing for the Cloud & Edge |
2020 |
VLDB |
5.1408379e-05 |
| 6,261 |
The Cosmos Big Data Platform at Microsoft: Over a Decade of Progress and a Decade to Look Forward |
2021 |
VLDB |
5.1350714e-05 |
| 6,673 |
Incorporating Super-Operators in Big-Data Query Optimizers |
2020 |
VLDB |
4.966799e-05 |
| 6,757 |
KEA: Tuning an Exabyte-Scale Data Infrastructure |
2021 |
SIGMOD |
4.9372134e-05 |
| 7,778 |
Runtime Variation in Big Data Analytics |
2023 |
SIGMOD |
4.653651e-05 |
| 8,519 |
Extending Polaris to Support Transactions |
2024 |
SIGMOD |
4.494088e-05 |
| 8,758 |
Hyperspace: The Indexing Subsystem of Azure Synapse |
2021 |
VLDB |
4.456315e-05 |
| 8,785 |
Bringing Cloud-Native Storage to SAP IQ |
2021 |
SIGMOD |
4.4522556e-05 |
| 9,194 |
Phoebe: A Learning-based Checkpoint Optimizer |
2021 |
VLDB |
4.3761777e-05 |
| 9,232 |
AutoComp: Automated Data Compaction for Log-Structured Tables in Data Lakes |
2025 |
SIGMOD |
4.3690661e-05 |
| 9,701 |
Towards Functional Decomposition of Storage Formats |
2025 |
CIDR |
4.3008468e-05 |
| 10,803 |
GraphAr: An Efficient Storage Scheme for Graph Data in Data Lakes |
2025 |
VLDB |
4.1945683e-05 |
| 11,217 |
Efficient Approximation Framework for Attribute Recommendation |
2023 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 6,104 |
Automating Distributed Tiered Storage Management in Cluster Computing |
2020 |
VLDB |
5.2080102e-05 |
| 10,787 |
AnalyticDB-PG: A Cloud-native High-performance Data Warehouse in Alibaba Cloud |
2025 |
VLDB |
4.1945683e-05 |
| 7,907 |
Petabyte-Scale Row-Level Operations in Data Lakehouses |
2024 |
VLDB |
4.6205839e-05 |
| 8,416 |
Towards Building Autonomous Data Services on Azure |
2023 |
SIGMOD |
4.5196199e-05 |
| 746 |
Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores |
2020 |
VLDB |
0.00017326979 |
| 11,668 |
Cost-Effective, Workload-Adaptive Migration of Big Data Applications to the Cloud |
2019 |
SIGMOD |
4.1945683e-05 |
| 5,318 |
Analyzing and Comparing Lakehouse Storage Systems |
2023 |
CIDR |
5.5715872e-05 |
| 4,861 |
OctopusFS: A Distributed File System with Tiered Storage Management |
2017 |
SIGMOD |
5.8708916e-05 |
| 11,545 |
Pixels: Multiversion Wide Table Store for Data Lakes |
2020 |
CIDR |
4.1945683e-05 |
| 6,261 |
The Cosmos Big Data Platform at Microsoft: Over a Decade of Progress and a Decade to Look Forward |
2021 |
VLDB |
5.1350714e-05 |