The Cosmos Big Data Platform at Microsoft: Over a Decade of Progress and a Decade to Look Forward
Summary: Cosmos' exabyte-scale evolution at Microsoft spans reliability, scale, efficiency, and usability, with next steps toward security, compliance, and heterogeneous analytics. The paper links Cosmos workload evolution to broad big-data trends, offering platform-driven design insights for researchers. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Conor Power
- 2. Hiren Patel
- 3. Alekh Jindal
- 4. Jyoti Leeka
- 5. Bob Jenkins
- 6. Michael Rys
- 7. Ed Triou
- 8. Dexin Zhu
- 9. Lucky Katahanas
- 10. Chakrapani Bhat Talapady
- 11. Joshua Rowe
- 12. Fan Zhang
- 13. Rich Draves
- 14. Marc Friedman
- 15. Ivan Santa Maria Filho
- 16. Amrish Kumar
Incoming Citations (Sorted by Pagerank)
Showing 10 of 10 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,690 | Deploying a Steered Query Optimizer in Production at Microsoft | 2022 | SIGMOD | 5.997226e-05 |
| 7,778 | Runtime Variation in Big Data Analytics | 2023 | SIGMOD | 4.653651e-05 |
| 8,416 | Towards Building Autonomous Data Services on Azure | 2023 | SIGMOD | 4.5196199e-05 |
| 8,854 | Optimizing the cloud? Don't train models. Build oracles! | 2024 | CIDR | 4.4349047e-05 |
| 8,859 | Pipemizer: An Optimizer for Analytics Data Pipelines | 2022 | VLDB | 4.4344107e-05 |
| 10,401 | Asynchronous Replication Strategies for a Real-Time DBMS | 2025 | SIGMOD | 4.1945683e-05 |
| 10,723 | UniClean: A Scalable Data Cleaning Solution for Mixed Errors based on Unified Cleaners and Optimized Cleaning Workflow | 2025 | VLDB | 4.1945683e-05 |
| 10,767 | The HANA Native Query Engine for Lakehouse Systems | 2025 | VLDB | 4.1945683e-05 |
| 10,931 | Proactive Resume and Pause of Resources for Microsoft Azure SQL Database Serverless | 2024 | SIGMOD | 4.1945683e-05 |
| 13,196 | PikePlace: Generating Intelligence for Marketplace Datasets | 2023 | VLDB | - |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 32 of 32 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,132 | Advanced Join Strategies for Large-Scale Distributed Computation | 2014 | VLDB | 6.4241067e-05 |
| 11,668 | Cost-Effective, Workload-Adaptive Migration of Big Data Applications to the Cloud | 2019 | SIGMOD | 4.1945683e-05 |
| 8,416 | Towards Building Autonomous Data Services on Azure | 2023 | SIGMOD | 4.5196199e-05 |
| 6,242 | Helios: Hyperscale Indexing for the Cloud & Edge | 2020 | VLDB | 5.1408379e-05 |
| 6,757 | KEA: Tuning an Exabyte-Scale Data Infrastructure | 2021 | SIGMOD | 4.9372134e-05 |
| 7,778 | Runtime Variation in Big Data Analytics | 2023 | SIGMOD | 4.653651e-05 |
| 3,038 | Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics | 2017 | SIGMOD | 7.6717218e-05 |
| 5,297 | Continuous Cloud-Scale Query Optimization and Processing | 2013 | VLDB | 5.5801669e-05 |
| 6,136 | Scalable Progressive Analytics on Big Data in the Cloud | 2013 | VLDB | 5.1928748e-05 |
| 12,449 | The Microsoft Data Platform | 2007 | SIGMOD | 4.1945683e-05 |