Web Data Extraction using Hybrid Program Synthesis: A Combination of Top-down and Bottom-up Inference
Summary: Hybrid program synthesis unites top-down deductive and bottom-up enumerative inference, enabling concise web data extraction programs from few examples. Semi-supervised approach improves accuracy, reduces example needs, and lowers program complexity; deployed in Power BI. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Mohammad Raza
- 2. Sumit Gulwani
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,826 | The Smallest Extraction Problem | 2021 | VLDB | 4.6416742e-05 |
| 9,248 | Web Record Extraction with Invariants | 2023 | VLDB | 4.3690661e-05 |
| 11,245 | Cornet: Learning Table Formatting Rules By Example | 2023 | VLDB | 4.1945683e-05 |
| 11,343 | SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation | 2022 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 533 | RoadRunner: Towards Automatic Data Extraction from Large Web Sites | 2001 | VLDB | 0.00020757722 |
| 587 | Extracting Structured Data from Web Pages | 2003 | SIGMOD | 0.00019648348 |
| 1,132 | Building light-weight wrappers for legacy Web data-sources using W4F | 1999 | VLDB | 0.00013777657 |
| 1,469 | BlinkFill: Semi-supervised Programming By Example for Syntactic String Transformations | 2016 | VLDB | 0.00011836053 |
| 7,681 | SXPath - Extending XPath towards Spatial Querying on Web Documents | 2011 | VLDB | 4.6804276e-05 |
Previous
Page 1 / 1
Next