Combinatorial Pattern Discovery for Scientific Data: Some Preliminary Results
Summary: Combinatorial pattern discovery for structural data using string edit distance with variable-length don't cares. Protein databases illustrate effective, generalizable patterns from novel heuristics that complement top protein classifiers. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Jason Tsong-Li Wang
- 2. Gung-Wei Chin
- 3. Thomas G. Marr
- 4. Bruce Shapiro
- 5. Dennis Shasha
- 6. Kaizhong Zhang
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 362 | Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases | 1995 | VLDB | 0.00025770385 |
| 547 | An Efficient Algorithm for Mining Association Rules in Large Databases | 1995 | VLDB | 0.00020420717 |
| 1,146 | Estimating Alphanumeric Selectivity in the Presence of Wildcards | 1996 | SIGMOD | 0.00013679782 |
| 4,941 | Comparing Hierarchical Data in External Memory | 1999 | VLDB | 5.8179899e-05 |
| 5,336 | SPIRIT: Sequential Pattern Mining with Regular Expression Constraints | 1999 | VLDB | 5.5641083e-05 |
| 6,128 | Data Mining with the SAP NetWeaver BI Accelerator | 2006 | VLDB | 5.1979556e-05 |
| 7,973 | Pattern Matching and Pattern Discovery in Scientific, Program, and Document Databases | 1995 | SIGMOD | 4.613363e-05 |
| 12,645 | Mining Long Sequential Patterns in a Noisy Environment | 2002 | SIGMOD | 4.1945683e-05 |
| 13,973 | Free Parallel Data Mining | 1998 | SIGMOD | - |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 13 | Mining Association Rules between Sets of Items in Large Databases | 1993 | SIGMOD | 0.0010864752 |
| 92 | Practical Selectivity Estimation through Adaptive Sampling | 1990 | SIGMOD | 0.00051315959 |
| 146 | Knowledge Discovery in Databases: An Attribute-Oriented Approach | 1992 | VLDB | 0.00041315295 |
| 230 | An Interval Classifier for Database Mining Applications | 1992 | VLDB | 0.00032217064 |
| 357 | Random Sampling from B+ trees | 1989 | VLDB | 0.00026020098 |
| 367 | Sequential Sampling Procedures For Query Size Estimation | 1992 | SIGMOD | 0.00025509745 |
| 5,756 | Query Processing for Distance Metrics | 1990 | VLDB | 5.3401202e-05 |
| 14,202 | Data and Knowledge Bases for Genome Mapping: What Lies Ahead? | 1991 | VLDB | - |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 919 | Distance-Join: Pattern Match Query In a Large Graph Database | 2009 | VLDB | 0.00015343179 |
| 12,648 | Searching on the Secondary Structure of Protein Sequences | 2002 | VLDB | 4.1945683e-05 |
| 11,559 | Approximate Pattern Matching in Massive Graphs with Precision and Recall Guarantees | 2020 | SIGMOD | 4.1945683e-05 |
| 10,599 | Time Series Motif Discovery: A Comprehensive Evaluation | 2025 | VLDB | 4.1945683e-05 |
| 3,862 | A Partition-Based Approach to Structure Similarity Search | 2014 | VLDB | 6.687769e-05 |
| 8,306 | Online Windowed Subsequence Matching over Probabilistic Sequences | 2012 | SIGMOD | 4.5435639e-05 |
| 4,817 | Clustering by Pattern Similarity in Large Data Sets | 2002 | SIGMOD | 5.8987807e-05 |
| 4,307 | Mining Periodic Patterns with Gap Requirement from Sequences | 2005 | SIGMOD | 6.2885419e-05 |
| 12,645 | Mining Long Sequential Patterns in a Noisy Environment | 2002 | SIGMOD | 4.1945683e-05 |
| 7,973 | Pattern Matching and Pattern Discovery in Scientific, Program, and Document Databases | 1995 | SIGMOD | 4.613363e-05 |