III-COR-Small: Collaborative Research: Subsequence Matching for Content-based Access in Very Large Multimedia Databases

Supported by National Science Foundation

Award Number: IIS-0812601

Award Number: IIS-0812309

PIs: Vassilis Athitsos (UTA), Gautam Das (UTA), George Kollios (BU).
Dates: from September 1, 2008, to August 31, 2012.

This was a collaborative project, led by Professors Vassilis Athitsos and Gautam Das at UTA, and Professor George Kollios at Boston University. The project investigated methods for efficient subsequence matching in large databases of sequences (time series and strings).

Problem Definition and Background

In subsequence matching, given a specific sequence as input, we want to identify the best matching subsequences of (possibly long) sequences stored in a database. The sequences can be sequences of letters (for example, text, or DNA/protein sequences), sequences of numbers (such as closing stock market prices for each day of a year or a decade), sequences of images (movies), or sequences of notes (melodies).

Since sequences can represent such diverse data, the subsequence matching problem is relevant for several interesting applications. An example is sign spotting. Here, the input sequence is a video of a specific sign from some sign language, e.g., American Sign Language (ASL). The database contains sign language videos. The goal is to identify spots in the database videos where the input sign occurs.

Another example application is query-by-humming. Imagine having a melody in your mind, that you can't remember (or never knew in the first place) what song it is from. In a query-by-humming system, the user gives that melody as input to the system, by simply humming that melody on the computer's microphone. The system contains a large database of melodies, and it identifies the database melodies that contain parts similar to what the user hummed.

In this project, our goal was to different design methods, sometimes more generally applicable, sometimes applicable to more specific types of data, that can improve the state-of-the-art in subsequence matching. Improving the state-of-the-art means improving accuracy and efficiency in such systems, so that the system identifies more often the correct results, and the system produces those results to the user faster.

Key Findings

  • Embedding-Based Subsequence Matching:

    We have introduced an embedding-based framework for subsequence matching in time-series databases that improves the ef?ciency of processing subsequence matching queries under the Dynamic Time Warping (DTW) distance measure. This framework partially reduces subsequence matching to vector matching, using an embedding that maps each query sequence to a vector and each database time series into a sequence of vectors. The database embedding is computed of?ine, as a preprocessing step. At runtime, given a query object, an embedding of that object is computed online. Relatively few areas of interest are ef?ciently identi?ed in the database sequences by comparing the embedding of the query with the database vectors. Those areas of interest are then fully explored using the exact DTW-based subsequence matching algorithm.

  • Reference-Based Sequence Alignment:

    We have developed a method called reference-based string alignment for efficient subsequence matching of strings, under the edit distance and the Smith-Waterman similarity measure. Our method operates under the assumption that the query and its optimal subsequence match do not differ by more than 15% (as a fraction of the query length). We have obtained state-of-the-art indexing performance, as measured on a DNA dataset where the database contains a sequence of over 23 million letters. Our method outperforms BLAST for query lengths ranging from 40 to 10000, and the improvement is about 2 orders of magnitude for query lengths of 10000.

  • Gesture Spotting Using Dynamic Space-Time Warping:

    We have developed a novel subsequence matching method that can be used for identifying occurrences of gestures and signs in video databases and in streaming video. The key novelty of this method is that it does not require accurate hand detection. Instead of assuming that a hand detector module detects the hands correctly at each frame, we make the milder assumption that the hand detection outputs, at each frame, a set of candidate hand locations (e.g., 10 or 20 candidate locations). Feature vectors are extracted from each of those locations. Given a gesture model, our method identifies the sequence of candidate feature vectors (selecting one candidate from each frame) that best matches the model. Although the number of sequences of candidate feature vectors is exponential to the length of the video, our algorithm finds the best matching sequence in polynomial time, using dynamic programming.

  • The Move-Split-Merge (MSM) Metric:

    We have defined a novel metric measure for time series, called MSM (move-split-merge). This metric uses as building blocks three fundamental operations: Move, Split, and Merge, which can be applied in sequence to transform any time series into any other time series. A Move operation changes the value of a single element, a Split operation converts a single element into two consecutive elements, and a Merge operation merges two consecutive elements into one. Each operation has an associated cost, and the MSM distance between two time series is de?ned to be the cost of the cheapest sequence of operations that transforms the ?rst time series into the second one.

  • Indexing Subsequence Matching Searches under Consistent Metrics:

    We have defined a property of distance measures that we called "consistency", which is satisfied by a diverse set of metrics, such as the Euclidean distance, ERP, Levenshtein distance, Frechet distance. We have proposed an indexing method that can be applied on top of any consistent metric. An attractive feature of this method is that it can be applied on top of several different metrics, including metrics that may be proposed in the future, as long as they satisfy the consistency property. This is in contrast to several existing methods (including some of our own methods) which work only with specific distance measures.

  • Detecting Falls in Depth Videos:

    We have developed a computer vision method for detecting falls in videos captured using a depth camera (such as the popular Kinect camera from Microsoft). Detecting falls can be seen as a specific application within the subsequence matching framework, as fall detection involves identifying time series subsequences of interest (i.e., corresponding to falls) within a larger sequence (the entire video), and the start, end, and length of these subsequences of interest is not known a priori. The key distinguishing feature of our method is that it requires only a single camera, and that it does not require the training examples to be captured from the same viewpoint as the test sequences. For example, in our method, moving the camera or turning it to a somewhat different viewpoint does not require capturing new training data or recalibrating our system in any way. This is in contrast to several existing vision-based methods, where moving a camera oftentimes requires time-consuming recalibration of the system (which cannot even be performed by non-technical users) and oftentimes also requires capturing new training data from the new camera position and viewpoint.

Software and Datasets

Software partially or fully developed under the support of the NSF Awards IIS-0812601 and IIS-0812309:

Selected Publications

  • Alexios Kotsifakos, Vassilis Athitsos, and Panagiotis Papapetrou.
    Query-sensitive Distance Measure Selection for Time Series Nearest Neighbor Classification
    International Journal of Intelligent Data Analysis (IDA), 20(1), pages 5-27, January 2016.
    [Pre-print PDF 1.2MB]

  • Alexios Kotsifakos, Alexandra Stefan, Vassilis Athitsos, Gautam Das, and Panagiotis Papapetrou.
    DRESS: Dimensionality Reduction for Efficient Sequence Search
    Data Mining and Knowledge Discovery Journal (DAMI), 29(5), pages 1280-1311, September 2015.
    [Pre-print PDF 1.0MB]

  • Alexios Kotsifakos, Isak Karlsson, Panagiotis Papapetrou, Vassilis Athitsos, and Dimitrios Gunopulos.
    Embedding-based Subsequence Matching with Gaps-Range- Tolerances: a Query-By-Humming application
    Very Large Databases Journal (VLDBJ), 24(4), pages 519-536, August 2015.
    [Pre-print PDF 694KB]

  • Alexandra Stefan, Vassilis Athitsos, and Gautam Das.
    The Move-Split-Merge Metric for Time Series.
    IEEE Transactions on Knowledge and Data Engineering (TKDE), 25(6), pages 1425-1438, June 2013.
    [PDF 810KB]

  • Alexios Kotsifakos, Panagiotis Papapetrou, and Vassilis Athitsos.
    IBSM: Interval-Based Sequence Matching.
    SIAM International Conference on Data Mining (SDM), May 2013.
    [PDF 937KB]

  • Haohan Zhu, George Kollios, and Vassilis Athitsos.
    A Generic Framework for Efficient and Effective Subsequence Retrieval.
    Proceedings of the VLDB Endowment (PVLDB), 5(11), pages 1579-1590, July 2012.
    [PDF 1.4MB]

  • Zhong Zhang, Weihua Liu, Vangelis Metsis, and Vassilis Athitsos.
    A Viewpoint-Independent Statistical Method for Fall Detection.
    International Conference on Pattern Recognition (ICPR), November 2012.
    [PDF 411KB]

  • Alexios Kotsifakos, Panagiotis Papapetrou, Jaakko Hollmén, Dimitrios Gunopulos, Vassilis Athitsos, and George Kollios.
    Hum-a-song: A Subsequence Matching with Gaps-Range-Tolerances Query-By-Humming System.
    International Conference on Very Large Databases (VLDB) demo paper, August 2012.
    [PDF 688KB]

  • Panagiotis Papapetrou, Vassilis Athitsos, Michalis Potamias, George Kollios, and Dimitrios Gunopulos.
    Embedding-based Subsequence Matching in Time Series Databases.
    ACM Transactions on Database Systems (TODS), accepted March 2011, to appear.
    [PDF 593KB]

  • Alexios Kotsifakos, Vassilis Athitsos, Panagiotis Papapetrou, Jaakko Hollmén, and Dimitrios Gunopulos.
    Model-Based Search in Large Time Series Databases.
    Conference on Pervasive Technologies Related to Assistive Environments (PETRA), May 2011.
    [PDF 103KB]

  • Haijing Wang, Alexandra Stefan, Sajjad Moradi, Vassilis Athitsos, Carol Neidle, and Farhad Kamangar.
    A System for Large Vocabulary Sign Search.
    Workshop on Sign, Gesture and Activity (SGA), September 2010.
    [PDF 253KB]

  • Jonathan Alon, Vassilis Athitsos, Quan Yuan, and Stan Sclaroff.
    A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation.
    IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 31(9), pages 1685-1699, September 2009.
    [PDF 405KB]

  • Panagiotis Papapetrou, Vassilis Athitsos, George Kollios, and Dimitrios Gunopulos.
    Reference-Based Alignment in Large Sequence Databases.
    International Conference on Very Large Data Bases (VLDB), August 2009.
    Proceedings of the VLDB Endowment (PVLDB), 2(1), pages 205-216.
    [PDF 402KB]

  • Vassilis Athitsos, Panagiotis Papapetrou, Michalis Potamias, George Kollios, and Dimitrios Gunopulos.
    Approximate Embedding-Based Subsequence Matching of Time Series.
    ACM International Conference on Management of Data (SIGMOD), pages 365-378, June 2008.
    [PDF 238KB]

Any opinions, findings, and conclusions or recommendations expressed here are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.