Marking time in sequence mining
Abstract
Sequence mining is often conducted over static and
temporal datasets as well as over collections of events
(episodes). More recently, there has also been a focus
on the mining of streaming data. However, while
many sequences are associated with absolute time values,
most sequence mining routines treat time in a
relative sense, only returning patterns that can be described
in terms of Allen-style relationships (or simpler).
In this work we investigate the accommodation
of timing marks within the sequence mining process.
The paper discusses the opportunities presented and
the problems that may be encountered and presents a
novel algorithm, INTEMTM, that provides support
for timing marks. This enables sequences to be examined
not only in respect of the order and occurrence
of tokens but also in terms of pace. Algorithmic considerations
are discussed and an example provided for
the case of polled sensor data.