TY - GEN
T1 - XM2A
T2 - 4th IEEE International Conference on Multimedia Information Processing and Retrieval, MIPR 2021
AU - Garg, Yash
AU - Candan, K. Selçuk
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Advances in sensory technologies are enabling the capture of a diverse spectrum of real-world data streams. Increasing availability of such data, especially in the form of multivariate time series, allows for new opportunities for applications that rely on identifying and leveraging complex temporal patterns A particular challenge such algorithms face is that complex patterns consist of multiple simpler patterns of varying scales (temporal length). While several recent works (such as multi-head attention networks) recognized the fact complex patterns need to be understood in the form of multiple simpler patterns, we note that existing works lack the ability of represent the interactions across these constituting patterns. To tackle this limitation, in this paper, we propose a novel Multi-scale Multi-head Attention with Cross-Talk (XM2A) framework designed to represent multi-scale patterns that make up a complex pattern by configuring each attention head to learn a pattern at a particular scale and accounting for the co-existence of patterns at multiple scales through a cross-talking mechanism among the heads. Experiments show that XM2A outperforms state-of-the-art attention mechanisms, such as Transformer and MSMSA, on benchmark datasets, such as SADD, AUSLAN, and MOCAP.
AB - Advances in sensory technologies are enabling the capture of a diverse spectrum of real-world data streams. Increasing availability of such data, especially in the form of multivariate time series, allows for new opportunities for applications that rely on identifying and leveraging complex temporal patterns A particular challenge such algorithms face is that complex patterns consist of multiple simpler patterns of varying scales (temporal length). While several recent works (such as multi-head attention networks) recognized the fact complex patterns need to be understood in the form of multiple simpler patterns, we note that existing works lack the ability of represent the interactions across these constituting patterns. To tackle this limitation, in this paper, we propose a novel Multi-scale Multi-head Attention with Cross-Talk (XM2A) framework designed to represent multi-scale patterns that make up a complex pattern by configuring each attention head to learn a pattern at a particular scale and accounting for the co-existence of patterns at multiple scales through a cross-talking mechanism among the heads. Experiments show that XM2A outperforms state-of-the-art attention mechanisms, such as Transformer and MSMSA, on benchmark datasets, such as SADD, AUSLAN, and MOCAP.
KW - Information descriptors
KW - Multi-head attention
KW - Multi-scale features
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=85126211878&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126211878&partnerID=8YFLogxK
U2 - 10.1109/MIPR51284.2021.00030
DO - 10.1109/MIPR51284.2021.00030
M3 - Conference contribution
AN - SCOPUS:85126211878
T3 - Proceedings - 4th International Conference on Multimedia Information Processing and Retrieval, MIPR 2021
SP - 151
EP - 157
BT - Proceedings - 4th International Conference on Multimedia Information Processing and Retrieval, MIPR 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 8 September 2021 through 10 September 2021
ER -