High-frequency financial data — order books, trade events, tick-by-tick prices — poses unique modeling challenges: extreme noise, irregular sampling, heavy-tailed return distributions, and volatility clustering at multiple timescales. Deep learning, and in particular large Transformer-based architectures, has emerged as a powerful approach to capturing the complex temporal dynamics of financial markets at this granularity.