Detail výsledku

Towards Efficient Scheduling of Transformer Neural Network Computation for Edge AI Deployment

SEDLÁK, D.; KLHŮFEK, J.; MRÁZEK, V.; VAŠÍČEK, Z. Towards Efficient Scheduling of Transformer Neural Network Computation for Edge AI Deployment. Proceedings of the Genetic and Evolutionary Computation Conference Companion. Malaga: Association for Computing Machinery, 2025. p. 2242-2248. ISBN: 979-8-4007-1464-1.
Typ
článek ve sborníku konference
Jazyk
anglicky
Autoři
Abstrakt

Transformer neural networks have gained popularity in recent years, demonstrating
remarkable performance across many application domains. However, inference on
resource-constrained embedded hardware remains challenging due to Transformers'
substantial computational demands. We aim to address this problem by focusing on
exploiting the inherent parallelism opportunities presented by the multi-head
self attention operations of Transformers, to achieve a speedup in processing on
embedded hardware. In this paper, we present an evolutionary-based scheduling
approach for distribution and allocation of Transformer operations across
systolic array-based hardware accelerators used for execution. Our methodology
takes as input specifications of the Transformer workload and the target systolic
array architecture and explores the large mapping space to identify an efficient
plan of operation-to-array assignments. The plans are evaluated against
a hardware-aware cost model, capturing the cost of computational cycles for
a given operation and systolic array, with the objective to minimize the total
sum across all operations. Through extensive experimental evaluations across
diverse systolic array dimensions, we demonstrate that our evolutionary-based
scheduler surpasses conventional heuristics and is able to find plans offering up
to 33.8% average reduction in overall cycle count.

Klíčová slova

transformer networks, edge AI, evolutionary algorithms

Rok
2025
Strany
2242–2248
Sborník
Proceedings of the Genetic and Evolutionary Computation Conference Companion
Konference
Genetic and Evolutionary Computation Conference 2025 (Companion)
ISBN
979-8-4007-1464-1
Vydavatel
Association for Computing Machinery
Místo
Malaga
DOI
BibTeX
@inproceedings{BUT197537,
  author="David {Sedlák} and Jan {Klhůfek} and Vojtěch {Mrázek} and Zdeněk {Vašíček}",
  title="Towards Efficient Scheduling of Transformer Neural Network Computation for Edge AI Deployment",
  booktitle="Proceedings of the Genetic and Evolutionary Computation Conference Companion",
  year="2025",
  pages="2242--2248",
  publisher="Association for Computing Machinery",
  address="Malaga",
  doi="10.1145/3712255.3734345",
  isbn="979-8-4007-1464-1"
}
Projekty
Application-specific HW/SW architectures and their applications, VUT, Vnitřní projekty VUT, FIT-S-23-8141, zahájení: 2023-03-01, ukončení: 2026-02-28, řešení
LEDNeCo: Low Energy Deep Neurocomputing, GAČR, Standardní projekty, GA25-15490S, zahájení: 2025-01-01, ukončení: 2027-12-31, řešení
Výzkumné skupiny
Pracoviště
Nahoru