Result Details

A Multi-Dimensional DNS Domain Intelligence Dataset for Cybersecurity Research

HRANICKÝ, R.; ONDRYÁŠ, O.; HORÁK, A.; POUČ, P.; JEŘÁBEK, K.; EBERT, T.; POLIŠENSKÝ, J. A Multi-Dimensional DNS Domain Intelligence Dataset for Cybersecurity Research. Data in Brief, 2026, vol. 62, no. October, p. 1-13.
Type
journal article
Language
English
Authors
Hranický Radek, Ing., Ph.D., DIFS (FIT)
Ondryáš Ondřej, Ing., DIFS (FIT)
Horák Adam, Ing.
Pouč Petr, Ing.
Jeřábek Kamil, Ing., Ph.D., DIFS (FIT)
Ebert Tomáš, Bc.
Polišenský Jan, Ing., DIFS (FIT)
Abstract

The escalating sophistication and frequency of cyber threats require advanced solutions in cybersecurity research. Particularly, phishing and malware detection have become increasingly reliant on data-driven approaches. This paper presents a unique dataset precisely curated to bolster research in network security, focusing on the classification and analysis of internet domains. This dataset contains information for over a million internet domains with detailed labels distinguishing between phishing, malware, and benign traffic. Our dataset is distinctive due to its comprehensive compilation of metainformation derived from multiple sources, including DNS records, TLS handshakes and certificates, WHOIS and RDAP services, IP-related data, and geolocation details. Such rich, multi-dimensional data allows for a deeper analysis and understanding of domain characteristics that are critical in identifying and categorizing cyber threats. The integration of information from diverse sources enhances the dataset's utility, providing a holistic view of each domain's footprint and its potential security implications. The data is formatted in JSON, ensuring versatility, accessibility for researchers, and easy integration into various analytical tools and platforms, facilitating ease of use in statistical analysis, machine learning, and other computational analyses. Our dataset's extensive volume and variety surpass any known publicly available resources in this field, making it an invaluable asset for both academic and practical development and testing of cybersecurity solutions. This paper thoroughly describes the value of the data, details the comprehensive methodology employed in the collection process, and provides a clear description of the data structure. Such documentation is crucial for ensuring that the dataset can be effectively utilized and reapplied in a variety of research contexts. Its structured format and the broad range of included features are critical for developing robust cybersecurity solutions and can be adapted for emerging threats.

Keywords

Domain; DNS; TLS; WHOIS; RDAP; IP; Geolocation; Malware; Phishing

URL
Published
2026
Pages
13
Journal
Data in Brief, vol. 62, no. October, ISSN
DOI
UT WoS
001580758100003
EID Scopus
BibTeX
@article{BUT194220,
  author="Radek {Hranický} and Ondřej {Ondryáš} and Adam {Horák} and Petr {Pouč} and Kamil {Jeřábek} and Tomáš {Ebert} and Jan {Polišenský}",
  title="A Multi-Dimensional DNS Domain Intelligence Dataset for Cybersecurity Research",
  journal="Data in Brief",
  year="2026",
  volume="62",
  number="October",
  pages="13",
  doi="10.1016/j.dib.2025.112062",
  issn="2352-3409",
  url="https://www.sciencedirect.com/science/article/pii/S235234092500784X"
}
Files
Projects
Chytré informační technologie pro odolnou společnost, BUT, Vnitřní projekty VUT, FIT-S-23-8209, start: 2023-03-01, end: 2026-02-28, running
Flow-based Encrypted Traffic Analysis, MV, Strategická podpora rozvoje bezpečnostního výzkumu ČR 2019–2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/2VS), VJ02010024, start: 2022-01-01, end: 2025-06-30, completed
Research groups
Departments
Back to top