Result Details
DMSL: The Data Mining Specification Language
Our capability to generate and store data has been increasing rapidly in the last years. It is not a problem to store terabytes of data any more. The problem is to melt these huge amounts of relatively primitive information to human-understandable forms -- patterns and knowledge. Unfortunately, we are not able to perform this task just by ourselves as the amounts of data are simply too large for our brains to process them. Fortunately, the field of knowledge discovery in databases (KDD) offers a solution: it aims at automated and intelligent extraction of patterns representing implicit knowledge encoded in massive data repositories (databases, data warehouses, WWW, etc.).
Probably the most crucial step in the whole KDD process is the data preparation. Surprisingly, it does not receive much attention among the data mining community, and this thesis tries to fill the gap. We introduce a theoretical framework for the data preparation step of the KDD process, and present an XML vocabulary named the Data Mining Specification Language (DMSL) that is centered around the framework. The wider purpose of DMSL is to provide for platform-independent definition of the whole KDD process, and its exchange and sharing among different applications, possibly operating in heterogeneous environments.
knowledge discovery in databases, data mining, data preprocessing, DMSL
@misc{BUT66696,
author="Petr {Kotásek}",
title="DMSL: The Data Mining Specification Language",
year="2003",
pages="179",
publisher="Faculty of Information Technology BUT",
address="Brno",
isbn="80-214-2685-3",
url="http://www.fit.vutbr.cz/~zendulka/theses/pkotasek.pdf"
}