Thesis Details
Web page segmentation utilizing clustering techniques
Information extraction and other techniques for mining data from the Web get more important with the development of web technologies and raising amount of information stored exclusively on the Web. However, with this information, the amount of content that is completely irrelevant in context of the presented information grows as well. That's only one of the reasons why it is so important to intensively study and develop preprocessing of information stored on the Web. Segmentation algorithms are one of the possible ways of web page preprocessing. This thesis is dedicated to utilization of clustering techniques for improving the efficiency of existing web page segmentation algorithms, as well as finding completely new ones.
web page preprocessing, document preprocessing, segmentation, clustering, template, VIPS
@phdthesis{FITPT741, author = "Jan Zelen\'{y}", type = "Ph.D. thesis", title = "Web page segmentation utilizing clustering techniques", school = "Brno University of Technology, Faculty of Information Technology", year = 2017, location = "Brno, CZ", language = "english", url = "https://www.fit.vut.cz/study/phd-thesis/741/" }