Detail publikace

Brno University of Technology at TRECVid 2010 SIN, CCD

HRADIŠ Michal, BERAN Vítězslav, ŘEZNÍČEK Ivo, HEROUT Adam, BAŘINA David, VLČEK Adam a ZEMČÍK Pavel. Brno University of Technology at TRECVid 2010 SIN, CCD. In: 2010 TREC Video Retrieval Evaluation Notebook Papers. Gaithersburg, MD: National Institute of Standards and Technology, 2010, s. 1-10.
Název česky
Brno University of Technology at TRECVid 2010
Typ
článek ve sborníku konference
Jazyk
angličtina
Autoři
URL
Abstrakt

This paper describes our approach to semantic indexing and content-based copy detection which was used for TRECVID 2010 evaluation.

Semantic indexing

1.  The runs differ in the types of visual features used. All runs use several bag-of-word representations fed to separate linear SVMs and the SVMs were fused by logistic regression.

  • F_A_Brno_resource_4: Only single best visual features (on the training set) are used - dense image sampling with rgb-SIFT.
  • F_A_Brno_basic_3: This run uses dense sampling and Harris-Laplace detector in combination with SIFT and rgb-sift descriptors.
  • F_A_Brno_color_2: This run extends F_A_Brno_basic_3 by adding dense sampling with rg-SIFT, Opponent-SIFT, Hue-SIFT, HSV-SIFT, C-SIFT and opponent histogram descriptors.
  • F_A_Brno_spacetime_1: This run extends F_A_Brno_color_2 by adding space-time visual features STIP and HESSTIP.

2. Combining multiple types of visual features improves results significantly. F_A_Brno_color_2 achieve more than twice better results than F_A_Brno_resource_4. The space-time visual features did not improve results.

3. Combining multiple types of visual features is important. Linear SVM is inferior to non-linear SVM in the context of semantic indexing.

Content-based Copy Detection

1.    Two runs submitted, but with similar settings; the difference is only in amount of processed test data (40% and 60%)

  • brno.m.*.l3sl2: SURF, bag-of-words (visual codebook: 2k size, 4 nearest neighbors used in soft-assignment), inverted file index, geometry (homography) based image similarity metric

2.    What if any significant differences (in terms of what measures) did you find among the runs?

  • only one setting used - no differences

3.    Based on the results, can you estimate the relative contribution of each component of your system/approach to its effectiveness?

  • slow search in reference dataset due to unsuitable configuration of used visual codebook

4.    Overall, what did you learn about runs/approaches and the research question(s) that motivated them?

  • change the way of describing the video content - frame based (or key-frame based) approach is not sufficient
Rok
2010
Strany
1-10
Sborník
2010 TREC Video Retrieval Evaluation Notebook Papers
Konference
2010 TRECVID Workshop, Gaithersburg, US
Vydavatel
National Institute of Standards and Technology
Místo
Gaithersburg, MD, US
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB9444,
   author = "Michal Hradi\v{s} and V\'{i}t\v{e}zslav Beran and Ivo \v{R}ezn\'{i}\v{c}ek and Adam Herout and David Ba\v{r}ina and Adam Vl\v{c}ek and Pavel Zem\v{c}\'{i}k",
   title = "Brno University of Technology at TRECVid 2010 SIN, CCD",
   pages = "1--10",
   booktitle = "2010 TREC Video Retrieval Evaluation Notebook Papers",
   year = 2010,
   location = "Gaithersburg, MD, US",
   publisher = "National Institute of Standards and Technology",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/9444"
}
Nahoru