Research Datasets

Eye Disease Recognition

Cross-Lingual Image Captioning

  • TMM’19 COCO-CN: A bilingual image description dataset enriching MS-COCO with manually written Chinese sentences and tags
    Xirong Li, Chaoxi Xu, Xiaoxu Wang, Weiyu Lan, Zhengxiong Jia, Gang Yang, Jieping Xu: COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval. In: IEEE Transactions on Multimedia, vol. 21, no. 9, pp. 2347-2360, 2019.
  • ACMMM’17 Flickr30k-CN: A bilingual extension of the popular Flickr30k dataset, used for evaluating image captioning in a cross-lingual setting  
    Weiyu Lan, Xirong Li, Jianfeng Dong: Fluency-Guided Cross-Lingual Image Captioning. In: ACM Multimedia, 2017.
  • ICMR’16 Flickr8k-CN: A bilingual extension of the popular Flickr8k dataset, used for evaluating image captioning in a cross-lingual setting  
    Xirong Li, Weiyu Lan, Jianfeng Dong, Hailong Liu: Adding Chinese Captions to Images. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval (ICMR), pp. 271–275, 2016.

Image Retrieval

Video Analysis

  • ACMMM’23 ChinaOpen: A video dataset for open-world multimodal learning
    Aozhu Chen, Ziyuan Wang, Chengbo Dong, Kaibin Tian, Ruixiang Zhao, Xun Liang, Zhanhui Kang, Xirong Li: ChinaOpen: A Dataset for Open-world Multimodal Learning. In: ACM Multimedia, 2023.
  • ACMMM’16 mm2016vsd: Datasets for video violence detection using subclasses and multi-modal features
    Xirong Li, Yujia Huo, Qin Jin, Jieping Xu: Detecting Violence in Video using Subclasses. In: ACM Multimedia, 2016.