Galaxy Project. https://galaxyproject.org/. Accessed 14 Feb 2025
Aichroth, P., Sieland, M., Cuccovillo, L., Köllmer, T.: The MICO broker: an orchestration framework for linked data extractors. In: Joint Proceedings of the 4th International Workshop on Linked Media and the 3rd Developers Hackshop Co-Located with the 13th Extended Semantic Web Conference ESWC 2016, vol. 1615 (2016)
Google Scholar
McManus, B.: Investigation of Best Practice in Metadata for Sound, Moving Image & Audiovisual Collections. MPhil, Department of Information Studies, Aberystwyth University (2020)
Google Scholar
Botticelli, P., Roe, B., Troia, L.: The American archive of public broadcasting: media access and preservation. In: Botticelli, P., Mahard, M.R., Cloonan, M.V. (eds.) Libraries, Archives, and Museums Today: Insights from the Field, pp. 39–47. Rowman & Littlefield (2019)
Google Scholar
Dunn, J.W., et al.: Audiovisual metadata platform pilot development (AMPPD), final project report (2021)
Google Scholar
Gallegos, I.O., et al.: Bias and fairness in large language models: a survey. Comput. Linguist. 50(3), 1097–1179 (2024). https://doi.org/10.1162/coli_a_00524
Article
Google Scholar
Greenberg, J.: The applicability of Natural Language Processing (NLP) to archival properties and objectives. Am. Archivist 61(2), 400–425 (1998). https://doi.org/10.17723/aarc.61.2.j3p8200745pj34v6
Haslhofer, B., Klas, W.: A survey of techniques for achieving metadata interoperability. ACM Comput. Surv. 42(2), 7:1–7:37 (2010). https://doi.org/10.1145/1667062.1667064
Heid, U., Schmid, H., Eckart, K., Hinrichs, E.: A corpus representation format for linguistic web services: the D-SPIN text corpus format and its relationship with ISO standards. In: Calzolari, N., et al. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), Valletta, Malta (2010). https://aclanthology.org/L10-1348/
Hendrycks, D., et al.: Measuring massive multitask language understanding. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=d7KBjmI3GmQ
Jörgensen, C.: The MPEG-7 standard: multimedia description in theory and application. J. Am. Soc. Inform. Sci. Technol. 58(9), 1323–1328 (2007). https://doi.org/10.1002/asi.20571
Article
Google Scholar
Kroll, M., Kraus, K.: Optimizing the role of human evaluation in LLM-based spoken document summarization systems. In: Interspeech 2024, pp. 1935–1939 (2024). https://doi.org/10.21437/Interspeech.2024-2268
Lewis, S.C., Zamith, R., Hermida, A.: Content analysis in an era of big data: a hybrid approach to computational and manual methods. J. Broadcasting Electron. Media 57(1), 34–52 (2013). https://doi.org/10.1080/08838151.2012.761702
Article
Google Scholar
Liu, H., et al.: LLaVA-NeXT: improved reasoning, OCR, and world knowledge (2024). https://llava-vl.github.io/blog/2024-01-30-llava-next/
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A ConvNet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11976–11986 (2022). https://openaccess.thecvf.com/content/CVPR2022/html/Liu_A_ConvNet_for_the_2020s_CVPR_2022_paper.html
Llama team: The Llama 3 Herd of Models (2024). https://ai.meta.com/research/publications/the-llama-3-herd-of-models/
Lynch, K., Jiang, B., Lambright, B., Rim, K., Pustejovsky, J.: Video content summarization with large language-vision models. In: 2024 IEEE International Conference on Big Data (BigData), pp. 2456–2463 (2024). https://doi.org/10.1109/BigData62323.2024.10825195
Heller, M.: Frameworks for analyzing the use of generative artificial intelligence in libraries. Comput. Libr. 44(10) (2024). https://www.infotoday.com/cilmag/dec24/Heller--Frameworks-for-Analyzing-the-Use-of-Generative-Artificial-Intelligence-in-Libraries.shtml
Meyer, M., Conroy, M.: See it, be it: what children are seeing on TV. Technical report (2022). https://geenadavisinstitute.org/research/see-jane-2022-tv-see-it-be-it-what-children-are-seeing-on-tv/
Mühling, M., et al.: VIVA: visual information retrieval in video archives. Int. J. Digit. Libr. 23(4), 319–333 (2022). https://doi.org/10.1007/s00799-022-00337-y
Article
Google Scholar
Nandzik, J., et al.: CONTENTUS-technologies for next generation multimedia libraries. Multimed. Tools Appl. 63(2), 287–329 (2013). https://doi.org/10.1007/s11042-011-0971-2
Article
Google Scholar
Oard, D.W., et al.: Cross-language access to recorded speech in the MALACH project. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, pp. 57–64. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-46154-X_8
Chapter
Google Scholar
Raemy, J.A., Fornaro, P., Rosenthaler, L., Fornaro, P., Rosenthaler, L.: Implementing a video framework based on IIIF: a customized approach from long-term preservation video formats to conversion on demand. In; Archiving Conference, vol. 14, pp. 68–73. Society for Imaging Science and Technology (2017). https://doi.org/10.2352/issn.2168-3204.2017.1.0.68
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-1410
Rim, K., Lynch, K., Pustejovsky, J.: Computational linguistics applications for multimedia services. In: Alex, B., Degaetano-Ortlieb, S., Kazantseva, A., Reiter, N., Szpakowicz, S. (eds.) Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pp. 91–97. Association for Computational Linguistics, Minneapolis, USA (2019). https://doi.org/10.18653/v1/W19-2512
Rubin, N.: The PBCore metadata standard: a decade of evolution. J. Digit. Media Manag. 1(1), 55–68 (2012)
Google Scholar
Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4938–4947 (2020). https://openaccess.thecvf.com/content_CVPR_2020/html/Sarlin_SuperGlue_Learning_Feature_Matching_With_Graph_Neural_Networks_CVPR_2020_paper.html
Schmidt, T., et al.: An exchange format for multimodal annotations. In: International LREC Workshop on Multimodal Corpora, pp. 207–221 (2008). https://link.springer.com/chapter/10.1007/978-3-642-04793-0_13
Schweikert, A.: Audiovisual Algorithms: New Techniques for Digital Processing. Master of Arts, Moving Image Archiving and Preservation Program, New York University (2019)
Google Scholar
Soucek, T., Lokoc, J.: TransNet V2: an effective deep network architecture for fast shot transition detection. In: Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, pp. 11218–11221. Association for Computing Machinery, New York (2024). https://doi.org/10.1145/3664647.3685517
Sultana, F., Sufian, A., Dutta, P.: Advancements in image classification using convolutional neural network. In: 2018 Fourth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), pp. 122–129 (2018). https://doi.org/10.1109/ICRCICN.2018.8718718
Tiribelli, S., Pansoni, S., Frontoni, E., Giovanola, B.: Ethics of artificial intelligence for cultural heritage: opportunities and challenges. IEEE Trans. Technol. Soc. 5(3), 293–305 (2024). https://doi.org/10.1109/TTS.2024.3432407
Article
Google Scholar
Verhagen, M., et al.: The LAPPS interchange format. In: Murakami, Y., Lin, D. (eds.) WLSI 2015. LNCS (LNAI), vol. 9442, pp. 33–47. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31468-6_3
Chapter
Google Scholar
Weller, A., Bleisteiner, W., Hufnagel, C., Iber, M.: The Future is Meta: Metadata, Formats and Perspectives towards Interactive and Personalized AV Content (2024). https://doi.org/10.48550/arXiv.2407.19590