The project
The Metadata Quality Assessment Framework is a research project on figuring out how one can decide in an algorithmic way whether a metadata record in a cultural heritage database is “good” or “bad”. During the project a general framework is developed, which enables metadata repositories and digital libraries (such as Europeana, TextGrid or Digital Public Library of America) to run a range of measurements on the collection, and get suggestion where they should improve the quality of their metadata. Multiple extensions are being developed along with the core framework to work with specific metadata schemas, or data source.
These pages are about the process of the research – my findings, results, codes, talks.
The author
Péter Király, is a software developer and researcher at at GWDG, the data, compute and research centre for Max-Planck-Society and University of Göttingen. He received PhD (summa cum laude) from University of Göttingen, in 2019 as the output of this research. His main research interests are quality assessment of cultural heritage metadata and cultural analytics, the data analysis of these metadata as historical source. He is also interested in publishing and searching large text corpora in the web, and the new ways of web presence of cultural heritage (archival, library and museum materials with special focus on semantic web technologies). He is an editor of Code4Lib Journal, co-chair of LIBER Data Science in Libraries working group, member of different library and digital humanities related groups, maker and supporter of open source and open data projects. He collaborates with British Library, Belgian National Library, Gemeinsamer Bibliotheksverbund, a German library consortium, Europeana, Deutsche Digitale Bibliothek, meemoo (the Flemish Institute for Archives), Gent University Library, Victoria and Albert museum and other cultural heritage organisations.
Important links
The first version of the web interface: http://144.76.218.178/europeana-qa
Source code
- Metadata Quality Assessment Framework API:
- Source repository: https://github.com/pkiraly/metadata-qa-api
- Maven artifact: http://mvnrepository.com/artifact/de.gwdg.metadataqa/metadata-qa-api
- MARC measurement
- The Europeana-specific Europeana QA API:
- Source repository: https://github.com/pkiraly/europeana-qa-api
- Maven artifact: http://mvnrepository.com/artifact/de.gwdg.metadataqa/europeana-qa-api
- Harvester client: https://github.com/pkiraly/europeana-oai-pmh-client
- Measurement with Spark: https://github.com/pkiraly/europeana-qa-spark
- Analysis with R: https://github.com/pkiraly/europeana-qa-r
- Web interface: https://github.com/pkiraly/europeana-qa-web
- REST and command line interface: https://github.com/pkiraly/europeana-qa-client
- Solr connector: https://github.com/pkiraly/europeana-qa-solr
- Cassandra connector: https://github.com/pkiraly/europeana-qa-cassandra
- Measurement with Hadoop: https://github.com/pkiraly/europeana-qa
Publications
2015
Péter Király. “Metadata quality assurrance framework”. Unpublished doctoral research plan (2015) pkiraly.github.io
cited by:
- Vivien Petras, and Juliane Stiller. “A decade of evaluating europeana-constructs, contexts, methods & criteria.” In International Conference on Theory and Practice of Digital Libraries, pp. 233-245. Springer, Cham, 2017. DOI: 10.1007/978-3-319-67008-9_19
- Marcin Roszkowski. “Diagnostyka metadanych w kolekcjach cyfrowych.” Diagnostyka w zarządzaniu informacją: perspektywa informatologiczna (2017): pp. 365-390. researchgate.net
- August Wierling, Valeria Jana Schwanitz, Sebnem Altinci, Maria Bałazinska, Michael J. Barber, Mehmet Efe Biresselioglu, Christopher Burger-Scheidlin, Massimo Celino, Muhittin Hakan Demir, Richard Dennis, Nicolas Dintzner, Adel el Gammal, Carlos M. Fernández-Peruchena, Winston Gilcrease, Paweł Gładysz, Carsten Hoyer-Klick, Kevin Joshi, Mariusz Kruczek, David Lacroix, Małgorzata Markowska, Rafael Mayo-García, Robbie Morrison, Manfred Paier, Giuseppe Peronato, Mahendranath Ramakrishnan, Janeita Reid, Alessandro Sciullo, Berfu Solak, Demet Suna, Wolfgang Süß, Astrid Unger, Maria Luisa Fernandez Vanoni and Nikola Vasiljevic. “FAIR Metadata Standards for Low Carbon Energy Research—A Review of Practices and How to Advance.” Energies 2021, 14, 6692. DOI: 10.3390/en14206692
- Oksana L. Zavalina, Shadi Shakeri, Priya Kizhakkethil, and Mark E. Phillips. “Uncovering Hidden Insights for Information Management: Examination and Modeling of Change in Digital Collection Metadata.” In International Conference on Information, pp. 645-651. Springer, Cham, 2018. DOI: 10.1007/978-3-319-78105-1_74
- Gabriel Olufemi Isheyemi. “Metadata integration tool for Open Educational Resources”. Master of Science Dissertation. Federal State Autonomous Educational Insitution for Higher Professional Education National Research University, Higher School Of Economics, Faculty of Computer Science. Moscow 2018. researchgate.net
- Branka Badovinac. “Merjenje kakovosti podatkov v bibliografskih in normativnih zapisih: študija primera izbranih podatkovnih elementov za fasetno omejevanje in izpis seznama zadetkov v COBISS+.” Organizacija Znanja 24, no. 1/2 (2019): pp. 1-20. cobiss.si
- Mark Edward Phillips, Oksana L. Zavalina, and Hannah Tarver. “Exploring the utility of metadata record graphs and network analysis for metadata quality evaluation and augmentation.” International Journal of Metadata, Semantics and Ontologies 14, no. 2 (2020): 112-123. DOI: 10.1504/IJMSO.2020.108326
- Rachel Jaffe. “Rethinking Metadata’s Value and How It Is Evaluated.” Technical Services Quarterly 37, no. 4 (2020): 432-443. DOI: 10.1080/07317131.2020.1810443
- August Wierling, Valeria Jana Schwanitz, Sebnem Altinci, Maria Bałazinska, Michael J. Barber, Mehmet Efe Biresselioglu, Christopher Burger-Scheidlin, Massimo Celino, Muhittin Hakan Demir, Richard Dennis, Nicolas Dintzner, Adel el Gammal, Carlos M. Fernández-Peruchena, Winston Gilcrease, Paweł Gładysz, Carsten Hoyer-Klick, Kevin Joshi, Mariusz Kruczek, David Lacroix, Małgorzata Markowska, Rafael Mayo-García, Robbie Morrison, Manfred Paier, Giuseppe Peronato, Mahendranath Ramakrishnan, Janeita Reid, Alessandro Sciullo, Berfu Solak, Demet Suna, Wolfgang Süß, Astrid Unger, Maria Luisa Fernandez Vanoni and Nikola Vasiljevic. “FAIR Metadata Standards for Low Carbon Energy Research—A Review of Practices and How to Advance.” Energies (2021) 14, no. 20, 6692. DOI: 10.3390/en14206692
- Matteo Lorenzini, Marco Rospocher, Sara Tonelli. “On assessing metadata completeness in digital cultural heritage repositories.” Digital Scholarship in the Humanities 36, Supplement_2, (2021) pp. ii182–ii188. DOI: 10.1093/llc/fqab036
- Jhon Francined Herrera Cubides. “Metamodelo para vinculación de recursos educativos abiertos mediante especificaciones LOD y basado en principios de confianza.” Thesis doctoral. Universidad Distrital “Francisco José de Caldas” Facultad de Ingeniería. Bogotá - Colombia, 2021 udistrital.edu.co, proquest
- Lisandra Díaz de la Paz, Francisco N. Riestra Collado, Juan L. García Mendoza, Luisa M. González González, Amed A. Leiva Mederos, and Alberto Taboada Crispi. “Weights Estimation in the Completeness Measurement of Bibliographic Metadata.” Computación y Sistemas 25, no. 1. 2021. pp. 47–65. DOI: 10.13053/CyS-25-1-3355
- Jhon Francined Herrera-Cubides, Paulo Alonso Gaona-García, Carlos Enrique Montenegro-Marin, Salvador Sánchez-Alonso. “The Relevance of Open Data Principles for the Web of Data.” Journal of Electrical and Computer Engineering 2023. Article ID 4854965, pp. 1-17. DOI: 10.1155/2023/4854965
2017
Juliane Stiller, and Péter Király. “Multilinguality of Metadata Measuring the Multilingual Degree of Europeana’s Metadata.” In M. Gäde, V. Trkulja, V. Petras (Eds.): Everything Changes, Everything Stays the Same? Understanding Information Spaces. Proceedings of the 15th International Symposium of Information Science (ISI 2017), Berlin, 13th—15th March 2017. Glückstadt: Verlag Werner Hülsbusch, pp. 164—176. URL (whole book): http://isi2017.ib.hu-berlin.de/ISI_17_ONLINE_FINAL.pdf (this paper): researchgate.net
cited by:
- Sarah Fallert. “Multilinguale Herausforderungen in der Sacherschließung.” Master’s thesis, Humboldt-Universität zu Berlin, 2020. edoc.hu-berlin.de
- August Wierling, Valeria Jana Schwanitz, Sebnem Altinci, Maria Bałazinska, Michael J. Barber, Mehmet Efe Biresselioglu, Christopher Burger-Scheidlin, Massimo Celino, Muhittin Hakan Demir, Richard Dennis, Nicolas Dintzner, Adel el Gammal, Carlos M. Fernández-Peruchena, Winston Gilcrease, Paweł Gładysz, Carsten Hoyer-Klick, Kevin Joshi, Mariusz Kruczek, David Lacroix, Małgorzata Markowska, Rafael Mayo-García, Robbie Morrison, Manfred Paier, Giuseppe Peronato, Mahendranath Ramakrishnan, Janeita Reid, Alessandro Sciullo, Berfu Solak, Demet Suna, Wolfgang Süß, Astrid Unger, Maria Luisa Fernandez Vanoni and Nikola Vasiljevic. “FAIR Metadata Standards for Low Carbon Energy Research—A Review of Practices and How to Advance.” Energies 2021, 14, 6692. DOI: 10.3390/en14206692
Péter Király. “Towards an extensible measurement of metadata quality.” In Second International Conference on Digital Access to Textual Cultural Heritage. Conference Proceedings. Göttingen, June 1-2, 2017. Published by ACM 2017. ISBN 978-1-4503-5265-9. pp. 111-115. DOI: 10.1145/3078081.3078109
URL: http://dl.acm.org/citation.cfm?doid=3078081.3078109
cited by:
- Gustavo Candela, Pilar Escobar, María Dolores Sáez and Manuel Marco-Such. “A Shape Expression approach for assessing the quality of Linked Open Data in Libraries.” Semantic Web pp. 1–21. DOI: 10.3233/SW-210441
- Widad Elouataoui, Imane El Alaoui, and Youssef Gahi. “Metadata Quality in the Era of Big Data and Unstructured Content.” In The International Conference on Information, Communication & Cybersecurity, pp. 110-121. Springer, Cham, 2021. DOI: 10.1007/978-3-030-91738-8_11
- Hannah Tarver, Mark Edward Phillips, and Ana Krahmer. “EPIC: an iterative model for metadata improvement.” International Journal of Metadata, Semantics and Ontologies 15, no. 4 (2021): 244-253. DOI: 10.1504/IJMSO.2021.125885
- Volodymyr A. Shekhovtsov, Johann Eder. “Metadata Quality for Biobanks”. Applied Sciences 2022, 12, 9578. pp. 1–37. DOI: 10.3390/app12199578
- Gustavo Candela. “Towards a semantic approach in GLAM Labs: the case of the Data Foundry at the National Library of Scotland.” arXiv preprint (2023) arXiv:2301.11182.
- Bruno Zolotareff dos Santos, Sandra Santos Vales, and Jorge Rady Almeira Junior. “A Metrics-Based Approach to Metadata Classification Applied in a Recommendation System.” Preprint. (2023). DOI: 10.21203/rs.3.rs-3132032
- Luís Fernando Sayãoa, Luana Farias Sales. “Metadata authoring model: describing information about context and provenance of disciplinary research objects” Informação & Informação 2024 28(4) pp. 1-37 DOI: 10.5433/1981-8920.2023v28n4p1
Péter Király. “Measuring completeness as metadata quality metric in Europeana.” In Digital Humanities 2017. Conference Abstracts. McGill University & Université de Montréal, Montréal, Canada, August 8–11, 2017. Prepared by Rhian Lewis and the DH2017 Local Organizers: Cecily Raynor, Dominic Forest, Michael Sinatra and Stéfan Sinclair. pp. 291-293. URL (whole book): https://dh2017.adho.org/abstracts/DH2017-abstracts.pdf, URL (the abstract): https://dh2017.adho.org/abstracts/458/458.pdf.
2018
Valentine Charles, Juliane Stiller, Péter Király, Werner Bailer, and Nuno Freire. “Data Quality Assessment in Europeana: Metrics for Multilinguality.” In Joint Proceedings of the 1st Workshop on Temporal Dynamics in Digital Libraries (TDDL 2017), the (Meta)-Data Quality Workshop (MDQual 2017) and the Workshop on Modeling Societal Future (Futurity 2017) (TDDL MDQual Futurity 2017) co-located with 21st International Conference on Theory and Practice of Digital Libraries (TPLD 2017) (Grand Hotel Palace, Thessaloniki, Greece, 21 September 2017), edited by A. Caputo, N. Kanhabua, P. Basile, S. Lawless, D. Gavrilis, Ch. Papatheodorou, D. Trandabat. (CEUR Workshop Proceedings Volume 2038. ISSN 1613-0073.), Published by CEUR, 2018. http://ceur-ws.org/Vol-2038/paper6.pdf.
cited by:
- Matteo Lorenzini, Rospocher Marco, and Sara Tonelli. “Proposta per una valutazione automatica della completeness dei metadati nel contesto delle biblioteche digitali.” DigItalia 2. (2020). pp. 159-167. DOI: 10.36181/digitalia-00023
- Subhi Issa, Onaopepo Adekunle, Fayçal Hamdi, Samira Si-Said Cherfi, Michel Dumontier, and Amrapali Zaveri. “Knowledge Graph Completeness: A Systematic Literature Review.” IEEE Access 9. (2021). pp. 31322-31339. DOI: 10.1109/ACCESS.2021.3056622
2019
Péter Király and Marco Büchler. “Measuring completeness as metadata quality metric in Europeana.” In 2018 IEEE International Conference on Big Data (Big Data). Published by IEEE, 2019. pp. 2711–2720. DOI: 10.1109/BigData.2018.8622487
cited by:
- Nadim Akhtar Khan, S. M. Shafi, and Humma Ahangar. “Digitization of cultural heritage: Global initiatives, opportunities and challenges.” Journal of Cases on Information Technology (JCIT) 20, no. 4 (2018): 1-16. igi-global.com
- Jongwook Lee. “Analysis and Suggestions of Digital Heritage Policy.” Journal of The Korea Society of Computer and Information 24, no. 10 (2019): 71-78. koreascience.or.kr
- Klara Martha Wanderley Freire. “A curadoria digital nas instituições culturais: possibilidades de reuso de dados de Arte.” (2019). repositorio.ibict.br
- Jussi Pajari. “Tutkimusaineistojen metatiedot: Metatietojen laatu data-ja metatietoarkistoissa.” Master’s thesis, 2019. trepo.tuni.fi
- Mohammadreza Tavakoli, Mirette Elias, Gábor Kismihók, and Sören Auer. “Quality prediction of open educational resources a metadata-based approach.” In 2020 IEEE 20th International Conference on Advanced Learning Technologies (ICALT), pp. 29-31. IEEE, 2020. DOI: 10.1109/ICALT49669.2020.00007
- Kristen Schuster, and Stuart Dunn, eds. Routledge International Handbook of Research Methods in Digital Humanities. Routledge, 2020. books.google.com
- Mohammadreza Tavakoli, Ali Faraji, Stefan T. Mol, and Gábor Kismihók. “OER Recommendations to Support Career Development.” In 2020 IEEE Frontiers in Education Conference (FIE), pp. 1-5. IEEE, 2020. DOI: 10.1109/FIE44824.2020.9274175
- A. J. Million “Information Communication Technologies, Infrastructure, and Research Methods in the Digital Humanities.” In Routledge International Handbook of Research Methods in Digital Humanities, pp. 190-202. Routledge, 2020. DOI: 10.4324/9780429777028-15
- Mark Edward Phillips, Oksana L. Zavalina, and Hannah Tarver. “Exploring the utility of metadata record graphs and network analysis for metadata quality evaluation and augmentation.” International Journal of Metadata, Semantics and Ontologies 14, no. 2 (2020): 112-123. DOI: 10.1504/IJMSO.2020.108326
- Yalemisew Abgaz, Amelie Dorn, José Luis Preza Díaz, and Gerda Koch. “Towards a Comprehensive Assessment of the Quality and Richness of the Europeana Metadata of food-related Images.” In Proceedings of the 1st International Workshop on Artificial Intelligence for Historical Image Enrichment and Access, pp. 29-33. 2020. aclweb.org
- Johann Eder and Vladimir A. Shekhovtsov. “Data Quality for Medical Data Lakelands.” In International Conference on Future Data and Security Engineering, pp. 28-43. Springer, Cham, 2020. DOI: 10.1007/978-3-030-63924-2_2
- Matteo Lorenzini, Marco Rospocher, and Sara Tonelli. “Proposta per una valutazione automatica della completeness dei metadati nel contesto delle biblioteche digitali.” DigItalia 2. (2020). pp. 159-167. DOI: 10.36181/digitalia-00023
- Johann Eder and Vladimir A. Shekhovtsov. “Data quality for federated medical data lakes.” International Journal of Web Information Systems (2021). DOI: 10.1108/IJWIS-03-2021-0026
- Mark Edward Phillips and Hannah Tarver. “Investigating the use of metadata record graphs to analyze subject headings in the digital public library of America.” The Electronic Library (2021). DOI: 10.1108/EL-11-2020-0317
- Susan A. Barrett “Participatory Description and Metadata Quality in Rapid Response Archives.” Collections (2021): 1550190620981038. DOI: 10.1177/1550190620981038
- Matteo Lorenzini, Marco Rospocher, and Sara Tonelli. “Automatically evaluating the quality of textual descriptions in cultural heritage records.” International Journal on Digital Libraries 22, no. 2 (2021): 217-231. DOI: 10.1007/s00799-021-00302-1
- Lisa Wenige, Claus Stadler, Michael Martin, Richard Figura, Robert Sauter, and Christopher W. Frank. “Open Data and the Status Quo–A Fine-Grained Evaluation Framework for Open Data Quality and an Analysis of Open Data portals in Germany.” arXiv preprint arXiv:2106.09590 (2021). arxiv.org
- Mohammadreza Tavakoli, Mirette Elias, Gábor Kismihók, and Sören Auer. “Metadata Analysis of Open Educational Resources.” In LAK21: 11th International Learning Analytics and Knowledge Conference, pp. 626-631. 2021. DOI: 10.1145/3448139.3448208
- Lisandra Díaz de la Paz, Francisco N. Riestra Collado, Juan L. García Mendoza, Luisa M. González González, Amed A. Leiva Mederos, and Alberto Taboada Crispi. “Weights Estimation in the Completeness Measurement of Bibliographic Metadata.” Computación y Sistemas 25, no. 1. 2021. pp. 47–65. DOI: 10.13053/CyS-25-1-3355
- Петр Сергеевич Ершов, and Юрий Евгеньевич Хохлов. “Цифровая инфраструктура для работы с большими данными.” Информационное общество 4-5 (2021): 110-131. infosoc.iis.ru
- Vladimir A. Shekhovtsov and Johann Eder. “Data Item Quality for Biobanks.” In Transactions on Large-Scale Data-and Knowledge-Centered Systems L, Springer, Berlin, Heidelberg, 2021. pp. 77-115. DOI: 10.1007/978-3-662-64553-6_5
- Widad Elouataoui, Imane El Alaoui, and Youssef Gahi. “Metadata Quality in the Era of Big Data and Unstructured Content.” In The International Conference on Information, Communication & Cybersecurity, pp. 110-121. Springer, Cham, 2021. DOI: 10.1007/978-3-030-91738-8_11
- Mohammadreza Tavakoli, Abdolali Faraji, Jarno Vrolijk, Mohammadreza Molavi, Stefan T. Mol, Gábor Kismihók. “An AI-based open recommender system for personalized labor market driven education.” Advanced Engineering Informatics, Volume 52, 2022, 101508. DOI: 10.1016/j.aei.2021.101508
- Kaylin Blount. “Knowing What We Have: Proposing a Sustainable Workflow for the Assessment of Digitized Archival Collections at Wilson Library.” Master thesis in Information Science. University of North Caroline, School of Information and Library Science, 2022. unc.edu
- Barbara Šlibar, and Enrique Mu. “OGD metadata country portal publishing guidelines compliance: A multi-case study search for completeness and consistency”. Government Information Quarterly 2022, August. 101756. DOI: 10.1016/j.giq.2022.101756
- Volodymyr A. Shekhovtsov, Johann Eder. “Metadata Quality for Biobanks”. Applied Sciences 2022, 12, 9578. pp. 1–37. DOI: 10.3390/app12199578
- Hannah Tarver, Meredith Hale, Rachel White, Steven Gentry, Madison Chartier, Rachel Wittmann. “Metadata Quality in Digital Libraries: An Analysis of Survey Response Data”. In Proceedings of the 18th International Conference on Digital Preservation 2022. Glasgow, Scotland. 10.7207/ipres2022-proceedings. pp. 162-172.
- Juan Ribeiro Reis, Flavia Bernadini, Jose Viterbo. “A New Approach for Assessing Metadata Completeness in Open Data Portals”. International Journal of Electronic Government Research (IJEGR) 18, 2022, no.1: 1-20. 10.4018/IJEGR.313636
- Johann Eder, Volodymyr A. Shekhovtsov. “Managing the Quality of Data and Metadata for Biobanks”. In T. K. Dang et al. (Eds.) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications 2022 (FDSE 2022), CCIS 1688. pp. 52–69. DOI: 10.1007/978-981-19-8069-5_4
- Mirette Magdy Michel Elias. “Adapting OpenCourseWare Based on the Needs and Preferences of Disabled Learners”. PhD Dissertation, University of Bonn. 2022. uni-bonn.de
- B. V. Vishnu, Sharath S. Rao, and B. Netravathi. “An Interactive Framework for Querying Data from Large PDF Files.” In 2023 International Conference on Recent Advances in Information Technology for Sustainable Development (ICRAIS), pp. 25-30. IEEE, 2023. DOI: 10.1109/ICRAIS59684.2023.10367090
- Na Pang, Yue Yuan, Guangjian Li. “国外图书馆 开放元数据服务及其特点” (Open Metadata Service and Characteristics in Foreign Libraries) Documentation, Information and Knowledge, 2023 40 (1), pp. 112-123, DOI: 10.13366/j.dik.2023.01.112
- Hamza Elkina, Mohamed Rida Sahib, and Taher Zaki. “Generic model of metadata management system for data lakes.” International Journal of Metadata, Semantics and Ontologies 16, no. 4 (2023): 315-328. DOI 10.1504/IJMSO.2023.140696
- Vinit Kumar, Chandrappa, and N. S. Harinarayana. “Exploring dimensions of metadata quality assessment: A scoping review.” Journal of Librarianship and Information Science (2024) DOI: 10.1177/09610006241239080
- Sandra Isabel Arango-Vásquez, Bell Manrique-Losada, Beatriz Eugenia Quiceno-Castañeda and Fernando Moreira. “Rethinking the creation of open educational resources: a model for virtual learning platforms in higher education,” IEEE Revista Iberoamericana de Tecnologias del Aprendizaje, 2024. DOI: 10.1109/RITA.2024.3458860.
Péter Király, Juliane Stiller, Valentine Charles, Werner Bailer, and Nuno Freire. “Evaluating Data Quality in Europeana: Metrics for Multilinguality.” In Metadata and Semantic Research 2019. 12th International Conference, MTSR 2018, Limassol, Cyprus, October 23-26, 2018, Revised Selected Papers (Communications in Computer and Information Science, volume 846) Published by Springer, 2019. pp. 199–211. DOI: 10.1007/978-3-030-14401-2_19
cited by:
- Nuno Freire and Antoine Isaac. “Technical usability of Wikidata’s linked data.” In International Conference on Business Information Systems, pp. 556-567. Springer, Cham, 2019. DOI: 10.1007/978-3-030-36691-9_47
- Mark E. Phillips, Oksana L. Zavalina, and Hannah Tarver. “Using metadata record graphs to understand digital library metadata.” In International Conference on Dublin Core and Metadata Applications, pp. 49-58. 2020. dcpapers.dublincore.org
- Nuno Freire, and Antoine Isaac. “Wikidata’s linked data for cultural heritage digital resources: an evaluation based on the Europeana data model.” In International Conference on Dublin Core and Metadata Applications, pp. 59-68. 2020. dcpapers.dublincore.org
- Sarantos Kapidakis. “Consistency and Interoperability on Dublin Core Element Values in Collections Harvested using the Open Archive Initiative Protocol for Metadata Harvesting.” In Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020) Volume 2: KEOD, pp. 181-188. 2020. DOI: 10.5220/0010112001810188
- Yalemisew Abgaz, Amelie Dorn, José Luis Preza Díaz, and Gerda Koch. “Towards a Comprehensive Assessment of the Quality and Richness of the Europeana Metadata of food-related Images.” In Proceedings of the 1st International Workshop on Artificial Intelligence for Historical Image Enrichment and Access, pp. 29-33. 2020. aclanthology.org
- Н. С. Редькина “ЕВРОПЕАНА: ЦИФРОВОЕ КУЛЬТУРНОЕ НАСЛЕДИЕ ЕВРОПЫ.” Ученые записки (Алтайская государственная академия культуры и искусств) 2 (24) (2020). cyberleninka.ru
- Inkyung Choi, Wan-Chen Lee, Ying-Hsang Liu, Hsinliang Chen, Douglas W. Oard, and Chi Young Oh. “Cross-cultural information access.” 85th Annual Meeting of the Association for Information Science & Technology, 2022. Proceedings of the Association for Information Science and Technology, 2022 59:1 pp. 551-554. 10.1002/pra2.624. Preprint: umd.edu.
- Julia Neumann. “Semantic interoperability in metrology through controlled vocabulary.” In IMEKO TC6 International Conference on Metrology and Digital Transformation. September 19 − September 21, 2022, Berlin, Germany. 2022 m4dconf2022.ptb.de
- Manuel Alejandro Flores Chávez. “MetaMetrics: prototipo de visualización de la calidad de los metadatos en revistas científicas latinoamericanas publicadas en Open Journal System.” Biblioteca Universitaria 26, no. 1 (2023). DOI: 10.22201/dgbsdi.0187750xp.2023.1.1466
- Inkyung Choi, Wan‐Chen Lee, Ying‐Hsang Liu, Hsinliang Chen, Douglas W. Oard, and Chi Young Oh. “Cross‐cultural Information Access.” Proceedings of the Association for Information Science and Technology 59, no. 1 (2022): pp. 551-554. DOI: 10.1002/pra2.624
- Scott McAvoy, Beatrice Tanduo, Alessandra Spreafico, Filiberto Chiabrando, Dominique Rissolo, John Ristevski, and Falko Kuester. “An Archival Framework for Sharing of Cultural Heritage 3D Survey Data: OpenHeritage3D.Org.” The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2024. XLVIII-2–2024 pp. 241–248. DOI: 10.5194/isprs-archives-XLVIII-2-2024-241-2024
- Scott McAvoy, Beatrice Tanduo, Alessandra Spreafico, Filiberto Chiabrando, Dominique Rissolo, John Ristevski, and Falko Kuester. “An Archival Framework for Sharing of Cultural Heritage 3D Survey Data: OpenHeritage3D. org.” The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 48 (2024): 241-248. DOI: 10.5194/isprs-archives-XLVIII-2-2024-241-2024
Péter Király. “Adat a könyvtárban” (Data in the library – paper in Hungarian about the changing status of data in LAM). In Hagyomány és újítás a 21. századi könyvtárban (Erdélyi Évszázadok. A Kolozsvári Magyar Történeti Intézet Évkönyve. III.) eds. Rüsz-Fogarasi Enikő, Monok István. Kolozsvár (Romania), 2018. ISBN 978-606-8886-1. pp. 49-74. http://real.mtak.hu/92256/1/ErdEvsz_tordelt_nyomdaba.pdf
cited by:
- Virágos Márta. “Open Science a könyvtárban: könyvtáros kompetenciák újraértelmezése.” Tudományos és Műszaki Tájékoztatás 67, no. 12 (2020): 739-756. tmt.omikk.bme.hu
Péter Király. “Measuring metadata quality”. PhD dissertation, University of Göttingen. DOI: 10.13140/RG.2.2.33177.77920 (ResearchGate), Göttingen eDiss repository, Academia.edu.
cited by:
- Tyler J. Skluzacek, Ryan Wong, Zhuozhao Li, Ryan Chard, Kyle Chard, and Ian Foster. “A Serverless Framework for Distributed Bulk Metadata Extraction.” In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing, pp. 7-18. 2020. DOI: 10.1145/3431379.3460636
- Widad Elouataoui, Imane El Alaoui, and Youssef Gahi. “Metadata Quality in the Era of Big Data and Unstructured Content.” In The International Conference on Information, Communication & Cybersecurity, pp. 110-121. Springer, Cham, 2021. DOI: 10.1007/978-3-030-91738-8_11
- Salman Haider. “Library Cataloging, Classification, and Metadata Research: A Bibliography of Doctoral Dissertations—A Supplement, 2021”. Cataloging & Classification Quarterly, 2022. no. 1. pp. 1-6. DOI: 10.1080/01639374.2021.2025183
- Tyler J. Skluzacek, Matthew Chen, Erica Hsu, Kyle Chard, and Ian Foster. “Models and Metrics for Mining Meaningful Metadata.” International Conference on Computational Science (ICCS), 2022. ResearchGate.net.
- Nhu, Nam Dan. “Nachschlagedienst für qualitative hochwertige Metadaten von Veröffentlichungen.” Bachelor’s thesis, Hannover: Gottfried Wilhelm Leibniz Universität Hannover, Institut für Verteilte Systeme, 2022. handle:123456789/12472
- Tyler J. Skluzacek. “Automated Metadata Extraction Can Make Data Swamps More Navigable.” Ph.D. dissertation. The University of Chicago, Computer Science (2022). DOI: 10.6082/uchicago.4760
- Volodymyr A. Shekhovtsov, and Johann Eder. “Metadata Quality for Biobanks”. Applied Sciences 2022, 12, 9578. pp. 1–37. DOI: 10.3390/app12199578
- Agnieszka Karlińska, Cezary Rosiński, Jan Wieczorek, Patryk Hubar, Jan Kocoń, Marek Kubis, Stanisław Woźniak, Arkadiusz Margraf, and Wiktor Walentynowicz. “Towards a contextualised spatial-diachronic history of literature: mapping emotional representations of the city and the country in Polish fiction from 1864 to 1939.” In Proceedings of the 6th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Gyeongju, Republic of Korea. International Conference on Computational Linguistics. 2022. pp. 115–125 aclanthology.org
- Murari Tapaswi. “Some issues in the Shodhganga–A theses repository from India.” Annals of Library and Information Studies 70, no. 2 (2023): 74-84. DOI 10.56042/alis.v70i2.1834
- Widad Elouataoui, Imane El Alaoui, and Youssef Gahi. “Metadata Quality Dimensions for Big Data Use Cases.” In Proceedings of the 2nd International Conference on Big Data, Modelling and Machine Learning-BML, pp. 488-495. 2022. DOI: 10.5220/0010737400003101
- Bhola Gan Chaudhuri, and Shalli Rani. “Managing Metadata in Data Warehouse for Data Quality and Data Stewardship in Telecom Industry-A Compact Survey.” In 2023 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), pp. 353-361. IEEE, 2023. DOI: 10.1109/ICCCIS60361.2023.10425001
- Ghosh, Saptarshi, and Aditi Roy. “Metadata Quality Benchmarks of ETDs in International Institutional Repositories: An Automated Appraisal.” (2023). INFLIBNET Centre. ir.inflibnet.ac.in
- Vinit Kumar, Chandrappa, and N. S. Harinarayana. “Exploring dimensions of metadata quality assessment: A scoping review.” Journal of Librarianship and Information Science (2024) DOI: 10.1177/09610006241239080
Péter Király. “Validating 126 million MARC records”. In DATeCH2019 Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage Brussels, Belgium — May 08-10, 2019. Published by ACM, 2019. ISBN: 978-1-4503-7194-0. pp. 161-168. DOI: 10.1145/3322905.3322929
cited by:
- Ungváry, Rudolf. “MARC21 tartalmi adatmezők használata jelentősebb nagykönyvtárakban. Egy elemzés néhány tanulsága.” Networkshop (2020): 33-53. real.mtak.hu
- Ungváry, Rudolf. “Ismeretszervező-könyvtári rendszerek tartalmi feltárásának összehasonlító vizsgálata MARC21 környezetben.” Tudományos és Műszaki Tájékoztatás, 2020. (67. évf.) 11. sz. pp. 655-680. tmt.omikk.bme.hu
- Evan Bryer, Theppatorn Rhujittawiwat, John R. Rose, and Colin F. Wilder. “Spelling Based Ranked Clustering Algorithm To Clean And Normalize Early Modern European Book Titles.” ihci-conf.org
- Evan Bryer, Theppatorn Rhujittawiwat, Samyu Comandur, Vasco Madrid, Stephanie Riley, John Rose, and Colin Wilder. “Analysis of Clustering Algorithms to Clean and Normalize Early Modern European Book Titles.” In 2021 The 4th International Conference on Software Engineering and Information Management, pp. 106-112. 2021. DOI: 10.1145/3451471.3451489
- Gustavo Candela, Pilar Escobar, María Dolores Sáez and Manuel Marco-Such. “A Shape Expression approach for assessing the quality of Linked Open Data in Libraries.” Semantic Web 14, no. 2 (2023): 159-179. DOI: 10.3233/SW-210441
- Jakob Voß. “Datenqualität als Grundlage qualitativer Inhaltserschließung.” In Qualität in der Inhaltserschließung. Edited by: Michael Franke-Maier, Anna Kasprzik, Andreas Ledl and Hans Schürmann. Berlin, Boston: De Gruyter Saur. ISBN: 9783110691597, DOI: 10.1515/9783110691597 (Bibliotheks- und Informationspraxis, Volume 70) pp. 167-176. DOI: 10.1515/9783110691597-011
- Vyacheslav Zavalin, Oksana L. Zavalina and Rachel Safa. “Patterns of Subject Metadata Change in MARC 21 Bibliographic Records for Video Recordings.” Proceedings of the Association for Information Science and Technology 58, no. 1 (2021): 543-547. DOI: 10.1002/pra2.494
- Evan Bryer, Theppatorn Rhujittawiwat, John R. Rose, and Colin F. Wilder. “Improvement of Clustering Algorithms by Implementation of Spelling Based Ranking.” IADIS International Journal on Computer Science and Information Systems 2021. Vol. 16, No. 2, pp. 45-60. ISSN: 1646-3692. iadisportal.org
- Tomasz Umerle and Vojtěch Malínek. “Literarybibliography.eu — modelový příklad pro tvorbu mezinárodní oborové bibliografie.” Ceska Literatura 70(2022):5. pp. 579–595. DOI: 10.51305/cl.2022.05.03
- Gustavo Candela, Nele Gabriëls, Sally Chambers, Thuy-An Pham, Sarah Ames, Neil Fitzgerald, Katrine Hofmann, Victor Harbo, Abigail Potter, Meghan Ferriter, Eileen Manchester, Alba Irollo, Ellen Van Keer, Mahendra Mahey, Olga Holownia, Milena Dobreva. “A Checklist to Publish Collections as Data in GLAM Institutions.” arXiv preprint (2023) arXiv: 2304.02603
Péter Király and Marco Büchler. “A teljesség minőségjelzőként való mérése az Europeanában”. (Hungarian translation of “Measuring completeness as metadata quality metric in Europeana”) In Digitális Bölcsészet 2, 2019. pp. 57-76. DOI: 10.31400/dh-hun.2019.2.388
2020
Péter Király. “Empirical evaluation of library catalogues”. In EuropeanaTech Newsletter 15, 2020. https://pro.europeana.eu/page/issue-15-swib-2019#empirical-evaluation-of-library-catalogues. In Spanish: “Evaluación empírica de los catálogos de las bibliotecas” (translator unkown - send me a message if you know the translator). Blog de la biblioteca de Traducción y Documentación de la Universidad de Salamanca, 2020. https://universoabierto.org/2020/06/01/evaluacion-empirica-de-los-catalogos-de-las-bibliotecas/
Péter Király. “A magyar népzenei adatok története és a (digitális) archiválás lehetőségei. Bolya Mátyás. Információelmélet és népzenekutatás: Rendszeralkotás, nyilvántartás, digitális archívum. Budapest: MTA BTK Zenetudományi Intézet–L’Harmattan Kiadó, 2019.” Book review. In Digitális Bölcsészet 3, 2020. pp. 7-15. DOI: 10.31400/dh-hun.2020.3.1405
2021
Péter Király, and Jan Brase. “Qualitätsmanagement”. In Praxishandbuch Forschungsdatenmanagement. Edited by: Markus Putnings, Heike Neuroth and Janna Neumann. Berlin, Boston: De Gruyter Saur. ISBN: 9783110653656, DOI: 10.1515/9783110657807 (De Gruyter Praxishandbuch) pp. 357–380. DOI: 10.1515/9783110657807-020
cited by:
- Dietmar Kammerer and Kai Matuszkiewicz. “Forschungsdaten in der Medienwissenschaft: Infrastrukturen, Plattformen und Forschungsdatenmanagement und ihre Bedeutung für die digitale Transformation der Medienwissenschaft”. In: Stollfuß, S., Niebling, L., Raczkowski, F. (eds) Handbuch Digitale Medien und Methoden. Springer VS, Wiesbaden, 2023. pp. 1-19. DOI: 10.1007/978-3-658-36629-2_6-1
- Maxi Kindling. “Qualitätssicherung von Datenpublikationen bei Data Journals und Forschungsdatenrepositorien”. Doctoral Thesis. Philosophischen Fakultät der Humboldt-Universität zu Berlin. 2022. DOI: 10.18452/26023
- Dr. Yves Vincent Grossmann (ed.) “Data Quality”. In: Max Planck Digital Library. Research Data Management. Information Platform for Max Planck Researchers mpdl.mpg.de
- Philipp Hegel, Melanie Seltmann and Helmut W. Klug. “Forschungsdatenmanagement”. In: KONDE Weißbuch. Hrsg. v. Selina Galka und Helmut W. Klug unter Mitarbeit von Susanne Höfer im Projekt “Enlarging ‘Weißbuch Digitale Edition’”. 2024. pp. 1-5. Handle: 11471/562.50.293
Rudolf Ungváry, and Péter Király. “Bemerkungen zu der Qualitätsbewertung von MARC-21-Datensätzen”. In Qualität in der Inhaltserschließung. Edited by: Michael Franke-Maier, Anna Kasprzik, Andreas Ledl and Hans Schürmann. Berlin, Boston: De Gruyter Saur. ISBN: 9783110691597, DOI: 10.1515/9783110691597 (Bibliotheks- und Informationspraxis, Volume 70) pp. 177-227. DOI: 10.1515/9783110691597-011
Király Péter. “Kulturális adatelemzés – bevezetés és önéletrajz. Lev Manovich. Cultural analytics. Cambridge, Massachussetts–London, England: The MIT Press, 2020.” Book Review. In Digitális Bölcsészet 4, 2021. pp. 11-22. DOI: 10.31400/dh-hun.2021.4.3510
2022
Tomasz Umerle, Giovanni Colavizza, Elżbieta Herden, Rindert Jagersma, Péter Király, Beata Koper, Leo Lahti, David Lindemann, Jakub Maciej Łubocki, Vojtěch Malínek, Alexandra Milanova, Róbert Péter, Nanette Rißler-Pipka, Matteo Romanello, Marcin Roszkowski, Dorota Siwecka, Mikko Tolonen, Ondřej Vimr. “An Analysis of The Current Bibliographical Data Landscape in the Humanities. A Case for the Joint Bibliodata Agendas of Public Stakeholders”. 46 p. DARIAH. DOI: 10.5281/zenodo.6559857
cited by:
- Gustavo Candela. “An automatic data quality approach to assess semantic data from cultural heritage institutions.” Journal of the Association for Information Science and Technology 74, no. 7 (2023): 866-878. DOI: 10.1002/asi.24761
- Vojtěch Malínek. “Poezie na křižovatce: poznámky ke kvantitativním aspektům české poezie v letech 1945–1959.” Bohemica litteraria 26, no. 2 (2023): 7-28. DOI: 10.5817/BL2023-2-1
- Vojtěch Malínek, Tomasz Umerle, Mikko Tolonen, Agnieszka Karlinska, Matteo Romanello, Giovanni Colavizza, Silvio Peroni Dorota Siwecka, Jakub Łubocki, Nanette Rißler-Pipka, David Lindemann, Penny Labropoulou and Christiane Klaes. “Fostering Collaboration to Enable Bibliodata-driven Research in the Humanities.” In Digital Humanities 2023: Book of Abstracts. 2023. pp. 445-448. 10.5281/zenodo.8107517
Péter Király, and Hannes Lowagie. “Implementation of a daily MARC assessment with open source tools at KBR, the royal library of Belgium.”. IFLA Metadata Newsletter Volume 8, Number 1, June 2022. pp. 12-15 ifla.org
cited by:
- Ann Van Camp and Sven Lieber. “ISNI, a top tool for quality enhancement, smooth data flows and efficient internal processes”. In: 87th IFLA World Library and Information Congress (WLIC) / 2022. pp. 1-10. ifla.org
Rudolf Ungváry, and Péter Király. “A MARC21 tételfejének és kódolt tartalmi jellemzőinek feldolgozási minősége néhány nemzeti könyvtárban”. Tudományos és Műszaki Tájékoztatás 2022. pp. 155-175. DOI: 10.3311/tmt.13174
Triet Ho Anh Doan, Péter Király, Sven Bingert. “MINE – Workspace as a Service for Text Analysis.” In: Gianmaria Silvello, Oscar Corcho, Paolo Manghi, Giorgio Maria Di Nunzio, Koraljka Golub, Nicola Ferro, Antonella Poggi (Eds.) Linking Theory and Practice of Digital Libraries. 26th International Conference on Theory and Practice of Digital Libraries, TPDL 2022 Padua, Italy, September 20–23, 2022, Proceedings. Lecture Notes in Computer Science, vol 13541. Springer, Cham. pp 328–334. DOI: 10.1007/978-3-031-16802-4_29
2023
Ungváry Rudolf, Király Péter. “Nemzeti könyvtárak és az OSZK MARC21 állományainak összehasonlító elemzése néhány adatmező alapján: Tanulságok.” Tudományos és Műszaki Tájékoztatás, 2023. pp. 1-28. DOI: 10.3311/tmt.13250
Király Péter. “Adatelemzés Pythonnal bölcsészeknek. Recenzió Folgert Karsdorp, Mike Kestemont, and Allen Riddel. Humanities Data Analysis: Case studies with Python című könyvéről” Digitális Bölcsészet, 2022. 6. K:3-K:9. DOI: 10.31400/dh-hun.2022.6.5580.
Erzsébet Tóth-Czifra, Marta Błaszczyńska, Francesco Gelati, Femmy Admiraal, Mirjam Blümm, Erik Buelinckx, Vera Chiquet, Rita Gautschy, Peter Gietz, Péter Király, Maria Vivas-Romero, Walter Scholger, Bartłomiej Szleszyński, and Ulrike Wuttke. “Research Data Management for Arts and Humanities: Integrating Voices of the Community.” Zenodo, 2023. 97 p. DOI: 10.5281/zenodo.8059626
cited by:
- Ljiljana Poljak Bilić, Kristina Posavec. “FAIRness of Research Data in the European Humanities Landscape” Publications 2024. 12, no. 1: 6. DOI: 10.3390/publications12010006
- Sebastian Barzaghi, Alice Bordignon, Bianca Gualandi, Ivan Heibi, Arcangelo Massari, Arianna Moretti, Silvio Peroni, and Giulia Renda. “A Proposal for a FAIR Management of 3D Data in Cultural Heritage: The Aldrovandi Digital Twin Case.” arXiv preprint arXiv:2407.02018 (2024). DOI: 10.48550/arXiv.2407.02018
- Ljiljana Poljak Bilić and Kristina Posavec. “FAIRness of Research Data in the European Humanities Landscape.” Publications 12, no. 1 (2024): 6. DOI: 10.3390/publications12010006
Ungváry Rudolf, Király Péter. “Nemzeti könyvtárak és az OSZK MARC21 állományainak összehasonlító elemzése néhány adatmező alapján.” (Short paper version) In: Networkshop, 2023. pp. 36-49. DOI: 10.31915/NWS.2023.7
2024
Vojtěch Malínek, Tomasz Umerle, Edward Gray, Ivan Heibi, Péter Király, Christiane Klaes, Przemysław Korytkowski, David Lindemann, Arianna Moretti, Charlotte Panušková, Róbert Péter, Mikko Tolonen, Aldona Tomczyńska, and Ondřej Vimr. “Open Bibliographical Data Workflows and the Multilinguality Challenge.” Journal of Open Humanities Data, 2024. 10: 27, pp. 1–14. DOI: 10.5334/johd.190
cited by:
- Lorella Viola. “Editorial: Data and Workflows for Multilingual Digital Humanities.” Journal of Open Humanities Data, 2024. 10: 37, pp. 1–6. DOI: 10.5334/johd.220
Péter Király. “QA Catalogue - A Quality Assessment Tool for Library Catalogues”. GWDG Nachrichten 2024. 04-05. pp. 19-24. gwdg.de
Péter Király. “Shacl4Bib: custom validation of library data”. Preprint. arxiv, 2024. arXiv:2405.09177 DOI: 10.48550/arXiv.2405.09177
Sponsors
Thanks to
- GWDG for supporting my research in different ways,
- to Europeana and eTRAP research group for using their computers,
- to JetBrains s.r.o. for IntelliJ IDEA community licence,
- to British Library, KBR, the Belgian National Library and Gemeinsamer Bibliotheksverbund for supporting the developments of QA catalogue and for Deutsche Digitale Bibliothek for supporting the developments of Metadata Quality Assessment Framework,
- to developers of Open Source software packages, and infrastructure services I used in the research, and
- to Open Data publishers for their data.
You can reach me via the methods listed in the contact page.