Cobertura lingüística en las nuevas bases de datos

Main Article Content

Rodrigo Sánchez-Jiménez

Abstract

This study examines the linguistic coverage of five bibliographic databases, comparing traditional subscription-based models (Web of Science and Scopus) with emerging open infrastructures and aggregators such as OpenAlex, OpenAIRE, and SciLit. The analysis begins with the well-documented issue of geographic and linguistic biases inherent in classical sources, in order to assess whether these new platforms provide a more diverse representation of global science. To this end, the complete body of indexed output available in each source up to 2025 was retrieved and harmonized, and the distribution of the twenty most prevalent languages was analyzed to quantify differences in visibility across indexing models.


The results reveal stark disparities in the volume of non-English documents, with platforms like OpenAlex surfacing millions of records in Asian languages (Japanese, Indonesian, Korean) and Middle Eastern languages that remain largely invisible in commercial databases. However, the analysis also highlights the trade-off between quantity and metadata quality: whereas WoS and Scopus rely on editorial selection and OpenAIRE on a more “notarial” harvesting approach (with lower completeness), OpenAlex achieves massive coverage through algorithmic inference, introducing a certain margin of error. The study concludes that professionals and researchers now navigate two complementary ecosystems, needing to choose between the curated selectivity of the scientific elite and the more inclusive—though noisier—panorama of global science.

Downloads

Download data is not yet available.

Article Details

How to Cite
Sánchez-Jiménez, R. (2025). Cobertura lingüística en las nuevas bases de datos. CLIP De SEDIC: Revista De La Sociedad Española De Documentación E Información Científica, (92), 45–54. https://doi.org/10.47251/clip.n92.186
Section
Panorama
Author Biography

Rodrigo Sánchez-Jiménez, Departamento de Biblioteconomía y Documentación. Universidad Complutense de Madrid

Profesor Titular en la Universidad Complutense de Madrid, donde ejerce la docencia desde hace más de veinte años. Doctor en Documentación por la UCM en 2006, se ha especializado en cienciometría. Su actividad investigadora se centra en el análisis cuantitativo de la actividad científica, abordando líneas como los costes de publicación (APCs), las brechas de género y la evolución de las tesis doctorales. Asimismo, cuenta con una amplia experiencia en gestión editorial y colabora activamente con el grupo SCImago en el desarrollo de nuevos indicadores y el estudio de fuentes de datos abiertas.