# Load libraries
library(bibliometrix)
library(tidyverse)
library(ggplot2)
library(maps)
# Import raw Scopus data
scopus_raw <- convert2df(
file = "_small lake__.csv [no keyword limitation]",
dbsource = "scopus",
format = "csv"
)##
## Converting your scopus collection into a bibliographic dataframe
##
## Done!
##
##
## Generating affiliation field tag AU_UN from C1: Done!
# Run bibliometric analysis
bib_analysis <- biblioAnalysis(scopus_raw, sep = ";")
bib_summary <- summary(bib_analysis, k = 10, pause = FALSE)##
##
## MAIN INFORMATION ABOUT DATA
##
## Timespan 1898 : 2025
## Sources (Journals, Books, etc) 2236
## Documents 13181
## Annual Growth Rate % 0
## Document Average Age 15.5
## Average citations per doc 34.62
## Average citations per year per doc 2.167
## References 715186
##
## DOCUMENT TYPES
## article 11637
## book 117
## book chapter 356
## conference paper 370
## data paper 16
## editorial 21
## erratum 4
## letter 20
## note 33
## review 601
## short survey 6
##
## DOCUMENT CONTENTS
## Keywords Plus (ID) 31456
## Author's Keywords (DE) 23153
##
## AUTHORS
## Authors 31119
## Author Appearances 56688
## Authors of single-authored docs 1228
##
## AUTHORS COLLABORATION
## Single-authored docs 1558
## Documents per Author 0.424
## Co-Authors per Doc 4.3
## International co-authorships % 25.9
##
##
## Annual Scientific Production
##
## Year Articles
## 1898 1
## 1926 1
## 1943 1
## 1947 1
## 1948 2
## 1956 1
## 1961 2
## 1965 1
## 1967 1
## 1970 5
## 1971 7
## 1972 6
## 1973 17
## 1974 11
## 1975 13
## 1976 19
## 1977 21
## 1978 32
## 1979 37
## 1980 40
## 1981 37
## 1982 50
## 1983 56
## 1984 58
## 1985 48
## 1986 93
## 1987 79
## 1988 106
## 1989 89
## 1990 117
## 1991 83
## 1992 119
## 1993 99
## 1994 107
## 1995 103
## 1996 170
## 1997 187
## 1998 206
## 1999 206
## 2000 208
## 2001 201
## 2002 220
## 2003 275
## 2004 225
## 2005 285
## 2006 302
## 2007 271
## 2008 285
## 2009 345
## 2010 398
## 2011 363
## 2012 459
## 2013 463
## 2014 495
## 2015 532
## 2016 595
## 2017 579
## 2018 601
## 2019 603
## 2020 683
## 2021 727
## 2022 743
## 2023 700
## 2024 390
## 2025 1
##
## Annual Percentage Growth Rate 0
##
##
## Most Productive Authors
##
## Authors Articles Authors Articles Fractionalized
## 1 ZHANG Y 112 SMOL JP 21.2
## 2 WANG Y 83 BRILLO BBC 20.9
## 3 WANG J 79 BIRKS HJB 20.1
## 4 LI Y 75 ZHANG Y 19.5
## 5 LI J 71 ANDERSON NJ 17.3
## 6 SMOL JP 68 WANG Y 15.8
## 7 RASK M 62 RASK M 14.8
## 8 WANG L 60 JR 14.7
## 9 JR 57 MOISEENKO TI 14.5
## 10 BIRKS HJB 56 SCHINDLER DW 14.2
##
##
## Top manuscripts per citations
##
## Paper
## 1 SCHWARZENBACH RP, 2005, ENVIRONMENTAL ORGANIC CHEMISTRY
## 2 HYSLOP EJ, 1980, J FISH BIOL
## 3 REYNOLDS CS, 2006, THE ECOLOGY OF PHYTOPLANKTON
## 4 KIDD KA, 2007, PROC NATL ACAD SCI U S A
## 5 ALLAN JD, 2007, STREAM ECOL: STRUCT AND FUNCT OF RUNNING WATERS: SECOND EDITION
## 6 POFF NL, 1997, J NORTH AM BENTHOLOGICAL SOC
## 7 CORRELL DL, 1998, J ENVIRON QUAL
## 8 LIMA SL, 1998, BIOSCIENCE
## 9 ADRIAN R, 2009, LIMNOL OCEANOGR
## 10 WERNER EE, 2003, ECOLOGY
## DOI TC TCperYear NTC
## 1 10.1002/0471649643 3743 170.1 64.5
## 2 10.1111/j.1095-8649.1980.tb02775.x 3684 78.4 28.9
## 3 10.1017/CBO9780511542145 1944 92.6 33.0
## 4 10.1073/pnas.0609568104 1644 82.2 28.9
## 5 10.1007-978-1-4020-5583-6 1451 72.5 25.5
## 6 10.2307/1468026 1441 48.0 22.5
## 7 10.2134/jeq1998.00472425002700020004x 1429 49.3 18.1
## 8 10.2307/1313225 1428 49.2 18.1
## 9 10.4319/lo.2009.54.6_part_2.2283 1324 73.6 28.8
## 10 10.1890/0012-9658(2003)084[1083:AROTII]2.0.CO;2 1319 55.0 20.6
##
##
## Corresponding Author's Countries
##
## Country Articles Freq SCP MCP MCP_Ratio
## 1 USA 1872 0.2058 1452 420 0.224
## 2 CHINA 1142 0.1255 804 338 0.296
## 3 CANADA 935 0.1028 695 240 0.257
## 4 UNITED KINGDOM 464 0.0510 281 183 0.394
## 5 GERMANY 419 0.0461 229 190 0.453
## 6 FINLAND 338 0.0372 268 70 0.207
## 7 SWEDEN 314 0.0345 168 146 0.465
## 8 POLAND 272 0.0299 227 45 0.165
## 9 FRANCE 249 0.0274 124 125 0.502
## 10 JAPAN 226 0.0248 186 40 0.177
##
##
## SCP: Single Country Publications
##
## MCP: Multiple Country Publications
##
##
## Total Citations per Country
##
## Country Total Citations Average Article Citations
## 1 USA 85290 45.56
## 2 CANADA 34587 36.99
## 3 UNITED KINGDOM 25085 54.06
## 4 CHINA 22503 19.70
## 5 GERMANY 14531 34.68
## 6 SWEDEN 14507 46.20
## 7 FINLAND 11441 33.85
## 8 AUSTRALIA 10022 50.11
## 9 FRANCE 9505 38.17
## 10 SWITZERLAND 7209 49.72
##
##
## Most Relevant Sources
##
## Sources Articles
## 1 HYDROBIOLOGIA 573
## 2 FRESHWATER BIOLOGY 295
## 3 JOURNAL OF PALEOLIMNOLOGY 280
## 4 SCIENCE OF THE TOTAL ENVIRONMENT 260
## 5 LIMNOLOGY AND OCEANOGRAPHY 219
## 6 QUATERNARY SCIENCE REVIEWS 184
## 7 WATER (SWITZERLAND) 180
## 8 HOLOCENE 178
## 9 CANADIAN JOURNAL OF FISHERIES AND AQUATIC SCIENCES 174
## 10 JOURNAL OF HYDROLOGY 137
##
##
## Most Relevant Keywords
##
## Author Keywords (DE) Articles Keywords-Plus (ID) Articles
## 1 CLIMATE CHANGE 408 LAKES 2237
## 2 LAKES 377 ARTICLE 1351
## 3 EUTROPHICATION 356 CLIMATE CHANGE 1274
## 4 HOLOCENE 330 WATER QUALITY 1245
## 5 PHYTOPLANKTON 294 LAKE 1226
## 6 ZOOPLANKTON 280 UNITED STATES 1021
## 7 DIATOMS 267 ENVIRONMENTAL MONITORING 1012
## 8 WATER QUALITY 218 CHINA 975
## 9 LAKE SEDIMENTS 204 EUTROPHICATION 907
## 10 LAKE 196 ECOSYSTEM 888
Bibliometric analysis is a quantitative method used to evaluate and map the scientific literature of a specific field. This study uses bibliometric analysis to examine the research landscape on small lakes from 1990 to 2024, using data retrieved from the Scopus database.
The analysis aims to identify publication trends, leading authors, influential journals, productive countries, and dominant research themes in the field of small lakes research.
Data was retrieved from the Scopus database using “small lake*” and related terms as search keywords. The raw dataset consisted of 13,181 records spanning from 1898 to 2025.
The dataset was cleaned using the following steps:
The final cleaned dataset comprises 11,391 documents published across 1,729 sources by 28,700 authors.
All analyses and visualizations were performed in R using the
bibliometrix, tidyverse, and
ggplot2 packages.
The raw dataset retrieved from Scopus consisted of 13,181 records spanning from 1898 to 2025. Before proceeding with the analysis, the data was inspected and cleaned through the following steps.
The first step was to check whether key bibliometric fields contained any missing values.
## AU TI PY SO DE ID AB TC CR C1
## 0 0 0 0 0 0 0 0 0 0
All key fields returned zero missing values, indicating that the Scopus export was complete and no imputation was necessary.
Next, the distribution of publication years was examined to identify any anomalous or sparse records.
##
## 1898 1926 1943 1947 1948 1956 1961 1965 1967 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984
## 1 1 1 1 2 1 2 1 1 5 7 6 17 11 13 19 21 32 37 40 37 50 56 58
## 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
## 48 93 79 106 89 117 83 119 99 107 103 170 187 206 206 208 201 220 275 225 285 302 271 285
## 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
## 345 398 363 459 463 495 532 595 579 601 603 683 727 743 700 390 1
The data contained records as far back as 1898, with publication counts remaining very sparse prior to 1990. To ensure a meaningful and consistent trend analysis, the dataset was filtered to include only publications from 1990 onwards. Additionally, the single record from 2025 was excluded as the year is incomplete and would skew the publication trend downward.
## [1] 72
A total of 72 duplicate records were identified based on document title and were removed to avoid inflating author and citation counts.
##
## ARTICLE BOOK BOOK CHAPTER CONFERENCE PAPER DATA PAPER EDITORIAL ERRATUM
## 11637 117 356 370 16 21 4
## LETTER NOTE REVIEW SHORT SURVEY
## 20 33 601 6
The dataset contained 11 document types, including articles, reviews, conference papers, book chapters, books, editorials, and others. For bibliometric analysis, only peer-reviewed articles and reviews were retained, as these are the most comparable and citable document types in the literature.
Based on the inspection above, the following cleaning steps were applied:
# Filter by year
scopus_clean <- scopus_raw[scopus_raw$PY >= 1990 & scopus_raw$PY <= 2024, ]
# Keep only Articles and Reviews
scopus_clean <- scopus_clean[scopus_clean$DT %in% c("ARTICLE", "REVIEW"), ]
# Remove duplicates
scopus_clean <- scopus_clean[!duplicated(scopus_clean$TI), ]
# Confirm final dataset dimensions
dim(scopus_clean)## [1] 11391 53
The cleaned dataset comprises 11,391 documents ready for bibliometric analysis.
The following key metrics summarize the dataset:
| Metric | Value |
|---|---|
| Timespan | 1990–2024 |
| Total Documents | 11,391 |
| Total Authors | 28,700 |
| Total Sources | 1,729 |
| Annual Growth Rate | 3.53% |
| Average Citations per Document | 33.55 |
| Co-Authors per Document | 4.53 |
| International Co-authorships | 27.71% |
p1$AnnualScientProd +
labs(
title = "Annual Scientific Production on Small Lakes (1990–2024)",
subtitle = "Based on Scopus database | n = 11,391 documents",
x = "Year",
y = "Number of Articles",
caption = "Source: Scopus | Bibliometric Analysis"
)journal_df <- as.data.frame(table(scopus_clean$SO))
colnames(journal_df) <- c("Journal", "Articles")
journal_df <- journal_df[order(-journal_df$Articles), ][1:10, ]
ggplot(journal_df, aes(x = reorder(Journal, Articles), y = Articles)) +
geom_bar(stat = "identity", fill = "#2171b5") +
coord_flip() +
labs(
title = "Top 10 Most Relevant Journals in Small Lakes Research",
subtitle = "Based on Scopus database | n = 11,391 documents",
x = "Journal",
y = "Number of Articles",
caption = "Source: Scopus | Bibliometric Analysis"
)library(maps)
country_df <- as.data.frame(bib_analysis$Countries)
colnames(country_df) <- c("Country", "Articles")
country_df$Country <- stringr::str_to_title(country_df$Country)
world_map <- map_data("world")
country_df$Country[country_df$Country == "Usa"] <- "USA"
country_df$Country[country_df$Country == "United Kingdom"] <- "UK"
map_merged <- left_join(world_map, country_df, by = c("region" = "Country"))
ggplot(map_merged, aes(x = long, y = lat, group = group, fill = Articles)) +
geom_polygon(color = "white", linewidth = 0.1) +
scale_fill_gradient(
low = "#c6dbef", high = "#08306b",
na.value = "gray90",
name = "Articles"
) +
labs(
title = "Country Scientific Production on Small Lakes Research",
subtitle = "Based on Scopus database | n = 11,391 documents",
caption = "Source: Scopus | Bibliometric Analysis"
)thematic_map <- thematicMap(
scopus_clean,
field = "DE",
n = 250,
minfreq = 5,
stemming = FALSE,
size = 0.5,
n.labels = 1,
repel = TRUE
)
plot(thematic_map$map) +
labs(
title = "Thematic Map of Small Lakes Research",
subtitle = "Based on Scopus database | n = 11,391 documents",
caption = "Source: Scopus | Bibliometric Analysis"
)cited_df <- scopus_clean %>%
select(TI, AU, PY, SO, TC) %>%
arrange(desc(TC)) %>%
slice_head(n = 10) %>%
mutate(
Label = paste0(word(AU, 1), " (", PY, ")"),
TI = str_trunc(TI, 50)
)
ggplot(cited_df, aes(x = reorder(Label, TC), y = TC)) +
geom_bar(stat = "identity", fill = "#2171b5") +
geom_text(aes(label = TC), hjust = -0.2, size = 3, color = "gray30") +
coord_flip() +
labs(
title = "Top 10 Most Cited Papers in Small Lakes Research",
subtitle = "Based on Scopus database | n = 11,391 documents",
x = "Paper",
y = "Total Citations",
caption = "Source: Scopus | Bibliometric Analysis"
)This bibliometric analysis examined the scientific landscape of small lake research using 11,391 documents retrieved from the Scopus database, covering the period from 1990 to 2024. The analysis reveals a field that has grown consistently over three decades, with an annual growth rate of 3.53% and a peak publication output of 685 documents in 2022 — reflecting a growing recognition of the ecological importance of small lakes in the global scientific community.
The United States, China, and Canada emerged as the most productive countries, while Hydrobiologia and the Journal of Paleolimnology were identified as the leading publication outlets. The high international co-authorship rate of 27.71% and an average of 4.53 co-authors per document suggest that small lake research is increasingly collaborative and globally engaged.
Thematic analysis of author keywords revealed that the dominant research themes revolve around climate change, eutrophication, phytoplankton, and paleolimnology — all of which are rooted in the natural sciences. This pattern is consistent with the broader observation that small lake research has historically been concentrated in ecological and hydrological dimensions, with comparatively little attention given to the social and economic aspects of these water bodies.
This gap represents a significant opportunity for future research. Small lakes, despite being numerically dominant and biologically active, remain underexplored in terms of their socio-economic contributions to surrounding communities — including their roles in supporting local livelihoods, fisheries, aquaculture, and rural economies. Future studies that bridge the natural and social sciences in the context of small lake research would contribute meaningfully to a more holistic understanding of these ecosystems and their value to human communities.