Published April 1, 2026 | https://doi.org/10.59350/bwz65-rjn21

A Review of "Comparing Bibliographic Coupling and Conceptual Similarity" – Have We Been Wrong All Along?

Creators & Contributors

Jeffrey Demaine & Philippe Boisvert

A review of a recent article on bibliographic coupling (BC) and conceptual similarity prompted reflection among members of a bibliometrics reading group. In this piece, they examine the validity of BC as a measure of semantic similarity, unpack the article's methodology, and discuss its conclusions, considering the broader implications of using BC to capture the semantic content of cited works.

Bibliometric coupling and conceptual similarity

As part of a monthly reading club hosted by Philippe Boisvert of the Université Laval, a few bibliometricians reviewed the article Bibliographic Coupling and Conceptual Similarity: Are the Bibliographically Coupled Papers also Conceptually Similar? This research by Nandy et al. (2024) casts doubt on the validity of Bibliographic Coupling (BC) as a method of capturing the semantic content of cited articles. It should be noted that questions about the usefulness of BC in identifying any meaningful relationship between publications dates back several decades. The article in question does a good job of reviewing previous studies that both support and refute the validity of BC as a measure of semantic similarity. Yet for those of us who have used BC for years, the article's conclusions are surprising.

Bibliographic Coupling

Originally proposed by Michael Kessler in 1963, the rationale behind Bibliographic Coupling ("BC") is simple: the more references that two articles have in common, the more likely the contents of the articles are similar. This can be measured simply by counting the number of references they share (known as "BC strength"). It can also be expressed as a fraction by dividing the number of overlapping references by the sum of all the unique references in the two articles (the "Jaccard Similarity Index").

Nandy et al.'s methodology

Although it involves the manipulation of large matrices, the methodology is straightforward. The normalized Jaccard Similarity is calculated for all pairs of publications, both for BC and for a "concept coupling" score derived from the Dimensions database. The resulting matrices (document X document; document X concept) were then filtered to extract pairs of publications that are related both through BC and concepts, and the Jaccard Similarity values of these pairs were compared. Somewhat controversially, the authors find no correlation between the BC scores and the "concept coupling" weights assigned by Dimensions.

Thankfully, a close reading of the details of the methodology uncovered a couple of weaknesses in the analysis that may help to explain this unexpected finding:

  • The authors use a Spearman coefficient to test for a correlation between BC and the "concept coupling" score. The advantage of the Spearman correlation is that it is non-parametric, meaning that non-normal frequency distributions can be tested. Presumably, the BC and concept coupling values were not normal. On the other hand, the disadvantage is that the values for both BC and concept coupling must first be rank-ordered. In transforming the actual values into ranks, much of the precision is lost. The result is a more approximate comparison of the two sets of values.

  • Similarly, the authors chose an arbitrary threshold for the value of the concept coupling metric. Only concept coupling scores greater than 0.6 were retained. As many concept scores are much lower, this effectively excluded much comparative detail from the dataset. Curiously, the values of BC were not subject to a similar threshold. As a result, BC values could range from 0 to 1, while concept coupling values began at 0.6. Perhaps if the authors had chosen a more balanced way of filtering the values, the resulting comparison would have been fairer. For example, the top 20% of both BC and concept coupling could have been retained.

While the results presented in the article would seem to present a strong case against the meaningfulness of BC, the take-home message might be that, despite decades of use, it seems that the jury is still out. Clearly, BC is measuring something different than the concepts expressed in the Dimensions database. Perhaps both are equally valid measures of document similarity, but in completely separate ways.

Connecting flights

Consider an analogy: the similarity of cities can be calculated based on the flight networks that connect them. It can be said that L.A. and Vancouver are coupled (from an airline's point of view) because there are direct flights to Tokyo from both. Yet the socio-economic characteristics of each city (perhaps analogous to the conceptual content of an article) are quite different. Thus, while the article by Nandy et al. (2024) demonstrates that BC and conceptual coupling are incomparable, this should not be taken as discrediting the usefulness of BC in expressing the structural or topical relatedness of publications. A prudent bibliometrician would use BC in conjunction with other measures of textual similarity.

References

  1. Kessler, M.M. (1963), Bibliographic coupling between scientific papers. American Documentation, 14: 10-25. https://doi.org/10.1002/asi.5090140103
  2. Nandy, A., Singh, A., Gupta, V., & Singh, V. K. (2024). Bibliographic Coupling and Conceptual Similarity: Are the Bibliographically Coupled Papers also Conceptually Similar? Journal of Scientometric Research, 13(3), 706–714. https://doi.org/10.5530/jscires.20041115

About the authors

Jeffrey Demaine is an organizer of the annual Bibliometrics and Research Impact (BRIC) conference. He has extensive experience in information science, beginning with the National Research Council in Ottawa, and then the iFQ in Germany. His publications have focused on leveraging metadata to uncover patterns in research and higher education, such as Linked Literature Analysis, gender distribution across faculties and institutions, and the fractionalization of citation impact. He was the Bibliometrics and Research Impact Librarian at the University of Waterloo and McMaster University.

Philippe Boisvert has been the Bibliometrics and Research Impact Librarian at Université Laval in Quebec City, Canada, since 2022. Before that, he was a subject librarian in science and engineering since 2010. In that role, he has been involved in bibliometrics since 2014. He also has a background in Physics (a master's degree and a PhD candidature for more than two years). Since 2023, he has been running the monthly reading club of the Canadian Association of Research Libraries BIR CoP. He's also been on the Steering Committee of that CoP since that year.

Unless it states other wise, the content of the Bibliomagician is licensed under a Creative Commons Attribution 4.0 International License.

DOI:

Additional details

Description

Jeffrey Demaine & Philippe Boisvert A review of a recent article on bibliographic coupling (BC) and conceptual similarity prompted reflection among members of a bibliometrics reading group.

Identifiers

Dates

Issued
2026-04-01T12:49:55
Updated
2026-04-01T12:53:36