Published February 22, 2012 | https://doi.org/10.59350/wfhyy-qt220

Clustering strings

  • 1. ROR icon University of Glasgow

Revisiting an old idea (Clustering taxonomic names) I've added code to cluster strings into sets of similar strings to the phyloinformatics course site.

This service (available at http://iphylo.org/~rpage/phyloinformatics/services/clusterstrings.php) takes a list of strings, one per line, and returns a list of clusters. For example, given the names


Ferrusac 1821
Bonavita 1965
Ferussa 1821
Fer.
Lamarck 1812
Ferussac 1821


the service finds three clusters, displayed here using Google images:



(Note to self, investigate canviz as an alternative for displaying graphviz graphs.)

If you are curious, these strings are taxonomic authorities associated with the name Helicella, and based on this clustering there are three taxonomic names, one of which has three different variations of the author's name.

Additional details

Description

Revisiting an old idea (Clustering taxonomic names) I've added code to cluster strings into sets of similar strings to the phyloinformatics course site.This service (available at http://iphylo.org/~rpage/phyloinformatics/services/clusterstrings.php) takes a list of strings, one per line, and returns a list of clusters.

Identifiers

UUID
5e34a711-3cc6-4bd1-a53c-8cec0c06cbb0
GUID
tag:blogger.com,1999:blog-16081779.post-5892993905869406165
URL
https://iphylo.blogspot.com/2012/02/clustering-strings.html

Dates

Issued
2012-02-22T16:15:00
Updated
2012-02-22T16:15:53