Messaggi di Rogue Scholar

language

A brief overview of different types of clustering techniques and their algorithms. Authors Aishwarya Nambissan (ORCID: 0009-0003-3823-6609) Amir Aryani (ORCID: 0000-0002-4259-9774) Background Clustering is a fascinating technique used in machine learning, where patterns or data points are grouped based on their similarities. It’s like finding hidden connections among different data points without predefined labels.

Pubblicato in iRights.info
Autore Fabian Rack

Setzt man Technologien Künstlicher Intelligenz für Bildung und Lehre ein, ergeben sich verschiedene (urheber-)rechtliche Fragen. Fabian Rack erklärt, wie offene Bildungsmaterialien für das Training von KI-Generatoren dienen und was es beim Prompten mit Fremdwerken zu beachten gilt. Im letzten Monat haben wir darüber berichtet, wie OER mit KI-Inhalten am besten zu lizenzieren sind.

Pubblicato in FAIR Data Digest

Dear subscriber, welcome to the second edition of the newsletter and also a warm welcome to all new subscribers. It has been an interesting week. In this edition I will talk about some work updates, a one-day workshop I’ve attended last week and I have a video recommendation.

Pubblicato in GigaBlog

Published today in GigaScience is a Data Note describing the National COVID-19 Chest Imaging Database (NCCID), a centralised database containing chest X-rays, Computed Tomography (CT) and MRI scans from patients across the UK. Utilising the UK National Health Service’s unique position as the world’s single largest integrated healthcare system, the benefits of collecting chest imaging data this large are extensive and already being used

Pubblicato in iPhylo

Quick note on a tool I've been working on to parse citations, that is to take a series of strings such as: Möllendorff O (1894) On a collection of land-shells from the Samui Islands, Gulf of Siam. Proceedings of the Zoological Society of London, 1894: 146–156. de Morgan J (1885) Mollusques terrestres & fluviatiles du royaume de Pérak et des pays voisins (Presqúile Malaise). Bulletin de la Société Zoologique de France, 10: 353–249.

Pubblicato in GigaBlog

GigaScience has always had a focus on reproducibility rather than subjective impact, and it can be challenging for our reviewers to judge this, especially now that more and more tools are being created – bringing data science to the masses.  This also means more efficiency and ease is required especially when multiple collaborators and contributors on a specific project are involved.

Pubblicato in iPhylo

Note to self. The challenge of finding specimen citations in papers keeps coming around. It seems that this is basically the same problem as finding citations to papers, and can be approached in much the same way. If you want to build a database of reference from scratch, one way is to scrape citations from papers (e.g., from the "literature cited" section), convert those strings into structured data, and add those to your database.

Pubblicato in iPhylo

Note to self about a possible project. This PLoS ONE paper: describes a method for inferring a hierarchy from a set of tags (and cites related work that is of interest). I've grabbed the code and data from http://hiertags-beta.elte.hu/home/ and put it on GitHub. Possible project Use Tibély et al. method (or others) on taxonomic names extracted from BHL text (or other) and see if we can reconstruct taxonomic classifications.