Rogue Scholar

Veröffentlicht 16. September 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

Source Ex Libris surprised us by suddenly releasing Primo Research Assistant to production on September 9, 2024 (when the earlier timeline was 4Q 2024 with some believing it might even be delayed). Despite the fact that there are so many RAG (retrieval augmented generation) academic search systems today that generate answers from search, this is still quite a significant event to be worth covering in my blog. Why?

DiscoveryLarge Language ModelAndere SozialwissenschaftenEnglisch

AI & Retrieval augmented generation search - the content problem -reactions from librarians, authors and publishers & thoughts on tradeoffs

https://doi.org/10.59350/b5jnj-z5e22

Veröffentlicht 2. August 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

IP and ethical issues surrounding the use of content in Large Language Models (LLMs) have sparked significant debate, but I’ve mostly stayed out of it as this isn’t my area of expertise, and while there’s much to discuss and many legal opinions to consider, ultimately, the courts will decide what’s legal. However, for those interested in exploring this topic further, I recommend Peter Schoppert’s AI & Copyright substack.

DiscoveryLarge Language ModelAndere SozialwissenschaftenEnglisch

Prompt engineering with Retrieval Augmented Generation systems - tread with caution!

https://doi.org/10.59350/try40-x4947

Veröffentlicht 13. Juli 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

I recently watched a librarian give a talk about their experiments teaching prompt engineering. The librarian drawing from the academic literature on the subject (there are lots!), tried to leverage "prompt engineering principles" from one such paper to craft a prompt and used it in a Retrieval Augmented Generation (RAG) system, more specifically, Statista's brand new "research AI" feature.

DiscoveryLarge Language ModelAndere SozialwissenschaftenEnglisch

Can Semantic Search be more interpretable? COLBERT, SPLADE might be the answer but is it enough?

https://doi.org/10.59350/bm0ee-2vz33

Veröffentlicht 6. Juni 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

Can Semantic Search be more interpretable? COLBERT, SPLADE might be the answer but is it enough?

DiscoveryLarge Language ModelAndere SozialwissenschaftenEnglisch

Deep dive into AI powered embedding based search - sparse vs dense embeddings

https://doi.org/10.59350/qg0pj-wt465

Veröffentlicht 3. Juni 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

Warning : I am not a information retrieval researcher, so take my blog post with a pinch of salt In my last blog post, I described a simplified description of a framework for infomation retrieval from the paper -

DiscoveryLarge Language ModelAndere SozialwissenschaftenEnglisch

Retrieval Augmented Generation and academic search engines - some suggestions for system builders

https://doi.org/10.59350/c44n5-2gz66

Veröffentlicht 21. Mai 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

As academic search engines and databases incorporate the use of generative AI into their systems, an important concept that all librarian should grasp is that of retrieval augmented generation (RAG). You see it in use in all sorts of "AI products" today from chatbots like Bing Copilot, to Adobe's Acrobat Ai assistant that allow you to chat with your PDF.

DiscoveryLarge Language ModelAndere SozialwissenschaftenEnglisch

Undermind.ai - a different type of AI agent style search optimized for high recall?

https://doi.org/10.59350/4g3nh-6e089

Veröffentlicht 9. April 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

In the last blog post , I argued that despite the advancements in AI thanks to transformer based large language models, most academic search still are focused mostly in supporting exploratory searches and do not focus on optimizing recall and in fact trade off low latency for accuracy.

DiscoveryLarge Language ModelAndere SozialwissenschaftenEnglisch

A conceptual view of information retrieval - can we do better with AI?

https://doi.org/10.59350/2a78r-gsw45

Veröffentlicht 7. April 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

I've watched with interest, as academic search engines use AI to improve searching. Elicit is probably currently the leading example of this, using transformer based language models to

DiscoveryLarge Language ModelAndere SozialwissenschaftenEnglisch

Why entering your query in natural question leads to better result than keyword searching with the latest AI powered (Dense retrieval/embedding models) search

https://doi.org/10.59350/t3vmh-wnd26

Veröffentlicht 23. März 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

One of the tricks about using the newer "AI powered" search systems like Elicit, SciSpace and even JSTOR experiment search is that they recommend that you type in your query or what you want in full natural language and not keyword search style (where you drop the stop words) for better results. So for example do

DiscoveryLarge Language ModelAndere SozialwissenschaftenEnglisch

How learning evidence synthesis (systematic reviews etc) changed the way I search and some thoughts about semantic search as a complement search technique

https://doi.org/10.59350/n01x0-tk156

Veröffentlicht 24. Februar 2024 in Aaron Tay's Musings about librarianship

Autor Aaron Tay

I've spent a large part of my career as an academic librarian studying the question of discovery from many angles.

Rogue Scholar Beiträge

Primo Research Assistant launches- a first look and some things you should know

AI & Retrieval augmented generation search - the content problem -reactions from librarians, authors and publishers & thoughts on tradeoffs

Prompt engineering with Retrieval Augmented Generation systems - tread with caution!

Can Semantic Search be more interpretable? COLBERT, SPLADE might be the answer but is it enough?

Deep dive into AI powered embedding based search - sparse vs dense embeddings

Retrieval Augmented Generation and academic search engines - some suggestions for system builders

Undermind.ai - a different type of AI agent style search optimized for high recall?

A conceptual view of information retrieval - can we do better with AI?

Why entering your query in natural question leads to better result than keyword searching with the latest AI powered (Dense retrieval/embedding models) search

How learning evidence synthesis (systematic reviews etc) changed the way I search and some thoughts about semantic search as a complement search technique