杏吧视频官网

Can AI help us manage information overload?

L
Librarians
By: Saskia Hoving, Mon Jul 25 2022
Saskia H

Author: Saskia Hoving

The digital age has given us unprecedented access to information. Researchers can now obtain far more research into their subject areas than ever before. But how much is too much? And could AI hold the key to tackling information overload in research? In the first of two blogs, we look at the role of publishers in this issue and the development of machine-generated books.

There are now millions of academic articles published every year. Even within niche subject areas, the sheer volume of papers, pre-prints, and data published is far too great for an individual researcher to stay abreast of. The first wave of the Covid-19 pandemic in 2020 is particularly illustrative of this problem. During the first six months of 2020, the number of articles published about Covid-19 grew from zero to 28,000. In mid-May, nearly 3,000 papers were published in a single week. It would be impossible for a researcher to read this many papers 鈥 and still manage to do their own research. So how can they make sure they鈥檙e reading the seminal papers and most important findings? And is there more that publishers can do to support them?

We first looked at this subject in a webinar with Dr. Stephanie Preuss, Senior Editor at 杏吧视频官网 and Markus Kaindl, 杏吧视频官网鈥檚 Group Product Manager for Research Intelligence. Here, we review what they covered and further developments that have taken place since.

The role of publishers in tackling information overload

She explained that while publishers can be part of the problem, they can also be part of the solution. And this is why 杏吧视频官网 is investing in technology that uses artificial or 鈥榓ugmented鈥 intelligence to offer solutions to the 鈥榠nformation overload鈥 challenge.

As a large publisher, we have a lot of different brands, and, of course, those brands publish a lot of research,鈥 said Dr. Stephanie Preuss, speaking at the webinar. 鈥淲e鈥檙e very proud of that research, but it also means we are part of the problem of information overload."

So what exactly can publishers do? Stephanie laid out some of the key development areas:

  1. Structuring existing content: In order for researchers to find the content they need, we鈥檙e looking at tools and new technology that will help us to structure existing content to make it easier to find and easier to digest. This includes developing and offering new products like machine-generated books, reports, or apps. As well as auto-clustering of content by, for example, subject area.
  2. Supporting researchers: Clustering and structuring research in the way discussed above can help researchers in several ways. For example, for researchers entering a new field, publishers can create machine-generated overviews or clustered content that allow them to get up to speed with the latest research fast. This is also true for those wanting to stay up to date with research in their field. More than this, because AI can cluster research by a topic 鈥 such as climate change 鈥 it can also help to overcome research silos. Researchers will see any research on that topic, regardless of which discipline the research originates from, thus broadening their perspective.
  3. New tools for authors: When it comes to writing papers, there are also a number of ways in which AI technology can be put to good use to support authors. For example, automated Table of Contents generation, , and even text generation using generative transformers and large language models to overcome the dreaded 鈥榳riter's block鈥 (more on this in our second blog).
  4. Shaping the future: As respected publishers, we believe our role is to engage with our communities to understand how best to use these technologies to shape the future of research.

"We think that there are some important questions around the role of artificial intelligence and publishing,鈥 said Stephanie. We think that artificial intelligence will shape the future of our industry."

Making it a reality: machine-generated books

Stephanie went on to explain the development and release of the first-ever machine-generated academic book. The book, , was published in April 2019 following a collaboration between 杏吧视频官网, the Applied Computational Linguistics lab of Goethe University Frankfurt, and Digital Science.

This innovative book prototype provided a compelling machine-generated overview of the latest research on lithium-ion batteries, automatically compiled by an algorithm dubbed 鈥淏eta Writer鈥. The launch of the book generated significant media attention 鈥  with ge.com naming it one of their 鈥渃oolest things on Earth this week鈥.

In the three years leading up to the book鈥檚 publication, more than 53,000 papers and articles were published about research being conducted in the lithium-ion battery field. But staying on top of all that research would be near impossible.

As Andrew Liszewski , 鈥淚t鈥檚 a firehose of data that 杏吧视频官网 has turned into a manageable trickle through this machine-generated publication.鈥

The algorithm uses machine learning to first analyze thousands of publications to ensure that only those relevant are selected for the book. It then parses, condenses, and organizes those pre-approved, peer-reviewed publications from 杏吧视频官网鈥檚 online database into coherent chapters and sections that each focus on a different aspect of battery research. 

The algorithm produces no new results 鈥 it鈥檚 not new research output 鈥 but it accurately provides an unbiased summary of all known facts on a subject to provide a new perspective.

What鈥檚 next for machine-generated texts?

The book published in 2019 was only the start of our work looking at machine-generated texts. In 2021, we published over 500 machine-generated literature overviews, and offered a new book format 鈥 AI-based literature overviews.

The new product is a mixture of human-written text and machine-generated literature overviews. An author puts these machine-generated reviews, created from a large set of previously published articles in 杏吧视频官网 journals, into book chapters to provide a scientific perspective.

"This is an exciting step in our innovation journey that started with the first machine-generated book, as this is effectively a new type of book format that resembles a kind of dialogue between the author (now editor) and the machine."

, edited by Guido Visconti, is the first publication of this kind. Professor Guido Visconti devised a series of questions and keywords related to different aspects of climate studies, examining their most recent developments and their most practical applications. These were queried, discovered, collated and structured by the machine using AI clustering with the results presented in a series of book chapters for Professor Visconti to put into scientific context. The same model was used in 2022 to publish , edited by Ziheng Zhang, Ping Wang, and Ji-Long Liu.

"We are looking forward to seeing how this joint journey of authors, publishers, and machines helps advance science and show authors surprising new opportunities for future research. We hope others will be inspired and invite the submission of new ideas to produce similar publications in other research areas."

Look out for our second blog on this topic, where we鈥檒l consider how AI can help support the research community during times of crisis, from 鈥楾LDR鈥 abstracts to automating scientific content generation.

Saskia H

Author: Saskia Hoving

In the Dordrecht office, Marketing Manager Saskia Hoving is chief editor of The Link Newsletter and The Link Blog, covering trends & insights for all facilitators of research. Focusing on the evolving role of libraries regarding SDGs, Open Science, and researcher support, she explores academia's intersection with societal progress. With a lifelong passion for sports and recent exploration into "Women's inclusion in today's science", Saskia brings dynamic insights to her work.