Introduction to Bioimage Analysis: Open Textbook Interview
In the fourth of our Open Education Week 2024 open textbook interviews, Charlie Farley talks to Dr Peter Bankhead, Reader in the Institute of Genetics and Cancer at The University of Edinburgh. Creator of digital imaging open-source software, QuPath, which has been downloaded close to half a million times, cited in about 3,000 academic articles, and is used all over the world, both in academia and industry. Dr Bankhead has published his open textbook Introduction to Bioimage Analysis using GitHub and Jupyter Books.
Charlie Farley
Could you share with our readers who you are and how you came to create an open textbook?
Dr Peter Bankhead
I’m a Reader in the Institute of Genetics and Cancer at The University of Edinburgh. Although I work in image analysis, my first degree was in Divinity. I changed direction and studied for a MSc in Computer Science, then I did a PhD in Biomedical Sciences with idea that I would learn more biology with lab work and analysis. But it ended up being 100% analysis. Because I hadn’t been taught image analysis before, I remember how difficult it was to try to learn the key ideas quickly – and apply them at a deep enough level to do a PhD. So, I can really empathise with the biologists and biomedical scientists I meet now, who find themselves needing to learn how to analyse their imaging data, which requires quite different skills from what they were trained in.
I had always tried to make my work open, because I think it’s really important for transparency and reproducibility in science. I shared code with the papers I wrote for my PhD, and I benefitted a lot from open software written by others. Then, some years later, I found myself as a postdoc in the field of digital pathology, where the images are huge – maybe 40-50 GB for a single 2D image of a tissue sample – and all the open-source tools I used previously just didn’t work. I created new software to address this, and wanted to make it open source as well. That was when I discovered that not everyone in a university is necessarily very enthusiastic about openness. In the end, I felt I had to leave the university to get it released. My next job was in industry, which also wasn’t very conducive to working open source either.
I left that job after a year, and spent some time figuring out what to do next. In the end, I joined Edinburgh as Senior Lecturer in September 2018.
Since the software I created back as a postdoc was finally open-source by this time, and I was free to work on it, I started improving it – and now, together with my group, we have made it much more powerful. The software, QuPath, has been downloaded close to half a million times, cited in more than 3,000 academic articles, and is used all over the world, both in academia and industry. All of that has only been possible because it is open.
The theme of ‘openness’ has really come to dominate the research side of my career. But during my first postdoc, back in 2010-2012, I was in a core facility and doing some teaching. I found myself trying to explain concepts to biologists who – like me – hadn’t been trained in image analysis. When I started to create new training resources to try to explain things in a more intuitive way, I just wanted to share them in the hope they’d be useful. I didn’t really know anything about licensing or FAIR principles – which only came out later anyway – so the initial handbook that I wrote was just a big 195-page PDF online. Even so, quite a lot of people downloaded it and used it to learn about image analysis.
Then I started to get questions from people wanting to re-use it, and I realised that I didn’t know the rules around this myself. Also, a PDF was very static and wasn’t the ideal format for teaching. By this time, my experiences with making software open had taught me that sharing doesn’t just mean putting something online, but also being clear with licenses. I ended up completely revising the content, and replacing many of the images. It feels better that it can now be used by other people and by my future self, no matter where I am.
Charlie Farley
What was your process in deciding where to host and publish your open textbook?
Dr Peter Bankhead
After updating the PDF, I had the documentation on GitHub and published using GitBook, which turned it into a nice website. I still use GitHub to host all the source material, but now I use Jupyter Book – which is itself open-source. Jupyter Book makes it possible to make the book interactive. For example, the figures are created using Python code; particularly enthusiastic readers can modify this code in their browser, and see how the figure changes. This gives an extra level to use the book to explore new image processing concepts.
Charlie Farley
You said that you went through a process of checking the copyright of everything that you used in there and removing the images. How did you find that process?
Dr Peter Bankhead
Oh, torturous! There are images that are widely used among researchers and instructors, but it is difficult to find 100% clear statements on their licensing. Then there were microscopy images I had been using from previous institutions. I’d included them in my original PDF handbook with the agreement of the person who had taken the image, but I didn’t have explicit agreement for sharing under a specific license – because I hadn’t realised at the time how important that would become. So I had to remove these and find replacement images where I could point readers to the original source.
I know I’m not alone in this. I was recently at a workshop on ‘Effectively Communicating Bioimage Analysis’. Chatting with one of the other attendees, who I know makes great training videos that are very popular, we talked about how we both found ourselves recreating our materials during the pandemic. In fact, he had the idea of submitting an opinion paper about why people should share training resources under clear open licences, which we’ve just done. Sharing training resources is important not just for learners, but for other trainers: we can create better and more advanced resources if we don’t need to continually spend time reinventing what already exists from scratch. And it’s good for the person doing the sharing, because they get credit and recognition – as well as the satisfaction that more people are benefiting from their work.
Charlie Farley
Do you have a message to people who are making this material out there?
Dr Peter Bankhead
If at all possible, please share under a permissive licence. CC BY is a good choice, or even CC0 of you don’t want to be acknowledged; more restrictive licenses are better than no license at all, but too many restrictions (such as non-commercial, or no derivates) can really complicate remixing resources because license incompatibilities start to arise. I am by no means an expert on the legal intricacies, but there is a lot of good information on the Creative Commons website. Really, to keep things simple, I’d recommend sharing as openly and permissively as you can. and as early as you can. If you only think about licensing years later, it’s going to be harder to go back and sort it all out – especially if you’re at a different institution.
Charlie Farley
What sort of feedback have you gotten from students and colleagues and peers?
Dr Peter Bankhead
Every now and again I meet someone working in the field, and they tell me that they first encountered image analysis through my book. That’s always really encouraging. I also know from experienced people working in the field that the book is one that they recommend to their students.
One exciting thing happening now is that Beth Cimini, as part of the Centre for Open Bioimage Analysis in the US, has been introducing me to the world of translation. They are starting to translate bioimage analysis resources into different languages and they’ve chosen this book as one of their first. We have a machine translation in Spanish and German – now we need native speakers to edit and improve this.
Charlie Farley
That’s great news!
The last question I want to ask you is what would you say to somebody who is thinking about creating an open textbook?
Dr Peter Bankhead
If the alternative is to create a closed textbook, I would say create an open textbook!
I’ve certainly found it rewarding. I was much more motivated to work on it because I knew that it had the potential to help more people than something closed and restricted. I’ve learned a lot as the book has evolved over the years, and it has helped with the in-person workshops I’ve taught along the way. I understand the topics better because I’ve spent a long time thinking about how to explain them, and benefited from others’ feedback.
It also opens a lot of doors.
When I meet new people at conferences, sometimes they mention having read the book or recommended it to others. I know a lot of people in science suffer from imposter syndrome – I can certainly relate to it – and it feels a bit exposing to work openly. I might get things wrong, and when I make my work open then everyone can see my mistakes. But this can be a good thing, since then they can tell me and I’ll learn. And the positive response to the book, as well as my open-source software, gives me more confidence that I belong in the field and have something meaningful to contribute.
Even though it’s a bit scary, and certainly can be a lot of work, I’d highly recommend making an open textbook – it can end up creating new opportunities you could never expected, both for yourself and for others.