A quarter of a century ago, when I thought my future was in science, automation and the idea of “big data” had just arrived for genetics. Automated sequencing, mathematical models, algorithms. Similar innovations spread to others areas of biology through things like better sensors, imaging systems, smarter radio tags for wildlife, data-loggers, and the like.

Wet labs got less wet, and computer suites multiplied. Science started following its own version of “Moore’s Law” - the relative amount of time on analytical and computational effort doubled every year (or so it seemed).

Now there is another revolution underway in biology labs. Artificial intelligence methods, such as deep learning and neural networks. Several recent reports highlight this. The May issue of The Scientist looked at how AI is being in biology, particularly in imaging. This includes applications that can infer cell structures from unlabelled cellular images, and identifying bacteria or pollen spores from images (which enables real time analysis outside of the lab)

AI is also becoming involved in study social behaviour in the wild through the use of facial recognition.

However, AI is being used for more than just image analysis. Nathan Benaich looks at six areas in the life science that are adopting AI approaches . Along with image analysis he notes uses in drug discovery, chemical synthesis, protein identification, and protein structure prediction.

Much of biology is about identifying and making sense of patterns, so applications in many other areas are likely to follow.

Technology vs Sociology

Implications of using AI in science is well examined from both technical and sociological points of view by Mohammed AlQuraishi in his blog post about advances in predicting protein structures. He considers whether the advances shown by DeepMind’s AlphaFold programme reflect scientific insights or just “better engineering.”

AlQuraishi notes that “science vs engineering” debates are often more about scientists being eager to demonstrate that an advance was made because someone else had more resources, rather than a failing in their own thinking and approach.

He thinks both better scientific insights as well as engineering are involved, and that this will benefit both small science teams as well as well funded companies. His main message isn’t about how AI is transforming his area of research, but how we think about the practices of biology (and other sciences) more generally in the face of continued technological developments.

AlQuraishi points out that Big Pharma companies, with enormous resources at their disposal, haven’t been able to achieve what AlphaFold’s small team, with little biological experience, has done. He considers this a failing and blames it on pharmaceutical companies’ very narrow and applied research focus.

AlQuraishi also suggests AlphaFold’s success illuminates a common behaviour in science that can hold back discoveries - the reluctance in sharing information by competing science teams. Competition is an important driver in science, but so is collaboration. When the emphasis is too much toward competition we get “toxic academia”. AlQuraishi thinks that different protein folding teams probably collectively had the information to improve folding predictions if they had shared more of their research and thinking.

Science in the future

The future of science, he thinks, will see a shift, with scientists focusing more on problems that require insights and conceptual breakthroughs, while the machines focus on data generation and crunching, and other automated tasks. That seems plausible.

But will it be more like an Amazon fulfillment centre, with a few graduate students run ragged, or like an updated version of the 1950’s Cavendish Lab, where intellectual brilliance and quirkyness flourished?

Making DNA sequencing and analysis quicker and easier didn’t reduce the number of scientists involved in genetics. The opposite happened, because of the opportunities DNA sequencing created and the interesting questions and problems that opened up.

Will a similar trend occur as AI and other automated techniques spread in biology? I’m less sure, both because of supply and demand problems. For one thing, many universities seem to be struggling financially, and competition for good staff and students is rising. Investment in science and technology isn’t seeing much growth.

Rising costs for higher education combined with salaries and job security for scientists may make science less attractive. Graduate students and post-docs can be treated as easily replaceable resources (that’s not a new phenomenon, but it may get even easier with robotics).

Incentives (such as “publish or perish”) are now less aligned to needs, and there appears to be an oversupply of scientists, at least in developed countries. Top data-literate minds are being attracted into technology companies.

What’s of more interest, though, than the number of scientists is the type of science and the nature of the problems and questions being addressed, and the conditions that encourage great science.

Marc Kirschner advocates for renewed emphasis on “playful, non-conformist scientists,” rather than building industrial scientific complexes. Not something that will appeal to university of government research administrators.

“I believe that science at its most creative is more akin to a hunter-gatherer society than it is to a highly regimented industrial activity, more like a play group than a corporation.”

Kirschner’s analogy to hunter-gatherers is to emphasise a collective with different but complementary skills and minimal hierarchies.

What vs Why

Science has always progressed through a combination of tools, observation and theory. This leads to progress about what (finding and describing what happens), and why (understanding the causes and implications).

One of the big challenges for science is that it needs to focus not just on data generation rather but on continuing to develop and test causal hypotheses. Describing patterns and correlations can be useful, but there is more to science than just that. A risk with AI techniques is that how they come to infer patterns and correlations can be unknowable. No one knows how a neural network identifies dogs from millions of photos sans dog.

In an interview AlQuraishi rhetorically asked :

“What is it that we mean by ‘doing science’? Is science about understanding natural phenomena or about building mathematical models that’ll predict what will happen?”

He’s concerned that an increasing emphasis data generation may mean that it may get harder to publish theoretical papers, which are usually the source of paradigm shifts. An unreasonable faith in data beating theory was illustrated by Wired magazine a decade ago where they gleefully claimed that a data deluge would do away with the scientific method.

Jonathan Zittrain picked up on this in an essay in The New Yorker. He described the risks that automated thinking has in some circumstances if it provides answers but not explanations. He calls this “intellectual debt.”

“It’s possible to discover what works without knowing why it works, and then to put that insight to use immediately, assuming that the underlying mechanism will be figured out later. In some cases, we pay off this intellectual debt quickly. But, in others, we let it compound, relying, for decades, on knowledge that’s not fully known.”

Science shouldn’t recklessly rack up intellectual debt using big data credits.

There is also the risk that as biology, and the rest of science, becomes more “artificially intelligent” and automated there will be a correlated increase in expectations from funders and society that answers to pressing and complex problems will be easier and quicker to answer. That’s not going to be the case.

Featured image by Michael Schiffer on Unsplash