#17 | Animal Consciousness Evidence

+ a 15 trillion token dataset, the tree of life for flower plants, and more

Hello fellow curious minds!

Welcome back to another edition of The Aurorean.

Thank you to everyone who participated in our first raffle giveaway this past month! We received hundreds of survey responses and will pull our 3 randomly selected winners within the next 24 hours. If you are one of the winners, you will receive a separate email from us congratulating you this Friday.

Best of luck to everyone! Even if you do not win one of our $50 Visa gift card prizes this time around, the appreciation and success we saw from this event has motivated us to celebrate with more giveaway opportunities in the future for those engage with and support our work. ❤️

***

One other quick point: last week’s poll was illuminating for us. We asked you all how often you listen to podcasts, and the most popular response was Never, with 28% of votes from participants. When we include participants who said Rarely and Never, the number jumps to a plurality of 52%. This was somewhat expected. We are a newsletter, so we expect our audience to enjoy and reading. However this was a reminder that we all prefer to consume information in different way, and we continue to grow our community we’ll explore how we can best service the preferences and habits of our audience.

With that said, on to the news. Wondering what STEM discovered last week?

Let’s find out.

Quote of the Week 💬 

Experts Approaching A New Consensus On Animal Consciousness

“The empirical evidence indicates at least a realistic possibility of conscious experience in all vertebrates (including all reptiles, amphibians and fishes) and many invertebrates (including, at minimum, cephalopod mollusks, decapod crustaceans and insects).”

The New York Declaration on Animal Consciousness

⌛ The Seven Second Summary: A group of 30+ philosophers and biologists created and signed an updated animal consciousness declaration to acknowledge consensus around a growing body of scientific evidence to suggest consciousness is likely experienced by vertebrates and invertebrates.

🔬 How It Was Done:

  • In 2012, the Cambridge Declaration on Consciousness became the first major scientific declaration to acknowledge consciousness is likely experienced by mammals and birds. Last week’s updated declaration builds upon this framework to include far more of the animal kingdom.

  • The vertebrate and invertebrate declaration focuses on the most basic kind of consciousness, known as phenomenal consciousness, which refers to an animal's capacity to have sensory experiences, such as sensations of touch, taste, sight, and so forth. Phenomenal consciousness also means animals with sensory experiences can classify these experiences as positive or negative feelings, like pain, pleasure, fear, and so on.

🧮 Key Results: To strengthen their claims, the researchers shared a list of recent evidence of animals demonstrating phenomenal consciousness. For example,

  • Vertebrates:

    • Reptiles:

      • Garter snakes pass a scent-based version of the mirror-mark test, indicating self-recognition.

    • Fish:

    Invertebrates:

    • Cephalopods:

      • Octopuses avoid pain and value pain relief.

      • Cuttlefish remember details of specific past events.

    • Insects:

    • Crustaceans:

      • Crayfish display “anxiety-like” states, and these states can be altered by anti-anxiety drugs.

💡 Why This May Matter: Society already considers the welfare of livestock and of mice in lab research because we acknowledge these animals experience some level of consciousness. How might society’s relationship with fish, reptiles and other insects change as we learn more about their consciousness?

🔎 Elements To Consider: This declaration does not suggest these animals experience more complex forms of consciousness, like a sense of oneself and their own “inner monologue.” However, it does recognize many animals may experience sensations, feelings and other cognition that humans do not.

Stat of the Week 📊 

Using AI To Improve Diagnosis Of Rare Genetic Disorders

57%

⌛ The Seven Second Summary: Researchers at Baylor College developed an AI system to help geneticists diagnose dozens of rare genetic disorders caused by specific mutations in single genes, such as cystic fibrosis, sickle cell disease, and Duchenne muscular dystrophy.

🔬 How It Was Done:

  • The researchers used a public dataset containing details of ~3.5 million genetic variants from thousands of diagnosed cases, as well as genetic data and symptom information from 1,044 patients to train their AI system.

  • For model training, the team identified dozens of important genetic characteristics that led to rare disorders and classified this information into 6 different groups: disease links, evolutionary history, mutation types, functional impact, biological connections, and inheritance patterns.

  • Afterwards, they developed a random forest machine learning model and optimized its performance by tuning its weights and parameters to improve its ability to accurately identify genetic variations associated with various diseases.

🧮 Key Results: When the team tested their model on a separate dataset of 871 patient cases it had not seen before, they discovered:

  • Their model correctly identified the cause of the genetic disorder 98% of the time and made the correct diagnosis in 57% of cases.

  • The model’s top three factors contributing to a final diagnosis were: how rare a genetic variant is, how confident it is in its accuracy, and how well the patient's symptoms match the expected characteristics of a genetic disorder.

  • Their model outperformed other specialized algorithms like AMELIE, Exomiser and PhenIX at identifying the most likely gene responsible for different disorders.

💡 Why This May Matter: Although a correct diagnosis rate of 57% may appear modest, the research team claims it is significant because rare genetic disorders are only diagnosed 30-40% of the time, and a diagnosis can often take years to determine after symptoms first appear. As genome sequencing and AI compute costs continue to decrease, and as AI performance continues to improve at an extraordinary rate, the potential for early intervention and treatment will ideally increase alongside these advancements.

🔎 Elements To Consider: The robustness and effectiveness of these models are often limited by their training data. While 3.5 million genetic variants sounds like a lot, far more data is needed to produce a representative dataset of many patient populations. Also, even with improved diagnostic accuracy, identifying patients who should be screened and tested early on is still a significant challenge even as countries expand various screenings for newborns.

AI x Science 🤖

Credit: WrongTog on Unsplash

15 Trillion Tokens Of Training Data To Train AI Models

Hugging Face has released a massive, high-quality dataset for training AI Large Language Models (LLMs). This dataset is a valuable resource for developers, and its impact may have important implications for scaling AI model performance moving forward.

To understand why, recall Meta announced Llama 3 last week, its latest suite of open-source LLMs. In Meta’s press statement, they mentioned their model was pre-trained on over 15 trillion tokens from publicly available sources, which is effectively the same size as the dataset Hugging Face has now shared. While there are some architectural differences that can be distinguished between Llama 2 and Llama 3, the main reason why their 80 billion parameter model jumped from the 41st best LLM in the world to the 6th best LLM in one generation appears to primarily be from the amount of data it was trained on.

This makes sense, because LLM performance has been observed to yield logarithmic gains as a system scales up its training data, parameter size and computing power. In fact, some research suggests that scaling these LLM properties beyond certain thresholds can lead to emergent new model behaviors in understanding and reasoning. This phenomenon appears to be similar to how expanding the brain's size and network presumably leads to more intelligence, and how learning coding skills, for instance, enhances certain problem-solving abilities, which can then be applied to unrelated tasks and challenges.

Credit: Open AI, Scaling Laws for Neural Language Models (2020)

While Big Tech giants like Microsoft, Apple, Amazon, and Google are often presumed to have an unassailable AI advantage due to their massive datasets and near-limitless resources, Meta's decision to open-source Llama 3 keeps the open-source AI model ecosystem competitive with proprietary models in terms of parameter size and computing power. Similarly, Hugging Face's announcement ensures competitiveness in terms of training data size, leaving algorithmic innovations and data quality as the main areas for the market to compete for an advantage. It will be interesting to see how significant an advantage these areas prove to be over time and whether the open-source ecosystem can continue to bridge the gap. Hugging Face.

Our Full AI Index
  • Plants: Scientists at the Salk Institute developed an AI-embedded tool to analyze plant root systems and identify genes that can help plants absorb more carbon from the atmosphere to combat climate change. With this system, the researchers were able to annotate plant traits 1.5x faster than before, as well as train AI models 10x faster, and predict plant structure 10x faster, all while maintaining similar or higher levels of accuracy. Salk Institute. Plant Phenomics.

  • OpenCrispr: A research lab from UC Berkeley unveiled their creation of OpenCRISPR-1, reportedly the world's first open-source AI-developed gene editor. Their technology is meant to enable scientists to precisely edit the human genome with customizable gene editors that are designed from scratch by their AI system. NYT. bioRxiv. Github.

  • On-Device Models: During the same week where Apple researchers shared details about OpenELM, their open-source models to use locally on devices, Microsoft researchers released Phi-3, its own suite of small AI models. The corresponding paper from Microsoft emphasizes why high-quality training data is important for model performance, as they note “our previous model trained on this data recipe, phi-2 matched the performance of models 25x larger trained on regular data.“ arXiv. Hugging Face.

  • Policy: A new AI Safety and Security Board will advise the United States’ Department of Homeland Security on how to best to develop and deploy responsible AI. The board is composed of 22 members from academia, software and hardware companies, public officials, and the civil rights community. DHS.

Other Observations 📰

Credit: Google DeepMind on Unsplash

A Tree Of Life For Flowering Plants

An international team of 279 scientists from 138 organizations in 27 countries published a paper containing the most up-to-date understanding of the flowering plant tree of life.

This achievement was made possible by sequencing 1.8 billion letters of genetic code from 9,506 different species, including more than 800 that have never had their DNA sequenced before. This represents ~60% of known flowering plant genera and is 15x more data than any comparable studies of the flowering plant tree of life.

This study is noteworthy because it provides the most comprehensive understanding of the evolutionary relationships among flowering plants, which originated on Earth more than 140 million years ago and quickly became one of the most prolific ecological species on the planet. As researchers, conservationists, and agricultural scientists refine their ability to classify existing plant species, they may discover new species and develop innovative strategies to safeguard ecosystems from climate-driven biodiversity loss. University of Michigan. Nature.

Our Full Science Index
  • Gene-Edited Pig Kidney Transplant: Remember in March when we shared the story of the man who received the world’s first genetically-edited pig kidney transplant? A woman is now the second patient to receive such a transplant to recover from her terminal illness. NYU Langone Health.

  • Solar Power: One of the most noteworthy ongoing developments in renewable energy is the amount of clean energy capacity China is installing. The country added 45.7GW of solar PV in Q1 2024, which is ~35% higher than their record-setting pace from Q1 last year. If their solar installation pace continues, 2024 may finally be the year where global CO2 emissions fall and remain in perpetual decline. PV Tech.

  • Kidney Cancer: A recent phase III clinical trial found a drug called pembrolizumab can help people with kidney cancer live longer. The study followed over 900 patients for an average of 57.2 months and found that those who took pembrolizumab had a 38% lower risk of dying from kidney cancer compared to those who took a placebo. The benefits of pembrolizumab became apparent about 15 months into the study and continued to increase over time. NEJM.

  • Immunization: A new report from the WHO estimates that since the 1970s, global immunization efforts have saved an estimated 154 million livesthe equivalent of six lives every minute of every year. Vaccination against 14 diseases, including diphtheria, measles, polio, rubella, and tuberculosis, are credited with saving these lives, which as also resulted in a 40% reduction in global infant deaths. UNICEF.

Media of the Week 📸 

Some Of The Most Impressive Robot Dexterity Yet

As we’ve showcased many times this year already, humanoid robot development appears to have passed an inflection point. This latest video is from an organization based in China, and their robot displays some of the most impressive speed, skill and dexterity we’ve seen yet.

This Week In The Cosmos 🪐

May 8: A new moon. The best time to stargaze!

Earthshine and the Eta Aquarids meteor will also be visible on the 4th and 5th, depending on the location in the world.

Credit: Denis Degioanni on Unsplash

That’s all for this week! Thanks for reading.