The Aurorean
Posts
#16 | Screening Cancers with AI

#16 | Screening Cancers with AI

+ Llama 3 AI models, detecting multiple sclerosis in blood, and more

The Aurorean
April 24, 2024

Hello fellow curious minds!

Welcome back to another edition of The Aurorean.

In case you missed the details these last couple of weeks, we are hosting our first raffle giveaway at the end of April to celebrate our 3 month anniversary and growing community of thousands of STEM enthusiasts!

Complete our short survey form no later than 11:59pm EST on April 30^th for eligibility to be one of 3 subscribers who will win a $50 Visa gift card.

As a reminder, if you have already taken the survey form we have shared these past few weeks, then you are already eligible and you do not need to take any further action. If you have not filled out our survey and want to be eligible for our raffle giveaway, click the link below. This is the last week of eligibility so make sure to complete the form before the deadline!

One other quick point: last week’s poll received a lot of great feedback and 40% of participants said Individual Interviews and Conversations are their favorite type of podcast format to listen to for STEM topics. Case studies received the 2^nd most votes with 30% of the total.

Let’s continue this discussion with one other poll question:

I listen to STEM podcasts or interviews with distinguished experts or

With that said, on to the news. Wondering what STEM discovered last week?

Let’s find out.

Quote of the Week 💬

Screening Pediatric Cancers w/ AI Improves Treatment Outcomes

“I believe that AI is a critical component of the future of functional precision medicine and driving better outcomes for cancer patients.”

Diana Azzam, Assistant Professor of Environmental Health Sciences @ Florida International University

⌛ The Seven Second Summary: Researchers used an AI-powered precision medicine solution to identify unique therapeutic treatment options for children with relapsed cancer diagnoses.

🔬 How It Was Done:

25 patients with relapsed solid or blood-based cancers underwent drug sensitivity testing and genomic profiling in order to analyze the genetic characteristics of their cancer cells and identify potential treatment options.
Afterwards, tumor and blood samples were collected from patients. These samples were then cultivated in a lab to mimic the natural growth conditions in the body for further testing.
The tumor and blood samples were exposed to over 120 FDA-approved medications, tested both alone and in various combinations. The team used AI to quickly analyze and interpret the results to determine the optimal drug combinations and quantities to produce the most efficacious treatment for each patient's unique cancer type.

🧮 Key Results:

In the end, 19 patients completed this process and 6 received a precision medicine treatment recommendation.
83% of the patients who received personalized treatments experienced improved treatment responses. These patients lived significantly longer (8.5x) without their cancer progressing compared to previous treatments they received.

💡 Why This May Matter: Despite analyzing different tumor types, the AI-assisted treatment recommendations yielded better patient outcomes. This may indicate a versatile AI system that can effectively analyze many solid and blood-based cancers, rather than an AI system that is only useful with a few specific types of cancers.

🔎 Elements To Consider: This was a small study that did not include a randomized control group: Until a larger, more robust follow-up study takes place, the findings should be interpreted with restraint.

📚 Learn More: The Conversation. GEN. Nature.

Stat of the Week 📊

Meta Releases Its Llama 3 AI Models

15,000,000,000,000

⌛ The Seven Second Summary: Meta released Llama 3, the next generation of its open-source Large Language Models (LLMs), with 8 billion and 70 billion parameter versions. An even larger model with over 400 billion parameters is expected to be released soon.

🔬 How It Was Done:

Human Evaluation: The team created 1,800 different prompts to test their models across 12 different use cases, such as coding, reasoning and answering open questions. The team relied on human feedback to help the model understand what makes a good response, filter out low-quality responses and adjust its performance accordingly.
Dataset Curation: The model was pre-trained on over 15 trillion tokens from publicly available sources, a dataset 7x larger than what was used to train Llama 2. To ensure high-quality data, the team developed filtering techniques to classify information and screen out redundant or unwanted data.
Training Techniques: The team’s processing speed to train Llama 3 was 3x faster than training for Llama 2. This was made possible by dividing the training data into smaller chunks, parallelizing the AI model to process more data at once, and utilizing more powerful compute hardware.

🧮 Key Results: The data Meta released shows Llama 3 outperforms other open-models across various evaluation benchmarks. In fact, their 80 billion parameter model has already ascended to the 6th spot on Hugging Face’s LMSYS leaderboard, which suggests it is the most advanced open-source model on the market today.

💡 Why This May Matter: Open-source AI models are important because they allow developers and researchers around the world to access, modify, and test for critical safety and STEM use cases. Meta is one of few institutions with the infrastructure and resources to train state-of-the-art models, and their commitment to open-source is pushing the field of AI research forward.

🔎 Elements To Consider: Meta has yet to release their 400 billion parameter model, though it will likely perform similarly to Chat GPT-4 Turbo and Claude 3 Opus, the very best proprietary models. This is another reminder that the open-source community is only a few months behind private corporations, which makes the LLM space one of the most competitive software markets in the world.

📚 Learn More: Meta AI. Github.

AI x Science 🤖

A curious mind. Credit: Maximalfocus on Unsplash

Using Reinforcement Learning To Improve LLM Reasoning

Researchers from Tencent AI shared a paper outlining AlphaLLM, a new training and architecture design framework for Large Language Models (LLMs) to improve their reasoning abilities. This framework draws inspiration from Google Deepmind’s AlphaGo system design to train AI models in games like Go.

In 2016, AlphaGo was introduced and became the first AI model to defeat the world's best Go players. The model achieved superhuman performance by incorporating a combination of tree search algorithms, reinforcement learning and other machine learning techniques to evaluate a Go board and predict the next best move to win the game. Through self-play and game simulations, AlphaGo learned to predict moves by receiving rewards for winning and penalties for losing, which allowed the system to refine its strategy and adapt to new situations.

Similarly, AlphaLLM uses a tree search to explore potential responses and evaluate their outcomes. However, AlphaLLM innovates on this concept by introducing a rewards and penalties system for the spectrum of possible language responses it can provide, which allows the model to train itself through similar self-play and reinforcement learning techniques. While AlphaGo operates within the well-defined rules of a game, AlphaLLM tackles the challenges of reasoning through open-ended language tasks. This requires the model to have subjective evaluations about the sorts of responses it should reward and penalize. To assist with the evaluation process, the researchers designed 3 other models to critique and provide different types of feedback to the LLMs outputs, so it can learn from its mistakes over time.

For example, if you ask the AI for instructions to bake a cake for your spouse, this reward and penalty system might assess the model’s ability to efficiently gather and consider crucial information for its response: your budget, cooking skills, available appliances, dietary restrictions, the measurements of each ingredient, your spouse's favorite flavors, the clarity and brevity of the model’s instructions, and so on. The first several responses from AlphaLLM may be gibberish or low quality responses. However, over time, the model’s reward and penalty system will teach AlphaLLM to search for an optimal chain of reasoning to provide a concise, comprehensive and well-thought-out response. This type of training technique is a reminder how seemingly simple language questions actually contain a wide array of subtle complexities for machines.

When tested on mathematical reasoning, the experimental results from this new design framework were encouraging. Notably, the model's accuracy significantly improved, from 57.8% to 92.0% on the GSM8K dataset and from 20.7% to 51.0% on the MATH dataset.

Of course, language tasks in everyday activities are far more complex than board games. There are 10¹⁷⁰ possible board configurations in Go, and the possibilities for language use are virtually limitless. Moreover, although this framework excelled in mathematical reasoning tasks, this success is partly due to the objective nature of mathematical answers, which can be clearly labeled as correct or incorrect. In contrast, many language tasks involve subjective judgments of "correct" and "incorrect", which makes AlphaLLM's path to superhuman reasoning far more uncertain and challenging than AlphaGo's journey to superhuman Go skills.

Nevertheless, AI research papers are beginning to demonstrate how reasoning is not a linear process. Rather, it is a form of imagination and exploration that involves iterative searches and recursive feedback loops. Perhaps the right search mechanism and the correct reward system to reinforce proper reasoning skills on a step-by-step basis can actually teach machines to become super intelligent. Perhaps not. Either way, we may find out in the next few years as researchers continue to push the boundaries of AI. arXiv.

Our Full AI Index

Tumor Origins: Scientists from Tianjin Medical University in China developed an AI tool to identify the origins of metastatic cancer cells with an 83% accuracy rate, outperforming human pathologists and potentially improving diagnosis and treatment of late-stage cancer. The AI model was trained on 30,000 images of cancer cells and provides a list of possible cancer sources based on its data assessment. This may help medical experts spend less time and resources on tests, so more time can be spent on finding effective treatments and improving patient care. Nature Blog. Nature Paper.
Drug Design: Researchers from the University of Cambridge used an AI system to accelerate the search for Parkinson's disease treatments. They used machine learning techniques to identify 5 highly potent compounds to block the clumping of alpha-synuclein, a protein associated with the disease. The researchers report this AI-based approach sped up the initial screening process by 10x and reduced costs by 1000x. University of Cambridge. Nature.
Open Source Medical Models: Researchers from Open Life Science AI, Edinburgh University and Hugging Face announced the creation of Open Medical-LLM Leaderboard. The goal of this leaderboard is to standardize how LLMs are evaluated and compare model performance on a variety of medical tasks and datasets. Hugging Face.
Trends Report: Stanford researchers released the 2024 edition of their AI Index Report. The report is more robust than ever, with new assessments and commentary on science and medicine, responsible AI and many other pillars. AI Index.

Other Observations 📰

Red blood cells. Credit: ANIRUDH on Unsplash

Multiple Sclerosis Signs Appear In Blood Years Before Symptoms

Researchers from the University of California, San Francisco, the University of Maryland, and Northwestern University have discovered a link between infections and autoimmune diseases like Multiple Sclerosis (MS). The team analyzed blood samples from 250 MS patients and used a technique called PhIP-Seq to detect autoantibodies stuck against human proteins. They found 10% of patients had a distinct signature of autoantibodies that resembled Epstein-Barr Virus (EBV), a common virus that infects over 90% of the global population and has been linked to MS in previous studies.

Crucially, this autoantibody signature was 100% predictive of an MS diagnosis, on average, 5-7 years before symptoms emerged. Blood biomarkers are invaluable for early detection and treatment, and a new 100% predictive marker means a definitive molecular sign for MS development in some people may have been discovered. Although 90% of patients in this study did not exhibit this immune response, leaving questions about how others develop MS, these findings nevertheless offer new research avenues and potential therapeutic interventions to slow or prevent disease progression. UCSF. Nature.

Our Full Science Index

Technological Change: Rwanda has made remarkable progress in expanding electricity access in just 15 years. In 2009, only 6% of households had electricity. As of March 2024, 75% of households have electricity, including 100% of healthcare centers and 84% of schools and small businesses. Kenya also deserves praise in this domain, increasing access to electricity from 5% of households in 1995 to 71% as of 2020. World Bank.
Environmental Protection: The number of oil spills from container ships transporting oil has decreased by over 90% since the 1970s, from 70-100 spills per year to just a handful. Furthermore, the amount of oil spilled has also dramatically decreased, from over 300,000 tons per year in the 1970s to less than 10,000 tons per year in the last decade. Our World In Data.
Public Health: More children are surviving today into their teenage years than ever before, with the global under-5 mortality rate declining by 51% since 2000. Similarly, maternal deaths have fallen by ~50% in the last 35 years. UNICEF.
Clean Energy: For the first time since the mid-20th century, over 95% of the new electric-generating capacity planned for 2024 in the United States is zero-carbon, meaning almost all new energy generation installed this year will not add CO₂ to the atmosphere. This exceeds the global rate, where clean electricity accounted for around 80% of new capacity additions worldwide in 2023. The White House.
Space: A supernova in 2023 provided scientists with an opportunity to study how these explosions accelerate cosmic rays, but surprisingly, NASA's Fermi Gamma-ray Space Telescope did not detect any high-energy gamma-ray light. The findings suggest that supernovae may not be as efficient at accelerating cosmic rays as previously thought, and scientists are now exploring alternative scenarios and hypotheses to explain the mystery of cosmic ray origins. NASA. Astronomy & Astrophysics.

Media of the Week 📸

A Jumping Robot For Low Gravity Exploration

Researchers at ETH Zurich designed a robot called Space Hopper to move in low gravity environments. The robot’s namesake is from the hopping motions it makes to move, and its advancement is intended to help future space missions explore asteroids. Space Hopper.

An Autonomous Robot Repairs Another Robot

In January, we featured a demo video of the ALOHA robotics project showcasing its ability to handle various kitchen tasks, such as cooking eggs and cleaning dishes. Last week, the Stanford research team released a new video to highlight their robots' improved dexterity and latest advancements, including demonstrations in tasks like knotting shoelaces, hanging shirts, and even repairing other robots. It will be interesting to see what else the team will train their machines, and if the robots learn these tasks with fewer demonstrations and with greater adaptability over time. arXiv.

A Map Of The Milky Way’s Dust Clouds & Magnetic Fields

A map of the central region of the Milky Way. Credit: Villanova University/Paré, Karpovich, Chuss (PI)

Scientists from Villanova University assembled a map of the magnetic fields at the center of the Milky Way. The map represents 500 light-years of space in our galaxy and is painted in infrared wavelengths to represent different temperatures of interstellar dust clouds. Cool, dense dust is green, whereas warmer dust is pink. The lines threaded throughout the photo show the direction of the magnetic forces in the clouds. Does the texture of this map remind anyone else of the sky in Van Gogh’s Starry Night? NYT.

This Week In The Cosmos 🪐

No major events this upcoming week. How disappointing...

Credit: Francesco Ungaro on Unsplash

That’s all for this week! Thanks for reading.