A Data-driven Model of Nucleosynthesis with Chemical Tagging in a Lower-dimensional Latent Space

Andrew R. Casey, John C. Lattanzio, Aldeida Aleti, David L. Dowe, Joss Bland-Hawthorn, Sven Buder, Geraint F. Lewis, Sarah L. Martell, Thomas Nordlander, Jeffrey D. Simpson, Sanjib Sharma, Daniel B. Zucker

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Chemical tagging seeks to identify unique star formation sites from present-day stellar abundances. Previous techniques have treated each abundance dimension as being statistically independent, despite theoretical expectations that many elements can be produced by more than one nucleosynthetic process. In this work, we introduce a data-driven model of nucleosynthesis, where a set of latent factors (e.g., nucleosynthetic yields) contribute to all stars with different scores and clustering (e.g., chemical tagging) is modeled by a mixture of multivariate Gaussians in a lower-dimensional latent space. We use an exact method to simultaneously estimate the factor scores for each star, the partial assignment of each star to each cluster, and the latent factors common to all stars, even in the presence of missing data entries. We use an information-theoretic Bayesian principle to estimate the number of latent factors and clusters. Using the second Galah data release, we find that six latent factors are preferred to explain N = 2566 stars with 17 chemical abundances. We identify the rapid- and slow neutron-capture processes, as well as latent factors consistent with Fe-peak and α-element production, and another where K and Zn dominate. When we consider N ∼ 160,000 stars with missing abundances, we find another seven factors, as well as 16 components in latent space. Despite these components showing separation in chemistry, which is explained through different yield contributions, none show significant structure in their positions or motions. We argue that more data and joint priors on cluster membership that are constrained by dynamical models are necessary to realize chemical tagging at a galactic-scale. We release accompanying software that scales well with the available data, allowing for the model's parameters to be optimized in seconds given a fixed number of latent factors, components, and ∼107 abundance measurements.

Original languageEnglish
Article number73
Number of pages18
JournalAstrophysical Journal
Volume887
Issue number1
DOIs
Publication statusPublished - 11 Dec 2019

Cite this

Casey, Andrew R. ; Lattanzio, John C. ; Aleti, Aldeida ; Dowe, David L. ; Bland-Hawthorn, Joss ; Buder, Sven ; Lewis, Geraint F. ; Martell, Sarah L. ; Nordlander, Thomas ; Simpson, Jeffrey D. ; Sharma, Sanjib ; Zucker, Daniel B. / A Data-driven Model of Nucleosynthesis with Chemical Tagging in a Lower-dimensional Latent Space. In: Astrophysical Journal. 2019 ; Vol. 887, No. 1.
@article{5314a16bed4244d8874e4b0697c304b2,
title = "A Data-driven Model of Nucleosynthesis with Chemical Tagging in a Lower-dimensional Latent Space",
abstract = "Chemical tagging seeks to identify unique star formation sites from present-day stellar abundances. Previous techniques have treated each abundance dimension as being statistically independent, despite theoretical expectations that many elements can be produced by more than one nucleosynthetic process. In this work, we introduce a data-driven model of nucleosynthesis, where a set of latent factors (e.g., nucleosynthetic yields) contribute to all stars with different scores and clustering (e.g., chemical tagging) is modeled by a mixture of multivariate Gaussians in a lower-dimensional latent space. We use an exact method to simultaneously estimate the factor scores for each star, the partial assignment of each star to each cluster, and the latent factors common to all stars, even in the presence of missing data entries. We use an information-theoretic Bayesian principle to estimate the number of latent factors and clusters. Using the second Galah data release, we find that six latent factors are preferred to explain N = 2566 stars with 17 chemical abundances. We identify the rapid- and slow neutron-capture processes, as well as latent factors consistent with Fe-peak and α-element production, and another where K and Zn dominate. When we consider N ∼ 160,000 stars with missing abundances, we find another seven factors, as well as 16 components in latent space. Despite these components showing separation in chemistry, which is explained through different yield contributions, none show significant structure in their positions or motions. We argue that more data and joint priors on cluster membership that are constrained by dynamical models are necessary to realize chemical tagging at a galactic-scale. We release accompanying software that scales well with the available data, allowing for the model's parameters to be optimized in seconds given a fixed number of latent factors, components, and ∼107 abundance measurements.",
author = "Casey, {Andrew R.} and Lattanzio, {John C.} and Aldeida Aleti and Dowe, {David L.} and Joss Bland-Hawthorn and Sven Buder and Lewis, {Geraint F.} and Martell, {Sarah L.} and Thomas Nordlander and Simpson, {Jeffrey D.} and Sanjib Sharma and Zucker, {Daniel B.}",
year = "2019",
month = "12",
day = "11",
doi = "10.3847/1538-4357/ab4fea",
language = "English",
volume = "887",
journal = "The Astrophysical Journal",
issn = "0004-637X",
publisher = "American Astronomical Society",
number = "1",

}

Casey, AR, Lattanzio, JC, Aleti, A, Dowe, DL, Bland-Hawthorn, J, Buder, S, Lewis, GF, Martell, SL, Nordlander, T, Simpson, JD, Sharma, S & Zucker, DB 2019, 'A Data-driven Model of Nucleosynthesis with Chemical Tagging in a Lower-dimensional Latent Space', Astrophysical Journal, vol. 887, no. 1, 73. https://doi.org/10.3847/1538-4357/ab4fea

A Data-driven Model of Nucleosynthesis with Chemical Tagging in a Lower-dimensional Latent Space. / Casey, Andrew R.; Lattanzio, John C.; Aleti, Aldeida; Dowe, David L.; Bland-Hawthorn, Joss; Buder, Sven; Lewis, Geraint F.; Martell, Sarah L.; Nordlander, Thomas; Simpson, Jeffrey D.; Sharma, Sanjib; Zucker, Daniel B.

In: Astrophysical Journal, Vol. 887, No. 1, 73, 11.12.2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A Data-driven Model of Nucleosynthesis with Chemical Tagging in a Lower-dimensional Latent Space

AU - Casey, Andrew R.

AU - Lattanzio, John C.

AU - Aleti, Aldeida

AU - Dowe, David L.

AU - Bland-Hawthorn, Joss

AU - Buder, Sven

AU - Lewis, Geraint F.

AU - Martell, Sarah L.

AU - Nordlander, Thomas

AU - Simpson, Jeffrey D.

AU - Sharma, Sanjib

AU - Zucker, Daniel B.

PY - 2019/12/11

Y1 - 2019/12/11

N2 - Chemical tagging seeks to identify unique star formation sites from present-day stellar abundances. Previous techniques have treated each abundance dimension as being statistically independent, despite theoretical expectations that many elements can be produced by more than one nucleosynthetic process. In this work, we introduce a data-driven model of nucleosynthesis, where a set of latent factors (e.g., nucleosynthetic yields) contribute to all stars with different scores and clustering (e.g., chemical tagging) is modeled by a mixture of multivariate Gaussians in a lower-dimensional latent space. We use an exact method to simultaneously estimate the factor scores for each star, the partial assignment of each star to each cluster, and the latent factors common to all stars, even in the presence of missing data entries. We use an information-theoretic Bayesian principle to estimate the number of latent factors and clusters. Using the second Galah data release, we find that six latent factors are preferred to explain N = 2566 stars with 17 chemical abundances. We identify the rapid- and slow neutron-capture processes, as well as latent factors consistent with Fe-peak and α-element production, and another where K and Zn dominate. When we consider N ∼ 160,000 stars with missing abundances, we find another seven factors, as well as 16 components in latent space. Despite these components showing separation in chemistry, which is explained through different yield contributions, none show significant structure in their positions or motions. We argue that more data and joint priors on cluster membership that are constrained by dynamical models are necessary to realize chemical tagging at a galactic-scale. We release accompanying software that scales well with the available data, allowing for the model's parameters to be optimized in seconds given a fixed number of latent factors, components, and ∼107 abundance measurements.

AB - Chemical tagging seeks to identify unique star formation sites from present-day stellar abundances. Previous techniques have treated each abundance dimension as being statistically independent, despite theoretical expectations that many elements can be produced by more than one nucleosynthetic process. In this work, we introduce a data-driven model of nucleosynthesis, where a set of latent factors (e.g., nucleosynthetic yields) contribute to all stars with different scores and clustering (e.g., chemical tagging) is modeled by a mixture of multivariate Gaussians in a lower-dimensional latent space. We use an exact method to simultaneously estimate the factor scores for each star, the partial assignment of each star to each cluster, and the latent factors common to all stars, even in the presence of missing data entries. We use an information-theoretic Bayesian principle to estimate the number of latent factors and clusters. Using the second Galah data release, we find that six latent factors are preferred to explain N = 2566 stars with 17 chemical abundances. We identify the rapid- and slow neutron-capture processes, as well as latent factors consistent with Fe-peak and α-element production, and another where K and Zn dominate. When we consider N ∼ 160,000 stars with missing abundances, we find another seven factors, as well as 16 components in latent space. Despite these components showing separation in chemistry, which is explained through different yield contributions, none show significant structure in their positions or motions. We argue that more data and joint priors on cluster membership that are constrained by dynamical models are necessary to realize chemical tagging at a galactic-scale. We release accompanying software that scales well with the available data, allowing for the model's parameters to be optimized in seconds given a fixed number of latent factors, components, and ∼107 abundance measurements.

UR - http://www.scopus.com/inward/record.url?scp=85077314033&partnerID=8YFLogxK

U2 - 10.3847/1538-4357/ab4fea

DO - 10.3847/1538-4357/ab4fea

M3 - Article

AN - SCOPUS:85077314033

VL - 887

JO - The Astrophysical Journal

JF - The Astrophysical Journal

SN - 0004-637X

IS - 1

M1 - 73

ER -