A data-driven model of nucleosynthesis with chemical tagging in a lower-dimensional latent space

Andrew Casey, John Lattanzio, Aldeida Aleti, David Dowe, Joss Bland-Hawthorn, Sven Buder, Geraint Lewis, Sarah L Martell, Thomas Nordlander, Jeffrey D. Simpson, Sanjib Sharma, Daniel Zucker

Research output: Other contributionOther

Abstract

Chemical tagging seeks to identify unique star formation sites from present-day stellar abundances. Previous techniques have treated each abundance dimension as being statistically independent, despite theoretical expectations that many elements can be produced by more than one nucleosynthetic process. In this work we introduce a data-driven model of nucleosynthesis where a set of latent factors (e.g., nucleosynthetic yields) contribute to all stars with different scores, and clustering (e.g., chemical tagging) is modelled by a mixture of multivariate Gaussians in a lower-dimensional latent space. We use an exact method to simultaneously estimate the factor scores for each star, the partial assignment of each star to each cluster, and the latent factors common to all stars, even in the presence of missing data entries. We use an information-theoretic Bayesian principle to estimate the number of latent factors and clusters. Using the second Galah data release we find that six latent factors are preferred to explain N = 2,566 stars with 17 chemical abundances. We identify the rapid- and slow-neutron capture processes, as well as latent factors consistent with Fe-peak and \alpha-element production, and another where K and Zn dominate. When we consider N ~ 160,000 stars with missing abundances we find another 7 factors, as well as 16 components in latent space. Despite these components showing separation in chemistry that is explained through different yield contributions, none show significant structure in their positions or motions. We argue that more data, and joint priors on cluster membership that are constrained by dynamical models, are necessary to realise chemical tagging at a galactic-scale. We release software that allows for model parameters to be optimised in seconds given a fixed number of latent factors, components, and 107 abundance measurements.
Original languageEnglish
Media of outputarXiv.org
Number of pages22
Publication statusAccepted/In press - Oct 2019

Publication series

NameThe Astrophysical Journal
PublisherAmerican Astronomical Society
ISSN (Print)0004-637X

Cite this

Casey, Andrew ; Lattanzio, John ; Aleti, Aldeida ; Dowe, David ; Bland-Hawthorn, Joss ; Buder, Sven ; Lewis, Geraint ; Martell, Sarah L ; Nordlander, Thomas ; Simpson, Jeffrey D. ; Sharma, Sanjib ; Zucker, Daniel. / A data-driven model of nucleosynthesis with chemical tagging in a lower-dimensional latent space. 2019. 22 p. (The Astrophysical Journal).
@misc{d2f6b4b8aede42aea1f48f15ecc6e15e,
title = "A data-driven model of nucleosynthesis with chemical tagging in a lower-dimensional latent space",
abstract = "Chemical tagging seeks to identify unique star formation sites from present-day stellar abundances. Previous techniques have treated each abundance dimension as being statistically independent, despite theoretical expectations that many elements can be produced by more than one nucleosynthetic process. In this work we introduce a data-driven model of nucleosynthesis where a set of latent factors (e.g., nucleosynthetic yields) contribute to all stars with different scores, and clustering (e.g., chemical tagging) is modelled by a mixture of multivariate Gaussians in a lower-dimensional latent space. We use an exact method to simultaneously estimate the factor scores for each star, the partial assignment of each star to each cluster, and the latent factors common to all stars, even in the presence of missing data entries. We use an information-theoretic Bayesian principle to estimate the number of latent factors and clusters. Using the second Galah data release we find that six latent factors are preferred to explain N = 2,566 stars with 17 chemical abundances. We identify the rapid- and slow-neutron capture processes, as well as latent factors consistent with Fe-peak and \alpha-element production, and another where K and Zn dominate. When we consider N ~ 160,000 stars with missing abundances we find another 7 factors, as well as 16 components in latent space. Despite these components showing separation in chemistry that is explained through different yield contributions, none show significant structure in their positions or motions. We argue that more data, and joint priors on cluster membership that are constrained by dynamical models, are necessary to realise chemical tagging at a galactic-scale. We release software that allows for model parameters to be optimised in seconds given a fixed number of latent factors, components, and 107 abundance measurements.",
author = "Andrew Casey and John Lattanzio and Aldeida Aleti and David Dowe and Joss Bland-Hawthorn and Sven Buder and Geraint Lewis and Martell, {Sarah L} and Thomas Nordlander and Simpson, {Jeffrey D.} and Sanjib Sharma and Daniel Zucker",
year = "2019",
month = "10",
language = "English",
series = "The Astrophysical Journal",
publisher = "American Astronomical Society",
type = "Other",

}

Casey, A, Lattanzio, J, Aleti, A, Dowe, D, Bland-Hawthorn, J, Buder, S, Lewis, G, Martell, SL, Nordlander, T, Simpson, JD, Sharma, S & Zucker, D 2019, A data-driven model of nucleosynthesis with chemical tagging in a lower-dimensional latent space..

A data-driven model of nucleosynthesis with chemical tagging in a lower-dimensional latent space. / Casey, Andrew; Lattanzio, John; Aleti, Aldeida; Dowe, David; Bland-Hawthorn, Joss; Buder, Sven; Lewis, Geraint; Martell, Sarah L; Nordlander, Thomas; Simpson, Jeffrey D.; Sharma, Sanjib; Zucker, Daniel.

22 p. 2019, . (The Astrophysical Journal).

Research output: Other contributionOther

TY - GEN

T1 - A data-driven model of nucleosynthesis with chemical tagging in a lower-dimensional latent space

AU - Casey, Andrew

AU - Lattanzio, John

AU - Aleti, Aldeida

AU - Dowe, David

AU - Bland-Hawthorn, Joss

AU - Buder, Sven

AU - Lewis, Geraint

AU - Martell, Sarah L

AU - Nordlander, Thomas

AU - Simpson, Jeffrey D.

AU - Sharma, Sanjib

AU - Zucker, Daniel

PY - 2019/10

Y1 - 2019/10

N2 - Chemical tagging seeks to identify unique star formation sites from present-day stellar abundances. Previous techniques have treated each abundance dimension as being statistically independent, despite theoretical expectations that many elements can be produced by more than one nucleosynthetic process. In this work we introduce a data-driven model of nucleosynthesis where a set of latent factors (e.g., nucleosynthetic yields) contribute to all stars with different scores, and clustering (e.g., chemical tagging) is modelled by a mixture of multivariate Gaussians in a lower-dimensional latent space. We use an exact method to simultaneously estimate the factor scores for each star, the partial assignment of each star to each cluster, and the latent factors common to all stars, even in the presence of missing data entries. We use an information-theoretic Bayesian principle to estimate the number of latent factors and clusters. Using the second Galah data release we find that six latent factors are preferred to explain N = 2,566 stars with 17 chemical abundances. We identify the rapid- and slow-neutron capture processes, as well as latent factors consistent with Fe-peak and \alpha-element production, and another where K and Zn dominate. When we consider N ~ 160,000 stars with missing abundances we find another 7 factors, as well as 16 components in latent space. Despite these components showing separation in chemistry that is explained through different yield contributions, none show significant structure in their positions or motions. We argue that more data, and joint priors on cluster membership that are constrained by dynamical models, are necessary to realise chemical tagging at a galactic-scale. We release software that allows for model parameters to be optimised in seconds given a fixed number of latent factors, components, and 107 abundance measurements.

AB - Chemical tagging seeks to identify unique star formation sites from present-day stellar abundances. Previous techniques have treated each abundance dimension as being statistically independent, despite theoretical expectations that many elements can be produced by more than one nucleosynthetic process. In this work we introduce a data-driven model of nucleosynthesis where a set of latent factors (e.g., nucleosynthetic yields) contribute to all stars with different scores, and clustering (e.g., chemical tagging) is modelled by a mixture of multivariate Gaussians in a lower-dimensional latent space. We use an exact method to simultaneously estimate the factor scores for each star, the partial assignment of each star to each cluster, and the latent factors common to all stars, even in the presence of missing data entries. We use an information-theoretic Bayesian principle to estimate the number of latent factors and clusters. Using the second Galah data release we find that six latent factors are preferred to explain N = 2,566 stars with 17 chemical abundances. We identify the rapid- and slow-neutron capture processes, as well as latent factors consistent with Fe-peak and \alpha-element production, and another where K and Zn dominate. When we consider N ~ 160,000 stars with missing abundances we find another 7 factors, as well as 16 components in latent space. Despite these components showing separation in chemistry that is explained through different yield contributions, none show significant structure in their positions or motions. We argue that more data, and joint priors on cluster membership that are constrained by dynamical models, are necessary to realise chemical tagging at a galactic-scale. We release software that allows for model parameters to be optimised in seconds given a fixed number of latent factors, components, and 107 abundance measurements.

M3 - Other contribution

T3 - The Astrophysical Journal

ER -