Introduction

The exoplanet package is a minimal interface to the NASA Exoplanet Archive. The goal is to make it as easy as possible to interact with the API from R. Some key features are:

  1. exo: the main function for pulling data from the API via table names
  2. exo_kelt, exo_kepler, and exo_wasp: the functions for pulling time series data
  3. exo_summary: a function for summarising the database

Data Access

There are 3 ways to access data from the API:

  1. exo: provide a table name, see exo_tables for all available tables. Note this function doesn’t support time series tables.
  2. exo_kelt, exo_kepler, and exo_wasp: no table name needed, they are hardcoded and designed for a specific table, just provide the apporpriate arguments.
  3. exo_raw: provide the raw URL.

By table name

By default, exo will pull from the exoplanets table.

library(exoplanets)

df_exo <- exo()
#> * <https://exoplanetarchive.ipac.caltech.edu/cgi-bin/nstedAPI/nph-nstedAPI?table=exoplanets>
df_exo
#> # A tibble: 4,352 x 82
#>    pl_hostname pl_letter pl_name  pl_discmethod pl_controvflag pl_pnum pl_orbper
#>    <chr>       <chr>     <chr>    <chr>                  <dbl>   <dbl>     <dbl>
#>  1 Kepler-128  b         Kepler-… Transit                    0       2     15.1 
#>  2 Kepler-128  c         Kepler-… Transit                    0       2     22.8 
#>  3 Kepler-129  b         Kepler-… Transit                    0       2     15.8 
#>  4 Kepler-129  c         Kepler-… Transit                    0       2     82.2 
#>  5 Kepler-130  b         Kepler-… Transit                    0       3      8.46
#>  6 Kepler-130  c         Kepler-… Transit                    0       3     27.5 
#>  7 Kepler-130  d         Kepler-… Transit                    0       3     87.5 
#>  8 Kepler-131  b         Kepler-… Transit                    0       2     16.1 
#>  9 Kepler-131  c         Kepler-… Transit                    0       2     25.5 
#> 10 Kepler-132  b         Kepler-… Transit                    0       4      6.18
#> # … with 4,342 more rows, and 75 more variables: pl_orbpererr1 <dbl>,
#> #   pl_orbpererr2 <dbl>, pl_orbperlim <dbl>, pl_orbpern <dbl>,
#> #   pl_orbsmax <dbl>, pl_orbsmaxerr1 <dbl>, pl_orbsmaxerr2 <dbl>,
#> #   pl_orbsmaxlim <dbl>, pl_orbsmaxn <dbl>, pl_orbeccen <dbl>,
#> #   pl_orbeccenerr1 <dbl>, pl_orbeccenerr2 <dbl>, pl_orbeccenlim <dbl>,
#> #   pl_orbeccenn <dbl>, pl_orbincl <dbl>, pl_orbinclerr1 <dbl>,
#> #   pl_orbinclerr2 <dbl>, pl_orbincllim <dbl>, pl_orbincln <dbl>,
#> #   pl_bmassj <dbl>, pl_bmassjerr1 <dbl>, pl_bmassjerr2 <dbl>,
#> #   pl_bmassjlim <dbl>, pl_bmassn <dbl>, pl_bmassprov <chr>, pl_radj <dbl>,
#> #   pl_radjerr1 <dbl>, pl_radjerr2 <dbl>, pl_radjlim <dbl>, pl_radn <dbl>,
#> #   pl_dens <dbl>, pl_denserr1 <dbl>, pl_denserr2 <dbl>, pl_denslim <dbl>,
#> #   pl_densn <dbl>, pl_ttvflag <dbl>, pl_kepflag <dbl>, pl_k2flag <dbl>,
#> #   ra_str <chr>, dec_str <chr>, ra <dbl>, st_raerr <dbl>, dec <dbl>,
#> #   st_decerr <dbl>, st_posn <dbl>, st_dist <dbl>, st_disterr1 <dbl>,
#> #   st_disterr2 <dbl>, st_distlim <dbl>, st_distn <dbl>, st_optmag <dbl>,
#> #   st_optmagerr <dbl>, st_optmaglim <dbl>, st_optband <chr>, gaia_gmag <dbl>,
#> #   gaia_gmagerr <lgl>, gaia_gmaglim <dbl>, st_teff <dbl>, st_tefferr1 <dbl>,
#> #   st_tefferr2 <dbl>, st_tefflim <dbl>, st_teffn <dbl>, st_mass <dbl>,
#> #   st_masserr1 <dbl>, st_masserr2 <dbl>, st_masslim <dbl>, st_massn <dbl>,
#> #   st_rad <dbl>, st_raderr1 <dbl>, st_raderr2 <dbl>, st_radlim <dbl>,
#> #   st_radn <dbl>, pl_nnotes <dbl>, rowupdate <date>, pl_facility <chr>

If you want to explore other datasets, take a look at exo_tables a list that contains all available data. One nice feature of having a list of available data is that we can utilize autocompletion (assuming you’re using RStudio). Autocompletion paired with exo is an easy solution to pull whatever data you’re interested in, no need to memorize the table name.

# available data
str(exo_tables)
#> List of 32
#>  $ exoplanets              : chr "exoplanets"
#>  $ compositepars           : chr "compositepars"
#>  $ exomultpars             : chr "exomultpars"
#>  $ aliastable              : chr "aliastable"
#>  $ microlensing            : chr "microlensing"
#>  $ cumulative              : chr "cumulative"
#>  $ q1_q17_dr25_sup_koi     : chr "q1_q17_dr25_sup_koi"
#>  $ q1_q17_dr25_koi         : chr "q1_q17_dr25_koi"
#>  $ q1_q17_dr24_koi         : chr "q1_q17_dr24_koi"
#>  $ q1_q16_koi              : chr "q1_q16_koi"
#>  $ q1_q12_koi              : chr "q1_q12_koi"
#>  $ q1_q8_koi               : chr "q1_q8_koi"
#>  $ q1_q6_koi               : chr "q1_q6_koi"
#>  $ q1_q17_dr25_tce         : chr "q1_q17_dr25_tce"
#>  $ q1_q17_dr24_tce         : chr "q1_q17_dr24_tce"
#>  $ q1_q16_tce              : chr "q1_q16_tce"
#>  $ q1_q12_tce              : chr "q1_q12_tce"
#>  $ keplerstellar           : chr "keplerstellar"
#>  $ q1_q17_dr25_supp_stellar: chr "q1_q17_dr25_supp_stellar"
#>  $ q1_q17_dr25_stellar     : chr "q1_q17_dr25_stellar"
#>  $ q1_q17_dr24_stellar     : chr "q1_q17_dr24_stellar"
#>  $ q1_q16_stellar          : chr "q1_q16_stellar"
#>  $ q1_q12_stellar          : chr "q1_q12_stellar"
#>  $ keplertimeseries        : chr "keplertimeseries"
#>  $ keplernames             : chr "keplernames"
#>  $ kelttimeseries          : chr "kelttimeseries"
#>  $ superwasptimeseries     : chr "superwasptimeseries"
#>  $ k2targets               : chr "k2targets"
#>  $ k2candidates            : chr "k2candidates"
#>  $ k2names                 : chr "k2names"
#>  $ missionstars            : chr "missionstars"
#>  $ mission_exocat          : chr "mission_exocat"

# using exo_tables with exo
df_stars <- exo(exo_tables$missionstars)
#> * <https://exoplanetarchive.ipac.caltech.edu/cgi-bin/nstedAPI/nph-nstedAPI?table=missionstars>
df_stars
#> # A tibble: 360 x 27
#>    star_name  hip_name  hd_name gj_name tm_name     st_exocatflag st_coronagflag
#>    <chr>      <chr>     <chr>   <chr>   <chr>               <dbl>          <dbl>
#>  1 HIP 3419   HIP 3419  HD 4128 GJ 31   2MASS J004…             1              1
#>  2 HIP 3765   HIP 3765  HD 4628 GJ 33   2MASS J004…             1              1
#>  3 HIP 3821 A HIP 3821… HD 4614 GJ 34 A 2MASS J004…             1              1
#>  4 HIP 3909   HIP 3909  HD 4813 GJ 37   2MASS J005…             1              0
#>  5 HIP 4151   HIP 4151  HD 5015 GJ 41   2MASS J005…             1              1
#>  6 HIP 5336   HIP 5336  HD 6582 GJ 53 A 2MASS J010…             1              1
#>  7 HIP 5862   HIP 5862  HD 7570 GJ 55   2MASS J011…             1              1
#>  8 ups And    HIP 7513  HD 9826 GJ 61 A 2MASS J013…             1              1
#>  9 HIP 7751 B HIP 7751… HD 103… GJ 66 B 2MASS J013…             1              0
#> 10 HIP 7918   HIP 7918  HD 103… GJ 67   2MASS J014…             1              1
#> # … with 350 more rows, and 20 more variables: st_starshadeflag <dbl>,
#> #   st_wfirstflag <dbl>, st_lbtiflag <dbl>, st_rvflag <dbl>, st_ppnum <dbl>,
#> #   rastr <chr>, decstr <chr>, st_dist <dbl>, st_disterr1 <dbl>,
#> #   st_disterr2 <dbl>, st_vmag <dbl>, st_vmagerr <dbl>, st_vmagsrc <chr>,
#> #   st_bmv <dbl>, st_bmverr <dbl>, st_bmvsrc <chr>, st_spttype <chr>,
#> #   st_lbol <dbl>, st_lbolerr <dbl>, st_lbolsrc <chr>

Time series

There is one caveat to exo, it is not equipped to handle some of the available datasets, specifically time series datasets. This is because time series datasets are very large and require additional parameters to narrow in on a subset of the data.

Instead of cramming additional functionality into exo, 3 functions have been added specifically for time series data, they are:

  1. exo_kelt: for the kelttimeseries table
  2. exo_kepler: for the keplertimeseries table
  3. exo_wasp: for the superwasptimeseries table
df_wasp <- exo_wasp(sourceid = "1SWASP J191645.46+474912.3")
#> * <https://exoplanetarchive.ipac.caltech.edu/cgi-bin/nstedAPI/nph-nstedAPI?&table=superwasptimeseries&sourceid=1SWASP%20J191645.46+474912.3>
df_wasp
#> # A tibble: 1 x 17
#>   sourceid      hjdstart hjdstop hjd_ref obsstart            obsstop            
#>   <chr>            <dbl>   <dbl>   <dbl> <dttm>              <dttm>             
#> 1 1SWASP J1916… 2453129.  2.45e6  2.45e6 2004-05-03 03:07:41 2008-08-10 02:47:57
#> # … with 11 more variables: tstart <dbl>, tstop <dbl>, ra <dbl>, dec <dbl>,
#> #   wasp_mag <dbl>, npts <dbl>, tile <chr>, tm_statnpts <dbl>, tm_median <dbl>,
#> #   tm_stddevwrtmed <dbl>, tm_range595 <dbl>

Raw URL

The documentation has an excellent graphic to explain how the URL’s are constructed, see here. Generally, you will not need to touch exo_raw, the exo function and time series functions should be all you need. However, if you have a very specific query you might find exo_raw to be helpful as the API supports SQL syntax in the URL.

x <- c(
  base = "https://exoplanetarchive.ipac.caltech.edu/cgi-bin/nstedAPI/nph-nstedAPI?",
  table = "table=exoplanets",
  columns = "&select=pl_hostname,ra,dec",
  parameters = "&order=dec"
)

query <- paste(x, collapse = "")

df_raw <- exo_raw(query)
df_raw
#> # A tibble: 4,352 x 3
#>    pl_hostname     ra   dec
#>    <chr>        <dbl> <dbl>
#>  1 HD 142022 A 243.   -84.2
#>  2 HD 39091     84.3  -80.5
#>  3 HD 39091     84.3  -80.5
#>  4 HD 137388   234.   -80.2
#>  5 GJ 3021       4.05 -79.9
#>  6 HD 63454    115.   -78.3
#>  7 HD 212301   337.   -77.7
#>  8 HD 97048    167.   -77.7
#>  9 CHXR 73     167.   -77.6
#> 10 HD 221420   353.   -77.4
#> # … with 4,342 more rows