<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Resources and further references | Data Analytics: Learning from Data</title>
    <link>https://usi-emba-analytics.netlify.app/reference/</link>
      <atom:link href="https://usi-emba-analytics.netlify.app/reference/index.xml" rel="self" type="application/rss+xml" />
    <description>Resources and further references</description>
    <generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><lastBuildDate>Fri, 24 Jul 2020 00:00:00 +0000</lastBuildDate>
    <image>
      <url>https://usi-emba-analytics.netlify.app/media/social-image.png</url>
      <title>Resources and further references</title>
      <link>https://usi-emba-analytics.netlify.app/reference/</link>
    </image>
    
    <item>
      <title>Finance Data</title>
      <link>https://usi-emba-analytics.netlify.app/reference/finance_data/</link>
      <pubDate>Fri, 31 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/finance_data/</guid>
      <description>
&lt;script src=&#34;https://usi-emba-analytics.netlify.app/rmarkdown-libs/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#finance-data-with-the-tidyquant-package&#34;&gt;Finance data with the &lt;code&gt;tidyquant&lt;/code&gt; package&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#calculating-financial-returns&#34;&gt;Calculating financial returns&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#summarising-the-data-set&#34;&gt;Summarising the data set&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#minimum-and-maximum-price-of-each-stock-by-quarter&#34;&gt;Minimum and maximum price of each stock by quarter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#sharpe-ratio&#34;&gt;Sharpe Ratio&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#investment-growth&#34;&gt;Investment Growth&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#scatterplots-of-individual-stocks-returns-versus-sp500-index-returns&#34;&gt;Scatterplots of individual stocks returns versus S&amp;amp;P500 Index returns&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#creating-a-portfolio-of-assets&#34;&gt;Creating a portfolio of assets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#creating-various-portfolios-by-changing-weights-of-assets&#34;&gt;Creating various portfolios by changing weights of assets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#data-from-the-federal-reserve-economic-data-with-tidyquant&#34;&gt;Data from the &lt;em&gt;Federal Reserve Economic Data&lt;/em&gt; with &lt;code&gt;tidyquant&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#acknowledgments&#34;&gt;Acknowledgments&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;“&lt;em&gt;Data!data!data!&lt;/em&gt;” he cried impatiently. “&lt;em&gt;I can’t make bricks without clay.&lt;/em&gt;” &lt;br&gt;
      –Arthur Conan Doyle, The Adventure of the Copper Beeches&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The easiest way to download data is if someone makes available a CSV file and we can download it directly off the web with &lt;code&gt;readr::read_csv()&lt;/code&gt;or with &lt;code&gt;data.table::fread()&lt;/code&gt;. Alternatively, we can use the &lt;code&gt;rio&lt;/code&gt; package to download many different types of files (Excel, SPSS, Stata, etc.)&lt;/p&gt;
&lt;p&gt;In this section we will look at three packages that use wrapped &lt;strong&gt;Application Programming Interface (APIs)&lt;/strong&gt; to get data off the web:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;tidyquant&lt;/code&gt; to get finance data&lt;br /&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wbstats&lt;/code&gt; to get data from the World Bank database, and&lt;/li&gt;
&lt;li&gt;&lt;code&gt;eurostat&lt;/code&gt; to get Eurostat data.&lt;/li&gt;
&lt;/ul&gt;
&lt;div id=&#34;finance-data-with-the-tidyquant-package&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Finance data with the &lt;code&gt;tidyquant&lt;/code&gt; package&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;tidyquant&lt;/code&gt; package comes with a number of functions- utlities that allow us to download financial data off the web, as well as ways of handling all this data.&lt;/p&gt;
&lt;p&gt;We begin by loading the data set into the R workspace. We create a collection of stocks with their ticker symbols and then use the &lt;em&gt;piping&lt;/em&gt; operator &lt;em&gt;%&amp;gt;%&lt;/em&gt; to use tidyquant’s &lt;code&gt;tq_get&lt;/code&gt; to donwload historical data using Yahoo finance and, again, to group data by their ticker symbol.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(tidyquant)
myStocks &amp;lt;- c(&amp;quot;AAPL&amp;quot;,&amp;quot;JPM&amp;quot;,&amp;quot;DIS&amp;quot;,&amp;quot;DPZ&amp;quot;,&amp;quot;ANF&amp;quot;,&amp;quot;TSLA&amp;quot;,&amp;quot;XOM&amp;quot;,&amp;quot;SPY&amp;quot; ) %&amp;gt;%
  tq_get(get  = &amp;quot;stock.prices&amp;quot;,
         from = &amp;quot;2011-01-01&amp;quot;,
         to   = &amp;quot;2020-07-31&amp;quot;) %&amp;gt;%
  group_by(symbol) 

glimpse(myStocks) # examine the structure of the resulting data frame&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 19,280
## Columns: 8
## Groups: symbol [8]
## $ symbol   &amp;lt;chr&amp;gt; &amp;quot;AAPL&amp;quot;, &amp;quot;AAPL&amp;quot;, &amp;quot;AAPL&amp;quot;, &amp;quot;AAPL&amp;quot;, &amp;quot;AAPL&amp;quot;, &amp;quot;AAPL&amp;quot;, &amp;quot;AAPL&amp;quot;, &amp;quot;A...
## $ date     &amp;lt;date&amp;gt; 2011-01-03, 2011-01-04, 2011-01-05, 2011-01-06, 2011-01-0...
## $ open     &amp;lt;dbl&amp;gt; 46.5, 47.5, 47.1, 47.8, 47.7, 48.4, 49.3, 49.0, 49.3, 49.4...
## $ high     &amp;lt;dbl&amp;gt; 47.2, 47.5, 47.8, 47.9, 48.0, 49.0, 49.3, 49.2, 49.5, 49.8...
## $ low      &amp;lt;dbl&amp;gt; 46.4, 46.9, 47.1, 47.6, 47.4, 48.2, 48.5, 48.9, 49.1, 49.2...
## $ close    &amp;lt;dbl&amp;gt; 47.1, 47.3, 47.7, 47.7, 48.0, 48.9, 48.8, 49.2, 49.4, 49.8...
## $ volume   &amp;lt;dbl&amp;gt; 1.11e+08, 7.73e+07, 6.39e+07, 7.51e+07, 7.80e+07, 1.12e+08...
## $ adjusted &amp;lt;dbl&amp;gt; 40.8, 41.0, 41.3, 41.3, 41.6, 42.4, 42.3, 42.6, 42.8, 43.1...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For each ticker symbol, the data frame contains its &lt;code&gt;symbol&lt;/code&gt;, the &lt;code&gt;date&lt;/code&gt;, the prices for &lt;code&gt;open&lt;/code&gt;,&lt;code&gt;high&lt;/code&gt;, &lt;code&gt;low&lt;/code&gt; and &lt;code&gt;close&lt;/code&gt;, and the &lt;code&gt;volume&lt;/code&gt;, or how many stocks were traded on that day. More importantly, the data frame contains the &lt;code&gt;adjusted&lt;/code&gt; closing price, which adjusts for any stock splits and/or dividends paid and this is what we will be using for our analyses.&lt;/p&gt;
&lt;div id=&#34;calculating-financial-returns&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Calculating financial returns&lt;/h3&gt;
&lt;p&gt;Financial performance and CAPM analysis depend on &lt;strong&gt;returns&lt;/strong&gt; and not on &lt;strong&gt;adjusted closing prices&lt;/strong&gt;. So given the adjusted closing prices, our first step is to calculate daily and monthly returns.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#calculate daily returns
myStocks_returns_daily &amp;lt;- myStocks %&amp;gt;%
  tq_transmute(select     = adjusted, 
               mutate_fun = periodReturn, 
               period     = &amp;quot;daily&amp;quot;, 
               type       = &amp;quot;log&amp;quot;,
               col_rename = &amp;quot;daily.returns&amp;quot;,
               cols = c(nested.col))  

#calculate monthly  returns
myStocks_returns_monthly &amp;lt;- myStocks %&amp;gt;%
  tq_transmute(select     = adjusted, 
               mutate_fun = periodReturn, 
               period     = &amp;quot;monthly&amp;quot;, 
               type       = &amp;quot;arithmetic&amp;quot;,
               col_rename = &amp;quot;monthly.returns&amp;quot;,
               cols = c(nested.col)) 

#calculate yearly returns
myStocks_returns_annual &amp;lt;- myStocks %&amp;gt;%
  group_by(symbol) %&amp;gt;%
  tq_transmute(select     = adjusted, 
               mutate_fun = periodReturn, 
               period     = &amp;quot;yearly&amp;quot;, 
               type       = &amp;quot;arithmetic&amp;quot;,
               col_rename = &amp;quot;yearly.returns&amp;quot;,
               cols = c(nested.col))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For yearly and monthly data, we assume discrete changes, so we the formula used to calculate the return for month &lt;strong&gt;(t+1)&lt;/strong&gt; is&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math inline&#34;&gt;\(Return(t+1)= \frac{Adj.Close(t+1)}{Adj.Close (t)}-1\)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;For daily data we use log returns, or &lt;span class=&#34;math inline&#34;&gt;\(Return(t+1)= LN\frac{Adj.Close(t+1)}{Adj.Close (t)}\)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The reason we use log returns are:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: lower-alpha&#34;&gt;
&lt;li&gt;&lt;p&gt;Compound interest interpretation; namely, that the log return can be interpreted as the continuously (rather than discretely) compounded rate of return&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Log returns are assumed to follow a normal distribution&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Log return over n periods is the sum of n log returns&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;div id=&#34;summarising-the-data-set&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Summarising the data set&lt;/h3&gt;
Let us get quick summary statistics of daily returns for each stock, as well as a density plot whwre we use &lt;code&gt;facet_grid&lt;/code&gt; to superimpose all the distributions in one plot.
&lt;table class=&#34;table table-striped table-bordered&#34; style=&#34;margin-left: auto; margin-right: auto;&#34;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
symbol
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
min
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
median
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
max
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
mean
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
sd
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
annual_mean
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
annual_sd
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
AAPL
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.138
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.113
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.017
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.233
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.276
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ANF
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.307
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.296
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.034
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.152
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.540
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
DIS
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.139
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.135
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.015
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.129
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.241
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
DPZ
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.106
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.228
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.018
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.344
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.287
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
JPM
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.162
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.166
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.000
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.018
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.111
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.284
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
SPY
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.116
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.087
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.000
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.011
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.117
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.172
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TSLA
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.215
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.218
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.002
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.034
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.417
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.535
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
XOM
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.130
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.000
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.119
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.000
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.015
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.026
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.232
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/quick_density_plot-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Daily returns seem to follow a normal distribution with a mean close to zero. Since most people think of returns on an annual, rather than on a daily basis, we can calculate summary statistics of annual returns, a boxplot of annual returns, and a bar plot that shows return for each stock on a year-by-year basis.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;myStocks_returns_annual %&amp;gt;% 
  group_by(symbol) %&amp;gt;% 
  mutate(median_return= median(yearly.returns)) %&amp;gt;% 

  # arrange stocks by median yearly return, so highest median return appears first, etc.   
  ggplot(aes(x=reorder(symbol, median_return), y=yearly.returns, colour=symbol)) +
  geom_boxplot()+
  coord_flip()+
  labs(x=&amp;quot;Stock&amp;quot;, 
       y=&amp;quot;Returns&amp;quot;, 
       title = &amp;quot;Boxplot of Annual Returns&amp;quot;)+
  scale_y_continuous(labels = scales::percent_format(accuracy = 2))+
  guides(color=FALSE) +
  theme_bw()+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/annual_returns_plot-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(myStocks_returns_annual, aes(x=year(date), y=yearly.returns, fill=symbol)) +
  geom_col(position = &amp;quot;dodge&amp;quot;)+
  labs(x=&amp;quot;Year&amp;quot;, y=&amp;quot;Returns&amp;quot;, title = &amp;quot;Annual Returns&amp;quot;)+
  scale_y_continuous(labels = scales::percent)+
  guides(fill=guide_legend(title=NULL))+
  theme_bw()+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/annual_returns_plot-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;minimum-and-maximum-price-of-each-stock-by-quarter&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Minimum and maximum price of each stock by quarter&lt;/h3&gt;
&lt;p&gt;What if we wanted to find out and visualise the min/max price by quarter?&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/minMiaxbyQ-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;sharpe-ratio&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Sharpe Ratio&lt;/h3&gt;
&lt;p&gt;The Sharpe ratio, introduced by William F. Sharpe, is used to understand the return of an investment compared to its risk. It is simply the return on an asset per unit of risk, with the unit of risk typically being the standard deviation of the returns of that particular asset.&lt;/p&gt;
Mathematically, the ratio is the average return earned in excess of the risk-free rate per unit of volatility.
&lt;center&gt;
&lt;span class=&#34;math inline&#34;&gt;\(Sharpe Ratio = \frac{R_{p}-R_{f}}{\sigma_{p}}\)&lt;/span&gt;
&lt;/center&gt;
&lt;p&gt;Generally, the greater the value of the Sharpe ratio, the more attractive the risk-adjusted return.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;myStocks_returns_monthly %&amp;gt;%
  tq_performance(Ra = monthly.returns, #the name of the variable containing the returns of the asset
                 Rb = NULL, 
                 performance_fun = SharpeRatio) %&amp;gt;% 
  kable() %&amp;gt;%
  kable_styling(c(&amp;quot;striped&amp;quot;, &amp;quot;bordered&amp;quot;)) &lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;table table-striped table-bordered&#34; style=&#34;margin-left: auto; margin-right: auto;&#34;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
symbol
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ESSharpe(Rf=0%,p=95%)
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
StdDevSharpe(Rf=0%,p=95%)
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
VaRSharpe(Rf=0%,p=95%)
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
AAPL
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.163
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.296
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.211
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
JPM
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.068
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.166
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.102
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
DIS
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.104
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.203
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.147
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
DPZ
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.313
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.427
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.416
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ANF
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.010
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.022
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.014
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TSLA
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.207
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.286
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.329
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
XOM
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.002
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.006
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.003
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
SPY
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.119
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.280
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.193
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div id=&#34;investment-growth&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Investment Growth&lt;/h3&gt;
&lt;p&gt;Finally, we may want to see what our investments would have grown to, if we had invested $1000 in each of the assets on Jan 1, 2011.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;scatterplots-of-individual-stocks-returns-versus-sp500-index-returns&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Scatterplots of individual stocks returns versus S&amp;amp;P500 Index returns&lt;/h3&gt;
&lt;p&gt;Besides these exploratory graphs of returns and price evolution, we also need to create scatterplots among the returns of different stocks. &lt;code&gt;ggpairs&lt;/code&gt; from the &lt;code&gt;GGally&lt;/code&gt; package creates a scattterplot matrix that shows the distribution of returns for each stock along the diagonal, and scatter plots and correlations for each pair of stocks. Running a &lt;code&gt;ggpairs()&lt;/code&gt; correlation scatterplot-matrix typically takes a while to run.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#calculate daily returns
table_capm_returns &amp;lt;- myStocks_returns_daily %&amp;gt;%
            spread(key = symbol, value = daily.returns)  #just keep the period returns grouped by symbol

table_capm_returns[-1] %&amp;gt;% #exclude &amp;quot;Date&amp;quot;, the first column, from the correlation matrix
  GGally::ggpairs() +
  theme_bw()+
    theme(axis.text.x = element_text(angle = 90, size=8),
         axis.title.x = element_blank())&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/correlationMatrix-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;creating-a-portfolio-of-assets&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Creating a portfolio of assets&lt;/h3&gt;
&lt;p&gt;DPZ may have been the best performing stock, but you believe that you can create a portfolio of technology stocks that will beat the relevant sector index, &lt;a href=&#34;https://finance.yahoo.com/quote/XLK&#34;&gt;XLK&lt;/a&gt;. To create a portfolio, you need to choose a few stocks and then the weights, or how much of your total investment is allocated to each stock. To keep things simple we will assume you will choose among &lt;code&gt;AAPL&lt;/code&gt;, &lt;code&gt;GOOG&lt;/code&gt;, &lt;code&gt;MSFT&lt;/code&gt;, &lt;code&gt;NFLX&lt;/code&gt;, and &lt;code&gt;NVDA&lt;/code&gt; and you will compare your performance against the sector index, &lt;code&gt;XLK&lt;/code&gt;. We will also add two non-tech stocks, &lt;code&gt;TSLA&lt;/code&gt; and &lt;code&gt;DPZ&lt;/code&gt; so we can their position on the risk/return frontier.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ticker_symbols &amp;lt;- c(&amp;quot;AAPL&amp;quot;,&amp;quot;GOOG&amp;quot;,&amp;quot;MSFT&amp;quot;,&amp;quot;NFLX&amp;quot;,&amp;quot;NVDA&amp;quot;, &amp;quot;XLK&amp;quot;, &amp;quot;TSLA&amp;quot;, &amp;quot;DPZ&amp;quot;) 

tech_stock_returns_monthly &amp;lt;- ticker_symbols %&amp;gt;%
    tq_get(get  = &amp;quot;stock.prices&amp;quot;,
           from = &amp;quot;2011-01-01&amp;quot;,
           to   = &amp;quot;2020-07-31&amp;quot;) %&amp;gt;%
    group_by(symbol) %&amp;gt;%
    tq_transmute(select     = adjusted, 
                 mutate_fun = periodReturn, 
                 period     = &amp;quot;monthly&amp;quot;, 
                 col_rename = &amp;quot;monthly_return&amp;quot;)


baseline_returns_monthly &amp;lt;- &amp;quot;XLK&amp;quot; %&amp;gt;%
    tq_get(get  = &amp;quot;stock.prices&amp;quot;,
           from = &amp;quot;2011-01-01&amp;quot;,
           to   = &amp;quot;2020-07-31&amp;quot;) %&amp;gt;%
    tq_transmute(select     = adjusted, 
                 mutate_fun = periodReturn, 
                 period     = &amp;quot;monthly&amp;quot;, 
                 col_rename = &amp;quot;baseline_return&amp;quot;)

# Summary Stats for individual Stocks
stocks_risk_return &amp;lt;- tech_stock_returns_monthly %&amp;gt;%
  tq_performance(Ra = monthly_return, Rb = NULL, performance_fun = table.Stats) %&amp;gt;% 
  select(symbol, ArithmeticMean, GeometricMean, Minimum,Maximum,Stdev, Quartile1, Quartile3) 



ggplot(stocks_risk_return, aes(x=Stdev, y = ArithmeticMean, colour= symbol, label= symbol))+
  geom_point(size = 4)+
  labs(title = &amp;#39;Risk/Return profile of technology stocks&amp;#39;, 
       x = &amp;#39;Risk (stdev of monthly returns)&amp;#39;, 
       y =&amp;quot;Average monthly return&amp;quot;)+
  theme_bw()+
  scale_x_continuous(labels = scales::percent)+
  scale_y_continuous(labels = scales::percent)+
  geom_text_repel()+
  theme(legend.position = &amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/unnamed-chunk-1-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We have the monthly returns of the individual stocks and the relevenant sector index. To create a portfolio, we must specify the weights; as an example, suppose we only choose three stocks and invest 50% in &lt;code&gt;AAPL&lt;/code&gt;, 35% in &lt;code&gt;NFLX&lt;/code&gt;, and 15% in &lt;code&gt;NVDA&lt;/code&gt;. To do this, we create a two-column tibble, with symbols in the first column and weights in the second; any symbol not specified by default gets a weight of zero.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;weights_map &amp;lt;- tibble(
    symbols = c(&amp;quot;AAPL&amp;quot;, &amp;quot;NFLX&amp;quot;, &amp;quot;NVDA&amp;quot;),
    weights = c(0.5, 0.35, 0.15)
)

tech_portfolio_returns &amp;lt;- tech_stock_returns_monthly %&amp;gt;%
    tq_portfolio(assets_col  = symbol, 
                 returns_col = monthly_return, 
                 weights     = weights_map, 
                 col_rename  = &amp;quot;monthly_portfolio_return&amp;quot;)

tech_portfolio_returns %&amp;gt;%
    ggplot(aes(x = date, y = monthly_portfolio_return)) +
    geom_col() +
    scale_y_continuous(labels = scales::percent) +
    # geom_bar(stat = &amp;quot;identity&amp;quot;, fill = palette_light()[[1]]) +
    labs(title = &amp;quot;Tech Portfolio Returns&amp;quot;,
         subtitle = &amp;quot;50% AAPL, 35% NFLX, and 15% NVDA&amp;quot;,
         x = &amp;quot;&amp;quot;, y = &amp;quot;Monthly Returns&amp;quot;) +
    theme_bw() &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;portfolio_growth_monthly &amp;lt;- tech_stock_returns_monthly %&amp;gt;%
    tq_portfolio(assets_col   = symbol, 
                 returns_col  = monthly_return, 
                 weights      = weights_map, 
                 col_rename   = &amp;quot;investment.growth&amp;quot;,
                 wealth.index = TRUE) %&amp;gt;%
    mutate(investment.growth = investment.growth * 1000)

plot1 &amp;lt;- portfolio_growth_monthly %&amp;gt;%
    ggplot(aes(x = date, y = investment.growth)) +
    geom_line(size = 2) +
    labs(title = &amp;quot;Portfolio Growth&amp;quot;,
         subtitle = &amp;quot;50% AAPL, 35% NFLX, and 15% NVDA&amp;quot;,
         x = &amp;quot;&amp;quot;, y = &amp;quot;Portfolio Value&amp;quot;) +
    # geom_smooth(method = &amp;quot;loess&amp;quot;, se = FALSE) +
    theme_bw() +
    scale_y_continuous(labels = scales::dollar)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now that we have our portfolio returns and the baseline returns of the &lt;code&gt;XLK&lt;/code&gt; index, we can merge to get our consolidated table of asset and baseline returns, create a scatter plot and fit a CAPM model.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;tech_single_portfolio &amp;lt;- left_join(tech_portfolio_returns, 
                                   baseline_returns_monthly,
                                   by = &amp;quot;date&amp;quot;)
tech_single_portfolio&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 115 x 3
##    date       monthly_portfolio_return baseline_return
##    &amp;lt;date&amp;gt;                        &amp;lt;dbl&amp;gt;           &amp;lt;dbl&amp;gt;
##  1 2011-01-31                  0.162           0.0204 
##  2 2011-02-28                 -0.00466         0.0219 
##  3 2011-03-31                  0.0122         -0.0155 
##  4 2011-04-29                  0.00601         0.0261 
##  5 2011-05-31                  0.0609         -0.0105 
##  6 2011-06-30                 -0.0586         -0.0248 
##  7 2011-07-29                  0.0592          0.00428
##  8 2011-08-31                 -0.0596         -0.0531 
##  9 2011-09-30                 -0.215          -0.0307 
## 10 2011-10-31                 -0.00422         0.102  
## # ... with 105 more rows&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(tech_single_portfolio, aes(x = baseline_return, y= monthly_portfolio_return)) +
  geom_point()+
  geom_smooth(method=&amp;quot;lm&amp;quot;, se=FALSE) +
  scale_x_continuous(labels = scales::percent) +
  scale_y_continuous(labels = scales::percent) +
  labs(x = &amp;quot;Baseline returns (XLK)&amp;quot;, 
       y= &amp;quot;Tech Portfolio Return&amp;quot;, 
       title= &amp;quot;How do our tech fund returns compare to the the sector index XLK&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;portfolio_CAPM &amp;lt;- lm(monthly_portfolio_return ~ baseline_return, data = tech_single_portfolio)
summary(portfolio_CAPM)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Call:
## lm(formula = monthly_portfolio_return ~ baseline_return, data = tech_single_portfolio)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.1841 -0.0375 -0.0016  0.0326  0.1618 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(&amp;gt;|t|)    
## (Intercept)      0.00836    0.00579    1.44     0.15    
## baseline_return  1.27904    0.12865    9.94   &amp;lt;2e-16 ***
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1
## 
## Residual standard error: 0.0586 on 113 degrees of freedom
## Multiple R-squared:  0.467,  Adjusted R-squared:  0.462 
## F-statistic: 98.8 on 1 and 113 DF,  p-value: &amp;lt;2e-16&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;autoplot(portfolio_CAPM, which = 1:3) +
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/unnamed-chunk-3-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;creating-various-portfolios-by-changing-weights-of-assets&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Creating various portfolios by changing weights of assets&lt;/h3&gt;
&lt;p&gt;Suppose we wanted to examine a few more portfolios by varying the weights.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Naive portfolio: you split your investment equally among the five stocks, so each of them has a weight of 20%&lt;/li&gt;
&lt;li&gt;Bitcoin mining: you invest 80-20 in &lt;code&gt;NVDA&lt;/code&gt; and &lt;code&gt;GOOG&lt;/code&gt;&lt;br /&gt;
&lt;/li&gt;
&lt;li&gt;Binge TV watching: you invest most (70%) in &lt;code&gt;NFLX&lt;/code&gt; and 10% to &lt;code&gt;AAPL&lt;/code&gt;, &lt;code&gt;GOOG&lt;/code&gt;, and &lt;code&gt;MSFT&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ticker_symbols = c(&amp;quot;AAPL&amp;quot;, &amp;quot;GOOG&amp;quot;, &amp;quot;MSFT&amp;quot;, &amp;quot;NFLX&amp;quot;, &amp;quot;NVDA&amp;quot;)

weights &amp;lt;- c(
    0.2, 0.2, 0.2, 0.2, 0.2,
    0, 0.2, 0, 0, 0.8,
    0.1, 0.1, 0.1, 0, 0.7
)

weights_table &amp;lt;-  tibble(ticker_symbols) %&amp;gt;%
    tq_repeat_df(n = 3) %&amp;gt;%
    bind_cols(tibble(weights)) %&amp;gt;%
    group_by(portfolio)


stock_returns_monthly_multi &amp;lt;- tech_stock_returns_monthly %&amp;gt;%
    tq_repeat_df(n = 3)

# Calculate montly returns for all portfolios
portfolio_returns_monthly_multi &amp;lt;- stock_returns_monthly_multi %&amp;gt;%
    tq_portfolio(assets_col   = symbol, 
                 returns_col  = monthly_return, 
                 weights      = weights_table, 
                 col_rename   = &amp;quot;portfolio_return&amp;quot;,
                 wealth.index = FALSE) 

# Calculate what an investment of 1000 will grow to 
portfolio_growth_monthly_multi &amp;lt;- stock_returns_monthly_multi %&amp;gt;%
    tq_portfolio(assets_col   = symbol, 
                 returns_col  = monthly_return, 
                 weights      = weights_table, 
                 col_rename   = &amp;quot;investment.growth&amp;quot;,
                 wealth.index = TRUE) %&amp;gt;%
    mutate(investment.growth = investment.growth * 1000)

portfolio_growth_monthly_multi %&amp;gt;%
  ggplot(aes(x = date, y = investment.growth, colour = as.factor(portfolio))) +
  geom_line(size = 2) +
  labs(title = &amp;quot;Portfolio Growth&amp;quot;,
       subtitle = &amp;quot;Comparing Multiple Portfolios&amp;quot;,
       x = &amp;quot;&amp;quot;, y = &amp;quot;Portfolio Value&amp;quot;,
       color = &amp;quot;Portfolio&amp;quot;) +
  theme_bw()+
  scale_y_continuous(labels = scales::dollar)+
  scale_colour_discrete(name=&amp;quot;Portfolio&amp;quot;,
                      labels=c(&amp;quot;Naive&amp;quot;, &amp;quot;Bitcoiners&amp;quot;, &amp;quot;Binge Watchers&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/unnamed-chunk-4-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Returns a basic set of statistics that match the period of the data passed in (e.g., monthly returns 
# will get monthly statistics, daily will be daily stats, and so on).

portfolio_risk_return &amp;lt;- portfolio_returns_monthly_multi %&amp;gt;%
  tq_performance(Ra = portfolio_return, Rb = NULL, performance_fun = table.Stats) %&amp;gt;% 
  select(portfolio, ArithmeticMean, GeometricMean, Minimum,Maximum,Stdev, Quartile1, Quartile3) 

portfolio_risk_return %&amp;gt;% 
  kable() %&amp;gt;%
  kable_styling(c(&amp;quot;striped&amp;quot;, &amp;quot;bordered&amp;quot;)) &lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;table table-striped table-bordered&#34; style=&#34;margin-left: auto; margin-right: auto;&#34;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
portfolio
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ArithmeticMean
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
GeometricMean
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
Minimum
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
Maximum
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
Stdev
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
Quartile1
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
Quartile3
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.026
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.023
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.176
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.232
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.069
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.017
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.076
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.033
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.028
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.242
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.408
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.104
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.028
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.079
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.032
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.028
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.232
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.360
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.098
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.026
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.074
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(portfolio_risk_return, 
       aes(x=Stdev, 
           y = ArithmeticMean,
           label= portfolio, 
           colour= as.factor(portfolio)))+
  geom_point(size = 4)+
  labs(title = &amp;#39;Risk/Return profile of the three portfolios&amp;#39;, 
       x = &amp;#39;Risk (stdev of monthly returns)&amp;#39;, 
       y =&amp;quot;Average monthly return&amp;quot;)+
  theme_bw()+
  scale_x_continuous(labels = scales::percent)+
  scale_y_continuous(labels = scales::percent)+
  scale_colour_discrete(name=&amp;quot;Portfolio&amp;quot;,
                      labels=c(&amp;quot;Naive&amp;quot;, &amp;quot;Bitcoiners&amp;quot;, &amp;quot;Binge Watchers&amp;quot;))+
  geom_text_repel()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/unnamed-chunk-4-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;data-from-the-federal-reserve-economic-data-with-tidyquant&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Data from the &lt;em&gt;Federal Reserve Economic Data&lt;/em&gt; with &lt;code&gt;tidyquant&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;A lot of economic data can be extracted from the &lt;a href=&#34;https://fred.stlouisfed.org/categories&#34;&gt;Federal Reserve Economic Data (FRED)&lt;/a&gt; database. For each data we are interested, we need to get its FRED symbol; for instance, if we cared about &lt;a href=&#34;https://fred.stlouisfed.org/categories/32217&#34;&gt;commodities&lt;/a&gt;, we can select the &lt;a href=&#34;https://fred.stlouisfed.org/series/DHHNGSP&#34;&gt;Henry Hub Natural Gas Spot Price&lt;/a&gt; and notice that its FRED symbol is &lt;code&gt;DHHNGSP&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;So, if we wanted to download this, as well as prices of WTI crude, gold, and USD:EUR, we first identify the FRED codes which are shown below&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://fred.stlouisfed.org/series/DHHNGSP&#34;&gt;Henry Hub Natural Gas Spot Price&lt;/a&gt;: &lt;code&gt;DHHNGSP&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://fred.stlouisfed.org/series/DCOILWTICO&#34;&gt;WTI Crude Oil Prices&lt;/a&gt;: &lt;code&gt;DCOILWTICO&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://fred.stlouisfed.org/series/GOLDAMGBD228NLBM&#34;&gt;Gold Fixing Price&lt;/a&gt;:&lt;code&gt;GOLDAMGBD228NLBM&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://fred.stlouisfed.org/series/DEXUSEU&#34;&gt;U.S. / Euro Exchange Rate&lt;/a&gt;: &lt;code&gt;DEXUSEU&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To get the data and plot it&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;natgas_spot  &amp;lt;-   tq_get(&amp;quot;DHHNGSP&amp;quot;, get = &amp;quot;economic.data&amp;quot;,
                       from = &amp;quot;2011-01-01&amp;quot;,
                       to   = &amp;quot;2020-07-31&amp;quot;)

ggplot(natgas_spot, aes(x=date, y=price)) +
  geom_line()+
  labs(x=&amp;quot;Year&amp;quot;, 
       y=&amp;quot;NatGas Spot price&amp;quot;, 
       title = &amp;quot;Henry Hub Natural Gas Spot Prices&amp;quot;)+
  scale_y_continuous(labels = scales::dollar)+
  guides(fill=guide_legend(title=NULL))+
  theme_bw()+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/FRED_Data-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;wti_price  &amp;lt;-   tq_get(&amp;quot;DCOILWTICO&amp;quot;, get = &amp;quot;economic.data&amp;quot;,
                       from = &amp;quot;2011-01-01&amp;quot;,
                       to   = &amp;quot;2020-07-31&amp;quot;)

ggplot(wti_price, aes(x=date, y=price)) +
  geom_line()+
  labs(x=&amp;quot;Year&amp;quot;, 
       y=&amp;quot;WTI price&amp;quot;, 
       title = &amp;quot;West Texas Intermediate Crude Oil (WTI) Prices&amp;quot;)+
  scale_y_continuous(labels = scales::dollar)+
  guides(fill=guide_legend(title=NULL))+
  theme_bw()+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/FRED_Data-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gold_price  &amp;lt;-   tq_get(&amp;quot;GOLDAMGBD228NLBM&amp;quot;, get = &amp;quot;economic.data&amp;quot;,
                        from = &amp;quot;2011-01-01&amp;quot;,
                        to   = &amp;quot;2020-07-31&amp;quot;) 

ggplot(gold_price, aes(x=date, y=price)) +
  geom_line()+
  labs(x=&amp;quot;Year&amp;quot;, 
       y=&amp;quot;Gold price&amp;quot;, 
       title = &amp;quot;Gold Fixing Price 10:30 A.M. (London time) in London Bullion Market&amp;quot;)+
  scale_y_continuous(labels = scales::dollar)+
  guides(fill=guide_legend(title=NULL))+
  theme_bw()+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/FRED_Data-3.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;USDEUR_rate &amp;lt;-   tq_get(&amp;quot;DEXUSEU&amp;quot;, get = &amp;quot;economic.data&amp;quot;,
                        from = &amp;quot;2011-01-01&amp;quot;,
                        to   = &amp;quot;2020-07-31&amp;quot;) 

ggplot(USDEUR_rate, aes(x=date, y=price)) +
  geom_line()+
  labs(x=&amp;quot;Year&amp;quot;, 
       y=&amp;quot;Exchange rate&amp;quot;, 
       title = &amp;quot;USD to EUR Exchange Rate&amp;quot;)+
  scale_y_continuous(labels = scales::dollar)+
  guides(fill=guide_legend(title=NULL))+
  theme_bw()+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/FRED_Data-4.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Now suppose we wanted to check if there is any correlation between natgas spot prices, WTI, and Gold prices. We will download prices, then calculate returns, calculate statistics on daily returns, and visualise some of the returns.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;commodities &amp;lt;- c(&amp;quot;DHHNGSP&amp;quot;, &amp;quot;DCOILWTICO&amp;quot;, &amp;quot;GOLDAMGBD228NLBM&amp;quot;)

commodities_prices  &amp;lt;- tq_get(commodities, get = &amp;quot;economic.data&amp;quot;,
                              from = &amp;quot;2011-01-01&amp;quot;,
                              to   = &amp;quot;2020-07-31&amp;quot;) %&amp;gt;% 
  group_by(symbol) 


commodities_returns_daily &amp;lt;- commodities_prices %&amp;gt;% na.omit() %&amp;gt;% 
  tq_transmute(select     = price, 
               mutate_fun = periodReturn, 
               period     = &amp;quot;daily&amp;quot;, 
               type       = &amp;quot;log&amp;quot;,
               col_rename = &amp;quot;daily.returns&amp;quot;)  

#calculate monthly  returns
commodities_returns_monthly &amp;lt;- commodities_prices %&amp;gt;%
  tq_transmute(select     = price, 
               mutate_fun = periodReturn, 
               period     = &amp;quot;monthly&amp;quot;, 
               type       = &amp;quot;arithmetic&amp;quot;,
               col_rename = &amp;quot;monthly.returns&amp;quot;) 

favstats(daily.returns ~ symbol,  data=commodities_returns_daily) %&amp;gt;% 
  mutate(
    annual_mean = mean *250,
    annual_sd = sd * sqrt(250)
  ) %&amp;gt;% 
  select(symbol, min, median, max, mean, sd, annual_mean, annual_sd)  %&amp;gt;% 
  kable() %&amp;gt;%
  kable_styling(c(&amp;quot;striped&amp;quot;, &amp;quot;bordered&amp;quot;)) &lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;table table-striped table-bordered&#34; style=&#34;margin-left: auto; margin-right: auto;&#34;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
symbol
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
min
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
median
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
max
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
mean
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
sd
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
annual_mean
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
annual_sd
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
DCOILWTICO
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.281
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.001
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.426
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.030
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.008
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.474
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
DHHNGSP
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.476
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.000
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.525
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.042
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.093
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.668
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
GOLDAMGBD228NLBM
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.089
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.000
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.068
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.010
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.034
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.154
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(commodities_returns_daily, aes(x=daily.returns, fill=symbol))+
  geom_density()+
  coord_cartesian(xlim=c(-0.05,0.05)) + 
  scale_x_continuous(labels = scales::percent_format(accuracy = 2))+
  facet_grid(rows = (vars(symbol))) + 
  theme_bw()+
  labs(x=&amp;quot;Daily Returns&amp;quot;, 
       y=&amp;quot;Density&amp;quot;, 
       title = &amp;quot;Charting the Distribution of Daily Log Returns&amp;quot;)+
  guides(fill=FALSE) +
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/returns_and_stats-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(commodities_returns_daily, aes(x=symbol, y=daily.returns))+
  geom_boxplot(aes(colour=symbol))+
  coord_flip()+
  scale_y_continuous(labels = scales::percent_format(accuracy = 2))+
  theme_bw()+
  labs(x=&amp;quot;Daily Returns&amp;quot;, 
       y=&amp;quot;&amp;quot;, 
       title = &amp;quot;Boxplot of Daily Log Returns&amp;quot;)+
  theme(legend.position=&amp;quot;none&amp;quot;) +
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/returns_and_stats-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;commodities_returns_daily %&amp;gt;% 
  pivot_wider(names_from=&amp;quot;symbol&amp;quot;, values_from=&amp;quot;daily.returns&amp;quot;) %&amp;gt;% 
  na.omit() %&amp;gt;% 
  select(-date) %&amp;gt;% 
  dplyr::rename(
    &amp;quot;NatGas&amp;quot; = &amp;#39;DHHNGSP&amp;#39;,
    &amp;quot;WTI Oil&amp;quot; = &amp;#39;DCOILWTICO&amp;#39;,
    &amp;quot;Gold&amp;quot; = &amp;#39;GOLDAMGBD228NLBM&amp;#39;
  ) %&amp;gt;% 
  ggpairs()+
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/finance_data_files/figure-html/returns_and_stats-3.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;acknowledgments&#34; class=&#34;section level2 toc-ignore&#34;&gt;
&lt;h2&gt;Acknowledgments&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;This page is derived in part from &lt;a href=&#34;https://cran.r-project.org/web/packages/tidyquant/vignettes/TQ05-performance-analysis-with-tidyquant.html&#34;&gt;Performance Analytics with &lt;code&gt;tidyquant&lt;/code&gt;&lt;/a&gt; by Matt Dancho.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Installing R and RStudio</title>
      <link>https://usi-emba-analytics.netlify.app/reference/01-reference/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/01-reference/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#installing-r-rstudio&#34;&gt;Installing R &amp;amp; RStudio&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#install-xcode-if-you-have-a-mac&#34;&gt;Install &lt;code&gt;XCode&lt;/code&gt; if you have a Mac&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#install-r&#34;&gt;Install R&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#install-rstudio-ide&#34;&gt;Install RStudio IDE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#change-character-encoding-to-utf-8-and-utf-8-only&#34;&gt;Change character encoding to &lt;code&gt;UTF-8&lt;/code&gt;, and &lt;code&gt;UTF-8&lt;/code&gt; only&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#exiting-r-rstudio&#34;&gt;Exiting R &amp;amp; RStudio&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#updating-r-and-rstudio&#34;&gt;Updating R and RStudio&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#r-commands&#34;&gt;R commands&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#assignmnent-operator--&#34;&gt;Assignmnent Operator &lt;code&gt;&amp;lt;-&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#r-is-case-sensitive&#34;&gt;R is case sensitive&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#typos&#34;&gt;Typos&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#comments&#34;&gt;Comments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#r-knows-youre-not-finished&#34;&gt;R knows you’re not finished&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#arithmetic-operations-and-functions&#34;&gt;Arithmetic Operations and Functions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#rstudio-help&#34;&gt;RStudio help&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#tab-autocomplete&#34;&gt;Tab autocomplete&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#the-history-pane&#34;&gt;The history pane&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#more-resources&#34;&gt;More resources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#acknowledgements&#34;&gt;Acknowledgements&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;In this section we download and install R and R Studio, and then show you how to write R commands and navigate around the RStudio interface. The goal in this chapter is not to learn any statistical or programming concepts: we’re just trying to learn how R works and get comfortable interacting with the system. We’ll spend a bit of time using R as a simple calculator. Specifically, we will learn the basics of R and RStudio, namely&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;How to install R and RStudio interface&lt;/li&gt;
&lt;li&gt;How to navigate around the RStudio interface; a free Integrated Development Environment (IDE) for R&lt;/li&gt;
&lt;li&gt;How to install and load packages that provide extra functionality for R&lt;/li&gt;
&lt;/ol&gt;
&lt;div id=&#34;installing-r-rstudio&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Installing R &amp;amp; RStudio&lt;/h2&gt;
&lt;p&gt;An important distinction to remember is between the R &lt;em&gt;programming language&lt;/em&gt; itself, and the software you use to interact with R. You could choose to interact with R directly from the terminal, but that’s painful, so most people use an &lt;em&gt;integrated development environment&lt;/em&gt; (IDE), which takes care of a lot of boring tasks for you. To get started, make sure you have both R and RStudio installed on your computer. Both are free and open source, and for most people they should be straightforward to install.&lt;/p&gt;
&lt;div id=&#34;install-xcode-if-you-have-a-mac&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Install &lt;code&gt;XCode&lt;/code&gt; if you have a Mac&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;If you have a Mac&lt;/strong&gt; make sure that before installing R and R studio you&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;upgrade to the latest version of &lt;a href=&#34;https://www.apple.com/macos/how-to-upgrade/&#34;&gt;macOS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;install &lt;code&gt;XCode&lt;/code&gt; through the appStore&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/install-xcode-3.png&#34; width=&#34;90%&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;install-r&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Install R&lt;/h3&gt;
&lt;p&gt;First you need to install R itself (the engine). Go to the &lt;a href=&#34;https://cran.r-project.org/&#34;&gt;CRAN (Collective R Archive Network)&lt;/a&gt;– this is the site where R itself and most R packages live. Click on “Download R for XXX”, where XXX is either Mac or Windows:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/installR.png&#34; width=&#34;90%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Double click on the downloaded file. Click *Yes** through all the prompts to install like any other program. once finished, proceed to install R Studio.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;install-rstudio-ide&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Install RStudio IDE&lt;/h3&gt;
&lt;p&gt;Go to the &lt;a href=&#34;https://www.rstudio.com/&#34;&gt;R studio&lt;/a&gt; website, and follow the links to download. RStudio is a powerful user interface for programming in R. I suggest you install the &lt;a href=&#34;https://www.rstudio.com/products/rstudio/download/preview/&#34;&gt;preview version of R studio&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To get started, open the &lt;strong&gt;Rstudio&lt;/strong&gt; application (i.e., RStudio.exe or RStudio.app), not the vanilla application (i.e., not R.exe or R.app). You should be looking at something like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/rstudio_start.png&#34; width=&#34;90%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The RStudio IDE is divided into 4 separate panes (one of which is hidden for now) which all serve specific functions. The &lt;em&gt;Console&lt;/em&gt; starts with information about the R version number, license and contributors. The last line is a standard prompt &lt;code&gt;&amp;gt;&lt;/code&gt; that indicates R is ready and expecting instructions to do something.&lt;/p&gt;
&lt;p&gt;You edit scripts in the &lt;em&gt;editor&lt;/em&gt; panel in R Studio and see results in the bottom right &lt;em&gt;output&lt;/em&gt; panel.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&#34;https://r4ds.had.co.nz/diagrams/rstudio-editor.png&#34; /&gt;
&lt;/center&gt;
&lt;p&gt;For now, to make sure R and RStudio are setup correctly, type &lt;code&gt;x &amp;lt;- 3 + 2&lt;/code&gt; into the &lt;em&gt;Console&lt;/em&gt; pane and execute it by pressing Enter/Return. You just created an object in R called &lt;code&gt;x&lt;/code&gt;. What does this object contain? Type &lt;code&gt;print(x)&lt;/code&gt; or just &lt;code&gt;x&lt;/code&gt; into the console and press enter again. Your console should now contain the following output&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;x &amp;lt;- 3 + 2
print(x)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 5&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Congratulations! You installed R and RStudio succesfully, created an object &lt;code&gt;x&lt;/code&gt; to which you assigned the value &lt;code&gt;3+2&lt;/code&gt; and managed to print the value of &lt;code&gt;x&lt;/code&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;change-character-encoding-to-utf-8-and-utf-8-only&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Change character encoding to &lt;code&gt;UTF-8&lt;/code&gt;, and &lt;code&gt;UTF-8&lt;/code&gt; only&lt;/h3&gt;
&lt;p&gt;This may seem like an overly technical issue, but please bear with me. Since LBS is a very international school, we always seem to have issues with the language, or character encoding (Chinese, Arabic, Greek, Cyrillic, Hebrew, Thai, French, German, etc.), that people use in their computers. By default, all base R functions use the system native language encoding which has to do with the different languages some of us may have on our computers. Chinese and Greek users, having a completely different alphabet, typically report issues/problems/errors related to character encodings.&lt;/p&gt;
&lt;p&gt;UTF-8 is the best possible character encoding, it &lt;a href=&#34;http://utf8everywhere.org/&#34;&gt;works everywhere&lt;/a&gt; and we shall ask R Studio to use UTF-8 encoding globally. Please go to &lt;code&gt;Tools&lt;/code&gt;… &lt;code&gt;Global Options&lt;/code&gt;… &lt;code&gt;Code&lt;/code&gt;… &lt;code&gt;Saving&lt;/code&gt; and and change the default text encoding to UTF-8 as shown below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/utf8.png&#34; width=&#34;90%&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;exiting-r-rstudio&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Exiting R &amp;amp; RStudio&lt;/h3&gt;
&lt;p&gt;When quitting RStudio you will be asked whether to &lt;code&gt;Save workspace&lt;/code&gt; with two options:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Yes&lt;/code&gt; - Your current R workspace (containing the work that you have done) will be restored next time you open RStudio.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;No&lt;/code&gt; - You will start with a fresh R session next time you open RStudio. For now select “No” to prevent errors being carried over from previous sessions.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In general, it’s good practice to always start with a fresh new session. If you want to do that, please go to &lt;code&gt;Tools&lt;/code&gt;… &lt;code&gt;Global Options&lt;/code&gt;and make sure that&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Restore .RData into workspace at startup&lt;/em&gt; is &lt;strong&gt;NOT&lt;/strong&gt; ticked&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Save workspace to .RData on exit:&lt;/em&gt; select &lt;strong&gt;Never&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Always save history (even when not saving .RData)&lt;/em&gt; is &lt;strong&gt;NOT&lt;/strong&gt; ticked&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;as shown below&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/rstudio_preferences.png&#34; width=&#34;90%&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;updating-r-and-rstudio&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Updating R and RStudio&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;If you already installed R or RStudio for a previous course, update both to the most current version. Generally this entails downloading and installing the most recent version of both programs. When you update R, you don’t actually remove the old version - you have all versions on your computer and default to the most recent one. Sometimes this is useful when specific R libraries require an older version of R, however we will generally stick to the most recent versions of R and RStudio.&lt;/li&gt;
&lt;li&gt;When you update R, make sure to update your packages as well. The following command should perform most of this work, &lt;code&gt;update.packages(ask = FALSE, checkBuilt = TRUE)&lt;/code&gt; or you can go through the &lt;code&gt;Packages&lt;/code&gt; tab in the bottom right panel of RStudio.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;r-commands&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;R commands&lt;/h2&gt;
&lt;p&gt;We have already seen how we can type commands in the command prompt and use R as a simple calculator. For instance, try typing &lt;code&gt;5 + 20&lt;/code&gt;, and hitting enter. When you do this, you’ve entered a command, and R will &lt;strong&gt;execute&lt;/strong&gt; that command. What you see on screen now will be this:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;5 + 20&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 25&lt;/code&gt;&lt;/pre&gt;
&lt;div id=&#34;assignmnent-operator--&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Assignmnent Operator &lt;code&gt;&amp;lt;-&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;R treats everything (single numbers, lists, vectors, datasets) as &lt;strong&gt;objects&lt;/strong&gt;. To create an object, we must use the assignment operator &lt;code&gt;&amp;lt;-&lt;/code&gt;. For instance, if we had data on a student whose name is Alex, is 28 years old, and comes from Athens, we would create three objects, &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;height&lt;/code&gt;, and &lt;code&gt;city&lt;/code&gt; and assign the values of &lt;code&gt;Alex&lt;/code&gt;, &lt;code&gt;28&lt;/code&gt;, and &lt;code&gt;Athens&lt;/code&gt; respectively, we would type&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;name &amp;lt;- &amp;quot;Alex&amp;quot;
age &amp;lt;- 28
city &amp;lt;- &amp;quot;Athens&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The two objects have now been created; if we wanted to print out their values, we can use the &lt;code&gt;print()&lt;/code&gt; function or just type the names of the objects.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(name); print(age); print(city)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Alex&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 28&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Athens&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;name&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Alex&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;age&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 28&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;city&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Athens&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can mentally read the command &lt;code&gt;age &amp;lt;- 28&lt;/code&gt; as &lt;em&gt;object &lt;code&gt;age&lt;/code&gt; becomes equal to the value 28&lt;/em&gt;. There is a keyboard shortcut &lt;code&gt;Alt + -&lt;/code&gt; to get the assignment operator. We can do more interesting and useful things creating variables and assigning values to them. For instance, if we have the relevant dimensions and wanted to calculate the area and volume of a room, we could do it as follows:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;room_length &amp;lt;- 5.63
room_width  &amp;lt;- 6.48
room_height &amp;lt;- 2.93
room_area &amp;lt;- room_length * room_width
room_volume &amp;lt;- room_length * room_width * room_height

room_area&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 36.4824&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;room_volume&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 106.8934&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;r-is-case-sensitive&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;R is case sensitive&lt;/h3&gt;
&lt;p&gt;R is case sentitive and needs everything exactly as it was defined. &lt;code&gt;age&lt;/code&gt; is different from &lt;code&gt;AgE&lt;/code&gt; and &lt;code&gt;Age&lt;/code&gt;. So if you type&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;age &amp;lt;- 28
AgE &amp;lt;- 34
Age &amp;lt;- 55

age; AgE; Age&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 28&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 34&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 55&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;R will create three different objects.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;typos&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Typos&lt;/h3&gt;
&lt;p&gt;R is a brilliant piece of software, but it cannot handle typos. Unlike Google’s search, &lt;em&gt;“Did you mean…”&lt;/em&gt;, it takes it on faith that what you typed is &lt;strong&gt;exactly&lt;/strong&gt; what you meant. For example, suppose that you forgot to hit the shift key when trying to type &lt;code&gt;+&lt;/code&gt;, and as a result your command ended up being &lt;code&gt;5 = 20&lt;/code&gt; rather than &lt;code&gt;5 + 20&lt;/code&gt;. Here’s what happens:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;5 = 20&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Error in 5 = 20: invalid (do_set) left-hand side to assignment&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;R attempted to interpret &lt;code&gt;5 = 20&lt;/code&gt; as a command, and spits out an error message because this makes no sense to it. Even more subtle is the fact that some typos won’t produce errors at all, because they happen to correspond to R commands. For instance, suppose that instead of &lt;code&gt;5 + 20&lt;/code&gt;, I mistakenly type command &lt;code&gt;5 - 20&lt;/code&gt;. Clearly, R has no way of knowing that you meant to add &lt;code&gt;20&lt;/code&gt; to &lt;code&gt;5&lt;/code&gt;, not subtract &lt;code&gt;20&lt;/code&gt; from &lt;code&gt;5&lt;/code&gt;, so what happens this time is this:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;5 - 20&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] -15&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this case, R produces the right answer, but to the the wrong question.&lt;/p&gt;
&lt;p&gt;R will always try to do exactly what you ask it to do. There is no autocorrect or equivalent to “Did you mean..” in R, and for good reason. When doing advanced stuff and even the simplest of statistics is pretty advanced in a lot of ways, it’s dangerous to let a mindless automaton like R try to overrule the human user. But because of this, it’s your responsibility to be careful. Always make sure you type exactly what you mean. When dealing with computers, it’s not enough to type approximately the right thing. In general, you absolutely must be precise in what you say to R … like all machines it is too stupid to be anything other than absurdly literal in its interpretation.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;comments&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Comments&lt;/h3&gt;
&lt;p&gt;It is useful to put comments in your code, to make everything more readable. These comments could help others and you when you go back to your code in the future. R comments start with a hashtag sign &lt;code&gt;#&lt;/code&gt;. Everything after the hashtag to the end of the line will be ignored by R. RStudio by default thinks that every line you write is a command; if you want to turn a line into a comment, place the cursor in the line and hit &lt;code&gt;Ctrl + Shift + C&lt;/code&gt; in Windows or &lt;code&gt;Cmd + Shift + C&lt;/code&gt; in a Mac.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# This line is a comment and will be ignored when run.
city # Text after the hashtag &amp;quot;#&amp;quot; is also ignored.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Athens&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;r-knows-youre-not-finished&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;R knows you’re not finished&lt;/h3&gt;
&lt;p&gt;If you hit enter in a situation where it’s obvious to R that you haven’t actually finished typing the command, R is just smart enough to keep waiting. For example, if you wanted to calculate &lt;code&gt;15 - 4&lt;/code&gt;, and start by typing type &lt;code&gt;15 -&lt;/code&gt; and then press enter by mistake, R is smart enough to realise that you probably wanted to type in another number. So here’s what happens:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt; 15 -
+&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and there’s a blinking cursor next to the plus &lt;code&gt;+&lt;/code&gt; sign. What this means is that R is still waiting for you to finish. It thinks you’re still typing your command, so it hasn’t tried to execute it yet. In other words, this plus sign is actually another command prompt. It’s different from the usual one (i.e., the &lt;code&gt;&amp;gt;&lt;/code&gt; symbol) to remind you that R is going to add whatever you type now to what you typed last time. For example, if I then go on to type &lt;code&gt;4&lt;/code&gt; and hit enter, what we get:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt; 15 -
+ 4
[1] 11&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And as far as R is concerned, this is exactly the same as if you had typed &lt;code&gt;15 - 4&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;By the way, if after entering the &lt;code&gt;15 -&lt;/code&gt; you wanted to stop execution and cancel your command, just hit the &lt;strong&gt;escape&lt;/strong&gt; key. R will return you to the normal command prompt (i.e. &lt;code&gt;&amp;gt;&lt;/code&gt;) without attempting to execute the botched command.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;arithmetic-operations-and-functions&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Arithmetic Operations and Functions&lt;/h3&gt;
&lt;p&gt;R has the basic operators and you can use it as as simple calculator: addition is &lt;code&gt;+&lt;/code&gt;, subtraction is &lt;code&gt;-&lt;/code&gt;, multiplication is &lt;code&gt;*&lt;/code&gt;, division is &lt;code&gt;/&lt;/code&gt;, and &lt;code&gt;^&lt;/code&gt; is the power operator:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;2 + 3 &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 5&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;5 - 8&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] -3&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;13 * 21&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 273&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;34 / 55&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 0.6181818&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;(5 * 13)/4 - 7&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 9.25&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# ^ : to the power off
2^3&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 8&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# square root
sqrt(25)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 5&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Besides the basic operations functions, you can use standard mathematical functions&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Rounding
-&lt;code&gt;round()&lt;/code&gt;, &lt;code&gt;floor()&lt;/code&gt;, &lt;code&gt;ceiling()&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;Logarithms and Exponentials
-&lt;code&gt;exp()&lt;/code&gt;, &lt;code&gt;log()&lt;/code&gt;, &lt;code&gt;log10()&lt;/code&gt;, &lt;code&gt;log2()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# R knows pi = 3.1415926...

# round to 2 decimal places 
round(pi, digits = 2); round(pi,2)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 3.14&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 3.14&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#Round down to nearest interger
floor(pi)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 3&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#Round up to nearest interger
ceiling(pi)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 4&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;rstudio-help&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;RStudio help&lt;/h2&gt;
&lt;p&gt;At this stage you know how to type in basic commands, including how to use some R functions. Few analysts bother to try to know or remember all the commands. What they really do is use tricks to make their lives easier. The first (and arguably most important one) is to use the internet. If you don’t know how a particular R function works, Google it. There is a lot of R documentation out there, and almost all of it is searchable! For the moment though, I want to call your attention to a couple of simple tricks that Rstudio makes available to you.&lt;/p&gt;
&lt;div id=&#34;tab-autocomplete&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Tab autocomplete&lt;/h3&gt;
&lt;p&gt;The first thing I want to call your attention to is the &lt;em&gt;autocomplete&lt;/em&gt; ability in Rstudio. Assume that what you want to do is to round a number. This time around, start typing the name of the function that you want, and then hit the &lt;code&gt;Tab&lt;/code&gt; key. Rstudio will then display a little window like the one shown here:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/rstudio_autocomplete.png&#34; width=&#34;90%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;In this figure, we have typed the letters &lt;code&gt;rou&lt;/code&gt; at the command line, and then hit tab. The window has two panels. On the left, there’s a list of variables and functions that start with the letters typed shown in black text, and some grey text that tells you where that variable/function is stored. In our case, &lt;code&gt;round&lt;/code&gt; is included in the &lt;code&gt;{base}&lt;/code&gt; R, what is included in every new installation of R. There’s a few options there, and the one we want is &lt;code&gt;round&lt;/code&gt;, but if you’re typing this yourself you’ll notice that when you hit the tab key the window pops up with the top entry highlighted. You can use the up and down arrow keys to select the one that you want. Or, if none of the options look right to you, you can hit the escape key (&lt;code&gt;ESC&lt;/code&gt;) or the left arrow key to make the window go away.&lt;/p&gt;
&lt;p&gt;In our case, the thing we want is the &lt;code&gt;round&lt;/code&gt; option, and the panel on the right tells you a bit about how the function works. This display is really handy. The very first thing it says is &lt;code&gt;round(x, digits = 0)&lt;/code&gt;: what this is telling you is that the &lt;code&gt;round&lt;/code&gt; function has two arguments. The first argument is called &lt;code&gt;x&lt;/code&gt;, and it doesn’t have a default value. The second argument is &lt;code&gt;digits&lt;/code&gt;, and it has a default value of &lt;code&gt;0&lt;/code&gt;. In a lot of situations, that’s all the information you need. But Rstudio goes a bit further, and provides some additional information about the function underneath. Sometimes that additional information is very helpful, sometimes it’s not: Rstudio pulls that text from the R help documentation, and my experience is that the helpfulness of that documentation varies wildly. Anyway, if you’ve decided that &lt;code&gt;round&lt;/code&gt; is the function that you want to use, you can hit the enter key and Rstudio will finish typing the rest of the function name for you.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;the-history-pane&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;The history pane&lt;/h3&gt;
&lt;p&gt;One thing R does is keep track of your *command history**, i.e., it remembers all the commands previously typed. You can access this history in a few different ways. The simplest way is to use the up and down arrow keys. If you hit the up key, the R console will show you the most recent command that you’ve typed. Hit it again, and it will show you the command before that. If you want the text on the screen to go away, hit escape. Using the up and down keys can be handy if you’ve typed a long command that had one typo in it. Rather than having to type it again from scratch, you can use the up key to bring up the command and fix it.&lt;/p&gt;
&lt;p&gt;The second way to get access to your command history is to look at the history panel in Rstudio. On the upper right panel of the Rstudio window, you’ll see a tab labelled &lt;strong&gt;History&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/rstudio_editor.png&#34; width=&#34;90%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Click on that, and you’ ll see a list of all your recent commands displayed in that panel– double click on one of the commands, and it will be copied to the R console. You can achieve the same result by selecting the command you want with the mouse and then clicking the *“To Console”** button.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;more-resources&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;More resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;RStudio have a produced a great series of video tutorials &lt;a href=&#34;https://resources.rstudio.com/wistia-rstudio-essentials-2/rstudioessentialsprogrammingpart1-2&#34;&gt;RStudio Essentials Videos&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.rstudio.com/resources/cheatsheets/#ide&#34;&gt;RStudio IDE Cheatsheet&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;acknowledgements&#34; class=&#34;section level2 toc-ignore&#34;&gt;
&lt;h2&gt;Acknowledgements&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;This page is derived in part from &lt;a href=&#34;https://psyr.org/index.html&#34;&gt;“R for Psychological Science”&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;br&gt;
&lt;br&gt;&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Textbooks and other resources</title>
      <link>https://usi-emba-analytics.netlify.app/reference/05-reference/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/05-reference/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#textbooksreadings&#34;&gt;Textbooks/Readings&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#r-programming&#34;&gt;R Programming&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#statistics-with-r&#34;&gt;Statistics with R&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#visualisations&#34;&gt;Visualisations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#spatial-visualisations&#34;&gt;Spatial Visualisations&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#online-resources&#34;&gt;Online resources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#software&#34;&gt;Software&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#companies-government-agencies-and-ngos-using-r&#34;&gt;Companies, Government Agencies, and NGOs Using R&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#podcasts&#34;&gt;Podcasts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;The following is a non-exhaustive list of free online textbooks and resources that use R&lt;/p&gt;
&lt;div id=&#34;textbooksreadings&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Textbooks/Readings&lt;/h2&gt;
&lt;div id=&#34;r-programming&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;R Programming&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;http://r4ds.had.co.nz/&#34; target=&#34;_blank&#34;&gt;R for Data Science&lt;/a&gt; – Garrett Grolemund and Hadley Wickham&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Open-source online version is available for free; &lt;a href=&#34;https://www.amazon.co.uk/R-Data-Science-Garrett-Grolemund/dp/1491910399&#34;&gt;Available for purchase online&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;No official solution manual for the book exercises exists, but several can be found online, like &lt;a href=&#34;https://jrnold.github.io/r4ds-exercise-solutions/&#34; target=&#34;_blank&#34;&gt;this version by Jeffrey B. Arnold&lt;/a&gt;. Your exact solutions may vary, but these are a good starting point.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://rstudio-education.github.io/hopr/&#34; target=&#34;_blank&#34;&gt;Hands-On Programming with R&lt;/a&gt; by Garrett Grolemund. This is a non-statistical introduction to R programming with many hands-on examples.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;http://adv-r.had.co.nz/&#34; target=&#34;_blank&#34;&gt;Advanced R&lt;/a&gt; – Hadley Wickham&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Hardcover available online, but the online version is free&lt;/li&gt;
&lt;li&gt;A deeper dive into R as a programming language, not just a tool for data science. Most of this material is best covered on your own after you are familiar with R.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://bookdown.org/rdpeng/rprogdatascience/&#34; target=&#34;_blank&#34;&gt;R Programming for Data Science&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://mdsr-book.github.io/&#34; target=&#34;_blank&#34;&gt;Modern Data Science with R&lt;/a&gt; – Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;statistics-with-r&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Statistics with R&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://moderndive.com/&#34; target=&#34;_blank&#34;&gt;Modern Dive: A moderndive into R and the tidyverse&lt;/a&gt; by Chester Ismay and Albert Y. Kim&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://learningstatisticswithr.com/book/index.html&#34; target=&#34;_blank&#34;&gt;Learning Statistics with R&lt;/a&gt; by Danielle Navarro&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://www.openintro.org/stat/textbook.php?stat_book=os&#34; target=&#34;_blank&#34;&gt;OpenIntro Statistics&lt;/a&gt; Open-source online version is available for free&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;http://faculty.marshall.usc.edu/gareth-james/ISL/&#34; target=&#34;_blank&#34;&gt;An Introduction to Statistical Learning: with Applications in R&lt;/a&gt; – Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Each chapter includes code that demonstrates how to implement different methods. Unfortunately, their code use a lot of base R functions and syntax, whereas our emphasis is on getting things done with the &lt;a href=&#34;http://tidyverse.org/&#34;&gt;&lt;code&gt;tidyverse&lt;/code&gt;&lt;/a&gt; collection of R packages. However, this is still a great book and the code provided is useful.&lt;/li&gt;
&lt;li&gt;You can download a free PDF of the entire book &lt;a href=&#34;http://faculty.marshall.usc.edu/gareth-james/ISL/ISLR%20Seventh%20Printing.pdf&#34; target=&#34;_blank&#34;&gt;from the authors’ site&lt;/a&gt;
&lt;br&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://bookdown.org/roback/bookdown-bysh/&#34; target=&#34;_blank&#34;&gt;Broadening Your Statistical Horizons&lt;/a&gt; is an applied textbook on generalized linear models, with all of the examples / code in R.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://otexts.com/fpp2/index.html&#34; target=&#34;_blank&#34;&gt;Forecasting: Principles and Practice&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://www.tmwr.org/&#34; target=&#34;_blank&#34;&gt;Tidy Modeling with R&lt;/a&gt; The purpose of this book is to demonstrate how the tidyverse and tidymodels can be used to produce high quality models.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://www.tidytextmining.com/&#34; target=&#34;_blank&#34;&gt;Text Mining with R&lt;/a&gt; by Julia Silge and David Robinson. What happens if your data is text, rather than numbers? What if you wanted to do sentiment analysis?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;visualisations&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Visualisations&lt;/h3&gt;
&lt;p&gt;The de-facto standard for visualisations in R is the &lt;a href=&#34;https://cran.r-project.org/web/packages/ggplot2/index.html&#34; target=&#34;_blank&#34;&gt;&lt;code&gt;ggplot2&lt;/code&gt;&lt;/a&gt; package. If you want to read Hadley Wickham’s paper that implemented the grammar of graphics into R, you can find it &lt;a href=&#34;http://vita.had.co.nz/papers/layered-grammar.pdf&#34; target=&#34;_blank&#34;&gt;here&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;http://socviz.co/&#34; target=&#34;_blank&#34;&gt;Data Visualization: A Practical Introduction&lt;/a&gt; by Kieran Healy.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://serialmentor.com/dataviz/&#34; target=&#34;_blank&#34;&gt;Fundamentals of Data Visualization&lt;/a&gt; by Claus O. Wilke.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://r-graphics.org/&#34; target=&#34;_blank&#34;&gt;R Graphics Cookbook&lt;/a&gt; A practical guide by Winston Chang that provides any specific examples/ recipes to help you generate high-quality graphs quickly. I use it as quick reference to get my ggplot working.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;a href=&#34;https://bbc.github.io/rcookbook/&#34; target=&#34;_blank&#34;&gt;BBC Visual and Data Journalism cookbook for R graphics&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://plotly-r.com/&#34; target=&#34;_blank&#34;&gt;Interactive web-based data visualization with R, plotly, and shiny&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;spatial-visualisations&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Spatial Visualisations&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;http://strimas.com/r/tidy-sf/&#34; target=&#34;_blank&#34;&gt;Tidy spatial data in R: using dplyr, tidyr, and ggplot2 with sf&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://keen-swartz-3146c4.netlify.com/&#34; target=&#34;_blank&#34;&gt;Spatial Data Science&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://geocompr.robinlovelace.net/&#34; target=&#34;_blank&#34;&gt;Geocomputation with R&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;online-resources&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Online resources&lt;/h2&gt;
&lt;p&gt;Data science and statistical programming can be challenging. Computers are dumb and tiny errors in your code can cause hours of frustration (even if you’ve been doing this stuff for years!).&lt;/p&gt;
&lt;p&gt;Fortunately, there are tons of online resources to help you with this. Two of the most important are &lt;a href=&#34;https://stackoverflow.com/&#34; target=&#34;_blank&#34;&gt;StackOverflow&lt;/a&gt; (a Q&amp;amp;A site with thousands of answers to all sorts of statistical and programming questions) and &lt;a href=&#34;https://community.rstudio.com/&#34; target=&#34;_blank&#34;&gt;RStudio Community&lt;/a&gt; (a forum specifically designed for people using RStudio and the tidyverse).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;I highly recommend subscribing to the &lt;a href=&#34;https://rweekly.org/&#34; target=&#34;_blank&#34;&gt;R Weekly&lt;/a&gt; newsletter which is sent every Monday and is full of helpful tutorials and ideas on how to do stuff with R.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://www.rstudio.com/resources/cheatsheets/&#34;&gt;RStudio Cheatsheets&lt;/a&gt; Printable cheat sheets for common R tasks and features&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/rstudio/cheatsheets/raw/master/rmarkdown-2.0.pdf&#34; target=&#34;_blank&#34;&gt;R Markdown&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/rstudio/cheatsheets/raw/master/source/pdfs/data-import-cheatsheet.pdf&#34; target=&#34;_blank&#34;&gt;Data import&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/rstudio/cheatsheets/raw/master/source/pdfs/data-transformation-cheatsheet.pdf&#34; target=&#34;_blank&#34;&gt;Data transformation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/rstudio/cheatsheets/raw/master/source/pdfs/ggplot2-cheatsheet-2.1.pdf&#34; target=&#34;_blank&#34;&gt;Data visualization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/rstudio/cheatsheets/raw/master/lubridate.pdf&#34; target=&#34;_blank&#34;&gt;Dates and Times&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/rstudio/cheatsheets/raw/master/strings.pdf&#34; target=&#34;_blank&#34;&gt;Work with Strings&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/rstudio/cheatsheets/raw/master/source/pdfs/rstudio-IDE-cheatsheet.pdf&#34; target=&#34;_blank&#34;&gt;RStudio IDE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.rstudio.com/resources/cheatsheets/&#34; target=&#34;_blank&#34;&gt;And more!&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;software&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Software&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://typora.io&#34; target=&#34;_blank&#34;&gt;Typora&lt;/a&gt; is a lightweight, stand alone editor for Markdown documents&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;companies-government-agencies-and-ngos-using-r&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Companies, Government Agencies, and NGOs Using R&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/ThinkR-open/companies-using-r&#34; target=&#34;_blank&#34;&gt;Organisations using R&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;podcasts&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Podcasts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://www.bbc.co.uk/programmes/b006qshd&#34;&gt;Tim Harford’s &lt;strong&gt;More or Less&lt;/strong&gt;&lt;/a&gt; explains and debunks the numbers and statistics used in political debate, the news and everyday life. A great episode on sampling can be found &lt;a href=&#34;https://www.bbc.co.uk/sounds/play/m0004sj2&#34; target=&#34;_blank&#34;&gt;here&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://everythinghertz.com/&#34; target=&#34;_blank&#34;&gt;Everything Hertz: A podcast by scientists, for scientists. Methodology, scientific life, and bad language&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Using Markdown</title>
      <link>https://usi-emba-analytics.netlify.app/reference/03-reference/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/03-reference/</guid>
      <description>
&lt;script src=&#34;https://usi-emba-analytics.netlify.app/rmarkdown-libs/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#basic-markdown-formatting&#34;&gt;Basic Markdown formatting&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#mathematical-formulas&#34;&gt;Mathematical formulas&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#tables&#34;&gt;Tables&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#front-matter&#34;&gt;Front matter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#other-references&#34;&gt;Other references&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;&lt;a href=&#34;https://daringfireball.net/projects/markdown/&#34; target=&#34;_blank&#34;&gt;Markdown&lt;/a&gt; is a special kind of markup language that lets you format text with simple syntax. You can then use a converter program like &lt;a href=&#34;https://pandoc.org/&#34; target=&#34;_blank&#34;&gt;pandoc&lt;/a&gt; to convert Markdown into whatever format you want: HTML, PDF, Word, PowerPoint, etc. (&lt;a href=&#34;https://pandoc.org/MANUAL.html#option--to&#34; target=&#34;_blank&#34;&gt;see the full list of output types here&lt;/a&gt;)&lt;/p&gt;
&lt;div id=&#34;basic-markdown-formatting&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Basic Markdown formatting&lt;/h2&gt;
&lt;table&gt;
&lt;colgroup&gt;
&lt;col width=&#34;40%&#34; /&gt;
&lt;col width=&#34;21%&#34; /&gt;
&lt;col width=&#34;38%&#34; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;Type…&lt;/th&gt;
&lt;th&gt;…or…&lt;/th&gt;
&lt;th&gt;…to get&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;pre&gt;Some text in a paragraph.

More text in the next paragraph. Always
use empty lines between paragraphs.&lt;/pre&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;p&gt;Some text in a paragraph.&lt;/p&gt;
&lt;p&gt;More text in the next paragraph. Always
use empty lines between paragraphs.&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;*Italic*&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;_Italic_&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Italic&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;**Bold**&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;__Bold__&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Bold&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;# Heading 1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;h1 class=&#34;smaller-h1&#34;&gt;
Heading 1
&lt;/h1&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;## Heading 2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;h2 class=&#34;smaller-h2&#34;&gt;
Heading 2
&lt;/h2&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;### Heading 3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;h3 class=&#34;smaller-h3&#34;&gt;
Heading 3
&lt;/h3&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;(Go up to heading level 6 with &lt;code&gt;######&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;[Link text](http://www.example.com)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href=&#34;http://www.example.com&#34;&gt;Link text&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;![Image caption](/path/to/image.png)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/penguins.png&#34; title=&#34;fig:&#34; alt=&#34;Penguins&#34; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;`&lt;code&gt;Inline code` with backticks&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Inline code&lt;/code&gt; with backticks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;&amp;gt; Blockquote&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;blockquote&gt;
&lt;p&gt;Blockquote&lt;/p&gt;
&lt;/blockquote&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;pre&gt;- Things in
- an unordered
- list&lt;/pre&gt;&lt;/td&gt;
&lt;td&gt;&lt;pre&gt;* Things in
* an unordered
* list&lt;/pre&gt;&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;Things in&lt;/li&gt;
&lt;li&gt;an unordered&lt;/li&gt;
&lt;li&gt;list&lt;/li&gt;
&lt;/ul&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;pre&gt;1. Things in
2. an ordered
3. list&lt;/pre&gt;&lt;/td&gt;
&lt;td&gt;&lt;pre&gt;1) Things in
2) an ordered
3) list&lt;/pre&gt;&lt;/td&gt;
&lt;td&gt;&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Things in&lt;/li&gt;
&lt;li&gt;an ordered&lt;/li&gt;
&lt;li&gt;list&lt;/li&gt;
&lt;/ol&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;pre&gt;Horizontal line

---&lt;/pre&gt;&lt;/td&gt;
&lt;td&gt;&lt;pre&gt;Horizontal line

***&lt;/pre&gt;&lt;/td&gt;
&lt;td&gt;&lt;p&gt;Horizontal line&lt;/p&gt;
&lt;hr /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div id=&#34;mathematical-formulas&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Mathematical formulas&lt;/h2&gt;
&lt;p&gt;Markdown uses LaTeX to create fancy mathematical equations. There are tons of little options and features available for math equations—you can find &lt;a href=&#34;http://www.malinc.se/math/latex/basiccodeen.php&#34; target=&#34;_blank&#34;&gt;helpful examples of the the most common basic commands here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can use math in two different ways: inline or in a display block. To use math inline, wrap it in single dollar signs, like &lt;code&gt;$y = mx + b$&lt;/code&gt;:&lt;/p&gt;
&lt;table&gt;
&lt;colgroup&gt;
&lt;col width=&#34;52%&#34; /&gt;
&lt;col width=&#34;47%&#34; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;Type…&lt;/th&gt;
&lt;th&gt;…to get&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;pre&gt;Based on our regression model for
estimating the effect of education on wages
is $\hat{y} = \beta_0 + \beta_1 x_1 + \epsilon$, or
$\text{Wages} = \beta_0 + \beta_1 \text{Education} + \epsilon$.&lt;/pre&gt;&lt;/td&gt;
&lt;td&gt;Based on our regression model for
estimating the effect of education on wages
is &lt;span class=&#34;math inline&#34;&gt;\(\hat{y} = \beta_0 + \beta_1 x_1 + \epsilon\)&lt;/span&gt;, or
&lt;span class=&#34;math inline&#34;&gt;\(\text{Wages} = \beta_0 + \beta_1 \text{Education} + \epsilon\)&lt;/span&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;To put an equation on its own line in a display block, wrap it in double dollar signs, like this:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Type…&lt;/strong&gt;&lt;/p&gt;
&lt;pre class=&#34;text&#34;&gt;&lt;code&gt;The quadratic equation was an important part of high school math:

$$
x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}
$$

But now we just use computers to solve for $x$.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;…to get…&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The quadratic equation was an important part of high school math:&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[
x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}
\]&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;But now we just use computers to solve for &lt;span class=&#34;math inline&#34;&gt;\(x\)&lt;/span&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr /&gt;
&lt;p&gt;Because dollar signs are used to indicate math equations, you can’t just use dollar signs like normal if you’re writing about actual dollars. For instance, if you write &lt;code&gt;This book costs $5.75 and this other costs $40&lt;/code&gt;, Markdown will treat everything that comes between the dollar signs as math, like so: “This book costs $5.75 and this other costs $40”.&lt;/p&gt;
&lt;p&gt;To get around that, put a backslash (&lt;code&gt;\&lt;/code&gt;) in front of the dollar signs, so that &lt;code&gt;This book costs \$5.75 and this other costs \$40&lt;/code&gt; becomes “This book costs &lt;span&gt;$5.75&lt;/span&gt; and this other costs &lt;span&gt;$40&lt;/span&gt;”.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;tables&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Tables&lt;/h2&gt;
&lt;p&gt;There are 4 different ways to hand-create tables in Markdown—I say “hand-create” because it’s normally way easier to use R to generate these things with packages like &lt;a href=&#34;https://rapporter.github.io/pander/&#34;&gt;&lt;strong&gt;pander&lt;/strong&gt;&lt;/a&gt; (use &lt;code&gt;pandoc.table()&lt;/code&gt;) or &lt;strong&gt;knitr&lt;/strong&gt; (use &lt;a href=&#34;https://bookdown.org/yihui/rmarkdown-cookbook/kable.html&#34;&gt;&lt;code&gt;kable()&lt;/code&gt;&lt;/a&gt;). The two most common are simple tables and pipe tables. &lt;a href=&#34;https://pandoc.org/MANUAL.html#tables&#34;&gt;You can find the full documentation here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For simple tables, type…&lt;/strong&gt;&lt;/p&gt;
&lt;pre class=&#34;text&#34;&gt;&lt;code&gt;  Right     Left     Center     Default
-------     ------ ----------   -------
     12     12        12            12
    123     123       123          123
      1     1          1             1

Table: Caption goes here&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;…to get…&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;caption&gt;Caption goes here&lt;/caption&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;right&#34;&gt;Right&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;Left&lt;/th&gt;
&lt;th align=&#34;center&#34;&gt;Center&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;right&#34;&gt;12&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;12&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;12&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;right&#34;&gt;123&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;123&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;123&lt;/td&gt;
&lt;td&gt;123&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;For pipe tables, type…&lt;/strong&gt;&lt;/p&gt;
&lt;pre class=&#34;text&#34;&gt;&lt;code&gt;| Right | Left | Default | Center |
|------:|:-----|---------|:------:|
|   12  |  12  |    12   |    12  |
|  123  |  123 |   123   |   123  |
|    1  |    1 |     1   |     1  |

Table: Caption goes here&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;…to get…&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;caption&gt;Caption goes here&lt;/caption&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;right&#34;&gt;Right&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;Left&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th align=&#34;center&#34;&gt;Center&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;right&#34;&gt;12&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;12&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;12&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;right&#34;&gt;123&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;123&lt;/td&gt;
&lt;td&gt;123&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;123&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div id=&#34;front-matter&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Front matter&lt;/h2&gt;
&lt;p&gt;You can include a special section at the top of a Markdown document that contains metadata (or data about your document) like the title, date, author, etc. This section uses a special simple syntax named &lt;a href=&#34;https://learn.getgrav.org/16/advanced/yaml&#34; target=&#34;_blank&#34;&gt;YAML&lt;/a&gt; (or “YAML Ain’t Markup Language”) that follows this basic outline: &lt;code&gt;setting: value for setting&lt;/code&gt;. Here’s an example YAML metadata section. Note that it must start and end with three dashes (&lt;code&gt;---&lt;/code&gt;).&lt;/p&gt;
&lt;pre class=&#34;yaml&#34;&gt;&lt;code&gt;---
title: Title of your document
date: &amp;quot;January 13, 2020&amp;quot;
author: &amp;quot;Your name&amp;quot;
---&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can put the values inside quotes (like the date and name in the example above), or you can leave them outside of quotes (like the title in the example above). I typically use quotes just to be safe—if the value you’re using has a colon (&lt;code&gt;:&lt;/code&gt;) in it, it’ll confuse Markdown since it’ll be something like &lt;code&gt;title: My cool title: a subtitle&lt;/code&gt;, which has two colons. It’s better to do this:&lt;/p&gt;
&lt;pre class=&#34;yaml&#34;&gt;&lt;code&gt;---
title: &amp;quot;My cool title: a subtitle&amp;quot;
---&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you want to use quotes inside one of the values (e.g. your document is &lt;code&gt;An evaluation of &#34;scare quotes&#34;&lt;/code&gt;), you can use single quotes instead:&lt;/p&gt;
&lt;pre class=&#34;yaml&#34;&gt;&lt;code&gt;---
title: &amp;#39;An evaluation of &amp;quot;scare quotes&amp;quot;&amp;#39;
---&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;other-references&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Other references&lt;/h2&gt;
&lt;p&gt;These websites have additional details and examples and practice tools:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://commonmark.org/help/tutorial/&#34; target=&#34;_blank&#34;&gt;&lt;strong&gt;CommonMark’s Markdown tutorial&lt;/strong&gt;&lt;/a&gt;: A quick interactive Markdown tutorial.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.markdowntutorial.com/&#34; target=&#34;_blank&#34;&gt;&lt;strong&gt;Markdown tutorial&lt;/strong&gt;&lt;/a&gt;: Another interactive tutorial to practice using Markdown.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;http://packetlife.net/media/library/16/Markdown.pdf&#34; target=&#34;_blank&#34;&gt;&lt;strong&gt;Markdown cheatsheet&lt;/strong&gt;&lt;/a&gt;: Useful one-page reminder of Markdown syntax.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;http://plain-text.co/&#34; target=&#34;_blank&#34;&gt;&lt;strong&gt;The Plain Person’s Guide to Plain Text Social Science&lt;/strong&gt;&lt;/a&gt;: A comprehensive explanation and tutorial about why you should write data-based reports in Markdown.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>World Bank Data</title>
      <link>https://usi-emba-analytics.netlify.app/reference/world_bank_data/</link>
      <pubDate>Fri, 31 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/world_bank_data/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#population-growth-1970-2017&#34;&gt;Population Growth 1970-2017&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#world-happiness-how-does-it-correlate-with-various-indicators&#34;&gt;World Happiness: how does it correlate with various indicators&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#acknowledgments&#34;&gt;Acknowledgments&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;The World Bank is one of the world’s largest producers of development data and research. It is a great source of &lt;a href=&#34;https://data.worldbank.org/&#34;&gt;global socio-economic data&lt;/a&gt;, spanning several decades and many topics. For example, you can read their &lt;a href=&#34;http://datatopics.worldbank.org/sdgatlas/index.html&#34;&gt;2018 Atlas of Sustainable Development Goals&lt;/a&gt; or a &lt;a href=&#34;http://blogs.worldbank.org/opendata/2018-atlas-sustainable-development-goals-all-new-visual-guide-data-and-development&#34;&gt;blog post on their all-new visual guide to data and development&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;wbstats&lt;/code&gt; package allows you to search for and download any open World Bank dataset. To identify the actual indicator you want, you have to find its &lt;strong&gt;code&lt;/strong&gt; either in the &lt;a href=&#34;https://datacatalog.worldbank.org/&#34;&gt;World Bank datacatalog&lt;/a&gt; or, even better, through &lt;code&gt;wbstats&lt;/code&gt;.&lt;/p&gt;
&lt;div id=&#34;population-growth-1970-2017&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Population Growth 1970-2017&lt;/h2&gt;
&lt;p&gt;Suppose we wanted to get data on population growth. Manually, we would navigate to the &lt;a href=&#34;https://datacatalog.worldbank.org/&#34;&gt;World Bank datacatalog website&lt;/a&gt;, and search for population growth.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/wb_population_growth.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We get various results, but the more important ones are usually at the top with data on &lt;em&gt;Population Growth (Annual %)&lt;/em&gt; with code &lt;code&gt;SP.POP.GROW&lt;/code&gt;, on &lt;em&gt;Rural Population Growth (Annual %)&lt;/em&gt; with code &lt;code&gt;SP.RUR.TOTL.ZG&lt;/code&gt;, and on &lt;em&gt;Urban Population Growth (Annual %)&lt;/em&gt; with code &lt;code&gt;SP.URB.GROW&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Alternatively, we would load the &lt;code&gt;wbstats&lt;/code&gt; package, and use &lt;code&gt;pop_growth_codes &amp;lt;- wbsearch(pattern = &#34;population growth&#34;)&lt;/code&gt; to get a dataframe with the codes that the search function returns.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(wbstats)

pop_growth_codes &amp;lt;- wb_search(pattern = &amp;quot;population growth&amp;quot;)
head(pop_growth_codes)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 4 x 3
##   indicator_id    indicator            indicator_desc                           
##   &amp;lt;chr&amp;gt;           &amp;lt;chr&amp;gt;                &amp;lt;chr&amp;gt;                                    
## 1 IN.EC.POP.GRWT~ Decadal Growth of P~ Population growth rate over the 10 year ~
## 2 SP.POP.GROW     Population growth (~ Annual population growth rate for year t~
## 3 SP.RUR.TOTL.ZG  Rural population gr~ Rural population refers to people living~
## 4 SP.URB.GROW     Urban population gr~ Urban population refers to people living~&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Either way, the indicator we are interested in is Population Growth Annual and its code = &lt;code&gt;SP.POP.GROW&lt;/code&gt;. The next step is to download the data with the &lt;code&gt;wbstats::wb_data()&lt;/code&gt; function.&lt;/p&gt;
&lt;p&gt;The first argument the &lt;code&gt;wb_data&lt;/code&gt; function takes is a list of countries; if left empty, is will download all data for individual countries and aggregate regions like Arab World, Euro area, etc. In our example, let us download data for individuals countries only starting at 1970 and ending in 2017.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Download data for Population Growth Annual% SP.POP.GROW
pop_growth_data &amp;lt;- wb_data(country = &amp;quot;countries_only&amp;quot;, 
                      indicator = &amp;quot;SP.POP.GROW&amp;quot;, 
                      start_date = 1970, 
                      end_date = 2017,
                      return_wide=FALSE)

glimpse(pop_growth_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 10,416
## Columns: 11
## $ indicator_id &amp;lt;chr&amp;gt; &amp;quot;SP.POP.GROW&amp;quot;, &amp;quot;SP.POP.GROW&amp;quot;, &amp;quot;SP.POP.GROW&amp;quot;, &amp;quot;SP.POP.G...
## $ indicator    &amp;lt;chr&amp;gt; &amp;quot;Population growth (annual %)&amp;quot;, &amp;quot;Population growth (an...
## $ iso2c        &amp;lt;chr&amp;gt; &amp;quot;AF&amp;quot;, &amp;quot;AF&amp;quot;, &amp;quot;AF&amp;quot;, &amp;quot;AF&amp;quot;, &amp;quot;AF&amp;quot;, &amp;quot;AF&amp;quot;, &amp;quot;AF&amp;quot;, &amp;quot;AF&amp;quot;, &amp;quot;AF&amp;quot;, ...
## $ iso3c        &amp;lt;chr&amp;gt; &amp;quot;AFG&amp;quot;, &amp;quot;AFG&amp;quot;, &amp;quot;AFG&amp;quot;, &amp;quot;AFG&amp;quot;, &amp;quot;AFG&amp;quot;, &amp;quot;AFG&amp;quot;, &amp;quot;AFG&amp;quot;, &amp;quot;AFG&amp;quot;...
## $ country      &amp;lt;chr&amp;gt; &amp;quot;Afghanistan&amp;quot;, &amp;quot;Afghanistan&amp;quot;, &amp;quot;Afghanistan&amp;quot;, &amp;quot;Afghanis...
## $ date         &amp;lt;dbl&amp;gt; 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, ...
## $ value        &amp;lt;dbl&amp;gt; 2.55, 2.78, 3.08, 3.36, 3.49, 3.41, 3.14, 2.75, 2.40, ...
## $ unit         &amp;lt;chr&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ obs_status   &amp;lt;chr&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ footnote     &amp;lt;chr&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ last_updated &amp;lt;date&amp;gt; 2020-08-18, 2020-08-18, 2020-08-18, 2020-08-18, 2020-...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;wb_cachelist&lt;/code&gt; is a cached version of useful information from the World Bank API and provides a snapshot of available countries, indicators, and other relevant information. The structure of wb_cachelist is as follows&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;glimpse(wb_cachelist, max.level = 1)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## List of 8
##  $ countries    : tibble [304 x 18] (S3: tbl_df/tbl/data.frame)
##  $ indicators   : tibble [16,607 x 8] (S3: tbl_df/tbl/data.frame)
##  $ sources      : tibble [61 x 9] (S3: tbl_df/tbl/data.frame)
##  $ topics       : tibble [21 x 3] (S3: tbl_df/tbl/data.frame)
##  $ regions      : tibble [48 x 4] (S3: tbl_df/tbl/data.frame)
##  $ income_levels: tibble [7 x 3] (S3: tbl_df/tbl/data.frame)
##  $ lending_types: tibble [4 x 3] (S3: tbl_df/tbl/data.frame)
##  $ languages    : tibble [23 x 3] (S3: tbl_df/tbl/data.frame)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and as we can see it contains data on &lt;code&gt;countries&lt;/code&gt; and aggregate regions, well over 16,000 &lt;code&gt;indicators, etc. If we wanted to see the data on countries, let us create a dataframe&lt;/code&gt;countries` and glimpse its contents.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;countries &amp;lt;-  wb_cachelist$countries
glimpse(countries)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 304
## Columns: 18
## $ iso3c              &amp;lt;chr&amp;gt; &amp;quot;ABW&amp;quot;, &amp;quot;AFG&amp;quot;, &amp;quot;AFR&amp;quot;, &amp;quot;AGO&amp;quot;, &amp;quot;ALB&amp;quot;, &amp;quot;AND&amp;quot;, &amp;quot;ANR&amp;quot;,...
## $ iso2c              &amp;lt;chr&amp;gt; &amp;quot;AW&amp;quot;, &amp;quot;AF&amp;quot;, &amp;quot;A9&amp;quot;, &amp;quot;AO&amp;quot;, &amp;quot;AL&amp;quot;, &amp;quot;AD&amp;quot;, &amp;quot;L5&amp;quot;, &amp;quot;1A&amp;quot;, ...
## $ country            &amp;lt;chr&amp;gt; &amp;quot;Aruba&amp;quot;, &amp;quot;Afghanistan&amp;quot;, &amp;quot;Africa&amp;quot;, &amp;quot;Angola&amp;quot;, &amp;quot;Alb...
## $ capital_city       &amp;lt;chr&amp;gt; &amp;quot;Oranjestad&amp;quot;, &amp;quot;Kabul&amp;quot;, NA, &amp;quot;Luanda&amp;quot;, &amp;quot;Tirane&amp;quot;, &amp;quot;...
## $ longitude          &amp;lt;dbl&amp;gt; -70.02, 69.18, NA, 13.24, 19.82, 1.52, NA, NA, 5...
## $ latitude           &amp;lt;dbl&amp;gt; 12.52, 34.52, NA, -8.81, 41.33, 42.51, NA, NA, 2...
## $ region_iso3c       &amp;lt;chr&amp;gt; &amp;quot;LCN&amp;quot;, &amp;quot;SAS&amp;quot;, NA, &amp;quot;SSF&amp;quot;, &amp;quot;ECS&amp;quot;, &amp;quot;ECS&amp;quot;, NA, NA, &amp;quot;...
## $ region_iso2c       &amp;lt;chr&amp;gt; &amp;quot;ZJ&amp;quot;, &amp;quot;8S&amp;quot;, NA, &amp;quot;ZG&amp;quot;, &amp;quot;Z7&amp;quot;, &amp;quot;Z7&amp;quot;, NA, NA, &amp;quot;ZQ&amp;quot;, ...
## $ region             &amp;lt;chr&amp;gt; &amp;quot;Latin America &amp;amp; Caribbean&amp;quot;, &amp;quot;South Asia&amp;quot;, &amp;quot;Aggr...
## $ admin_region_iso3c &amp;lt;chr&amp;gt; NA, &amp;quot;SAS&amp;quot;, NA, &amp;quot;SSA&amp;quot;, &amp;quot;ECA&amp;quot;, NA, NA, NA, NA, &amp;quot;LA...
## $ admin_region_iso2c &amp;lt;chr&amp;gt; NA, &amp;quot;8S&amp;quot;, NA, &amp;quot;ZF&amp;quot;, &amp;quot;7E&amp;quot;, NA, NA, NA, NA, &amp;quot;XJ&amp;quot;, ...
## $ admin_region       &amp;lt;chr&amp;gt; NA, &amp;quot;South Asia&amp;quot;, NA, &amp;quot;Sub-Saharan Africa (exclu...
## $ income_level_iso3c &amp;lt;chr&amp;gt; &amp;quot;HIC&amp;quot;, &amp;quot;LIC&amp;quot;, NA, &amp;quot;LMC&amp;quot;, &amp;quot;UMC&amp;quot;, &amp;quot;HIC&amp;quot;, NA, NA, &amp;quot;...
## $ income_level_iso2c &amp;lt;chr&amp;gt; &amp;quot;XD&amp;quot;, &amp;quot;XM&amp;quot;, NA, &amp;quot;XN&amp;quot;, &amp;quot;XT&amp;quot;, &amp;quot;XD&amp;quot;, NA, NA, &amp;quot;XD&amp;quot;, ...
## $ income_level       &amp;lt;chr&amp;gt; &amp;quot;High income&amp;quot;, &amp;quot;Low income&amp;quot;, &amp;quot;Aggregates&amp;quot;, &amp;quot;Lowe...
## $ lending_type_iso3c &amp;lt;chr&amp;gt; &amp;quot;LNX&amp;quot;, &amp;quot;IDX&amp;quot;, NA, &amp;quot;IBD&amp;quot;, &amp;quot;IBD&amp;quot;, &amp;quot;LNX&amp;quot;, NA, NA, &amp;quot;...
## $ lending_type_iso2c &amp;lt;chr&amp;gt; &amp;quot;XX&amp;quot;, &amp;quot;XI&amp;quot;, NA, &amp;quot;XF&amp;quot;, &amp;quot;XF&amp;quot;, &amp;quot;XX&amp;quot;, NA, NA, &amp;quot;XX&amp;quot;, ...
## $ lending_type       &amp;lt;chr&amp;gt; &amp;quot;Not classified&amp;quot;, &amp;quot;IDA&amp;quot;, &amp;quot;Aggregates&amp;quot;, &amp;quot;IBRD&amp;quot;, &amp;quot;...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The dataframe contains the &lt;a href=&#34;https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes&#34;&gt;ISO country codes&lt;/a&gt;, the country name, its capital with its longitude and latitude, the region the country is in, the regions associated ISO code, as well as a classification on the income group, the country’s &lt;a href=&#34;https://blogs.worldbank.org/opendata/new-country-classifications-income-level-2018-2019&#34;&gt;classification by income level&lt;/a&gt;, etc.&lt;/p&gt;
&lt;p&gt;We can merge the dataframes &lt;code&gt;pop_growth_data&lt;/code&gt; and &lt;code&gt;countries&lt;/code&gt; with a left join, so we have a dataframe that contains data from both of them&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;countries &amp;lt;-  wb_cachelist$countries


# Merge with a left_join (a) country data with (b) population growth data
pop_growth &amp;lt;- 
  left_join(countries, pop_growth_data, by=&amp;quot;iso3c&amp;quot;) %&amp;gt;% 
              mutate(year = as.integer(date)) %&amp;gt;%  #make year an integer, rather than a character value
              select(iso3c, country.x, region, income_level, value, year=) %&amp;gt;% 
              na.omit()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let us calculate and plot the average population growth for all countries between 1970 and 2017, faceted by region.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;average_pop_growth &amp;lt;- pop_growth %&amp;gt;% 
              dplyr::group_by(region, income_level, country.x, iso3c) %&amp;gt;% 
              summarise(average_growth = mean(value)) %&amp;gt;% 
              arrange(average_growth) %&amp;gt;% 
              ungroup()

ggplot(data = average_pop_growth, 
       aes(x = reorder(country.x, average_growth), 
           y = average_growth, 
           fill = region))+
  geom_col()+
  coord_flip()+
  theme_minimal(7)+
  expand_limits(y=c(-1,8))+
  facet_wrap(~income_level, nrow=3, scales=&amp;quot;free&amp;quot;)+
  labs(title = &amp;#39;Average annual population growth (%), 1970-2017&amp;#39;,
       x = &amp;quot;&amp;quot;,
       y = &amp;quot;Average Annual Population Growth (in %)&amp;quot;,
       caption = &amp;#39;Source: Worldbank&amp;#39;) +
  # theme(legend.position=&amp;quot;none&amp;quot;)+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/world_bank_data_files/figure-html/unnamed-chunk-1-1.png&#34; width=&#34;110%&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;world-happiness-how-does-it-correlate-with-various-indicators&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;World Happiness: how does it correlate with various indicators&lt;/h2&gt;
&lt;p&gt;Data from the &lt;a href=&#34;https://www.kaggle.com/unsdsn/world-happiness&#34;&gt;UN’s World Happiness Report&lt;/a&gt; is available at Kaggle. We have downloaded the 2015 report in a CSV file, and have a quick glimpse at its structure.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;world_happiness_2015 &amp;lt;- read_csv(here::here(&amp;quot;data&amp;quot;, &amp;quot;world_happiness_2015.csv&amp;quot;))
glimpse(world_happiness_2015)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As you notice, some of the variable names include a space, like &lt;code&gt;Happiness Rank&lt;/code&gt;, all start with a capital letter, etc. We will use &lt;code&gt;janitor::clean_names()&lt;/code&gt; to clean the variable names, so they are easier to deal with.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(janitor)

world_happiness_2015 &amp;lt;- world_happiness_2015 %&amp;gt;%
  clean_names()

glimpse(world_happiness_2015)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 158
## Columns: 12
## $ country                     &amp;lt;chr&amp;gt; &amp;quot;Switzerland&amp;quot;, &amp;quot;Iceland&amp;quot;, &amp;quot;Denmark&amp;quot;, &amp;quot;N...
## $ region                      &amp;lt;chr&amp;gt; &amp;quot;Western Europe&amp;quot;, &amp;quot;Western Europe&amp;quot;, &amp;quot;We...
## $ happiness_rank              &amp;lt;dbl&amp;gt; 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, ...
## $ happiness_score             &amp;lt;dbl&amp;gt; 7.59, 7.56, 7.53, 7.52, 7.43, 7.41, 7.3...
## $ standard_error              &amp;lt;dbl&amp;gt; 0.0341, 0.0488, 0.0333, 0.0388, 0.0355,...
## $ economy_gdp_per_capita      &amp;lt;dbl&amp;gt; 1.397, 1.302, 1.325, 1.459, 1.326, 1.29...
## $ family                      &amp;lt;dbl&amp;gt; 1.350, 1.402, 1.361, 1.331, 1.323, 1.31...
## $ health_life_expectancy      &amp;lt;dbl&amp;gt; 0.941, 0.948, 0.875, 0.885, 0.906, 0.88...
## $ freedom                     &amp;lt;dbl&amp;gt; 0.666, 0.629, 0.649, 0.670, 0.633, 0.64...
## $ trust_government_corruption &amp;lt;dbl&amp;gt; 0.4198, 0.1414, 0.4836, 0.3650, 0.3296,...
## $ generosity                  &amp;lt;dbl&amp;gt; 0.2968, 0.4363, 0.3414, 0.3470, 0.4581,...
## $ dystopia_residual           &amp;lt;dbl&amp;gt; 2.52, 2.70, 2.49, 2.47, 2.45, 2.62, 2.4...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;First, we can look how happiness_score correlates with its of the variables the UN uses. We will use &lt;code&gt;GGally:ggpairs()&lt;/code&gt; to get a correlation- scatterplot matrix. We do not want to include in our analyses the country name, its region, the happiness_rank and the standard error associated with the estimate of the happiness score.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;world_happiness_2015 %&amp;gt;% 
  select(-country, -region, -happiness_rank, -standard_error) %&amp;gt;% 
  GGally::ggpairs()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/world_bank_data_files/figure-html/happiness_ggpairs-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We will now choose six (6) indicators form the World Bank data, downloads their values for 2015 and see how these correlate with the overall happiness score.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Download data for the following indicators

indicators &amp;lt;- c(&amp;quot;SE.PRM.NENR&amp;quot;,     # School enrollment, primary (% net)
                &amp;quot;SP.DYN.LE00.IN&amp;quot;,  # Life expectancy
                &amp;quot;SI.POV.DDAY&amp;quot;,     # Extreme poverty (% earning less than $2/day)
                &amp;quot;EG.ELC.ACCS.ZS&amp;quot;,  # Access to electricity
                &amp;quot;SI.POV.GINI&amp;quot;,     # GINI Index
                &amp;quot;NY.GDP.PCAP.KD&amp;quot;)  # GDP per capita


happiness_data_WB_long &amp;lt;- wb_data(country = &amp;quot;countries_only&amp;quot;, 
                             indicator = indicators, 
                             start_date = 2015, 
                             end_date = 2015,
                             #since we have many indicators, we should get the data in long format
                             return_wide=FALSE) 

# look at the long dataframe
glimpse(happiness_data_WB_long)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 1,302
## Columns: 11
## $ indicator_id &amp;lt;chr&amp;gt; &amp;quot;SE.PRM.NENR&amp;quot;, &amp;quot;SE.PRM.NENR&amp;quot;, &amp;quot;SE.PRM.NENR&amp;quot;, &amp;quot;SE.PRM.N...
## $ indicator    &amp;lt;chr&amp;gt; &amp;quot;School enrollment, primary (% net)&amp;quot;, &amp;quot;School enrollme...
## $ iso2c        &amp;lt;chr&amp;gt; &amp;quot;AF&amp;quot;, &amp;quot;AL&amp;quot;, &amp;quot;DZ&amp;quot;, &amp;quot;AS&amp;quot;, &amp;quot;AD&amp;quot;, &amp;quot;AO&amp;quot;, &amp;quot;AG&amp;quot;, &amp;quot;AR&amp;quot;, &amp;quot;AM&amp;quot;, ...
## $ iso3c        &amp;lt;chr&amp;gt; &amp;quot;AFG&amp;quot;, &amp;quot;ALB&amp;quot;, &amp;quot;DZA&amp;quot;, &amp;quot;ASM&amp;quot;, &amp;quot;AND&amp;quot;, &amp;quot;AGO&amp;quot;, &amp;quot;ATG&amp;quot;, &amp;quot;ARG&amp;quot;...
## $ country      &amp;lt;chr&amp;gt; &amp;quot;Afghanistan&amp;quot;, &amp;quot;Albania&amp;quot;, &amp;quot;Algeria&amp;quot;, &amp;quot;American Samoa&amp;quot;,...
## $ date         &amp;lt;dbl&amp;gt; 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, ...
## $ value        &amp;lt;dbl&amp;gt; NA, 94.2, 97.5, NA, NA, NA, 94.2, 99.5, 92.7, NA, 97.0...
## $ unit         &amp;lt;chr&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ obs_status   &amp;lt;chr&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ footnote     &amp;lt;chr&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, &amp;quot;Natio...
## $ last_updated &amp;lt;date&amp;gt; 2020-08-18, 2020-08-18, 2020-08-18, 2020-08-18, 2020-...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In order to get the two dataframes to combine into one, they have to have a shared column/ variable. We will merge the two datasets with a &lt;code&gt;left_join()&lt;/code&gt; by “country”, and glimpse the structure of the resulting dataframe.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Merge with a left_join (a) happiness data with all indicators and (b) the 2015 World Happiness index 

happiness &amp;lt;- 
  left_join(happiness_data_WB_long, world_happiness_2015, by=&amp;quot;country&amp;quot;) 

glimpse(happiness)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 1,302
## Columns: 22
## $ indicator_id                &amp;lt;chr&amp;gt; &amp;quot;SE.PRM.NENR&amp;quot;, &amp;quot;SE.PRM.NENR&amp;quot;, &amp;quot;SE.PRM.N...
## $ indicator                   &amp;lt;chr&amp;gt; &amp;quot;School enrollment, primary (% net)&amp;quot;, &amp;quot;...
## $ iso2c                       &amp;lt;chr&amp;gt; &amp;quot;AF&amp;quot;, &amp;quot;AL&amp;quot;, &amp;quot;DZ&amp;quot;, &amp;quot;AS&amp;quot;, &amp;quot;AD&amp;quot;, &amp;quot;AO&amp;quot;, &amp;quot;AG...
## $ iso3c                       &amp;lt;chr&amp;gt; &amp;quot;AFG&amp;quot;, &amp;quot;ALB&amp;quot;, &amp;quot;DZA&amp;quot;, &amp;quot;ASM&amp;quot;, &amp;quot;AND&amp;quot;, &amp;quot;AGO...
## $ country                     &amp;lt;chr&amp;gt; &amp;quot;Afghanistan&amp;quot;, &amp;quot;Albania&amp;quot;, &amp;quot;Algeria&amp;quot;, &amp;quot;A...
## $ date                        &amp;lt;dbl&amp;gt; 2015, 2015, 2015, 2015, 2015, 2015, 201...
## $ value                       &amp;lt;dbl&amp;gt; NA, 94.2, 97.5, NA, NA, NA, 94.2, 99.5,...
## $ unit                        &amp;lt;chr&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ obs_status                  &amp;lt;chr&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ footnote                    &amp;lt;chr&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ last_updated                &amp;lt;date&amp;gt; 2020-08-18, 2020-08-18, 2020-08-18, 20...
## $ region                      &amp;lt;chr&amp;gt; &amp;quot;Southern Asia&amp;quot;, &amp;quot;Central and Eastern E...
## $ happiness_rank              &amp;lt;dbl&amp;gt; 153, 95, 68, NA, NA, 137, NA, 30, 127, ...
## $ happiness_score             &amp;lt;dbl&amp;gt; 3.58, 4.96, 5.61, NA, NA, 4.03, NA, 6.5...
## $ standard_error              &amp;lt;dbl&amp;gt; 0.0308, 0.0501, 0.0510, NA, NA, 0.0476,...
## $ economy_gdp_per_capita      &amp;lt;dbl&amp;gt; 0.320, 0.879, 0.939, NA, NA, 0.758, NA,...
## $ family                      &amp;lt;dbl&amp;gt; 0.303, 0.804, 1.078, NA, NA, 0.860, NA,...
## $ health_life_expectancy      &amp;lt;dbl&amp;gt; 0.3034, 0.8133, 0.6177, NA, NA, 0.1668,...
## $ freedom                     &amp;lt;dbl&amp;gt; 0.2341, 0.3573, 0.2858, NA, NA, 0.1038,...
## $ trust_government_corruption &amp;lt;dbl&amp;gt; 0.09719, 0.06413, 0.17383, NA, NA, 0.07...
## $ generosity                  &amp;lt;dbl&amp;gt; 0.3651, 0.1427, 0.0782, NA, NA, 0.1234,...
## $ dystopia_residual           &amp;lt;dbl&amp;gt; 1.95, 1.90, 2.43, NA, NA, 1.95, NA, 2.8...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can create a histogram of &lt;code&gt;happiness_score&lt;/code&gt; by region&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(data = happiness, aes(x = happiness_score , fill=region))+
  geom_histogram()+
  theme_minimal()+
  facet_wrap(~region,nrow=5) +
  labs(title = &amp;#39;2015 World Happiness&amp;#39;,
       x = &amp;quot;&amp;quot;,
       y = &amp;quot;Total Happiness Score&amp;quot;,
       caption = &amp;#39;Source: Worldbank&amp;#39;) +
  theme(legend.position=&amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/world_bank_data_files/figure-html/happiness_histogram-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We can also create a scatterplot of &lt;code&gt;happiness_score&lt;/code&gt; against all the indicators we have downloaded.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(data = happiness, aes(x = value, y = happiness_score , colour=indicator))+
  geom_point()+
  geom_smooth(se=FALSE)+
  theme_minimal()+
  facet_wrap(~indicator,scales=&amp;quot;free&amp;quot;) +
  labs(title = &amp;#39;2015 World Happiness&amp;#39;,
       x = &amp;quot;&amp;quot;,
       y = &amp;quot;Total Happiness Score&amp;quot;,
       caption = &amp;#39;Source: Worldbank&amp;#39;) +
  theme(legend.position=&amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/world_bank_data_files/figure-html/happiness_correlation-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;acknowledgments&#34; class=&#34;section level2 toc-ignore&#34;&gt;
&lt;h2&gt;Acknowledgments&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;This page is derived in part from &lt;a href=&#34;https://cran.r-project.org/web/packages/wbstats/vignettes/Using_the_wbstats_package.html&#34;&gt;Introduction to the &lt;code&gt;wbstats&lt;/code&gt; R-package&lt;/a&gt; by Jesse Piburn.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Installing the tidyverse</title>
      <link>https://usi-emba-analytics.netlify.app/reference/02-reference/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/02-reference/</guid>
      <description>
&lt;script src=&#34;https://usi-emba-analytics.netlify.app/rmarkdown-libs/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#installing-the-tidyverse&#34;&gt;Installing the &lt;code&gt;tidyverse&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#installing-the-tidyverse-if-you-have-a-mac&#34;&gt;Installing the &lt;code&gt;tidyverse&lt;/code&gt; if you have a Mac&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#installing-further-packages&#34;&gt;Installing further packages&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#install-from-github&#34;&gt;Install from &lt;em&gt;Github&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#updating-packages&#34;&gt;Updating packages&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#updating-r&#34;&gt;Updating R&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#install-tinytex&#34;&gt;Install &lt;code&gt;tinytex&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;installing-the-tidyverse&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Installing the &lt;code&gt;tidyverse&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;R packages are easy to install with RStudio. Select the packages panel, click on “Install,” type the name of the package you want to install, and press enter.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/install/install-r-package-panel.png&#34; width=&#34;60%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;This can sometimes be tedious when you’re installing lots of packages, though. &lt;a href=&#34;https://www.tidyverse.org/&#34;&gt;The tidyverse&lt;/a&gt;&lt;a href=&#34;#fn1&#34; class=&#34;footnote-ref&#34; id=&#34;fnref1&#34;&gt;&lt;sup&gt;1&lt;/sup&gt;&lt;/a&gt; for instance, consists of dozens of packages that all work together. Rather than install each package individually, you can install &lt;code&gt;tidyverse&lt;/code&gt;, a meta-package if you wish, and get them all at the same time.&lt;/p&gt;
&lt;p&gt;Go to the packages panel in RStudio, click on “Install,” type “tidyverse”, and press enter. You’ll see a bunch of output in the RStudio console as all the tidyverse packages are installed.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/install/install-r-tidyverse.png&#34; width=&#34;60%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;RStudio generates a line of code for you and run it: &lt;code&gt;install.packages(&#34;tidyverse&#34;)&lt;/code&gt;. You can also just paste and run this instead of using the packages panel.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# install the major packages from the tidyverse
install.packages(&amp;quot;tidyverse&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will take a while as &lt;code&gt;tidyverse&lt;/code&gt; is a collection of packages and R will have to install all dependencies.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;installing-the-tidyverse-if-you-have-a-mac&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Installing the &lt;code&gt;tidyverse&lt;/code&gt; if you have a Mac&lt;/h2&gt;
&lt;p&gt;Unfortunately, installing the &lt;code&gt;tidyverse&lt;/code&gt; isn’t quite always a straight-forward task with the current version of macOS 10.14, Mojave which was released on September 24, 2018.&lt;/p&gt;
&lt;p&gt;To solve issues that may arise with missing &lt;code&gt;xml2&lt;/code&gt; library, please do the following:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Open &lt;strong&gt;Terminal&lt;/strong&gt; (the tab right next to Console)&lt;/li&gt;
&lt;li&gt;Type&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;xcode-select --install&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Be careful as you do need &lt;strong&gt;two&lt;/strong&gt; (2) dashes before the &lt;code&gt;install&lt;/code&gt;. A software update popup window should appear that will ask if you want to install command line developer tools. Click on “Install” (you don’t need to click on “Get Xcode”)&lt;/p&gt;
&lt;ol start=&#34;3&#34; style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Go to &lt;a href=&#34;https://brew.sh&#34;&gt;https://brew.sh&lt;/a&gt; and copy the long command under “Install Homebrew” (starts with &lt;code&gt;/usr/bin/ruby -e &#34;$(curl -fsSL.)&lt;/code&gt;, paste it into Terminal, and press enter.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;/usr/bin/ruby -e &amp;quot;$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This installs &lt;code&gt;Homebrew&lt;/code&gt;, which is special software that lets you install Unix-y programs from the terminal.&lt;/p&gt;
&lt;ol start=&#34;4&#34; style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Type the following command line in Terminal to install &lt;code&gt;libxml2&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;brew install libxml2 &lt;/code&gt;&lt;/pre&gt;
&lt;ol start=&#34;5&#34; style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Then, within RStudio, type&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;quot;xml2&amp;quot;) &lt;/code&gt;&lt;/pre&gt;
&lt;ol start=&#34;6&#34; style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Finally, you can now proceed with the installation of the &lt;code&gt;tidyverse&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;quot;tidyverse&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;installing-further-packages&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Installing further packages&lt;/h2&gt;
&lt;p&gt;Once the &lt;code&gt;tidyverse&lt;/code&gt; collection of packages installs and you get back to the R prompt &lt;code&gt;&amp;gt;&lt;/code&gt;, you can install a series of packages that will be useful later in the course. You can copy/paste the code below; please note that this will take quite a while, so grab a coffee.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# install these packages as well
list_of_packages &amp;lt;- c(
  &amp;quot;moderndive&amp;quot;,   # https://www.moderndive.com/
  &amp;quot;DT&amp;quot;,           # Allows us to handle Data Tables and manipulate data faster 
  &amp;quot;unvotes&amp;quot;,      # How countries have voted in UN resolutions
  &amp;quot;gridExtra&amp;quot;,    # Miscellaneous Functions for &amp;quot;Grid&amp;quot; Graphics
  &amp;quot;GGally&amp;quot;,       # Allows us to create a correlations/scatterplots matrix 
  &amp;quot;tidyquant&amp;quot;,    # Download and manipulate financial data
  &amp;quot;wbstats&amp;quot;,      # Download World Bank Data
  &amp;quot;eurostat&amp;quot;,     # Download data from Eurostat
  &amp;quot;fpp2&amp;quot;,         # Time Series and Forecasting fucntions, with data too 
  &amp;quot;car&amp;quot;,          # Applied Regression- allows to calculate VIF, Variance Inflation Factor
  &amp;quot;gapminder&amp;quot;,    # Data on life expectancy, GDP/capita, and population by country and year
  &amp;quot;nycflights13&amp;quot;, # Data on all domestic flights through NYCs 3 airports (JFK, EWR, LGA) in 2013
  &amp;quot;fivethirtyeight&amp;quot;, #Data used in articles that appeared in the fivethirtyeight.com website
  &amp;quot;corrr&amp;quot;,        # correlation in R
  &amp;quot;plotly&amp;quot;,       # interactive visualizations
  &amp;quot;sf&amp;quot;,           # tidy geo-computing
  &amp;quot;cowplot&amp;quot;,      # ggplot multiple figures addon
  &amp;quot;coefplot&amp;quot;,     # plot coefficients from fitted models
  &amp;quot;interplot&amp;quot;,    # plot effects of variables in interaction terms
  &amp;quot;scales&amp;quot;,       # scale functions for visualisations 
  &amp;quot;ggridges&amp;quot;,     # ridgeline plots in ggplot2
  &amp;quot;skimr&amp;quot;,        # nice dataframe summaries
  &amp;quot;leaflet&amp;quot;,      # interactive maps
  &amp;quot;ggrepel&amp;quot;,      # geoms for ggplot2 to repel overlapping text labels
  &amp;quot;viridis&amp;quot;,      # Colour Maps
  &amp;quot;rvest&amp;quot;,        # scrape webpages
  &amp;quot;usethis&amp;quot;,      # automation of package and project setup
  &amp;quot;remotes&amp;quot;,      # installing packages from Github
  &amp;quot;tidytext&amp;quot;,     # text mining
  &amp;quot;here&amp;quot;,         # finding your files 
  &amp;quot;mosaic&amp;quot;        # summary stats, using mosaic::favstats()
)

install.packages(list_of_packages, dependencies=TRUE, repos = &amp;quot;https://cran.rstudio.com/&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;install-from-github&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Install from &lt;em&gt;Github&lt;/em&gt;&lt;/h2&gt;
&lt;p&gt;Most of the time the packages that you’ll want to install have been made available on CRAN, the &lt;em&gt;Comprehensive R Archive Network&lt;/em&gt;, so you use the &lt;code&gt;install.packages(&#34;package_name&#34;)&lt;/code&gt; function. Sometimes people write packages that are not submitted to CRAN, and sometimes you might want to try out a package that is currently under development. In these situations, people who write packages will often make them available on &lt;a href=&#34;https://github.com/&#34;&gt;GitHub&lt;/a&gt;. We can install packages directly from Github, using the &lt;strong&gt;remotes&lt;/strong&gt; package.&lt;/p&gt;
&lt;p&gt;The first thing you need to do is install &lt;strong&gt;remotes&lt;/strong&gt;, which is easy because that package is available on CRAN and hopefully you installed it with all packages listed earlier. If not,&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;quot;remotes&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once you install &lt;strong&gt;remotes&lt;/strong&gt;, you must explicitly say to R you will be using it by typing &lt;code&gt;library(devtools)&lt;/code&gt;. Then, you can use the &lt;code&gt;install_github&lt;/code&gt; command to install a package directly from a GitHub repository. For example, there’s an R data package featuring every Lego set from 1970 to 2015 put together by &lt;a href=&#34;https://github.com/seankross/lego&#34;&gt;Sean Kross&lt;/a&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;remotes::install_github(&amp;quot;seankross/lego&amp;quot;) #install the lego package directly from Github &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;R fetches and installs the package from Github, and we now have the new &lt;strong&gt;lego&lt;/strong&gt; package to play with. To verify that everything worked properly, let’s load the &lt;code&gt;lego&lt;/code&gt; package and look at its &lt;code&gt;legosets&lt;/code&gt; dataframe:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(lego)     #load the lego package into the computer&amp;#39;s memory

legosets          #view the legosets dataframe&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6,172 x 14
##    Item_Number Name    Year Theme Subtheme Pieces Minifigures Image_URL GBP_MSRP
##    &amp;lt;chr&amp;gt;       &amp;lt;chr&amp;gt;  &amp;lt;int&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt;     &amp;lt;int&amp;gt;       &amp;lt;int&amp;gt; &amp;lt;chr&amp;gt;        &amp;lt;dbl&amp;gt;
##  1 10246       Detec~  2015 Adva~ &amp;quot;Modula~   2262           6 http://i~   133.  
##  2 10247       Ferri~  2015 Adva~ &amp;quot;Fairgr~   2464          10 http://i~   150.  
##  3 10248       Ferra~  2015 Adva~ &amp;quot;Vehicl~   1158          NA http://i~    70.0 
##  4 10249       Toy S~  2015 Adva~ &amp;quot;Winter~    898          NA http://i~    60.0 
##  5 10581       Ducks   2015 Duplo &amp;quot;Forest~     13           1 http://i~     9.99
##  6 10582       Anima~  2015 Duplo &amp;quot;Forest~     39           2 http://i~    17.0 
##  7 10583       Fishi~  2015 Duplo &amp;quot;Forest~     32           2 http://i~    20.0 
##  8 10584       Forest  2015 Duplo &amp;quot;Forest~    105           3 http://i~    50.0 
##  9 10585       Mom a~  2015 Duplo &amp;quot;&amp;quot;           13           2 http://i~     8.99
## 10 10586       Ice C~  2015 Duplo &amp;quot;&amp;quot;           11           2 http://i~    13.0 
## # ... with 6,162 more rows, and 5 more variables: USD_MSRP &amp;lt;dbl&amp;gt;,
## #   CAD_MSRP &amp;lt;dbl&amp;gt;, EUR_MSRP &amp;lt;dbl&amp;gt;, Packaging &amp;lt;chr&amp;gt;, Availability &amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;glimpse(legosets) #examine the structure of the dataframe- variables, observations, type of variables, etc.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 6,172
## Columns: 14
## $ Item_Number  &amp;lt;chr&amp;gt; &amp;quot;10246&amp;quot;, &amp;quot;10247&amp;quot;, &amp;quot;10248&amp;quot;, &amp;quot;10249&amp;quot;, &amp;quot;10581&amp;quot;, &amp;quot;10582&amp;quot;, &amp;quot;10~
## $ Name         &amp;lt;chr&amp;gt; &amp;quot;Detective&amp;#39;s Office&amp;quot;, &amp;quot;Ferris Wheel&amp;quot;, &amp;quot;Ferrari F40&amp;quot;, &amp;quot;Toy~
## $ Year         &amp;lt;int&amp;gt; 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 201~
## $ Theme        &amp;lt;chr&amp;gt; &amp;quot;Advanced Models&amp;quot;, &amp;quot;Advanced Models&amp;quot;, &amp;quot;Advanced Models&amp;quot;, ~
## $ Subtheme     &amp;lt;chr&amp;gt; &amp;quot;Modular Buildings&amp;quot;, &amp;quot;Fairground&amp;quot;, &amp;quot;Vehicles&amp;quot;, &amp;quot;Winter Vi~
## $ Pieces       &amp;lt;int&amp;gt; 2262, 2464, 1158, 898, 13, 39, 32, 105, 13, 11, 52, 13, 2~
## $ Minifigures  &amp;lt;int&amp;gt; 6, 10, NA, NA, 1, 2, 2, 3, 2, 2, 3, 1, NA, NA, NA, NA, 1,~
## $ Image_URL    &amp;lt;chr&amp;gt; &amp;quot;http://images.brickset.com/sets/images/10246-1.jpg&amp;quot;, &amp;quot;ht~
## $ GBP_MSRP     &amp;lt;dbl&amp;gt; 132.99, 149.99, 69.99, 59.99, 9.99, 16.99, 19.99, 49.99, ~
## $ USD_MSRP     &amp;lt;dbl&amp;gt; 159.99, 199.99, 99.99, 79.99, 9.99, 19.99, 24.99, 59.99, ~
## $ CAD_MSRP     &amp;lt;dbl&amp;gt; 199.99, 229.99, 119.99, NA, 12.99, 24.99, 29.99, 69.99, 1~
## $ EUR_MSRP     &amp;lt;dbl&amp;gt; 149.99, 179.99, 89.99, 69.99, 9.99, 19.99, 24.99, 59.99, ~
## $ Packaging    &amp;lt;chr&amp;gt; &amp;quot;Box&amp;quot;, &amp;quot;Box&amp;quot;, &amp;quot;Box&amp;quot;, &amp;quot;Box&amp;quot;, &amp;quot;Box&amp;quot;, &amp;quot;Box&amp;quot;, &amp;quot;Box&amp;quot;, &amp;quot;Box&amp;quot;, &amp;quot;~
## $ Availability &amp;lt;chr&amp;gt; &amp;quot;Retail - limited&amp;quot;, &amp;quot;Retail - limited&amp;quot;, &amp;quot;LEGO exclusive&amp;quot;,~&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The dataframe has 14 variables (or columns) and 6,172 observations (rows). Besides the item number, year, theme/subtheme and the number of pieces and minifigures contained in each Lego box, we also have the recommeneded retail prices in GBP, USD, CAD, and EUR. While we are at it, let us have a quick look at how Lego prices (in GBP) have evolved over the years.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;avg_price_per_year &amp;lt;- legosets %&amp;gt;% # create avg_price_year&amp;quot; by taking legosets, and then
  filter(!is.na(GBP_MSRP)) %&amp;gt;%    # filter out entries with no GBP prices, GBP_MSRP, and then
  group_by(Year) %&amp;gt;%              # group prices by year
  summarise(Price = mean(GBP_MSRP)) # create variable &amp;quot;Price&amp;quot; = yearly average of GBP_MSRP

ggplot(avg_price_per_year, 
       mapping = aes(x = Year, y = Price)) +  # time series plot: x=Year, y=Price
  geom_point(size = 0.5) +                    # simple scatterplot Y vs. X
  geom_line(size = 0.5) +                     # add the black line between points
  geom_smooth(se = FALSE) +                   # fit trend line,no error band around it &amp;quot;se = FALSE&amp;quot; 
  labs(x = &amp;quot;Year&amp;quot;,   
       y = &amp;quot;Price (GBP)&amp;quot;, 
       title = &amp;quot;Average price of LEGO sets&amp;quot;,
       subtitle = &amp;quot;Amounts are reported in current GBP&amp;quot;,
       caption = &amp;quot;Source: LEGO&amp;quot;) +
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/02-reference_files/figure-html/price-over-time-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;There is a clear upward trend in average GBP prices.&lt;/p&gt;
&lt;p&gt;And since we are talking about LEGOs, here is a fun application of &lt;a href=&#34;http://www.ryantimpe.com/post/lego-mosaic1/&#34;&gt;creating LEGO mosaics from photos using R &amp;amp; the tidyverse&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;updating-packages&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Updating packages&lt;/h2&gt;
&lt;p&gt;Every now and then the authors of packages release updated versions. The updated versions often add new functionality, fix bugs, and so on. It’s a good idea to update your packages periodically.&lt;/p&gt;
&lt;p&gt;There’s an &lt;code&gt;update.packages&lt;/code&gt; function, but it’s probably easier to stick with the RStudio tool. In the packages tab, click on the &lt;code&gt;Update Packages&lt;/code&gt; button. This will bring up a window that looks like the one shown below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/packages_update.png&#34; width=&#34;60%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;In this window, each row refers to a package that needs to be updated. You can select which updates to install by checking the boxes on the left. If you feel lazy, click the &lt;em&gt;Select All&lt;/em&gt; button, and then &lt;em&gt;Install Updates&lt;/em&gt;. This might take a while to complete depending on how fast your internet connection is.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;updating-r&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Updating R&lt;/h2&gt;
&lt;p&gt;About twice a year, a new version of R is released, and the features of all packages get changed to be compatible with the new version of R. The side effect of packages being compatible with the newest R version is that then you update to the newest version of R, you lose all the packages that you have downloaded and installed. Unfortuantely, you need to install the new versions of packages, even though they will typically behave just like the old ones.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;install-tinytex&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Install &lt;code&gt;tinytex&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;When you knit to PDF, R uses a special scientific typesetting program named LaTeX (pronounced “lay-tek” or “lah-tex”; for goofy nerdy reasons, the x is technically the “ch” sound in “Bach”, but most people just say it as “k”—saying “layteks” is frowned on for whatever reason).&lt;/p&gt;
&lt;p&gt;LaTeX makes pretty documents, but it’s a huge program—&lt;a href=&#34;https://tug.org/mactex/mactex-download.html&#34;&gt;the macOS version, for instance, is nearly 4 GB&lt;/a&gt;! To make life easier, there’s &lt;a href=&#34;https://yihui.org/tinytex/&#34;&gt;an R package named &lt;strong&gt;tinytex&lt;/strong&gt;&lt;/a&gt; that installs a minimal LaTeX program and that automatically deals with differences between macOS and Windows.&lt;/p&gt;
&lt;p&gt;Here’s how to install &lt;strong&gt;tinytex&lt;/strong&gt; so you can knit to pretty PDFs:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Use the Packages in panel in RStudio to install &lt;strong&gt;tinytex&lt;/strong&gt; like you did above with &lt;strong&gt;tidyverse&lt;/strong&gt;. Alternatively, run &lt;code&gt;install.packages(&#34;tinytex&#34;)&lt;/code&gt; in the console.&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;tinytex::install_tinytex()&lt;/code&gt; in the console.&lt;/li&gt;
&lt;li&gt;Wait for a bit while R downloads and installs everything you need.&lt;/li&gt;
&lt;li&gt;The end! You should now be able to knit to PDF.&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;div class=&#34;footnotes&#34;&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id=&#34;fn1&#34;&gt;&lt;p&gt;A universe of packages centered around tidy data, including &lt;code&gt;ggplot2&lt;/code&gt;&lt;a href=&#34;#fnref1&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Using R Markdown</title>
      <link>https://usi-emba-analytics.netlify.app/reference/04-reference/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/04-reference/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#reproducibility-in-scientific-research&#34;&gt;Reproducibility in scientific research&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#r-markdown-markdown-r-code&#34;&gt;R Markdown = Markdown + R Code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#key-terms&#34;&gt;Key terms&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#add-chunks&#34;&gt;Add chunks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#chunk-names&#34;&gt;Chunk names&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#chunk-options&#34;&gt;Chunk options&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#inline-chunks&#34;&gt;Inline chunks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#caching&#34;&gt;Caching&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#output-formats&#34;&gt;Output formats&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#table-of-contents&#34;&gt;Table of contents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#appearance-and-style&#34;&gt;Appearance and style&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#other-references&#34;&gt;Other references&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;reproducibility-in-scientific-research&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Reproducibility in scientific research&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Reproducibility&lt;/strong&gt; is the idea that data analyses, and more generally scientific claims, are published with their data and software code so that others may try to replicate the same work, get similar results, and build upon the works of others.&lt;/p&gt;
&lt;p&gt;While this sounds obvious, it actually happens far less frequently than what it should.&lt;/p&gt;
&lt;p&gt;For instance, scientists at the biotechnology company Amgen were unable to replicate the majority of published pre-clinical cancer research studies; as a matter of fact, &lt;a href=&#34;https://www.nature.com/articles/483531a&#34;&gt;only 6 out of 53 landmark results could be reproduced&lt;/a&gt;. Similarly, it has been argued that the &lt;a href=&#34;https://www.ncbi.nlm.nih.gov/pubmed/25552691&#34;&gt;great majority of preclinical results cannot be reproduced&lt;/a&gt;, leading to an &lt;a href=&#34;https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002165&#34;&gt;&lt;strong&gt;annual&lt;/strong&gt; estimate of the cost of irreproducibility on preclinical research industry to be equal to 28 Billion USD&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“You are always working with at least one collaborator: Future you.” &lt;br&gt;
      – Hadley Wickham&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Suppose that your colleague sends you an Excel file with an analysis she has undertaken. The Excel file is likely to contain the raw data, but also graphs, results, etc. that were generated from the data. If you have ever received such an Excel analysis file, it takes a long time to navigate around it and try to understand the logic used to arrive at the results.&lt;/p&gt;
&lt;p&gt;Data analysts who implement reproducibility in their projects can quickly and easily reproduce the original results and trace back to determine how they were derived. &lt;strong&gt;Literate programming&lt;/strong&gt;, an idea from &lt;a href=&#34;https://en.wikipedia.org/wiki/Donald_Knuth&#34;&gt;Donald Knuth&lt;/a&gt;, is a technique for mixing written text, where you write notes explaining what you did and why, and chunks of code that produce your graphs, analyses, etc.&lt;/p&gt;
&lt;p&gt;This makes documentation of code easier, enables verification and replication, and allows the analyst to precisely replicate her analysis. This is extremely important when revisiting work done months later, because it’s highly likely you won’t remember how all the code/analysis works together when completing your work.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to humans what we want the computer to do. &lt;br&gt;
      – Donald E. Knuth (1984), &lt;em&gt;Literate Programming&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Reproducibility is also key for communicating findings with other and decision makers; it allows them to follow your logic and verify your results, assess your assumptions, and understand how your answers were formed rather than solely relying on your claimed results. In the data science framework employed in &lt;a href=&#34;http://r4ds.had.co.nz&#34;&gt;R for Data Science&lt;/a&gt;, reproducibility is infused throughout the entire workflow.&lt;/p&gt;
&lt;p&gt;Your reproducibility goals should be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Are the results (tables and figures) reproducible from the code and data?&lt;/li&gt;
&lt;li&gt;Does the code actually do what you think it does?&lt;/li&gt;
&lt;li&gt;Is the code well documented so someone else can foolow your work?&lt;/li&gt;
&lt;li&gt;In addition to what was done, is it clear &lt;strong&gt;why&lt;/strong&gt; it was done? (e.g., how were parameter settings chosen?)&lt;/li&gt;
&lt;li&gt;Can the code be used for other, or newer, data?&lt;/li&gt;
&lt;li&gt;Can you generalise the code to do other things?&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;r-markdown-markdown-r-code&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;R Markdown = Markdown + R Code&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://rmarkdown.rstudio.com/&#34;&gt;R Markdown&lt;/a&gt; is &lt;a href=&#34;https://usi-emba-analytics.netlify.app/reference/markdown/&#34;&gt;regular Markdown&lt;/a&gt; with R code and output sprinkled in. You can do everything you can with &lt;a href=&#34;https://usi-emba-analytics.netlify.app/reference/markdown/&#34;&gt;regular Markdown&lt;/a&gt;, but you can incorporate graphs, tables, and other R output directly in your document. You can create HTML, PDF, and Word documents, PowerPoint and HTML presentations, websites, books, and even &lt;a href=&#34;https://rmarkdown.rstudio.com/flexdashboard/index.html&#34;&gt;interactive dashboards&lt;/a&gt; with R Markdown. This whole course website is created with R Markdown (and &lt;a href=&#34;https://bookdown.org/yihui/blogdown/&#34;&gt;a package named &lt;strong&gt;blogdown&lt;/strong&gt;&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;&lt;code&gt;rmarkdown&lt;/code&gt; and &lt;code&gt;knitr&lt;/code&gt; is a powerful combination of packages for literate programming, reproducible analysis, and document generation, which can:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Combine R code and Markdown syntax&lt;/li&gt;
&lt;li&gt;Produce documents in PDF , Microsoft Word and various types of HTML documents&lt;br /&gt;
&lt;/li&gt;
&lt;li&gt;In HTML format, it can incorporate “extras” like interactive graphics&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;An R Markdown file is a plain text file that uses the extension &lt;code&gt;.Rmd&lt;/code&gt; and contains three (3) major components:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;A &lt;strong&gt;YAML header&lt;/strong&gt; surrounded by &lt;code&gt;---&lt;/code&gt;s. This is the &lt;strong&gt;metadata&lt;/strong&gt; of the document and it tells you how it is formed - what the &lt;strong&gt;title&lt;/strong&gt; is, the &lt;strong&gt;author&lt;/strong&gt;, &lt;strong&gt;date&lt;/strong&gt;, &lt;strong&gt;output&lt;/strong&gt;, and other control information.&lt;br /&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Chunks&lt;/strong&gt; of R code surounded by &lt;code&gt;```&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Text mixed with simple text formatting using the &lt;a href=&#34;https://www.markdowntutorial.com/&#34;&gt;Markdown syntax&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Code chunks are interspersed with text throughout the document. To complete the document, you “Knit” or “render” the document. Most of you probably knit the document by clicking the “Knit” button in the script editor panel. You can also do this programmatically from the console by running the command &lt;code&gt;rmarkdown::render(&#34;example.Rmd&#34;)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;When you &lt;strong&gt;knit&lt;/strong&gt; the document you send your &lt;code&gt;.Rmd&lt;/code&gt; file to &lt;code&gt;knitr&lt;/code&gt;, a package for R that executes all the code chunks and creates a second &lt;strong&gt;markdown&lt;/strong&gt; document (&lt;code&gt;.md&lt;/code&gt;). That markdown document is then passed onto &lt;a href=&#34;http://pandoc.org/&#34;&gt;&lt;strong&gt;pandoc&lt;/strong&gt;&lt;/a&gt;, a document rendering software program independent from R. Pandoc allows users to convert back and forth between many different document formats such as HTML, &lt;span class=&#34;math inline&#34;&gt;\(\LaTeX\)&lt;/span&gt;, Microsoft Word, etc. By splitting the workflow up, you can convert your R Markdown document into a wide range of output formats.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://r4ds.had.co.nz/images/RMarkdownFlow.png&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The &lt;a href=&#34;https://rmarkdown.rstudio.com/&#34;&gt;documentation for R Markdown&lt;/a&gt; is extremely comprehensive, and their &lt;a href=&#34;https://rmarkdown.rstudio.com/lesson-1.html&#34;&gt;tutorials&lt;/a&gt; and &lt;a href=&#34;https://rmarkdown.rstudio.com/lesson-15.html&#34;&gt;cheatsheets&lt;/a&gt; are excellent—rely on those.&lt;/p&gt;
&lt;p&gt;Here are the most important things you’ll need to know about R Markdown in this class:&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;key-terms&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Key terms&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Document&lt;/strong&gt;: A Markdown file where you type stuff&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Chunk&lt;/strong&gt;: A piece of R code that is included in your document. It looks like this:&lt;/p&gt;
&lt;pre class=&#34;markdown&#34;&gt;&lt;code&gt;```{r chunk_name}
# Code goes here
```&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There must be an empty line before and after the chunk. The final three backticks must be the only thing on the line—if you add more text, or if you forget to add the backticks, or accidentally delete the backticks, your document will not knit correctly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Knit&lt;/strong&gt;: When you “knit” a document, R runs each of the chunks sequentially and converts the output of each chunk into Markdown. R then runs the knitted document through &lt;a href=&#34;https://pandoc.org/&#34;&gt;pandoc&lt;/a&gt; to convert it to HTML or PDF or Word (or whatever output you’ve selected).&lt;/p&gt;
&lt;p&gt;You can knit by clicking on the “Knit” button at the top of the editor window, or by pressing &lt;code&gt;⌘⇧K&lt;/code&gt; on macOS or &lt;code&gt;control + shift + K&lt;/code&gt; on Windows.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/assignments/knit-button.png&#34; width=&#34;30%&#34; /&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;add-chunks&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Add chunks&lt;/h2&gt;
&lt;p&gt;There are three ways to insert chunks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Press &lt;code&gt;⌘⌥I&lt;/code&gt; on macOS or &lt;code&gt;control + alt + I&lt;/code&gt; on Windows&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click on the “Insert” button at the top of the editor window&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/reference/insert-chunk.png&#34; width=&#34;30%&#34; /&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Manually type all the backticks and curly braces (don’t do this)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;chunk-names&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Chunk names&lt;/h2&gt;
&lt;p&gt;You can add names to chunks to make it easier to navigate your document. If you click on the little dropdown menu at the bottom of your editor in RStudio, you can see a table of contents that shows all the headings and chunks. If you name chunks, they’ll appear in the list. If you don’t include a name, the chunk will still show up, but you won’t know what it does.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/reference/chunk-toc.png&#34; width=&#34;40%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;To add a name, include it immediately after the &lt;code&gt;{r&lt;/code&gt; in the first line of the chunk. Names cannot contain spaces, but they can contain underscores and dashes. &lt;strong&gt;All chunk names in your document must be unique.&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A word of caution: If you use the same chunk name more than once, &lt;code&gt;knitr&lt;/code&gt; will give you an error message and refuse to knit your Rmd document. So ifyou copy/paste a named chunk, make sure you give them unique names.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre class=&#34;markdown&#34;&gt;&lt;code&gt;```{r name-of-this-chunk}
# Code goes here
```&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;chunk-options&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Chunk options&lt;/h2&gt;
&lt;p&gt;There are a bunch of different options you can set for each chunk. You can see a complete list in the &lt;a href=&#34;https://rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf&#34;&gt;RMarkdown Reference Guide&lt;/a&gt; or at &lt;a href=&#34;https://yihui.org/knitr/options/&#34;&gt;&lt;strong&gt;knitr&lt;/strong&gt;’s website&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Options go inside the &lt;code&gt;{r}&lt;/code&gt; section of the chunk:&lt;/p&gt;
&lt;pre class=&#34;markdown&#34;&gt;&lt;code&gt;```{r name-of-this-chunk, warning=FALSE, message=FALSE}
# Code goes here
```&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The most common chunk options are these:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;fig.width=5&lt;/code&gt; and &lt;code&gt;fig.height=3&lt;/code&gt; (&lt;em&gt;or whatever number you want&lt;/em&gt;): Set the dimensions for figures&lt;/li&gt;
&lt;li&gt;&lt;code&gt;echo=FALSE&lt;/code&gt;: The code is not shown in the final document, but the results are.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;include=FALSE&lt;/code&gt;: The chunk still runs, but the code and results are not included in the final document&lt;/li&gt;
&lt;li&gt;&lt;code&gt;message=FALSE&lt;/code&gt;: Any messages that R generates (like all the notes that appear after you load a package) are omitted&lt;/li&gt;
&lt;li&gt;&lt;code&gt;warning=FALSE&lt;/code&gt;: Any warnings that R generates are omitted&lt;/li&gt;
&lt;li&gt;&lt;code&gt;eval = FALSE&lt;/code&gt; - prevents code from being evaluated. I use this in my notes for class when I want to show how to write a specific function but don’t need to actually use it.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;error = TRUE&lt;/code&gt; - causes the document to continue knitting and rendering even if the code generates a fatal error. If you’re debugging your code, you might want to use this option. However, for the final version of your work, you do not want to allow errors to pass through unnoticed.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can also set chunk options by clicking on the little gear icon in the top right corner of any chunk:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/reference/chunk-options.png&#34; width=&#34;70%&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;inline-chunks&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Inline chunks&lt;/h2&gt;
&lt;p&gt;You can also include R output directly in your text, which is really helpful if you want to report numbers from your analysis. To do this, use &lt;code&gt;`r r_code_here`&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It’s generally easiest to calculate numbers in a regular chunk beforehand and then use an inline chunk to display the value in your text. For instance, this document…&lt;/p&gt;
&lt;pre class=&#34;markdown&#34;&gt;&lt;code&gt;```{r find-avg-mpg, echo=FALSE}
avg_mpg &amp;lt;- mean(mtcars$mpg)
```

The average fuel efficiency for cars from 1974 was `r round(avg_mpg, 1)` miles per gallon.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;… would knit into this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The average fuel efficiency for cars from 1974 was 20.1 miles per gallon.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;div id=&#34;caching&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Caching&lt;/h2&gt;
&lt;p&gt;By default, every time you knit a document R starts anew and no previous results are saved.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://r4ds.had.co.nz/images/RMarkdownFlow.png&#34; /&gt;&lt;/p&gt;
&lt;p&gt;If you have code chunks that run computationally intensive tasks, like running a &lt;code&gt;ggpairs()&lt;/code&gt; correlation/scatterplot matrix in a large dataset, you might want to store these results to be more efficient and save time. If you use &lt;code&gt;cache = TRUE&lt;/code&gt;, R will do exactly this. The output of the chunk will be saved to a specially named file on disk. Now, every time you knit the document the cached results will be used instead of running the code fresh.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;output-formats&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Output formats&lt;/h2&gt;
&lt;p&gt;You can specify what kind of document you create when you knit in the &lt;a href=&#34;https://usi-emba-analytics.netlify.app/reference/markdown/#front-matter&#34;&gt;YAML front matter&lt;/a&gt;.&lt;/p&gt;
&lt;pre class=&#34;yaml&#34;&gt;&lt;code&gt;title: &amp;quot;My document&amp;quot;
output:
  html_document: default
  pdf_document: default
  word_document: default&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can also click on the down arrow on the “Knit” button to choose the output &lt;em&gt;and&lt;/em&gt; generate the appropriate YAML. If you click on the gear icon next to the “Knit” button and choose “Output options”, you change settings for each specific output type, like default figure dimensions or whether or not a table of contents is included.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/reference/output-options.png&#34; width=&#34;35%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The first output type listed under &lt;code&gt;output:&lt;/code&gt; will be what is generated when you click on the “Knit” button or press the keyboard shortcut (&lt;code&gt;⌘⇧K&lt;/code&gt; on macOS; &lt;code&gt;control + shift + K&lt;/code&gt; on Windows). If you choose a different output with the “Knit” button menu, that output will be moved to the top of the &lt;code&gt;output&lt;/code&gt; section.&lt;/p&gt;
&lt;p&gt;The indentation of the YAML section matters, especially when you have settings nested under each output type. Here’s what a typical &lt;code&gt;output&lt;/code&gt; section might look like:&lt;/p&gt;
&lt;pre class=&#34;yaml&#34;&gt;&lt;code&gt;---
title: &amp;quot;My document&amp;quot;
author: &amp;quot;My name&amp;quot;
date: &amp;quot;January 13, 2020&amp;quot;
output: 
  html_document: 
    toc: yes
    fig_caption: yes
    fig_height: 8
    fig_width: 10
  pdf_document: 
    latex_engine: xelatex  # More modern PDF typesetting engine
    toc: yes
  word_document: 
    toc: yes
    fig_caption: yes
    fig_height: 4
    fig_width: 5
---&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;table-of-contents&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Table of contents&lt;/h2&gt;
&lt;p&gt;Each output format has various options to customize the appearance of the final document. One option for HTML documents is to add a &lt;strong&gt;t&lt;/strong&gt;able &lt;strong&gt;o&lt;/strong&gt;f &lt;strong&gt;c&lt;/strong&gt;ontents through the &lt;code&gt;toc&lt;/code&gt; option. To add any option for an output format, just add it in a hierarchical format like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;---
title: &amp;quot;My report&amp;quot;
author: &amp;quot;My Name&amp;quot;
date: 2020-07-26
output:  
  html_document:
    toc: true
    toc_depth: 2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can explicitly set the number of levels included in the table of contents with &lt;code&gt;toc_depth&lt;/code&gt; (the default is 3).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;appearance-and-style&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Appearance and style&lt;/h2&gt;
&lt;p&gt;There are several options that control the visual appearance of HTML documents.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;theme&lt;/code&gt;&lt;/strong&gt; specifies the Bootstrap theme to use for the page (themes are drawn from the &lt;a href=&#34;http://bootswatch.com/&#34; target=&#34;_blank&#34;&gt;Bootswatch&lt;/a&gt; theme library). Valid themes include &lt;code&gt;default&lt;/code&gt;, &lt;code&gt;cerulean&lt;/code&gt;, &lt;code&gt;journal&lt;/code&gt;, &lt;code&gt;flatly&lt;/code&gt;, &lt;code&gt;readable&lt;/code&gt;, &lt;code&gt;spacelab&lt;/code&gt;, &lt;code&gt;united&lt;/code&gt;, &lt;code&gt;cosmo&lt;/code&gt;, &lt;code&gt;lumen&lt;/code&gt;, &lt;code&gt;paper&lt;/code&gt;, &lt;code&gt;sandstone&lt;/code&gt;, &lt;code&gt;simplex&lt;/code&gt;, and &lt;code&gt;yeti&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;highlight&lt;/code&gt;&lt;/strong&gt; specifies the syntax highlighting style for code chunks. Supported styles include &lt;code&gt;default&lt;/code&gt;, &lt;code&gt;tango&lt;/code&gt;, &lt;code&gt;pygments&lt;/code&gt;, &lt;code&gt;kate&lt;/code&gt;, &lt;code&gt;monochrome&lt;/code&gt;, &lt;code&gt;espresso&lt;/code&gt;, &lt;code&gt;zenburn&lt;/code&gt;, &lt;code&gt;haddock&lt;/code&gt;, and &lt;code&gt;textmate&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;other-references&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Other references&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://themockup.blog/posts/2020-07-25-meta-rmarkdown/&#34; target=&#34;_blank&#34;&gt;How I share knowledge around R Markdown&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://rmd4sci.njtierney.com/&#34; target=&#34;_blank&#34;&gt;RMarkdown for Scientists&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.com/resources/rstudioconf-2020/don-t-repeat-yourself-talk-to-yourself-repeated-reporting-in-the-r-universe/&#34; target=&#34;_blank&#34;&gt;Don’t repeat yourself, talk to yourself! Repeated reporting in the R universe&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Eurostat Data</title>
      <link>https://usi-emba-analytics.netlify.app/reference/eurostat_data/</link>
      <pubDate>Fri, 02 Oct 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/eurostat_data/</guid>
      <description>
&lt;script src=&#34;https://usi-emba-analytics.netlify.app/rmarkdown-libs/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#eurostat-data-with-the-eurostat-package&#34;&gt;Eurostat Data with the &lt;code&gt;eurostat&lt;/code&gt; package&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#house-price-index-hpi&#34;&gt;House Price Index (HPI)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#tourism-seasonality-in-the-meditteranean&#34;&gt;Tourism Seasonality in the Meditteranean&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#disposable-income-of-private-households-by-nuts-2-regions&#34;&gt;Disposable income of private households by NUTS 2 regions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#acknowledgments&#34;&gt;Acknowledgments&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;eurostat-data-with-the-eurostat-package&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Eurostat Data with the &lt;code&gt;eurostat&lt;/code&gt; package&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;eurostat&lt;/code&gt; package provides access to well over 9000 datasets from the &lt;a href=&#34;https://ec.europa.eu/eurostat/web/main/home&#34;&gt;Eurostat&lt;/a&gt;. It may seem a challenging task to find the correct dataset, but you are essentially looking for the &lt;code&gt;code&lt;/code&gt; that describes the dataset. We an get a &lt;em&gt;table of contents&lt;/em&gt;, namely all of th ecodes contained in the eurostat database.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(eurostat)
library(fpp2) # for time series decomposition
library(seasonal)
library(tmap) #mapping eurostat data

# Get Eurostat data listing
# Function get_eurostat_toc() downloads a table of contents of eurostat datasets. 
# The values in column ‘code’ should be used to download a selected dataset.
toc &amp;lt;- get_eurostat_toc()

# Check the first 20 rows 
head(toc, 20) %&amp;gt;% 
  kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
title
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
code
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
type
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
last update of data
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
last table structure change
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
data start
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
data end
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
values
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Database by themes
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
folder
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
General and regional statistics
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
general
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
folder
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
European and national indicators for short-term analysis
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
euroind
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
folder
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Business and consumer surveys (source: DG ECFIN)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bcs
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
folder
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Consumer surveys (source: DG ECFIN)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bcs_cs
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
folder
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Consumers - monthly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsco_m
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980M01
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020M09
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Consumers - quarterly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsco_q
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
30.07.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1990Q1
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020Q3
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Business surveys - NACE Rev. 2 activity (source: DG ECFIN)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bcs_bs
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
folder
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Industry - monthly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsin_m_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980M01
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020M09
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Industry - quarterly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsin_q_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
30.07.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980Q1
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020Q3
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Construction - monthly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsbu_m_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980M01
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020M09
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Construction - quarterly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsbu_q_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
30.07.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1981Q1
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020Q3
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Retail sale - monthly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsrt_m_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1984M01
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020M09
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Sentiment indicators - monthly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bssi_m_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980M01
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020M09
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Services - monthly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsse_m_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1988M01
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020M09
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Services - quarterly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsse_q_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
30.07.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2001Q2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020Q3
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Euro-zone Business Climate Indicator - monthly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsci_m_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1985M01
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020M09
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Financial services - monthly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsfs_m
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2006M04
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020M09
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Financial services - quarterly data
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsfs_q
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
30.07.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2007Q3
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020Q3
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Employment expectations indicator
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ei_bsee_m_r2
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
dataset
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
29.09.2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980M01
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2020M09
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NA
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;div id=&#34;house-price-index-hpi&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;House Price Index (HPI)&lt;/h3&gt;
&lt;p&gt;The Eurostat &lt;a href=&#34;https://ec.europa.eu/eurostat/web/products-datasets/-/teicp270&#34;&gt;House Price Index (HPI)&lt;/a&gt; &lt;em&gt;measures price changes of all residential properties purchased by households (flats, detached houses, terraced houses, etc.), both new and existing, independently of their final use and their previous owners.&lt;/em&gt; First, we node that the code id for this dataset is &lt;code&gt;teicp270&lt;/code&gt;. Once we know the relevant code id, we can download eurostat data using the &lt;code&gt;get_eurostat(id)&lt;/code&gt; function.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;hpi &amp;lt;- get_eurostat(id=&amp;quot;teicp270&amp;quot;)
glimpse(hpi)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 1,257
## Columns: 5
## $ indic  &amp;lt;chr&amp;gt; &amp;quot;TOTAL&amp;quot;, &amp;quot;TOTAL&amp;quot;, &amp;quot;TOTAL&amp;quot;, &amp;quot;TOTAL&amp;quot;, &amp;quot;TOTAL&amp;quot;, &amp;quot;TOTAL&amp;quot;, &amp;quot;TOTAL...
## $ unit   &amp;lt;chr&amp;gt; &amp;quot;I15_NSA&amp;quot;, &amp;quot;I15_NSA&amp;quot;, &amp;quot;I15_NSA&amp;quot;, &amp;quot;I15_NSA&amp;quot;, &amp;quot;I15_NSA&amp;quot;, &amp;quot;I15_...
## $ geo    &amp;lt;chr&amp;gt; &amp;quot;AT&amp;quot;, &amp;quot;BE&amp;quot;, &amp;quot;BG&amp;quot;, &amp;quot;CY&amp;quot;, &amp;quot;CZ&amp;quot;, &amp;quot;DE&amp;quot;, &amp;quot;DK&amp;quot;, &amp;quot;EA&amp;quot;, &amp;quot;EA19&amp;quot;, &amp;quot;EE&amp;quot;...
## $ time   &amp;lt;date&amp;gt; 2017-04-01, 2017-04-01, 2017-04-01, 2017-04-01, 2017-04-01,...
## $ values &amp;lt;dbl&amp;gt; 114.2, 104.7, 115.4, 102.7, 119.1, 113.1, 110.5, 107.8, 107....&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(hpi,40) %&amp;gt;% 
  kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
indic
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
unit
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
geo
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
time
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
values
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
AT
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
114.2
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
BE
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
104.7
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
BG
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
115.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
CY
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
102.7
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
CZ
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
119.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
DE
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
113.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
DK
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
110.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
EA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
107.8
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
EA19
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
107.8
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
EE
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
108.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
ES
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
110.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
EU
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
109.0
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
EU27_2020
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
108.7
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
EU28
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
109.0
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
FI
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
103.0
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
FR
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
103.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
HR
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
104.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
HU
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
125.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
IE
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
115.9
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
IS
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
130.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
IT
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
99.6
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
LT
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
114.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
LU
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
112.2
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
LV
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
119.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
MT
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
108.9
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
111.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
NO
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
115.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
PL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
105.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
PT
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
115.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
RO
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
114.3
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
SE
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
116.0
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
SI
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
111.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
SK
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
113.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TR
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
124.3
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
I15_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
UK
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
111.2
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
PCH_Q1_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
AT
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
PCH_Q1_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
BE
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.3
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
PCH_Q1_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
BG
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
PCH_Q1_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
CY
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
TOTAL
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
PCH_Q1_NSA
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
CZ
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Typically, the downloaded data has codes and abbreviations for all of the variables, but we can use &lt;code&gt;label_eurostat&lt;/code&gt; to get a more verbose description.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;house_price_index_data &amp;lt;-  hpi %&amp;gt;% 
  label_eurostat()

head(house_price_index_data,40) %&amp;gt;% 
  kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
indic
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
unit
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
geo
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
time
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
values
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Austria
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
114.2
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Belgium
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
104.7
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Bulgaria
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
115.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Cyprus
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
102.7
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Czechia
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
119.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Germany (until 1990 former territory of the FRG)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
113.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Denmark
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
110.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Euro area (EA11-1999, EA12-2001, EA13-2007, EA15-2008, EA16-2009, EA17-2011, EA18-2014, EA19-2015)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
107.8
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Euro area - 19 countries (from 2015)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
107.8
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Estonia
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
108.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Spain
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
110.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
European Union (EU6-1958, EU9-1973, EU10-1981, EU12-1986, EU15-1995, EU25-2004, EU27-2007, EU28-2013, EU27-2020)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
109.0
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
European Union - 27 countries (from 2020)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
108.7
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
European Union - 28 countries (2013-2020)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
109.0
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Finland
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
103.0
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
France
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
103.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Croatia
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
104.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Hungary
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
125.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Ireland
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
115.9
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Iceland
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
130.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Italy
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
99.6
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Lithuania
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
114.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Luxembourg
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
112.2
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Latvia
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
119.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Malta
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
108.9
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Netherlands
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
111.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Norway
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
115.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Poland
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
105.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Portugal
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
115.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Romania
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
114.3
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Sweden
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
116.0
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Slovenia
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
111.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Slovakia
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
113.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Turkey
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
124.3
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Index, 2015=100 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
United Kingdom
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
111.2
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Percentage change q/q-1 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Austria
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Percentage change q/q-1 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Belgium
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.3
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Percentage change q/q-1 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Bulgaria
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.4
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Percentage change q/q-1 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Cyprus
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3.1
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Total
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Percentage change q/q-1 (NSA)
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Czechia
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
2017-04-01
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.5
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;We note that our dataframe contains both the value of the index (unit = &lt;code&gt;I15_NSA&lt;/code&gt;), as well as the percentage change (unit = &lt;code&gt;PCH_Q1_NSA&lt;/code&gt;). We will select the &lt;code&gt;I15_NSA&lt;/code&gt; index, a few countries and the EU-28 index, and plot the evolution of house prices over time.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;hpi_data &amp;lt;- hpi %&amp;gt;% 
  
  # choose the UK, France, Poland, Spain, Portugal, Germany, Italy, and the EU28
  filter(geo %in%  c(&amp;quot;UK&amp;quot;, &amp;quot;FR&amp;quot;, &amp;quot;PL&amp;quot;, &amp;quot;ES&amp;quot;,&amp;quot;PT&amp;quot;, &amp;quot;DE&amp;quot;,&amp;quot;IT&amp;quot;,&amp;quot;EU28&amp;quot;) ) %&amp;gt;%  
  
  # choose value of the index (unit =   `I15_NSA`) 
    filter(unit == &amp;quot;I15_NSA&amp;quot;)

ggplot(hpi_data, aes(x=time, y=values, group=geo, colour=geo))+
  geom_point()+
  geom_line()+
  theme_bw()+
  labs(
    title= &amp;quot;House price index in the EU (2015 = 100)&amp;quot;,
    x = &amp;quot;Time&amp;quot;,
    y = &amp;quot;Housing Price Index&amp;quot;, 
    caption = &amp;quot;Source: Eurostat, code id = teicp270&amp;quot;
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/eurostat_data_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;tourism-seasonality-in-the-meditteranean&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Tourism Seasonality in the Meditteranean&lt;/h3&gt;
&lt;p&gt;The eurostat database has a dedicated &lt;a href=&#34;https://ec.europa.eu/eurostat/web/tourism/data/database&#34;&gt;tourism section&lt;/a&gt;.
I wanted to check monthly nights spent at hotels– the relevant code id = &lt;a href=&#34;https://ec.europa.eu/eurostat/web/products-datasets/-/tour_occ_nim&#34;&gt;&lt;code&gt;tour_occ_nim&lt;/code&gt;&lt;/a&gt; in the four Meditteranean countries, Portugal, Spain, Italy, and Greece since 2000.&lt;/p&gt;
&lt;p&gt;The code below downloads the data and plots time series plots for all countries.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# create a dataframe tourism_data that contains the eurostat data for
# code id = &amp;quot;tour_occ_nim&amp;quot;, namely value of monthly nights spent at hotels
tourism_data &amp;lt;- get_eurostat(id=&amp;quot;tour_occ_nim&amp;quot;)

med_tourism &amp;lt;-  tourism_data %&amp;gt;%   
  
  # choose Portugal, Spain, Italy, and Greece
  filter(geo %in%  c(&amp;quot;PT&amp;quot;, &amp;quot;ES&amp;quot;, &amp;quot;IT&amp;quot;, &amp;quot;EL&amp;quot; ) ) %&amp;gt;%
  
  #use label_eurostat to get verbose descriptions of codes
  label_eurostat() %&amp;gt;% 
  
  # choose number of total hotel accommodations since Jan 1, 2000
  filter (c_resid == &amp;quot;Total&amp;quot;, 
          nace_r2 == &amp;quot;Hotels and similar accommodation&amp;quot;, 
          unit == &amp;quot;Number&amp;quot;,
          time &amp;gt;= &amp;quot;2000-01-01&amp;quot;) %&amp;gt;% 
  
  # express values in million of nights
  mutate(values = values/1000000) 

ggplot(med_tourism, aes(x=time, y=values, group=geo, colour=geo))+
  geom_point()+
  geom_line()+
  geom_smooth(se=FALSE)+
  facet_wrap(~geo)+
  theme_bw()+
  labs(title=&amp;quot;Hotel stays in the Medditeranean, 2000-present&amp;quot;, 
       y= &amp;quot;Millions of nights spent in hotels&amp;quot;,
       x = &amp;quot;Year&amp;quot;,
       caption = &amp;quot;Source: Eurostat, code = tour_occ_nim&amp;quot;)+
  theme(legend.position=&amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/eurostat_data_files/figure-html/get_tourism_data-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;All countries exhibit the same seasonal pattern: there is a peak in July-August, and the minimum number is around December-January.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Look at the impact of Covid-19 on all countries!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#first define **ts** (time series ) objects; one for each country  

portugal_tourism &amp;lt;- med_tourism %&amp;gt;% 
  
  #select the country you are interested in, in this case Portugal
  filter (geo == &amp;quot;Portugal&amp;quot;) %&amp;gt;% 
  
  #sort by time in ascending order, so  earliest observation is first
  arrange(time) %&amp;gt;%
  
  #we just want to keep the values 
  select(values) %&amp;gt;% 
  
  #time series (ts) starts Jan 2000 and has monthlyfrequency (12 months/yr)
  ts(start=2000, frequency = 12) 



spain_tourism &amp;lt;- med_tourism %&amp;gt;% 
  filter (geo == &amp;quot;Spain&amp;quot;) %&amp;gt;% 
  arrange(time) %&amp;gt;% 
  select(values) %&amp;gt;% 
  ts(start=2000, frequency = 12)

italy_tourism &amp;lt;- med_tourism %&amp;gt;% 
  filter (geo == &amp;quot;Italy&amp;quot;) %&amp;gt;% 
  arrange(time) %&amp;gt;% 
  select(values) %&amp;gt;%   
  ts(start=2000, frequency = 12)

greece_tourism &amp;lt;- med_tourism %&amp;gt;% 
  filter (geo == &amp;quot;Greece&amp;quot;) %&amp;gt;% 
  arrange(time) %&amp;gt;% 
  select(values) %&amp;gt;%   
  ts(start=2000, frequency = 12)


#Season plot for Spain and Greece: the seasonal pattern is consistent since 2000
ggseasonplot(spain_tourism, year.labels=TRUE, year.labels.left=TRUE) +
  labs(
    title = &amp;quot;Seasonal plot: Hotel stays in Spain&amp;quot;,
    y = &amp;quot;Millions of nights spent in hotels&amp;quot;
  )+
    theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/eurostat_data_files/figure-html/seasonal_plots-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggseasonplot(greece_tourism, year.labels=TRUE, year.labels.left=TRUE) +
  labs(
    title = &amp;quot;Seasonal plot: Hotel stays in Greece&amp;quot;,
    y = &amp;quot;Millions of nights spent in hotels&amp;quot;
  )+
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/eurostat_data_files/figure-html/seasonal_plots-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;An interesting question is which country has the greatest seasonality distortion, namely, how much bigger is the summer peak from the winter bottom. For this we produce a subseries plot, one that emphasises the seasonal patterns and where the data for each season are collected together in separate mini time plots. The horizontal lines indicate the means for each month. This form of plot enables the underlying seasonal pattern to be seen clearly, and also shows the changes in seasonality over time. It is especially useful in identifying changes within particular seasons.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggsubseriesplot(portugal_tourism)+
  labs(
    title = &amp;quot;Seasonal subseries plot: Hotel stays in Portugal 2000-present&amp;quot;,
    subtitle = &amp;quot;Horizontal lines indicate monthly averages&amp;quot;,
    y = &amp;quot;Millions of nights spent in hotels&amp;quot;, 
    caption = &amp;quot;Source:Eurostat&amp;quot;
  )+
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/eurostat_data_files/figure-html/subseries_plots-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggsubseriesplot(spain_tourism)+
  labs(
    title = &amp;quot;Seasonal subseries plot: Hotel stays in Spain 2000-present&amp;quot;,
    subtitle = &amp;quot;Horizontal lines indicate monthly averages&amp;quot;,
    y = &amp;quot;Millions of nights spent in hotels&amp;quot;, 
    caption = &amp;quot;Source:Eurostat&amp;quot;
  )+
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/eurostat_data_files/figure-html/subseries_plots-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggsubseriesplot(italy_tourism)+
  labs(
    title = &amp;quot;Seasonal subseries plot: Hotel stays in Italy 2000-present&amp;quot;,
    subtitle = &amp;quot;Horizontal lines indicate monthly averages&amp;quot;,
    y = &amp;quot;Millions of nights spent in hotels&amp;quot;, 
        caption = &amp;quot;Source:Eurostat&amp;quot;
  )+
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/eurostat_data_files/figure-html/subseries_plots-3.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggsubseriesplot(greece_tourism)+
  labs(
    title = &amp;quot;Seasonal subseries plot: Hotel stays in Greece 2000-present&amp;quot;,
    subtitle = &amp;quot;Horizontal lines indicate monthly averages&amp;quot;,
    y = &amp;quot;Millions of nights spent in hotels&amp;quot;, 
    caption = &amp;quot;Source:Eurostat&amp;quot;
  )+
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/eurostat_data_files/figure-html/subseries_plots-4.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Visually, the approximate ratio of max:min averages for each of the four Mediterranean countries is as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Portugal 6:2 = 3&lt;/li&gt;
&lt;li&gt;Spain 39:13 = 3&lt;/li&gt;
&lt;li&gt;Italy 43:10 = 4.3&lt;/li&gt;
&lt;li&gt;Greece 13.5:1= 13.5&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;disposable-income-of-private-households-by-nuts-2-regions&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Disposable income of private households by NUTS 2 regions&lt;/h3&gt;
&lt;p&gt;Using the eurostat data, we can create maps of, e.g., disposable income at a regional level. &lt;a href=&#34;https://en.wikipedia.org/wiki/Nomenclature_of_Territorial_Units_for_Statistics&#34;&gt;NUTS or &lt;em&gt;Nomenclature of Territorial Units for Statistics&lt;/em&gt;&lt;/a&gt; is a geocode standard for referencing subdvisions (regions, counties, districts, etc.) within a country.&lt;/p&gt;
&lt;p&gt;We will work with the &lt;a href=&#34;https://ec.europa.eu/eurostat/web/products-datasets/-/tgs00026&#34;&gt;Disposable income of private households by NUTS 2 regions&lt;/a&gt; database&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;income_data &amp;lt;- get_eurostat(id=&amp;quot;tgs00026&amp;quot;) %&amp;gt;% 
  select(geo,time,values) %&amp;gt;% 
  dplyr::mutate(cat = cut_to_classes(values))


income_2016 &amp;lt;- income_data %&amp;gt;% 
  filter(time == &amp;quot;2016-01-01&amp;quot;)

# Download geospatial data from GISCO
geodata &amp;lt;- get_eurostat_geospatial(output_class = &amp;quot;sf&amp;quot;,
                                   resolution = &amp;quot;60&amp;quot;,
                                   nuts_level = 2,
                                   year = 2016) 


map_data &amp;lt;- inner_join(geodata, income_2016)


ggplot(data=map_data) + geom_sf(aes(fill=cat),color=&amp;quot;dim grey&amp;quot;, size=.1) + 
  scale_fill_brewer(palette = &amp;quot;Accent&amp;quot;) +
  guides(fill = guide_legend(reverse=T, title = &amp;quot;euro&amp;quot;)) +
  labs(title=&amp;quot;Disposable household income in 2016&amp;quot;,
       caption=&amp;quot;(C) EuroGeographics for the administrative boundaries 
                Map produced in R with a help from Eurostat-package &amp;lt;github.com/ropengov/eurostat/&amp;gt;&amp;quot;) +
  theme_light() + theme(legend.position=c(.8,.8)) +
  coord_sf(xlim=c(-12,44), ylim=c(35,70))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/eurostat_data_files/figure-html/eurostat_map1-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;acknowledgments&#34; class=&#34;section level2 toc-ignore&#34;&gt;
&lt;h2&gt;Acknowledgments&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;This page is derived in part from &lt;a href=&#34;https://ropengov.github.io/eurostat/articles/eurostat_tutorial.html&#34;&gt;Tutorial (vignette) for the &lt;code&gt;eurostat&lt;/code&gt; R package&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Using Github</title>
      <link>https://usi-emba-analytics.netlify.app/reference/reference_github/</link>
      <pubDate>Tue, 25 Aug 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/reference_github/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#introduction-to-gitgithub&#34;&gt;Introduction to Git/Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#git-workflow&#34;&gt;Git workflow&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#further-resources&#34;&gt;Further resources&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;introduction-to-gitgithub&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Introduction to Git/Github&lt;/h2&gt;
&lt;p&gt;When you engage in any kind of data science or programming, there comes a (frustrating) point that you need to understand how Git and GitHub work. Learning how to use Git and GitHub is especially important for keeping versions of your work (think something like Dropbox + MS Word’s &lt;em&gt;Track Changes&lt;/em&gt;) and collaborating with others.&lt;/p&gt;
&lt;p&gt;Git is essentially a boring time machine. Remember when you worked on a Word file and saved it by adding the date, or calling it “mywork-vesion1”, “mywork-final”, “mywork-final-final”, etc?&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/final_final.gif&#34; width=&#34;60%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Git is organised around &lt;strong&gt;repositories&lt;/strong&gt;; repos are folders where you keep a project with all necessary files (code, data, images, etc). So you first need to tell git which files/folders to keep track of for any changes you will be making.&lt;/p&gt;
&lt;p&gt;As you keep adding code to your project/assignment/etc, you &lt;strong&gt;commit changes&lt;/strong&gt; into your repository and you add an explanatory comment, or message to yourself briefly describing the changes/additions/new work you have done.&lt;/p&gt;
&lt;p&gt;When you commit changes, it’s as though you take a snapshot of your work and write a short comment to yourself; it would be the same as saving your Word document adding today’s date in the filename, or v1, v2, final, final-final, etc.&lt;/p&gt;
&lt;p&gt;After committing your changes, you need to &lt;strong&gt;pull&lt;/strong&gt; first, so you get the latesr copy from git and then &lt;strong&gt;push&lt;/strong&gt; them to git– this is when you actually upload changes, etc.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;git-workflow&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Git workflow&lt;/h2&gt;
&lt;p&gt;The following lists the main steps to create a repository and keep it updated&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Create a repo on GitHub and initialize with a README.&lt;/li&gt;
&lt;li&gt;Clone the repo to your local machine. You can either do it as an RStudio Project, or using a shell command: &lt;code&gt;$ git clone REPOSITORY-URL&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Add&lt;/em&gt; or &lt;em&gt;Stage&lt;/em&gt; any changes you make: &lt;code&gt;$ git add -A&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Commit your changes: &lt;code&gt;$ git commit -m &#34;Helpful message to yourself/collaborators&#34;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Pull from GitHub: &lt;code&gt;$ git pull&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Push your changes to GitHub: &lt;code&gt;$ git push&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Repeat steps 3—7, but especially steps 3-4, often.&lt;/p&gt;
&lt;p&gt;Git keeps track of all the changes you have made in your repo, just in case you made a mistake and need to go back to an earlier version where things actually worked. GitHub is a website built on top of Git that allows you to collaborate on code with others, in helping with code fixes, documentation, and more.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;further-resources&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Further resources&lt;/h2&gt;
&lt;p&gt;For R users, Jenny Bryan et al have created &lt;a href=&#34;https://happygitwithr.com/&#34; target=&#34;_blank&#34;&gt;Happy Git with R&lt;/a&gt;, a brilliant resource that shows you how to use Git and GitHub in RStudio effectively.&lt;/p&gt;
&lt;p&gt;One final thing: git can be confusing and frustrating as hell (ask me for details)– add git to the challenges of coding and you sometimes end up with &lt;a href=&#34;https://twitter.com/maciejwalkowiak/status/1295820902433730561&#34; target=&#34;_blank&#34;&gt;people asking themselves interesting questions&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;When things do go wrong (they will), have a look at &lt;a href=&#34;https://ohshitgit.com/&#34; class=&#34;uri&#34;&gt;https://ohshitgit.com/&lt;/a&gt; and &lt;a href=&#34;http://happygitwithr.com/burn.html&#34; class=&#34;uri&#34;&gt;http://happygitwithr.com/burn.html&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>H-E-L-P!</title>
      <link>https://usi-emba-analytics.netlify.app/reference/reference_help/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/reference_help/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#overwiew&#34;&gt;Overwiew&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#error-messages-in-r&#34;&gt;Error messages in R&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#failure-and-the-15-minute-rule&#34;&gt;Failure, and the 15 minute rule&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#the-reprex-package&#34;&gt;The &lt;code&gt;reprex&lt;/code&gt; package&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#reprex-with-copy-paste&#34;&gt;&lt;code&gt;reprex&lt;/code&gt; with copy-paste&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#reprex-directly-with-the-reprex-command&#34;&gt;&lt;code&gt;reprex&lt;/code&gt; directly with the reprex() command&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#further-resources&#34;&gt;Further Resources&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;overwiew&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Overwiew&lt;/h2&gt;
&lt;p&gt;I recently got an ad on my phone trying to sell me a service to &lt;em&gt;become a data scientist in a month&lt;/em&gt;. This may make you think that doing data science with R is an easy, straight-forward process.&lt;/p&gt;
&lt;p&gt;&lt;font size=&#34;+2&#34;&gt;&lt;strong&gt;It is not.&lt;/strong&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/learn_new_tools.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;blockquote class=&#34;twitter-tweet&#34; data-lang=&#34;en&#34;&gt;
&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;
Any time I have been struggling to learn a new tool/technology/software … I go back to this short clip I cut out of &lt;span class=&#34;citation&#34;&gt;@hadleywickham&lt;/span&gt;’s 2014 &lt;span class=&#34;citation&#34;&gt;@user2014_ucla&lt;/span&gt; tutorial on Dplyr to motivate myself to keep pushing through … learning new tools is hard for everyone at the beginning!- Brain AMA
&lt;/p&gt;
— Aliakbar Akbaritabar (Ali) (&lt;span class=&#34;citation&#34;&gt;@Akbaritabar&lt;/span&gt;) &lt;a href=&#34;https://twitter.com/Akbaritabar/status/1022057084802748416&#34;&gt;July 25, 2018&lt;/a&gt;
&lt;/blockquote&gt;
&lt;p&gt;You will stumble, get frustrated, lost, and confused, make (silly) mistakes even when you think you know stuff, not understand how to perform a task, not understand why your code is generating an error, etc. It still happens to me all the time, and I am still googling really basic stuff about R, even after quite a few years using it. But as &lt;a href=&#34;https://www.youtube.com/watch?v=ZS8QHRtzcPg&#34; target=&#34;_blank&#34;&gt;Alfred so helpfully points out to Bruce Wayne in &lt;em&gt;Batman Begins&lt;/em&gt;&lt;/a&gt;, do not fall to pieces when you fail. Instead, learn to pick yourself up, learn from experience, practice more, and get better.&lt;/p&gt;
&lt;p&gt;Back in 2018, Hadley Wickham, one the major driving forces behind the tidyverse, &lt;a href=&#34;https://www.youtube.com/watch?v=go5Au01Jrvs&#34; target=&#34;_blank&#34;&gt;recorded a video of his live analysis&lt;/a&gt; of a dataset with the goal of demonstrating his approach.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/hadley_live.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;It’s great to see Hadley, a true expert, undertake data analysis; he makes quite a few mistakes in this video and he even forgets the arguments for routines/packages he has written! But it’s even more powerful that he shrugs off the mistake, corrects it, and moves forward.&lt;/p&gt;
&lt;p&gt;Even more interesting is to see Hadley, the author of the &lt;code&gt;ggplot2&lt;/code&gt; package, admit to using Google to look things up in… &lt;code&gt;ggplot2&lt;/code&gt;!&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/hadley_ggplot_google.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;error-messages-in-r&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Error messages in R&lt;/h2&gt;
&lt;p&gt;Error messages are a normal part of working in R, not a sign you are bad. To make matters worse, R will alert you with red letters not just for errors, but for warnings, too. It helps to learn relatively early on how to decipher these messages and what common ones mean.&lt;/p&gt;
&lt;p&gt;First, if after typing a command you see red letters, don’t panic– it may just be a warning, and most of the times you can ignore them or worry about them later.&lt;/p&gt;
&lt;p&gt;But you will get errors (in red letters too!) As an example let me try to read a CSV file using the read_csv function&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/error1.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;There are three main parts to an error:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;The declaration that it is an Error, and not a Warning&lt;/li&gt;
&lt;li&gt;The location of the error: it is in the &lt;code&gt;read_csv(&#34;myfile.csv&#34;)&lt;/code&gt; line of my code&lt;/li&gt;
&lt;li&gt;The issue my code caused: &lt;code&gt;could not find function &#34;read_csv&#34;&lt;/code&gt;, as I asked R to use a function from the &lt;code&gt;readr&lt;/code&gt; package but forgot to load it.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let me try again.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/error2.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The error given now is again produced by the same &lt;code&gt;read_csv&lt;/code&gt; function, but the error is that the CSV file does not exist in the working directory.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;failure-and-the-15-minute-rule&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Failure, and the 15 minute rule&lt;/h2&gt;
&lt;p&gt;It’s good practice to follow the &lt;strong&gt;15 minute rule&lt;/strong&gt;. If you encounter a problem in your work, spend 15 minutes troubleshooting the problem on your own; &lt;a href=&#34;https://www.google.com&#34;&gt;Google&lt;/a&gt;, &lt;a href=&#34;https://support.rstudio.com/hc/en-us&#34;&gt;RStudio Support&lt;/a&gt;, and &lt;a href=&#34;http://stackoverflow.com/&#34;&gt;StackOverflow&lt;/a&gt; are good places to look for answers. So if you google your error message, you will find that 99% of the time someone has had the same error message and the solution is on stackoverflow.&lt;/p&gt;
&lt;p&gt;However, if after 15 minutes you still cannot solve the problem, &lt;strong&gt;ask for help&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/15min_rule.png&#34; width=&#34;60%&#34; /&gt;&lt;/p&gt;
&lt;blockquote class=&#34;twitter-tweet&#34; data-lang=&#34;en&#34;&gt;
&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;
15 min rule: when stuck, you HAVE to try on your own for 15 min; after 15 min, you HAVE to ask for help.- Brain AMA
&lt;/p&gt;
— Rachel Thomas (&lt;span class=&#34;citation&#34;&gt;@math_rachel&lt;/span&gt;) &lt;a href=&#34;https://twitter.com/math_rachel/status/764931533383749632&#34;&gt;August 15, 2016&lt;/a&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;div id=&#34;the-reprex-package&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;The &lt;code&gt;reprex&lt;/code&gt; package&lt;/h2&gt;
&lt;p&gt;How should you ask for help? You must provide enough information so others can understand what is the issue with your code and try to reproduce the issue on their own computer.Stackoverflow provides advice not only on technical questions but also on how to ask good questions! A very popular post addresses &lt;a href=&#34;https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example&#34;&gt;how to make a great R reproducible example&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/reprex.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The &lt;a href=&#34;https://reprex.tidyverse.org/index.html&#34;&gt;&lt;code&gt;reprex&lt;/code&gt; package&lt;/a&gt;, written by Jenny Bryan, was developed to help create &lt;strong&gt;repr&lt;/strong&gt;oducible &lt;strong&gt;ex&lt;/strong&gt;amples, so others can reproduce your code, run it, and see where the issue is.&lt;/p&gt;
&lt;div id=&#34;reprex-with-copy-paste&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;code&gt;reprex&lt;/code&gt; with copy-paste&lt;/h3&gt;
&lt;p&gt;Reprex works with whatever is currently saved on your clipboard. The easiest way to use &lt;code&gt;reprex&lt;/code&gt; is to highlight with your mouse the part of code that gives you an error and copy it to your clipboard using Command+c (Mac) or Control+c(Windows)).&lt;/p&gt;
&lt;p&gt;Now that the code has been highlighted, you can easily just type &lt;code&gt;reprex()&lt;/code&gt; and the reprex code will now be on the clipboard, which means you can paste it directly into a new Rmd file&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;x &amp;lt;- 1:10
mean(x)
#&amp;gt; [1] 5.5&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;reprex-directly-with-the-reprex-command&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;code&gt;reprex&lt;/code&gt; directly with the reprex() command&lt;/h3&gt;
&lt;p&gt;Besides copy-and-paste which is the easiest way to use reprex, you can include the code you want to share ore debug directly into the reprex() command. Let us look at a few examples.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;reprex(gapminder %&amp;gt;% summarise(lifeExp))
#&amp;gt; Error in gapminder %&amp;gt;% summarise(lifeExp): could not find function &amp;quot;%&amp;gt;%&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;sup&gt;Created on 2019-07-16 by the &lt;a href=&#34;https://reprex.tidyverse.org&#34;&gt;reprex package&lt;/a&gt; (v0.3.0)&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/reprex1.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The error message given is that it cannot find the pipe operator &lt;code&gt;%&amp;gt;%&lt;/code&gt;, as we haven’t given the &lt;code&gt;library(dplyr)&lt;/code&gt; command. &lt;code&gt;reprex&lt;/code&gt; will ensure that all the necessary data and packages are loaded. The information above is now automatically stored on your clipboard, and you can paste it directly (with Ctrl/Cmd+c) as needed.&lt;/p&gt;
&lt;p&gt;Let us load the library and try again.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;reprex({library(dplyr); gapminder %&amp;gt;% summarise(lifeExp)})
#&amp;gt; 
#&amp;gt; Attaching package: &amp;#39;dplyr&amp;#39;
#&amp;gt; The following objects are masked from &amp;#39;package:stats&amp;#39;:
#&amp;gt; 
#&amp;gt;     filter, lag
#&amp;gt; The following objects are masked from &amp;#39;package:base&amp;#39;:
#&amp;gt; 
#&amp;gt;     intersect, setdiff, setequal, union
#&amp;gt; Error in eval(lhs, parent, parent): object &amp;#39;gapminder&amp;#39; not found&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/reprex2.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;dplyr&lt;/code&gt; is ok now, and the pipe operator works, but we now realise that the &lt;code&gt;gapminder&lt;/code&gt; package has not been loaded; let’s try again.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;reprex({library(dplyr); library(gapminder); gapminder %&amp;gt;% summarise(lifeExp)}
#&amp;gt; 
#&amp;gt; Attaching package: &amp;#39;dplyr&amp;#39;
#&amp;gt; The following objects are masked from &amp;#39;package:stats&amp;#39;:
#&amp;gt; 
#&amp;gt;     filter, lag
#&amp;gt; The following objects are masked from &amp;#39;package:base&amp;#39;:
#&amp;gt; 
#&amp;gt;     intersect, setdiff, setequal, union
#&amp;gt; Error: Column `lifeExp` must be length 1 (a summary value), not 1704&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/reprex3.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The error we get relates to the use of the &lt;code&gt;summarise&lt;/code&gt; function; this function summarises many values into a single summary, like mean, min, median, etc. R tells us that &lt;code&gt;lifeExp&lt;/code&gt; must be of length 1 (a single summary value) rather than
1704 values which is how many values &lt;code&gt;lifeExp&lt;/code&gt; has.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;further-resources&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Further Resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;http://rex-analytics.com/decoding-error-messages-r/&#34;&gt;Decoding error messages in R&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://reprex.tidyverse.org/index.html&#34;&gt;&lt;code&gt;reprex&lt;/code&gt; vignette&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cran.r-project.org/web/packages/reprex/vignettes/reprex-dos-and-donts.html&#34;&gt;&lt;code&gt;reprex&lt;/code&gt; do’s and dont’s&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Using SQL within R</title>
      <link>https://usi-emba-analytics.netlify.app/reference/reference_sql/</link>
      <pubDate>Tue, 11 Aug 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/reference_sql/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#sql-and-dbplyr&#34;&gt;SQL and &lt;code&gt;dbplyr&lt;/code&gt;&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#sql-commands-vs-dplyr-verbs&#34;&gt;SQL commands vs dplyr verbs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#establish-a-connection-with-the-sqlite-database&#34;&gt;Establish a connection with the SQLite database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#database-objects-or-tibbles&#34;&gt;Database objects or tibbles?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#querying-the-database-with-dbplyr&#34;&gt;Querying the database with &lt;code&gt;dbplyr&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#generate-the-actual-sql-commands&#34;&gt;Generate the actual SQL commands&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#run-sql-query-and-retrieve-results&#34;&gt;Run SQL query and retrieve results&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#other-references&#34;&gt;Other references&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;sql-and-dbplyr&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;SQL and &lt;code&gt;dbplyr&lt;/code&gt;&lt;/h1&gt;
&lt;p&gt;This sort note will teach you the basics of using SQL databases with R. Sometimes, you have a massive dataset, made up of many different dataframes (or &lt;em&gt;tables&lt;/em&gt; in SQL jargon), that would crash your computer’s memory if you try to load it. To interact with any database you typically use &lt;strong&gt;SQL&lt;/strong&gt;, the Structured Query Language.&lt;/p&gt;
&lt;p&gt;Rather than writing SQL commands, the &lt;code&gt;dbplyr&lt;/code&gt; package automatically generates SQL commands from dplyr sequences. However, please keep in mind that SQL is a very large language, and &lt;code&gt;dbplyr&lt;/code&gt; doesn’t do everything, but you can still get a lot out of it.&lt;/p&gt;
&lt;div id=&#34;sql-commands-vs-dplyr-verbs&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;SQL commands vs dplyr verbs&lt;/h2&gt;
&lt;p&gt;One of the advantages of learning about tidy data and &lt;code&gt;dplyr&lt;/code&gt; is that with dplyr verbs you can replicate a lot of what SQL does.&lt;/p&gt;
&lt;table&gt;
&lt;colgroup&gt;
&lt;col width=&#34;50%&#34; /&gt;
&lt;col width=&#34;50%&#34; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;&lt;code&gt;SQL&lt;/code&gt; command…&lt;/th&gt;
&lt;th&gt;… translate to &lt;code&gt;dplyr&lt;/code&gt; verb&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;SELECT&lt;/td&gt;
&lt;td&gt;&lt;p&gt;&lt;code&gt;select()&lt;/code&gt; for columns&lt;/p&gt;
&lt;p&gt;&lt;code&gt;mutate()&lt;/code&gt; for expressions&lt;/p&gt;
&lt;p&gt;&lt;code&gt;summarise()&lt;/code&gt; for aggregates&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;FROM&lt;/td&gt;
&lt;td&gt;which dataframe to use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;WHERE&lt;/td&gt;
&lt;td&gt;&lt;code&gt;filter()&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;GROUP_BY&lt;/td&gt;
&lt;td&gt;&lt;code&gt;group_by()&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;ORDER_BY&lt;/td&gt;
&lt;td&gt;&lt;code&gt;arrange()&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;LIMIT&lt;/td&gt;
&lt;td&gt;&lt;code&gt;head()&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div id=&#34;establish-a-connection-with-the-sqlite-database&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Establish a connection with the SQLite database&lt;/h2&gt;
&lt;p&gt;We will use the &lt;a href=&#34;https://www.kaggle.com/hugomathien/soccer&#34;&gt;European Soccer Database&lt;/a&gt; that has more than 25,000 matches and more than 11,000 players. We first need to establish a connection with the SQL database. Unlike dataframes that we just load into memory, the size of some SQL databases prohibits loading the entire database into memory. Although this soccer database is a locally saved file, we would use a similar connection into any SQL database over the internet&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# set up a connection to sqlite database
football_db &amp;lt;- DBI::dbConnect(
  drv = RSQLite::SQLite(),
  dbname = &amp;quot;database.sqlite&amp;quot;
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The general code for connecting to a remote database is:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;connection_to_db &amp;lt;- DBI::dbConnect(
  drv = [database driver, eg odbc::odbc()],
  dbname = &amp;quot;database_name&amp;quot;,
  user = &amp;quot;User_ID&amp;quot;,
  password = &amp;quot;Password&amp;quot;,
  host = &amp;quot;host name&amp;quot;, (default=local connection)
  port = &amp;quot;port number&amp;quot; (default=local connection)
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That’s pretty much it - R now has a direct connection to the database and you can start making queries.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;database-objects-or-tibbles&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Database objects or tibbles?&lt;/h2&gt;
&lt;p&gt;Now, an SQL database will typically contain multiple &lt;em&gt;tables&lt;/em&gt;. You can think of these &lt;em&gt;tables&lt;/em&gt; as R data frames (or tibbles). What are the tables in the soccer database? We can browse the tables in the database using &lt;code&gt;DBI::dbListTables()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;DBI::dbListTables(football_db)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Country&amp;quot;           &amp;quot;League&amp;quot;            &amp;quot;Match&amp;quot;            
## [4] &amp;quot;Player&amp;quot;            &amp;quot;Player_Attributes&amp;quot; &amp;quot;Team&amp;quot;             
## [7] &amp;quot;Team_Attributes&amp;quot;   &amp;quot;sqlite_sequence&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can easily set these tables up as &lt;strong&gt;database objects&lt;/strong&gt; using &lt;code&gt;dplyr&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;countries &amp;lt;- dplyr::tbl(football_db, &amp;quot;Country&amp;quot;)
leagues &amp;lt;- dplyr::tbl(football_db, &amp;quot;League&amp;quot;)
matches &amp;lt;- dplyr::tbl(football_db, &amp;quot;Match&amp;quot;)
teams &amp;lt;- dplyr::tbl(football_db, &amp;quot;Team&amp;quot;)
team_attributes &amp;lt;- dplyr::tbl(football_db, &amp;quot;Team_Attributes&amp;quot;)
players &amp;lt;- dplyr::tbl(football_db, &amp;quot;Player&amp;quot;)
player_attributes &amp;lt;- dplyr::tbl(football_db, &amp;quot;Player_Attributes&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Each of these tables are SQL database objects in your R session which you can manipulate in the same way as a dataframe.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;class(countries)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;tbl_SQLiteConnection&amp;quot; &amp;quot;tbl_dbi&amp;quot;              &amp;quot;tbl_sql&amp;quot;             
## [4] &amp;quot;tbl_lazy&amp;quot;             &amp;quot;tbl&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;When you define these tables, you are not physically downloading them, just creating a bare minimum extract to work with.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;IF you wanted to handle these as normal dataframes or tibbles, you can simply pipe the database objects to &lt;code&gt;as_tibble()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;player_attributes_df &amp;lt;- player_attributes %&amp;gt;% as_tibble()
class(player_attributes)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;tbl_SQLiteConnection&amp;quot; &amp;quot;tbl_dbi&amp;quot;              &amp;quot;tbl_sql&amp;quot;             
## [4] &amp;quot;tbl_lazy&amp;quot;             &amp;quot;tbl&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;class(player_attributes_df)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;tbl_df&amp;quot;     &amp;quot;tbl&amp;quot;        &amp;quot;data.frame&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Notice the difference between &lt;code&gt;player_atributes&lt;/code&gt;, a database object, and &lt;code&gt;player_atributes_df&lt;/code&gt;, a ‘regular’ dataframe/tibble.&lt;/p&gt;
&lt;p&gt;Now that we have player attributes as a dataframe, we can handle it the usual way and, e.g., build a scatterplot/correlation matrix wth &lt;code&gt;ggpairs()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;player_attributes_df %&amp;gt;% 
  filter(!is.na(preferred_foot)) %&amp;gt;% 
  select(preferred_foot, ball_control, overall_rating) %&amp;gt;% 
  ggpairs(aes(colour=preferred_foot, alpha = 0.3))+
  scale_colour_manual(values = c(&amp;quot;#67a9cf&amp;quot;,&amp;quot;#ef8a62&amp;quot;))+
  scale_fill_manual(values = c(&amp;quot;#67a9cf&amp;quot;,&amp;quot;#ef8a62&amp;quot;))+
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/img/player_attributes.png&#34; width=&#34;60%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;querying-the-database-with-dbplyr&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Querying the database with &lt;code&gt;dbplyr&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;To create the &lt;code&gt;ggpairs()&lt;/code&gt; plot we had to convert a database table to a dataframe, load it all in the computer’s memory, and then use ggplot. The beauty of working with databases is that we do &lt;strong&gt;NOT&lt;/strong&gt; have to load everything into memory. Instead, all dplyr calls are evaluated lazily, generating SQL code that is only sent to the database when you request the data.&lt;/p&gt;
&lt;p&gt;Let us look at an example. What if we wanted to calculate the average number of goals per league (country) per season and then plot those averages. This seems like a data wrangling exercise we would do with &lt;code&gt;dplyr&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# write dplyr code that will calculate average number of goals per country per season
goals_per_match &amp;lt;-  matches %&amp;gt;%
  group_by(country_id, season) %&amp;gt;%
  summarise(avg_goals = mean(home_team_goal + away_team_goal)) %&amp;gt;%
 
  #do a left_join, so we know the country&amp;#39;s name rather than the country&amp;#39;s ID
  left_join(countries, by = c(&amp;quot;country_id&amp;quot;=&amp;quot;id&amp;quot;)) %&amp;gt;%
  arrange(desc(avg_goals)) %&amp;gt;% 
  ungroup()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What kind of an object is &lt;code&gt;goals_per_match&lt;/code&gt;?&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#what kind of an object is goals_per_match?
class(goals_per_match)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;tbl_SQLiteConnection&amp;quot; &amp;quot;tbl_dbi&amp;quot;              &amp;quot;tbl_sql&amp;quot;             
## [4] &amp;quot;tbl_lazy&amp;quot;             &amp;quot;tbl&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;goals_per_match&lt;/code&gt; is not a dataframe (tibble), but rather a querty to an SQLite database table.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;generate-the-actual-sql-commands&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Generate the actual SQL commands&lt;/h2&gt;
&lt;p&gt;We are familiar with all the &lt;code&gt;dplyr&lt;/code&gt; verbs (filter, select, group_by, summarise, arrange, etc.), but SQL has its own commands, all of which are written in capital letters (is SQL constantly angry and shouting? Who knew…). We can generate the actual SQL commands using &lt;code&gt;dbplyr::sql_render()&lt;/code&gt; or &lt;code&gt;dplyr::show_query()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Generate actual SQL commands: We can either use dbplyr::sql_render() or dplyr::show_query()
dbplyr::sql_render(goals_per_match)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## &amp;lt;SQL&amp;gt; SELECT *
## FROM (SELECT `LHS`.`country_id` AS `country_id`, `LHS`.`season` AS `season`, `LHS`.`avg_goals` AS `avg_goals`, `RHS`.`name` AS `name`
## FROM (SELECT `country_id`, `season`, AVG(`home_team_goal` + `away_team_goal`) AS `avg_goals`
## FROM `Match`
## GROUP BY `country_id`, `season`) AS `LHS`
## LEFT JOIN `Country` AS `RHS`
## ON (`LHS`.`country_id` = `RHS`.`id`)
## )
## ORDER BY `avg_goals` DESC&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;goals_per_match %&amp;gt;% show_query()&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## &amp;lt;SQL&amp;gt;
## SELECT *
## FROM (SELECT `LHS`.`country_id` AS `country_id`, `LHS`.`season` AS `season`, `LHS`.`avg_goals` AS `avg_goals`, `RHS`.`name` AS `name`
## FROM (SELECT `country_id`, `season`, AVG(`home_team_goal` + `away_team_goal`) AS `avg_goals`
## FROM `Match`
## GROUP BY `country_id`, `season`) AS `LHS`
## LEFT JOIN `Country` AS `RHS`
## ON (`LHS`.`country_id` = `RHS`.`id`)
## )
## ORDER BY `avg_goals` DESC&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;run-sql-query-and-retrieve-results&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Run SQL query and retrieve results&lt;/h1&gt;
&lt;p&gt;Now that we have the SQL query we can retrieve the results into a local dataframe (tibble) using &lt;code&gt;collect()&lt;/code&gt;. The main difference is that rather than loading all of the databases in memory, the &lt;code&gt;goals_per_match&lt;/code&gt; goes to the SQL database, collects the necessary data, and only returns&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# execute query and retrieve results in a tibble (dataframe). 
goals_match_tibble &amp;lt;- goals_per_match %&amp;gt;% 
  collect()

#have a look at the resulting dataframe with glimpse() and skim()
glimpse(goals_match_tibble)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 88
## Columns: 4
## $ country_id &amp;lt;int&amp;gt; 24558, 13274, 13274, 13274, 7809, 13274, 24558, 13274, 2...
## $ season     &amp;lt;chr&amp;gt; &amp;quot;2009/2010&amp;quot;, &amp;quot;2011/2012&amp;quot;, &amp;quot;2010/2011&amp;quot;, &amp;quot;2013/2014&amp;quot;, &amp;quot;201...
## $ avg_goals  &amp;lt;dbl&amp;gt; 3.33, 3.26, 3.23, 3.20, 3.16, 3.15, 3.14, 3.08, 3.00, 2....
## $ name       &amp;lt;chr&amp;gt; &amp;quot;Switzerland&amp;quot;, &amp;quot;Netherlands&amp;quot;, &amp;quot;Netherlands&amp;quot;, &amp;quot;Netherland...&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;skimr::skim(goals_match_tibble)&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;caption&gt;&lt;span id=&#34;tab:unnamed-chunk-9&#34;&gt;Table 1: &lt;/span&gt;Data summary&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Name&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;goals_match_tibble&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Number of rows&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;88&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Number of columns&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;_______________________&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Column type frequency:&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;character&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;numeric&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;________________________&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Group variables&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Variable type: character&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;skim_variable&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_missing&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;complete_rate&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;min&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;max&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;empty&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_unique&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;whitespace&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;season&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;9&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;9&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;name&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Variable type: numeric&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;skim_variable&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_missing&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;complete_rate&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;mean&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;sd&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p0&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p25&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p50&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p75&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p100&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;hist&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;country_id&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;12452.09&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7877.88&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4769.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;13274.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;19694.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;24558.00&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▇▂▅▅▇&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;avg_goals&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.71&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.24&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.18&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.56&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.71&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.86&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.33&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▂▆▇▃▂&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The resulting dataframe only has the information we want: 4 variables (columns) and 88 rows (cases); for each country and season, we have the average number of goals scored per match. We can now use this smaller dataframe and plot the results.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# plot results, using goals_match_tibble
ggplot(goals_match_tibble) + 
  geom_point(aes(x=reorder(name, avg_goals),y=avg_goals, colour=name))+
  theme_bw(8)+ 
  facet_wrap(~season, nrow=4)+
  labs(
    title = &amp;quot;Which football leagues had the higest number of goals per game?&amp;quot;,
    y = &amp;quot;Average Number of Goals per Match&amp;quot;,
   x = &amp;quot;National League&amp;quot;
    ) + 
  coord_flip() +
  theme(legend.position = &amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/reference_SQL_files/figure-html/unnamed-chunk-10-1.png&#34; width=&#34;456&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(goals_match_tibble, aes(x=reorder(name, avg_goals),y=avg_goals, colour=name)) + 
  geom_violin()+
  # geom_boxplot()+
  geom_jitter()+
  theme_bw()+
  labs(
    title = &amp;quot;Which football leagues had the higest number of goals per game?&amp;quot;,
    subtitle=&amp;quot;2008/2009 to 2015/2016&amp;quot;,
    y = &amp;quot;Average Number of Goals per Match&amp;quot;,
    x = &amp;quot;National League&amp;quot;
  ) + 
  coord_flip() +
  theme(legend.position = &amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://usi-emba-analytics.netlify.app/reference/reference_SQL_files/figure-html/unnamed-chunk-10-2.png&#34; width=&#34;456&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We cannow run queries on the database, collect the results in a local dataframe, and show the results of, e.g., the highest overall rating of all players in the database.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Which are the top 20 players by overall rating (`overall_rating`)?
top_players &amp;lt;-  player_attributes %&amp;gt;%
  group_by(player_api_id) %&amp;gt;%
  summarise(max_rating = max(overall_rating)) %&amp;gt;% 
  arrange(desc(max_rating)) %&amp;gt;% 
  left_join(players, by = c(&amp;quot;player_api_id&amp;quot;=&amp;quot;player_api_id&amp;quot;)) %&amp;gt;%
  collect

top_players %&amp;gt;% 
  head(20) %&amp;gt;% 
  kableExtra::kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
player_api_id
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
max_rating
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
id
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
player_name
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
player_fifa_api_id
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
birthday
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
height
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
weight
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30981
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
94
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
6176
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Lionel Messi
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
158023
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1987-06-24 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
170
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
159
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30893
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
93
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1995
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Cristiano Ronaldo
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
20801
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1985-02-05 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
176
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30829
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
93
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
10749
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Wayne Rooney
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54050
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1985-10-24 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
175
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
183
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30717
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
93
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3826
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gianluigi Buffon
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1179
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1978-01-28 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
193
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
201
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
39989
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
92
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3994
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gregory Coupet
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1747
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1972-12-31 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
180
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
176
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
39854
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
92
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
10861
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Xavi Hernandez
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
10535
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980-01-25 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
170
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
148
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
34520
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
91
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3183
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Fabio Cannavaro
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1183
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1973-09-13 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
175
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
165
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30955
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
91
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
742
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Andres Iniesta
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
41
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1984-05-11 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
170
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
150
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30743
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
91
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
9216
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Ronaldinho
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
28130
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980-03-21 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
183
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
168
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30723
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
91
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
388
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Alessandro Nesta
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1088
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1976-03-19 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
188
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
174
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30657
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
91
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
4366
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Iker Casillas
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5479
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1981-05-20 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30627
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
91
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5120
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
John Terry
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
13732
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980-12-07 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
188
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
198
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30626
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
91
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
10203
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Thierry Henry
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1625
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1977-08-17 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
188
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
183
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
41044
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
90
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5592
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Kaka
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
138449
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1982-04-22 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
183
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
40636
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
90
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
6377
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Luis Suarez
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
176580
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1987-01-24 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
183
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
187
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
38843
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
90
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
11039
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Ze Roberto
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
28765
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1974-07-06 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
173
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
159
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
35724
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
90
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
11057
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Zlatan Ibrahimovic
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
41236
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1981-10-03 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
196
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
209
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30924
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
90
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3514
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Franck Ribery
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
156616
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1983-04-07 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
170
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
159
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30834
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
90
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
951
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Arjen Robben
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
9014
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1984-01-23 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
180
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
176
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30728
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
90
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2426
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
David Trezeguet
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5984
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1977-10-15 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
190
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
176
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Which are the top 20 goalkeepers by sum of all gk attributes (`gk_diving`,`gk_handling`, `gk_kicking`, etc)?
top_goalies &amp;lt;-  player_attributes %&amp;gt;%
  mutate(goalie_rating = gk_diving + gk_handling + gk_kicking + gk_positioning + gk_reflexes) %&amp;gt;% 
  group_by(player_api_id) %&amp;gt;%
  summarise(max_goalie_rating = max(goalie_rating)) %&amp;gt;% 
  arrange(desc(max_goalie_rating)) %&amp;gt;% 
  left_join(players, by = c(&amp;quot;player_api_id&amp;quot;=&amp;quot;player_api_id&amp;quot;)) %&amp;gt;%
  collect
  

top_goalies %&amp;gt;% 
  head(20) %&amp;gt;% 
  kableExtra::kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
player_api_id
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
max_goalie_rating
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
id
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
player_name
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
player_fifa_api_id
&lt;/th&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
birthday
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
height
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
weight
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30717
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
449
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3826
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gianluigi Buffon
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1179
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1978-01-28 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
193
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
201
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
39989
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
447
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3994
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gregory Coupet
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1747
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1972-12-31 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
180
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
176
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30859
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
445
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
8580
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Petr Cech
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
48940
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1982-05-20 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
196
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
198
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30657
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
442
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
4366
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Iker Casillas
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5479
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1981-05-20 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
27299
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
440
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
6556
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Manuel Neuer
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
167495
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1986-03-27 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
193
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
203
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30989
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
438
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5536
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Julio Cesar
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
48717
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1979-09-03 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
174
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
24503
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
437
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
9579
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Sebastian Frey
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1289
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1980-03-18 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
190
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
198
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30726
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
436
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2900
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Edwin van der Sar
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
51539
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1970-10-29 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
198
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
196
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
182917
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
429
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2340
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
David De Gea
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
193080
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1990-11-07 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
193
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
181
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30660
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
428
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
8541
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Pepe Reina
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
24630
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1982-08-31 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
188
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
203
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30622
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
426
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
8413
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Paul Robinson
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
13914
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1979-10-15 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
193
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
198
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
32657
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
425
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
10625
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Victor Valdes
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
106573
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1982-01-14 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
183
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
172
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30742
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
425
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
7470
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Mickael Landreau
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3813
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1979-05-14 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
183
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26295
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
425
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
4272
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Hugo Lloris
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
167948
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1986-12-26 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
188
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
172
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30841
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
424
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
6446
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Maarten Stekelenburg
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2147
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1982-09-22 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
198
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
203
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
27341
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
424
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
9028
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Robert Enke,30
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
158400
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1977-08-24 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
172
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30648
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
423
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
4832
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Jens Lehmann
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
805
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1969-11-10 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
190
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
192
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
33986
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
421
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
746
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Andres Palop
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
8247
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1973-10-22 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
183
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
165
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
31293
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
421
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
10009
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Steve Mandanda
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
163705
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1985-03-28 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
185
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
181
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
30380
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
420
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1345
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Brad Friedel
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
11983
&lt;/td&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
1971-05-18 00:00:00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
188
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
203
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;div id=&#34;other-references&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Other references&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://cran.r-project.org/web/packages/dbplyr/vignettes/dbplyr.html&#34; target=&#34;_blank&#34;&gt;Introduction to dbplyr&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://medium.com/aclu-tech-analytics/dbplyr-a-path-to-more-inclusive-data-transformations-at-the-aclu-5e6af21f4042&#34; target=&#34;_blank&#34;&gt;dbplyr : A Path to More Inclusive Data Transformations at the ACLU&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://mystery.knightlab.com/&#34; target=&#34;_blank&#34;&gt;SQL Tutorial: Solving a Murder Mystery&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Other Data Sources</title>
      <link>https://usi-emba-analytics.netlify.app/reference/other_data_sources/</link>
      <pubDate>Fri, 31 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://usi-emba-analytics.netlify.app/reference/other_data_sources/</guid>
      <description>


&lt;p&gt;The web is a vast source of datasets on almost any subject, such as demographics, disease, economics, finance, geography, entertainment, science, etc. You can always start with &lt;a href=&#34;https://toolbox.google.com/datasetsearch&#34; target=&#34;_blank&#34;&gt;Google’s Dataset Search&lt;/a&gt; that indexes thousands of public datasets.&lt;/p&gt;
&lt;p&gt;Here are some more suggestions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://www.kaggle.com/datasets&#34; target=&#34;_blank&#34;&gt;Kaggle&lt;/a&gt;: Kaggle hosts machine learning competitions and contains a large number of datasets that are generally free and open to the public.&lt;br /&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/awesomedata/awesome-public-datasets/blob/master/README.rst&#34; target=&#34;_blank&#34;&gt;Awesome Public Datasets&lt;/a&gt;: Collection of public datasets, arranged by area&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.reddit.com/r/datasets/&#34; target=&#34;_blank&#34;&gt;Reddit&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://data.gov.uk/&#34; target=&#34;_blank&#34;&gt;UK data&lt;/a&gt; and &lt;a href=&#34;https://www.ons.gov.uk&#34; target=&#34;_blank&#34;&gt;UK Office for National Statistics&lt;/a&gt;&lt;br /&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.data.gov&#34; target=&#34;_blank&#34;&gt;U.S. Government’s open data&lt;/a&gt; with many datasets on a range of issues&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://tinyletter.com/data-is-plural&#34; target=&#34;_blank&#34;&gt;Data is plural&lt;/a&gt;: a weekly newsletter that has collected &lt;a href=&#34;https://docs.google.com/spreadsheets/d/1wZhPLMCHKJvwOkP4juclhjFgqIY8fQFMemwKL2c64vk/edit#gid=0&#34;&gt;over a thousand useful/curious datasets&lt;/a&gt;. This may well be one of my favourite dataset collections!&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://ourworldindata.org&#34;&gt;Our World in Data&lt;/a&gt; contains time series of demographic and global development data. Their &lt;a href=&#34;https://ourworldindata.org/coronavirus&#34;&gt;collection of Covid-19 data&lt;/a&gt; is among the best.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/rfordatascience/tidytuesday&#34; target=&#34;_blank&#34;&gt;TidyTuesday&lt;/a&gt;: A weekly data project in R where they release a new dataset every week and emphasis is placed on understanding how to summarise and arrange data to make meaningful charts with ggplot2, tidyr, dplyr, and other tools in the tidyverse ecosystem.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://fivethirtyeight.com/&#34; target=&#34;_blank&#34;&gt;fivethirtyeight.com&lt;/a&gt; is a data-driven journalism site that &lt;a href=&#34;https://github.com/fivethirtyeight/data&#34; target=&#34;_blank&#34;&gt;share the data on most of their stories&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In terms of investigative journalism, &lt;a href=&#34;https://themarkup.org//&#34; target=&#34;_blank&#34;&gt;The Markup&lt;/a&gt; and &lt;a href=&#34;https://www.propublica.org/&#34; target=&#34;_blank&#34;&gt;ProPublica&lt;/a&gt; are both data-driven and share their data; &lt;a href=&#34;https://github.com/the-markup&#34; target=&#34;_blank_&#34;&gt;All Markup data is freely available&lt;/a&gt; and &lt;a href=&#34;https://www.propublica.org/datastore/datasets/&#34; target=&#34;_blank_&#34;&gt;ProPublica provides many of their datasets for free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/erikgahner/PolData&#34; target=&#34;_blank&#34;&gt;Erik Gahner’s list of political science datasets&lt;/a&gt;: Datasets divided by topic (governance, elections, policy, political elites, etc.), geography (country, region), etc.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cloud.google.com/bigquery/public-data/&#34; target=&#34;_blank&#34;&gt;BigQuery public datasets&lt;/a&gt; Google has set up BigQuery which is a data warehouse for some large datasets that you really need to access with SQL. There is even an R package &lt;a href=&#34;https://cran.r-project.org/web/packages/bigrquery/&#34; target=&#34;_blank_&#34;&gt;&lt;code&gt;bigrquery&lt;/code&gt;&lt;/a&gt; that allows you to easily talk with BigQuery’s database.&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
  </channel>
</rss>
