class: title-slide, middle, center # BUS 320 # Excel to R - I # Introduction to R and RStudio ## Elizabeth Stanny --- layout: true <div class="my-footer"><span>http://bus320.estanny.com</span></div> --- # Overview + What are R and RStudio? + How do I code in R? + What are R packages? + Compare variables and functions in R and Excel --- class: center # What are R and RStudio? <img src="images/R_vs_RStudio.png" width="80%" /> .pull-left[ R, programming language, does the work .footnote[.font80[Source: Figure 1.1. from Modern Dive https://moderndive.com/1-getting-started.html]] ] .pull-right[ RStudio, *integrated development environment (IDE)*, is interface that makes it easier ] --- # Open RStudio not R <div class="figure" style="text-align: center"> <img src="images/R_vs_RStudio_logos.png" alt="Icons of R versus RStudio on your computer." width="90%" /> <p class="caption">Icons of R versus RStudio on your computer.</p> </div> --- class: center # RStudio <img src="images/rstudio.png" width="70%" /> --- .pull-left[ # Create an Rmarkdown document ``` --- title: "Quiz 2" author: "Elizabeth Stanny" output: html_document --- ``` ] -- .pull-right[ # Look at the Markdown Reference - type ``` # Below I will do some calculations ## In the Rchunk ``` - knit the document ] --- .pull-left[ # Insert a code chunk ] -- .pull-right[ ## R is a calculator - add - subtract - multiply - divide ] --- # What are R packages? <img src="images/R_vs_R_packages.png" width="70%" style="display: block; margin: auto;" /> .footnote[.font80[Source: Modern Dive https://moderndive.com/1-getting-started.html]] --- # Package installation - using menu -- .pull-left[ a) Click on the "Packages" tab. b) Click on "Install" next to Update. c) Type the name of the package under "Packages (separate multiple with space or comma):" In this case, type `tidyverse`, `pacman`. d) Click "Install." .footnote[.font80[Source: Modern Dive https://moderndive.com/1-getting-started.html]] ] -- .pull_right[ <img src="images/install_packages_easy_way.png" width="50%" style="display: block; margin: auto 0 auto auto;" /> ] --- .left-column[ ## Package installation with pacman Install the package `pacman` using the menu ] .right-column[ ## Type in the console or in an R code chunk ```r library(pacman) p_load(tidyquant, tidyverse, readxl, janitor) ``` ] --- # Help .pull-left[ ### Type in an R code chunk or in the console ```r ?p_load() ``` - What does this command do? ] .pull-right[ ### Type in an R code chunk ```r ??p_load ``` - What does this command do? ] --- # Exploring data 1. About the data *Corporate Tax Avoidance in the First Year of the Trump Tax Law* here https://itep.org/corporate-tax-avoidance-in-the-first-year-of-the-trump-tax-law/ >Profitable Fortune 500 companies avoided $73.9 billion in taxes under the first year of the Trump-GOP tax law. The study includes financial filings by 379 Fortune 500 companies that were profitable in 2018; it excludes companies that reported a loss. The report builds on a previous ITEP analysis released in April 2019, which reviewed corporate filings available as of that date. --- # Download the spreadsheet - Click on the link below and save to your project folder `bus320` - [
corp_tax.xlsx](corp_tax.xlsx) --- .pull-left[ ### First step in data exploration in R Use function `skim` from the packages `skimr` to calculate descriptive statistics **Character variables** - minimum length of character variable - maximum length of character variable - number of unique values - number of blank cells ] -- .pull-right[ **Numeric variables** - mean (average) - standard deviation - quartiles provides summary of distribution - p0 is the minimum value, 0% of the values are less than it - p25 is the 1st quartile, 25% of the values are smaller - p50 is the 2nd quartile, the middle value, 50% of the values are smaller - p75 is the 3rd quartile, 75% of values are smaller - p100 is the maximum value, 100% of the values are smaller ] --- # Duplicating the output of the R skim function in Excel - `COLUMNS(array)` number of columns - `LEN(array)` number of rows - `AVERAGE()` mean - `STDEV()` standard deviation - Count unique values among duplicates - Quartiles - p0 = `QUARTILE(array, 0)` - p25 = `QUARTILE(array, 1)` - p50 = `QUARTILE(array, 2)` - p75 = `QUARTILE(array, 3)` - p100 = `QUARTILE(array, 4)` --- # Cheatsheets <img src="images/cheatsheets.png" width="80%" /> --- # Tips on learning to code * **Computers are not actually that smart** * **Take the "copy, paste, and tweak" approach** * **The best way to learn to code is by doing** - Analyze data you are interested in * **Practice is key** .footnote[.font80[Source: Modern Dive https://moderndive.com/1-getting-started.html]] --- # Errors, warnings, and messages * **Errors**: <span style="color:red">go to the start of the error to figure out the problem</span> * **Warnings**: <span style="color:orange">anything unexpected? if not, don't worry</span> * **Messages**:<span style="color:green"> don't worry</span> .footnote[.font80[Source: Modern Dive https://moderndive.com/1-getting-started.html]]