sum(45, 978, 121)[1] 1144
I will now introduce you to two tools you will use to successfully complete this course: R and RStudio. R is a versatile programming language that excels in statistical analysis. It is widely used by academics and in the private, government, and international sectors. You will certainly get a lot of use out of it going forward!
R is also a very flexible language. You can make all kinds of very cool things with R, including websites, apps, slides, and more. In fact, all the resources I produced for this course were made using R and in RStudio, including this website and fancy slides.
Another advantage R has over other statistical programming languages is its accessibility. It is entirely free to use. There are many resources that are freely available that introduce you to its many uses. There is also an enthusiastic and welcoming community of R users who continue to grow R itself and the various resources you might need to expand your skills.
So, what is the difference between R and RStudio? R is the statistical programming language. RStudio is the platform, or integrated development environment, you will use to work with R. RStudio is free and used widely by R users.
This course aims to provide you with two broad skills: statistical analysis and R. I will now outline what you will learn in relation to R.
You will learn how to import your data into R. This includes how to load data stored in an external file, database, or online into a data frame in R.
You will then be introduced to methods for cleaning up those data. Oftentimes, data come to us in a messy format, with missingness, and inconsistencies. You will need to tidy it up into a format that is easy to work with and consistent.
Once you have tidy data, you will then need to transform it so that it is ready for your analysis. This includes focusing your data on the observations you are interested in and creating new variables.
Next, we will focus on visualizing your data. You can learn a lot more about your data and relationships lurking within it from a plot than you can from looking at the raw numbers.
We will also spend a fair chunk of time learning how to model those relationships within our data. Alongside visualization, this is where R excels.
Finally, I will also introduce you to tools for communicating your findings in an engaging and replicable way.
An R package is a collection of functions, data, and documentation bundled together to extend R’s capabilities. Packages help users avoid reinventing the wheel by providing pre-written code for common tasks like data manipulation, visualization, and statistical modeling. The tidyverse, for example, is a collection of packages that simplify working with data in R. You can install packages from CRAN (the Comprehensive R Archive Network) using install.packages("package_name") and load them into your current session using library(package_name). These packages are written by R coders just like you. As your R skills develop, you might want to approach these common tasks in your own way. You might even want to write your own package for others to use.
We will install and learn about some wonderful R packages throughout this course, starting with ggplot2 and the broader tidyverse family in the very next session.
Using the console, find the summation of 45, 978, and 121.
What is 67 divided by 6?
67 / 6[1] 11.16667
What is the square root of 894? Hint: use the sqrt() function.
sqrt(894)[1] 29.89983