3  Let’s take our first ride in R

You might be hesitating about R. Probably, you are anxious at this point because you have never used a scripting software to analyze quantitative information. We will start with a gentle approach with some exercises, but first let’s start with important details about R.

3.1 R is an object oriented language

A object oriented language sounds probably esoteric and strange when you first read about it. But, it is not as weird as it sounds!

Programming languages have different options to represent the human knowledge. Programming is after all, a human activity, therefore we always try to represent what we have in our minds. Computers are good mind metaphors, we can create objects, with characteristics, qualities, names, and properties using a programming language.

Have you think about a simple object such a table? You see the object, you name the object and then you start realizing the table has some properties. Normally, tables can be built with something solid, and typically; a table has four legs. However, you could transform the chair, you could pull it apart and create a chair. The table object can be manipulated, you could also paint the table, or change some parts of the table when the table gets old.

Likewise, an object in R can be created, manipulated and transformed. Also, each object will have different properties. Some objects will represent letters and numbers, others will have only numbers. We can study the next example:

table <- c("has four legs", "it is brown", "it is solid")

What happened in the above? In the code showed, you can see the word table, you will also see an arrow that looks like this <-. That arrow is called the assignment, we assign or save information in the object that I decided to name table. You are also able to notice that I wrote statements in English. These statements were now assigned to the object table.

We can access the information inside table by typing the word table in the RStudio console, it should look similar to this:

table
[1] "has four legs" "it is brown"   "it is solid"  

Notice that the console printed the information contained in table, in this case the information was [1] "has four legs" "it is brown" "it is solid". In this example, we created a type of data object called vector. Vectors are the most basic object in R language. Vectors can comprise numbers, letters, or both.

Let’s see the following example:

table <- c(4, 90, 50, "has for legs")

In this new example, I mixed numbers with an statement in English. We can see how the information is saved inside the object table by typing table in the console:

table
[1] "4"            "90"           "50"           "has for legs"

Aja! Can you spot the differences? When you save numbers and letters together, all the elements are wrapped in quotes. This means that vectors have an important property: they can handle numbers and letters, but when you mixed them, the numbers are transformed into characters.

In R, characters are represented by wrapping the element using quotes. This is one of the characteristics of vector objects. Similar to objects such as tables, virtual objects have properties. For instance, tables cannot fly, that’s a property of tables 🤓.

Assigment character shortcut

Press CTRL + - to add the assignment arrow <-.

3.2 Types of objects

3.2.1 Vectors

I already showed examples of vectors in the previous section. However, there are more details to mention. Objects can have different modes. In the case of vectors, you can have a vector that are mode numeric or mode character (Matloff, 2011).

vector1 <- c(1,2,3,4)
mode(vector1)
[1] "numeric"

In this example I created a vector that I named vector1, this vector contains only numbers. When the vector has only numeric values the mode is numeric. I used the function mode to examine what mode was assigned to the vector.

vector2 <- c("USA","Costa Rica","Jamaica","Argentina")
mode(vector2)
[1] "character"

vector2 in this case has a mode character. This means that all elements in the vector are strings or characters.

3.2.2 Matrices

You might remember a matrix from your high school or college math. If not, you are safe. Let me show how a matrix looks like:

  1. \[\begin{equation*} \begin{bmatrix*} 1 & 2 & 3\\ 6& 5 & 4 \end{bmatrix*} \end{equation*}\]

The matrix number 1 is an example of a possible matrix. In this course we won’t use matrices. They are better used in multivariate statistics which is a more advanced topic. Nonetheless, matrices are objects that are closely related to data frames in R. We will study data frames later in this chapter.

In R you can create that same matrix by running this code:

matrixExample <- matrix(data = c(1,6,2,5,3,4),
                        nrow = 2, 
                        ncol = 3, 
                        byrow = FALSE)
matrixExample
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    6    5    4

A matrix has rows and columns. Each element in a matrix can be a number or it could be letters. In R, you can subset the matrix by row:

## This gets the first row values.
matrixExample[1,]
[1] 1 2 3

Or by column:

## This gets the first column values.
matrixExample[,1]
[1] 1 6

Matrices have particular mathematical properties, R respects these properties. However, I won’t explain the properties in this class. You just need to know that matrix is a type of object in R.

3.2.3 Lists

Lists are probably the most useful type of object in R. It is similar to a vector, but lists are able to contain multiple objects in different modes and different types:

list_1 <- list(firstGrade = c("Esteban", "David", "Sofia"),
               secondGrade = c("Jenny", "Alex", "Maria"),
               grades_1 = c(80,90,100),
               grades_2 = c(56,89,88))

In this example, I’m adding the names of first graders, and second graders, along with their grades. I’m combining numeric vectors with character vectors, and then I saved then in a single object named list_1. We can check how the information looks like when we print the content on the console:

list_1
$firstGrade
[1] "Esteban" "David"   "Sofia"  

$secondGrade
[1] "Jenny" "Alex"  "Maria"

$grades_1
[1]  80  90 100

$grades_2
[1] 56 89 88
Important

Pay attention to the function list() that I used in the example.

Also, you can see in the output that each vector is display separately as an element of the list. There is a $ dollar sign repeated in each element. The dollar sign is like a pin that helps to access a specific vector inside the list:

list_1$firstGrade
[1] "Esteban" "David"   "Sofia"  

In this example I’m printing on the console the names of first graders.

3.2.4 Data Frames

Data frame is the object that you will see the most in this course. This type of object is a bundle of several lists combined in a single object. This property allows to combine different modes (character or numeric), and helps to subset columns and rows easily. This the data type that will help use to analyze observed data in this course.

Example:

dataExample <- data.frame(firstGrade = c("Esteban", "David", "Sofia"),
               secondGrade = c("Jenny", "Alex", "Maria"),
               grades_1 = c(80,90,100),
               grades_2 = c(56,89,88))

In this new example, I’m not using the function list(), because I replaced it with the function data.frame(). This will save the information with the following structure:

dataExample
  firstGrade secondGrade grades_1 grades_2
1    Esteban       Jenny       80       56
2      David        Alex       90       89
3      Sofia       Maria      100       88

After printing the information saved in the object, you’ll notice we have column names, also we have multiple rows. Now, this looks more like a spreadsheet in Excel.

The data frame looks better displayed in the following table:

You can also transform and subset data frames, for instance you can get access only to the first row:

dataExample[1,] ### I'm printing only the first row
  firstGrade secondGrade grades_1 grades_2
1    Esteban       Jenny       80       56

You may also print only the first column:

dataExample[,1] ### I'm printing only the first row
[1] "Esteban" "David"   "Sofia"  
Matrix and Data Frame

Matrices and data frames are two dimensional objects. These objects have rows and columns. That’s why we use square brackets to subset the object by row or column. The first number in the square bracket represents rows, for instance data[1,], the second number in the square bracket represents the columns data[,1].

3.3 Exercises

  1. Create a data frame object by copying the code below. Change the object’s name, you may named it “expenses”, then change the variable names in the example. Finally run the code. How many rows does this data frame has? How many columns does this data frame has? Can you tell what happened after running the function head()?
Example <- data.frame(variable1 = c(30,63,96),
               variable2 = c(63,25,45),
               variable3 = c(78,100,100),
               variable4 = c(56,89,88))

head(Example)
  1. Can you guess what is the mode of this vector? You may use the function mode() to find out the answer.
vec <- c("Alexandra", "Apple", "Chair")
  1. Create a matrix with 5 columns and 5 rows.

References

Matloff, N. (2011). The art of r programming: A tour of statistical software design. No Starch Press.