15  R Basics

15.1 Data Structures

In R, data structures are used to store, organize, and manipulate data. Below are some common data structures in R:

Objects

An object is a data structure that stores a value or a set of values, along with information about the type of data and any associated attributes. Objects are usually created by assigning a value to a variable name. You can assign values by using either = or <-.
When naming objects in R use PascalCase, camelCase, snake_case or dot.case.

ScreenTime<-120

Vectors

A vector is a one-dimensional array that can hold elements of any data type.
Some common data types are numeric, character, logical, and complex.
Use the c() function to concatenate (combine) elements and store them in a vector.

ScreenTimeDays<-c(110,115,120,98,60)

15.1.1 Data Frames

Data Frames store data in rows and columns. Each columns stores a variable that can have its unique data type (i.e., character, boolean, numeric, etc.). The code below constructs a data frame called Devices:

Devices<- data.frame(Device=c("iphone","ipad","computer",
                              "kindle","watch"),
                     Days=ScreenTimeDays)
Devices
    Device Days
1   iphone  110
2     ipad  115
3 computer  120
4   kindle   98
5    watch   60

As you can see, Devices is made up of two vectors (Device and Days). The data frame holds information on five devices (rows), and two variables (the name of the device and the number of days engaged).

To retrieve a variable from a data frame, we can use the $ operator. For example, if we wanted to use the Days variable, we would write Devices$Days indicating to R to return the Days variable from the Devices data frame.

Devices$Days
[1] 110 115 120  98  60

If instead we wanted to retrive the Device variable, we would write:

Devices$Device
[1] "iphone"   "ipad"     "computer" "kindle"   "watch"   

Data Types

The main data types are numeric, character, logical, date, and complex. To identify the data type stored in a vector use the class() function.

class(Devices$Device)
[1] "character"

15.2 Functions

In general, functions relate an input (arguments) to an output. Below are some common functions in R.

sum

The sum() function takes as an input a vector with numeric values and returns the sum of the elements.

SleepingHours<-c(10,9,6,8)
sum(SleepingHours)
[1] 33

The total number of sleeping hours is 33.

mean

The mean() function takes a numeric vector and returns the average. Let’s try this with the Devices data frame.

mean(Devices)
Warning in mean.default(Devices): argument is not numeric or logical: returning
NA
[1] NA

Note that R returns an NA. This is because the mean() function expects a vector. Let’s try again by supplying the Device vector:

mean(Devices$Device)
Warning in mean.default(Devices$Device): argument is not numeric or logical:
returning NA
[1] NA

Once again we get an error. In this case we have supplied a vector that contains characters. These can’t be added up. The point here is to recognize that each function has requirements on the inputs it requires. When working with functions make sure to provide the correct inputs. Let’s provide the mean() function with the Days variable:

mean(Devices$Days)
[1] 100.6

At last we have provided a numerical vector, and now R can calculate its mean.

Help

To learn more about a function you can use ?. For example, to learn more about the sum() function, write ?sum in the console.