# Data Types

In the last lesson, we learned about two data types: vectors and data frames. We also learned about two different classes of vectors: `numeric` and `factor`. There are many other data types in R. Each has a special use, and to be productive in R, you need to be familiar with the major types and the operations on these types.

## Primitive Types

Each R object has a un underlying “type”, which determines the set of possible values for that object. You can find the type of an object using the `typeof` function.

The main types include the following:

• `logical`: a logical value.

``TRUE``
``` TRUE
```
``FALSE``
``` FALSE
```
``TRUE | FALSE  # logical 'or'``
``` TRUE
```
``TRUE & FALSE  # logical 'and'``
``` FALSE
```
``!TRUE  # logical 'not'``
``` FALSE
```
• `integer`: an integer (positive or negative). Many R programmers do not use this mode since every `integer` value can be represented as a `double`.

``1L  # suffix integers with an L to distinguish them from doubles``
``` 1
```
``-7L``
``` -7
```
``1L:10L  # range of values``
```   1  2  3  4  5  6  7  8  9 10
```
``1:10  # (L suffix is optional)``
```   1  2  3  4  5  6  7  8  9 10
```
``7%%2  # modulo (remainder)``
``` 1
```
``7%/%2  # integer division``
``` 3
```
• `double`: a real number stored in “double-precision floatint point format.”

``1``
``` 1
```
``3.14``
``` 3.14
```
``-(3 + 8/2) * 7  # arithmetic operations``
``` -49
```
``2^10  # exponentiation``
``` 1024
```

A `double` type can store the special values `Inf`, `-Inf`, and `NaN`, which represent “positive infinity,” “negative infinity,” and “not a number”:

``1/0``
``` Inf
```
``-1/0``
``` -Inf
```
``0/0``
``` NaN
```
• `complex`: a complex number

``1i  # suffix with i to denote 'imaginary'``
``` 0+1i
```
``(2i)^2``
``` -4+0i
```
``sqrt(-1+0i)``
``` 0+1i
```
• `character`: a sequence of characters, called a “string” in other programming languages

``"Hello, World!"  # denote a string with double quotes...``
``` "Hello, World!"
```
``'abracadabra'    # ...or with single quotes (both forms are equivalent).``
``` "abracadabra"
```
• `list`: a list of named values (discussed in detail in the next section)

``list(a = 10, b = 11, z = "hello")``
```\$a
 10

\$b
 11

\$z
 "hello"
```
• `builtin`, `closure`, `special`: a function or operator (for most purposes, the distinctions between these are not important)

``typeof(sqrt)``
``` "builtin"
```
``typeof(read.csv)``
``` "closure"
```
``typeof(`<-`)``
``` "special"
```
• `NULL`: a special type with only one possible value, known as `NULL`

``typeof(NULL)``
``` "NULL"
```

This is not an exhaustive list, but the other modes are exotic and you probably won’t ever encounter them.

## Missing Values

One unique feature of R is its support for “Not Applicable” or “Missing” values. The `logical`, `integer`, `double`, `complex`, and `character` types can all represent missing values, using the special constant `NA`.

## Conversions

Often, you don’t need to worry too much about the types, because R will implicitly convert between types for you. For example, consider the following sequence of commands

``````x <- 1:10
x[] <- 3.14``````

When the first line gets executed, `x` gets created as an `integer` vector. In the second line, R converts `x` to a `double` vector so that it can store the value `3.14`.

## Lists

A “list” is a primitive type that stores a sequence of values, along with optional names for these values. The power of the list type is that it allows you to represent complicated objects.

We construct lists using the `list` function:

``````abe <- list(first.name = "Abraham", last.name = "Lincoln", weight.lb = 180,
height.in = 76.8)``````

In this example, `abe` is a list with four elements, with names `first.name`, `last.name`, `weight.lb`, and `height.in`.

We access the elements of a list using double square brackets. We can either specify the index of the element

``abe[]``
``` "Abraham"
```
``abe[]``
``` "Lincoln"
```

or we can specify the name

``abe[["first.name"]]``
``` "Abraham"
```
``abe[["last.name"]]``
``` "Lincoln"
```

Another way to access an element by name is to use the `\$` operator:

``abe\$height``
``` 76.8
```
``abe\$weight``
``` 180
```

Both forms (`abe[["first.name"]]` and `abe\$first.name`) are equivalent, but the `\$` form is more common.

As with vectors, we get the number of elements with the `length` function:

``length(abe)``
``` 4
```

We slice lists with single square brackets:

``abe[1:2]``
```\$first.name
 "Abraham"

\$last.name
 "Lincoln"
```
``abe``
```\$first.name
 "Abraham"
```

For a vector, the slice `` is logically equivalent to the element `[]`, but for a list, these entities are distinct.

We can delete a particular element of a list by assigning it the value `NULL`:

``abe[["last.name"]] <- NULL``

This removes the element, and shifts the indexes of subsequent elements

``abe[]``
``` 180
```
``abe[]``
``` 76.8
```

## Classes

Two types we saw in the previous lesson are not primitive: data frames and factors. In fact, a data frame is a special type of list, and a factor is a special type of integer vector. These special types are known as “classes”.

Every R object is a member of one or more classes. To find these classes, use the `class` function:

``class(TRUE)``
``` "logical"
```
``class(1L)``
``` "integer"
```
``class(3.14)``
``` "numeric"
```

(Confusingly, the class for `double` objects is not called `double`; it is called `numeric`.)

A data frame is a list whose elements are vectors, each with the same length. A factor is an integer vector taking values in the range `1`..`m`, with each integer corresponding to a certain level. R distinguishes between these types and their underlying representations by assigning them to different classes.

``````bikedata <- read.csv("bikedata.csv")
typeof(bikedata)``````
``` "list"
```
``class(bikedata)``
``` "data.frame"
```
``typeof(bikedata\$colour)``
``` "integer"
```
``class(bikedata\$colour)``
``` "factor"
```

The power of classes is that they allow you to change how certain functions behave. Compare the following two otputs:

``summary(bikedata\$colour)``
```Black  Blue Green  Grey Other   Red White  NA's
262   636   149   531    52   378   333    14
```
``summary(unclass(bikedata\$colour))``
```   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's
1.00    2.00    4.00    3.83    6.00    7.00      14
```

Here, `unclass` is a function that converts to the underlying primitive type. When we summarize an object with class `factor`, we report counts for the levels; when we summarize an object with class `integer`, we report quartiles and other statistics.

Advanced R programmers create new kinds of classes, along with specialized functions to act on these classes.