---
title: "Assignment 1"
author: "Burkay Genç"
date: "March 26, 2018"
output:
html_document:
theme: cerulean
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, eval = TRUE)
```
## Question 1 (5 points)
Do the following in R:
- Assign 8 to `p`
- Assign 6 to `q`
- Swap the values of `p` and `q`. You are not allowed to directly assign values. You have to "swap" them!
```{r}
# Solution 1 : No extra variables used
p <- 8
q <- 6
p <- p + q
q <- p - q
p <- p - q
p
q
```
```{r}
# Solution 2 : Temporary variable used
p <- 8
q <- 6
temp <- p
p <- q
q <- temp
p
q
```
## Question 2 (10 points)
- Create a vector of the populations of the 10 largest cities in Turkey.
- Name your vector with the names of the cities.
- Print the names of the cities that have a population between 2 million and 3 million.
```{r}
# Write your answer here
city.pops <- c(14160467, 5045083, 4061074, 2740970, 2158265, 2149260, 2079225, 1844438, 1801980, 1705774)
names(city.pops) <- c("İstanbul", "Ankara", "İzmir", "Bursa", "Antalya", "Adana", "Konya", "Gaziantep", "Şanlıurfa", "Mersin")
names(city.pops)[city.pops > 2000000 & city.pops < 3000000]
```
## Question 3 (10 points)
- Create a matrix as follows:
- First consists of numbers: {1,2,3,4,5,6}
- Second row consists of numbers: {2,4,6,8,10,12}
- Third row consists of numbers: {1,3,5,7,9,11}
- Fourth row consists of the sum of the second and third rows
- Fifth row consists of the division of the fourth row with the first row
- Swap the columns of the matrix so that the first row reads: {1,3,5,2,4,6}
```{r}
# Write your answer here
first <- 1:6
second <- first * 2
third <- second - 1
fourth <- second + third
fifth <- fourth / first
m <- matrix(c(first, second, third, fourth, fifth), nrow = 5, byrow = T)
m
m <- m[, c(1, 3, 5, 2, 4, 6)]
m
```
## Question 4 (10 points)
- Create a factor from the following vector:
`{"red", "red", "blue", "brown", "green", "blue", "red", "green", "green", "brown", "red", "blue"}`
- Display the frequencies of each factor value (level)
- Re-name `"red"` as `"purple"`
- Display the number of "purples"
```{r}
# Write your answer here
f <- factor(c("red", "red", "blue", "brown", "green", "blue", "red", "green", "green", "brown", "red", "blue"))
table(f)
index.of.red <- which(levels(f) == "red")
levels(f)[index.of.red] <- "purple"
table(f)["purple"]
```
## Question 5 (20 points)
- Create a data frame for the following girls. You must choose the correct column types:
- Canan is 24 years old, blonde, 170cm and 56kgs. She is married.
- Deniz is 35 years old, has brown hair, 173cm and 61kgs. She is married.
- Eda is 21 years old, has brown hair, 156cm and 45kgs. She is not married.
- Fatma is 40 years old, has black hair, 164cm and 60kgs. She is married.
- Gonca is 33 years old, blonde, 182cm and 65kgs. She is not married.
- Hilal is 45 years old, has black hair, 165cm and 58kgs. She is married.
- Lale is 38 years old, has black hair, 175cm and 59kgs. She is not married.
- Mine is 28 years old, has brown hair, 190cm and 71kgs. She is not married.
- Answer the following questions based on this dataframe:
- What is the average age of the group?
- How many girls are above the average height?
- What is the most frequent hair color?
- What is the average height of girls above 60kgs?
- Compare the height/weight ratio of married and single girls. Which is higher?
```{r}
# Write your answer here
name <- c("Canan", "Deniz", "Eda", "Fatma", "Gonca", "Hilal", "Lale", "Mine")
age <- c(24, 35, 21, 40, 33 ,45, 38, 28)
hair.color <- c("blonde", "brown", "brown", "black", "blonde", "black", "black", "brown")
height <- c(170, 173, 156, 164, 182, 165, 175, 190)
weight <- c(56, 61, 45, 60, 65, 58, 59, 71)
married <- c(T, T, F, T, F, T, F, F)
df <- data.frame(name, age, hair.color, height, weight, married)
# What is the average age of the group?
mean(df$age)
# How many girls are above the average height?
sum(df$height > mean(df$height))
# What is the most frequent hair color?
names(sort(table(df$hair.color), decreasing = T))[1]
# What is the most frequent hair color? [Alternative answer]
t <- table(df$hair.color)
names(t)[which(t == max(t))]
# What is the average height of girls above 60kgs?
mean(df$height[df$weight > 60])
# Compare the height/weight ratio of married and single girls. Which is higher?
hw.m <- mean((df$height / df$weight)[df$married])
hw.nm <- mean((df$height / df$weight)[!df$married])
if (hw.m > hw.nm)
{
print("Married is higher")
} else if (hw.nm > hw.m)
{
print("Not Married is higher")
} else
{
print("They are equal")
}
```
## Question 6 (15 points)
- Given the below vector, compute its mean without using **any** functions.
```{r}
# Do not change the two lines below
set.seed(1024)
v <- runif(100, 1, 20) + rnorm(100, 1, 3)
# Compute the mean of v below
sum <- 0
count <- 0
for (i in v)
{
sum <- sum + i
count <- count + 1
}
# Computed
sum / count
# Actual
mean(v)
```
## Question 7 (20 points)
- Write a function that takes two numeric vectors and returns a matrix as follows:
```
# Example:
> a <- c(1,3,5)
> b <- c(20, 40, 60)
> c <- your_function(a, b)
> c
[,1] [,2] [,3]
[1,] 21 41 61
[2,] 23 43 63
[3,] 25 45 65
```
```{r}
# Write your answer here
f <- function(a, b)
{
m <- matrix(0, nrow = length(a), ncol = length(b))
for (r in seq_along(a))
for (c in seq_along(b))
m[r, c] <- a[r] + b[c]
m
}
a <- c(1, 3, 5)
b <- c(20, 40, 60)
c <- f(a, b)
c
```
```{r}
# Alternative answer
f2 <- function(a, b)
{
m1 <- matrix(a, nrow = length(a), ncol = length(b))
m2 <- matrix(b, nrow = length(a), ncol = length(b), byrow = T)
m1 + m2
}
a <- c(1, 3, 5)
b <- c(20, 40, 60)
c <- f2(a, b)
c
```
## Question 8 (20 points)
- Write a function that takes a numeric vector `vec` and a numeric variable `val`, and returns `TRUE` if `val` exists in `vec`, and otherwise returns `FALSE`. You are **not** allowed to use `%in%` or any other functions present in R.
```{r}
# Write your answer here
f <- function(vec, val)
{
for (i in vec)
{
if (i == val)
return(TRUE)
}
return(FALSE)
}
# Testing
f(c("a", "b", "c", "d", "e"), "d")
f(c("a", "b", "c", "d", "e"), "f")
```