Basic Operations

This chapter shows how to use basic operations in both R and Python. We will focus on operations that we encounter in a data analysis setting: arithmetic, comparisons, logic.

You are expected to be able to use all the operations described in this chapter.

Basic Mathematical operations

In both R and Python, basic mathematical operations can be written directly in the

Arithmetic operators

# In R
1 + 1
[1] 2
4 * 2
[1] 8
4 / 2
[1] 2

# In python
1 + 1
4 * 2
4 / 2
Pay Attention

For more complex mathematical operators we might need some functions that are not provided in the base version of the software we use. For instance, the ln() function and exp() function are not available in base python. We therefore need to load a package which contains these functions, this will be the maths package in Python. For R, both the ln() and exp() functions are part of the base package, we therefor do not need to load an additional package, we can use these functions right away.

# Base R
log(12)
[1] 2.484907
exp(12)
[1] 162754.8
sqrt(12) # square root
[1] 3.464102

import math
math.log(12)
math.exp(12)
math.sqrt(12)

Comparison Operators

Comparison operations function in the same way in both languages. The only difference we note is in the assignment of the values to a variable in which it is recommended in R to use <-.

Warning

Even though = works for simple operations in R, you might run into trouble with more complex code. For a detailed explanation see this post. We recommend you to always use the <- operator when programming in R.

# R
x <- 5
y <- 5
# Equal to
x == y
[1] TRUE
# Not equal to
x != y
[1] FALSE
# Greater than
x > y
[1] FALSE
# Less than
x < y
[1] FALSE
# Greater than or equal to
x >= y
[1] TRUE
# Less than or equal to
x <= y
[1] TRUE

# Python
x = 5
y = 5
# Equal to
x == y

# Not equal to
x != y

# Greater than
x > y

# Less than
x < y

# Greater than or equal to
x >= y

# Less than or equal to
x <= y

Logic Operators

Logical operators are commonly used in data analysis, especially when sub-setting datasets. For example when we want to extract documents that are from the year 2000 which have the term “sustainability” and the term “climate change” but not the term “fossil fuel”. Combining these operators is important, and so is understanding how they work.

Pay Attention

Some differences between R and Python become apparent here. In R, TRUE and FALSE must be written in all caps to be recognised as the logical operator. In Python, True and False must start with a capitalized letter. or, and, not should also be written exactly in this manner.If these operators are written differently, they will be recognized as objects.

x <- 4
y <- 8
# Equal (==)
x == y
[1] FALSE
# And (&)
x == 4 & y == 8
[1] TRUE
# Or (|)
x == 4 | y == 8
[1] TRUE
# Not (!)
!y
[1] FALSE
# Combine
z <- "plop"
x == 4 & (y == 8 | z == "plop")
[1] TRUE

x = 4
y = 8
# Equal (==)
x == y
# And 
x == 4 and y == 8

# Or
x == 4 or y == 8

# Not
not x

# Combine
z = "plop"
x == 4 and (y == 8 or z == "plop")

In data analysis, we usually use operators to subset data. This means that we compare a variable to a value to check if it fits our criteria. For example, if we have a column that contains a year, and we only want observations with the year 2003, we will search for year == 2003. In this setting the R operators we just described will be the same. It is possible that these operators varie when different packages are used in python. For instance, in the context of the pandas package, and becomes &, or becomes |, not becomes ~. We will adress these variations in the database manipulation chapter.