Blog posts on Data Science, Machine Learning, Data Mining, Artificial Intelligence, Spark Machine Learning

Saturday, December 2, 2017

Getting Started with R

In this post we get familiar with R Studio and basic syntax of R programming language

Rstudio OverView

we have 4 panes
1) script pan - to write and save the programing script
2) Console pane - where all the code will get executed
3) Environment/history pane - displays all the variables created,functions
used with in the current session
4) Helper pane - contains multiple tabs to install/display pacakges,
view visualization plots,
locate files within the workspace


In [1]:
help(mean)

getting and setting workspace

In [2]:
# to display current working directory use getwd() function
getwd()
'C:/Users/Suresh/mlclassscripts'
In [ ]:
# to set up workspace or working directory use setwd() function
#syntax is shown below
setwd("path")
In [6]:
setwd("C:\\Suresh\\R&D\\Projects\\ML classroom training\\sessions")
setwd("C:/Suresh/R&D/Projects/ML classroom training/sessions")

getting help in R

To get help within R environment, we use help() function to get the
documentation
for any of the functions/packages available within R environment.
To see the arguments required for a function, we use args() function.
to see the example of a function, example() function is used.
In [ ]:
help("stats")
help("mean")
args("mean")
example("mean")

#getting help documentation for a package
help(package="caret")

online help for R programming

We can get online help on available packages in R from official website of R-Cran
https://cran.r-project.org/web/views/
We can also get online support for our day to day activities from below websites:
https://stackoverflow.com/
https://stats.stackexchange.com

Installing Packages

In [ ]:
#install pacakges in R can be done in two ways,
#1) using install.packages() function and from the bottom right pane of Rstudio
install.packages("randomForest")

#loading of installed or downloaded packages can be done using library() function.  
#Note that we can only load the package if
# we have installed the package already within our R environment
library(cluster)
In [ ]:
#below code to first verify if the library is installed in the R environment, 
#if it is not available
# then the package will get installed.
if(!library(cluster)){
install.pacakges("cluster")
}

basic operations in R

In [ ]:
# Adding two numericals
1+1

#multiplying two numericals
10*2

#dividing two numericals
10/2

#applying modulus operation on two numericals
10%%2

printing results to R console

In [ ]:
#printing the data on the console
print(10*2)

print("data science")

print(pi^2)

Variable declaration and assignment in R

variable assignment: In the below example, we are creating variable named z:
In [8]:
z <-  100
we use left arrow or = symbol for variable assignment. Its always good
practice to use left arrow for assignment.
In [9]:
z = 10.009
z <- 10.009

Loading existing or default datasets available in R environment

we can access default datasets avaiable in R using data() function.
data() function will displays all the avaiable datasets within R.
In [ ]:
data()
In order to load a specific dataset into R, we need to give the dataset name as argument to the data() function
In [ ]:
data(AirPassengers)

Viewing data of R objects

To view first 5 records of a R object (ex:dataframe), we use head() function.
head() function expects the data object as argument and prints the first 5 records on the R console.
In [ ]:
head(AirPassengers)
to view all the records in a nice tabular view
In [ ]:
View(AirPassengers)

Getting the description and structure of R object

use str function to see the descriptions of the data object,
In [ ]:
str(AirPassengers)