Data Perspective: 2013

Tuesday, December 31, 2013

Data Scientist. The Path I Chose

As we all are marching into the New Year, I would like to post about my plans to become a Data Scientist, my 2014 resolution at Professional front. The term Data Science was first introduced to me a year ago same time. Since then I have started researching and gathering necessary information and decided to become Data Scientist. After one year I just wanted to look back myself to understand where I stand now & still what needs to be done.
Power of Data – possible Use Cases:

Cluster Analysis using R

In this post, I will explain you about Cluster Analysis, The process of grouping objects/individuals together in such a way that objects/individuals in one group are more similar than objects/individuals in other groups.
For example, from a ticket booking engine database identifying clients with similar booking activities and group them together (called Clusters). Later these identified clusters can be targeted for business improvement by issuing special offers, etc.
Cluster Analysis falls into Unsupervised Learning algorithms, where in Data to be analyzed will be provided to a Cluster analysis algorithm to identify hidden patterns within as shown in the figure below.

Fetch Twitter data using R

This short post will explain how you can fetch twitter data using twitteR & StreamR packages available in R. In order to connect to twitter API, we need to undergo an authentication process known as OAuth explained in my previous post.

Twitter data can be fetched from twitter in two ways: a) Rest API b) Streaming Api.
In today's blog post we shall go through both using Rest API & Stream API.

Topic Modeling in R

As a part of Twitter Data Analysis, So far I have completed Movie review using R & Document Classification using R. Today we will be dealing with discovering topics in Tweets, i.e. to mine the tweets data to discover underlying topics– approach known as Topic Modeling.

What is Topic Modeling?

Sentiment Analysis using R

Today I will explain you how to create a basic Movie review engine based on the tweets by people using R.

The implementation of the Review Engine will be as follows:

Gets Tweets from Twitter
Clean the data
Create a Word Cloud
Create a data dictionary
Score each tweet.

Document Classification using R

Recently I have developed interest in analyzing data to find trends, to predict the future events etc. & started working on few POCS on Data Analytics such as Predictive analysis, text mining. I’m putting my next blog on Data Mining- more specifically document classification using R Programming language, one of the powerful languages used for Statistical Analysis.

What is Document classification?

Blog posts on Data Science, Machine Learning, Data Mining, Artificial Intelligence, Spark Machine Learning

Tuesday, December 31, 2013

Data Scientist. The Path I Chose

Tuesday, December 17, 2013

Cluster Analysis using R

Thursday, October 24, 2013

Fetch Twitter data using R

Sunday, October 6, 2013

Topic Modeling in R

Tuesday, August 20, 2013

Sentiment Analysis using R

September 23, 2013

Thursday, July 25, 2013

Document Classification using R

September 23, 2013