Tuesday, 22 January 2013

Business Application Lab : Session 3

Day3 - 22Jan , 2013

Assignment 1
Read the set of data given in the .csv file and fit a linear model for the data set.
Comment on it s applicability.


Soln -:

Commands used -: 

reg1 <- (DependentVariable ~ Independent Variable) - // to calculate regression coefficient

res1<-resid(reg1) - // to calculate residuals
resStd<-rstandard(reg1) - // to calculate standard residuals




 Plot between Independent variable and residuals



Plot between Independent variable and standard residuals



Q-Q Normal plot



Q-Q normal plot fit with a line


-Regression applicability- :
As the plot is scattered , non-linear and shows a parabolic pattern , the application of linear regression is not feasible.

Assignment 1(b)

Data set with Alpha and Pluto

Read data from the csv file and calculate the regression





Plot between Independent variable and residuals



Plot between Independent variable and residuals



Q-Q Normal Plot



Q-Q Normal Plot fit with a line



-Regression applicability- :
As the plot is random with lots of points around the Q-Q normal plot line , linearity is visible. hence application of linear regression is possible


Assignment 2
To justify a NULL Hypothesis for a given data using ANOVA

Soln-:
Commands used -: 


var_name.anv<-aov(<var_name>$<Dependent Variable> ~ <var_name>$<Nominal_scale_variable>)
summary(var_name.anv)


As shown , after reading the data from a csv file


The result shows the P value of the Hypothesis to be 0.687 which is very much greater than the assumed confidence interval.

hence Null hypothesis cannot be rejected.

Tuesday, 15 January 2013

Business Application Lab - Session 2

15th Jan,2013
Assignment 1 -:
To bind columns/rows from 2 different matrices into a new matrix

Sol -:
Matrix 1 assignment and generation


> mat1<-c(1:10)
> dim(mat1)<-c(2,5)


Matrix 2 assignment and generation


> mat2<-c(11:16)
> dim(mat2)<-c(2,3)

Taking 3 column from matrix1 and 2nd column from matrix 2
Binding using the cbind and rbind functions as shown -:




Assignment 2
Multiply 2 matrices 

Sol -:
Command to multiply 2 matrices
> multip <- z1 %*% z2




Assignment 3-:
To read NSE historical data dated from 1st Dec, 2012 to 31st Dec, 2012 from a .csv file.
To find regression between the High Price and the opening share price and also calculating the residuals.

Soln- :
Command For Regression :
> reg1<-lm(HighPrice ~ OpenPrice , data = NSEData)

NSEData - Object with file historical data
High Price - Dependent variable
Open Price - Independent variable



Residuals





Assignment 4
To generate data for a normal distribution and plot the distribution curve


Soln -:
To generate normally distributed random numbers function used is -:

rnorm(N, mean,sd)
where N is the no of observations
mean is the mean vector
sd - standard deviation

As shown below -:


The plot is as shown -:









Tuesday, 8 January 2013

Business App IT Lab - Day1 - 8 Jan 2013

 IT Business Applications Lab 



Day 1 , 8th Jan 2013

Today we learned a statistical language in the Business Applications IT laboratory. Its named the R and is a very important and powerful language for statistical computing and graphics.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible.

One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.

We did some great stuff like reading from a csv file and some functional operations on the data therein.
Below are the assignments from day1.


Assignment 1 : 

Draw a histogram after concatenating 3 data points.

Soln : 
Commands used are as under -:

> x<-c(1,2,3)
> plot(x, type = "h")


Histogram


Assignment 2:  Drawing a Histogram with the data extracted from the csv file.

Soln -: 

Reading from the csv file is done as under -:  

> z<-read.csv(file.choose(), header=T)

 > zcol1<-z[,3]
> plot(zcol1 , type="h")



Assignment 3:  Drawing a line graph with points and naming the graph and the axis.
Soln :Let z be the variable that contains data from the .csv file selected.

Reading from the csv file is done as under -:  

> z<-read.csv(file.choose(), header=T)

This command prompts the user to select the data file from the saved location.

zcol1 be the variable that contains contents of column 3 from the excel data.
the following commands were used.
> zcol1<-z[,3]
> plot(zcol1 , type="b" , main="NSE Graph" , xlab="Time" , ylab="indices")






Assignment 4:
Create a scatter plot by using share HIGH and LOW values from the NSE Historical data as obtained from the .csv file.

Soln :

HIGH values as obtained in previous ques 

> zcol1<-z[,3]

LOW values are in column 4 from the csv file

> zcol2<-z[,4]

To plot the scatter plot 
> plot(zcol1,zcol2)



Assignment 5:
To find the volatility between the share values obtained from NSE historical data and obtain the range for the same.

Soln -:

To obtain the volatility , we wold require the maximum value amongst the HIGH values and the minimum values amongst the LOW values.
Merging both the columns into one vector variable 'y' to get the HIGH and LOW values together.

> y<-c(zcol1,zcol2)

> summary(y)
 will give the min and the max value as under -:

   Min.    1st Qu.  Median    Mean   3rd Qu.    Max.
   4888    5660    5723        5758    5884       6021 

> range(y)

will give the desired range of volatility

[1] 4888.20 6020.75