Saturday 7 December 2019

R Programming

Conditional Statement:
In R we have two Conditional Statement. One is If else Statement and another one is Nested If Else Statement. It works like any other programming language.
The If else statements are a very important part of R programming. In R there are a lot of powerful packages for data manipulation. It produces a logical value and carries out the next statement only when the logical values become TRUE.
If statement,
If the condition is true you can try to access the condition and if it doesn’t make then it goes to else Statement.
As you can check, whether a is less than 4 or not?
If it is less than 4 then it is satisfied and then enters into the block and print a is less than 4 and else again check whether it is a=4?
If it is a=4 then print whether a has the value of 4 otherwise go to another statement that’s how it goes on and on. So, if statement you have the condition and you write that conditions it's just checked and if it’s satisfying and it enters into the block and does whatever you order to do and otherwise it goes to else statement.

 


The Nested If-Else the same thing can be done. With a Nested statement you are saying that if a is less than equal to 4 then print a is less than equal to 4 which is exactly same as the previous one but here you don’t write multiple else rather you are writing else if a == 4 then print a is value of 4 otherwise else print a>=4.
Hurriedly, see the output. The output is also same. For If else statement a is the value of 4 and since here we had stored 4 for a.
Similar things happened for the Nested If Else Statement.
Same in another variant which is If else variant. You can use If else function here you can print, when a==4 then the output is Yes,otherwise it is No. it doesn’t have the If Else Statement level of control but it does your job checking two conditions whether this made then print whatever you want to say otherwise you print whatever is not made.
Loops-

                                       


  • For Loop in R:
Suppose you are storing 1,2,3,4 in vector, using that combining function then you are seeing for then you are writing the condition that if “i” in a vector for 1 to 4 times you are asking to run that loop and you are asking to print (i).

                                                       


so I will start from 1 then 2 then 3 then 4, that’s how the output you can see 1,2,3,4 and that’s how a For Loop also. It is exactly to similar to any other languages.
  • While Loop:
While loop repeats a statement or group of statements when the given condition is true. It is testing the condition before executing the Loop body.Now in While Loop, while the test expression remains true, the code inside the loop keeps on executing.

                                                     


Till the point the condition is made which is X less than 6 this will just run. Here suppose we will start with X equal to 1 then we will check whether it is less than 6 then we print X and then before existing we will increase by 1 so it would print 1 and X becomes to again checks again print 2 then again increase to 3 and again goes till the point 5 and then once it becomes 5 then before existing it would increase that to 6 so it own entered into the while loop. That’s how a while loop works. It is pretty simple.
  • Repeat Loop:
Now discuss the Repeat Loop. A Repeat Loop executes a sequence of statements multiple times and abbreviates the code that manages the Loop variable.
Earlier cases in For Loop and While Loop we are giving the condition before we even start the loop but in Repeat Loop we are not mentioning any condition. Here we just start X equal to 1(X=1) and let’s print X before existing you increase X/1 and then you give your condition that if X is == 6 (X==6) then break, break means nothing you just exist the loop. That’s how we jot down the condition in a Repeat Loop.

                                                                


So you keep on doing all those conditions and then, in the end, you just check the condition at the end of the loop before existing and if it is satisfied, before it satisfied the condition then it gives the output 1,2,3,4,5 then you just increase that 1 and then it becomes X equal to 6(X=6) and finally you break from the loop.

Loops in R:
Now we are discussing Break Statement and Next Statement as well.
Here will be some situation where we have to terminate the loop without executing all the statements. In this condition, we can use the Break Statement and Next statements. Just like the While and Repeat Loop, you can break out of a loop completely by using the break statement. Additionally, if you want to skip the current iteration and continue the loop then you can use the Next Statements.

Break Statement:
In a Repeat Loop if you actually want to break from that loop so based on a condition you can write that statement. Break Statement is used inside a loop, to stop the iteration and flow the control outside of the loop. It is also used to terminate a case in the switch statement. (covered in the next chapter) Break Statement can also be used inside the else branch of the if else statement. Just like the While and Repeat Loop, you can break out of a loop completely by using the break statement.

                                         


For an example:
Suppose you have 15 statements inside the loop and you want to exit from the loop when a certain condition is true otherwise it has to execute all of them. In this condition, you have to use the If Statement to check for the expression and place the Break Statement inside the If block. If the condition is true then the compiler will execute the break statement, and the break will exit the controller from the loop completely otherwise, it will execute all the statements.

Next Statement:
Next statement is useful when we want to skip the current iteration of a loop without terminating it. On encountering next, the R parser skips further evaluation and starts next iteration of the loop. Next Statement simulates the behavior of R switch. The next discontinues a particular iteration and jumps to the Next Cycle. In fact, it jumps to the evaluation of the condition holding the current loop. The Next Statements can also be used inside the else branch and of if else statements. Next Statement actually helps to skip from a current relationship of a loop.

                                       


So, suppose in a For loop ideally it should go from 1  so suppose I equal to 1? (I=1) and in number, we have 5, so we are checking from one to five(1-5)  and we are printing the value of i. But now you want whenever it reaches I equal to three(i=3) then you want to skip that loop. So, what would happen? So, it would print all of them but it own print if it is I equal to three (i=3). So, it would print one to then skip three then four five.

Function in R:
What is  function?
The function is exactly the same concept as any other language it’s like a Black Box, you give input and based on the function it tries to solve and it would give you the output.

                              


Like, you have a function called mean(), it gets as input and whatever it's given,it provides you the output. Similarly, in R a function called mean() and if you give the input as 1,5,6,7 and you combining all of them by combine function and you would get the output as 4.75.

                           


By using the keyword ‘function’ the R function is created.
Function Structure & Documentation:
Now, R has a lot of inbuilt function like Numeric function, Statistical Function and Character functions and lots of other function as well. Before we actually deep drive in those function let’s understand what is the structure of a function.

                                                        


Therefore, the function has a body where you have written all those things. There is another function we are trying to create all those things which are about added, so if you pass two argument X and Y then we will add them and return the value. Now, by default you can set some values and if you set X, Y then there is so no default value but if you set X, Y=1 then by default you are saying that even you don’t pass anything in Y value, and suppose you pass 2 in this function then it would give the result 2+1=3.
So, we will be learning all those things, like, How to write a function on your own but before we learn all those things, we will actually learn a couple of inbuilt function in R, which are very popular and we are actually using all those functions in our subsequent courses.
Function Arguments Matching:
In the fifth or sixth position argument is very difficult to remember that’s why, whenever you are passing any value to a function, you can pass it two way either By Position where you say the first position is value, the second one is na.rm. But if you forget it by any chance then you just remember it By Name. so, if you have two arguments and suppose in Standard Deviation.

     


Then in the fifth position, there is an argument which you want to change. Instead of position, you can choose By name so you can say na.rm equal to False or True and it would understand.
Introduction to Function:
Now, we are talking about lots of Inbuilt Function. There is a couple of Numeric Function, a couple of Statistic Function and couple of Character Function as well.
At first, we are discussing Numeric Function.
Numeric Function:
We are learning the Numeric Function from R Studio. We are talking about its various function.
The first thing is a couple of easier one sqrt(x).
Sqrt(x): It is a numeric value or a valid numeric expression for which you want to see square root. If the Numeric Expression is positive value then sqrt() function will go back to the square root of a given value.
If the numeric expression is negative value then sqrt() function will return NaN.
If the numeric expression is not a number(NaN), or negative infinity then sqrt in R will return NaN.
If the numeric expression is positive infinity then sqrt function will return the result as positive infinity.
Ceiling(x): It is one of the R Math Function which is used to return the smallest integer value that is greater than or equal to an individual number, or an expression. It can be a numeric value for which you want to find the square root. If the numeric Expression is positive or negative numeric value, then ceiling() function will return the ceiling value.
If the numeric expression is positive or negative zero then ceiling() function will return zero.
If the numeric expression is not a number(NaN) therefore ceiling will go back to NaN.
If the numeric number is positive or negative infinity then the function will return the same.

                                        


Floor(x): The R Floor method is one of the R Math Function that is used to return the largest integer value which is not greater than(less than) or equal to an individual number or an individual expression.
It can be a numeric value for which you want to find the square root. If the numeric expression is positive or negative numeric value then floor function will return the floor value.  
If the numeric expression is positive or negative zero then the function will return zero.
If the numeric expression is NaN(not a number), therefore floor function will return NaN.
If the numeric expression is positive or negative infinity, therefore the function will return the same.
Exp(x): The function Exp() defines exponential distribution,a one-parameter distribution for a gamlss.family object to be used in GAMLSS fitting using the function gamlss(). The mu parameter represents the mean of a distribution. The functions are dEXP,pEXP,qEXP and rEXP define the density, distribution function, quantile function and random generation for the specific parameterization of the exponential distribution defined by Exp() function. “Keywords” is “distribution” “regression”
Log(x): log computes logarithms, by default natural logarithms, log10 computes common (that is base 10) logarithms, and log2 computes binary (that is base 2) logarithms. The general form log(x, base) computes logarithms with the base.
Round(x,digits=n): Round is classified in some steps like,
Ceiling: It takes a single numeric argument x and returns a numeric vector containing the smallest integers not less than the corresponding elements of x.
Floor: It takes a single numeric argument x and returns a numeric vector containing the largest integers not greater than the corresponding elements of x.
Trunc: It takes a single numeric argument x and returns a numeric vector containing the integers formed by truncating the values in x toward 0.
Round: rounds the values in its first argument to the specified number of decimal places (default 0)
Signif: rounds the values in its first argument to the specified number of significant digits.
Append(): Append values to x, probably inserted into the middle of x. This function is important since its trains to perform a little faster than using the concatenation (c) function.
Identical(): The safe and reliable way to tests two objects for being equal True in this case and False in every other case.
Length(): Get the length of vectors and factors, and any other R object for which a method has been defined.
Range(): Range returns a vector containing the minimum and maximum of all the given arguments. The range is a generic function, its methods can be defined for it directly or via the summary group generic. Its arguments should be unnamed and dispatch is on the first argument. The keyword is “arith” , “Univar”.

                                                                  


Rep(x,n): Rep reproduces the values in X. It is a  generic function and the default method is described here. The keyword is “manip”, “chron”.
Rev(): Rev provides a reverse version argument. It is a generic function with a default method for vectors and one for dendrograms.
Seq(x,y,n): Generated regular sequences,seq is a standard generic with a default method. Seq.int is a primitive that can be much faster but has a few restrictions. Seq.along and seq.len are very fast primitives for two common cases. The keyword is “manip”.
Unique(): Unique goes back to a vector, data frame or array-like x but with duplicate elements removed. The keyword is “manip”, “logic”.


Statistical Function:
Now we are learning the Statistical function in R. The first function which we will be talking now is the mean() function.
  • mean() function:  The function mean() is mainly used to calculate in R, it is calculated by taking the sum of the values and dividing with a number of values in data series. On the other word, mean() of an observation variable is a numerical measure of the central area of the data values. 
                                                     
The keyword is “Univar”.  Mean() function is the arithmetic average and is a common statistic used with ratio data. Mean can be calculated on an isolated variable via the mean(VAR) command. VAR is the name of the variable.
  • Median(x): The middle most value in a data series is called Median. The median of an observation variable is the middle value when the data is sorted in ascending order. 
                                                     
It is an ordinal measure of the central area of the data values. This is a generic function where methods can be written. The median is called a reasonable concept for its default method, which will work for most classes.
  • Sd(x): The Standard Deviation of an observation variable is a square root of its variance. This function computes the standard deviation of the values in x. If na.rm is TRUE therefore missing the values are removed before computation proceeds. In R Standard Deviations are calculated in the same as the mean.
                                                     
The Standard Deviation of a single variable can be computed with the sd command, where VAR is the name of the variable. A Standard Deviation can be calculated for each of the variables in a dataset by using the SD (DATAVAR) command, where DATAVAR is the name of the variable containing the data.
  • Range(x): The range of an observation variable is the difference between its largest and smallest data values. This is a measure of how far apart the entire data spread in value. 
                                                     
Range returns a vector which is containing the minimum and maximum of all the given arguments. The keyword is “:arith”, “Univar”. It is recommended that ranges also be computed on individual variables.
  • Sum(x): Sum function in R is used to calculate the sum of vector elements.
                                                     
Sum returns the sum of all the values present in its arguments. These generic function methods can be defined for it directly or in via the summary group generic.
  • min(x): min function computes the minimum value of a vector. 
                                                     
A minimum can be computed on a single variable using the min (VAR) command.
  • max(x): max function computes the maximum value of a vector. 
                                                     
The maximum, via max(VAR), generates identically.
Character Function:
Now we will deal with some character variable. Suppose, you have your customer, customer’s names, location and customer other attributes those are mainly character in nature. Lots of time we have to manipulate and clean the data before we use it in a model that’s why we have no couple of simple example as well.
Here we discuss some inbuilt Character Function. Now we are talking about the first function which is tolower function.

  • Tolower(): It converts a string to lower case letter.


  • Toupper(): It converts a string actually uppercase letter.


  • Substr(X,star=n1,stop=n2): It extract or replace substrings in a character vector. How does it extract? 


It has a starting point and an ending point and it does on a top of X. It can also be used to overwrite a part of the character string.
  • Grep(pattern,x,ignore.case=FALSE): It searches a pattern in X. 


Substr is actually extracting but this grep() actually find a particular pattern in each element of a vector x.
  • Sub(pattern,replacement,x,ignore.case=FALSE,Fixed=FAlse): It finds a pattern in x and replaces with the replacement text. According to Sub, there is a little bit different between Sub and Gsub. 


Sub replaces only at the first place but Gsub replaces at all the places where ever it finds, it shows. All matches of a string replace by gsub() function.
  • Paste(...,sep=””): In this function, you can paste two words or two letter. 


It converts its arguments to character strings and concatenates them.


Now we will be learning, how to write our own function? Before learning this, we have to learn where we actually create a function. Whenever we are doing a lot of copy paste of the same code with minimal change we just create our own function. Whenever you are copy pasting lot of time the same number of code you are copy pasting again and again then it’s a time to create your own function.
Why would actually create a function? Writing your own functions is one way to reduce duplication in your code. It is easier to use. Not Only easier to use but also easier to understand and its actually removing lots of your deduplication work so that’s why we create a function.
Structure Function(): Structure is as you name your function, like mean, standard deviation, range or append etc some kind of meaningful name you have to do.
                      
Suppose, you want to create a function and the function will give you anything which it passed you, and then you will give it triple. That so, you will multiply it that 3 and you will return the value. So here, the meaningful name is triple and how to write that, that is our query. Perhaps, you start with the name > triple <- then you are writing the function and then write down how many arguments you have? Here, suppose you have only one argument that’s why you write ‘x’ and then you have to create that body in the function and whatever you take for your input that is ‘x’ and then multiply it with 3 and then return it. Now the function is being created. (>triple<- function(x) {3*x} and now if you pass 6 in your triple function you will get 18. Here we are working with one argument which is pretty easy,next we are doing with two argument.
                       
In the second argument we will be creating a new function called math magic which takes actually two arguments suppose one is ‘a’ and another is ‘b’,and you pass ‘a’ multiply ‘b’ plus ‘a’ by ‘b’ like “a*b+a/b” that’s why its work suppose this the formula and we have created math magic. If you pass a=2 and b=1 and you are getting 4 as a result. Here,2 multiple 1 equal to 2 and 2 by 1 equal to 2 and then 2 plus 2 equal to 4, so the result is 4 as well. (2*1+2/1=4)
                   
Somehow you don’t remember the second value which is for 'b' and then you get an error. Remember instead of this you could be written it this way also. like, you write ‘a’ equal to 2 and ‘b’ equal to 1. You can do this By name and since the position also same, that’s a reason the function is working. But if you forget to give the second value then you get an error. That’s why, whenever you are creating your function try to give a meaningful default value as well.
                           
Here we are passing a default value ‘b’ equal to 1 and get the same result and same set of code but we have set the default value that ‘b’ would be 1, if we pass 2 and 1 then the result would be same, here if we forget to give the value of ‘b’ still we get a result. So earlier when it gave, we are getting an error result but now it automatically takes the default value as 1 that’s why we get the result.
                            
That is the way, how to create your own function. It is a simple way. And above example also very simple. But whenever you are doing a lot of data manipulation where has a lot of columns and  lot of attributes and you want to do a similar kind of operation on each of those attributes then instead of doing and writing again and again you are feel bore so what can you do? you can simply create a function which can look after each of those columns one by one and that’s why you can reuse the number of code.
              

No comments:

Post a Comment

Videos based solutions

City / Traffic Surveillance Target Sectors Government Sectors Traffic Surveillance Highway and State Road Surveillance Defense Ai...