When the input data is unlabeled and we have to find hidden patterns or clusters in the data set unsupervised learning comes in the picture. In clustering what we do is given a data set we look for similarities among …

Time Series Forecasting Using R : A Starter Pack Some basic theoretical ideas needed before we proceed :- Time Series Data – A time series is a set of observations on the values that a variable takes at different times. …

Why Logistic Regression? The linear Regression model assumes that the response variable Y is quantitative. But in many situations, the response variable is instead qualitative. For example eye colour is qualitative taking on values blue, brown or green. Often qualitative …

Top Data Science Geek to follow on GitHub How to use and learn Data Science tools and techniques from these GitHub account? Create your account on GitHub Decide on what you want to start learning (Ex: Visualization or Machine Learning) …

This is the place to discover cool data and work together to solve Data problems faster and seamlessly analyse open data. Data Sets Repository UC Irvine Machine Learning Repository – contains data sets good for machine learning enigma.io – Navigate the world of public …

So far we know that when the constants of a population (the parameters) are unknown, we estimate them by finding estimates based on the samples drawn from the same population. This method is called “estimation”. Now, since we are …

Is there any basis why probability distribution has to be talked about? What are its uses in understanding data? Can it show a sense of relevance according to one’s needs? These are some of the questions that one has to …

Dispersion Dispersion means the variability, spread in the data. Average gives a single representative of the data however reliability of average is more if dispersion is less. Consider the following example, suppose there are three screw manufacturing machines each …

Hypothesis Testing The primary objective of any statistical analysis is to gather information about some characteristics of the population. But usually only a part of the population (i.e. sample) can be accessed and hence one needs to make guess about the characteristics …

When we say data, these involve numbers or texts or symbols that represent some pieces of information. More often than not, we can see numbers. Because numbers are involved, it is easier to think that it has some values of …