- Statistics is the study of the collection ,analysis, interpretation, presentation, and organisation of data.
- It is a way to understand the data and find the patterns in that.
Terminologies in Statistics:
- Population is the whole contains every events in an experiments.
- Parameter is the characteristics of population such as population such as population mean, median etc.
- Sample is a subset of the population.
- Statistics is a characteristics of sample such as sample mean, median etc.
Types of Analysis or data types
Numerical or Quantitative
- Quantitative is nothing but variables are expressed in numerical terms.
- Example : Price , income, etc.
- Their are two types of data in numerical data type.
Continuous Data Type:
- A continuous data set is a quantitative data set representing a scale of measurement that can consist of numbers other than whole numbers, like decimals and fractions.
- Example: Height, weight, length, temperature.
Discrete Data Type:
- Discrete data is based on counts. Only a finite number of values is possible.
- There is constant interval for an instance.
- Example: No of children’s, and interval is 1 because we can’t say 1.5 like that.
Categorical or Qualitative
- Qualitative is nothing but variables represents characteristics but can’t expressed in numerical terms.
- Example : marital status etc
- Their are three data types in Categorical Data Type.
Nominal Data Type:
- The values which is not having specific order.
- Example: Names , TV, fan etc.
Ordinal Data Type:
- The ordinal data in which the categories are ordered.
- Example: Education Scoring Class (Fail, Pass, First Class, Second Class, Distinction) , Ageing(Young age , Middle Age , Old age) etc.
Binary Data Type
- Binary data is an important special case of categorical data that takes only one of two values.
- Example: 0/1, yes/no, accept/reject.
Here a small example.
A data frame containing columns as name, degree, gender, performance, Experience, Promotion and three records.
Name column is an example for nominal data type because their is no specific order.
Degree column is an example for ordinal data type because each degree has some qualification has to be done.
Gender column is an example for binary data type, because here we have two values either male or female.
Performance column is an example for ordinal data type.
Experience column is an example for discrete data type where experience column is integer, no floating values.
Promotion column is an example for binary data type.
Here "data" is variable we stored data frame.
data.info()
It will give the information about data Frame. On Dtype column will tell data type of each column. int64,float64 tells that column is numerical data type.
object Dtype tells us that column is categorical data type.
Why data types is important?
Datatypes are an important concept because in statistical analysis we analyze continuous data differently than categorical data otherwise it would result in a wrong analysis. Therefore knowing the types of data you are dealing with, enables you to choose the correct method for analysis.
Two types of statistics:
Descriptive Statistics
- In Descriptive Statistics your are describing, presenting, summarizing and organizing your data.
- It gives basic information about data helps to further proceed the data analysis.
Inferential Statistics
- It is about using data from sample and then making inferences about the larger population from which the sample is drawn.
- The goal of the inferential statistics is to draw conclusions from a sample and generalize them to the population.
Descriptive Statistics in Part - 2
Comments
Post a Comment