ABC of Statistics
- A table or dataset is generally means same thing. Ideally a dataset can have multiple tables.
- A table contains numerous columns.
- A table or dataset contains information, this can pertain to either a sample or a population.
- Samples represent a subset of observations, while populations encompass all observations.
- Census methods gather data for the entire population, whereas surveys collect data for a sample.
- Statistics involve collecting, analyzing, interpreting, and presenting data, incorporating methods for both sample and population data.
- A census involves a comprehensive count of an entire population, aiming to collect data from every individual or element in a specific group or area, leaving none out.
- Information about samples or populations is available in the rows.
- Column data within a table exhibits specific distributions dependent on the data type (e.g., numerical, categorical) and column name (e.g., gender, city, age, wait time, population, etc.).
- Mean, mode, median, standard deviation, range, min, max, quartile, etc., are referred to as measures or metrics.
- There exist six categories of statistical measures: 1) Measures of Central Tendency, 2) Measures of Dispersion or Variability, 3) Measures of Position, 4) Measures of Shape or Distribution, 5) Measures of Association, 6) Measures of Association.
- Statistics summarize tables using measures such as mean, mode, median, standard deviation, range, min, max, quartile, applicable to either sample or population data.
- Parameters, including mean, mode, median, standard deviation, minimum, maximum, etc., describe the characteristics of a population dataset.
- In cases where census data isn’t available, knowledge about the population parameters is unavailable.
- Inferential statistics involve deducing population insights from sample data.
- “Non-parametric” refers to scenarios where parameters (such as mean, mode, median, etc.) are unknown, indicating a lack of population data.
- Lack of knowledge implies insufficient data for calculation, not the absence of prior data analysis.
- Statistical tests evaluate whether two datasets (columns) belong to the same population by comparing variances or statistics between sample data and population data.
- “Data” is a broad term in statistics, encompassing columns, column summaries, individual cell values, and context-specific references to both sample and population data. It includes various forms like voice recordings, images, articles, emails, logs, streaming or batch data. Clarify the specific type of data when discussing it.