Numbers that are used to summarize and describe data that is known is a branch of statistics called

Numbers that are used to summarize and describe data that is known is a branch of statistics called

Notes on Topic 1: Basic Statistical Concepts

    Statistics, Science, and Observations

       ScienceScience is based on the empirical method for making observations - for systematically obtaining information. It consists of methods for making observations.

      Observations

      Observations are the basic empirical "stuff" of science.

      Statistics

      Statistics is a set of methods and rules for organizing, summarizing and interpreting information.

      The methods and rules enable scientific researchers to describe and analyze the observations they have made. Statistical methods are tools for science.

      Science consists of methods for making observations;
      Statistics consists of methods for describing and analyzing the observations.

      Here are some of the "observations" we gathered in the survey we did on the first day of class in 1997 and 1998.

      Numbers that are used to summarize and describe data that is known is a branch of statistics called

      Populations & Samples

      PopulationsA population is the set of all individuals of interest in a particular study. We will also refer to populations of scores.

      Samples

      A sample is a set of individuals selected from a population, usually intended to represent the population in a study. We will also refer to samples of scores.

      The data we gathered in class are a "sample" of scores obtained with a sample of individuals. The population we sampled from is the population of UNC undergraduates.

      Parameters

      A Parameter is a value, usually a numerical value, that describes a Population. A Parameter may be obtained from a single measurement, or it may be derived from a set of measurements from the Population.

      Statistics

      A Statistic is a value, usually a numerical value, that describes a Sample. A Statistic may be obtained from a single measurement, or it may be derived from a set of measurements from the Sample.

      Here are some "statistics" computed from our sample of data:

      Numbers that are used to summarize and describe data that is known is a branch of statistics called

      Data

      Data (plural) are measurements or observations. A data set is a collection of measurements or observations. A datum (singular) is a single measurement or observation and is commonly called a data-value, a score, or a raw score.

      Descriptive Statistics

      Descriptive Statistics are statistical procedures used to summarize, organize and simplify data. It is also the branch of statistical activity focusing on the use of such procedures. These procedures are the focus of chapters 1 through 5.

      Statistical Visualization

      Recently developed computational statistical procedures used to visually summarize, organize and simplify data. The statistical system we are using is named ViSta for "Visual Statistics", because it includes statistical visualiation.

      A statistical visualization of our data is shown below. It shows the relationship between GPA and Satisfaction with the UNC experience. Higher satisfaction is associated with higher GPA.

      Numbers that are used to summarize and describe data that is known is a branch of statistics called

      Exploratory Statistics

      The process of exploring data by using descriptive and visualization methods to "see what the data seem to say". The branch of statistics that focuses on "seeing what the data seem to say" (Tukey, 19??).

      Inferential Statistics

      Inferential Statistics consist of techniques that allow us to study samples and then to make generalizations about the populations from which the samples were selected. It is also the branch of statistical activity focusing on the use of such procedures. These procedures are the focus of chapters 8 through the remainder of the text. The groundwork for statistical inference is laid in chapters 6 and 7.

      Sampling Error

      Sampling error is the discrepency, or amount of error, that exists between a sample statistic and the corresponding population parameter.

      The Scientific Method and the Design of Experiments

      Science attempts to discover orderliness in the universe - to discover regularity in changes. Something that can change is called a variable.

      Variables

      A variable is a characteristic or condition that changes or has different values for different individuals. In the data we gathered, the variables include "Gender", "Age", etc.

      A constant is a characteristic or condition that does not vary, and is the same for every individual.

      The Correlational Method

      The scientific method in which two (or more) variables are observed without manipulation (i.e., as they exist naturally) to see if there is any relationship between them.

      The correlational method cannot establish cause-and-effect: Correlation is not causation!

      The data we gathered are an example of the correlational method. We can say that "Higher satisfaction is associated with higher GPA", but we can't say that "Higher GPA causes higher satisfaction" (or the converse).

      The Experimental Method

      The scientific method which can establish a cause-and-effect relationship between two (or more) variables. Some important points:
      1. The researcher manipulates one variable and observes what happens on the other. More than one variable may be manipulated or observed.
      2. To correctly establish cause-and-effect, the researcher must exercise some control over the experimental situation to ensure that some other variable(s) do(es) not influence the relationship being watched.
      3. Random Assignment can be used to eliminate other variables' influence on results.
      4. The experimental conditions must be identical, other than differing on values of the manipulated variable.

      Independent Variable (also called the predictor variable)

      The variable which is manipulated by the researcher. Dependent Variable (also called the response variable)The variable which is observed by the researcher for changes in order to access the effect of the treatment. (The treatment is the manipulation of the predictor variable). Confounding VariableAn uncontrolled variable that is unintentionally allowed to vary systematically with the independent variable. Confounds the results (bad, bad, bad!).

      The control group

      This is a condition of the independent variable that does not receive the experimental treatment. Usually, the control group receives either no treatment or a placebo treatment. The experimental groupThis is a condition of the independent variable that does receive an experimental treatment. There may be several experimental groups.

      The Quasi-Experimental Method

      Examines differences between pre-existing groups of subjects (such as men vs. women) or differences between groups of scores obtained at different times (before and after treatment).

      Hypotheses

      A hypothesis is a prediction about the outcome of an experiment. In experimental research, a hypothesis makes a prediction about how the manipulation of the independent (predictor) variable will affect the dependent (response) variable.

    Measurement

    Data are measurements of observations which involve categorizing, ordering or using number to characterize amount. Several levels of measurement are involved. These in turn determine what statistics can be computed. Measurements may also be discrete or continuous.

        Scales (Levels) of Measurement

        Nominal

        The nominal level of measurement labels observations so that they fall into different categories. Football jersey numbers and home street addresses are common examples.

        In ViSta, nominal variables are called "Category" variables.

        Ordinal

        The ordinal level of measurement consists of categories that are ordered in a sequence. Order of finish in a race is a common example.

        In ViSta, ordinal variables are called "Ordinal" variables.

        Interval

        The interval level of measurement consists of ordered categories where all of the categories are intervals of exactly the same size. Temperature is a common example. Here, equal differences between numbers reflect equal differences in magnitude of the observed variable.

        Ratio

        The ratio level of measurement is an interval scale with an absolute zero point. Length and weight are common examples. Here, ratio of numbers reflect ratios of variable magnitude.

        In ViSta, interval and ratio variables are called "Numeric" variables.

        Discrete and Continuous Variables

        DiscreteA discrete variable has separate, indivisible categories. No values can exist in between two neighboring categories. ContinuousA continuous variable has an infinite number of possible values falling between any two observed values.

      Mathematical Notation

      In statistical calculations you will constantly be required to add a set of values to find a specific total. We use algebraic expressions to represent the values being added. For example
      X means "Scores on a Variable.
      For example X = [1 2 3] refers to a variable with three observations which are 1, 2, and 3."
      We will use the greek letter Sigma to signify the summation process. Thus, we write
      Numbers that are used to summarize and describe data that is known is a branch of statistics called

      Note that
      1. All calculations within parentheses are done first.
      2. Squaring, multiplying, and dividing are done second, and should be completed in order from left to right.
      3. Adding and subtracting (including summation) are third, and should be completed in order from left to right.

      The following term, which is called the "squared sum" works as shown:

      Numbers that are used to summarize and describe data that is known is a branch of statistics called

      Because of the order of operations, the following term, which is called "the sum of squares", works as shown:

      Numbers that are used to summarize and describe data that is known is a branch of statistics called
      Numbers that are used to summarize and describe data that is known is a branch of statistics called

      Consider how the following summation equation works:

      Numbers that are used to summarize and describe data that is known is a branch of statistics called

      On the other hand, the next summation equation works differently:
      Numbers that are used to summarize and describe data that is known is a branch of statistics called

      Numbers that are used to summarize and describe data that is known is a branch of statistics called

      Finally, consider how this last summation equation works:

      Numbers that are used to summarize and describe data that is known is a branch of statistics called

    Is the branch of statistics concerned with drawing conclusions about a population from a sample?

    Inferential Statistics CONCEPT The branch of statistics that analyzes sample data to reach conclusions about a population.

    What are numbers that summarize a sample of data called?

    Statistics are numbers that summarize data from a sample, i.e. some subset of the entire population.

    What is the branch of statistics?

    There are three real branches of statistics: data collection, descriptive statistics and inferential statistics.

    What branch of statistics organizes and summarizes data?

    Answer: The branch of statistics that involves organizing, displaying, and describing data is Descriptive Statistics. Explanation: Descriptive statistics aims at describing the data features and providing summaries of the whole or sample population of data collected.