Statistics with a second grade math class. The student must know the different definitions (population, character, range, mode) but also know how to calculate the mean and median of a statistical series.
I. Statistics: vocabulary and definitions.
The study of the vocabulary and statistical concepts of the 2nd grade program will be illustrated by an example.
The statistical series of mathematics grades obtained by students in a second grade class during the previous school year (in reality, it is the average, rounded up to the nearest half point, of the grades of the first 8 math homework assignments completed in that class).
The population of a statistical series is the set of people among whom the survey is conducted.
Specifically, the set to which the question is addressed.
An individual is called an element of the population.
We are conducting a survey of 24 students in a second grade class.
The population is the 24 students in this second grade class and one individual is each student in this second grade class.
Characteristic studied or statistical variable:
Grades obtained by students in the second grade during the previous school year.
The values that a character can take are called modalities.
In general, a character can be :
Quantitative when the values are numerical (physical, physiological, sociological, demographic, economic measurements, …)
The character is said to be discrete when it can only take a finite number of numerical values: this is the case of the example studied here.
The character is said to be continuous, when it can take an infinite number of numerical values: for example, the height of a student is a continuous quantitative character.
In this situation it is convenient to group the values of the character in classes: for example, we will group the sizes of the individuals in classes of amplitude 1 cm.
Qualitative when values cannot be ordered or added (blood type, eye color, vote for a candidate).
For reasons of ease of computer or mathematical processing, one seeks to reduce to quantitative characters by a coding.
The number of individuals in the population possessing this value of the characteristic is the number of modalities .
The size of the “10.5” modality is:3. This is the number of students with a score of 10.5.
The total number is the number of individuals in the population.
It is the sum of the numbers of each modality.
The total number of students is the number of second year students in the previous year, i.e. 35.
The statistical series of the numbers is the function which, with each value of the character (modality), associates the number of this modality.
It is most often defined using a table.
See statistical data table below:
As for functions, statistical series can be represented graphically.
The character studied in the example being quantitative, we have of course the graphical representation of the numerical functions to illustrate it!
The following graphs have been made with modified data: the scores have been rounded up to the next point.
The statistical series of cumulative numbers is the function that associates to each modality the sum of the numbers of modalities with values less than or equal to this modality.
Example to be carried out as an exercise from the data in the table:
The same type of graphical representations can be made for the cumulative numbers as for the numbers. Let’s see for example the polygon of cumulative numbers, whose interest is to read graphically an interesting statistical indicator:
which corresponds to the value of the characteristic whose cumulative number gathers 50% of the total number.
Here 50% of the total number of employees corresponds to a cumulative number of: .
By graphical reading, confirmed by the reading of the table, the median of this statistical series is located around 8: it is the abscissa of the point of the graph whose ordinate is 17.5.
Let’s look at this in more detail with a more precise definition:
>The median of a statistical series is the central value of the statistical series: There is as much number of people before the median as after, that is to say that the modalities lower than the median correspond to 50% of the total number and the modalities higher than the median correspond to the other 50% of the total number. This is why the median is not very sensitive to extreme values, which is not the case for the mean!
The series of observations is ordered in ascending order.
If the total number in the series is odd (of size: 2n + 1), the median is the value of the term of rank n + 1 in this ordered series.
If the total number of terms in the series is even (of size: 2n), the median is the average of the values of the terms of rank n and n + 1 in this ordered series.
In our example, the total number of employees is odd: . The median is therefore the 18th note in ascending order, i.e. 8.
The median of a statistical series is not to be confused with the average of this series!
The median divides the total number of people into two classes of equal size, whereas this is not usually the case for the mean!
The statistical series of frequencies is the function which, to each value of the character, associates the frequency of the class of this character.
Do this as an exercise using data rounded up:
The statistical series of the cumulative frequencies is the function which, with each value of the character, associates the cumulative frequency of the class of this character (Same method as for the cumulative numbers)
Do this as an exercise using data rounded up.
If the studied character is defined by :
The population has the total number of people:
The mean of this statistical series is the number defined by:
1. Calculate the average score using the original data (rounded up to the nearest half point) and then using the data rounded up. Compare the results obtained.
2. Write and perform again the calculation of the mean with the formula using frequencies:
3. Compare the mean and median. What do you think about it?
Cette publication est également disponible en : Français (French) العربية (Arabic)