Basic newsroom math and statistics

Think of math in the context of stories…

Math is useful for:
– Going beyond anecdotes.
– Ensuring accuracy and credibility of anecdotes.
– Finding the numbers that will lead to the best anecdotes.

1. Fraction to decimal and percent – Because percents are easier to understand than fractions.

Formula: Divide top number by bottom number
Then multiply result by 100

Example: 
Five-eighths
5 / 8 =.625
.625 * 100 = 62.5%

2. Compare two numbers using percent difference – To see how much more/less one number is than another.

Formula:  X is (X/Y) – 1 * 100 =  MORE OR LESS THAN Y

Example:
10 and 17

(10 / 17) = .5882
.5582 – 1 = -.4117
-.4117 * 100 = -41.17

10 is 41% less than 17

Your turn:
Compare the pay of two employees by percent. Lisa makes $14 an hour. Joe makes $9 an hour. Lisa makes how much more than Joe (in percent)?

3. Percentage change – Comparing a new number to an old number.

Formula: (NEW minus OLD) divided by OLD
Then multiply the result by 100 and put % on it

Example:
50 murders in 2014
40 murders in 2013
50 – 40 = 10
10 / 40 = .25
.25 X 100 = 25
25% increase in murders

or the reverse

40 murders in 2014
50 murders in 2013
40 – 50 = -10
-10 / 50 = -.2
-.2 X 100 = -20
20% decrease in murders

Your turn:
In 2013, there were 342 homes sold in Thrillsville. In 2014, there were 432 homes sold. How much did home sales increase in the last year?

4. Rates – Allows you to compare places of different size.

Formula: EVENTS divided by POPULATION multiplied by PER UNIT
Common PER UNITS are 100,000, 10,000, 1,000, 100 or 10

The phrase Per capita = 1

Example:
Compare the murder rates of two cities:

City 1 of 150,000 people with 25 murders

City 2 of 75,000 people with 20 murders.

City 1:
25 / 150,000 = .00016667
.00016667 * 10,000 = 1.6 murders per 10,000 residents

City 2:
20 / 75,000 = .00026667
.00026667 * 10,000 = 2.6 murders per 10,000 residents

Your turn:
Find the arson rate (per 1,000 people) for each of the following:

  • Maplewood – Population 23,867 Arsons 51
  • Mount Holly – Population 9,536 Arsons 15
  • North Brunswick – Population 40,742 Arsons 42

5. Mean, median, mode and outliers – Where is the center or middle of the data?

Mean or Average: Total of the values, divided by the number of those values.

Median: The middle value of an ordered list.

Mode: The most common value.

Outliers: Atypical values far from the average.

Example: 2018 salaries for MLS players

See the story that ESPN did with these numbers

Your turn:
Find the mean, median, mode and outlier for the following. There are ten employees at a business. Pay ranges from $9 an hour to $40 an hour. The employees and their hourly wages are:

Joe $9
Mary $10
Bob $9
Marshall $15
Carrie $25
Alex $14
Jo Jo $40
Elizabeth $9
Bernard $14
Stephan $9

6. Correlation – The relationship between two or more variables in your data.

573x210xpearson-2-small.png.pagespeed.ic.sPXPYd6s8ooNxPQVZkX_

Positive r: if one variable goes up, the other goes up.
Negative r: if one variable goes up, the other goes down.

Causation – The act or process of causing; the act or agency which produces an effect.

IMPORTANT: Correlation does not imply causation.

 

7. Normal distribution – The probability that any real observation will fall between any two real limits or real numbers, as the curve approaches zero on either side.

Normal distribution (Mathisfun.com) -The peak is in the middle near the mean. The curve covers 100%.

8. Variability – How data can vary from the center.

Measures of variability:

Maximum and minimum: largest and smallest values.
Range: the distance between the maximum and minimum.
Quartiles: the medians of each half of the ordered list of values.
-Halfway down from the median is the first quartile.
-Halfway up from the median is the third quartile.
Standard deviation: the average distance from the mean.

9. Standard deviation – Defines whether a value is in fact a true outlier.

Standard_deviation_diagram

Values are reliably an outlier if found more than 3 StdDev from the mean.

Empirical rule:
-68% of values within 1 StdDev of mean
-95% of values within 2 StdDev of mean
-99.7% of values within 3 StdDev of mean

Variability is normal. Values within 3 StdDev are considered normal.

Your turn:
Is Messi an outlier? Why or why not?
Is Ronaldo an outlier? Why or why not?

morris-feature-messi-31

 

10. Margin of Error – The likelihood (not a certainty) that the result from a sample is close to the number one would get if the whole population had been queried.

The margin of error in a sample = 1 divided by the square root of the number of people in the sample

Or as Robert Niles says, “If a poll has a margin of error of 2.5 percent, that means that if you ran that poll 100 times — asking a different sample of people each time — the overall percentage of people who responded the same way would remain within 2.5 percent of your original result in at least 95 of those 100 polls.”

Your turn:

page1

 

This entry was posted in Lecture. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s