John Akwei, ECMp ERMp Data Scientist

### This document contains examples of the Predictive Analytics capabilities of ContextBase, http://contextbase.github.io.

**Predictive Analytics Example 1: Linear Regression**

Linear Regression allows for prediction of future occurrences derived from one explanatory variable, and one response variable.

cat(“The Intercept =”, model$coefficients[1])

## The Intercept = 15.21788

### Example 1 — Linear Regression Conclusion:

cat(“For a Price Index of “, as.character(test), “, the predicted Market Potential = “, round(result, 2), “.”, sep=””)

## For a Price Index of 4.57592, the predicted Market Potential = 13.03.

**In conclusion to ContextBase Predictive Analytics Example 1, a direct correlation of Price Index to Market Potential was found, (see above graph). As a test of the Predictive Algorithm, a Price Index of 4.57592 was processed, and a Market Potential of 13.03 was predicted. The source R dataset shows this prediction to be accurate.**

### Predictive Analytics Example 2: Logistic Regression

Logistic Regression allows for prediction of a logical, (Yes or No), occurrence based on the effects of an explanatory variable on a response variable. For example, the probability of winning a congressional election vs campaign expenditures.

How does the amount of money spent on a campaign affect the probability that the candidate will win the election?

**Source of Data Ranges: https://www.washingtonpost.com/news/the-fix/wp/2014/04/04/think-money-doesnt-matter-in-elections-this-chart-says-youre-wrong/**

**The logistic regression analysis gives the following output:**

model model$coefficients

## (Intercept) Expenditures

## -7.615054e+00 4.098080e-06

The output indicates that campaign expenditures significantly affect the probability of winning the election.

The output provides the coefficients for Intercept = -7.615054e+00, and Expenditures = 4.098080e-06. These coefficients are entered in the logistic regression equation to estimate the probability of winning the election:

Probability of winning election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*CampaignExpenses)))

**For a Candidate that has $1,600,000 in expenditures:**

CampaignExpenses ProbabilityOfWinningElection cat(“Probability of winning Election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*”,

CampaignExpenses, “))) = “, round(ProbabilityOfWinningElection, 2), “.”, sep=””)

## Probability of winning Election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*1600000))) = 0.26.

**For a Candidate that has $2,100,000 in expenditures:**

CampaignExpenses ProbabilityOfWinningElection cat(“Probability of winning Election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*”,

CampaignExpenses, “))) = “, round(ProbabilityOfWinningElection, 2), “.”, sep=””)

## Probability of winning Election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*2100000))) = 0.73.

### Example 2 — Logistic Regression Conclusion:

ElectionWinTable 1700000, 1900000,

2300000),

column2=

c(round(1/(1+exp(-(-7.615054e+00+4.098080e-06*1100000))), 2),

round(1/(1+exp(-(-7.615054e+00+4.098080e-06*1400000))), 2),

round(1/(1+exp(-(-7.615054e+00+4.098080e-06*1700000))), 2),

round(1/(1+exp(-(-7.615054e+00+4.098080e-06*1900000))), 2),

round(1/(1+exp(-(-7.615054e+00+4.098080e-06*2300000))), 2)))

names(ElectionWinTable)

### In conclusion to ContextBase Predictive Analytics Example 2, a direct correlation of Campaign Expenditures to Election Performance was verified. The above table displays corresponding probablities of winning an election to campaign expenses.

### Predictive Analytics Example 3: Multiple Regression

Multiple Regression allows for the prediction of the future values of a response variable, based on values of multiple explanatory variables.

## Call:

## lm(formula = Life_Exp ~ Population + Income + Illiteracy, data = input)

##

## Coefficients:

## (Intercept) Population Income Illiteracy

## 7.120e+01 -1.024e-05 2.477e-04 -1.179e+00

a cat(“The Multiple Regression Intercept = “, a, “.”, sep=””)

**The Multiple Regression Intercept = 71.2023.**

### Multiple Regression Conclusion:

Y = a + popl * XPopulation + Incm * XIncome + Illt * XIlliteracy

cat(“For a City where Population = “, popl, “, Income = “, Incm, “, and Illiteracy = “, Illt, “,

the predicted Life Expectancy is: “, round(Y, 2), “.”, sep=””)

#### ## For a City where Population = 3100, Income = 5348, and Illiteracy = 1.1,

## the predicted Life Expectancy is: 71.2.

### In conclusion to ContextBase Predictive Analytics Example 3, the multiple variables of “Population”, “Income”, and “Illiteracy” were used to determine the predicted “Life Expectancy” of an area corresponding to a USA State. For an area with a Population of 3100, a per capita Income Rate of 5348, and an Illiteracy Rate of 1.1, a Life Expectancy of 71.2 years was predicted.