Example with simple predictors

This file demonstrates fitting a nonlinear discounting multilevel model to data when there are one or two variables that are assumed to predict the level of discounting. These examples do not test for interactions.

The Basics - Loading data, summarizing, plotting

library(nlme)
library(lattice)

Be sure that you either place the command and data files in your working directory or CHANGE your working directory (see R menus). To check the current working directory, use “getwd()”.

myd<-read.csv("Young E1 DelayDiscounting.dat")
summary(myd)

##     Subject        Delay        Indifference.point Sex      VGExperience  
##  01a    :  5   Min.   :   7.0   Min.   :  0        f:113   Min.   : 0.00  
##  01b    :  5   1st Qu.:  30.0   1st Qu.:402        m: 63   1st Qu.: 6.00  
##  01c    :  5   Median :  90.0   Median :812                Median :13.00  
##  02a    :  5   Mean   : 411.9   Mean   :679                Mean   :11.59  
##  02b    :  5   3rd Qu.: 365.0   3rd Qu.:989                3rd Qu.:17.00  
##  02d    :  5   Max.   :1825.0   Max.   :999                Max.   :21.00  
##  (Other):146                                               NA's   :10     
##     Cluster     
##  Min.   :1.000  
##  1st Qu.:1.000  
##  Median :1.000  
##  Mean   :1.561  
##  3rd Qu.:2.000  
##  Max.   :3.000  
##  NA's   :5

Note, if a categorical variable is designated with numbers, R will assume it’s continuous. If so, do something like this: myd$condition<-as.factor(myd$condition) before continuing. We’ll need to do this for Cluster in a later example.

For this data set, I have some very noisy data from a titration procedure. Variables included two between-subject predictors: sex (categorical) and video game experience (continuous). The data also includes missing values for VG experience and some missing delays.

I’ll start with an analysis involving none of the predictors to help me find good starting values.

mynls<-nls(Indifference.point~1000/(1+exp(logk)*Delay), data=myd, start=c(logk=-2))
summary(mynls) # suggests that -6 is a good starting value for logk

## 
## Formula: Indifference.point ~ 1000/(1 + exp(logk) * Delay)
## 
## Parameters:
##      Estimate Std. Error t value Pr(>|t|)    
## logk  -6.2735     0.1864  -33.65   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 357.1 on 175 degrees of freedom
## 
## Number of iterations to convergence: 14 
## Achieved convergence tolerance: 7.572e-06

If you want to explore various starting values of k, you may experience some unhelpful “solutions.”

Fit the multilevel hyperbolic model.

myr<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~1, random=logk~1|Subject, data=myd, start=c(logk=-6))
summary(myr)

## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: Indifference.point ~ 1000/(1 + exp(logk) * Delay) 
##  Data: myd 
##       AIC      BIC    logLik
##   2359.89 2369.402 -1176.945
## 
## Random effects:
##  Formula: logk ~ 1 | Subject
##             logk Residual
## StdDev: 2.732835  139.972
## 
## Fixed effects: logk ~ 1 
##          Value Std.Error  DF   t-value p-value
## logk -6.159484 0.4664613 139 -13.20471       0
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -7.019870117 -0.475215699 -0.009234761  0.259459548  2.835331276 
## 
## Number of Observations: 176
## Number of Groups: 37

hist(coef(myr)$logk)

The following highlights the messiness of the original data and the quality of the fits.

xyplot(fitted(myr)+Indifference.point~Delay|Subject, data=myd, type="a")

Let’s explore sex differences.

myrSex<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~Sex, random=logk~1|Subject, data=myd, start=c(logk=-6, 0))
summary(myrSex)

## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: Indifference.point ~ 1000/(1 + exp(logk) * Delay) 
##  Data: myd 
##        AIC      BIC    logLik
##   2360.296 2372.978 -1176.148
## 
## Random effects:
##  Formula: logk ~ 1 | Subject
##         logk.(Intercept) Residual
## StdDev:          2.66784 139.9943
## 
## Fixed effects: logk ~ Sex 
##                      Value Std.Error  DF    t-value p-value
## logk.(Intercept) -5.729935 0.5671846 138 -10.102417  0.0000
## logk.Sexm        -1.216238 0.9584553 138  -1.268957  0.2066
##  Correlation: 
##           lg.(I)
## logk.Sexm -0.592
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -7.023136979 -0.480374698 -0.006779411  0.277230989  2.834971965 
## 
## Number of Observations: 176
## Number of Groups: 37

Men had shallower discount rates (logk women = -5.73, men = -6.95), but it’s not significant (p = .2066).

Let’s try a continuous predictor, VG experience.

myrVG<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~VGExperience, random=logk~1|Subject, data=myd, start=c(logk=-6, 0))

## Error in na.fail.default(structure(list(Indifference.point = c(904L, 800L, : missing values in object

We get an error message - nlme doesn’t like the missing values for this predictor, so I’ll screen them out in the data statement.

myrVG<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~VGExperience, random=logk~1|Subject, data=subset(myd, !is.na(VGExperience)), start=c(logk=-6, 0))
summary(myrVG)

## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: Indifference.point ~ 1000/(1 + exp(logk) * Delay) 
##  Data: subset(myd, !is.na(VGExperience)) 
##        AIC      BIC    logLik
##   2233.666 2246.114 -1112.833
## 
## Random effects:
##  Formula: logk ~ 1 | Subject
##         logk.(Intercept) Residual
## StdDev:         2.675625 143.2707
## 
## Fixed effects: logk ~ VGExperience 
##                       Value Std.Error  DF   t-value p-value
## logk.(Intercept)  -7.502613 1.1297299 130 -6.641068  0.0000
## logk.VGExperience  0.124074 0.0873572 130  1.420311  0.1579
##  Correlation: 
##                   lg.(I)
## logk.VGExperience -0.908
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -6.854544215 -0.510969068 -0.003055795  0.246596938  2.769817320 
## 
## Number of Observations: 166
## Number of Groups: 35

Higher values of VG Experience produced steeper discounting (higher logk), but it wasn’t signifcant (p = .1579). Note, just like in a regression, the intercept here (-7.5) corresponds to the logk when the other predictors are zero (in this case, no video game experience). We’ll plot the relationship to observe it.

plot(function(vgexp) -7.502613+.124074*vgexp, 0, 21, xlab="Total VG Experience", ylab="Predicted logk")

Note that if I try to plot the fits for the model with missing values, xyplot won’t properly align the predictions with the rows. So, I need to screen out the missing values in the plot, too.

xyplot(fitted(myrVG)+Indifference.point~Delay|Subject, data=myd, type="a") # with missing values

xyplot(fitted(myrVG)+Indifference.point~Delay|Subject, data=subset(myd, !is.na(VGExperience)), type="a") # without missing values

Let’s use Cluster as a predictor

In the original experiment, I performed a cluster analysis of the types of video game experience that each subject self-reported. I identified three clusters. This is a categorical variable, so…

myd$Cluster<-as.factor(myd$Cluster)

One subject doesn’t have a cluster, so I’ll have to watch for those missing values. The following command shows the default dummy coding R will use for this variable.

contrasts(myd$Cluster)

##   2 3
## 1 0 0
## 2 1 0
## 3 0 1

myrCluster<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~Cluster, random=logk~1|Subject, data=subset(myd, !is.na(Cluster)), start=c(logk=-6, 0))

## Error in nlme.formula(Indifference.point ~ 1000/(1 + exp(logk) * Delay), : starting values for the 'fixed' component are not the correct length

Starting values aren’t the correct length? Oh, my three-level categorical variable needs two dummy variables to be estimated in addition to the intercept.

myrCluster<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~Cluster, random=logk~1|Subject, data=subset(myd, !is.na(Cluster)), start=c(logk=-6, 0, 0))
summary(myrCluster)

## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: Indifference.point ~ 1000/(1 + exp(logk) * Delay) 
##  Data: subset(myd, !is.na(Cluster)) 
##        AIC      BIC    logLik
##   2297.445 2313.153 -1143.723
## 
## Random effects:
##  Formula: logk ~ 1 | Subject
##         logk.(Intercept) Residual
## StdDev:         2.520223 141.9058
## 
## Fixed effects: logk ~ Cluster 
##                      Value Std.Error  DF   t-value p-value
## logk.(Intercept) -5.293952 0.5716334 133 -9.261096  0.0000
## logk.Cluster2    -2.097144 1.0251071 133 -2.045781  0.0427
## logk.Cluster3    -1.272602 1.3077191 133 -0.973146  0.3322
##  Correlation: 
##               lg.(I) lgk.C2
## logk.Cluster2 -0.558       
## logk.Cluster3 -0.437  0.244
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -6.929385690 -0.482972631 -0.005138608  0.267728087  2.796649374 
## 
## Number of Observations: 171
## Number of Groups: 36

The intercept estimates the k for Cluster 1 (both dummy variables with predictor values of 0), the intercept plus cluster2 dummy weight estimates k for Cluster 2, and the intercept plus cluster3 dummy weight estimates the k for Cluster 3. Although there are no differences, let’s plot them anyway.

xyplot(fitted(myrCluster)+Indifference.point~Delay|Cluster, group=Subject, data=subset(myd, !is.na(Cluster)), type="a")

The second and third p-values correspond to the difference between cluster 2 and 1 and cluster 3 and 1, respectively. The mean and SE for Cluster 1 is in the results summary. To obtain means + SE for the second and third conditions, you’ll need the emmeans package. We’ll also obtain pairwise Tukey’s tests of the group differences which adjusts for the number of comparisons.

library(emmeans)
myEm= emmeans(myrCluster, "Cluster", param="logk")
myEm

##  Cluster    emmean        SE  df  lower.CL  upper.CL
##  1       -5.293952 0.5665969 133 -6.414658 -4.173245
##  2       -7.391096 0.8434315 133 -9.059371 -5.722821
##  3       -6.566554 1.1658022 133 -8.872465 -4.260642
## 
## Confidence level used: 0.95

pairs(myEm)

##  contrast   estimate       SE  df t.ratio p.value
##  1 - 2     2.0971444 1.016075 133   2.064  0.1014
##  1 - 3     1.2726020 1.296197 133   0.982  0.5896
##  2 - 3    -0.8245423 1.438913 133  -0.573  0.8347
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

plot(myEm, comparisons=TRUE)

The blue bars in the plot are 95% confidence intervals, and the red arrows are for the comparisons among the three clusters. If an arrow from one mean overlaps an arrow from another group, the difference is not significant (based on the default Tukey test).

Note that the myrCluster summary suggests that Cluster 2’s logk (-7.39) is lower than Cluster 1’s (-5.29) The following command would test all of the comparisons to produce an omnibus F instead of two t values.

anova(myrCluster)

##                  numDF denDF   F-value p-value
## logk.(Intercept)     1   133 187.95857  <.0001
## logk.Cluster         2   133   2.21229  0.1135

…which suggests that there are no differences among the clusters. Note that the anova command here returns Type I sum of squares, and this usually isn’t what you want when you have multiple predictors. However, it suffices here. If you want Type III sum of squares, you can type anova(myrCluster, type="marginal"), but here it doesn’t matter.

Multiple Predictors

If we wanted to use two of these predictors at the same time and not their interaction, it would work like this:

myrMult<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~Cluster+Sex, random=logk~1|Subject, data=subset(myd, !is.na(Cluster)), start=c(logk=-6, 0, 0, 0))
# note that I had to change the starting values.  I needed intercept, two for cluster dummies and one for sex dummy.
summary(myrMult)

## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: Indifference.point ~ 1000/(1 + exp(logk) * Delay) 
##  Data: subset(myd, !is.na(Cluster)) 
##        AIC      BIC    logLik
##   2299.365 2318.215 -1143.683
## 
## Random effects:
##  Formula: logk ~ 1 | Subject
##         logk.(Intercept) Residual
## StdDev:         2.516245 141.9137
## 
## Fixed effects: logk ~ Cluster + Sex 
##                      Value Std.Error  DF   t-value p-value
## logk.(Intercept) -5.258871  0.584801 132 -8.992583  0.0000
## logk.Cluster2    -1.879830  1.263929 132 -1.487291  0.1393
## logk.Cluster3    -1.022278  1.557415 132 -0.656394  0.5127
## logk.Sexm        -0.359344  1.216038 132 -0.295504  0.7681
##  Correlation: 
##               lg.(I) lgk.C2 lgk.C3
## logk.Cluster2 -0.324              
## logk.Cluster3 -0.249  0.482       
## logk.Sexm     -0.204 -0.584 -0.541
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -6.929794528 -0.480596778 -0.005035028  0.268657463  2.796505616 
## 
## Number of Observations: 171
## Number of Groups: 36

A whole lot of nothing, but I like effect coding using the “contr.sum” command when multiple predictors are used because the estimates hold the other variables at their average thus estimating marginal means.

contrasts(myd$Sex)=contr.sum(2)
contrasts(myd$Cluster)=contr.sum(3)
myrMult<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~Cluster+Sex, random=logk~1|Subject, data=subset(myd, !is.na(Cluster)), start=c(logk=-6, 0, 0, 0))
summary(myrMult)

## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: Indifference.point ~ 1000/(1 + exp(logk) * Delay) 
##  Data: subset(myd, !is.na(Cluster)) 
##        AIC      BIC    logLik
##   2299.365 2318.215 -1143.683
## 
## Random effects:
##  Formula: logk ~ 1 | Subject
##         logk.(Intercept) Residual
## StdDev:         2.516245 141.9137
## 
## Fixed effects: logk ~ Cluster + Sex 
##                      Value Std.Error  DF    t-value p-value
## logk.(Intercept) -6.405913 0.5222744 132 -12.265416  0.0000
## logk.Cluster1     0.967369 0.8111045 132   1.192657  0.2351
## logk.Cluster2    -0.912461 0.7468225 132  -1.221791  0.2240
## logk.Sex1         0.179672 0.6080187 132   0.295504  0.7681
##  Correlation: 
##               lg.(I) lgk.C1 lgk.C2
## logk.Cluster1 -0.430              
## logk.Cluster2 -0.055 -0.315       
## logk.Sex1      0.073 -0.649  0.282
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -6.929794533 -0.480596780 -0.005035028  0.268657477  2.796505617 
## 
## Number of Observations: 171
## Number of Groups: 36

anova(myrMult, type="marginal")

##                  numDF denDF   F-value p-value
## logk.(Intercept)     1   132 150.44043  <.0001
## logk.Cluster         2   132   1.10841  0.3331
## logk.Sex             1   132   0.08732  0.7681

Here, the intercept, -6.4, corresponds to the estimate of the “average” subject because Sex=0 now represents halfway between male and female and Cluster1=0 and Cluster2=0 represents “average” across clusters. To see the effect coding to know which numbers to plug into the resulting regression equation:

contrasts(myd$Sex)

##   [,1]
## f    1
## m   -1

contrasts(myd$Cluster)

##   [,1] [,2]
## 1    1    0
## 2    0    1
## 3   -1   -1

Model Comparison

AIC(myr, myrSex, myrVG, myrCluster, myrMult)

## Warning in AIC.default(myr, myrSex, myrVG, myrCluster, myrMult): models are
## not all fitted to the same number of observations

##            df      AIC
## myr         3 2359.890
## myrSex      4 2360.296
## myrVG       4 2233.666
## myrCluster  5 2297.445
## myrMult     6 2299.365

Note the warning! AIC can only be compared for models based on the same data, and some of these models had to screen out rows with missing data. Look at the “Number of Obserations” in each model summary. Thus, we can compare myr to myrSex, and myrCluster to myrMult. If you wanted to compare all of them, you’d need to refit each to the same data by screening out the same NAs for all of them in the “data=” statement or on a new dataset. I’m lazy so I’ll do the latter:

newdata<-subset(myd, !is.na(VGExperience)&!is.na(Cluster))

myr<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~1, random=logk~1|Subject, data=newdata, start=c(logk=-6))
myrSex<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~Sex, random=logk~1|Subject, data=newdata, start=c(logk=-6, 0))
myrCluster<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~Cluster, random=logk~1|Subject, data=newdata, start=c(logk=-6, 0, 0))
myrVG<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~VGExperience, random=logk~1|Subject, data=newdata, start=c(logk=-6, 0))
myrMult<-nlme(Indifference.point~1000/(1+exp(logk)*Delay), fixed=logk~Cluster+Sex, random=logk~1|Subject, data=newdata, start=c(logk=-6, 0, 0, 0))

AIC(myr, myrSex, myrVG, myrCluster, myrMult)

##            df      AIC
## myr         3 2171.769
## myrSex      4 2172.199
## myrVG       4 2172.354
## myrCluster  5 2172.147
## myrMult     6 2174.148

All five provided similar fits with the simplest model with no predictors being best (lowest AIC). Note that AIC differences less than 2 are meaningless, and differences less than about 5 represent modest evidence in favor of the model with the lowest AIC. See Wagenmakers & Farrell, 2004, PB&R.

Once you’ve identified the best model, if the model you choose can be refit to more data (i.e., without screening out NAs for a variable that is not part of that model), then you should do so.

None of these examples involved within-subject variables, but they are easily added as random effects. See Young (2016) for the syntax.

Saving coefficients to a file for plotting in other software

If you want to output the coefficients for graphing in other software, the following demonstrates doing so for one of our models.

write.csv(coef(myrSex), "filename.txt")

Note that this file does not include the Sex of each individual, so you need that data to determine when to add the Sex fixed effect adjustment. There is likely an easier way to do this, but the following will work. If you discover an easier way, please email me!

jill<-aggregate(myd$Sex, by=list(myd$Subject), FUN=function(x) unique(x))
jack<-coef(myrSex)
jack$Subject<-rownames(coef(myrSex))
mytemp<- merge(jack, jill, by.x="Subject", by.y="Group.1")
write.csv(mytemp, "filename.txt")