SMAP - Moderation

Frequently asked questions about moderation

Steps

  1. 1. What is the difference between moderation and interaction?

    The words ‘moderation’ and ‘interaction’ are often used interchangeably. However, the term moderation can be regarded as the conceptual term for interaction between independent variables, often (but not always) implying causal relationships between the different variables in the model. Accordingly, to avoid confusion, moderation is the term you should use in the introduction and discussion section of your article (for theory building and interpretation of the results), while interaction is the term you should use in the methods section (to describe the effects of interest for your research question) and the results section (when reporting statistical outcomes). 
     

  2. 2. Is moderation analysis a way to prove causality?

    Keep in mind that causality can never be tested only with statistical analyses. That is: the significance value of the interaction effect does not inform you about causal effects in your moderation model.

    The possibility to draw causal conclusions always requires specific study designs, typically including an experimental manipulation and experimental control measures, like random assignment to groups or counterbalancing orders.

  3. 3. My research question is about moderation. Which analysis should I use?

    Which type of analysis you should choose, depends on the level of your independent variables (including the moderator).

    My independent variables, including the moderator, are all categorical (nominal/qualitative)

    Perform an ANOVA (SPSS: Analyze -> General Linear Model). Which type of ANOVA depends on the type of qualitative independent variables (factors) you have. If all of your factors are between-subjects factors, you can run a regular ANOVA. If one or more of the factors in your design are within-subjects factors, you need to run a repeated measures ANOVA. For more information on carrying out the analysis, see below.

    My independent variables, including the moderator, are all continuous (interval/ratio/quantitative)

    Run a multiple regression analysis. You have two options:

    1. Perform the regression analysis yourself as a multiple regression: Analyze -> Regression -> Linear. For more information, see below.
    2. Download and use the PROCESS macro for SPSS to do the regression analysis. PROCESS does some preparatory steps for you and provides additional information on the interaction effects, but you need to familiarize yourself with the procedure. For more information on PROCESS, see below

    Should I use PROCESS for the moderation analysis (all independent variables being continuous)?

    PROCESS is a macro that makes moderation analysis with regression in SPSS easier. PROCESS provides the usual tests for effects: t-tests and F-tests. It has several advantages over doing it yourself with multiple regression:

    • You do not have to compute your own interaction terms as separate variables. PROCESS will create them for you.
    • PROCESS can center your continuous variables.
    • You can select your model out of various preset models, with one or more moderators. You can even combine moderation and mediation in one model: moderated mediation. To determine which model suits your design, please check the book by Hayes (available as e-book in the library) that belongs to the PROCESS macro.
    • PROCESS makes the interpretation of an interaction effect easier in two ways:
      • You can probe significant interactions, meaning that it gives you the simple effects of the IV on the DV at different values of the moderator.
      • PROCESS also gives data to visualize an interaction effect. This helps you to understand the interaction effect better.

    PROCESS also has some limitations:

    • The number of IVs and interaction effects that you can include in the PROCESS analysis is limited to one X-variable and two moderator variables and their interactions, with a maximum of one three-way-interaction. You can enter more independent variables as covariates (control variables), but these cannot be used to construct interaction terms with. If you only have two IVs (one X, one Moderator), this is no problem. Hint: If you have two independent variables and one moderator, you can reverse them in PROCESS: as one independent variable and two moderators.
    • PROCESS does not save the predicted and residual values, so you cannot easily check whether the assumptions of normality, homoscedasticity (also called homogeneity) and linearity are met. You have several options to deal with this. See below for more information on assumptions.

     

    My independent variables, including the moderator, are a combination of continuous and categorical variables

    You have a choice:

    • An ANCOVA or repeated measures ANCOVA, with the continuous IV(s) as covariate(s) and respectively with the categorical IV(s) as between-subjects factors (ANCOVA) or with the categorical IVs(s) as between-subjects factor and/or within-subjects factors (RM ANCOVA). You can perform these analyses in SPSS by choosing: Analyze -> General Linear Model etc. For more information, see below.
    • A regression analysis (use of the PROCESS macro is strongly recommended if the covariate is the moderator) if you do not have any within-subjects factors. For more information on PROCESS, see below.

    You can pick the analysis that you are most familiar with (at RU Nijmegen, this will probably be ANCOVA for psychology students and PROCESS for pedagogy students). Both the ANCOVA and the regression-based (PROCESS) approach work equally well and come to the same conclusion, but the output looks different and both approaches come with slightly different additional options.

    If you have more than one categorical independent variable, an ANCOVA is the most logical choice.

    I have an ordinal independent variable or moderator

    In this case we recommend to change the level of this variable. If you have only a few categories, you can treat your IV as if it were categorical. If you have many categories that are more or less equally spaced, you can treat it as continuous. If you have a scale score that consists of the sum or mean of multiple ordinal items, you can also treat this scale score as continuous. See our FAQ answer about Likert scale variables.

  4. 4. I know which analysis I need. How do I continue?

    4.1 I want to test moderation with an ANOVA. What do I need to know?

     

    ANOVA: Carrying it out

    If you run an ANOVA or repeated measures ANOVA in SPSS, the interaction term(s) between your factors will automatically be added. This means that carrying out the statistical procedure is relatively straightforward. General knowledge of ANOVA / RM ANOVA will suffice. When carrying out the analysis, we recommend to use the button ‘Plots’ from the menu to ask for an interaction plot to visualize the interaction pattern between two (or three) factors. Also, ask for Estimated Marginal Means and ask for compare simple main effects (you can do this in SPSS29, not in older versions of SPSS) to get follow-up tests of your interaction effect.

    ANOVA: Understanding the results

    Report all effects: main effects and interaction effect(s).

    If your interaction effect is significant, this means that the effect of your first factor on the DV is not the same for all levels of your second factor (and vice versa); in other words, the differences between the DV-means for the levels of factor 1 are not the same for all levels of your second factor (and vice versa). Please note that factor 2 is often called the moderator. This is still a very abstract description of interaction, and it will remain abstract to your reader unless you explain the interaction effect in terms of your study. To help you explain the interaction effects in terms of your own study, the interaction plot you created will be a helpful tool (but, you might need to do follow-up tests as well).

    So, you will want to report what these differences between the levels of factor 1 actually look like for each level of the second factor, because interaction can take on many different patterns.  For example, you can find an interaction, in which two groups differ statistically from each other on the DV at one value of the moderator but do not differ statistically from each other on the DV at the other value of the moderator. As a concrete example: suppose that for adults (moderator level 1), you find that dog persons are happier (DV) than cat persons, whereas for children (moderator level 2), dog persons and cat persons are equally happy.

    Or you can find an interaction, in which the difference between two groups is in opposite directions for the two levels of your moderator (e.g., at one value of the moderator, group 1 scores significantly higher than group 2, whereas at the other value of the moderator, group 1 scores significantly lower than group 2). Back to the concrete example: maybe for adults, dog persons are happier than cat persons, whereas for children, cat persons are happier than dog persons.

    Yet another example of interaction would be that group 1 scores significantly higher than group 2 for all levels of the moderator (so same direction), but that the difference between the two groups is significantly larger for one level of the moderator than for the other level of the moderator. In the concrete example, this would mean that, for both adults and children, dog persons are happier than cat persons, but that the difference in happiness between dog persons and cat persons is larger for adults than for children.

    The above were examples with factors consisting of only two levels. For some of these claims, you will need to do follow-up tests, since a plot or means alone will not give you all information (so, looking back at the first concrete example; you would for example have to confirm that the difference in happiness between dog persons and cat persons is indeed significant for adults and is not significant for children).

    If you have more than two levels, your description obviously needs to be about all those levels (and then might become more difficult and might require more follow-up tests). You need to tell the reader exactly what your interaction effect is about. So, do not only write down that an interaction effect is significant, but write down what it means.

    If your interaction effect is not significant, you do not need to delve any deeper. You can interpret the main effects, and conclude that the main effect of factor 1 on the DV (that is, whether the levels of factor 1 differ on the DV or not) is not significantly different for the levels of factor 2.

    4.2 I want to test moderation with an ANCOVA. What do I need to know?

     

    ANCOVA: Carrying it out

    In most cases, you need to standardize or center your continuous variable(s) before running the ANCOVA. Ideally, you would center on a datafile that contains no missing data for any of the variables that are in the analysis. So, you might first need to create a new data file that contains no missings on your variables of interest.

    The easiest way to center variables in SPSS is Analyze -> Descriptive Statistics -> Descriptives, and then check the box ‘save standardized scores as variables’. For each variable you get a new variable with z-scores. That means that SPSS does a bit more than actually needed: you only need the mean to be 0, not necessarily also the SD to be 1. But that is fine.

    Then you perform an ANCOVA via the GLM Univariate menu. Enter your categorical variables as factors and your (centered or standardized) continuous variables as covariates.

    SPSS does not automatically add interaction terms with covariates, so you have to build the correct model yourself. Use the Model button. Include the main effects of your factor(s) and covariate(s) and the interaction effect(s) between the factor(s) and the covariate(s). Important! You need to include the main effects of all variables, otherwise the resulting output is incorrect.

    ANCOVA: Understanding the results

    Report all effects: main effects and interaction effect(s).

    If your interaction effect is significant, this means that the strength of the relationship between your covariate and the dependent variable is not the same for all levels of your factor (groups). Now you will want to know how this interaction looks like exactly. For example, there might be a significant relationship between the covariate and the DV in one group but not in the other groups. Or, you might have an interaction pattern in which the relationship between the covariate and the DV is positive in one group but negative in the other group.  As a concrete example of the second option; the relationship between extraversion (covariate) and happiness (DV) might be positive for psychology students (moderator level 1), but negative for pedagogy students (moderator level 2). Another possibility is that the relationship between the covariate and the DV is significant with the same direction, but that the relationship is stronger in some groups than in others. This is what you need to find out and then, of course, you need to report about this.

    To illustrate the interaction effect, you could use the Graph function to make a scatterplot (do not use Chart Builder, but directly ‘Scatter/Dot’). Choose a simple scatter plot. In this scatterplot, the dependent variable would be on the y-as, the covariate (centered or non-centered) would be on the x-axis, and you would use your factor for the ‘Set Markers by’ option. Once you have the plot, you can double click the plot to activate it and add regression lines for each group using ‘Add Fit Line at Subgroups’ (please note, if you have more than these variables in your model, the lines might be different than what is given by the parameter estimates in the output of the ANCOVA).

    If you find a significant interaction effect, you can rerun the analysis separately for each level of the factor (follow-up tests). You can do this by first using ‘Split file’ on the factor and then rerunning the analysis without the factor in your model. You will then be able to test the effect of the covariate for each of the levels of your factor separately.

    4.3 I want to test moderation with a multiple regression analysis. What do I need to know?

    If you cannot or do not want to use PROCESS, you can do the moderation analysis manually using a multiple regression analysis.

    Multiple regression analysis: carrying it out

    First center all variables that are involved in the interaction: independent variable(s) and moderator(s).

    Ideally, you center on a datafile that contains no missing data for any of the variables that are in the analysis. So, you might first need to create a new data file that contains no missings on your variables of interest.

    The easiest way to center variables in SPSS is Analyze -> Descriptive Statistics -> Descriptives, and then check the box ‘save standardized scores as variables’. For each variable you get a new variable with z-scores. That means that SPSS does a bit more than actually needed: you only need the mean to be 0, not necessarily also the SD to be 1. But that is fine (and note that if you want standardized results – see the end of this section – it would be in fact necessary to standardize the variables).

    Then compute an interaction variable. You do this by multiplying the two centered/standardized variables using the Compute function. Do not standardize this interaction variable.

    Now run a regression model with the centered/standardized IVs and the interaction term(s) as predictors. Always include the main effects of the independent variables, also if you are only interested in the interaction effect! You cannot interpret the interaction effect properly without the main effects also present in the model. That means that when you are testing interaction effects, you will always have at least three independent variables in the model: the two main effects and the interaction effect.

    Would you like to report standardized regression coefficients? The values SPSS gives you are not entirely correct if you also have an interaction term in the model. The best procedure to obtain standardized coefficients is this:

    • Standardize all your variables (independent and dependent)
    • Compute the interaction term using these variables.
    • Run the regression analysis
    • Look at the unstandardized regression coefficients and interpret these as standardized coefficients.

     

    Multiple regression analysis: Understanding the results

    The coefficient of the interaction effect is the most important. This coefficient tells you by how much the effect of independent variable A changes if the value of independent variable B increases by 1. To interpret this interaction coefficient, you also need to look at the effect of independent variable A. Please note that the roles of A and B can be switched and note that the second independent variable is often referred to as the moderator.

    If both the effect of the independent variable and the interaction effect have the same sign (so both positive or both negative), this means that the effect of the independent variable is stronger at higher levels of the moderator. So for example, the effect of extraversion (IV) on happiness (DV) might be positive, and might be more positive with an increasing number of social contacts (moderator).

    If the effect of the independent variable and the interaction effect differ in direction (so one is positive and the other is negative), this means that the effect of the independent variable is weaker at higher levels of the moderator. For example, if the effect of the IV is positive and the interaction effect is negative, this means that the effect of the IV is less positive with higher levels of the moderator. The effect of the IV can even become negative with higher levels of the moderator. For example, the effect of the number of social contacts (IV) on happiness (DV) is positive for low levels of work stress (moderator), and becomes less positive for higher levels of work stress (maybe because with more work stress, you have less time for the many friends you have and therefore you are less happy).

    If you have a dichotomous IV (a variable with two levels) to create an interaction term with, you have two options: either you center your dichotomous variable (the easiest way to do this is score them -0.5 and 0.5), or you score the groups as 0 and 1 (so not 1 and 2 or whatever other variation). Think about your choice, because it affects the interpretation of the effects of your other variables. If you decide to center the dichotomous variable, then the (main) effects of your other variables will give you the mean size of their effect averaged over the levels of your dichotomous IV. If you use 0 and 1, then all effects will give you the effect size on the level of your dichotomous IV that you coded as 0. The interaction term will be the same in both cases: it reflects the difference in effect size of the effect of the IV on the DV between the levels of your dichotomous IVs.

    4.4 I want to test moderation with PROCESS. What do I need to know?

     

    Multiple regression analysis with PROCESS: What is PROCESS and how do I get started?

    PROCESS is a free macro developed by Andrew Hayes and can be used in SPSS. You can download it via this website. By installing a custom dialog file, you add the option PROCESS in the pull-down menu under Analyze -> Regression (when you download the PROCESS material, you will find a document that describes how to install custom dialog files).

    Before you use PROCESS, you really should read the chapter on Moderation in Hayes, A.F. Introduction to Mediation, Moderation, and Conditional Process Analysis: A Regression-Based Approach. The Quilford Press. Refer to the book in your research proposal or thesis.

    The third edition of the book is available as e-book through RUQuest (you might have to login first with your RU-account to get access).

    The first edition is also available as print version in the University library.

    In addition, there is a FAQ about PROCESS by Professor Andrew Hayes.

    Multiple regression analysis with PROCESS: carrying it out

    Variable names. First, make sure that all variables that you want to include in the analysis are a maximum 8 characters long. You will probably have to shorten some names.

    Building the model. In the most simple model of moderation, there is one independent variable X, one moderator W and one dependent variable Y. Statistically, it doesn’t matter which variable you call X and which one you call W. You need to identify one of your IVs as the X and one of your IVs as the moderator (in the output, you will get both main effects and the interaction between the IVs, as usual).

    In PROCESS, you have to fill in X, W and Y in the correct boxes and choose Model number 1.

    You may also have some more variables that you want to include, but that are not involved in an interaction effect, such as age. You can enter these as covariates.

    Multicategorical. If some of your variables are multicategorical (3 or more categories), make sure all categorical variables contain numbers (for example -0.5 and 0.5, not m and f). Inform PROCESS that this is the case with the button ‘Multicategorical’. You have to select a coding scheme: read the book for details on the different options.

    Options. Click on the ‘Options’ button and activate a few relevant options.

    1. Generate code for visualizing interactions. This generates syntax that you can use to plot your data. Visualizing the interaction effect can help you to understand the pattern of the interaction effect, if it is there. For example, does the effect of X become stronger or weaker for higher values of the moderator? Note that you don’t get this graph immediately. You have to copy a clearly designated part of the output to a new syntax, then select all the syntax that you pasted and run it.
    2. Mean center for construction of products. This is really important if you have not centered the variables yourself! Just as with regular regression analysis, it is often recommended that you center your predictors in advance so that the main effects are easier to interpret. In most cases ‘continuous variables that define products’ is a good choice.
    3. Note that the default for probing interactions is to ‘probe if p < .10 for 16th, 50th and 84th percentile’. You can also change this to ‘-1SD, Mean, and +1SD’. In the literature it seems more common to report the -1SD, mean, +1SD variant, but Hayes (the creator of PROCESS) recommends using the percentiles. If the moderator is perfectly normally distributed, then both options give the same results, because the 16th, 50th and 84th percentile correspond to Mean -1SD, Mean, and Mean +1SD respectively. You can use these estimated effects to interpret what the interaction effect looks like. See more below.
    4. Heteroscedasticity-consistent inference: this is a correction for a possible violation of the homoscedasticity assumption. As you cannot test within PROCESS itself if this assumption is met (but outside of PROCESS you still can), it is safest always to correct for it. There are several options: don’t worry about that, just pick one.

    PROCESS does not give standardized regression coefficients in models containing interaction effects. What you can do to get these, however, is standardizing all continuous variables (incl. the DV!) in SPSS before you run the PROCESS analysis. Then the coefficients in the output will be standardized coefficients.

    PROCESS options pop-up dialog

    PROCESS: Understanding the results

    If the interaction effect is significant, the effect of your IV on your DV is not the same in strength and/or direction, but depends on the value of the moderator. For example, this might mean that the effect of the IV on the DV is stronger at higher levels of the moderator. As a concrete example, the effect of extraversion (IV) on happiness (DV) might be positive, and might be more positive with an increasing number of social contacts (moderator).

    In the PROCESS output, the interaction effect is displayed as INT1 (and possibly INT2), and PROCESS tells you below the ANOVA table which variables constitute the interaction term.

    If this interaction effect is significant, PROCESS helps you to understand how the effect of the IV on the DV differs for different values of the moderator in two ways:

    1. Probing the interaction. This is also called simple slopes analyses. With the default settings, if there is an interaction term with a p-value of .10 or lower, PROCESS estimates the effect of X on Y for three values of the moderator: low, average, and high. If you do not get these conditional effects in your output, the p-value of your interaction effect was not significant (higher than .10). Important! Note that you should only report the probed interactions if your interaction effect was significant (the conditional effects are if p < .10, but you are most likely using an alpha value of .05).
    2. Syntax to make a graph. If, before running the analysis, you checked the box ‘generate code to visualize interactions’ under options, you don’t get a plot just yet. You get a bit of code in the output that you can copy and paste into a new syntax file and run from there. This gives you a nice plot that is easier to interpret than the raw numbers alone. Tip: if you have pasted the syntax into a new syntax file, then select the syntax before running it to make sure the plot is made based on the data in the syntax and not accidentally on your own data file.

    Methods 1 and 2 are complementary. The graph as described under 2. makes a plot of the three lines as described under 1. The statistical output tells you if effects are significant, and the plot gives a quick overview of what the pattern looks like.

    4.5 PROCESS doesn’t work! What did I do wrong?

    The most commonly made mistakes are these:

    1. You selected the wrong model number within PROCESS, so there is a mismatch between the selected model and the selected variables. Model 1 is for simple moderation (which is conceptually something completely different from mediation!), and PROCESS expects a variable in the field for moderator variable W and nothing in the field for mediator(s) M. Model 2 and Model 3 are appropriate if you have two moderators.
    2. You entered the moderator W in the wrong field. In some old documentation the moderator is referred to as M, but it is W now.
    3. Your variable names are too long. PROCESS can only handle variables that are 8 characters or shorter in length. Shortening your variable names beforehand (but keeping them meaningful) will solve this. It is also possible in the PROCESS dialogue window to click the button ‘Long variable names’ and check the box ‘I accept the risk of incorrect output.’ Only do this if there are no other variables with the same first 8 characters, or else PROCESS may select one of these for your analysis instead.

    4.6 I managed to run PROCESS but I have trouble making the graph.

    First, be aware that you don’t get the graph automatically. PROCESS gives you a bit of code in the output that you can copy and paste into a new syntax file and run from there.

    Make sure that you select the entire new syntax that you pasted before running! Otherwise it only runs the last line. Unfortunately in this case you also get a plot but it doesn’t make any sense.

    4.7 Why don’t I see the conditional effects of my interaction effect in my PROCESS output?

    If you checked the box to probe the results but you don’t see it in your output, that is because your interaction effect wasn’t significant. In that case you don’t need it: ‘no interaction’ doesn’t need further specification.

  5. 5. Do I need to center my variables?

    You need to center or standardize a variable when it is continuous, and you use it in an interaction effect (moderation). This can be the case for an independent variable and a moderator in regression analysis or ANCOVA. You do not need to center the dependent variable, or covariates that are not involved in an interaction effect.

    If you use PROCESS, you do not have to center variables yourself as PROCESS does it for you (but don’t forget to check that box under ‘Options’ I you want PROCESS to do so).

  6. 6. How do I check the assumptions of my moderation analysis?

    Which assumptions you have to check, depends on the type of analysis you are planning to use to study the moderation (e.g. ANOVA, ANCOVA, multiple regression analysis) – which is again connected to the levels of your independent variables .

    Multiple regression analysis or PROCESS:

    If you use a multiple regression analysis or PROCESS, the assumptions of regression apply.

    If you use PROCESS, there is no way to directly check the assumptions within PROCESS (i.e. saving the residuals and creating histogram and scatterplots). You have several options to deal with this.

    • The first one is to also do a regular regression analysis via the Regression menu for the sake of creating the histogram and scatterplot for the assumption check (clearly then you have to make sure that the outcome from this regression is identical to the PROCESS output in terms of F-values, p-value, R-squared etc). Another option is to calculate the predicted and residuals values yourself, based on the output of PROCESS using the compute function. That is, create the regression equation based on the output, calculate the predicted scores and then the residuals (observed scores – predicted scores). Then use these new variables in your dataset to make the plots you need.
    • Another option is to avoid assuming normality and homoscedasticity in the first place. If you use the option bootstrapping, normality is no longer an assumption. However, in that case, you can only report the bootstrapped results from the output. When you use a heteroscedasticity-consistent inference method, homoscedasticity is no longer an assumption. That still leaves you with the assumption of linearity, for which you would still need to create a scatter plot in order to check it, for example like described above using Regression.

     

    ANOVA:

    If you do an ANOVA with only between-subject factors, the assumptions are normality of residuals and homogeneity of variances. These can be checked by creating histograms or P-P plots for the residuals and by checking the variances of each cell using for example the Fmax rule.

    ANCOVA:

    The easiest way to check those assumptions is to follow the instructions about the assumptions of regression (normality of residuals, homogeneity of variances and linearity). However, within the GLM Univariate menu, there is no option to ask directly for plots of residuals, so you have to do it manually by 1) using the ‘Save’ button within the menu to save the predicted scores and residuals when running the analysis, and 2) then using these saved variables to create the histogram of the residuals and the scatterplot of the residuals (y-axis) and predicted scores (x-axis). 3) In addition, for the homogeneity assumption of the ANOVA part of the analysis, you still have to look at the ratio of the variances (or standard deviations) of the dependent variable in each group or cell (using the Fmax rule for example).

    RM ANOVA/ANCOVA:

    With a repeated measures ANOVA, assumptions are about difference scores. The assumptions here are that the differences scores should have approximately a normal distribution within each cell and should have approximately the same variances within each cell.

    If you have a between-subject factor as well, the mean scores (averages across the levels of the within-subjects factor) should have approximately a normal distribution within each cell and should have approximately the same variances within each cell.

    If you have a repeated measures ANCOVA, you also have to check linearity by looking at the relation between the difference scores and the covariate.

    How many difference scores you have, depends on the number of within-subjects factors you have and on the number of levels you have (e.g. with 1 within-subject factor with 2 levels, you would have 1 difference score).

    Do not use the residuals that the ‘Save’ button of the RM ANOVA gives you. If you want to check assumptions with residuals, then perform separate MANOVAs on the differences scores and ANOVAs on the average scores and ask for their residuals.