Navigating from Generated Survey Data to Analysis-Ready Data File




The process:

Download data —> Screen and recode data —> EFA in SPSS —> SEM in AMOS 

For this topic, we will cover the following topics:

  1. Transferring data.
  2. Examining trends (to identify consistency or potential initial problems).
  3. Data screening.
  4. Exploratory Factor Analysis (EFA).
  5. Introduction to SEM-AMOS.

Intro:


Data management and data analysis can be exhausting processes in quantitative research. However, we can simplify them with step-by-step procedures. I will share the steps based on my approach to simplifying these complex methods. Others may have different procedures for conducting this analysis.


Google Sheets data contains a mixture of text and numbers. This needs to be addressed because an SPSS data file should consist of numbers exclusively before we prepare an Excel file suitable for importing into SPSS.


Usually, I maintain a comprehensive data file that includes all my variables. Then, when I perform a specific analysis, I simply make a copy of that file and proceed to modify it, retaining only the variables required for that particular analysis.


Disclaimer:

  • SEM is an advanced method, and our aim here is to provide an overview.
  • We will give you a broad view of what your analysis will entail, from data screening to final analysis.
  • Please note that some steps may not be explained in great detail.
  • The primary goal is to provide an understanding of the overall analysis process.
  • We will conduct a more in-depth methods workshop later on, so you can learn how to analyze your data independently.
  • If you don't grasp everything immediately, there's no need to worry; we'll offer assistance with data analysis as needed.

About the data to be used:

  • Problematic data
    • There are two screening questions; a response of 2 indicates a "No" answer.
    • This dataset comprises 7 factors/variables, each containing 4 items.
    • It includes reverse-coded items: SN4, ENVI4, HEAL4.
    • The Likert scale used ranges from 1 to 5.
  • Good data
    • There are two screening questions; a response of 2 indicates a "No" answer.
    • This dataset consists of 8 factors/variables, with some factors having 3 items and others having 4.
    • It includes reverse-coded items: SN3, HEAL3.
    • The Likert scale used ranges from 1 to 5.

1. Transferring Data From Google Forms Survey to EXCEL

File to use: Data from google sheets

Final output file: Raw dataset for use in further analysis


    1.1 Download the Data from Google Sheets

    • Go to your Google Form survey.
    • Access the responses by clicking on "Responses."
    • Choose "View in Sheets."
    • Download the Google Sheets file as a Microsoft Excel file.

    1.2 Reformat the Data in Excel

    • Replace the timestamp column with a respondent ID:
      • Clear the timestamp column entirely.
      • Start with the number 1 and drag to fill the series.
    • Replace column headings with variable codes (eliminate spaces):
      • For "Qualifying questions," "Lifestyle questions," "Constructs," and "Demographic":
      • Replace spaces with underscores or remove spaces entirely to create variable codes.

    1.3 Extract Numeric Values from Cells with Texts and Numeric Values

Note: This step can be completed later, focusing initially on preparing the dataset for EFA and SEM.

In the column where you want to fix the variable, insert an adjacent column.

Use the following formulas to extract the numeric values:

    • =RIGHT(SUBSTITUTE(A1," ",""),1)
      • This formula is for extracting a single word from a cell.
      • Replace "A1" with the cell you want to extract from.
      • It extracts the rightmost (last) character from a text string in cell A1 (the right column) after removing any spaces.
      • Drag the formula down to apply it to the entire column.
      • Copy the entire range and paste it back in the same range (but paste as values).
      • A warning may appear because it's stored as text, but click the warning and select "Convert it into number."
      • Rewrite the column heading.
      • You can delete the original column.
    • =IF(ISNUMBER(AL269), AL269, SUM(IF(ISNUMBER(--MID(AL269, ROW(INDIRECT("1:" & LEN(AL269))), 1)), 10^(LEN(AL269)-ROW(INDIRECT("1:" & LEN(AL269))))*INT(MID(AL269, ROW(INDIRECT("1:" & LEN(AL269))), 1)), 0)))
      • This formula is for extracting numeric values from cells with multiple words.
      • Replace "AL269" with the cell you want to extract from.

    1.4 Handling Multiple Responses

Note: Multiple responses will be addressed later.

    • We will address this task at a later stage of the analysis.
    • Copy the column with multiple responses to another sheet for future processing.
    • To separate the responses and extract the last number:
      • Highlight the data range containing multiple responses.
      • Go to "Data" > "Text to Columns."
      • Choose "Delimited" and click "Next."
      • Select "Comma" as the delimiter and click "Next."
      • Click "Finish" to separate the responses.
      • Once the responses are separated, you can use the formula to extract the numerical values as needed.

    1.5 Save the Excel Data File

    • Name the Excel data file: "Dataset raw."
    • Use this data set for further analysis in SPSS

2. Data Screening

File to Use: Dataset raw (copy and rename to Dataset screened)

Final Output File: Dataset to be used in recoding reverse-coded items


    2.1 Delete Unqualified Respondents

    • Utilize conditional formatting to identify unqualified respondents.
    • After identifying them, proceed to remove them from the dataset.

    2.2 Check for Missing Data on Rows

    • Ensure there are no missing data values in any rows of the dataset.

    2.3 Remove Unengaged Responses

    • To identify unengaged responses:
      • Check for the STDEV.P values (equal to zero).
      • Apply conditional formatting.
    • Once identified, delete these cases from the dataset.
    • Take note of the cases you deleted.

    2.4 Outliers, Skewness, and Kurtosis

    • (This step is necessary if you are using continuous variables in the model.)
    • However, if your model does not involve continuous variables, you can skip this step.

    2.5 Save the Dataset

    • Save the cleaned dataset with the name "Dataset screened."

3. Data Recoding

File to Use: Dataset screened

Final Output File: Data to be used in EFA proper


    3.1 Open SPSS

    • Go to "File" > "Import Data" > "Excel" > Click "OK."

    3.2 Adjust Measure Types in Variable View

    • In the "Variable View" tab:
    • Set the ID variable to "Nominal."
    • Ensure that measurement items are marked as "Scale."
    • Assign labels and values to the variables as needed.

    3.3 Recode the Reverse-Coded Items

    • Identify which items are reverse coded.
    • Go to "Transform" > "Recode Into Same Variable."
    • Select the variables that need to be recoded.

    3.4 Save the Data

    • Always save the data before running any SPSS file.
    • Save it as a new SPSS file with the name "dataset recoded."

4. Exploratory Factor Analysis

File to Use: Dataset recoded (copy and rename to Dataset EFA)

Final Output File: SPSS file with a clean pattern matrix


    4.1 Select Variables for Your Model

    • You can choose to retain only the variables that are relevant to your EFA model.

    4.2 Follow the Steps in the Blog Post

    • Refer to the steps provided on the blog post for detailed instructions on how to perform EFA using SPSS.


Additional notes:


Types of Screenings Using Different Datasets:

  1. Problematic Dataset:
    • Data that poses significant challenges.
    • Items do not logically align within one construct.
    • Results from items do not align with prior studies.
    • Inconsistent measurement of constructs.
    • Multiple cross-loadings observed.
  2. Good Enough Dataset:
    • Data with the potential for cleanliness.
    • Some cross-loadings are present but can be resolved for a clean pattern matrix.
    • Minor cleaning or adjustments needed.
    • Results align with validated items.
    • Well-screened data overall.
  3. Perfect Dataset:
    • A hypothetical dataset that does not exist.
    • Typically used as a theoretical benchmark but rarely achieved in practice.


How to Code, Encode and Analyze Multiple Response Items

Coding, encoding, and analyzing multiple response items in SPSS involves working with data where respondents can select more than one response option for a given question. Here are the detailed steps.



How to Code, Encode and Analyze Rank Order Data

To code, encode, and analyze rank order data in SPSS, you'll first need to re-enter the data in an Excel spreadsheet, and then import it into SPSS for analysis. Here are step-by-step instructions.



Intro to SEM-AMOS

Structural equation modeling (SEM) is a multivariate statistical analysis technique used to analyze structural relationships between measured variables and latent constructs. It combines elements of factor analysis and multiple regression analysis.


This step-by-step tutorial will guide you through the process, starting with data screening and leading to the execution of the causal model. Additionally, it covers special topics, including different types of analyses such as mediation, moderation, and interaction.



Hope this helps. Thanks for reading.



Comments