Experimental Design Checklist
Research Question Formulation
Is the Question Well-Formed?
- [ ] Specific: Clearly defined variables and relationships
- [ ] Answerable: Can be addressed with available methods
- [ ] Relevant: Addresses a gap in knowledge or practical need
- [ ] Feasible: Resources, time, and ethical considerations allow it
- [ ] Falsifiable: Can be proven wrong if incorrect
Have You Reviewed the Literature?
- [ ] Identified what's already known
- [ ] Found gaps or contradictions to address
- [ ] Learned from methodological successes and failures
- [ ] Identified appropriate outcome measures
- [ ] Determined typical effect sizes in the field
Hypothesis Development
Is Your Hypothesis Testable?
- [ ] Makes specific, quantifiable predictions
- [ ] Variables are operationally defined
- [ ] Specifies direction/nature of expected relationships
- [ ] Can be falsified by potential observations
Types of Hypotheses
- [ ] Null hypothesis (H₀): No effect/relationship exists
- [ ] Alternative hypothesis (H₁): Effect/relationship exists
- [ ] Directional vs. non-directional: One-tailed vs. two-tailed tests
Study Design Selection
What Type of Study is Appropriate?
Experimental (Intervention) Studies: - [ ] Randomized Controlled Trial (RCT): Gold standard for causation - [ ] Quasi-experimental: Non-random assignment but manipulation - [ ] Within-subjects: Same participants in all conditions - [ ] Between-subjects: Different participants per condition - [ ] Factorial: Multiple independent variables - [ ] Crossover: Participants receive multiple interventions sequentially
Observational Studies: - [ ] Cohort: Follow groups over time - [ ] Case-control: Compare those with/without outcome - [ ] Cross-sectional: Snapshot at one time point - [ ] Ecological: Population-level data
Consider: - [ ] Can you randomly assign participants? - [ ] Can you manipulate the independent variable? - [ ] Is the outcome rare (favor case-control) or common? - [ ] Do you need to establish temporal sequence? - [ ] What's feasible given ethical, practical constraints?
Variables
Independent Variables (Manipulated/Predictor)
- [ ] Clearly defined and operationalized
- [ ] Appropriate levels/categories chosen
- [ ] Manipulation is sufficient to test hypothesis
- [ ] Manipulation check planned (if applicable)
Dependent Variables (Outcome/Response)
- [ ] Directly measures the construct of interest
- [ ] Validated and reliable measurement
- [ ] Sensitive enough to detect expected effects
- [ ] Appropriate for statistical analysis planned
- [ ] Primary outcome clearly designated
Control Variables
- [ ] Confounding variables identified:
- Variables that affect both IV and DV
- Alternative explanations for findings
- [ ] Strategy for control:
- Randomization
- Matching
- Stratification
- Statistical adjustment
- Restriction (inclusion/exclusion criteria)
- Blinding
Extraneous Variables
- [ ] Potential sources of noise identified
- [ ] Standardized procedures to minimize
- [ ] Environmental factors controlled
- [ ] Time of day, setting, equipment standardized
Sampling
Population Definition
- [ ] Target population: Who you want to generalize to
- [ ] Accessible population: Who you can actually sample from
- [ ] Sample: Who actually participates
- [ ] Difference between these documented
Sampling Method
- [ ] Probability sampling (preferred for generalizability):
- Simple random sampling
- Stratified sampling
- Cluster sampling
- Systematic sampling
- [ ] Non-probability sampling (common but limits generalizability):
- Convenience sampling
- Purposive sampling
- Snowball sampling
- Quota sampling
Sample Size
- [ ] A priori power analysis conducted
- Expected effect size (from literature or pilot)
- Desired power (typically .80 or .90)
- Significance level (typically .05)
- Statistical test to be used
- [ ] Accounts for expected attrition/dropout
- [ ] Sufficient for planned subgroup analyses
- [ ] Practical constraints acknowledged
Inclusion/Exclusion Criteria
- [ ] Clearly defined and justified
- [ ] Not overly restrictive (limits generalizability)
- [ ] Based on theoretical or practical considerations
- [ ] Ethical considerations addressed
- [ ] Documented and applied consistently
Blinding and Randomization
Randomization
- [ ] What is randomized:
- Participant assignment to conditions
- Order of conditions (within-subjects)
- Stimuli/items presented
- [ ] Method of randomization:
- Computer-generated random numbers
- Random number tables
- Coin flips (for very small studies)
- [ ] Allocation concealment:
- Sequence generated before recruitment
- Allocation hidden until after enrollment
- Sequentially numbered, sealed envelopes (if needed)
- [ ] Stratified randomization:
- Balance important variables across groups
- Block randomization to ensure equal group sizes
- [ ] Check randomization:
- Compare groups at baseline
- Report any significant differences
Blinding
- [ ] Single-blind: Participants don't know group assignment
- [ ] Double-blind: Participants and researchers don't know
- [ ] Triple-blind: Participants, researchers, and data analysts don't know
- [ ] Blinding feasibility:
- Is true blinding possible?
- Placebo/sham controls needed?
- Identical appearance of interventions?
- [ ] Blinding check:
- Assess whether blinding maintained
- Ask participants/researchers to guess assignments
Control Groups and Conditions
What Type of Control?
- [ ] No treatment control: Natural course of condition
- [ ] Placebo control: Inert treatment for comparison
- [ ] Active control: Standard treatment comparison
- [ ] Wait-list control: Delayed treatment
- [ ] Attention control: Matches contact time without active ingredient
Multiple Conditions
- [ ] Factorial designs for multiple factors
- [ ] Dose-response relationship assessment
- [ ] Mechanism testing with component analyses
Procedures
Protocol Development
- [ ] Detailed, written protocol:
- Step-by-step procedures
- Scripts for standardized instructions
- Decision rules for handling issues
- Data collection forms
- [ ] Pilot tested before main study
- [ ] Staff trained to criterion
- [ ] Compliance monitoring planned
Standardization
- [ ] Same instructions for all participants
- [ ] Same equipment and materials
- [ ] Same environment/setting when possible
- [ ] Same assessment timing
- [ ] Deviations from protocol documented
Data Collection
- [ ] When collected:
- Baseline measurements
- Post-intervention
- Follow-up timepoints
- [ ] Who collects:
- Trained researchers
- Blinded when possible
- Inter-rater reliability established
- [ ] How collected:
- Valid, reliable instruments
- Standardized administration
- Multiple methods if possible (triangulation)
Measurement
Validity
- [ ] Face validity: Appears to measure construct
- [ ] Content validity: Covers all aspects of construct
- [ ] Criterion validity: Correlates with gold standard
- Concurrent validity
- Predictive validity
- [ ] Construct validity: Measures theoretical construct
- Convergent validity (correlates with related measures)
- Discriminant validity (doesn't correlate with unrelated measures)
Reliability
- [ ] Test-retest: Consistent over time
- [ ] Internal consistency: Items measure same construct (Cronbach's α)
- [ ] Inter-rater reliability: Agreement between raters (Cohen's κ, ICC)
- [ ] Parallel forms: Alternative versions consistent
Measurement Considerations
- [ ] Objective measures preferred when possible
- [ ] Validated instruments used when available
- [ ] Multiple measures of key constructs
- [ ] Sensitivity to change considered
- [ ] Floor/ceiling effects avoided
- [ ] Response formats appropriate
- [ ] Recall periods appropriate
- [ ] Cultural appropriateness considered
Bias Minimization
Selection Bias
- [ ] Random sampling when possible
- [ ] Clearly defined eligibility criteria
- [ ] Document who declines and why
- [ ] Minimize self-selection
Performance Bias
- [ ] Standardized protocols
- [ ] Blinding of providers
- [ ] Monitor protocol adherence
- [ ] Document deviations
Detection Bias
- [ ] Blinding of outcome assessors
- [ ] Objective measures when possible
- [ ] Standardized assessment procedures
- [ ] Multiple raters with reliability checks
Attrition Bias
- [ ] Strategies to minimize dropout
- [ ] Track reasons for dropout
- [ ] Compare dropouts to completers
- [ ] Intention-to-treat analysis planned
Reporting Bias
- [ ] Preregister study and analysis plan
- [ ] Designate primary vs. secondary outcomes
- [ ] Commit to reporting all outcomes
- [ ] Distinguish planned from exploratory analyses
Data Management
Data Collection
- [ ] Data collection forms designed and tested
- [ ] REDCap, Qualtrics, or similar platforms
- [ ] Range checks and validation rules
- [ ] Regular backups
- [ ] Secure storage (HIPAA/GDPR compliant if needed)
Data Quality
- [ ] Real-time data validation
- [ ] Regular quality checks
- [ ] Missing data patterns monitored
- [ ] Outliers identified and investigated
- [ ] Protocol deviations documented
Data Security
- [ ] De-identification procedures
- [ ] Access controls
- [ ] Audit trails
- [ ] Compliance with regulations (IRB, HIPAA, GDPR)
Statistical Analysis Planning
Analysis Plan (Prespecify Before Data Collection)
- [ ] Primary analysis:
- Statistical test(s) specified
- Hypothesis clearly stated
- Significance level set (usually α = .05)
- One-tailed or two-tailed
- [ ] Secondary analyses:
- Clearly designated as secondary
- Exploratory analyses labeled as such
- [ ] Multiple comparisons:
- Adjustment method specified (if needed)
- Primary outcome protects from inflation
Assumptions
- [ ] Assumptions of statistical tests identified
- [ ] Plan to check assumptions
- [ ] Backup non-parametric alternatives
- [ ] Transformation options considered
Missing Data
- [ ] Anticipated amount of missingness
- [ ] Missing data mechanism (MCAR, MAR, MNAR)
- [ ] Handling strategy:
- Complete case analysis
- Multiple imputation
- Maximum likelihood
- [ ] Sensitivity analyses planned
Effect Sizes
- [ ] Appropriate effect size measures identified
- [ ] Will be reported alongside p-values
- [ ] Confidence intervals planned
Statistical Software
- [ ] Software selected (R, SPSS, Stata, Python, etc.)
- [ ] Version documented
- [ ] Analysis scripts prepared in advance
- [ ] Will be made available (Open Science)
Ethical Considerations
Ethical Approval
- [ ] IRB/Ethics committee approval obtained
- [ ] Study registered (ClinicalTrials.gov, etc.) if applicable
- [ ] Protocol follows Declaration of Helsinki or equivalent
Informed Consent
- [ ] Voluntary participation
- [ ] Comprehensible explanation
- [ ] Risks and benefits disclosed
- [ ] Right to withdraw without penalty
- [ ] Privacy protections explained
- [ ] Compensation disclosed
Risk-Benefit Analysis
- [ ] Potential benefits outweigh risks
- [ ] Risks minimized
- [ ] Vulnerable populations protected
- [ ] Data safety monitoring (if high risk)
Confidentiality
- [ ] Data de-identified
- [ ] Secure storage
- [ ] Limited access
- [ ] Reporting doesn't allow re-identification
Validity Threats
Internal Validity (Causation)
- [ ] History: External events between measurements
- [ ] Maturation: Changes in participants over time
- [ ] Testing: Effects of repeated measurement
- [ ] Instrumentation: Changes in measurement over time
- [ ] Regression to mean: Extreme scores becoming less extreme
- [ ] Selection: Groups differ at baseline
- [ ] Attrition: Differential dropout
- [ ] Diffusion: Control group receives treatment elements
External Validity (Generalizability)
- [ ] Sample representative of population
- [ ] Setting realistic/natural
- [ ] Treatment typical of real-world implementation
- [ ] Outcome measures ecologically valid
- [ ] Time frame appropriate
Construct Validity (Measurement)
- [ ] Measures actually tap intended constructs
- [ ] Operations match theoretical definitions
- [ ] No confounding of constructs
- [ ] Adequate coverage of construct
Statistical Conclusion Validity
- [ ] Adequate statistical power
- [ ] Assumptions met
- [ ] Appropriate tests used
- [ ] Alpha level appropriate
- [ ] Multiple comparisons addressed
Reporting and Transparency
Preregistration
- [ ] Study preregistered (OSF, ClinicalTrials.gov, AsPredicted)
- [ ] Hypotheses stated a priori
- [ ] Analysis plan documented
- [ ] Distinguishes confirmatory from exploratory
Reporting Guidelines
- [ ] RCTs: CONSORT checklist
- [ ] Observational studies: STROBE checklist
- [ ] Systematic reviews: PRISMA checklist
- [ ] Diagnostic studies: STARD checklist
- [ ] Qualitative research: COREQ checklist
- [ ] Case reports: CARE guidelines
Transparency
- [ ] All measures reported
- [ ] All manipulations disclosed
- [ ] Sample size determination explained
- [ ] Exclusion criteria and numbers reported
- [ ] Attrition documented
- [ ] Deviations from protocol noted
- [ ] Conflicts of interest disclosed
Open Science
- [ ] Data sharing planned (when ethical)
- [ ] Analysis code shared
- [ ] Materials available
- [ ] Preprint posted
- [ ] Open access publication when possible
Post-Study Considerations
Data Analysis
- [ ] Follow preregistered plan
- [ ] Clearly label deviations and exploratory analyses
- [ ] Check assumptions
- [ ] Report all outcomes
- [ ] Report effect sizes and CIs, not just p-values
Interpretation
- [ ] Conclusions supported by data
- [ ] Limitations acknowledged
- [ ] Alternative explanations considered
- [ ] Generalizability discussed
- [ ] Clinical/practical significance addressed
Dissemination
- [ ] Publish regardless of results (reduce publication bias)
- [ ] Present at conferences
- [ ] Share findings with participants (when appropriate)
- [ ] Communicate to relevant stakeholders
- [ ] Plain language summaries
Next Steps
- [ ] Replication needed?
- [ ] Follow-up studies identified
- [ ] Mechanism studies planned
- [ ] Clinical applications considered
Common Pitfalls to Avoid
- [ ] No power analysis → underpowered study
- [ ] Hypothesis formed after seeing data (HARKing)
- [ ] No blinding when feasible → bias
- [ ] P-hacking (data fishing, optional stopping)
- [ ] Multiple testing without correction → false positives
- [ ] Inadequate control group
- [ ] Confounding not addressed
- [ ] Instruments not validated
- [ ] High attrition not addressed
- [ ] Cherry-picking results to report
- [ ] Causal language from correlational data
- [ ] Ignoring assumptions of statistical tests
- [ ] Not preregistering changes literature bias
- [ ] Conflicts of interest not disclosed
Final Checklist Before Starting
- [ ] Research question is clear and important
- [ ] Hypothesis is testable and specific
- [ ] Study design is appropriate
- [ ] Sample size is adequate (power analysis)
- [ ] Measures are valid and reliable
- [ ] Confounds are controlled
- [ ] Randomization and blinding implemented
- [ ] Data collection is standardized
- [ ] Analysis plan is prespecified
- [ ] Ethical approval obtained
- [ ] Study is preregistered
- [ ] Resources are sufficient
- [ ] Team is trained
- [ ] Protocol is documented
- [ ] Backup plans exist for problems
Remember
Good experimental design is about: - Asking clear questions - Minimizing bias - Maximizing validity - Appropriate inference - Transparency - Reproducibility
The best time to think about these issues is before collecting data, not after.